So, I'm happy to say that progress is happening. I got the sound! And waveform on OLED. And initial web GUI.
First, a video of an OLED and the GUI (sound is really low and can be heard only at the end - it was late at night):
Now onto technical details...
For HTTP and websockets server I used https://github.com/me-no-dev/ESPAsyncWebServer library. It is a really nice library with one downside - it includes FS and SPIFFS ("filesystem" on ESP32's flash chip) support and when it's included code uploading takes additional 20-30 seconds (if someone knows how to turn this off please let me know). It has a support for serving files directly from SPIFFS but I decided not to use it. Using it would require additional step of uploading html/js/css files to flash which is OK but I'd like to handle everything related to uploading code in one step. So, here's my kind of hackish solution - insert the whole web page in the code as a const value (this way it will be written to flash). To minimize the code I additionally decided to use just vanilla JS without external libraries and I think the final product is quite OK - UI controls are defined in variable and dynamically built so it's easy to change UI by just changing that variable. To convert the whole html page to C header file I wrote another Python script (data/convert_to_header.py) which takes html, removes some whitespace, escapes some characters and packs it into a C header file.
ESP32's WiFi in this case connects to my local WiFi but in the real case it should start in the soft AP mode. Actually, I was thinking to first check if some predefined WiFi is available and connect to it and if it's not then start as AP.
Currently, oscillators are implemented via simple (non-bandlimited) lookup table. Filter is Paul Kellet's design (http://musicdsp.org/showone.php?id=29) - it's quite usable but I'll probably switch to some other design.
This first prototype contained components directly soldered to ESP board and a lot of flying wires so I decided to implement it on perfboard. Unfortunately, I disassembled the first prototype without taking a photo of it.