The 1st idea was to read audio via the I2S to USB converter of 10 years ago, then write it to a junk USB soundcard in realtime after processing. This was the 1st time that soundcard produced any useful sound in 20 years, but it was an instant failure. The latency was too long, even with with MMIO, 768 samples buffered. Most of the latency appeared to be in the USB protocol. It was so bad, it would really have to process 1 sample at a time.
Another idea was using an ESP32 with no buffers. There was a definite advantage in keeping the I2S to USB converter, since multiple devices could capture I2S audio from outside the case. The ESP32 would either capture I2S directly, inside the case, or the I2S to USB converter could output something lower latency than USB. Wifi wouldn't work very well inside the case. The I2S could reach outside the case, with some drilling. Then, either the USB converter or the ESP32 could plug in.
Then of course, there's just passing analog through the soundcard for live effects & reserving the I2S just for recording.