Streaming Received Audio over Wifi/UDP

A project log for ESP32 TNC and Audio Relay for HF/VHF Packet Radio

Wireless packet radio interface for HF/VHF/UHF transceivers, using ESP32 as KISS TNC or audio relay for use with soft modems

Ryan KinnettRyan Kinnett 05/18/2020 at 03:360 Comments

It’s been a few weeks since my last post, but I have  made steady progress on several fronts.  I have focused mostly on the hardest part of this project:  real-time onboard FSK demodulation.  

I explored two approaches for audio sampling:  one using I2S and DMA, similar to the fast audio streaming demo I described in a previous log, and another using a timer + ISR approach as described by Ivan Voras here.  I will write more about these experiments later, in an update to this log, or in a separate one.  I was able to get both working 90% smoothly, but the streams would drop out every few seconds, for a fraction of a second.  After days of trying to optimize the code and clean up these dropouts, I haven’t yet gotten to the bottom of it.  I had Arduino OTA enabled and was printing to serial, doing some non-trivial batch sample processing, and streaming data over web sockets, all at the same time.  I have not yet tried isolating these interfaces to identify the bottleneck.  Will do that soon.

Despite the intermittent dropouts, I did have some success testing a tone identification method which will likely serve as the core of my onboard demodulation process.  I implemented the dual sine filter algorithm described in a late edit to my earlier log regarding demodulation techniques.  I set up a websocket to stream the isolated 1200 Hz and 2200 Hz magnitude values along with raw audio, and tested with a tone generator. It worked beautifully!  I’ll post more about this after I work out the periodic dropouts issue.

What’s next with development of FSK demodulation?  Getting the 1200/2200 Hz discrimination algorithm working was a big step.  After I get to the bottom of the dropouts problem, the next lesser challenges will be 1) implementing some sort of normalization and thresholding process, 2) first-frame identification, and 3) automatic sample phase optimization/synching.

Onboard demodulation is coming along, but still has a long way to go to make it reliable.

The more exciting news for this log is that I got half of the 2-way audio relay mode working.  The ESP samples analog audio, buffers and streams over wifi, via UDP and the VBAN protocol.  A VB-Audio Virtual Cable running on the PC receives the audio stream and presents as a soundcard driver.  I am very impressed by the VBAN utilities, which are fully functional donationware, and will happily endorse that project as I get this one off the ground.  It seems to be working smoothly.  I don’t hear noticeable dropouts when I sample general audio through the ESP32, although VBAN does report some packet losses which I have not yet investigated.  I will post my Arduino sketch of this test soon.  Should be handy for intercom systems.

I initially tested this received-audio sampling and streaming method by playing Stephen Smith (WA8LMF)’s APRS audio sample CD as analog input into the ESP32, streaming the digitized audio over UDP to VBAN Voicemeeter, then decoding the audio stream using UZ7HO’s soundmodem.  This setup reliably decoded messages from the test CD.  I have not systematically characterized reliability by counting total decoded packets for each track, which seems to be a common benchmarking practice.  I later hooked up my Baofeng UV-5R audio-out to the ESP32 and successfully decoded many live APRS transmissions.  This is huge!


I posted demo code here:

One thing worth noting, I originally planned to have the ESP32 broadcast audio without specifying a destination address, but found this approach to be choppy and intermittent, completely unusable.  After some googling, I found that UDP broadcast and multicast transmission modes are not generally recommended for even moderate data volume applications.  I will need to work out how to have the ESP32 dynamically configure the destination address.  Initial thought is that I'll just include an IP address field in a configuration webpage which the ESP32 will host after I build a few more of the core pieces of this project.

Also worth noting, I'm not seeing occasional dropouts like I observed while streaming over WebSockets.  The UDP stream is continuous, and Voicemeeter very rarely indicates packet errors.  I'm thrilled to watch soundmodem decoding the majority of packet transmissions that I can audibly hear.

Here's the demo!