Audio input for 20 cents USD

A project log for 1 dollar TinyML

Can we build Machine Learning enabled sensor for under 1 USD?

jon-nordbyJon Nordby 02/25/2024 at 15:400 Comments

TLDR: Using analog MEMS microphone with an analog opamp amplifier, it is possible to add audio processing to our sensor.
The added BOM cost for audio input is estimated to be 20 cents USD.

A two-stage amplifier with software selectable high/low gain is used to get the most of the internal microcontroller ADC.
The quality is not expected to be Hi-Fi, but should be enough for many practical Audio Machine Learning tasks.

Ultra low cost microphones

The go-to options for a microphone for a microcontroller based system are digital MEMS (PDM/I2S/TDM protocl), analog MEMS, or analog elecret microphone.

The ultra low cost microcontrollers we have found, do not have pheripherals for decoding I2S or PDM. It is sometimes possible to decode I2S or PDM using fast interrupts/timers or a SPI pheriperal, but usually at quite some difficulty and CPU usage. Furthermore, the cheapest digital MEMS microphone we were able to find cost 66 cents. This is too large part of our 100 cent budget, so a digital MEMS microphone is ruled out.

Below are some examples of analog microphones that could be used. All prices are in quantity 1k, from LCSC.

MEMS analog. SMD mount

Analog elecret. Capsule

So there looks to be multiple options within our budget.

Example of MEMS analog microphones (from CUI)

The sensitivity of the MEMS microphones are typically -38 dBV to -42 dBV, and have noise floors of around 30-39 dB(A) SPL.

Analog pre-amplifier

Any analog microphone will need to have an external pre-amplifier
to bring the output up to a suitable level for the ADC of the microcontroller.

An opamp based pre-amplifier is the go-to solution for this. The requirements for a suitable opamp can be found using the guide in Analog Devices AN-1165, Op Amps for MEMS Microphone Preamp Circuits.

The key criteria, and their implications on opamp specifications, are as follows:

Furthermore, it must work at the voltages available in the system, typically 3.3V from a regulator, or 3.0-4.2V from Li-ion battery.

ADC considerations

The standard bit-depth for audio is 16 bit, or 24 bits for high-end audio. To cover the full audible range, the samplerate should be 44.1/48 kHz. However, for many Machine Learning tasks 16 kHz is sufficient. Speech is sometimes processed at just 8 kHz, so this can also be used.

Puya PY32V003 datasheet says specify power consumption at 750k samples per second. However, ADC conversion takes 12 cycles, and the ADC clock is only guaranteed to be 1 Mhz (typical is 4-8 Mhz). That would leave 83k samples per second in the worst case, which is sufficient for audio. In fact, we could use an oversampling ratio of 4x or more - if we have enough CPU capacity.

The ADC resolution is specified as 12 bits. This means a theoretical max dynamic range of 72 dB. However, some of the lower bits will be noise, reducing the effective bit-depth. Realistically, we are probably looking at an effective bitrate between 10 bit (60 dB) and 8 bit (42 dB). Practical sound levels at a microphone input vary quite a lot in practice. The sound sources of interest may vary a lot in loudness, and the distance from source to sensor also has a large influence. Especially for low dynamic range, this is a challenge: If the input signal is low, we will a have poor Signal to Noise Ratio, due to quantization and ADC noise. Or, if the input signal is high, we risk clipping due to maxing out the ADC.

Finding the gain

The gain is a critical parameter for amplifier design, as it influences almost all other requirements. If we look at speech as reference. Normal speech level at 3 meters is approximately 50 dB(A) SPL, and up to 90 dB(A) SPL for shouting up close. These are short-time average levels. And because the sound pressure is not constant, the max level (which system also needs to represent) is quite a lot higher.

Given a microphone with a sensitivity of -38 dBV, and allowing for 20 dB headroom, the ideal gains would be between 65 dB (1800x) and 25 dB (18x).



A two-stage amplifier with selectable gain

Intergrated Circuits for operational amplifiers come with either 1, 2, or 4 opamps. It turns out that a chip with 2 opamps can be had for basically the same price as 1. It is generally a good idea to split amplification into multiple stages, as this is less likely to hit the limits of the Gain Bandwidth Product of the opamp. However, in this case we can get another benefit which is more important: the ability to have two different gains. By providing them both to the microcontroller as separate ADC channels, we can switch between them in software. This can either be used statically in form of a high/low switch. Or it could be done dynamically by monitoring the inputs, as a very crude form for Automatic Gain Control (AGC).

Selecting the operational amplifier

Now we know all the parameters to select the opamp.

From this we can compute the key opamp specs. The equations are covered in the reference design guide from Analog Devices linked previously. We need something that has:

I reviewed a bunch of cheap opamps at LCSC, that can run on the relevant voltages. Their specifications can be seen in the following table:


We see that the commodity low-cost, low-power LMV321 type chips are slightly out of spec, in both noise density and gain bandwidth product. The LMV721 class of devices have more-than-good enough performance. The GS8621 is a good alternative that has lower power consumption.

Audio input BOM

Microphone  Goertek S15OT421-005    0.0888 USD
Opamp          Gainsil GS8632                 0.0789 USD

Total of 16 cents, rounding up to 20 cents with capacitors and resistors.


Now that we have established that the hardware should be able to receive the audio,
we need to validate that we are able to process the audio signal with our rather weak microcontroller. That will be the topic of an upcoming post.