Can we build Machine Learning enabled sensor for under 1 USD?
To make the experience fit your profile, pick a username and tell us what interests you.
TLDR: Using analog MEMS microphone with an analog opamp amplifier, it is possible to add audio processing to our sensor.
The added BOM cost for audio input is estimated to be 20 cents USD.
A two-stage amplifier with software selectable high/low gain is used to get the most of the internal microcontroller ADC.
The quality is not expected to be Hi-Fi, but should be enough for many practical Audio Machine Learning tasks.
The go-to options for a microphone for a microcontroller based system are digital MEMS (PDM/I2S/TDM protocl), analog MEMS, or analog elecret microphone.
The ultra low cost microcontrollers we have found, do not have pheripherals for decoding I2S or PDM. It is sometimes possible to decode I2S or PDM using fast interrupts/timers or a SPI pheriperal, but usually at quite some difficulty and CPU usage. Furthermore, the cheapest digital MEMS microphone we were able to find cost 66 cents. This is too large part of our 100 cent budget, so a digital MEMS microphone is ruled out.
Below are some examples of analog microphones that could be used. All prices are in quantity 1k, from LCSC.
MEMS analog. SMD mount
Analog elecret. Capsule
So there looks to be multiple options within our budget.
The sensitivity of the MEMS microphones are typically -38 dBV to -42 dBV, and have noise floors of around 30-39 dB(A) SPL.
Any analog microphone will need to have an external pre-amplifier
to bring the output up to a suitable level for the ADC of the microcontroller.
An opamp based pre-amplifier is the go-to solution for this. The requirements for a suitable opamp can be found using the guide in Analog Devices AN-1165, Op Amps for MEMS Microphone Preamp Circuits.
The key criteria, and their implications on opamp specifications, are as follows:
Furthermore, it must work at the voltages available in the system, typically 3.3V from a regulator, or 3.0-4.2V from Li-ion battery.
The standard bit-depth for audio is 16 bit, or 24 bits for high-end audio. To cover the full audible range, the samplerate should be 44.1/48 kHz. However, for many Machine Learning tasks 16 kHz is sufficient. Speech is sometimes processed at just 8 kHz, so this can also be used.
Puya PY32V003 datasheet says specify power consumption at 750k samples per second. However, ADC conversion takes 12 cycles, and the ADC clock is only guaranteed to be 1 Mhz (typical is 4-8 Mhz). That would leave 83k samples per second in the worst case, which is sufficient for audio. In fact, we could use an oversampling ratio of 4x or more - if we have enough CPU capacity.
The ADC resolution is specified as 12 bits. This means a theoretical max dynamic range of 72 dB. However, some of the lower bits will be noise, reducing the effective bit-depth. Realistically, we are probably looking at an effective bitrate between 10 bit (60 dB) and 8 bit (42 dB). Practical sound levels at a microphone input vary quite a lot in practice. The sound sources of interest may vary a lot in loudness, and the distance from source to sensor also has a large influence. Especially for low dynamic range, this is a challenge: If the input signal is low, we will a have poor Signal to Noise Ratio, due to quantization and ADC noise. Or, if the input signal is high, we risk clipping due to maxing out the ADC.
The gain is a critical...Read more »
First prototype boards arrived this week.
In the weekend I did basic tests of all the subsystems:
As always with a first revision, there are some issues here and there. But thankfully all of them have usable workarounds. So we can develop with this board.
Examples of issues identified:
Next step will be to write some more firmware to validate more in detail that the board is functional. This includes:
I made an initial development board. This supports both sound-based and accelerometer-based ML tasks. As well as using the LEDs as a color detector. So this is intended to be used to develop and validate the tech stack. And then further cost-optimization will happen with later revisions.
These are the key components
Using a pre-built and FCC certified module for Bluetooth Low Energy, the Holtek BM7161.
This is a simple module based around the low cost BC7161 chip.
An initial batch of 10 boards have been ordered from JLCPCB.
Also did a check of the BOM costs. At 200 boards, the components except for passives cost
Additionally, there are around 20 capacitors, 1 small inductor, and 20 resistors needed.
This is estimated to be between 0.15 - 0.20 USD per board.
So it looks feasible to get below the 1 USD target BOM, for as low as 200 boards.
Also designed a small 3d-printed case, with holes for the microphone and LEDs / light sensor.
This looks to just-barely-doable on the chosen microcontroller (4 kB RAM and 32 kB FLASH).
Expected RAM usage is 0.5 kB to 3.0 kB, and FLASH between 10 kB to 32 kB FLASH.
There are accelerometers available that add 20 to 30 cents USD to the Bill of Materials.
Random Forest on time-domain features can do a good job at Activity Recognition.
The open-source library emlearn has efficient Random Forest implementation for microcontrollers.
The most common sub-task for Activity Recognition using accelerometers is Human Activity Recognition (HAR). It can be used for Activities of Daily Living (ADL) recognition such as walking, sitting/standing, running, biking etc. This is now a standard feature on fitness watches and smartphones etc.
But there are ranges of other use-cases that are more specialized. For example:
And many, many more. So this would be a good task to be able to do.
To have a sub 1 USD sensor that can perform this task, we naturally need a very low cost accelerometer.
Looking at LCSC (in January 2024), we can find:
The Silan SC7A20 chip is said to be a clone of LIS2DH.
So there looks to be several options in the 20-30 cent USD range.
Combined with a 20 cent microcontroller, we are still below 50% of our 1 dollar budget.
It seems that our project will have a 32-bit microcontroller with around 4 kB RAM and 32 kB FLASH (such as the Puya PY32F003x6). This sets the constraints that our entire firmware needs to fit inside. The firmware needs to collect data from the sensors, process the sensor data, run the Machine Learning model, and then transmit (or store) the output data. Would like to use under 50% of RAM and FLASH for buffers and for model combined, so under 2 kB RAM and under 16 kB FLASH.
We are considering an ML architecture where accelerometer samples are collected into fixed-length windows (typically a few seconds long) that are classified independently. Simple features are extracted from each of the windows, and a Random Forest is used for classification. The entire flow is illustrated in the following image, which is from A systematic review of smartphone-based human activity recognition methods for health research.
This kind of architecture was used for in the paper Are Microcontrollers Ready for Deep Learning-Based Human Activity Recognition? The paper shows that it is possible to perform similarly to a deep-learning approach, but with resource usage that are 10x to 100x better. They were able to run on Cortex-M3, Cortex-M4F and Cortex M7 microcontrollers with at least 96 kB RAM and 512 kB FLASH. But we need to fit into 5% of that resource budget...
The input buffers, intermediate buffers, tends to take up a considerable amount of RAM. So an appropriate tradeoff between sampling rate, precision (bit width) and length (in time) needs to be found. Because we are continiously sampling and also processing the data on-the-run, double-buffering may be needed. In the following table, we can see the RAM usage for input buffers to hold the sensor data from an accelerometer. The first two configurations were used in the previously mentioned paper:
16 bit is the typical full range of accelerometers, so it preserves all the data. It may be possible to reduce this down to 8 bit with sacrificing much performance....Read more »
If the complete BOM for sensor is to be under 1 USD, the microcontroller needs to be way below this. Preferably below 25% in order to leave budget for sensors, power and communication.
Thankfully, there have been a lot of improvements in this area over the last years. Looking at LCSC.com, we can find some interesting candidates:
There are also a very few sub-1 USD microcontrollers that have integrated wireless connectivity.
It looks like if we budget 10-20 cents USD to the microcontroller, then we get around:
At this price point the WCH CH32V003 or the Puya PY32F003x6 look like the most attractive options. Both have decent support in the open community. WCH CH32 can be targetted with cnlohr/ch32v003fun and Puya with py32f0-template.
What kind of ML tasks can we manage to perform on such a small CPU? That is the topic for the next steps.
Become a member to follow this project and never miss any updates