Close

Playing back audio using Pulse Code Modulation (PCM)

A project log for Storing and playing back lofi audio on an MCU

Software and hardware for storing 8-kHz, 8-bit (or less) audio on an AVR MCU, and playing it back

johan-carlssonJohan Carlsson 04/24/2022 at 05:180 Comments

I like to think about Pulse Code Modulation (PCM) as similar to AM radio: there's a high-frequency carrier wave that you amplitude modulate with your low-frequency input signal. The amplitude modulation is done in an indirect way by letting the signal strength set the duty cycle of the carrier wave, which has constant frequency and amplitude. Duty cycle of one for maximal signal strength and duty cycle zero for minimal signal strength. After low-pass filtering to get rid of the carrier wave, the output signal then approximates the input signal. For Michael Smith's PCM library for AVR MCUs, the carrier wave is a 62.5 kHz ultrasonic square wave. Here is the measured output on Pin 11 for an Arduino clone (with the ATmega328P chip) that is running PCM:

The amplitude of the input signal is set to zero until t = 0.09 ms when it is cranked up to half the maximum value, resulting in a duty cycle of 1/2. As can be seen, some of the carrier wave is bleeding through even with the amplitude turned all the way down. I'm not sure why. [Edit: On ATmega168 and ATmega328 the timers have a minimum duty cycle of 1/256 in fast-PWM mode, as explained in an excellent blog post by @Ken Shirriff ]. The sampling frequency of the PCM library is 8 kHz, so the duty cycle can be changed every 7.8125 periods of the carrier wave (compare with the seven periods shown in the plot). The separation of time scales between carrier and signal is thus not great, which motivated the development of the active filter documented in my previous log entry. The square wave is not that impressive, one could of course to better with an NE555. However, the nice thing about generating the carrier wave on an AVR MCU is that also the modulation becomes easy to implement.

The duty-cycle modulation is done in an Interrupt Service Routine (ISR) that reads the audio-amplitude data byte by byte from flash memory. Resources are scarce, both memory and clock cycles, so it is not possible to decompress audio that uses some fancy compression scheme. With 8 kHz sampling frequency, the Nyquist frequency is 4 kHz. You can't really halve that without going from lofi to nofi, so to extend playback time the only option is to reduce the bit depth of the signal samples.

The original PCM library only supports 8-bit depth, but I've extended it to also be able to play back bit-crushed audio, with a bit depth of 4, 2, or 1. I've also made some other changes (so that avr-gcc can build and link without Arduino code and to allow playback of multiple audio samples, and multiple times). The latest version of my code can be found in the cardeaduino GitHub repo, but I'll upload the source files to this project page too.

To debug and test my code I have been using a 1 kHz sine wave generated as a wav file by SoX. The PCM library wants the audio signal in the form of an array of unsigned chars stored in flash memory. To accommodate PCM I've used  a converter I call wav2h, that can also be found in the cardeaduino repo. It is based on wav2c by Mathieu Brethes. I've added bit crushing and made some other minor changes. Wav2h takes a mono, 8 kHz wav file as input and outputs a header file with a data array containing the audio samples, as well as some meta data. I've started referring to the output format as "raudio". The header file can be included in an Arduino sketch (cardeaduino.ino in the namesake repo) or in the C source for avr-gcc (cardeaduino.c).

Here is what the full resolution (8-bit) "raudio" version of the sine wave looks like:

One thing I learned from this is that SoX doesn't put out a rails-to-rails signal in its wav output files! Here's the bit-crushed version, with 4-bit depth (two samples stored in a single byte):

Literally 2-bit version (four samples per byte):

And finally the 1-bit version (eight samples per byte):

I haven't done any extensive testing yet, but tentatively I think that the 4-bit version might be generally useful for lofi. It sounds pretty similar to 8-bits, but with some pretty tolerable noise added. Two-bit might be acceptable in some cases too.

Discussions