The speech waveform has lots of redundancy, so compression is useful.
The filter bank formulation described below was used to analyse speech
into a number of log-distributed frequency bands. The energy in each of N
filters outputs was sampled at a rate around 128 times slower than the
original waveform, then reconstucted using just N pure sine waves with
amplitudes adjusted to the N filter amplitudes. The result is a slightly
sing-song version of the original voice signal. Sample rate for the
audio is 8 KHz. Analysis was done in Matlab, with reconstruction also in
Matlab, but also with filter output in a header file to a C program
using 15 filter channels, and with the filter power down-sampled to
every 16 milliseconds (approximately one fundamental period).
Overall compression is 8:1 from the 8-bit, 8KHz input samples to the
sampled filter outputs. The Matlab analyser and reconstruction is here.
The C header file
writtten by the Matlab program contains the filter coefficients, and
some constants that the PIC32 reconstruction program needs. The C program
defines N direct digital synthesis units and scales their amplitude
according to filter coefficients stored in the header file, then blasts
The spectrum of the reconstructed speech looks nothing like the original but is understandable.