The speech waveform has lots of redundancy, so compression is useful.
The filter bank formulation described below was used to analyse speech
into a number of log-distributed frequency bands. The energy in each of N
filters outputs was sampled at a rate around 128 times slower than the
original waveform, then reconstucted using just N pure sine waves with
amplitudes adjusted to the N filter amplitudes. The result is a slightly
sing-song version of the original voice signal. Sample rate for the
audio is 8 KHz. Analysis was done in Matlab, with reconstruction also in
Matlab, but also with filter output in a header file to a C program
running on the PIC32. A voice sample was analysed and reconstructed
using 15 filter channels, and with the filter power down-sampled to
every 16 milliseconds (approximately one fundamental period).
Overall compression is 8:1 from the 8-bit, 8KHz input samples to the
sampled filter outputs. The Matlab analyser and reconstruction is here.
The C header file
writtten by the Matlab program contains the filter coefficients, and
some constants that the PIC32 reconstruction program needs. The C program
defines N direct digital synthesis units and scales their amplitude
according to filter coefficients stored in the header file, then blasts
them out to a 12-bit SPI attached DAC. The program runs under ProtoThreads.
The spectrum of the reconstructed speech looks nothing like the original but is understandable.