Codec2 expects nominal signal levels to be able to decode data. So for testing how the codec2 encodes our packets, the PC will generate speech audio on its line-out (what voltage level to use here?), which is connected to the left line-in of the SGTL5000 audio codec, which will convert the audio voltage levels to 16bit PCM signed samples.
Maximum audio output voltage level
As a laptop only has a headpone output, no line out, I used an external sound device. The cheapest possible USB-audio card has been use here. It only costs €2.
To find the maximum amplitude it can deliver, we download a 1kHz sine wave 0dB file (maximum amplitude). The values of the audio samples vary from -1 to +1.
Play it and set your computer sound volume to maximum. Then measure the amplitude. If the wave form starts clipping, then there's a problem in your audio system.
The unloaded headphone output of the Lenovo L580 Thinkpad even goes up to 1.68Vp (=3.32Vpp). Remark that the SGTL5000 only accepts up to 2.83Vpp (=1Vrms) line-in voltage levels.
Ok, so now we know that different audio sources have different maximum voltage settings.
Nominal signal level
Maximum signal level is -1 to +1, but what should we use as nominal signal level? Let's download a speech sample from a news report, that one should be set correctly.
SGTL5000 audio codec
The analog gain stage before the ADC (controlled by the CHIP_ANA_ADC_CTRL register) of the SGTL5000 will need to be adjusted so that when a 0dB sine wave is played at maximum amplitude from the USB-sound card, it will result in 16bit samples that are also maximum amplitude.
Let's take a 100Hz sine wave, 0dB so that we have at least 80 samples per cycle. Remember we're using 8kHz sampling frequency because that's a codec2 requirement. Of course we might sample at higher frequencies, but then the ESP32 would have to down sample again.
The I2S samples could be printed to the Arduino serial plotter to get an idea of the amplitude.