The basic idea
Sample the sound via microphone, do some filtering and weighting, calculate noise level in real-time on ESP32 and display the result on small screen.

Should be quite simple, however, as usual, the devil is in the details.
Sampling sound with I2S digital microphone
The good thing about digital microphones is that you don't need to worry about the analog part like pre-amplification, linearity and speed of MCU ADC, etc... And the digital values you receive should be already referenced to sound pressure levels (SPL). Datasheet needs to list the amplitude value for certain SPL, for 1KHz pure sine wave tone (i.e. -26dBFS @ 94dB SPL). This is usually expressed as dBFS (decibel relative to full-scale), so for i.e. -26dBFS you can convert this to absolute amplitude value based on the maximum value mic can send, and in case of 24bit data, this should be (2^23 − 1) * 10^(−26/20) ~= 420426. This is the amplitude you should expect to receive if you i.e. put the microphone inside 'Sound level calibrator'
The microphone of choice for this project is TDK/InvenSense ICS-43434, or more specifically its breakout board available at Tindie. One good thing about this microphone is that its sensitivity is specified as +/-1dB. This means our measurement of 94dB, 1KHz pure sine wave tone, should be -26dBFS, +/-1dB, without any additional calibration. This is pretty good considering I do not have access to any calibration equipment. You can also use the older INMP441 mic, widely available as cheap breakout board on i.e. Aliexpress, but that one has sensitivity specified as +/-3dB.
The hardware
Breadboard friendly, see the list of components

Equalization and weighting
MEMS microphones are usually not ideal and there should be frequency response plot in the datasheet. If this curve deviates outside from acceptable parameters, first we need to equalize (i.e flatten) the microphone native response in the measurement range (20Hz - 20KHz), before we measure the actual SPL levels (i.e. Z-weighted) and apply any weighting filter. We can do this with digital IIR filter designed to (inverse) match the datasheet frequency plot. See the 'ics43434.m' file for my humble attempt at filter design to equalize the ICS-43434. You can copy/paste the math in Octave Online to calculate the coefficients and display the IIR filter frequency response graphs. TLDR, the 'flattened frequency response should look like this (blue line):

Next step is to apply the frequency weighting, in this case the most common (but probably not the most correct) A-weighting, also implemented as IIR filter. The coefficient for this filter were taken from here, for sampling frequency of 48KHz.
Actual implementation of IIR filters is taken (and slightly modified for single-precision and performance) from the nice Arduino digital filter library, and ESP32 with its FPU has the required grunt to do the math continuously while sampling.
The measurement
And from there it is straight forward. I calculate the RMS of the sampled signal, calculate decibels referenced to datasheet value for 94dB and display the value.
Sound level measurements are only meaningful in context of duration of the sampling (see Wikipedia). The Arduino sketch, by default, displays the LAeq(125ms) measurements as horizontal line on top of the screen and LAeq(1sec) measurements as numeric value. It also prints the measured numeric value on the serial monitor and you can graph it with Arduino's 'Serial plotter'
Source code and IIR filter math are available on Github
Hello Ivan. Firstly love this project and I've learnt masses already. Struggling to get my head around some of the code but heck, learning hurts (well, me anyway)!
I'm trying to modify to have two microphones, each producing a weighted Leq_RMS using I2S synchronised L/R channels. i.e. tie the ICS43434 SD and BCLK lines and set the I2S channel_format: I2S_CHANNEL_FMT_RIGHT_LEFT. 2 channels means twice the samples, but they come in alternately depending according to the WS state, if I read the datasheet correctly.
I'm guessing the Weighting + SumSquares filters are going to need adapting to pick out the L or R sample data and deal with them separately. Any suggestions on this approach?, or have you attempted this yourself already?
Reason I want to do this is to detect 2D direction of a moving noise, as well as its SPL.