Algorithm for Note Detection

As mentioned in the previous step, the detection is difficult due to the presence of multiple frequencies in the audio samples.

The program works in the following flow:

1. Data acquisition:

- this section takes 128 samples from audio data, the separation between two samples (sampling frequency) depending on the frequency of interest. In this case, we are using spacing between two samples is used to apply Hann window function as well as amplitude/RMS calculation. This code also does rough zeroing by subtracting 500 from analogread value. This value can be changed if required. For a typical case, these values work well. Further, some delay needs to be added to have a sampling frequency of around 1200Hz. in the case of 1200Hz sampling frequency max of 600 HZ frequency can be detected.

for(int i=0;i<128;i++)          
{            a=analogRead(Mic_pin)-500;     //rough zero shift  
              sum1=sum1+a;              //to average value     
             sum2=sum2+a*a;            // to RMS value            
         a=a*(sin(i*3.14/128)*sin(i*3.14/128));   // Hann window   
         in[i]=4*a;        // scaling for float to int conversion
        delayMicroseconds(195);   // based on operation frequency range  
        }

2. FFT:

Once data is ready, FFT is performed using EasyFFT. This EasyFFT function is modified to fix FFT for 128 samples. The code is also modified to reduce memory consumption. The original EasyFFT function designed to have up to 1028 samples (with the compatible board), while we only need 128 samples. this code reduces memory consumption of around 20% compared to original EasyFFT function.

Once FFT is done, the code returns the top 5 most dominant frequency peaks for further analysis. This frequency are arranged in descending order of amplitude.

3.Note detection: For every peak, the code detects possible note associate with it. This code only scans up to 1200 Hz. It is not necessary to have note the same as the frequency with max amplitude.

All frequencies are mapped between 0 to 255,

here the first octave is detected, for example, 65.4 Hz to 130.8 represents one octave, 130.8 Hz to 261.6 Hz represents another. For every octave, frequencies are mapped from 0 to 255. here mapping starting from C to C'.

if(f_peaks[i]>1040){f_peaks[i]=0;}        
if(f_peaks[i]>=65.4   && f_peaks[i]<=130.8) {f_peaks[i]=255*((f_peaks[i]/65.4)-1);}           
if(f_peaks[i]>=130.8  && f_peaks[i]<=261.6) {f_peaks[i]=255*((f_peaks[i]/130.8)-1);}           
if(f_peaks[i]>=261.6  && f_peaks[i]<=523.25){f_peaks[i]=255*((f_peaks[i]/261.6)-1);}           
if(f_peaks[i]>=523.25 && f_peaks[i]<=1046)  {f_peaks[i]=255*((f_peaks[i]/523.25)-1);}           
if(f_peaks[i]>=1046 && f_peaks[i]<=2093)    {f_peaks[i]=255*((f_peaks[i]/1046)-1);}

NoteV array values are used to assign the note to the detected frequencies.

byte NoteV[13]={8,23,40,57,76,96,116,138,162,187,213,241,255};

a. Note detection:

4. After calculating note for every frequency it may be the case that there are multiple frequencies that exist which suggests the same note. To have an accurate output code also considers repetitions. The code adds up all frequency values based on amplitude order and repetitions and peaks the note with maximum amplitude.

B: chord detection:

for (int i=0;i<12;i++)
{  in[20+i]=in[i]*in[i+4]*in[i+7];  
in[32+i]=in[i]*in[i+3]*in[i+7];  //all chord check
}

this section checks for all chords by multiplying note values with each other as per major and minor code combination. this section also makes use of same Input array for data storage.

further, the chord with the max possibility (max multiplication) selected to display.

Application

Using the code is straight forward, however, there are also multiple limitations that need to be kept in mind while it. The code can be copied as it is used for note detection. The below points need to be considered while using it.

1. Pin Assignment:

Based on the attached Pin assignment needs to be modified. For my experiment, I kept it to Analog pin 7,

void setup() {Serial.begin(250000);
Mic_pin = A7;  }

2. Microphone sensitivity:

Microphone sensitivity needs to be modified such waveform can be generated with good amplitude. Mostly, the Microphone module comes with a sensitivity setting. appropriate sensitivity to be selected such that signal is neither too small and also not clips off due to higher amplitude.

3. Amplitude threshold:

This code activates only if the signal amplitude if high enough. this setting needs to be set manually by the user. this value depends upon microphone sensitivity as well as application.

if(sum2-sum1>5){
.
.

in the above code, sum2 gives RMS value while sum 1 gives mean value. so the difference between these two values gives the amplitude of the sound signal. in my case, it works properly with an amplitude value of around 5.

4. By default, this code will print the detected note. however, if you are planning to use the note for some other purpose, the directly assigned number should be used. for example C=0;C#=1, D=2, D#=3 and onward.

5. If instrument have higher frequency, the code may give false output. the maximum frequency is limited by the sampling frequency. so you may play around below delay values to get optimum output. in below code delay of 195 microseconds. which may be tweaked to get optimum output. This will affect the overall execution time.

{           a=analogRead(Mic_pin)-500;     //rough zero shift            
sum1=sum1+a;              //to average value            
sum2=sum2+a*a;            // to RMS value            
a=a*(sin(i*3.14/128)*sin(i*3.14/128));   // Hann window            
in[i]=4*a;                // scaling for float to int conversion            
delayMicroseconds(195);   // based on operation frequency range          }

6. this code will only work till 2000Hz frequency. by eliminating the delay between sampling around 3-4 kHz of sampling frequencies can be obtained.

Precautions:

Demostration:

https://www.youtube.com/watch?v=ikmrsRl5hfc

https://www.youtube.com/watch?v=mZhGm_FKuSY

Summery

Note detection is computationally intensive work, getting real-time output is very difficult especially on Arduino. This code can give around 6.6 samples /seconds (for 195 microseconds delay added). this code works well with the piano and some other instruments.

I hope this code and tutorial be helpful in your project related to music. in case of any doubt or suggestion feel free to communicate.