Neural networks are based on what called perceptron which is a computational model of a single neuron . A perceptron consists of one or more inputs , a processor and a single output :

Most of the informations i'm using in this post i got them from this online free book i really do recommend having a look at it .

The Perceprton returns an output based on the inputs and weights and recalculate the weights depending on the error from the desired output . an extra input called bias is added for the case where all the inputs are 0 .

My inputs will be simple 4 push buttons , and the output 4 LEDs , i'm going to add 4 more buttons on the output side to use them during the training , and 4 more LEDs as input indicators . and finally a button to switch between the training and guessing mode . and i'm running all this on the Atmega328p

The schematics looks like this :

The idea here , is to teach it to light an LED at the output depending on what button we press at the input . at first the program starts with some untrained weights for each input :

psize =5;
weights[psize]={0.5,-0.1,0.8,-0.6,0.6};

And set inputs stat to 0 except the bias which is always set to 1;

inputs[psize]={0,0,0,0,1};

In the guess mode , the perceptron calculate the sum of the weights multiplied by the inputs:

for(i=0;i<psize;i++){
        if(inputs[i]!=0)guess+=weights[i];
}

The sum will be our output .
In the training mode , we will calculate the error from the desired result , i choose to indicate output stat with values 0,1 ,2,3 that corresponds to the LED we want to turn on at the output .

And we input the correct answer using the output buttons and we calculate the error , and re-update the weights :

error = keypressed-guess;

for(i=0;i<psize;i++){
                weights[i] +=learnConst*error*inputs[i];

}

A learning constant is added as well to the calculation , to adjust wither we want a harsh or soft weights changes

As you have noticed the calculations uses float variables which takes quite a good amount of the memory space , i've struggled a lot to to reduce the code enough for the challenge, and just a moment from giving up , i found the avr gcc fixed-point type library which helped reducing the code size significantly ! . If anyone have a different approach please share it ! .

./getSize 
AVR Memory Usage
----------------
Device: atmega328p

Program:     960 bytes (2.9% Full)
(.text + .data + .bootloader)

Data:         10 bytes (0.5% Full)
(.data + .bss + .noinit)