I want try some day what happens if I connect a blank neural network used as reward applicator (to avoid indicate any reward) and this reward net came from some type of long memory net.
Reward net actualization will be modelated according to something like a "neurons cell energy" variables with thresholds values to send backwards signals to this applicator and associating somehow the current input to the long memory.
Also the inputs layer from normal sensors go to indicate actions as usual but also go to long_memory > reward_applicator to get a closed loop system.
I will not be able to give a reward for walking towards the food but if the long memory + reward helps to reach the food by chance, the neurons receive their energy and it is recorded. Otherwise... natural selection.