Shimo, H. K., Roque, A. C., Tinós, R., Tejada, J., & Morato, S. (2010). Use of Evolutionary Robots as an Auxiliary Tool for Developing Behavioral Models of Rats in an Elevated Plus-Maze (pp. 217–222). Presented at the 2010 Eleventh Brazilian Symposium on Neural Networks (SBRN 2010), São Paulo, SP: IEEE. http://doi.org/10.1109/SBRN.2010.45

Summary

Physical robot that moves like a rat inside an elevated plus-maze. The robot has a simple Recurrent Neural Network inside and communicates to a desktop running a Genetic Algorithm. The Genetic Algorithm optimizes the network’s weights.

Notes

Elevated plus-maze is divided into 21 areas: 5 in each arm, 1 in the central position.
Used a Genetic Algorithm to train an Elman Network, a type of Recurrent Neural Network (network with state). The Genetic Algorithm optimizes the synaptic weights.
Used a physical robot on an environment similar to an elevated plus-maze, arguing that can generate more realistic models. The robot has sensors that mimic ones from a real rat. These values are input to the neural network, producing the next movement.
The trajectory of the robot are parameters for the fitness function. The fitness function is based on the behavior of real rats.
Linear activation function in the hidden layer, signal function in the output layer. In the Elman Network architecture, the output obtained in time step t is added as input to the time step t+1. They called it “context layer.”
Neural Network was written in C and put directly into the robot’s microcontroller.
Genetic Algorithm was written in C++, running in a desktop, and communicating with the robot via radio.
The Genetic Algorithm uses single point crossover, and integer mutation with uniform distribution. Used also etilism to maintain the best individuals in the next generation.
Seven infrared sensors. Four to detect walls, three to read position markers on the maze. The neural network has 11 entries on the input layer: the four infrared sensors for walls, one for the current position, one for the previous position (according to internal representation), and five from activations of the hidden layer (the previous output).
Sample of 30 real rats.
Fitness function intends to minimize the difference of parameters between real rats and robot. The parameters are the following:
- # entries (open arms)
- # entries (closed arms)
- Time spent in seconds (open arms)
- Time spent in seconds (closed arms)
- Time spent in seconds (center arms)
- # entries on extremity (open arms)
- # entries on extremity (closed arms)
- Transitions between positions (open arms)
- Transitions between positions (closed arms)
First experiment’s Genetic Algorithm, only for testing, in a maze with only open arms: 20 individuals, 0.01 as mutation rate, 0.4 as crossover rate, and 48 was the chromossome length.
Second experiment’s Genetic Algorithm, in a standard maze: 60 individuals, 0.028 as mutation rate, 0.2 as crossover rate, 72 was the chromossome length, and a limit of 10 generations.
Individuals spending less than 10% of their time moving had fitness reduced by a factor of 1/3. Implemented after observing individuals only spinning on the same place in first generations.
Authors argue that individuals with decreasing (for some time) fitness were due to “communication faults between the base station and the robot.”
They noticed robots making cyclic movements – going to the same places, returning to the original position, and repeating – until the end of the experiment. These movements had 2-9 steps. Suggested that using a more complex network would increase the number of steps in these cycles.

Thoughts

As all the other models optimizing for # entries, time spent in arms, they model average behavior. There’s a chance that these results will be good, but transitions between actions (e.g. deciding if it goes to the right or stays on the same place, for a second or two) won’t. They are not considered.
The idea of having a physical robot might seem interesting for demonstration purposes and even for collecting some kind of real-world data but reduces the speed in which the model is capable of learning. Also, the authors blame the communication between the robot and the desktop for sub-optimal improvements between generations.
Quite interesting approach for evolving the network. I understand that there’s no backpropagation, only an optimization through an Evolutionary Programming.
Again, sample size less than 35. Might be interesting to experiment with more real rats.