Floreano, D., Dürr, P., & Mattiussi, C. (2008). Neuroevolution: from architectures to learning. Evolutionary Intelligence, 1(1), 47–62. http://doi.org/10.1007/s12065-007-0002-4

Summary

Review article on Neuroevolution. From 2008.

Notes

In the most simple form, a neuron model is a weighted sum of inputs, transformed by a typically nonlinear static function
Evolutionary algorithms as an alternative, or complement, to other commonly used algorithms such as back-propagation.
The main benefit of using an evolutionary algorithm for this is because it enables to learn to evolve multiple characteristics, concurrently. Also, because the definition can be broader than only a simple error function. Also, suggests a combination with “Hebbian learning.”
Approaches are divided into how the neural network is presented: direct, developmental, and implicit. The first is more used for evolving fixed-sized networks, while the latter two are more flexible.
Direct representations: one-to-one mapping between parameters of the network and genes.
- A technique called “dynamic encoding” uses bits to represent the most significant part of parameters; once convergence gets to a satisfactory solution, changes to represent the least significant part.
- “Center of Mass Encoding” has a self-adaptive encoding method.
- Floating-point representation of synaptic weights shows excellent results for small and fixed architectures. It uses a strategy called “covariance matrix adaptation” (CMA-ES).
- Study shows better performance when compared to back-propagation.
- Problem of “competing conventions,” where very different networks produce the same behavior. Another problem is premature convergence to local optima.
- Algorithm called SANE (symbiotic, adaptive neuro-evolution). Neurons connect only to the input and output layer. In between, random neurons are generated. Fitness of each neuron is the average fitness of all networks they participated in.
- Evolving topologies is challenging because changes usually decrease fitness, even when they have the potential to increase with more evolution.
- NEAT (Neuro-evolution of augmenting topologies) is a method for encoding and evolving topologies and weights. It is designed to work against the previously mentioned problem of evolving topologies and competing conventions. It maintains a chronological order of appearance of each gene, useful for preventing competing conventions and building sub-populations (genes with similar behavior) to crossover internally.
- Direct representations have excellent results for small networks but don’t work so well in larger ones because of scalability issues in the genetic string size.
Developmental representations: encode the developmental process that will describe the neural network.
- It has a binary matrix to represent architecture and patterns of the network.
- Genome is divided into blocks of five elements, each defining a rewriting rule. The set of rules is fixed, allowing for this recombination of blocks to form the connectivity of the network.
- A specific models the genotype with a binary tree structure. There’s also a proposal to allow a terminal node to connect to another tree, reusing previously learned configurations.
- This approach tends to generate regular and modular networks.
Implicit encoding: connections are not explicitly encoded in the genome, but it’s a result from interactions with the so-called “regulatory regions.”
- It is a recent approach, inspired by biological gene regulatory networks (GRNs). An abstraction of the biochemical process of gene regulation.
- Expression of the gene comes from the synthesis of proteins, the so-called “coding region” of the gene.
- “Interaction map” is a function that receives two genes and returns the interaction between them.
- Interactions have variable length.
When a suitable topology is already known, it’s better to use simpler representations.
Dynamic neural networks are hard to train and evolutionary algorithms have been used for this with success.
Continuous-time recurrent neural networks (CTRNN): represents the neural network as a set of differential equations. One of the most simple and widely used dynamic models.
Spiking neuron models: in biological neurons, the number of spikes, or neurons received, is proportional to the activation. It is used for decreasing the use of computational resources. It is implemented similarly to CTRNN, replacing the output function for a comparison with the threshold value. When the activation is higher than the threshold, a spike is emitted. Research suggests that this is a useful model for tasks with memory dependent dynamics, resulting in less complex solutions.
It is possible to combine evolutionary algorithms with other searching methods. Since backpropagation is sensitive to initial weight values, one can use evolution to come up with an ideal starting point. Better initial values may result in faster and better training, by two orders of magnitude.
Lamarckian evolution – characters acquired through learning are coded back into the genotype – is more effective than Darwinian – are not coded back – in environments where input-output mapping does not change over time. When they change, Darwinian outperform Lamarckian.
Reinforcement learning: a class of algorithms that seeks to estimate the value of each state so it can favor actions generating a higher reward. The biological system responsible for this is yet to be fully understood.
The actor-critic model: the critic module receives information on the current reinforcement value, external, outputting an estimate of the weighted sum of future rewards. The actor module, on the other end, is the probability of executing a set of actions. Finding the optimal structure, by default, depends a lot on the human designing the model.
GasNets emulate the behavior of gases. Neurons are distributed in a 2D space, emitting “gases” that will represent the network connectivity and activation functions.
In contrary to many other learning methods, neuroevolution require weaker constraints from the problem.

Thoughts

An extensive research showing how evolutionary algorithms may stand as a smarter alternative to Grid Search.
With 91 references, it should be expected to cover many different research and topics inside the Neuroevolution subject. It is hard to grasp everything on the first read, but it seems to be a useful reference when I want to return to the subject and go deeper in any of the topics.
Methods such as those using developmental representations look very similar to how a human brain works, with other existing structures. Allows for a network to connect to another, more specialized in the task it needs to perform for specific cases.
I am not sure why, if it’s because it’s a reflection of the writing or just the complexity of this approach, but the explanation for the developmental representation – a little of implicit encoding, too – seems not to be enough. References should have better descriptions of these methods.