Road to ML Engineer #73 - Indirect Encodings

In the last article, we covered approaches to applying evolutionary algorithms to neural networks, with a main focus on NEAT. NEAT achieved evolution of network topology primarily using direct genomic encoding with historical marking. It not only showed the potential of neuroevolution in finding robust and minimal solutions but also demonstrated the importance of encoding in the effectiveness of evolutionary algorithms.

However, direct encoding often fails to capture the symmetries we observe in organisms (limbs, brains, etc.) and even in neural networks. For example, NEAT's direct encoding still doesn't fully handle the permutation invariance of neural networks. Also, the genetic mutations and crossovers in direct encoding typically lead only to incremental changes in the phenotype, which might make it easier for populations to get stuck in local minima. Hence, in this article, we discuss approaches to indirect encoding, which primarily mimic development in biology.

Cell-Chemistry & Grammatical Encodings

The most intuitive developmental approach to indirect encoding is to draw inspiration from DNA and how such compact genetic representation is transcribed differently to result in various cell fates. In biology, the underlying mechanism involves proteins called transcription factors, which bind and interact with specific DNA segments and control the activation and inhibition of gene transcription (the synthesis of specific types of proteins, including transcription factors and others needed to build a cell). These factors form a complex relationship across various parts of the organism, modeled by a network called a genetic regulatory network (GRN).

The activation and location of these transcription factors are determined by the concentration of morphogens (a type of protein, which can themselves be transcription factors) that are initially randomly distributed but form complex patterns through reactions and diffusion, as described by the Turing reaction-diffusion model. (I am by no means an expert in this field, and the above is just my interpretation of cell-chemistry, which may be inaccurate or imprecise.) We can either utilize the low-level abstractions of the Turing reaction-diffusion model (initial locations and parameters of neurons for diffusion as indirect encoding) and GRNs as indirect encoding for neuroevolution. Indeed, some examples exist, such as the simplified rule-based GRN approach by Reisinger, J., & Miikkulainen, R. (2007) (though I could not find the details of the implementation for replication). However, much of the cell-chemistry approach to indirect encoding remains under-explored.

In contrast to the cell-chemistry approach, grammatical encodings represent a high-level abstraction of development, utilizing grammatical rewrite systems. An L-system is an example of such a system, applying rewrite rules to all characters in a string over time. L-systems can produce plant-like structures with symmetries and regularities, and it has been shown that they are capable of generating realistic trees with the addition of slight stochasticity and evolving into better and more natural table designs than direct encoding. Cellular encoding (CE) is another example of a grammatically encoded method specifically designed for evolving neural networks, representing grammars with trees (each intermediate node corresponding to architectural modifications on a neuron) that are evolved using genetic programming. Other methods have been proposed, though the challenge remains in choosing the appropriate grammatical representations.

Learning

While the two approaches discussed in the previous section are based on the physical development of biological systems (particularly molecules and cells), learning from interactions with the environment (adaptation of structures and parameters) is an integral part of biological development and a form of indirect encoding (the genotype does not directly represent the phenotype). Hence, we can somehow combine neuroevolution with learning mechanisms like gradient descent to also achieve indirect encoding.

There are mainly two ways to integrate learning into evolutionary algorithms. One involves reflecting Lamarckian evolution (a disproven theory suggesting that learned physical traits are inherited) and epigenetics (the study of how environments can influence gene activity and even affect offspring) by coding the learned parameters back into the genome for the reproduction of the next generation. Although this approach can be biologically implausible to some extent, it has shown some success in evolutionary computation, particularly in image processing. The challenge of potential premature convergence, where the population updates towards the same direction, can possibly be mitigated by training individuals with different batches of data and other techniques for maintaining diversity.

The other approach respects the Baldwin effect in Darwinian evolution (a biologically plausible theory suggesting that natural selection favors individuals genetically predisposed to learn well) by using only the learned phenotype for fitness computation and not for reproduction. This more biologically plausible Darwinian approach was implemented as BackpropNEAT and tested on a circle dataset classification (a supervised task) alongside vanilla NEAT. BackpropNEAT utilized mini-batch gradient descent with the Adam optimizer (batch size of 32) to train the neural networks on training data consisting of 512 points before computing fitness with a test dataset of 200 points, though the weights were NOT encoded back into the genome. (Training and fitness computations were all parallelized with vectorization. I cannot disclose my implementation here, so I highly recommend trying to implement both the Lamarckian and Darwinian approaches yourself as a practice.)

As a result, while vanilla NEAT could only achieve a test accuracy of approximately 88% after over 100 generations, Darwinian BackpropNEAT achieved 98% test accuracy in just 7 generations by utilizing gradient descent for parameter tuning and focusing its evolutionary search on architecture optimization. This result already demonstrates the potential of synergies between neuroevolution and other learning algorithms. We can also combine gradient-based learning and NEAT for control tasks by using policy gradient methods such as REINFORCE (details of which are available in the article Road to ML Engineer #43 - Policy Gradient Methods). I highly recommend you try implementing it along with other algorithms.

Hebian Learning vs. Recurrency

The indirect encoding through learning is also achievable with an adaptation mechanism such as recurrency (which adapts activations according to previous states during inference) and Hebbian learning. Hebbian learning updates weights locally without using actual outputs, according to the equation $\Delta w_{ij} = \alpha_{ij}o_io_j - \beta_{ij}w_{ij}$ , where $\Delta w_{ij}$ is the weight update connecting node $i$ and $j$ , $o_i$ and $o_j$ are the corresponding activations, and $\alpha_{ij}$ and $\beta_{ij}$ are (either learnable or fixed) learning and decay rates, respectively. It is based on the idea that a connection is important when both sides of the connection are active simultaneously, and it is closer to biological learning.

Hebbian learning with learnable alpha and beta, and recurrency, were both implemented and tested on the cartpole task. (The results from vanilla NEAT are available in the previous article, titled Road to ML Engineer #72 - NEAT. The implementations cannot be provided here, so I recommend implementing them yourself. Hint: use clipping to prevent activations from going to infinity/NaN.) As a result, Hebbian learning performed the poorest, failing to even stay on the screen or lift the pole above. NEAT with recurrent connections found a slightly worse solution compared to vanilla NEAT. This performance degradation with Hebbian learning may be due to the dramatic increase in the number of parameters to optimize (effectively tripling the number of parameters). Recurrency's slightly worse performance may be attributable to recurrent weights below or above 1, which can cause vanishing or exploding activations depending on the activation functions. Nonetheless, these results do not imply that these approaches are ineffective in other tasks, and they can be proven useful in more complex tasks that require weak adaptation.

Conclusion

In this article, we covered biology-inspired approaches, a cell chemistry approach, a grammatical approach, and a learning approach to indirect encoding. Although many approaches to indirect encoding have shown potential for better leveraging symmetries and improving neuroevolution (as BackpropNEAT did), more research and discovery are needed to fully realize their potential. In the next article, we will discuss another potentially more flexible and robust, non-biology-inspired approach to indirect encoding.

Resources

Reisinger, J. & Miikkulainen, R. 2007. Acquiring Evolvability through Adaptive Representations. UTexas.
Risi, S. et al. 2025. Neuroevolution: Harnessing Creativity in AI Agent Design. Neuroevolution.
Stanley, O. K. & Miikkulainen, R. 2002. Evolving Neural Networks through Augmenting Topologies.The MIT Press Journals.