A genetic algorithm (GA) is a very fascinating method for solving an optimization problem. They can be useful when trying to solve optimization problems with a very large search space (solution space respectively).
From the very first moment I heard about genetic algorithms, I was impressed by the idea of simulating natural evolution in an algorithm to solve a search problem. Natural evolution is responsible for the most beautiful but also complex plants and creatures on this planet, so indeed it must be a good idea to implement some of its functioning in an algorithm (at a very trivial scale).
Since I decided to delve deeper into this very interesting topic, I want to share my findings and opinions about genetic algorithms here.
Genetic Algorithms: The Basics
A genetic algorithm in general, is an evolutionary algorithm. This means, that a GA holds a population of possible solutions that evolve with each evolutionary step. For this reason, evolutionary algorithms are sometimes also called “population-based optimization techniques”. At each evolutionary step (or let’s just call it evolution) the adaption of the individuals in the population is considered as the optimization process. To make it more practical: In each evolution the best performing individuals, are selected to survive and are used for reproduction for the next evolutionary step. During reproduction, two operations, namely crossover and mutation are performed, to produce a rich variety of better solution candidates at each step.
If we think of a population of 100 solution candidates, first we must determine how good each solution candidate is. This can be done with a so called “fitness function”. A fitness function is used to assign a “fitness” to each individual in the population. Based on this fitness, we can select the 10 fittest individuals for reproduction (they are also called “parents”), which means they populate their genetic code by crossover and mutation into the next population. The other individuals of the current population are simply neglected, in order to increase the overall population fitness and produce individuals with higher fitness in each evolution. Having this basic idea in mind, it is completely clear that the population must get better with each evolution, but it is not the “all-in-one solution” for every optimization problem as we will see later.
Hearing about genetic algorithms for the first time, this short introduction may be a bit hard to grasp. In order to fully understand how and why this optimization algorithm works, I can strongly suggest to just google for an introductory article or watch out for proper explanations on YouTube. Furthermore, you can look for an application of a GA with the Travelling Salesman Problem (TSP) if you are a more practical guy. You can find a ton of good articles and if you are new to this topic, you should familiarize with fitness functions, solution representations, selection methods, crossover and mutation operators and maybe work through a case study regarding TSP. Otherwise the following could be difficult to understand – but you can try anyway!
The Canonical Genetic Algorithm
If we think in a rather philosophical way, one could say, that natural evolution can only work with a little portion of coincidence. Philosophically, I can neither agree nor disagree with this hypothesis, but mathematically I have good news! First of all, the concept and the inner workings of a genetic algorithms can be mathematically described. Second (more importantly!!) it is proven that the overall fitness of a population increases over time with the proper settings, parameters and operators.
In 1975 John Holland analyzed genetic algorithms and especially the influence of selection, crossover and mutation on the individuals belonging to a population. His goal was to provide and prove a formal model for the functioning of GAs. His consideration for a GA is the so called “Canonical Genetic Algorithm” and has the following setup:
- Binary representation
- Generational replacement
- Population of constant size
- Roulette wheel selection
- Bit flip mutation
Based on these settings and the mathematical definition of selection, crossover, and mutation, he proposed two major cornerstones of GAs, which are the “Schema Theorem” and the “Building Block Hypothesis”. Just for completeness I will try to explain these two topics with my own words and you are very welcome to correct me if I do a mistake here.
With the schema theorem Holland proved that the number of individuals having above average fitness will increase (even exponentially) in subsequent generations. This schema theorem relies on the building block hypothesis, which states that genetic algorithms combine parts (building blocks) of a solution into longer building blocks to increase the quality as the algorithm proceeds. If you do not fully get what this means, it is not that big of a deal – for now. Don’t get me wrong, this is one of the most important concepts when dealing with GAs, since it proves that the whole optimization process is really working, but if you are not planning to design your own genetic operators, it is enough to understand that genetic algorithms are mathematically proven.
Attention! These theorems only hold under the conditions of a Canonical GA and strongly depend on the solution representation and the genetic operators (selection, crossover, mutation). Or in other words: If working with genetic algorithms, you should know what you are doing, otherwise you will not get good results.
That’s it for now and I hope you could get an idea of genetic algorithms in general. We revised some theory about genetic algorithms, when talking about the canonical genetic algorithm, the schema theorem, and the building block hypothesis. Maybe I will do an in-depth article about this theory in future, if it is desired, but for now I will focus on some practical usage of GAs. Stay tuned for more content!