Key Takeaways
- Genetic algorithms are a family of search, optimization, and learning algorithms inspired by the principles of natural evolution.
- Genetic algorithms can be applied to search and optimization problems, such as planning, scheduling, gaming, and analytics.
- Genetic algorithms can be used to improve machine learning and deep learning models in several different ways, such as feature selection, hyperparameter tuning and architecture optimization.
- Genetic algorithms can be utilized for reinforcement learning by optimizing the cumulative reward sought after.
- Genetic Programming is a special case of Genetic Algorithms, that can be used to generate computer programs with desired functionality.
- There are numerous other bio-inspired algorithms that can solve problems using biological models and behaviors.
Hands-On Genetic Algorithms with Python by Eyal Wirsansky is a new book which explores the world of genetic algorithms to solve search, optimization, and AI-related tasks, and improve machine learning models. InfoQ interviewed Eyal Wirsansky about how genetic algorithms work and what they can be used for.
In addition to our interview, InfoQ was able to obtain a sample chapter which can be downloaded here or, if you prefer to get it directly from the author, download it here by joining his AI-training mailing list.
InfoQ: How do genetic algorithms work?
Eyal Wirsansky: Genetic algorithms are a family of search algorithms inspired by the principles of evolution in nature. They imitate the process of natural selection and reproduction, by starting with a set of random solutions, evaluating each one of them, then selecting the better ones to create the next generation of solutions. As generations go by, the solutions we have get better at solving the problem. This way, genetic algorithms can produce high-quality solutions for various problems involving search, optimization, and learning. At the same time, their analogy to natural evolution allows genetic algorithms to overcome some of the hurdles encountered by traditional search and optimization algorithms, especially for problems with a large number of parameters and complex mathematical representations.
InfoQ: What type of problems do genetic algorithms solve?
Wirsansky: Genetic algorithms can be used for solving almost any type of problem, but they particularly shine where traditional algorithms cannot be used, or fail to produce usable results within a practical amount of time. For example, problems with very complex or non-existing mathematical representation, problems where the number of variables involved is large, and problems with noisy or inconsistent input data. In addition, genetic algorithms are better equipped to handle ‘deceptive’ problems, where traditional algorithms may get ‘trapped’ in a suboptimal solution.
Genetic algorithms can even deal with cases where there is no way to evaluate an individual solution by itself, as long as there is a way to compare two solutions and determine which of them is better. An example can be a machine learning-based agent that drives a car in a simulated race. A genetic algorithm can optimize and tune the agent by having different versions of it compete against each other to determine which version is better.
InfoQ: What are the best use cases for genetic algorithms?
Wirsansky: The most common use case is where we need to assemble a solution using a combination of many different available parts; we want to select the best combination, but the number of possible combinations is too large to try them all. Genetic algorithms can usually find a good combination within a reasonable amount of time. Examples can be scheduling personnel, planning of delivery routes, designing bridge structures, and also constructing the best machine learning model from many available building blocks, or finding the best architecture for a deep learning model.
Another interesting use case is where the evaluation is based on people’s opinion or response. For example, you can use the genetic algorithm approach to determine the design parameters for a web site—such as color palette, font size, and location of components on the page—that will achieve the best response from customers, such as conversion or retention. This idea can also be used for ‘genetic art’— artificially created paintings or music that prove pleasant to the human eye (or ear).
Genetic algorithms can also be used for ‘ongoing optimization’—cases where the best solution may change over time. The algorithm can run continuously within the changing environment and respond dynamically to these changes by updating the best solution based on the current generation.
InfoQ: How can genetic algorithms select the best subset of features for supervised learning?
Wirsansky: In many cases, reducing the number of features—used as inputs for a model in supervised learning—can increase the model’s accuracy, as some of the features may be irrelevant or redundant. This will also result in a simpler, better generalizing model. But we need to figure out which are the features that we want to keep. As this comes down to finding the best combination of features out of a potentially immense number of possible combinations, genetic algorithms provide a very practical approach. Each potential solution is represented by a list of booleans, one for each feature.
The value of the boolean (‘0‘ or ‘1’) represents the absence or presence of the corresponding feature. These lists of boolean values are used as genetic material, that can be exchanged between solutions when we ‘mate’ them, or even mutated by flipping values randomly. Using these ‘mating’ and ‘mutation’ operations, we create new generations out of preceding ones, while giving an advantage to solutions that yielded better performing models. After a while, we can have some good solutions, each representing a subset of the features. This is demonstrated in Chapter 7 of the book (our sample chapter) with the UCI ‘Zoo’ dataset using python code, where the best performance was achieved by selecting six particular features out of the original sixteen.
InfoQ: What are the benefits that we can get from using genetic algorithms with machine learning for hyperparameter tuning?
Wirsansky: Every machine learning model utilizes a set of hyperparameters—values that are set before the training takes place and affect the way the learning is done. The combined effect of hyperparameters on the performance of the model can be significant. Unfortunately, finding the best combination of the hyperparameter values—also known as hyperparameter tuning—can be as difficult as finding a needle in a haystack.
Two common approaches are grid search and random search, each with its own disadvantages. Genetic algorithms can be used in two ways to improve upon these methods. One way is by optimizing the grid search, so instead of trying out every combination on the grid, we can search only a subset of combinations but still get a good combination. The other way is to conduct a full search over the hyperparameter space, as genetic algorithms are capable of handling a large number of parameters as well as different parameter types —continuous, discrete and categorical. These two approaches are demonstrated in Chapter 8 of the book with the UCI ‘Wine’ dataset using python code.
InfoQ: How can genetic algorithms be used in Reinforcement Learning?
Wirsansky: Reinforcement Learning (RL) is a very exciting and promising branch of machine learning, with the potential to handle complex, everyday-life-like tasks. Unlike supervised learning, RL does not present an immediate 'right/wrong' feedback, but instead provides an environment where a longer-term, cumulative reward is sought after. This kind of setting can be viewed as an optimization problem, another area where genetic algorithms excel.
As a result, genetic algorithms can be utilized for reinforcement learning in several different ways. One example can be determining the weights and biases of a neural network that interacts with its environment by mapping input values to output values. Chapter 10 of the book includes two examples of applying genetic algorithms to RL tasks, using the OpenAI Gym environments ‘mountain-car’ and ‘cart-pole’.
InfoQ: What is bio-inspired computing?
Wirsansky: Genetic algorithms are just one branch within a larger family of algorithms called Evolutionary Computation, all inspired by Darwinian evolution. One particularly interesting member of this family is Genetic Programming, that evolves computer programs aiming to solve a specific problem. More broadly, as evolutionary computation techniques are based on various biological systems or behaviors, they can be considered part of the algorithm family known as Bio-inspired Computing.
Among the many fascinating members of this family are Ant Colony Optimization—imitating the way certain species of ants locate food and mark the paths to it, giving advantage to closer and richer locations of food; Artificial Immune Systems, capable of identifying and learning new threats, as well as applying the acquired knowledge and respond faster the next time a similar threat is detected; and Particle Swarm Optimization, based on the behavior of flocks of birds or schools of fish, where individuals within the group work together towards a common goal without central supervision.
Another related, broad field of computation is Artificial Life, involving systems and processes imitating natural life in different ways, such as computer simulations and robotic systems. Chapter 12 of the book includes two relevant Python-written examples, one solving a problem using genetic programming, and the other—using particle swarm optimization.
About the Book Author:
Eyal Wirsansky is a senior software engineer, a technology community leader, and an artificial intelligence researcher and consultant. Eyal started his software engineering career as a pioneer in the field of voice over IP, and he now has over 20 years' experience of creating a variety of high-performing enterprise solutions. While in graduate school, he focused his research on genetic algorithms and neural networks. One outcome of his research is a novel supervised machine learning algorithm that combines the two. Eyal leads the Jacksonville (FL) Java user group, hosts the Artificial Intelligence for Enterprise virtual user group, and writes the developer-oriented artificial intelligence blog, ai4java.