Introduction
Chapter 1

Digital computers and the rise of the information age have revolutionized the modern lifestyle. The invention of digital computers has enabled us to digitize numerous areas of our lives. This digitalization allows us to outsource many tedious daily tasks to computers where previously humans may have been required. An everyday example of this would be modern word processing applications that feature built in spell checkers to automatically check documents for spelling and grammar mistakes.

As computers have grown faster and more computationally powerful, we have been able to use them to perform increasingly complex tasks such as understanding human speech and even somewhat accurately predict the weather. This constant innovation allows us to outsource a growing number of tasks to computers. A present day computer is likely able to execute billions of operations a second, but however technically capable they become, unless they can learn and adapt themselves to better suit the problems presented to them, they’ll always be limited to whatever rules or code us humans write for them.

The field of artificial intelligence and the subset of genetic algorithms are beginning to tackle some of these more complex problems faced in today’s digital world. By implementing genetic algorithms into real world applications it is possible to solve problems which would be nearly impossible to solve by more traditional computing methods.

What is Artificial Intelligence?

In 1950, Alan Turing – a mathematician and early computer-scientist - wrote a famous paper titled, “Computing Machinery and Intelligence”, where he questioned, “Can computers think?” His question caused much debate on what intelligence actually is and what the fundamental limitations of computers might be.

Many early computer scientists believed computers would not only be able to demonstrate intelligent-like behavior, but that they would achieve human level intelligence in just a few decades of research. This notion is indicated by Herbert A. Simon in 1965 when he declared, “Machines will be capable, within twenty years, of doing any work a man can do.” Of course now, over 50 years later, we know that Simon’s prediction was far from reality, but at the time many computer scientists agreed with his position and made it their goal to create a “strong AI” machine. A strong AI machine is simply a machine which is at least just as intellectually capable at completing any task it’s given as humans.

Today, more than 50 years since Alan Turing’s famous question was posed, the possibility of whether machines will eventually be able to think in a similar way to humans still remains largely unanswered. To this day his paper, and thoughts, on what it means to “think” is still widely debated by philosophers and computer scientists alike.

Although we’re still far from creating machines able to replicate the intelligence of humans, we have undoubtedly made significant advances in artificial intelligence over the last few decades. Since the 1950s the focus on “strong AI” and developing artificial intelligence comparable to that of humans, has begun shifting in favor of “weak AI”. Weak AI is the development of more narrowly focused intelligent machines which is much more achievable in the short term. This narrower focus has allowed computer scientists to create practical and seemingly intelligent systems such as Apple’s Siri and Google’s self-driving car, for example.

When creating a weak AI system, researchers will typically focus on building a system or machine which is only just as “intelligent” as it needs to be to complete a relatively small problem. This means we can apply simpler algorithms and use less computing power while still achieving results. In comparison, strong AI research focuses on building a machine that’s intelligent and able enough to tackle any problem which we humans can. This makes building a final product using strong AI much less practical due to the scope of the problem.

In only a few decades’ weak AI systems have become a common component of our modern lifestyle. From playing chess, to helping humans fly fighter jets, weak AI systems have proven themselves useful in solving problems once thought only possible by humans. As digital computers become smaller and more computationally capable, the usefulness of these systems is likely to only increase in time.

Biologically Analogies

When early computer scientists were first trying to build artificially intelligent systems, they would frequently look to nature for inspiration on how their algorithms could work. By creating models which mimic processes found in nature, computer scientists were able to give their algorithms the ability to evolve, and even replicate characteristics of the human brain. It was implementing their biologically-inspired algorithms that enabled these early pioneers, for the first time, to give their machines the ability to adapt, learn and control aspects of their environments.

By using different biological analogies as a guiding metaphor to develop artificially intelligent systems, computer scientists created distinct fields of research. Naturally, the different biological systems that inspired each field of research have their own specific advantages and applications. One successful field, and the one we’re paying attention to in this book, is evolutionary computation - in which genetic algorithms make up the majority of the research. Other fields focused on slightly different areas, such as modeling the human brain. This field of research is called artificial neural networks, and it uses models of the biological nervous system to mimic its learning and data processing capabilities.

History of Evolutionary Computation

Evolutionary computation was first explored as an optimization tool in the 1950s when computer scientists were playing with the idea of applying Darwinian ideas of biological evolution to a population of candidate solutions. They theorized that it may be possible to apply evolutionary operators such as crossover – which is an analog to biological reproduction - and mutation – which is the process in which new genetic information is added to the genome. It’s these operators when coupled with selection pressure that provide genetic algorithms the ability to “evolve” new solutions when left over a period of time.

In the 1960s “evolution strategies” – an optimization technique applying the ideas of natural selection and evolution - was first proposed by Rechenberg (1965, 1973) and his ideas were later expanded on by Schwefel (1975, 1977). Other computer scientists at the time were working independently on similar fields of research such as Fogel L.J; Owens, A.J; and Walsh, M.J (1966), who were the first to introduce the field of evolutionary programming. Their technique involved representing candidate solutions as finite-state machines and applying mutation to create new solutions.

During the 1950s and 1960s some biologists studying evolution began experimenting with simulating evolution using computers. However, it was Holland, J.H. (1975) who first invented and developed the concept of genetic algorithms during the 1960s and 1970s. He finally presented his ideas in 1975 in his groundbreaking book, “Adaption in Natural and Artificial Systems”. Holland’s book demonstrated how Darwinian evolution could be abstracted and modeled using computers for use in optimization strategies. His book explained how biological chromosomes can be modeled as strings of 1s and 0s, and how populations of these chromosomes can be “evolved” by implementing techniques that are found in natural selection such as mutation, selection and crossover.

Holland’s original definition of a genetic algorithm has gradually changed over the decades from when it was first introduced back in the 1970s. This is somewhat due to recent researchers working in the field of evolutionary computation occasionally bringing ideas from the different approaches together. Although this has blurred the lines between many of the methodologies it has provided us with rich set of tools which can help us better tackle specific problems. The term “genetic algorithm” in this book will be used to refer to both Holland’s classical vision of a genetic algorithm, and also to the wider, present day, interpretation of the words.

Computer scientists to this day are still looking at biology and biological systems to give them ideas on how they can create better algorithms. One of the more recent biologically inspired optimization algorithms would be Ant Colony Optimization which was first proposed in 1992 by Marco, D. (1992). Ant Colony optimization models the behavior of ants as a method for solving various optimization problems such as the Traveling Salesman Problem.

The Advantage of Evolutionary Computation

The very rate at which intelligent machines have been adopted within our society is an acknowledgement of their usefulness. The vast majority of problems we use computers to solve can be reduced to relatively simple static decision problems. These problems can become rapidly more complex as the amount of possible inputs and outputs increase, and only further complicated when the solution needs to adapt to a changing problem. In addition to this, some problems may also require an algorithm to search through a huge number of possible solutions in an attempt to find a feasible solution. Depending on the amount of solutions that need to be searched through, classical computational methods may not be able find a feasible solution in the timeframe available – even using a super computer. It’s in these circumstances where evolutionary computation can offer a helping hand.

To give you an idea of a typical problem we can solve with classical computational methods, consider a traffic light system. Traffic lights are relatively simple systems which only require a basic level of intelligence to operate. A traffic light system will usually have just a few inputs which can alert it to events such as a car or pedestrian waiting to use the junction. It then needs to manage those inputs and correctly change the lights in a way in which cars and pedestrians can use the junction efficiently without causing any accidents. Although, there may be a certain amount of knowledge required to operate a traffic light system, its inputs and outputs are basic enough that a set of instructions to operate the traffic light system can be designed and programmed by humans without much problem.

Often we will need an intelligent system to handle more complex inputs and outputs. This could mean it is no longer as simple, or maybe impossible, for a human to program a set of instructions so the machine can correctly map the inputs to a viable output. In these cases where the complexity of the problem makes it unpractical for a human programmer to solve with code, optimization and learning algorithms can provide us with a method to use the computer’s processing power to find a solution to the problem itself. An example of this might be when building a fraud detection system that can recognize fraudulent transactions based on transaction information. Although a relationship may occur between the transaction data and a fraudulent transaction, it could depend on many subtleties within the data itself. It’s these subtle patterns in the input that might be hard for a human to code for, making it a good candidate for applying evolutionary computation.

Evolutionary algorithms are also useful when humans don’t know how to solve a problem. A classic example of this was when NASA was looking for an antenna design that met all their requirements for a 2006 space mission. NASA wrote a genetic algorithm which evolved an antenna design to meet all of their specific design constraints such as, signal quality, size, weight and cost. In this example NASA didn’t know how to design an antenna which would fit all their requirements, so they decided to write a program which could evolve one instead.

Another situation in which we may want to apply an evolutionary computation strategy is when the problem is constantly changing, requiring an adaptive solution. This problem can be found when building an algorithm to make predictions on the stock market. An algorithm that makes accurate predictions about the stock market one week might not make accurate predictions the following week. This is due to the forever shifting patterns and trends of the stock market and thus making prediction algorithms very unreliable unless they’re able to quickly adapt to the changing patterns as they occur. Evolutionary computation can help accommodate for these changes by providing a method in which adaptations can be made to the prediction algorithm as necessary.

Finally, some problems require searching through a large, or possibly, infinite amount of potential solutions to find the best, or good enough, solution for the problem faced. Fundamentally, all evolutionary algorithms can be viewed as search algorithms which search through a set of possible solutions looking the best – or “fittest” - solution. You may be able to visualize this if you think of all the potential combinations of genes found in an organism’s genome as candidate solutions. Biological evolution is great at searching through these possible genetic sequences to find a solution which sufficiently suits its environment. In larger search spaces it’s likely - even when using evolutionary algorithms - the best solution to a given problem won’t be found. However, this is rarely an issue for most optimization problems because typically we only require a solution good enough to get the job done.

The approach provided by evolutionary computation can be thought of as a “bottom-up” paradigm. That is when all the complexity that emerges from an algorithm comes from simple, underlying, rules. The alternative to this would be a “top-down” approach which would require all the complexity demonstrated within the algorithm to be written by humans. Genetic algorithms are fairly simple to develop; making them an appealing choice when otherwise a complex algorithm is required to solve the problem.

Here is a list of features which can make a problem a good candidate for an evolutionary algorithm:

  • If the problem is sufficiently hard to write code to solve
  • When a human isn’t sure how to solve the problem
  • If a problem is constantly changing
  • When it’s not feasible to search through each possible solution
  • When a “good-enough” solution is acceptable

Biological Evolution

Biological evolution, through the process of natural selection, was first proposed by Charles Darwin (1859) in his book, “The Origin of Species”. It was his concept of biological evolution which inspired early computer scientists to adapt and use biological evolution as a model for their optimization techniques, found in evolutionary computation algorithms.

Because many of the ideas and concepts used in genetic algorithms stem directly from biological evolution, a basic familiarity with the subject is useful for a deeper understanding into the field. With that being said, before we begin exploring genetic algorithms, let’s first run through the (somewhat simplified) basics of biological evolution.

All organisms contain DNA which encodes all of the different traits that make up that organism. DNA can be thought of as life’s instruction manual to create the organism from scratch. Changing the DNA of an organism will change its traits such as eye and hair color. DNA is made up of individual genes, and it is these genes that are responsible for encoding the specific traits of an organism.

An organism’s genes are grouped together in chromosomes and a complete set of chromosomes make up an organism’s genome. All organisms will have a least one chromosome, but usually contain many more, for example humans have 46 chromosomes with some species, having more than 1000! In genetic algorithms we usually refer to the chromosome as the candidate solution. This is because genetic algorithms typically use a single chromosome to encode the candidate solution.

The various possible settings for a specific trait are called an “allele”, and the position in the chromosome where that trait is encoded is called a “locus”. We refer to a specific genome as a “genotype” and the physical organism that genotype encodes is called the “phenotype”.

When two organisms mate, DNA from both organisms are brought together and combined in such a way that the resulting organism – usually referred to as the offspring – acquires 50% of its DNA from its first parent, and the other 50% from the second. Every so often a gene from the organisms DNA will mutate providing it with DNA found in neither of its parents. These mutations provide the population with genetic diversity by adding genes to the population that weren’t available beforehand. All possible genetic information in the population is referred as the population’s “gene pool”.

If the resulting organism is fit enough to survive in its environment it’s likely to mate itself, allowing its DNA to continue on into future populations. If however, the resulting organism isn’t fit enough to survive and eventually mate its genetic material won’t propagate into future populations. This is why evolution is occasionally referred to as survival of the fittest – only the fittest individuals survive and pass on their DNA. It’s this selective pressure that slowly guides evolution to find increasingly fitter and better adapted individuals.

An Example of Biological Evolution

To help clarify how this process will gradually lead to the evolution of increasingly fitter individuals, consider the following example:

On a distant planet there exists a species that takes the shape of a white square.

9781484203293_unFig01-01.jpg

The white square species has lived for thousands of years in peace, until recently when a new species arrived, the black circle.

9781484203293_unFig01-02.jpg

The black circle species were carnivores and began feeding on the white square population.

9781484203293_unFig01-03.jpg

The white squares didn’t have any way to defend themselves against the black circles. Until one day, one of the surviving white squares randomly mutated from a white square into a black square. The black circle no longer saw the new black square as food because it was the same color as itself.

9781484203293_unFig01-04.jpg

Some of the surviving square population mated, creating a new generation of squares. Some of these new squares inherited the black square color gene.

9781484203293_unFig01-05.jpg

However, the white colored squares continued to be eaten…

9781484203293_unFig01-06.jpg

Eventually, thanks to their evolutionary advantage of looking similar to the black circle, they were no longer eaten. Now the only color of square left was the black square.

9781484203293_unFig01-07.jpg

No longer prey to the black circle, the black squares were once again free to live in peace.

9781484203293_unFig01-08.jpg

Basic Terminology

Genetic algorithms were built on the concepts of biological evolution, so if you’re familiar with the terminology found in evolution, you’ll likely notice overlap in the terminology found when working with genetic algorithms. The similarities between the fields are of course due to evolutionary algorithms, and more specifically, genetic algorithms being analogous to processes found in nature.

Terms

It’s important that before we go deeper into the field of genetic algorithms we first understand some of the basic language and terminology used. As the book progresses, more complex terminology will be introduced as required. Below is a list of some of the more common terms for reference.

  • Population - This is simply just a collection of candidate solutions which can have genetic operators such as mutation and crossover applied to them.
  • Candidate Solution – A possible solution to a given problem.
  • Gene – The indivisible building blocks making up the chromosome. Classically a gene consists of 0 or a 1.
  • Chromosome – A chromosome is a string of genes. A chromosome defines a specific candidate solution. A typical chromosome with a binary encoding might contain something like, “01101011”.
  • Mutation – The process in which genes in a candidate solution are randomly altered to create new traits.
  • Crossover – The process in which chromosomes are combined to create a new candidate solution. This is sometimes referred to as recombination.
  • Selection – This is the technique of picking candidate solutions to breed the next generation of solutions.
  • Fitness – A score which measures the extent to which a candidate solution is adapted to suit a given problem.

Search Spaces

In computer science when dealing with optimization problems that have many candidate solutions which need to be searched through, we refer to the collection of solutions as a “search space”. Each specific point within the search space serves as a candidate solution for the given problem. Within this search space there is a concept of distance where solutions that are placed closer to one another are more likely to express similar traits than solutions place further apart. To understand how these distances are organized on the search space, consider the following example using a binary genetic representation:

“101” is only 1 difference away from, “111”. This is because there is only 1 change required (flipping the 0 to 1) to transition from “101” to “111”. This means these solutions are only 1 space apart on the search space.

“000” on the other hand, is three differences away from, “111”. This gives it a distance of 3, placing “000” 3 spaces from “111” on the search space.

Because solutions with fewer changes are grouped nearer to one another, the distance between solutions on the search space can be used to provide an approximation of the characteristics held by another solution. This understanding is often used as a tactic by many search algorithms to improve their search results.

Fitness Landscapes

When candidate solutions found within the search space are labeled by their individual fitness levels we can begin to think of the search space as a “fitness landscape”. Figure 1-1 provides an example of what a 2D fitness landscape might look like.

9781484203293_Fig01-01.jpg

Figure 1-1. A 2D fitness landscape

On the bottom axis of our fitness landscape is the value we’re optimizing for, and on the left axis is its corresponding fitness value. I should note, this is typically an over simplification of what would be found in practice. Most real world applications have multiple values that need optimizing creating a multi-dimensional fitness landscape.

In the above example the fitness value for every candidate solution in the search space can be seen. This makes it easy to see where the fittest solution is located, however, for this to be possible in reality, each candidate solution in the search space would have needed to have their fitness function evaluated. For complex problems with exponential search spaces it just isn’t plausible to evaluate every solution’s fitness value. In these cases, it is the search algorithm’s job to find where the best solution likely resides while being limited to only having a tiny proportion of the search space visible. Figure 1-2 is an example of what a search algorithm might typically see.

9781484203293_Fig01-02.jpg

Figure 1-2. A more typical search fitness space

Consider an algorithm that is searching through a search space of one billion (1,000,000,000) possible solutions. Even if each solution only takes 1 second to evaluate and be assigned a fitness value, it would still take over 30 years to explicitly search through each potential solution! If we don’t know the fitness value for each solution in the search space then we are unable to definitively know where the best solution resides. In this case, the only reasonable approach is to use a search algorithm capable of finding a good-enough, solution in the time frame available. In these conditions, genetic algorithms and evolutionary algorithms in general, are very effective at finding feasible, near optimum solutions in a relatively short time frame.

Genetic algorithms use a population approach when searching the search space. As part of their search strategy genetic algorithms will assume two well ranking solutions can be combined to form an even fitter offspring. This process can be visualized on our fitness landscape (Figure 1-3).

9781484203293_Fig01-03.jpg

Figure 1-3. Parent and offspring in the fitness plot

The mutation operator found in genetic algorithms allows us to search the close neighbors of the specific candidate solution. When mutation is applied to a gene its value is randomly changed. This can be pictured by taking a single step on the search space (Figure 1-4).

9781484203293_Fig01-04.jpg

Figure 1-4. A fitness plot showing the mutation

In the example of both crossover and mutation it is possible to end up with a solution less fit than what we originally set out with (Figure 1-5).

9781484203293_Fig01-05.jpg

Figure 1-5. A poor fitness solution

In these circumstances, if the solution performs poorly enough, it will eventually be removed from the gene pool during the selection process. Small negative changes in individual candidate solutions are fine as long as the population’s average trend tends towards fitter solutions.

Local Optimums

An obstacle that should be considered when implementing an optimization algorithm is how well the algorithm can escape from locally optimal positions in the search space. To better visualize what a local optimum is, refer to Figure 1-6.

9781484203293_Fig01-06.jpg

Figure 1-6. A local optimum can be deceiving

Here we can see two hills on the fitness landscape which have peaks of slightly different heights. As mentioned earlier, the optimization algorithm isn’t able to see the entire fitness landscape, and instead, the best it can do is find solutions which it believes are likely to be in an optimal position on the search space. It’s because of this characteristic the optimization algorithm can often unknowingly focus its search on suboptimal portions of the search space.

This problem becomes quickly noticeable when implementing a simple hill climbing algorithm to solve problems of any sufficient complexity. A simple hill climber doesn’t have any inherent method to deal with local optimums, and as a result will often terminate its search in locally optimal regions of the search space. A simple stochastic hill climber is comparable to a genetic algorithm without a population and crossover. The algorithm is fairly easy to understand, it starts off at a random point in the search space, then attempts to find a better solution by evaluating its neighbor solutions. When the hill climber finds a better solution amongst its neighbors, it will move to the new position and restart the search process again. This process will gradually find improved solutions by taking steps up whatever hill it found itself on in the search space – hence the name, hill climber. When the hill climber can no longer find a better solution it will assume it is at the top of the hill and stop the search.

Figure 1-7 illustrates how a typical run-through of a hill climber algorithm might look.

9781484203293_Fig01-07.jpg

Figure 1-7. Shows how the hill climber works

The diagram above demonstrates how a simple hill climber algorithm can easily return a locally optimal solution if it’s search begins in a locally optimal area of the search space.

Although there isn’t any guaranteed way to avoid local optimums without first evaluating the entire search area, there are many variations of the algorithm which can help avoid local optimums. One of the most basic and effective methods is called random-restart hill climbing, which simply runs the hill climbing algorithm multiple times from random starting positions then returns the best solution found from its various runs. This optimization method is relatively easy to implement and surprisingly effective. Other approaches such as, Simulated Annealing (see Kirkpatrick, Gelatt, and Vecchi (1983)) and Tabu search (see Glover (1989) and Glover (1990)) are slight variations to the hill climbing algorithm which both having properties that can help reduce local optimums.

Genetic algorithms are surprisingly effective at avoiding local optimums and retrieving solutions that are close to optimal. One of the ways it achieves this is by having a population that allows it to sample a large area of the search space locating the best areas to continue the search. Figure 1-8 shows how the population might be distributed at initialization.

9781484203293_Fig01-08.jpg

Figure 1-8. Sample areas at initialization

After a few generations have past, the population will begin to conform towards where the best solutions could be found in the previous generations. This is because less fit solutions will be removed during the selection process making way for new, fitter, solutions to be made during crossover and mutation (Figure 1-9).

9781484203293_Fig01-09.jpg

Figure 1-9. The fitness diagram after some generations have mutated

The mutation operator also plays a role in evading local optimums. Mutation allows a solution to jump from its current position to another position on the search space. This process will often lead to the discovery of fitter solutions in more optimal areas in the search space.

Parameters

Although all genetic algorithms are based on the same concepts, their specific implementations can vary quite a bit. One of the ways specific implementations can vary is by their parameters. A basic genetic algorithm will have at least a few parameters that need to be considered during the implementation. The main three are the rate of mutation, the population size and the third is the crossover rate.

Mutation Rate

The mutation rate is the probability in which a specific gene in a solution’s chromosome will be mutated. There is technically no correct value for the mutation rate of a genetic algorithm, but some mutation rates will provide vastly better results than others. A higher mutation rate allows for more genetic diversity in the population and can also help the algorithm avoid local optimums. However, a mutation rate that’s too high can cause too much genetic variation between each generation causing it to lose good solutions it found in its previous population.

If the mutation rate is too low, the algorithm can take an unreasonably long time to move along the search space hindering its ability to find a satisfactory solution. A mutation rate that’s too high can also prolong the time it takes to find an acceptable solution. Although, a high mutation rate can help the genetic algorithm avoid getting stuck in local optimums, when it’s set too high it can have a negative impact on the search. This, as was said before, is due to the solutions in each generation being mutated to such a large extent that they’re practically randomized after mutation has been applied.

To understand why a well configured mutation rate is important, consider two binary encoded candidate solutions, “100” and “101”. Without mutation new solutions can only come from crossover. However, when we crossover our solutions there are only two possible outcomes available for the offspring, “100” or “101”. This is because the only difference in the parent’s genome’s can be found in their last bits. If the offspring receives its last bit from the first parent, it will be a “1”, otherwise if it’s from the second, it would be a “0”. If the algorithm needed to find an alternative solution it would need to mutate an existing solution, giving it new genetic information that isn’t available elsewhere in the gene pool.

The mutation rate should be set to a value that allows for enough diversity to prevent the algorithm plateauing, but not so much that it causes the algorithm to lose valuable genetic information from the previous population. This balance will depend on the nature of the problem being solved.

Population Size

The population size is simply the number of individuals in the genetic algorithm’s population in any one generation. The larger the population’s size, the more of the search space the algorithm can sample. This will help lead it in the direction of more accurate, and globally optimal, solutions. A small population size will often result in the algorithm finding less desirable solutions in locally optimal areas of the search space, however they require less computational resources per generation.

Again here, like with the mutation rate, a balance needs to be found for optimum performance of the genetic algorithm. Likewise, the population size required will change depending on the nature of the problem being solved. Large hilly search spaces commonly require a larger population size to find the best solutions. Interestingly, when picking a population size there is a point in which increasing the size will cease to provide the algorithm with much improvement in the accuracy of the solutions it finds. Instead, it will slow the execution down due to the extra computational demand needed to process the additional individuals. A population size around this transition is usually going to provide the best balance between resources and results.

Crossover Rate

The frequency in which crossover is applied also has an effect on the overall performance of the genetic algorithm. Changing the crossover rate adjusts the chance in which solutions in the population will have the crossover operator applied to them. A high rate allows for many new, potentially superior, solutions to be found during the crossover phase. A lower rate will help keep the genetic information from fitter individuals intact for the next generation. The crossover rate should usually be set to a reasonably high rate promoting the search for new solutions while allowing a small percentage of the population to be kept unaffected for the next generation.

Genetic Representations

Aside from the parameters, another component that can affect a genetic algorithm’s performance is the genetic representation used. This is the way the genetic information is encoded within the chromosomes. Better representations will encode the solution in a way that is expressive while also being easily evolvable. Holland’s (1975) genetic algorithm was based on a binary genetic representation. He proposed using chromosomes that were comprised of strings containing 0s and 1s. This binary representation is probably the simplest encoding available, however for many problems it isn’t quite expressive enough to be a suitable first choice. Consider the example in which a binary representation is used to encode an integer which is being optimized for use in some function. In this example, “000” represents 0, and “111” represents 7, as it typically would in binary. If the first gene in the chromosome is mutated - by flipping the bit from 0 to 1, or from 1 to 0 - it would change the encoded value by 4 (“111” = 7, “011” = 3). However, if the final gene in the chromosome is changed it will only effect the encoded value by 1 (“111” = 7, “110” = 6). Here the mutation operator has a different effect on the candidate solution depending on which gene in its chromosome is being operated on. This disparity isn’t ideal as it will reduce performance and predictability of the algorithm. For this example, it would have been better to use an integer with a complimentary mutation operator which could add or subtract a relatively small amount to the gene’s value.

Aside from simple binary representations and integers, genetic algorithms can use: floating point numbers, trees-based representations, objects, and any other data structure required for its genetic encoding. Picking the right representation is key when it comes to building an effective genetic algorithm.

Termination

Genetic algorithms can continue to evolve new candidate solutions for however long is necessary. Depending on the nature of the problem, a genetic algorithm could run for anywhere between a few seconds to many years! We call the condition in which a genetic algorithm finishes its search its termination condition.

Some typical termination conditions would be:

  • A maximum number of generations is reached
  • Its allocated time limit has been exceeded
  • A solution has been found that meets the required criteria
  • The algorithm has reached a plateau

Occasionally it might be preferable to implement multiple termination conditions. For example, it can be convenient to set a maximum time limit with the possibility of terminating earlier if an adequate solution is found.

The Search Process

To finish the chapter let’s take a step-by-step look at the basic process behind a genetic algorithm, illustrated in Figure 1-10.

9781484203293_Fig01-10.jpg

Figure 1-10. A general genetic algorithm process

  1. Genetic algorithms begin by initializing a population of candidate solutions. This is typically done randomly to provide an even coverage of the entire search space.
  2. Next, the population is evaluated by assigning a fitness value to each individual in the population. In this stage we would often want to take note of the current fittest solution, and the average fitness of the population.
  3. After evaluation, the algorithm decides whether it should terminate the search depending on the termination conditions set. Usually this will be because the algorithm has reached a fixed number of generations or an adequate solution has been found.
  4. If the termination condition is not met, the population goes through a selection stage in which individuals from the population are selected based on their fitness score – the higher the fitness, the better chance an individual has of being selected.
  5. The next stage is to apply crossover and mutation to the selected individuals. This stage is where new individuals are created for the next generation.
  6. At this point the new population goes back to the evaluation step and the process starts again. We call each cycle of this loop a generation.
  7. When the termination condition is finally met, the algorithm will break out of the loop and typically return its finial search results back to the user.

CITATIONS

Turing, A.M. (1950). “Computing Machinery and Intelligence”

Simon, H.A. (1965). “The Shape of Automation for Men and Management”

Barricell, N.A. (1975). “Symbiogenetic Evolution Processes Realised by Artificial Methods”

Darwin, C. (1859). “On the Origin of Species”

Dorigo, M. (1992). “Optimization, Learning and Natural Algorithms”

Rechenberg, I. (1965) “Cybernetic Solution Path of an Experimental Problem”

Rechenberg, I. (1973) “Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution”

Schwefel, H.-P. (1975) “Evolutionsstrategie und numerische Optimierung”

Schwefel, H.-P. (1977) “Numerische Optimierung von Computer-Modellen mittels der Evolutionsstrategie”

Fogel L.J; Owens, A.J; and Walsh, M.J. (1966) “Artificial Intelligence through Simulated Evolution”

Holland, J.H. (1975) “Adaptation in Natural and Artificial Systems”

Dorigo, M. (1992) “Optimization, Learning and Natural Algorithms”

Glover, F. (1989) “Tabu search. Part I”

Glover, F. (1990) “Tabu search. Part II”

Kirkpatrick, S; Gelatt, C.D, Jr., and Vecchi, M.P. (1983) “Optimization by simulated annealing”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset