Traveling Salesman
Chapter 4

Introduction

In this chapter, we are going to explore the traveling salesman problem and how it can be solved using a genetic algorithm. In doing so, we will be looking at the properties of the traveling salesman problem and how we can use those properties to design the genetic algorithm.

The traveling salesman problem (TSP) is a classic optimization problem, studied as far back as the 1800s. The traveling salesman problem involves finding the most efficient route through a collection of cities, visiting each city exactly once.

The traveling salesman problem is often described in terms of optimizing a route through a collection of cities; however, the traveling salesman problem can be applied to other applications. For example, the notion of cities can be thought of as customers for certain applications, or even as soldering points on a microchip. The idea of distance can also be revised to consider other constraints such as time.

In its simplest form, cities can be represented as nodes on a graph, with the distance between each city represented by the length of the edges (see Figure 4-1). A “route” or a “tour”, simply defines which edges should be used, and in what order. The route’s score can then be calculated by summing the edges used in the route.

9781484203293_Fig04-01.jpg

Figure 4-1. Our graph showing the cities and the respective distances between them

During the 1900s, the traveling salesman problem was studied by many mathematicians and scientists; however to this day the problem remains unsolved. The only guaranteed method to produce an optimal solution for the traveling salesman problem is by using a brute force algorithm. A brute force algorithm is an algorithm designed to systematically try every possible solution. Then you find the optimal solution from the complete set of candidate solutions. Attempting solving the traveling salesman problem with a brute force algorithm is an extremely hard task because the number of potential solutions experiences factorial growth as the number of cities is increased. Factorial functions grow even faster than exponential functions, which is why it’s so hard to brute force the traveling salesman problem. For example, for 5 cities, there are 120 possible solutions (1x2x3x4x5), and for 10 cities that number will have increased to 3,628,800 solutions! By 15 cities, there are over a trillion solutions. At 60 cities, there are more possible solutions than there are atoms in the observable universe.

Brute force algorithms can be used to find optimal solutions when there are only a few cities but they become more and more challenging as the number of cities increase. Even when techniques are applied to remove reverse and identical routes, it still becomes quickly infeasible to find an optimal solution in a reasonable amount of time.

In reality, we know that finding an optimal solution usually isn’t necessary because a good enough solution will typically be all that’s required. There are a number of different algorithms that can quickly find solutions that are likely to be within only a couple of percent of optimal. One of the more commonly used algorithms is the nearest neighbor algorithm. With this algorithm, a starting city is picked at random. Then, the next nearest unvisited city is found and picked as the second city in the route. This process of picking the next nearest the unvisited city is continued until all cities have been visited and a complete route has been found. The nearest neighbor algorithm has been shown to be surprisingly effective at producing reasonable solutions that score within a fraction the optimal solution. Better still, this can be done in a very short time frame. These characteristics make it an attractive solution in many situations, and a possible alternative to genetic algorithms.

The Problem

The problem we will be tackling in this implementation is a typical traveling salesman problem in which we need to optimize a route through a collection of cities. We can generate a number of random cities in 2D space by setting each city to a random x, y position.

When finding the distance between two cities we will simply use the shortest length between the cities as the distance. We can calculate this distance with the following equation:

Equa.jpg

Often, the problem will be more complex than this. In this example we are assuming a direct ideal path exists between each city; this is also known as the “Euclidean distance”. This usually isn’t going to be a typical case as there could be various obstacles making the actual shortest path much longer than the Euclidean distance. We are also assuming that traveling from City-A to City-B takes just as long as traveling from City-B to City-A. Again, in reality this is rarely the case. Often there will be obstacles such as one-way roads that will affect the distance between cities while traveling in a certain direction. An implementation of the traveling salesman problem where the distance between the cities changes depending on the direction is called an asymmetric traveling salesman problem.

Implementation

It is time to go ahead and tackle the problem using our knowledge of genetic algorithms. After setting up a new Java/Eclipse package for this problem, we’ll begin by encode the route.

Before You Start

This chapter will build on the code you developed in Chapter 3. Before you start, create a new Eclipse or NetBeans project, or create a new package in your existing project for this book called “chapter4”.

Copy the Individual, Population, and GeneticAlgorithm classes over from Chapter 3 and import them into chapter4. Make sure to update the package name at the top of each class file! They should all say “package chapter4” at the very top.

Open the GeneticAlgorithm class and delete the following methods: calcFitness, evalPopulation, crossoverPopulation, and mutatePopulation. You’ll rewrite those methods in the course of this chapter.

Next, open the Individual class and delete the constructor with the signature “public Individual(int chromosomeLength)”. There are two constructors in the Individual class, so be careful to delete the correct one! The constructor to delete is the one that randomly initializes the chromosome; you’ll rewrite that as well in this chapter.

The Population class from Chapter 3 requires no modifications other than the package name at the top of the file.

Encoding

The encoding we choose in this example will need to be able to encode a list of cities in order. We can do this by assigning each city a unique ID then referencing it using a chromosome in the order of the candidate route. This type of encoding where sequences of genes are used is known as permutation encoding and is very well suited to the traveling salesman problem.

The first thing we need to do is assign our cities unique IDs. If we have 5 cities to visit, we can simply assign them the IDs: 1,2,3,4,5. Then when our genetic algorithm has found a route, our chromosome may order the city IDs to look like the following: 3,4,1,2,5. This simply means we would start at city 3, then travel to city 4, then city 1, then city 2, then city 5, before returning back to city 3 to complete the route.

Initialization

Before we start optimizing a route, we need to create some cities. As mentioned earlier, we can generate random cities by picking random x,y coordinates and using them to define a city location.

First, we need to create a City class, which can create and store a city as well as calculate the shortest distance to another city.

package chapter4;
public class City {
      private int x;
      private int y;

      public City(int x, int y) {
            this.x = x;
            this.y = y;
      }

      public double distanceFrom(City city) {
            // Give difference in x,y
            double deltaXSq = Math.pow((city.getX() - this.getX()), 2);
            double deltaYSq = Math.pow((city.getY() - this.getY()), 2);

            // Calculate shortest path
            double distance = Math.sqrt(Math.abs(deltaXSq + deltaYSq));
            return distance;
      }

      public int getX() {
            return this.x;
      }

      public int getY() {
            return this.y;
      }
}

The City class has a constructor that takes an x and y coordinate to create a city on a 2D plane. The class also contains a distanceFrom method that calculates the straight-line distance from the current city to another city using the Pythagorean Theorem. Finally, there are two getter methods which can be used to retrieve the city’s x and y position.

Next, we should reinstate the Individual class constructor we deleted in the “Before You Start” section. The traveling salesman problem has different constraints on the chromosome from those of our last two problems. Recall that the only constraints in the robot controller problem were that the chromosome must be 128 bits long, and that it must be binary.

Unfortunately, this is not the case with the traveling salesman problem; the constraints are more involved and dictate the initialization, crossover, and mutation techniques we can use. In this case, the chromosome must be of a certain length (however long the city tour is), but an additional constraint is that each city must be visited once and only once, otherwise the chromosome is invalid. There can be no duplicate genes in the chromosome, and there can be no omitted cities in the chromosome.

We can easily create a naïve Individual constructor without any randomness. Simply create a chromosome with each city’s index in it: 1, 2, 3, 4, 5, 6…, etc. Randomizing the initial chromosome is an exercise for the reader at the end of the chapter.

Add the following constructor to the Individual class. You can place this anywhere you like, but near the top is a good location for constructors. As always, comments and docblocks have been omitted here, but please see the provided Eclipse project accompanying this book for additional commentary.

public Individual(int chromosomeLength) {
      // Create random individual
      int[] individual;
      individual = new int[chromosomeLength];

      for (int gene = 0; gene < chromosomeLength; gene++) {
            individual[gene] = gene;
      }

      this.chromosome = individual;
}

At this point, we can create our executive class and its “main” method. Create a new Java class called “TSP” in package “chapter4” by using the File image New image Class menu item. As in Chapter 3, we’ll stub out the genetic algorithm pseudocode with a number of TODOs so we can mark our progress through the implementation.

Let’s also take this opportunity to initialize an array of 100 randomly generated City objects at the top of the “main” method. Simply generate random x and y coordinates and pass them to the City constructor. Make sure your TSP class looks like the following:

package chapter4;
public class TSP {
      public static int maxGenerations = 3000;
      public static void main(String[] args) {

            int numCities = 100;
            City cities[] = new City[numCities];

            // Loop to create random cities
            for (int cityIndex = 0; cityIndex < numCities; cityIndex++) {
                  int xPos = (int) (100 * Math.random());
                  int yPos = (int) (100 * Math.random());
                  cities[cityIndex] = new City(xPos, yPos);
            }

            // Initial GA
            GeneticAlgorithm ga = new GeneticAlgorithm(100, 0.001, 0.9, 2, 5);

            // Initialize population
            Population population = ga.initPopulation(cities.length);

            // TODO: Evaluate population

            // Keep track of current generation
            int generation = 1;

            // Start evolution loop
            while (ga.isTerminationConditionMet(generation, maxGenerations) == false) {
                  // TODO: Print fittest individual from population

                  // TODO: Apply crossover

                  // TODO: Apply mutation

                  // TODO: Evaluate population

                  // Increment the current generation
                  generation++;
            }

            // TODO: Display results
      }
}

Hopefully, this procedure is becoming familiar; we’re once again beginning to implement the pseudocode presented at the beginning of Chapter 2. We’ve also generated an array of City objects that we’ll use in our evaluation methods, much like how we generated a Maze object to evaluate individuals against in the last chapter.

The rest is rote: initialize a GeneticAlgorithm object (including population size, mutation rate, crossover rate, elitism count, and tournament size), then initialize a population. The individuals’ chromosome length must be the same as the number of cities we wish to visit.

We get to reuse the simple “max generations” termination condition from the previous chapter, so we’re left with only six TODOs and a working loop this time. Let’s begin, as usual, with the evaluation and fitness scoring methods.

Evaluation

Now we need to evaluate the population and assign fitness values to the individuals so we know which perform the best. The first step is to define the fitness function for the problem. Here, we simply need to calculate the total distance of the route given by the individual’s chromosome.

First, we need to create a new class that can store a route and calculate its total distance. Create a new class called “Route” in package “chapter4” and insert the following code:

package chapter4;

public class Route {
      private City route[];
      private double distance = 0;

      public Route(Individual individual, City cities[]) {
            // Get individual’s chromosome
            int chromosome[] = individual.getChromosome();
            // Create route
            this.route = new City[cities.length];
            for (int geneIndex = 0; geneIndex < chromosome.length; geneIndex++) {
                  this.route[geneIndex] = cities[chromosome[geneIndex]];
            }
      }

      public double getDistance() {
            if (this.distance > 0) {
                  return this.distance;
            }

            // Loop over cities in route and calculate route distance
            double totalDistance = 0;
            for (int cityIndex = 0; cityIndex + 1 < this.route.length; cityIndex++) {
                  totalDistance += this.route[cityIndex].distanceFrom(this.route[cityIndex + 1]);
            }

            totalDistance += this.route[this.route.length - 1].distanceFrom(this.route[0]);
            this.distance = totalDistance;

            return totalDistance;
      }
}

This class contains only a constructor and a single method to calculate the total route distance. The constructor accepts an Individual and a list of City definitions (the same City array we created in the TSP class’ “main” function). The constructor then builds an array of City objects in the order of the Individual’s chromosome; this data structure makes it simple to evaluate the total route distance in the getDistance method.

The getDistance method loops through the route array (an ordered array of City objects) and calls the City class’ “distanceFrom” method to calculate the distance between each two cities in turn, summing as it goes.

To implement this fitness scoring method, we need to update the calcFitness function in the GeneticAlgorithm class. The calcFitness class should delegate the distance calculation to the Route class and, in order to do so, it needs to accept our City definition array and pass it to the Route class.

Add the following method to the GeneticAlgorithm class, anywhere in the file.

public double calcFitness(Individual individual, City cities[]){
    // Get fitness
    Route route = new Route(individual, cities);
    double fitness = 1 / route.getDistance();

    // Store fitness
    individual.setFitness(fitness);

    return fitness;
}

In this function, the fitness is calculated by dividing 1 by the total route distance – a shorter distance therefore has a higher score. After the fitness has been calculated, it is stored for quick recall in case it is needed again.

Now we can update our evalPopulation method in the GeneticAlgorithm class to accept the cities parameter and find the fitness for every individual in the population.

public void evalPopulation(Population population, City cities[]){
    double populationFitness = 0;

    // Loop over population evaluating individuals and summing population fitness
    for (Individual individual : population.getIndividuals()) {
        populationFitness += this.calcFitness(individual, cities);
    }

    double avgFitness = populationFitness / population.size();
    population.setPopulationFitness(avgFitness);
}

As usual, this function loops over the population and each individual’s fitness is calculated. Unlike previous implementations, we’re calculating an average population fitness instead of a total population fitness. (Since we’re using tournament selection rather than roulette wheel selection, we don’t actually need the population’s fitness; nothing would change if we simply didn’t record this value.)

At this point, we can now resolve four of our TODOs in the TSP class’ “main” method related to evaluation and displaying results. Update the TSP class to represent the following. The four TODOs resolved were the two “Evaluate population” lines (before the loop and inside the loop), the “Print fittest individual from population” line at the top of the loop, and the “Display results” line after the loop.

package chapter4;
public class TSP {
      public static int maxGenerations = 3000;
      public static void main(String[] args) {

            int numCities = 100;
            City cities[] = new City[numCities];

            // Loop to create random cities
            for (int cityIndex = 0; cityIndex < numCities; cityIndex++) {
                  int xPos = (int) (100 * Math.random());
                  int yPos = (int) (100 * Math.random());
                  cities[cityIndex] = new City(xPos, yPos);
            }

            // Initial GA
            GeneticAlgorithm ga = new GeneticAlgorithm(100, 0.001, 0.9, 2, 5);

            // Initialize population
            Population population = ga.initPopulation(cities.length);

            // Evaluate population
            ga.evalPopulation(population, cities);

            // Keep track of current generation
            int generation = 1;

            // Start evolution loop
            while (ga.isTerminationConditionMet(generation, maxGenerations) == false) {
                  // Print fittest individual from population
                  Route route = new Route(population.getFittest(0), cities);
                  System.out.println("G"+generation+" Best distance: " + route.getDistance());

                  // TODO: Apply crossover

                  // TODO: Apply mutation

                  // Evaluate population
                  ga.evalPopulation(population, cities);

                  // Increment the current generation
                  generation++;
            }

            // Display results
            System.out.println("Stopped after " + maxGenerations + " generations.");
            Route route = new Route(population.getFittest(0), cities);
            System.out.println("Best distance: " + route.getDistance());
      }
}

At this point, we can click “Run” and the loop will go through the motions, printing the same thing 3,000 times but showing no change. This, of course, is expected; we need to implement crossover and mutation as our two remaining TODOs.

Termination Check

As we have already learned, there is no way to know if we have found an optimal solution to the traveling salesman problem unless we try every possible solution. This means the termination check we use in this implementation cannot terminate when the optimal solution has been found, because it simply has no way of knowing.

Seeing as we have no way to terminate when the optimal solution has been found, we can simply allow the algorithm to run for a set number of generations before finally terminating, and therefore we are able to reuse the isTerminationConditionMet method in the GeneticAlgorithm class from Chapter 3.

Note, however, that in situations like these—where the best solution can not be known—that there are many sophisticated techniques for termination beyond simply setting an upper bound on the number of generations.

One common technique is to measure the improvement of the population’s fitness over time. If the population is still improving rapidly, you may want to allow the algorithm to continue on. Once the population stops improving, you can end the evolution and present the best solution.

You may never find the globally optimum solution in a complex solution space like the traveling salesman problem’s, but there are many strong local optima, and a plateau in progress typically indicates that you’ve found one of these local optima.

There are several ways you can measure the progress of a genetic algorithm over time. The simplest method is to measure the number of consecutive generations where there has been no improvement in the best individual. If the number of generations with no improvement has crossed some threshold, for instance 500 generations with no improvement, you can stop the algorithm.

One drawback of this simple approach with large solution spaces is that you may see constant improvement in the population’s fitness – it just may be painfully slow! There are so many combinations that it’s feasible to get a one-point improvement every dozen generations or so, and you’ll never actually encounter 500 consecutive generations with no improvement. You could, of course, set a maximum upper bound on generations irrespective of improvement. You could also implement a more sophisticated technique, such as taking the moving average of various windows and comparing them against each other. If the fitness improvement has been trending downwards for several windows, stop the algorithm.

In our case, however, we’ll stick with the naïve approach from Chapter 3, and leave it up to the reader to implement a better termination condition as an exercise at the end of the chapter.

Crossover

With the traveling salesman problem, both the genes and the order of the genes in the chromosome are very important. In fact, for the traveling salesman problem we shouldn’t ever have more than one copy of a specific gene in our chromosome. This is because it would create an invalid solution because a city shouldn’t be visited more than once on a given route. Consider a case where we have three cities: City A, City B and City C. A route of A,B,C is valid; however, a route of C,B,C is not: this route visits City C twice, and also never visits City A. Because of this, it’s essential that we find and apply a crossover method that produces valid results for our problem.

We also need to be respectful of the ordering of the parent’s chromosomes during the crossover process. This is because the order of the chromosome affects the solution’s fitness. In fact, it’s only the order that matters. To understand better why this is the case, consider how these following two routes are completely different even though they contain the exact same genes:

Route 1: A,B,C,D,E
Route 2: C,A,D,B,E

We previously looked at uniform crossover; however, the uniform crossover method works on the level of individual genes and doesn’t respect the order of the chromosome. Single-point and two-point crossover methods do a better job as they deal with chunks of the chromosome, which will preserve order within those chunks. The problem with single-point and two-point crossover though, is that they aren’t careful about which genes are being added and removed from the chromosome. This means we are likely to end up with invalid solutions with chromosomes that contain more than one reference for the same city or have cities missing altogether.

A crossover method, which addresses both these problems, is ordered crossover. In this crossover method, a subset of the first parent’s chromosome is selected. That subset is then added to the child chromosome in the same position.

9781484203293_unFig04-01.jpg

The next step is to add the second parent’s genetic information to the offspring’s chromosome. We do this by starting from the end position of the selected subset, and then include each gene from parent 2, which isn’t already in the offspring’s chromosome.

In this example, we would start from the gene “2” checking if it can be found in the offspring’s chromosome. Because 2 isn’t currently in the offspring’s chromosome, we can add it to the first available position in the offspring’s chromosome. Then, because we reached the end of parent 2’s chromosome, we go back to the first gene, “5”. This time the 5 is in the offspring’s chromosome so we skip it and move onto the 1. We keep doing this until we end up with the following:

9781484203293_unFig04-02.jpg

This method of crossover retains a lot of the order from the parents, but also ensures solutions remain valid for problems such as the traveling salesman problem.

There’s one aspect of this algorithm we can’t pull off with our current Individual class: this technique requires checking the offspring’s chromosome for the existence of a specific gene. In previous chapters, we didn’t have specific genes – we had binary chromosomes – so there was no need to implement a method that checks for the existence of a gene in a chromosome.

Fortunately, this is an easy method to add. Open the Individual class and add a method called “containsGene” anywhere in the file:

public boolean containsGene(int gene) {
      for (int i = 0; i < this.chromosome.length; i++) {
            if (this.chromosome[i] == gene) {
                  return true;
            }
      }
      return false;
}

This method looks at each gene in the chromosome, and if it finds the gene it’s looking for it will return true; otherwise it returns false. The usage of this method addresses the question: “Does this solution visit City #5? Let’s call individual.containsGene(5) to find out.”

We’re now ready to apply our ordered crossover method to our genetic algorithm by updating the GeneticAlgorithm class. Like in the previous chapter, we can implement tournament selection as our selection method used for crossover, but we haven’t modified the selectParent method from the last chapter.

Add this crossoverPopulation method to the GeneticAlgorithm class:

public Population crossoverPopulation(Population population){
    // Create new population
    Population newPopulation = new Population(population.size());

    // Loop over current population by fitness
    for (int populationIndex = 0; populationIndex < population.size(); populationIndex++) {
        // Get parent1
        Individual parent1 = population.getFittest(populationIndex);

        // Apply crossover to this individual?
        if (this.crossoverRate > Math.random() && populationIndex >= this.elitismCount) {
            // Find parent2 with tournament selection
            Individual parent2 = this.selectParent(population);

        // Create blank offspring chromosome
            int offspringChromosome[] = new int[parent1.getChromosomeLength()];
            Arrays.fill(offspringChromosome, -1);
            Individual offspring = new Individual(offspringChromosome);

            // Get subset of parent chromosomes
            int substrPos1 = (int) (Math.random() * parent1.getChromosomeLength());
            int substrPos2 = (int) (Math.random() * parent1.getChromosomeLength());

            // make the smaller the start and the larger the end
            final int startSubstr = Math.min(substrPos1, substrPos2);
            final int endSubstr = Math.max(substrPos1, substrPos2);

            // Loop and add the sub tour from parent1 to our child
            for (int i = startSubstr; i < endSubstr; i++) {
                offspring.setGene(i, parent1.getGene(i));
            }

            // Loop through parent2's city tour
            for (int i = 0; i < parent2.getChromosomeLength(); i++) {
                int parent2Gene = i + endSubstr;
                if (parent2Gene >= parent2.getChromosomeLength()) {
                    parent2Gene -= parent2.getChromosomeLength();
                }

                // If offspring doesn’t have the city add it
                if (offspring.containsGene(parent2.getGene(parent2Gene)) == false) {
                    // Loop to find a spare position in the child’s tour
                    for (int ii = 0; ii < offspring.getChromosomeLength(); ii++) {
                        // Spare position found, add city
                        if (offspring.getGene(ii) == -1) {
                            offspring.setGene(ii, parent2.getGene(parent2Gene));
                            break;
                        }
                    }
                }
            }

            // Add child
            newPopulation.setIndividual(populationIndex, offspring);
        } else {
            // Add individual to new population without applying crossover
            newPopulation.setIndividual(populationIndex, parent1);
        }
    }

    return newPopulation;
}

In this method, we first create a new population to hold the offspring. Then, the current population is looped over in the order of the fittest individual first. If elitism is enabled, the first few elite individuals are skipped over and added to the new population unaltered. The remaining individuals are then considered for crossover using the crossover rate. If crossover is to be applied to the individual, a parent is selected using the selectParent method (in this case, selectParent implements tournament selection as in Chapter 3) and a new blank individual is created.

Next, two random positions in the parent1’s chromosome are picked and the subset of genetic information between those positions are added to the offspring’s chromosome. Finally, the remaining genetic information needed is added in the order found in parent2; then when complete, the individual is added into the new population.

We can now implement our crossoverPopulation method into our “main” method in the TSP class and resolve one of our TODOs. Find “TODO: Apply crossover” and replace it with:

// Apply crossover
population = ga.crossoverPopulation(population);

Clicking “Run” at this point should result in a working algorithm! After 3,000 generations, you should expect to see a best distance of approximately 1,500. However, as you may recall, crossover alone is prone to getting stuck in local optima, and you may find that the algorithm plateaus. Mutation is our way of randomly dropping candidates in new locations on our solution space, and can help improve long-term results at the cost of short-term gain.

Mutation

Like with crossover, the type of mutation we use for the traveling salesman problem is important because again, we need to ensure the chromosome is valid after it has been applied. A method in which we randomly change a single value of a gene would likely cause repeats in the chromosome and as a result, the chromosome would be invalid.

An easy solution to this is called swap mutation, which is an algorithm that will simply swap the genetic information at two points. Swap mutation works by looping though the genes in the individual’s chromosome with each gene being considered for mutation determined by the mutation rate. If a gene is selected for mutation, another random gene in the chromosome is picked and then their positions are swapped.

9781484203293_unFig04-03.jpg

This process ensures no duplicate genes are created and that any resulting offspring will be valid solutions.

To implement this mutation method, first add the mutatePopulation method to the GeneticAlgorithm class.

public Population mutatePopulation(Population population){
    // Initialize new population
    Population newPopulation = new Population(this.populationSize);

    // Loop over current population by fitness
    for (int populationIndex = 0; populationIndex < population.size(); populationIndex++) {
        Individual individual = population.getFittest(populationIndex);

        // Skip mutation if this is an elite individual
        if (populationIndex >= this.elitismCount) {
            // System.out.println("Mutating population member "+populationIndex);
            // Loop over individual’s genes
            for (int geneIndex = 0; geneIndex < individual.getChromosomeLength(); geneIndex++) {
            // Does this gene need mutation?
                if (this.mutationRate > Math.random()) {
                    // Get new gene position
                    int newGenePos = (int) (Math.random() * individual.getChromosomeLength());
                    // Get genes to swap
                    int gene1 = individual.getGene(newGenePos);
                    int gene2 = individual.getGene(geneIndex);
                    // Swap genes
                    individual.setGene(geneIndex, gene1);
                    individual.setGene(newGenePos, gene2);
                }
            }
        }

        // Add individual to population
        newPopulation.setIndividual(populationIndex, individual);
    }

    // Return mutated population
    return newPopulation;
}

The first step of this method is to create a new population to hold the mutated individuals. Next, the population is looped over starting with the fittest individuals. If elitism is enabled, the first few individuals are skipped and added to the new population unaltered. The chromosomes from the remaining individuals are then looped over and each gene is considered for mutation individually depending on the mutation rate. If a gene is to be mutated, another random gene from the individual is picked and the genes are swapped. Finally, the mutated individual is added to the new population.

Now we can add the mutation method to our TSP class’ “main” method and resolve our final TODO. Find the comment that says “TODO: Apply mutation” and replace it with:

// Apply mutation
population = ga.mutatePopulation(population);

Execution

The final code for the TSP class should look like this:

package chapter4;
public class TSP {
      public static int maxGenerations = 3000;
      public static void main(String[] args) {

            // Create cities
            int numCities = 100;
            City cities[] = new City[numCities];

            // Loop to create random cities
            for (int cityIndex = 0; cityIndex < numCities; cityIndex++) {
                  // Generate x,y position
                  int xPos = (int) (100 * Math.random());
                  int yPos = (int) (100 * Math.random());

                  // Add city
                  cities[cityIndex] = new City(xPos, yPos);
            }

            // Initial GA
            GeneticAlgorithm ga = new GeneticAlgorithm(100, 0.001, 0.9, 2, 5);

            // Initialize population
            Population population = ga.initPopulation(cities.length);

            // Evaluate population
            //ga.evalPopulation(population, cities);

            Route startRoute = new Route(population.getFittest(0), cities);
            System.out.println("Start Distance: " + startRoute.getDistance());

            // Keep track of current generation
            int generation = 1;
            // Start evolution loop
            while (ga.isTerminationConditionMet(generation, maxGenerations) == false) {
                  // Print fittest individual from population
                  Route route = new Route(population.getFittest(0), cities);
                  System.out.println("G"+generation+" Best distance: " + route.getDistance());

                  // Apply crossover
                  population = ga.crossoverPopulation(population);

                  // Apply mutation
                  population = ga.mutatePopulation(population);

                  // Evaluate population
                  ga.evalPopulation(population, cities);

                  // Increment the current generation
                  generation++;
            }

            System.out.println("Stopped after " + maxGenerations + " generations.");
            Route route = new Route(population.getFittest(0), cities);
            System.out.println("Best distance: " + route.getDistance());

      }
}

Additionally, check your GeneticAlgorithm class against the following properties and method signatures. If you missed implementing one of these methods, or one of your method signatures doesn’t match, please go back and resolve the issue now.

package chapter4;
import java.util.Arrays;

public class GeneticAlgorithm {

      private int populationSize;
      private double mutationRate;
      private double crossoverRate;
      private int elitismCount;
      protected int tournamentSize;

      public GeneticAlgorithm(int populationSize, double mutationRate, double crossoverRate, int elitismCount, int tournamentSize) { }
      public Population initPopulation(int chromosomeLength){ }
      public boolean isTerminationConditionMet(int generationsCount, int maxGenerations) { }
      public double calcFitness(Individual individual, City cities[]) { }
      public void evalPopulation(Population population, City cities[]) { }
      public Individual selectParent(Population population) { }
      public Population crossoverPopulation(Population population) { }
      public Population mutatePopulation(Population population) { }

}

Finally, make sure you’ve updated the Individual class by replacing the constructor and adding the containsGene method.

At this point, click “Run” and observe the output. As usual, remind yourself that genetic algorithms are stochastic processes dictated by statistics, and that you can’t make any conclusions based on just one trial. Unfortunately, the fact that this problem initializes a random set of cities means that each run of the program will have a different optimal solution; this makes experimenting with the genetic algorithm parameters and judging their performance difficult. The first exercise at the end of the chapter is to hard-code a list of cities so that you can accurately benchmark the algorithm’s performance.

However, there are a few interesting observations you can make at this point. Run the algorithm a number of times and observe the best distance. Are they all similar? At least in the same ballpark? This is interesting, because each run uses a different set of cities. But upon reflection, it makes sense: while the cities are in different locations each time, there are still 100 of them every time, and they’re still placed randomly on a 100x100 map, which means that we can estimate pretty easily the problem solution’s total distance.

Consider a 100x100 map – which has an area of 10,000 units – but instead of visiting 100 cities, your goal is to visit 10,000 cities. If the cities are placed evenly on the map (one on each grid point), the best solution should be a distance of exactly 10,100 (visiting every tile on the map in a zigzag). If, instead of evenly distributing the cities, you were to randomly distribute those 10,000 cities, the best solution would be a statistical distribution centering at 10,000, varying slightly from run to run due to the randomness of the placement.

Now we can work backward and consider fewer cities; place only 25 cities evenly on the map and the shortest route is 600 units. The relationship here becomes clear: the distance is related to the square root of the map’s area times the square root of the number of cities. Using this relationship we find that 100 cities placed evenly has a minimum distance of 1,100 (that’s IEq1.jpg; the added square root term at the end accounts for the north-south travel, but we can drop that term when we start speaking statistically). If we place those same 100 cities randomly on our map, we can expect the minimum distance to be a distribution centering around 1,000. Similarly, 1,000 cities should have a best distance near 3,100.

If you play with the number of cities, you’ll find that the algorithm easily confirms these suspicions for smaller numbers, but naturally, it has difficulty finding the minimum for more than 100 cities.

Now that we understand the relationship between the map size, the number of cities, and the expected best distance, we can experiment even without a constant set of cities and use our statistical expectations instead. One area of particular interest is the effect of mutation on the result quality.

If you imagine the solution space as a landscape of lots of rolling hills, a genetic algorithm is like dropping 100 people with different behaviors at random locations on the landscape and seeing which individual finds the lowest valley. (In this case, because our Individual constructor isn’t random, we’re actually dropping all individuals on the same spot.) Through the generations, individuals and their offspring will move downhill and then stop when they’ve found the lowest valley near them. Mutation, however, picks them up and drops them into a new random location – the location may be better or worse than the previous one, but at least it’s a new, unique location that allows them to continue their search in a fresh landscape.

Mutation, however, can often have short-term detriments. Mutations can be either favorable or unfavorable – which is why we use elitism to protect the best individuals from mutation. However, the diversity that mutation introduces can have profound long-term effects, by placing an individual on a landscape that may otherwise not be explored: imagine a tall volcano with a huge chasm in the middle that contains the lowest point in the landscape (the global optimum, surrounded by unfavorable landscape). It’s unlikely that any population would climb the volcano and find the global optimum at the center – unless a random mutation placed an individual on the lip of the volcano.

With that being said, observe the effects of different mutation rates and elitism counts on long-running (tens of thousands of generations) difficult problems (200 cities or more). Does mutation help or hurt? Does elitism help or hurt?

Summary

The traveling salesman problem is a classic optimization problem that asks: what is the shortest possible route between a list of cities, visiting each city once, and returning back to the initial city?

It is an unsolved optimization problem in which an optimal solution can only be found if a brute force algorithm is used. However, because the number of possible solutions to the traveling salesman problem grows so rapidly with each city added, it soon becomes infeasible to brute force solutions even with the most powerful computers. For these cases, heuristic methods are used to find a good approximation instead.

We covered a basic implementation of the traveling salesman problem using cities on a 2D graph and connected them by straight-line distance.

Using an ordered list of city IDs for the chromosomes’ encoding, we were able to represent a solution to the traveling salesman problem. However, because each city ID must appear in the encoding at least once, and only once, we looked at two new crossover and mutation methods that can keep to this constraint: ordered crossover and swap mutation.

In ordered crossover, a random subset of parent1’s chromosome is added to the offspring, then the remaining genetic information needed is added to the offspring in the order it’s found in parent2’s chromosome. This method of adding a subset of parent1’s genetic information, then only adding the missing remaining genetic information from parent2 guarantees each city ID will appear once and only once in the solution.

In swap mutation, two genes are selected and their positions are swapped. Again, this mutation method guarantees a valid solution for the traveling salesman problem, as it doesn’t allow a city ID to be removed completely, nor can it cause a city to appear twice.

Exercises

  1. Hard-code cities into the TSP class “main” method so that you can accurately take benchmarks of performance.
  2. Add support for both, shortest route and quickest route using user defined distances and times between cities.
  3. Add support for an asymmetric TSP (the cost of traveling from A to B may not equal the cost from traveling from B to A).
  4. Modify the Individual class constructor to randomize the order of cities. How does this affect the performance of the algorithm?
  5. Update the termination condition to measure the algorithm’s progress and quit when no significant progress is being made. How does this affect the algorithm’s performance and results?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset