This post is best viewed using the light theme.

This post uses GA to generate a high-quality solution to the Traveling Salesman Problem.

Traveling Salesman Problem using Genetic Algorithm

This blog post is regarding using a genetic algorithm to solve the Traveling Salesman Problem. In a one-liner, the TSP asks the following question: Given a list of cities and the distances between each pair of the cities, what is the shortest possible route that visits each city and returns to the origin city?"

The conditions in this scenario are that no point can be visited twice and it must return to the starting point. The selected starting point here is New York. (The starting point does not matter in this scenario.). There are times, that a point maybe the revisited more than once to achieve a better solution. The number of cities in this scenario is 13. In this specific implementation, it will never visit the same city twice.

The inspiration for this post is based on the google OR-Tools found here. This blog post; uses Genetic Algorithm to obtain the answer. It is implemented with a web worker which runs in the browser based on JavaScript.

I will also reuse the genetic algorithm implementation written for another blog post however with different fitness functions and different cross-over methodologies.

Location Coordinates Shorthand
New York 40, -74 A
Los Angeles 34, -118 B
Chicago 41, -87 C
Minneapolis 44, -93 D
Denver 39, -104 E
Dallas 32,-96 F
Seattle 47,-122 G
Boston 42,-71 H
San Francisco 37,-122 I
St. Louis 38,-90 J
Houston 29,-95 K
Phoenix 33,-111 L
Salt Lake City 40,-111 M

Total number of cities - 13.

For the Genetic Algorithm to work, a distance matrix needs to be given to it. This distance matrix is based on the “Euclidean Distance” and not the road network distance. The distance matrix is obtained from here which has 13 cities in the United States.

The Genetic Algorithm Solution

Please click run to see the results based on different cross over and selection methods. You can repeat this for different methodologies

It can be observed that the selection method random tends to not give a good result as it would defeat the purpose of the GA algorithm. The current mutation rate of the GA is set to 0.2 for this purpose. The starting population size is set to 20.

Due to the nature of GA, each run under the given settings will give a different solution as I have defaulted the number of generations to 500. This includes running with the same cross over methodology and selection methodology.

The fitness in general would depend on the cross over methodology. For example, if the roulette wheel methodology is used, it can be observed that the average fitness tends to spike more.

The suggested answer based on the Google OR tools is

New York -> Boston -> Chicago -> Minneapolis -> Denver -> Salt Lake City -> Seattle -> San Francisco -> Los Angeles -> Phoenix -> Houston -> Dallas -> St Louis -> New York which gives the total distance of 7293 miles which is also the minimal tour length.

The GA however does not obtain this solution. It does however, generate a high quality solution really quick.

Lessons from this post

  • The earth is not flat! Mapping putting coordinates using latitude and longitude on a chart; would work in a different way so it displays beautifully. Latitude and longitude need to be swapped.

  • Most chart APIs do not let you specify both the x-axis and y-axis at the same time. This is especially true if the chart is able to generate SVG diagrams. SVG diagrams are always nicer and would generally be more responsive at the end of the day.

  • You can use a series graph to draw lines from point to point in the chartist API. However, chartist API does not like the situation where there are two values on the same axis. (So it is not able to draw a straight line on the x-axis because of the nature of a series chart. An example of this is where there is a point on 30,55 and 30,65.

  • There are specific data sets in which people benchmark their TSP solutions.

  • The GA will downgrade into a random search if the mutation rate is too high. However, the mutation rate can always be changed to tailor to the specific use case.