Supporting material for the paper:

Parallel Ant Colony Optimization Algorithms
for the Traveling Salesman Problem

by Max Manfrin, Mauro Birattari, Thomas Stützle, and Marco Dorigo
April 2006

Accepted as a full paper to ANTS 2006: Fifth International Workshop on Ant Colony Optimization and Swarm Intelligence

Table of Contents
  1. Abstract
  2. Algorithms
  3. Raw experimental data
  4. Results
  5. Second communication scheme

Abstract:

There are two reasons for parallelizing a metaheuristic if interested in performance: (i) given a fixed time to search, the aim is to increase the quality of the solutions found in that time; (ii) given a fixed solution quality, the aim is to reduce the time needed to find a solution not worse than that quality. In this article, we study the impact of communication when we parallelize a high-performing ant colony optimization (ACO) algorithm for the traveling salesman problem using message passing libraries. In particular, we examine synchronous and asynchronous communications on different interconnection topologies. We find that the simplest way of parallelizing the ACO algorithms, based on parallel independent runs, is surprisingly effective; we give some reasons as to why this is the case.

Keywords: ant colony optimization, parallel programming, traveling salesman problem, message passing



Algorithms:

We modify version 1.0 of ACOTSP, adding the quadrant nearest neighbors heuristic with 20 neighbors and the MPI code to allow communication among different colonies. Except for Parallel Independent Runs, in which no communication takes place, the first letter in the name of each algorithm identifies how the communication is implemented: S means communication is synchronous, and A means communication is asynchronous. The following interconnection topologies are tested:

To have a reference algorithm for comparison, we also tested the equivalent sequential Max-Min Ant System algorithm running: (i) for k-times the time of the parallel algorithms (SEQ), where k=8 is the number of multiple colonies used; (ii) for the same time as the parallel algorithms (SEQ2).


Raw experimental data:

We tested the algorithms on 10 instances available from TSPLIB. The output of all the experiments for each algorithm is available in the following table:

Algorithm File
SCC sync_completely_connected.txt.gz (552 KB)
ACC async_completely_connected.txt.gz (580 KB)
SRW sync_replace_worst.txt.gz (544 KB)
ARW async_replace_worst.txt.gz (664 KB)
SH sync_hypercube.txt.gz (580 KB)
AH async_hypercube.txt.gz (584 KB)
SR sync_ring.txt.gz (588 KB)
AR async_ring.txt.gz (588 KB)
PIR parallel_independent.txt.gz (460 KB)
SEQ sequential.txt.gz (64 KB)
SEQ2 sequential2.txt.gz (60 KB)

Results:

Results for each of the 10 instances grouped by algorithm

Instance Boxplot of normalized results Thumbnail
(click to enlarge)
pr1002 eps format, png format Boxplot for TSPLIB instance pr1002
u1060 eps format, png format Boxplot for TSPLIB instance u1060
pcb1173 eps format, png format Boxplot for TSPLIB instance pcb1173
d1291 eps format, png format Boxplot for TSPLIB instance d1291
nrw1379 eps format, png format Boxplot for TSPLIB instance nrw1379
fl1577 eps format, png format Boxplot for TSPLIB instance fl1577
vm1748 eps format, png format Boxplot for TSPLIB instance vm1748
rl1889 eps format, png format Boxplot for TSPLIB instance rl1889
d2103 eps format, png format Boxplot for TSPLIB instance d2103
pr2392 eps format, png format Boxplot for TSPLIB instance pr2392

Aggregate results over all 10 instances grouped by algorithm

Run Time Kind of boxplot Thumbnail
(click to enlarge)
full Boxplot of normalized results
eps format, png format
Boxplot for aggregated results - fulltime
full Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot of aggregated results in [0,1]- full
reduced by a factor 1/2 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot of aggregated results in [0,1] - 1/2
reduced by a factor 1/4 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot for aggregated results in [0,1] - 1/4
reduced by a factor 1/8 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot for aggregated results in [0,1] - 1/8
reduced by a factor 1/16 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot for aggregated results in [0,1] - 1/16
reduced by a factor 1/32 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot for aggregated results in [0,1] - 1/32
reduced by a factor 1/64 Boxplot of normalized results restricted to values in [0,1]
eps format, png format
Boxplot for aggregated results in [0,1] - 1/64
reduced by a factor 1/128 Boxplot of normalized results restricted to values in [0,1]
eps format, png format

Boxplot for aggregated results in [0,1] - 1/128

Even though the boxplot indicates that parallel algorithms achieve, on average, better performance than the sequential ones, the impact of communication on performance seems negative. One reason might be that the run times are rather high, and MMAS easily converges in those times. PIR can count on multiple independent search paths to explore the search space, reducing the effects of the "stagnation" behavior. In fact, the other parallel algorithms accelerate the convergence toward a same solution, due to the frequent exchange of information, as can be verified by the traces of the algorithms.



Second communication scheme:

In a second set of experiments we tested a more coarse communication scheme on two of the algorithms (on the same 10 instances available from TSPLIB). The output of all the experiments for each algorithm is available in the following table:

Algorithm File
SRW2 sync_replace_worst2.txt.gz (657 KB)
SR2 sync_ring2.txt.gz (658 KB)

Aggregate results over all 10 instances grouped by algorithm

Run Time Kind of boxplot Thumbnail
(click to enlarge)
full Boxplot of normalized results
eps format, png format
Boxplot for aggregated results - fulltime