The reality gap – the intrinsic discrepancy between reality and
simulation – is a critical issue in the off-line automatic
design of control software for robot swarms.
It is understood that the reality gap manifests itself as a drop in
performance: when control software generated in simulation is ported
to physical robots, the performance observed is often deceiving
compared with the one obtained in simulation.
In the literature, performance drop is commonly explained as resulting
from reality being more complex than simulation – or equivalently,
from simulation being too simplistic.
We challenge this explanation.
In a first experiment, we show that performance drop might result from
an artificial, simulation-only reality gap: control software is
generated on the basis of a simulation model and assessed on a second
one.
We will call this second model a pseudo-reality.
We selected the simulation model to be used as a pseudo-reality by
trial and error, so as to qualitatively replicate previously published
observations made in experiments with physical robots.
The results of the first experiment show that performance drop
occurs even if we can exclude that pseudo-reality is more complex than
the simulation model used for the design.
In a second experiment, we eliminate the trial-and-error selection of
the first experiment by assessing control software across multiple
pseudo-realities, which are sampled uniformly around the original
simulation model used for the design.
The results of the second experiment confirm those of the first one
and show that they do not depend on the specific pseudo-reality we
previously selected by trial and error.
Moreover, they suggest that one could use multiple pseudo-realities to
evaluate automatic design methods and, from this simulation-only
evaluation, infer their intrinsic robustness to the reality gap.
About this supplementary material page
In this page, you will find supplementary information about the two experiments conducted. In these Sections you will find videos of behaviors displayed in the design context and in pseudo-reality, p-values tables and extra plots. In the videos and the tables, we use two letters to identify the design and evaluation context: the first determines the simulation model (A or B) on the basis of which control software was generated, and the second determines the simulation model on which the control software was evaluated (A or B).
In a p-values table, we report the p-values resulting of the assessment of Wilcoxon rank-sum tests in the left-hand part (all the cells (i, j) with j < i), and the corresponding relation between the two variables compared in the right-hand part (that is, all the cells (i, j) with i < j). In some cases, the comparison between performances are not meaningful: for example, comparing performance of two different design methods, one assessed in simulation, and the other one in reality. In these cases, we did not apply the Wilcoxon rank-sum test, and the corresponding cells contains the characters "NA". In the right-hand part, a cell can be empty, or have one of the following character: ">" or "<". If a cell (i, j) with i < j is empty, the corresponding p-value in cell (j, i) is greater than 0.05. If a cell (i, j) with i < j contains the character ">", the p-value in cell (j, i) is less than 0.05, and the variable of the i-th row is significantly greater than the variable of the j-th column. Similarly, if a cell (i, j) with i < j contains the character "<", the p-value in cell (j, i) is less than 0.05, and the variable of the i-th raw is significantly smaller than the variable of the j-th column. For example, in the p-values table of Study 1: Foraging, the cell (1, 2) contains ">", which means that the performance of control software generated by EvoStick on the basis of model A is significantly greater when it is evaluated on model A than when it is evaluated on model B. Indeed, the corresponding p-value, reported in cell (2, 1), is equal to 2.69e-67, which is smaller than 0.05.
The automatic design methods are available as open-source packages: AutoMoDe-Chocolate and EvoStick. A technical report on how to install and use these packages is available here.
Experiment 1: design model and pseudo-reality are fixed
The instances of control software and the results of the first experiment are available for download\
here:
Results. They can be imported in R using read.csv("ResultsExp1.csv").
Foraging
The arena contains two source areas (black circles) and a nest (white
area). A light is placed behind the nest to help the robots to
navigate.
In this idealized version of foraging, a robot is deemed to retrieve
an object when it enters a source and then the nest. The goal of the
swarm is to retrieve as many objects as possible.
Videos
Design context
Pseudo reality
EvoStick
Vanilla
Chocolate
P-values table
Aggregation
The swarm must select one of the two black areas and aggregate
there. The objective function is computed at the end of the experimental
run, and is maximized when all robots are either on the top or the
bottom area.
Videos
Design context
Pseudo reality
EvoStick
Vanilla
Chocolate
P-values table
Experiment 2: desing model is fixed, pseudo-reality is sampled
Results, sampled models and instances of control software obtained in the second experiment are available for download here:
Results. They can be imported in R using read.csv("ResultsExp2.csv").