IRIDIA - Supplementary Information (ISSN: 2684-2041)

Supplementary material for the paper:

Simulation-only experiments to mimic the effects of the reality gap in the automatic design of robot swarms

Antoine Ligot, Mauro Birattari

IRIDIA, Université Libre de Bruxelles, Brussels, Belgium


Table of Contents
  1. Abstract
  2. Experiment 1: design model and pseudo-reality are fixed
  3. Experiment 2: design model is fixed, pseudo-reality is sampled

Abstract

The reality gap – the intrinsic discrepancy between reality and simulation – is a critical issue in the off-line automatic design of control software for robot swarms. It is understood that the reality gap manifests itself as a drop in performance: when control software generated in simulation is ported to physical robots, the performance observed is often deceiving compared with the one obtained in simulation. In the literature, performance drop is commonly explained as resulting from reality being more complex than simulation – or equivalently, from simulation being too simplistic. We challenge this explanation. In a first experiment, we show that performance drop might result from an artificial, simulation-only reality gap: control software is generated on the basis of a simulation model and assessed on a second one. We will call this second model a pseudo-reality. We selected the simulation model to be used as a pseudo-reality by trial and error, so as to qualitatively replicate previously published observations made in experiments with physical robots. The results of the first experiment show that performance drop occurs even if we can exclude that pseudo-reality is more complex than the simulation model used for the design. In a second experiment, we eliminate the trial-and-error selection of the first experiment by assessing control software across multiple pseudo-realities, which are sampled uniformly around the original simulation model used for the design. The results of the second experiment confirm those of the first one and show that they do not depend on the specific pseudo-reality we previously selected by trial and error. Moreover, they suggest that one could use multiple pseudo-realities to evaluate automatic design methods and, from this simulation-only evaluation, infer their intrinsic robustness to the reality gap.

About this supplementary material page

In this page, you will find supplementary information about the two experiments conducted. In these Sections you will find videos of behaviors displayed in the design context and in pseudo-reality, p-values tables and extra plots. In the videos and the tables, we use two letters to identify the design and evaluation context: the first determines the simulation model (A or B) on the basis of which control software was generated, and the second determines the simulation model on which the control software was evaluated (A or B).

In a p-values table, we report the p-values resulting of the assessment of Wilcoxon rank-sum tests in the left-hand part (all the cells (i, j) with j < i), and the corresponding relation between the two variables compared in the right-hand part (that is, all the cells (i, j) with i < j). In some cases, the comparison between performances are not meaningful: for example, comparing performance of two different design methods, one assessed in simulation, and the other one in reality. In these cases, we did not apply the Wilcoxon rank-sum test, and the corresponding cells contains the characters "NA". In the right-hand part, a cell can be empty, or have one of the following character: ">" or "<". If a cell (i, j) with i < j is empty, the corresponding p-value in cell (j, i) is greater than 0.05. If a cell (i, j) with i < j contains the character ">", the p-value in cell (j, i) is less than 0.05, and the variable of the i-th row is significantly greater than the variable of the j-th column. Similarly, if a cell (i, j) with i < j contains the character "<", the p-value in cell (j, i) is less than 0.05, and the variable of the i-th raw is significantly smaller than the variable of the j-th column. For example, in the p-values table of Study 1: Foraging, the cell (1, 2) contains ">", which means that the performance of control software generated by EvoStick on the basis of model A is significantly greater when it is evaluated on model A than when it is evaluated on model B. Indeed, the corresponding p-value, reported in cell (2, 1), is equal to 2.69e-67, which is smaller than 0.05.

The automatic design methods are available as open-source packages: AutoMoDe-Chocolate and EvoStick. A technical report on how to install and use these packages is available here.

Experiment 1: design model and pseudo-reality are fixed

The instances of control software and the results of the first experiment are available for download\ here:

Foraging

The arena contains two source areas (black circles) and a nest (white area). A light is placed behind the nest to help the robots to navigate. In this idealized version of foraging, a robot is deemed to retrieve an object when it enters a source and then the nest. The goal of the swarm is to retrieve as many objects as possible.

Videos

Design context

Pseudo reality

EvoStick

Vanilla

Chocolate

P-values table

Aggregation

The swarm must select one of the two black areas and aggregate there. The objective function is computed at the end of the experimental run, and is maximized when all robots are either on the top or the bottom area.

Videos

Design context

Pseudo reality

EvoStick

Vanilla

Chocolate

P-values table

Experiment 2: desing model is fixed, pseudo-reality is sampled

Results, sampled models and instances of control software obtained in the second experiment are available for download here:

P-values table Foraging

P-values table Aggregation

Norms of differences between models

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

Differences of norms of models

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging

Cosine similarity

EvoStick

Chocolate

Aggregation

Foraging

EvoStick

Chocolate

Aggregation

Foraging