Cognitive abilities in swarm robotics: developing a swarm that can collectively sequence tasks

Lorenzo Garattoni, and Mauro Birattari (September 2020)

Supplementary movies

`Supplementary movie S1`

A typical experimental run of Mark I₃ on 20 e-puck robots. Mark I₃ is a swarm of reactive robots that sequences three tasks under the hypothesis that a robot is notified of an error as soon as it performs a task in the wrong order. Mark I₃ is able to sequence the tasks and perform them 10 times in the correct order. The correct order of the three tasks, which is unknown to the robots, is red, then green, and finally blue.

Below, we sketch the highlights of movie S1 to simplify the comprehension of the operations of TS-Swarm in this and the following movies. Each block of text is preceded by a timestamp or an interval (typeset in boldface), which identifies the moment or the segment of the movie described by the block itself. In the movie, timestamps are displayed in the lower right corner of the frame.

00:00:00 - 00:00:19 The 20 robots are distributed randomly in the hexagonal arena. The three tasks to be performed by the robots-i.e., red, green, and blue-are randomly assigned to the three TAMs that are placed along the perimeter of the arena. In this example, the red task is assigned to the TAM on the left of the image, the green task is assigned to the right TAM, and the blue task is assigned to the TAM at the bottom of the image. The correct order of execution is red, green, blue, and it is unknown to the robots at the beginning of the run. All robots start by assuming the role of runner and move randomly in the environment.

00:00:20 - 00:01:22 Robots that see a task attempt to perform it. Three robots, one per TAM, engage in task execution. The robot that performs the red task receives positive feedback as this task is the first to be performed in the correct sequence. The robots that perform the green and the blue tasks receive negative feedback. After receiving the feedback, the three robots exit the respective TAMs by moving along a straight line for about 0:25m; then stop and become guardians. They signal their role by displaying the color cyan with their LEDs. Because of the positive feedback received, the guardian of the red task locally broadcasts a range-and-bearing message that contains CND = 0 and CONF = 1. The other two guardians, which received negative feedback, both broadcast CND = 1 and CONF = 0.

00:00:52 - 00:01:49 The guardian of the red task signals nearby runners that they should initiate the construction of a branch of chain. The guardian indicates that the chain should be built on its right-hand side by sending a directional message only from the range-and-bearing emitters placed on the right side of its body. The chain extends on the right side of the first guardian, one robot after the other. The current tail of the chain displays the color magenta and the chain links the color yellow.

00:01:50 - 00:03:49 The branch of chain extends until its tail spots a guardian and establishes a connection. Here, the guardian spotted by the tail of the first branch is the one of the blue task. When the guardian of the blue task is reached by the first branch, it initiates the construction of a new branch that, following the same process, eventually reaches and connects the guardian of the green task. While the chain is being built, the runners navigate along it and perform the tasks they encounter on their way, if instructed to do so by the respective guardian.

00:02:02 - 00:02:55 A runner that has not performed any task before arrives to the red task and is instructed by the guardian to perform it. This happens because the guardian knows to be guarding the first task in the sequence. After executing the task and receiving positive feedback, the runner navigates along the first branch of chain by keeping the chain members on its left.

00:02:56 - 00:05:47 The runner arrives to the guardian of the blue task. As no other robot has performed the blue task before our runner (excluding the guardian itself), the guardian is still broadcasting CONF = 0 and CND = 1. As the task counter of the runner, after executing correctly the first task, has value CNT = 1, the condition CONF = 0 and CND <= CNT is verified. The runner thus is instructed to perform the blue task as the second task. Here, the runner has to struggle to enter the TAM as other runners block its way.

00:03:50 - 00:04:40 In the meantime, the second branch of the chain is being built. When it reaches the wall of the arena, the entire branch turns, sweeping anticlockwise around the originating guardian. The sweeping motion is triggered by the tail: its goal is to disentangle the branch of chain from the wall and enable its further extension.

00:05:48 - 00:06:21 The runner that we were following eventually enters the blue TAM. However, its execution results in a failure, as (we know that) the blue task is the third task to be performed. The runner receives therefore negative feedback. The notification of this negative feedback causes the guardian to update its information to CONF = 0 and CND = 2. From this update on, all further runners having performed only the first task will be instructed by the guardian to skip the blue task and continue their navigation along the chain. (The runner that we were following aborts the execution of the sequence. It will continue along the chain until it reaches again the guardian of the red task-the first of the sequence. From there, instructed by the guardians, it will start from scratch the execution of a sequence.)

00:06:22 - 00:07:21 A runner reaches the green task after executing the red one and skipping the blue. The runner's task counter is CNT = 1 and the guardian broadcasts CONF = 0 and CND = 1. Therefore, the condition CONF = 0 and CND <= CNT is verified. The runner is therefore instructed to perform the task. The execution results in a success and the runner updates its counter to CNT = 2. Upon notification, the guardian updates its information to CONF = 1 and CNT = 1. (It should be noted that the runner enters the TAM at 00:06:38 but cannot establish a communication with it. It exits to enter again at 00:06:58.)

00:07:22 - 00:07:33 The runner has now reached the end of the chain and revolves around the third (and last) guardian in the chain that acts as a turning point.

00:08:10 - 00:08:36 Proceeding along the chain, the runner reaches the blue task- which it had previously skipped. As its task counter is now CNT = 2 and the information of the blue task's guardian is CONF = 0 and CND = 2, the condition CONF = 0 and CND <= CNT is verified. The runner thus performs the blue task. The execution is successful and the notification of the positive feedback allows the guardian to update its information to CONF = 1 and CND = 2. A first sequence has been successfully performed and the task-sequencing process is complete: all the guardians have converged to the correct policy for instructing the runners. From now on, runners will repeatedly travel along the chain and perform the tasks in the correct order.

00:18:25 The run stops after the tenth execution of the correct sequence.

`Supplementary movie S2`

A typical run of Mark I₃ on the robots and a typical run of Mark I₃ in simulation, sideby- side. This movie enables a visual assessment of the realism of the simulation environment.

`Supplementary movie S3`

The scalability study of Mark I₃ in simulation. The movie shows the highlights of five runs of Mark I₃ in 5 experimental settings. In each setting, we double the surface of the arena and we increase the number of robots by a factor √2 with respect to the previous one. In every setting, Mark I is able to sequence the three tasks and perform them 10 times in the correct order.

`Supplementary movie S4`

A typical experimental run of Mark I₄ in simulation. Mark I₄ is a swarm of reactive robots that sequences four tasks under the hypothesis that a robot is notified of an error as soon as it performs a task in the wrong order. Mark Ibis is able to sequence the tasks and perform them 10 times in the correct order. The correct order of execution of the four tasks, which is unknown to the robots, is red, then green, blue, and finally orange.

`Supplementary movie S5`

A typical experimental run of Mark II₃ in simulation. Mark II₃ is a swarm of reactive robots that sequences three tasks under the hypothesis that a robot has to perform an entire sequence before being notified of a possible error. Mark II₃ is able to sequence the tasks and perform them 10 times in the correct order.

`Supplementary movie S6`

A typical experimental run of Mark II₄ in simulation. Mark II₄ is a swarm of reactive robots that sequences four tasks under the hypothesis that a robot has to perform an entire sequence before being notified of a possible error. Mark II₄ is able to sequence the tasks and perform them 10 times in the correct order.