Sophisticated collective foraging with minimalist agents: a swarm robotics test

Abstract

How groups of cooperative foragers can achieve efficient and robust collective foraging is of interest both to biologists studying social insects and engineers designing swarm robotics systems. Of particular interest are distance-quality trade-offs and swarm-size-dependent foraging strategies. Here, we present a collective foraging system based on virtual pheromones, tested in simulation and in swarms of up to 200 physical robots. Our individual agent controllers are highly simplified, as they are based on binary pheromone sensors. Despite being simple, our individual controllers are able to reproduce classical foraging experiments conducted with more capable real ants that sense pheromone concentration and follow its gradient. One key feature of our controllers is a control parameter which balances the trade-off between distance selectivity and quality selectivity of individual foragers. We construct an optimal foraging theory model that accounts for distance and quality of resources, as well as overcrowding, and predicts a swarm-size-dependent strategy. We test swarms implementing our controllers against our optimality model and find that, for moderate swarm sizes, they can be parameterised to approximate the optimal foraging strategy. This study demonstrates the sufficiency of simple individual agent rules to generate sophisticated collective foraging behaviour.

Introduction

Collective central-place foraging by super-organismal social insect colonies elegantly and scalably solves the problem of resource collection in a heterogeneous and uncertain environment (Olsson et al. 2008; Traniello 1989; Detrain and Deneubourg 2008). Accordingly, engineers have drawn inspiration from social insects to design swarm robotics systems that collectively solve foraging-like tasks in parallel (Labella et al. 2004; Hamann and Wörn 2006; Liu et al. 2006; Campo and Dorigo 2007; Winfield 2009; Berman et al. 2011; Pini et al. 2014; Reina et al. 2015a; Ferrante et al. 2015; Scheidler et al. 2016; Essche et al. 2015; Pitonakova et al. 2016, 2018; Hamann 2018b). Engineering and biology share common core interests in the efficiency of behaviour-generating mechanisms (e.g. Parker and Smith 1990; Houston and McNamara 1999; Ferrante et al. 2013; Gauci et al. 2014; Özdemir et al. 2018), and scalability (e.g. Rubenstein et al. 2014b; Khaluf et al. 2017; Poissonnier et al. 2019).

Here, we extend a previous study of pheromone-based collective foraging (Font Llenas et al. 2018); robots coordinate to find item sources in an unknown environment, collect an item, and transport it back to a central depot. Each robot has limited cognitive abilities and a minimal memory; it simply uses binary pheromone sensors and follows a reactive behaviour with a minimal set of states. Despite the limited capabilities of the robots and the simplicity of their individual behaviour, the resulting collective behaviour qualitatively reproduces patterns observed in real foraging ant colonies where the individuals have a more capable sensory system (i.e. pheromone concentration sensors) and a more complex behaviour [i.e. decisions based on difference of pheromone concentration (Thienen et al. 2014) or on number of collisions with other ants (Fourcassié et al. 2010)]. Our resulting collective behaviour is able to manage the distance-quality trade-off, and to approximate the optimal allocation of foragers to resources using quality-sensitive modulation of pheromone deposition and distance-sensitive abandonment rules. The emergent bidirectional collective movement of foragers between sources and depot is affected by crowding which is expected to reduce the efficiency of forage transportation from popular resources (Burd et al. 2002; Dussutour et al. 2004; Fourcassié et al. 2010; Banks 1999; Leduc et al. 2012). To assess the performance of the emergent collective behaviour, we built an optimal foraging model that explicitly takes account of crowding and we compared its predictions against the results of simulations with swarms of varying sizes and experiments with up to 200 physical robots. Our results are of potential interest to both swarm engineers and behavioural ecologists, in that they demonstrate the sufficiency of very simple individual agents to generate sophisticated collective behaviour, as well as its scalability, and reproduce empirically observed or theoretically predicted patterns. This study follows previous work that used swarm robotics as a useful tool in advancing the understanding of biological systems (Garnier 2011; Webb 2012; Wischmann et al. 2012; Mitri et al. 2013; Bose et al. 2017).

Related works

Previous engineering studies have investigated the use of stigmergy as a form of indirect communication within robot swarms where robots communicate with others by modifying the environment. Significant attention has been given to the use of indirect stigmergic communication to coordinate the collection of resources spread in the environment (Goss et al. 1992; Werger and Matarić 1996; Payton et al. 2001; Nouyan et al. 2009; Campo et al. 2010; Ducatelle et al. 2011b; Hoff et al. 2012; Purnamadjaja and Russell 2007). Engineers have mainly been inspired by social insect behaviours, especially the behaviour of some ant species that we overview in Sect. 2.1. In Sect. 2.2, we review the techniques that engineers have adopted to implement stigmergy-based foraging robots. Finally, in Sect. 2.3, we introduce optimal foraging theory and present previous theoretical models of collective foraging.

Stigmergy-based foraging in ant colonies

Some ant species coordinate their food collection by leaving pheromone trails when returning from a discovered resource to their nest (Wilson 1962; Hölldobler and Wilson 1990). In these ant species, the deposited pheromone trails serve as a positive feedback mechanism for mass recruitment which guides nest-mates to the location of a discovered source of forage (Sumpter and Pratt 2003). Foraging ants, equipped with pheromone concentration sensors (Thienen et al. 2014), reach food sources by following the deposited pheromone trails with a preference to higher concentration trails (Hangartner 1969; Van Vorhis Key and Baker 1982; Choe et al. 2012). The modulation of positive feedback [e.g. as a function of the source quality (Beckers et al. 1993; Portha et al. 2004; Shaffer et al. 2013) or footprint frequency (Devigne et al. 2004)] allows ant colonies to reach various collective patterns, such as selecting the best-quality food source available in the environment (Beckers et al. 1990, 1993; Reid et al. 2012; Shaffer et al. 2013), selecting the shortest path linking the food source to the nest (Goss et al. 1989; Deneubourg et al. 1990), and balancing predation risk and food quality (Nonacs and Dill 1990).

In addition to the ability of collective resource exploitation, adaptation to environmental fluctuations is a critically important ability for many biological organisms (Tsimring 2014), including foraging ants (Dussutour et al. 2009). The mechanisms behind mass recruitment abilities (i.e. positive feedback) are generally in opposition to those that allow adaptation and flexibility (Tabone et al. 2010; Tsimring 2014); therefore, organisms showing adaptability are generally capable of a more complex behaviour. A remarkably interesting example is offered by Monomorium pharaonis ants which make use of repellent pheromone as a form of negative feedback (Stickland et al. 1999; Robinson et al. 2005, 2008; Detrain and Deneubourg 2006). Ants use this repellent pheromone to mark unrewarding trails and could thus be a strategy to stop the exploitation of trails that lead to depleted food sources. Other evidence of adaptability in ants has been documented by Beckers et al. (1990) who showed that Tetramorium caespitum ants are able to refocus their foraging efforts from a previously selected lower-quality food source, to a newly available higher-quality food source. Ants are able to adapt to the environmental changes because, in addition to pheromone-based recruitment, they use tandem running to recruit ants to newly available higher-quality food sources (Beckers et al. 1990). In contrast, Lasius niger ants, using pheromone-based recruitment only, are unable to switch their foraging efforts to the newly available food source. In fact, Lasius niger ants only rely on indirect forms of negative feedback, which may arise from physical constraints at the food source (e.g. overcrowding or food depletion) or within the nest (e.g. filling of food reserve) (Detrain and Deneubourg 2006). Finally, in another study, Shaffer et al. (2013) showed that Temnothorax rugatulus ants employing quality-dependent linear recruitment and quality-dependent abandonment are able to adapt to the environmental changes. T. rugatulus ants select the best-quality food source in case of two unequal-quality sources, exploit equally the two sources if they have equal qualities, and refocus their foraging efforts in case of changes in relative qualities (Shaffer et al. 2013).

Stigmergy-based foraging in swarm robotics

To implement the pheromone-based recruitment mechanism in a robot swarm, an important question concerns the means of implementing pheromone trails; in particular, how the robots deposit pheromone, how the pheromone trails in the environment evolve, and how pheromone can be sensed by the robots. Here, we categorise state-of-the-art work in this area into three main approaches: beacon robots, robots with on-board actuators and sensors, and smart environments.

In the first category of robotic systems, some robots are tasked as static beacon robots (Goss et al. 1992; Werger and Matarić 1996; Payton et al. 2001; Nouyan et al. 2009; Campo et al. 2010; Ducatelle et al. 2011b; Hoff et al. 2012), which have the functions of storing pheromone levels and communicating with other robots in their neighbourhood. The biggest advantage of this approach is that the system can be implemented with simple robots in largely unknown and unstructured environments. However, there are some limitations: (i) allocation of beacon robots means they are not actively contributing to the main task, such as foraging; (ii) in large environments, the number of beacon robots increases in order to cope with the communication requirements, thereby further limiting the number of robots performing main tasks; (iii) beacon robots become obstacles themselves which restrict the movements of other robot agents. These issues can be overcome by the creation of mobile beacon robots, which can contribute to a main task as well as acting as beacons concurrently (Sperati et al. 2011; Ducatelle et al. 2011a). However, the performance of the latter approach relies on finding the correct balance between the swarm size and the communication range as a function of the environment size.

Researchers have made several attempts to equip robots with on-board actuators and sensors to implement indirect communication. For example, one early solution was to install marker pens on robots so they could draw lines on the path as pheromone trails (Svennebring and Koenig 2004). This method improved robots’ performance in the area coverage task; however, it did not incorporate pheromone evaporation or diffusion which are features of real ant trails; evaporation in particular is considered important to avoid runaway positive feedback (Garnier et al. 2007, 2013). Another design proposed in (Purnamadjaja and Russell 2007) equipped robots with devices to emit and detect gas, which then provided guidance to robots towards a source area. The main limitation of this design was the high volatility of the chemicals used. In (Mayet et al. 2010), a technique of energising phosphorescent paint using UV-LEDS mounted on E-Puck robots to mark the path, as well as sensors for picking up the glowing paint signal representing the pheromone trail, was presented. Although this allowed emulation of pheromone decay, diffusion could not be emulated. A more recent study (Fujisawa et al. 2008, 2014) used ethanol for indirect communication signals between robots, with an ethanol pump and an ethanol sensor installed on each robot, which preserved the four characteristics of pheromone: evaporation, diffusion, locality (i.e. pheromone level is only affected by the local environments), and reactivity (i.e. pheromone evolution is based on reactions with the environment).

Perhaps the most popular approach in implementing pheromone communication is through a smart environment (Sugawara et al. 2004; Garnier et al. 2007; Hecker et al. 2012; Garnier et al. 2013; Arvin et al. 2015; Valentini et al. 2018), which has the capability to store and to supply virtual pheromone information to robot agents in real-time. The popularity arises from the fact that this approach is generally low cost and easily adaptable to different sizes of swarm and environment. Smart environments may be difficult to install and use for real applications; rather, such setups are often employed for targeted research experiments. This category can be further divided into three classes: the usage of (i) radio-frequency identification (RFID) tags (Mamei and Zambonelli 2005, 2007; Herianto et al. 2007; Herianto and Kurabayashi 2009; Bosien et al. 2012; Khaliq et al. 2014); (ii) simulated pheromone environments, using projected light or other custom hardware for virtual pheromone implementations (Sugawara et al. 2004; Garnier et al. 2007, 2013; Arvin et al. 2015; Valentini et al. 2018) , and (iii) augmented reality tools in which a virtual environment is sensed and acted on by robots using virtual sensors and actuators (Reina et al. 2015b, 2017).

Optimal foraging theory

Foragers make economic decisions; hence, optimality models need to be based on suitable assumptions about ‘currencies of costs’ and benefits, as well as on constraints which may originate from features in the environment where foraging takes place (extrinsic) or inherent to the animals (intrinsic) (Stephens and Krebs 1986). It is often assumed that reproductive success (or fitness) and foraging behaviour are linked (Pyke 1984; Houston and McNamara 1999). Regarding currencies, i.e. the quantities to be maximised to achieve optimality, foraging animals often face a trade-off involving energy and time (Houston and McNamara 1999). Typically, an animal gains energy from eating a food item, but it also needs to invest time in handling such an item. Hence, if the quality of the food item is poor, the animal must decide whether to pick it, or to leave it and continue searching for better items. If the animal is a central-place forager (Orians and Pearson 1979; Kacelnik 1984), then its nest is the central place and food needs to be transported from the food source to the nest, where it is consumed.

Traditionally, two different currencies have often been used in foraging theory: the net rate of energy gain and efficiency (Kacelnik 1984; Houston and McNamara 2014). Whereas the net rate of energy gain is computed as the difference between the forager’s gross rate of gain and its rate of energy expenditure, efficiency is derived by dividing gross rate of energy gain by rate of energy consumption (Houston and McNamara 2014). In honeybees, for example, there is mounting evidence that maximising energetic efficiency provides a better account of the observed foraging behaviour (Schmid-Hempel et al. 1985; Seeley 1986, 1994; Cox and Myerscough 2003; Houston and McNamara 2014; Baveco et al. 2016). However, optimal foraging theory does not always apply to real systems, as has, for instance, been noted for leaf-cutting ants (Kacelnik 1993). Another study investigating seed-harvester ants, which always carry exactly one seed, made use of a different currency involving the seed mass to study optimal foraging (López 1987). Developing a theory that works for several foraging species seems inherently difficult, as mechanisms underlying foraging can be quite different (Traniello 1989). For example, red harvester ants (Pogonomyrmex barbatus) do not rely on pheromone trails during foraging; rather, interactions between ants at the nest site regulate their foraging behaviour (Gordon 1991; Greene and Gordon 2003; Pagliara et al. 2018). There are, however, many ant species where the production of pheromone trails is crucial in the foraging process (Wilson 1962; Hölldobler and Wilson 1990; Detrain et al. 1999; Nicolis and Deneubourg 1999; Sumpter and Pratt 2003). In principle, other aspects also need to be considered when a foraging model is developed, which are more generally related to the overall state of the forager (e.g. competing alternative activities) and the conditions characterising the foraging landscape (such as predation risk) (Houston and McNamara 1999).

We emphasise that this section has only touched on the complex nature of foraging behaviour of animals and insect colonies and that it is by no means an exhaustive collection of references. For the latter, we refer to more comprehensive overviews, e.g. see Charnov (1976); Pyke (1984); Stephens and Krebs (1986); Houston and McNamara (1999). In our swarm robotics study, we made use of several aspects of the biological systems discussed above (see Sects. 3 and 4) and we constructed a model for optimal resource collection which is described in Sect. 5 and in Appendix A.

Resource collection in an unknown environment

In this section, we formally define the investigated problem and the required capabilities of the robot (Sect. 3.1), then we describe the robotic platform (Sect. 3.2) and the augmented reality technology in use (ARK, Sect. 3.3).

The resource collection task

In this study, we investigate the problem of resource collection by a swarm composed of S robots. The environment has n circular source areas of radius \(10\, \mathrm{cm}\), denoted by \(A_i\) with \(i \in \{1,\ldots ,n\}\), which are scattered around a central depot. Each area \(A_i\) offers resource items of quality \(Q_i\). The quality is a numerical indication of the importance of the resource with respect to the task that will be performed; this is similar to the nutritional value of food items in animal foraging. In this work, we are interested in the foraging process at steady state; therefore, we assume sources which never deplete. If a robot enters a source area, it immediately collects one virtual item (or object) and returns it to the central circular depot (of radius \(10\, \mathrm{cm}\)). We do not take into account any handling time of the resource item. Also, we do not consider the time spent in the resource patch, as the robot immediately finds an object and returns to the depot (no exploration within the source area). The load carried back to the nest site is always one item at a time. Travelling takes place with the same speed independent of the load carried (i.e. either unloaded or loaded with one object). Keeping these aspects in abstract terms helps to focus the study on the collective motion aspect and allocation of robots to source areas. In fact, this study focuses on strategies to coordinate the robot motion between depot and source areas through decentralised self-organising mechanisms. In particular, we explore how indirect communication in the form of virtual pheromone trails can allow the robot swarm to balance the trade-off between the quality of resource items and the distance between the source area and the central depot.

The robots have limited computational and memory capabilities and need to operate in an unknown environment. Robots are incapable of memorising source areas’ locations, instead rely on pheromone trails to find again the previously discovered sources. This form of indirect communication requires the robots to be able to apply and read temporary marks in the environment. Additionally, we assume that robots always know the direction to the depot [similarly to path integration in ants and in other social insects (Collett and Collett 2002; Bregy et al. 2008; Heinze et al. 2018)] and are able to detect walls in front of them. However, robots do not possess any form of direct communication amongst each other and cannot perceive other robots in their surroundings.

The Kilobot robot

This study is conducted using Kilobots (Fig. 1a), which are minimalistic robots widely employed in swarm robotics research, with very limited capabilities provided by a small range of sensors and actuators (Rubenstein et al. 2014a). The Kilobot moves on a flat surface through a pair of vibration motors that allow the robot to perform a slip-stick differential-drive motion. A Kilobot moves at a speed of \(v_0\approx 1\, \mathrm{cm }/\mathrm{s}\) and rotates at \(\sim 40\,{{}^\circ }/{\mathrm{s}}\). It also has an infrared (IR) transceiver to communicate with other Kilobots in a range of 10  cm and to receive messages from an overhead control board (OHC), an RGB LED to display internal states through colours, and an ambient-light sensor. The OHC allows users to quickly program large swarms through wireless IR communication, and in our case, is used to augment the Kilobots with virtual sensors and actuators (see Sect. 3.3). While the Kilobot is quite limited in its capabilities, its simplicity results in a low-cost and easy-to-operate platform which is highly scalable.

Increasing Kilobots’ capabilities through augmented reality

To overcome the Kilobot’s limitations, researchers implemented open-source technology to extend the Kilobot’s capabilities via customisable virtual sensors and actuators (Reina et al. 2017; Valentini et al. 2018). This technology allows Kilobots to operate in an augmented reality in which, in addition to the real world, the Kilobots can sense and modify a computer-simulated environment in real-time (see Sect. 2.2). Two implementations of this technology have been proposed in recent years: the augmented reality for Kilobots (ARK) by Reina et al. (2017) and the Kilogrid by Valentini et al. (2018). In this study, we use the ARK system because of its low installation cost and its ability to automatically perform several house-keeping tasks such as motor calibration, unique ID assignment, and experiment video-recording.

ARK consists of an overhead camera array to track the Kilobots, an IR-OHC to communicate to the Kilobots, and a computer (base control station, BCS) to simulate the virtual environment. The information about the virtual sensors is computed on the BCS and communicated to the specific robot with addressed messages via the OHC. The information about virtual actuators is computed on-board by the Kilobots, communicated with colour-coded messages via LEDs visible by the overhead cameras, and processed by the BCS which updates the virtual environment. Additionally, the BCS updates the temporal dynamics of the virtual environment. In this way, each Kilobot can receive personalised information about its virtual sensors in real-time and autonomously decides when to modify the virtual environment through virtual actuators.

In this study, we employ ARK to allow robots to apply and read virtual pheromone which evaporates and diffuses over time. We equip the Kilobots with five virtual sensors and one virtual actuator. In particular, each robot is equipped with:

  • area sensor (either depot or source): the Kilobot is able to perceive if it is within the depot or a source area (this information is encoded in 2 bits);

  • item quality sensor: the Kilobot is able to estimate the quality of the item it retrieves from the source area. Additionally, when the Kilobot enters in the depot, it can estimate the quality of the items that have been collected up to now (this information is encoded in 4 bits);

  • depot direction sensor: the Kilobot has always knowledge about its relative direction to the depot (this information is encoded in 4 bits);

  • wall sensor: the Kilobot can sense if there is a wall at a distance of \(\sim 5\,\hbox {cm}\) in front of itself; note that this does not allow the Kilobot to sense the presence of other robots (this information is encoded in 4 bits);

  • pheromone gland actuator: the Kilobot can deposit a drop of pheromone at its location (it expresses this behaviour by blinking its LED blue);

  • pheromone antennae: the Kilobot can sense the presence of pheromone at a distance of \(\sim 3.5\,\hbox {cm}\) from its centre in front of itself (this information is encoded in 4 bits, see Fig. 1b).

Fig. 1
figure1

a A picture of a Kilobot with a 3D printed ring [originally designed for the study of Pratissoli et al. (2019)] which considerably improves ARK’s performance in terms of tracking and LED colour detection. b Kilobots sense via ARK the presence of virtual pheromone in front of themselves at a distance of \(\sim 3.5\,\hbox {cm}\) in four \(45^\circ \)-wide sectors. The virtual sensor indicates the presence or absence of pheromone as binary values, therefore, the Kilobot has no information about the pheromone quantity or concentration difference. In this illustration, pheromone is represented as blue circles, and thus, the virtual sensor readings are [1, 0, 1, 0]. When an exploring Kilobot detects pheromone, it interrupts random exploration and moves towards the detected pheromone. If more than one sector has pheromone (as in the illustration), to decide its motion direction the robot compares the sectors’ direction with the depot direction (depot illustrated as a house and direction differences as red and green angles), and moves towards the largest angle (green arrow) (Colour figure online)

To store information about the pheromone, ARK models the environment as a discrete 2D matrix with cells of \(6.7\times 6.7\, {\mathrm{mm}^2}\). Each time-step of length \(\varDelta t=0.5\, \mathrm{s}\), ARK updates the pheromone matrix by adding pheromone deposited by the robots (each drop consists of an increment of \(\phi =250\) in the cell under the robot’s centre), and computes evaporation and diffusion of the pheromone. Each matrix cell m(ij) is updated as

$$\begin{aligned} m(i,j) = m(i,j)[\, \mathrm {e}^{\log (0.5) \epsilon \, \varDelta t} - 4\,\gamma \, \varDelta t\,] + [m(i, j \pm 1)+m(i \pm 1, j)]\gamma \, \varDelta t, \end{aligned}$$
(1)

where the parameters \(\epsilon =0.1\) and \(\gamma =0.02\) are the evaporation and diffusion rates, respectively. Equation (1) is a discrete realisation of Fick’s law of diffusion (Fick 1855), where we introduce the exponential term to take into account the pheromone evaporation consistently with studies from biology (Garnier et al. 2013).

A simple individual behaviour for complex coordination

The individual robot behaviour is relatively simple and can be described by the probabilistic finite state machine (PFSM) illustrated in Fig. 2. The main structure of the behaviour is based on the control software designed by Font Llenas et al. (2018). The behaviour has been enriched by adding a new Obstacle Avoidance state (indicated as AO in Fig. 2), by including an additional form of indirect communication that enables adaptability to different quality scales (as described in Sect. 4.1), and by allowing for probabilistic transitions and tuneable pheromone functions (as described in Sect. 4.2).

The robots do not have previous knowledge about the number, location, and items’ quality of the source areas. Therefore, a robot starts by exploring the environment to discover source areas (state RW in Fig. 2). Due to the Kilobot’s limited capabilities (see Sect.3.2), the exploration is performed via an isotropic random walk which is a simple and efficient method to search for targets in an unknown environment (Dimidov et al. 2016). The random walk consists of alternate straight motion for 10 s and uniformly random rotation in \([-\pi , \pi ]\). Upon encounter of a source area, the robot (virtually) picks up an item and transports it to the depot (state GD in Fig. 2). As indicated in Sect. 3.1, we assume that the robots are limited in memory and only able to keep track of the direction towards a single location in the space, in our case the direction to the depot. This assumption is in line with the behaviour of several ants species which rely on path integration to return to the nest (Collett and Collett 2002; Bregy et al. 2008; Heinze et al. 2018). The robots follow the direction to depot to bring back collected items. Instead, to memorise the source locations, the robots rely on their stigmergic coordination which represents a form of collective memory. Therefore, on its way to the depot, the robot lays down virtual pheromone to allow itself, as other robots, to find again the source area. The robot, every four seconds, takes a probabilistic decision to deposit the next pheromone drop using the function \(P_\phi (Q_i)\) which is function of the collected item’s quality \(Q_i\).Footnote 1 The function \(P_\phi (Q_i)\) is given by Eq. (2) and described in details in Sect. 4.2. On arriving to the depot, the robot unloads the item and probabilistically decides [according to Eq. (3)] to turn back to follow the just-formed pheromone trail (state TB in Fig. 2), or to interrupt its exploitation of this source area and to resume exploration through random walk. When a robot senses virtual pheromone via the virtual antennae (composed by four sectors described in Sect. 3.3), the robot follows the trail by moving in the direction of the triggered antennae sector (state FP in Fig. 2). If the robot senses pheromone in more than one direction, e.g. both left and right sectors as in the illustration of Fig. 1b, the robot compares the sensed-pheromone directions with the direction to depot (red and green angles in Fig. 1b) and moves towards the direction with the largest difference (green arrow in Fig. 1b). This decision relies on the assumptions that robots only deposit pheromone in their straight path from a source area to the depot and that they always have access to the depot vector.

Compared with previous studies (Font Llenas et al. 2018), the robot’s behaviour has been enriched through the inclusion of obstacle avoidance (state AO in Fig 2). In fact, robots have been equipped with a virtual sensor to detect walls (see Sect. 3.3). The robot reacts to a wall only if sensed in a frontal position, i.e. the two central sectors in the range \([-\,45^\circ ,45^\circ ]\) of the robot’s heading (note that the virtual wall sensor is composed by four sectors equal to the virtual antennae of Fig. 1b). Upon wall detection, the robot turns left or right for about \(22.5^\circ \) in the opposite direction of the sensed obstacle, then moves straight for 2.5 s, and finally returns to either the random walk (RW) state or the go depot (GD) state, depending on whether it carries an item or not. This behaviour may be triggered multiple times, until no obstacle is sensed in the central sectors. In case of symmetric sensing, i.e. both central sectors sense an obstacle, the robot uses as tie-breaker the lateral obstacle sectors to turn in the freest direction. In the case of complete symmetry, the direction is selected at random.

Fig. 2
figure2

Probabilistic finite state machine (PFSM) of the individual robot behaviour. Circles represent states, and arrows are transitions. Robots start exploring the environment through a random walk (RW); when they find a source area they collect an item and return to the depot (GD) laying pheromone according to Eq. (2). Once arrived at the depot, they either turn back (TB) or resume exploration (RW). When explorer robots detect pheromone, they follow it (FP). When robots detect a wall, they avoid it (AO). Controlling individuals through this simple PFSM leads to sophisticated collective foraging dynamics

Adaptivity to relative quality differences

The robots do not have any prior information about the range of the sources’ qualities that the unknown environment can offer. In order to allow the swarm to tune its behaviour to an unknown quality range, the individual robots update over time their knowledge on the best currently available quality \(Q_{\max }\). Initially, the robot has no prior knowledge about the quality range and thus ranks the first source it finds as the best available. Over time, the robot constantly compares its range (i.e. the best available quality \(Q_{\max }\)) with other items collected by other swarm members. The communication between robots is indirect and takes place within the depot. Each time a robot enters the depot, it can see the qualities of the items collected by the swarm until now; thus, the robot compares its information with the best quality and, if higher, updates its \(Q_{\max }\) accordingly. This mechanism is consistent with animal behaviour where individuals can assess the nutrient quality of the swarm’s reserves and compare against their own (Dussutour and Simpson 2009; Arganda et al. 2014).

In our study, we consider unlimited item sources to investigate the steady state regime; however, in case of limited sources (i.e. with a limited number of items) the robots may update their quality range by only observing the latest collected items. In this way, we predict the swarm being able to flexibly adapt to appearance or depletion of sources.

Modulation of the individual rules to obtain a plastic behaviour

After collecting an item, the robot returns to the depot laying a pheromone trail. The pheromone trail acts as a form of indirect communication between robots which inform each other about paths connecting depot to discovered sources. Collective contribution to these trails leads to a form of swarm memory which allows the swarm to remember the location of sources in the environment. In fact, our simple robots cannot internally store sources’ locations, although the swarm, as a whole, can remember locations through pheromone trails. A pheromone trail is formed by a sequence of drops that the robot deposits via its virtual pheromone gland (see Sect. 3.3). Similar to the approach of Font Llenas et al. (2018), a robot probabilistically decides every four seconds whether to lay the next drop or not. In the previous work, we implemented a simple linear function to map the quality \(Q_i\) into a pheromone deposition probability, i.e. \(P_\phi (Q_i)=Q_i/Q_{\max }\). Linking the pheromone deposition function to perceived source quality allowed the swarm to give priority to better-quality sources over inferior sources.

In this study, we implement a tuneable function to allow the robot to regulate its selectivity on the quality through a single parameter \(\alpha \ge 0\). The probability to deposit the next pheromone drop is given by

$$\begin{aligned} P_\phi (Q_i)=\mathrm {e}^{\alpha (Q_i-Q_{\max })Q_i^{-1}} \; . \end{aligned}$$
(2)

The individual robots have access to \(\alpha \) in a decentralised way and can alter this value to vary the global response. Using an \(\alpha > 1\), the function has an exponential shape on \(Q_i\) resulting in high selectivity in favour of the highest quality sources. A value of \(\alpha \approx 1\) leads to (approximately) linear response therefore approximating the function investigated in (Font Llenas et al. 2018), thus having Eq. (2) as a generalisation of the previous specific function. Finally, decreasing \(\alpha < 1\) gradually flattens out the function to a constant value, that at the limit of \(\alpha =0\) becomes constant \(P_\phi (Q_i)=1\); this results in constant pheromone trails irrespective of the sources’ qualities.

To further expand the individual robot capabilities to be able to balance the distance-quality trade-off, we introduce a decay function \(P_d(t_i)\) that robots use, upon arrival in the depot with an item (event indicated with the letter ‘a’ in Fig. 2), to decide whether to keep exploiting the same source or to start exploring for new sources. \(P_d(t_i)\) is inspired by similar abandonment behaviours observed in social insects [e.g. foraging ants (Shaffer et al. 2013) and house-hunting honeybees (Seeley et al. 2012)] and allows the robots to abandon exploiting source \(A_i\) that required a long travel time \(t_i\) (either because it is distant or has an overcrowded path). The travel time \(t_i\) is measured by the robots as the time spent between the item collection (from the source \(A_i\)) and the item deposition (in the depot). The function \(P_d(t_i)\), similarly to \(P_\phi (Q_i)\) of Eq. (2), is modulated by the parameter \(\alpha \) as

$$\begin{aligned} P_{d}(t_i)=(\alpha +1)^{-2} \mathrm {e}^{\frac{t_i-t_{\max }}{ (\alpha +1)\sqrt{t_i}} } \end{aligned}$$
(3)

where \(t_{\max }\) is a parameter indicating robot’s prior knowledge on the maximum acceptable time to return from a source. The \(t_{\max }\) could be adaptively tuned (similarly to \(Q_{\max }\) in Sect. 4.1), although in this study we do not explore this aspect and we fix \(t_{\max }= 100 \,{\hbox {s}}\). Assuming a fixed \(t_{\max }\) is reasonable, because in both biological and artificial systems source areas may be accepted only if they are located within a certain maximum distance (or travel time \(t_i\)) from the depot that is decided a priori.

Equations (2) and (3) are linked by the parameter \(\alpha \) which the robots can regulate to alter the swarm behaviour. Increasing \(\alpha >1\) has the combined effect of increasing discriminability on quality \(Q_i\) and flattening \(P_d(t_i) \approx 0\) for any distance; therefore, the swarm ignores distance but selects the higher-quality source. Conversely, small \(\alpha < 1\) flattens out quality differences \(P_\phi (Q_i) \approx 1\) and accentuates differences on travel time with an exponential abandonment \(P_d(t_i)\) on high travel times; this leads to a system where the only discriminating factor on source selection is distance due to a combination of evaporation and abandonment on farther sources. Finally, intermediate values \(\alpha \approx 1\) give a quasi-linear response of \(P_\phi (Q_i)\) and sublinear \(P_d(t_i)>0\) which allow the swarm to balance the distance-quality trade-off [similarly to what has been reported in Font Llenas et al. (2018)].

An optimal resource collection model

In this section, we model the optimal resource collection by the robot swarm through a mathematical model inspired by general aspects of optimal foraging theory (Kacelnik 1984; Houston and McNamara 2014).

Our model describes the utility gained by collection of resource items discounted by the cost incurred in transporting these items to the depot. The main components of our model are the items’ qualities, the allocation of robots to various source areas, and the source–depot travel time. We model the robot allocation as \(\rho _j\) (with \(j \in \{1,\ldots ,n\}\)) which is the fraction of robots on the trail between central depot and source area \(A_j\). All robots that are actively involved in transportation of items from the n sources are called workers; their fraction is denoted by \(\rho _\mathrm{w}=\sum ^n_{j=1} \rho _j\). The remaining robots that explore the landscape are called explorers; their fraction is denoted by \(\rho _\mathrm{e}=1-\rho _\mathrm{w}\). The travel time is a function of the source–depot distance and of the traffic congestion on the path. In fact, crowded paths lead to frequent collisions between robots and result in longer travel times. The model is derived in Appendix A; here, we report the main quantity which is the swarm yield R, defined as

$$\begin{aligned} \begin{aligned} R= \sum _{j=1}^{n} \frac{q_j\,\beta _j\,\rho _j\,S}{\tilde{d}_j^2} , \qquad {\text {with}} \quad \tilde{d_j}= d_j + v_o\,T_{C,j}(\rho _j\,S) \,. \end{aligned} \end{aligned}$$
(4)

where S is the swarm size, \(q_j=Q_j/Q_{\max }\) is the normalised quality of source area \(A_j\), \(\rho _j\) is the fraction of robots on the trail between central depot and source area \(A_j\), the parameter \(\beta _j\) is a fitting parameter characterising the relationship between the number of collected items from source \(A_j\) and the number of robots on the trail to \(A_j\) (see Eq. (7) in Appendix A), \(d_j\) is the distance between source area \(A_j\) and depot, \(v_0=1\, \mathrm{cm }/\mathrm{s}\) is the Kilobot’s speed, and the function \(T_{C,j}(\rho _j\,S)\), defined in Eq. (12), models the additional travel time arising from traffic congestion. Therefore, Eq. (4) models traffic congestion as an increase of the travel distance \(d_j\) by accumulating the additional length of \(v_o\,T_{C,j}(\rho _j\,S)\).

Estimation of model parameters from simulation data

As for the model of Appendix A, three free parameters per source area (\(T_{0,j}\), \(\beta _j\), and \(\kappa _j\), with \(j\in \{1,\ldots ,n\}\)) need to be estimated from data. To do so, we use the relationship between the number of robots on a path and the number of collected items given in Eq. (7). For the case of two source areas, the results of fitting are depicted in Fig. 3 and summarised in Table 2 (in Appendix A). As shown in Fig. 3, for small-to-medium numbers of robots on a trail, the number of collected items per time interval increases linearly with the number of robots on a trail, whereas for medium-to-large numbers of robots on a trail, we observe a nonlinear decay. This type of curve is widespread in several natural and artificial systems and is often indicated as Universal Scalability Law (Gunther 2000; Krause et al. 2002; Hamann 2012, 2013, 2018a).

Fig. 3
figure3

Fits of Eq. (7) to data generated by physics-based simulations in order to obtain the model parameters reported in Table 2 in Appendix A. Fitting is performed in case of \(n=2\) source areas with different quality and equal distance in a, equal quality and different distance in b, and equal quality and distance in c. Data points are represented using symbols, and fits are represented using lines (circles and solid grey lines show collection from source \(A_1\), while triangles and dash-doted blue lines show collection from source \(A_2\)). Error bars represent 95% confidence intervals. There is a linear growth for small-to-medium numbers of robots on a path, and a nonlinear decay for medium-to-large numbers of robots on a path. This type of growth-decay curves on population size is widespread in nature (Krause et al. 2002) as in engineering (Gunther 2000) (Colour figure online)

Basic properties of the optimal resource collection model

To study the basic properties of the yield function R in Eq. (4), we consider the case of resource collection in an environment with \(n=2\) source areas. The robot swarm aims at optimally allocating its robots between the two source areas to maximise the yield R. For simplicity, we assume that all robots in the swarm are involved in resource collection (i.e. all robots are workers and \(\rho _\mathrm{w}=1\), \(\rho _\mathrm{e}=0\)); the fraction \(\rho _1=\rho \) collects items from source \(A_1\), and the fraction \(\rho _2=1-\rho \) collects items from source \(A_2\). The yield function in Eq. (4) is then given as

$$\begin{aligned} \begin{aligned} R(\rho ) = R_1(\rho ) + R_2(\rho ) \,;\,\, R_1(\rho ) = \frac{q_1\,\beta _1\,\rho \,S}{\tilde{d}_1^2} ,\,\, R_2(\rho )=\frac{q_2\,\beta _2\,(1-\rho )\,S}{\tilde{d}_2^2} , \end{aligned} \end{aligned}$$
(5)

Here, we are interested in how the robot swarm allocates its resources. Therefore, we explicitly mention the dependency of R on \(\rho \) in Eq. (5); in what follows we derive the optimal value of \(0\le \rho \le 1\) that maximises Eq. (5). Different outcomes are possible. If we consider increasing \(\rho \), where \(\rho \in [0,1]\), then we have

  • 1 global maximum at \(\rho =1\) (all workers allocated to source area \(A_1\)), if \(R(\rho )\) monotonically increases,

  • 1 global maximum at \(\rho =0\) (all workers allocated to source area \(A_2\)), if \(R(\rho )\) monotonically decreases, or

  • either 1 global maximum or 2 local maxima (one of which is also the global maximum) where \(0< \rho < 1\) (workers split between source area \(A_1\) and \(A_2\)).

For the last case, we can derive the optimal swarm deployment with respect to \(\rho \) from \(\partial R/\partial \rho =0\). We give the full expressions of the first-order derivatives in Eq. (15) in Appendix B and use a graphical approach in this section to picture the behaviour of \(R(\rho )\). Without loss of generality, below we make use of the averaged quantities obtained for \(d_1=d_2=1\,\text {m}\) and \(q_1=q_2=1\) (reported in Table 2) to demonstrate the basic behaviour of the model.

Equal distances and varying qualities (fixed swarm size)

We expect that the effect of crowding on a trail will lead to different behaviours when the source areas are near the depot compared with the case when they are sufficiently far away that crowding can be neglected. We consider both scenarios with \(n=2\) source areas at equal-far or equal-near distance and show the corresponding results in Fig. 4a, b for fixed \(q_1=1\) and varying \(q_2 \in \{0.5,0.75,1\}\), respectively. In case the qualities of items contained in the two different source areas are different, at first glance it seems intuitive to allocate as many workers as possible to the source with the higher-quality items. However, this strategy may lead to frequent collisions on the transport path and hence to traffic congestion that slows down the resource income. This means that there is a limitation on the item collection efficiency which depends on the number of workers and the space available on the transport trail.

Fig. 4
figure4

Model predictions of yield R depending on worker allocation \(\rho \) for equally distant sources \(d_1=d_2=3.5\,\text {m}\), in a, and equally nearby sources \(d_1=d_2=0.6\,\text {m}\), in b, for fixed \(q_1=1\) and varying \(q_2 \in \{0.5,0.75,1\}\). When sources are relatively far (a), it is optimal to allocate all workers to the better-quality source area, whereas for source areas in close proximity (b) the yield is maximised if the trail between the higher-quality option and depot does not become overcrowded. Parameters: \(\beta _1=\beta _2={\bar{\beta }}=0.965\), \(T_{0,1}=T_{0,2}=\bar{T}_0=0.029\), \(\kappa _1=\kappa _2={\bar{\kappa }}=2.321\), and \(S=200\)

Figure 4a shows that for sufficiently large distances between source area and depot it is indeed optimal to allocate all workers to the source containing higher-quality items. If the qualities of resource items in the two source areas are also equal then the yield is, albeit only marginally, larger if both source areas are exploited equally. In case both source areas are near the depot then the optimal strategy is different. Exploiting equally both resources does not give the highest yield, instead the best strategy is to avoid traffic congestions on the trail leading to the higher-quality items (low \(\rho \) in Fig. 4b). Interestingly, reducing collisions and congestion on the higher-quality source path means allocating more workers to the lower-quality source. Even when both source areas provide items of equal quality, it is better to focus on any of the two available sources to optimise the resource income from the other (Fig. 4b).

Equal qualities and varying distances (fixed swarm size)

Let us now consider the case when both available source areas contain objects of equal quality. Given a fixed swarm size, optimising the transport yield R should then be affected by the distance of each source area to the depot. In Fig. 4, we depict the corresponding yield function for equal qualities \(q_1=q_2=1\), swarm size \(S=200\), fixed \(A_1\)’s distance \(d_1=0.6\, \mathrm{m}\), and varying \(A_2\)’s distance \(d_2\in \{0.3\,{\hbox {m}},0.6\,{\hbox {m}},0.9\,{\hbox {m}}\}\). The model results of Fig. 5a predict the effect of overcrowding. The optimal strategy consists of allocating most robots to the more distant source area in order to keep the path to the closer source congestion-free and to allow for more efficient resource collection.

Fig. 5
figure5

Model predictions of yield R depending on workers allocation \(\rho \) for equal qualities \(q_1=q_2=1\). In a, we show the effect of distance on R; we fixed swarm size \(S=200\) and \(A_1\)’s distance \(d_1=0.6 \, \mathrm{m}\), and varied of \(A_2\)’s distance \(d_2\in \{0.3\, \mathrm{m},0.6\, \mathrm{m},0.9\, \mathrm{m}\}\). Due to the effect of overcrowding, the maximum yield is attained when only a limited number of workers, 10–20%, collected from the nearer source in order to keep the path free from traffic congestion. In b, we report the effect of the swarm size S when it is larger, smaller, or equal to the critical size \(S_\mathrm{c}\). The sources have equal qualities \(q_1=q_2=1\) and depot-source distances \(d_1=d_2=0.6\, \mathrm{m}\). The critical swarm size \(S_\mathrm{c}\) characterises the effect of overcrowding, i.e. when the swarm is sufficiently large (\(S>S_\mathrm{c}\)) it is optimal to keep at least one path with less than 50% workers; otherwise, the effect of overcrowding would decrease the income of resources on both paths. Parameters: \(\beta _1=\beta _2={\bar{\beta }}=0.965\), \(T_{0,1}=T_{0,2}=\bar{T}_0=0.029\), \(\kappa _1=\kappa _2={\bar{\kappa }}=2.321\)

Critical swarm size for equal qualities and equal distances

Through our model, we can derive the critical size \(S_\mathrm{c}\), below which the best predicted strategy is to equally split workers between the two source areas, assuming sources at equal distances and with equal qualities. We analytically derive the expression to obtain \(S_\mathrm{c}\) in Appendix C and we depict in Fig. 5b how R varies for swarm sizes S larger, smaller, or equal to \(S_\mathrm{c}\). If the swarm size gets too large (\(S>S_\mathrm{c}\)), it is optimal to allocate more robots to one source although collection from either source would give the same reward and incur identical costs. This means that the robot swarm should avoid overcrowding both paths to maximise the yield from resource collection. However, compared with the case \(S<S_\mathrm{c}\), the possible yield for \(S>S_\mathrm{c}\) is smaller, i.e. if the swarm exceeds its critical size \(S_\mathrm{c}\) of collecting workers it cannot achieve the maximum yield it could possibly achieve for a smaller number of workers involved in the resource transportation. This result highlights the importance of controlling the number of workers to maximise the global intake; a strategy implemented in a decentralised fashion by ants (Charbonneau et al. 2015; Pagliara et al. 2018), and recently investigated in the context of swarm robotics (Mayya et al. 2019).

Results

Through physics-based simulations, we systematically tested a variety of experimental conditions to study the performance of the proposed system. We validated some of the simulation results through experiments with up to 200 physical Kilobots. In Sect. 6.1, we present a set of simulation results that highlight the benefits of having introduced a virtual wall sensor, adaptability to unknown environmental scenarios, and behaviour modulation to balance the distance-quality trade-off. In Sect. 6.2, we compare the model predictions against robot swarm simulations for different swarm sizes.

The physics-based simulations were conducted with ARGoS (Pinciroli et al. 2012, 2018) which is a state-of-the-art swarm robotics simulator that accurately and efficiently simulates the Kilobots and the ARK system via a dedicated plug-in (Pinciroli et al. 2018). The physical robot experiments were run with fully charged Kilobots whose motors have been automatically calibrated through ARK (Reina et al. 2017). The videos of these experiments are augmented by superimposing the virtual environment information (see a sample image in Fig. 6) and available as online supplementary material (Online Resource 1-9) and at https://www.youtube.com/playlist?list=PLCGKY9OHLZwMaGeB6cxVfxmHwhBFqKF7a. The robot simulation code is open source and available online at https://github.com/DiODeProject/PheromoneKilobotSwarmIntell.

Fig. 6
figure6

A picture of a 50 real Kilobots experiment with the virtual environment superimposed on the image. The red (bottom-left) source area \(A_1\) has quality \(Q_1=10\), while the yellow (top-right) source area \(A_2\) has quality \(Q_2=4\). The sources are placed at \(d_1=1\, \mathrm{m}\) and \(d_2=0.6\, \mathrm{m}\) from the central (blue) depot. The (light-blue) shades represent the pheromone trails that the robots deposit and follow. Full videos available as supplementary material (Online Resource 1-9) and at https://www.youtube.com/playlist?list=PLCGKY9OHLZwMaGeB6cxVfxmHwhBFqKF7a (Colour figure online)

Results show tuneable and adaptive swarm responses

We report here the simulation and physical robot results to show evidence of the behaviours obtained through obstacle avoidance, adaptivity, and individual function modulation.

Obstacle avoidance Figure 7b shows a screenshot of an experiment inspired by the well-known study of Goss et al. (1989) which showed that ants are able to exploit the shorter path in double-bridge experiments with branching paths of different lengths. In our system, the individual robots have lower cognitive capabilities than the individual ants. In fact, the Kilobots cannot distinguish pheromone intensity, follow its gradient, nor make decisions with respect to differences in pheromone concentration. Nevertheless, the robot swarm was able to preferentially exploit the shorter path. This outcome was not limited to conditions where the pheromone evaporation was too high to exploit the longer path while sufficient to establish a path on the shorter, but it also applied to scenarios in which both paths were viable. In fact, we tested the swarm in an environment where we blocked the shorter path and only the longer path was active (see Fig. 7a) and the robots exploited the longer path, as illustrated in the plot of Fig. 7c. Similar double-bridge experimental setups have been emulated and investigated in previous swarm robotics studies such as Montes de Oca et al. 2010 and Scheidler et al. (2016), in which, however, the swarm behaviour and goal were different.

Fig. 7
figure7

A 50 simulated Kilobot swarm experiment inspired by the ants’ double-bridge experiment by Goss et al. (1989) in which two paths, a longer path (1.8  m long) and a shorter path (1.4  m long), connected source to depot. When the simulated swarm had access to only the longer path (a), the Kilobots reinforced pheromone on that path and used it for their collections. Instead, when both paths were available (b), the Kilobots disregarded the longer path and (almost exclusively) used the shorter for their collections. c Shows the number of robots on the two paths at the end of one simulated hour (boxes range from 1st to 3rd quartile of the data from 100 simulations and indicate the median with a horizontal line, the whiskers extends to 1.5 IQR). The individual Kilobots cannot follow a pheromone gradient nor detect any difference in pheromone concentration. Despite their limited individual capabilities, the robot swarm shows (in certain experimental conditions) behaviour similar to ants’ colonies, which instead rely on much higher cognitive abilities at the individual level

Our results indicate that, for certain types of experimental conditions, cognitively simpler individuals would suffice to reproduce the collective level behaviour observed in colonies of more complex ants. However, we believe that the ants, exploiting gradient sensing, are more flexible and can optimise path lengths in a larger range of environments than our robotic system. In fact, our results may vary if we would increase the robots density and/or vary the paths’ lengths. However, we cannot ascribe the observed behaviour to the manually tuned maximum travel time \(t_{\max }=100\, \mathrm{s}\) of Eq. (3) because our experiments were conducted with \(\alpha =10\) which flattens Eq. (3) to zero for every path length. Therefore, the observed dynamics emerged from a more complex interplay between the Kilobots’ behaviour and the virtual pheromone dynamics, and resulted in an efficient swarm selection of the shortest path.

Adaptivity As described in Sect. 4.1, the swarm is able to adapt to any quality range and have a response that only considers the ratio between qualities rather than the absolute quality values. Figure 8 shows the system’s response to three scenarios with \(n=2\) sources with the same quality ratio (i.e. \(Q_2/Q_1= 0.4\)) but different absolute quality values (i.e. \(Q_1=15, \; Q_2 = 6\) on the left, \(Q_1=10, \; Q_2 = 4\) in the centre, and \(Q_1=5, \; Q_2 = 2\) on the right of the x-axes of Figs. 8a, b). The results show that the adaptive strategy (white boxplots) adapted to any condition and, as the quality ratio remained the same, also the swarm response remained the same. Instead, the constant range strategy (dark boxplots) reckoned with absolute quantities and led to the desired outcome only when the prior knowledge on the quality range matched the environment’s range (central experimental scenario of Fig. 8). The ability to respond to the relative quality of food sources, rather than to an absolute quality range, has been recently documented also in foraging ants (Wendt et al. 2018).

Fig. 8
figure8

Simulation results showing the adaptivity of the system. We measured the number of collected items in a and the number of robots on each path in b for the two source areas, the superior \(A_1\) and inferior \(A_2\), both at equal distance \(d_1=d_2=1\, \mathrm{m}\). We kept the same quality ratio, i.e. \(Q_2/Q_1=0.4\), but varied the absolute value of the objects (indicated on the x axis). All experiments were conducted with swarms of \(S=50\) Kilobots and an intermediate value of \(\alpha =0.85\) in Eqs. (2) and (3). Boxes range from 1st to 3rd quartile of the data from 100 simulations and indicate the median with a horizontal line; the whiskers extend to 1.5 IQR. Having a constant range (dark boxplots) shows good results only if the predefined range matches the actual range of the environment (central experiment). Instead, an adaptive strategy allows the swarm to exploit resources as a function of the their relative qualities in a range adapted to the environment

Behaviour modulation Via Eqs. (2) and (3), the individual robots can modulate their behaviour to give priority to closer (low \(\alpha \)) or better-quality (high \(\alpha \)) source areas. This modulation at the individual level translates to different collective responses at the swarm level. We investigated such dynamics in swarms of \(S=50\) Kilobots operating in an \(n=2\) sources scenario environment with a superior source area \(A_1\) at distance \(d_1=1\, \mathrm{m}\) with \(Q_1=10\) and an inferior source area \(A_2\) with \(Q_2=4\) and varying distance \(d_2\in [0.5, 1]{\mathrm{m}}\). The relatively small swarm size was motivated by preliminary results that we reported in (Font Llenas et al. 2018) which showed that large swarms do not discriminate between sources as there are enough robots to maximally exploit any area. Figure 9 shows the effect of the three tested values of \(\alpha \in \{0,0.85,10\}\) on the swarm dynamics. Using \(\alpha =0\) promoted distance selectivity, in fact the simulated swarm had the highest item collection per minute (panel a) from the closest source (\(A_2\)) to which the majority of the workers was deployed (panel b). Using \(\alpha =10\) promoted quality selectivity, in fact the simulated swarm had the highest item collection per minute from the highest quality source (\(A_1\)) to which the majority of the robots was deployed. Finally, intermediate values of \(\alpha \), e.g. \(\alpha =0.85\), led to a distance-quality trade-off where the swarm exploited the nearest inferior-quality source only if it was much closer than the farther superior-quality source.

Fig. 9
figure9

Effect of the modulation of the parameter \(\alpha \) from Eqs. (2) and (3) to favour nearer source areas (\(\alpha =0\)), to favour the best-quality sources (\(\alpha =10\)), or to balance the distance-quality trade-off (\(0< \alpha < 10\)). Results of \(\alpha =0\) are shown in light-grey, \(\alpha =0.85\) in dark-grey, and \(\alpha =10\) in black. We report the results for simulations and physical robots experiments of one hour each in scenarios with \(n=2\) sources. We excluded the initial exploration phase and indicate mean values for the last 30 min. Physical robots results are indicated as solid symbols with vertical bars indicating the 95% confidence intervals of 3 runs for each condition (the symbols are slightly shifted to avoid bar overlaps but all represent results for \(d_2=0.6\, \mathrm{m}\)). Lines represent the mean of 100 simulations (shaded areas are 95% confidence intervals). Source \(A_1\) had quality \(Q_1=10\) and was located at distance \(d_1=1\, \mathrm{m}\); source \(A_2\) had quality \(Q_2=4\) and varying distance \(d_2 \in [0.5,1.0]{\mathrm{m}}\). We report the rate of collected items per minute in a, the mean number of robots on each path in b, and the rate per minute of collected items weighted by the normalised quality \(q_1=1.0\) and \(q_2=0.5\) in c. Individual robots can locally modulate the decentralised parameter \(\alpha \) to lead the swarm to a range of different collective responses, e.g. selecting almost exclusively the best-quality source (high \(\alpha \)) or balancing the distance-quality trade-off (low \(\alpha \)). Physical robots are less efficient than simulations; however, ordering between sources is preserved; this confirms the effects of \(\alpha \)-modulation observed in simulation

We ran three experiments with 50 physical robots for each of the two limit cases of quality-selective \(\alpha =10\) (solid black symbols) and of distance-selective \(\alpha =0\) (solid light-grey symbols). The videos of these six experiments are available as online supplementary material (Online Resource 1-6) and at  https://www.youtube.com/playlist?list=PLCGKY9OHLZwMaGeB6cxVfxmHwhBFqKF7a. Physical robots showed a resource collection less efficient than simulation; despite this, in both cases, the two strategies favoured either the best-quality or the nearest source area, as shown by the simulations. We explain the observed difference between reality and simulation (the reality gap) as a motion speed difference between robots and simulation. In fact, the simulation was accurately tuned on the movement of fully charged Kilobots (Pinciroli et al. 2018), but did not take into account that the robot’s speed was reduced over time due to the decrease of its battery level.

Figure 9c shows the rate per minute of collected items weighted by their normalised qualities (\(q_1=1.0\) and \(q_2=0.5\)). We did not include any cost because in our experiments every robot moved constantly and continuously (either as worker or as explorer). Therefore, the swarm incurred a constant cost independent of the collections (this would be different if, as ants do, some individuals would stop exploration to save energy (Charbonneau et al. 2015), or to avoid overcrowding as discussed above). Interestingly, the results show that there was not one \(\alpha \)-value that was better than all others; rather the best strategy varied in relation to the environment. For large distance difference, i.e. \(d_2 \ll d_1\), the distance-selective strategy (\(\alpha =0\)) displayed the highest weighted collection. Conversely, for similar distances, the best strategy consisted of favouring the best-quality source (\(\alpha =10\)), analogously to what has been observed in some species of ants which focused their foraging efforts on the richer of two equally distant sugar sources (Beckers et al. 1993; Shaffer et al. 2013).

Comparison of model and simulation data

Here, we compare the performance of binary resource collections for varying swarm sizes S and varying \(\alpha \) which regulates the swarm strategy [as from pheromone deposition in Eq. (2) and trail abandonment in Eq. (3)]. The plot in Fig. 10 shows the yield R as a function of the fraction of workers allocated to source \(A_1\) (with \(\rho _1=\rho \)) divided by the fraction of total workers involved in resource collection \(\rho _\mathrm{w}\), and of the number of worker robots \(\rho _\mathrm{w}\,S\) (i.e. involved in collecting resource items).

Fig. 10
figure10

Comparison of model with simulations and experiments: Total yield R as a function of the normalised swarm allocation \(\rho /\rho _\mathrm{w}\) and the number of worker robots \(\rho _\mathrm{w}\,S\). We report the predicted yield R from the model of Eq. (4) as a colour heatmap, and we overlay robot simulations for three strategies: distance-selective \(\alpha =0\) (circles), distance-quality trade-off \(\alpha =0.85\) (diamonds), and quality-selective \(\alpha =10\) (triangles). We report simulations for swarm sizes \(S=50\) (cyan), \(S=100\) (green), \(S=200\) (purple) and \(S=500\) (white). Under the model’s assumptions, the simulated robot swarm performs best for \(S=200\) and \(\alpha =0.85\) (\(R=150.6\,\text {m}^{-2}\)) in a, \(\alpha =10\) (\(R=177.1\,\text {m}^{-2}\)) in b and \(\alpha =10\) (\(R=120.4\,\text {m}^{-2}\)) in c. Swarms of large size (\(S=500\)) do not achieve good performance as they equally exploit both sources and do not avoid overcrowding. The star symbol in c was obtained from three experiments with 200 Kilobots assuming \(\alpha =0.85\) (see online videos). Error bars represent 95% confidence intervals. Parameters: \(\beta _j\), \(T_{0,j}\) and \(\kappa _j\) are given in Table 2 (Colour figure online)

Best performing swarms have an intermediate size (i.e. \(S=200\)). Relatively small swarms allocate robots more selectively depending on the implemented strategy. For instance, in Fig. 10a, the quality-selective strategy (\(\alpha =10\) indicated as triangles) shows an allocation of workers predominantly to the best-quality source (\(\rho /\rho _\mathrm{w} > 0.8\)) when \(S \le 200\). Instead, large swarms of \(S=500\) do not discriminate between sources and equally exploit both. The distance-selective strategy (\(\alpha =0\) indicated as circles) in Fig. 10b has a much smaller deviation and is visible only for the smallest swarm. Observing such a change in the swarm response is not an obvious result because robots cannot perceive each other. The observed change is an emergent property.

In general, simulations and the model show differences especially for swarms of size \(S=500\). In fact, for large swarms, the model predicts that the best strategy would be to allocate only a limited number of robots to the best path, in order to avoid overcrowding. We suggest that it would be possible to implement such a strategy by allowing the robots to sense and perceive peers (while they do not in this study). In the current strategy, we tried to overcome overcrowding by including the trail abandonment function of Eq. (3), although this did not demonstrate sufficient ability to deviate from a symmetric exploitation for large swarms. The resulting dynamics for \(S=500\) are an equal split between the two paths (Fig. 10a), which could be caused by physical ‘pushing’ between individuals, similarly to what is observed in some experiments of ants’ traffic organisation (Dussutour et al. 2004, 2005; Fourcassié et al. 2010).

To investigate how collision between individuals affects the collective dynamics, we reproduced the results of Fig. 10 in the collision-free case in which we removed any effect of physical interactions between robots. Figure 11 reports the model results with null traffic congestion contribution, i.e. Eq. (12) becomes \(T_{C,j}(\rho _j\,S)=0\). We overlay the simulation results with deactivated collisions, i.e. the Kilobots’ physical body is not simulated and robots can move through each others.

As expected, the model predicts that for every workers size, \(\rho _\mathrm{w}\,S\) the best strategy is always to allocate all workers to the best-quality source (Fig. 11a), or to the closest source (Fig. 11b). Some of the simulations approximate such an optimal behaviour. In the case of asymmetric qualities (Fig. 11a), the quality-selective strategy (\(\alpha =10\) represented as triangles) has high values of \(\rho \). Similarly, the closer area in Fig. 11b is largely exploited by distance-selective strategies (\(\alpha =0\) represented as circles and \(\alpha =0.85\) represented as diamonds).

Fig. 11
figure11

Total yield R as a function of the normalised swarm allocation \(\rho /\rho _\mathrm{w}\) and the number of worker robots \(\rho _\mathrm{w}\,S\) in the collision-free condition. We removed the effect of physical interactions (i.e. collisions between robots) that may cause traffic congestions, and we report the predicted yield R from model (4) as a colour heatmap, and we overlay robot simulations for three strategies: distance-selective \(\alpha =0\) (circle), distance-quality trade-off \(\alpha =0.85\) (diamond), and quality-selective \(\alpha =10\) (triangle). We report simulations for swarm sizes \(S=50\) (cyan), \(S=100\) (green), \(S=200\) (purple) and \(S=500\) (white). Without collision, the predicted best strategy is allocation of all workers to the best-quality or closest source area. The collision-free simulations approximate such result when the corresponding strategy is activated, e.g. quality-selective \(\alpha =10\) (triangle) in panel (a) and the distance-selective \(\alpha =0\) (circle) in panel (b). Error bars represent 95% confidence intervals. Parameters: \(\beta _j\), \(T_{0,j}\) and \(\kappa _j\) are given in Table 2 (Colour figure online)

Discussion

Our results show how simple individual agents can collectively forage in a sophisticated manner. We assumed a minimal cognitive architecture including maintenance of a home vector [well evidenced in ants (Collett and Collett 2002; Heinze et al. 2018)], and simple binary detection of pheromone trails and obstacles; our agents are thus much simpler than real ants. Combined with a simple pheromone deposition rule with a single tuneable parameter, however, we are able to qualitatively reproduce classical results such as the shortest path exploitation observed in lab ant colonies (Goss et al. 1989), and able to manage the classical distance-quality trade-off of foraging. We have further derived an optimality model accounting for congestion costs in foraging and examined the effect of resource distribution and colony size on the optimal distribution of foragers over forage patches. While others have previously considered the effect of colony size on recruitment strategy (Planqué et al. 2010; Pagliara et al. 2018; Mayya et al. 2019), our analysis instead assumes the recruitment strategy, and considers the optimal distribution. Our simple heuristic agent controllers are able to approximate the optimal distribution for relatively small swarm sizes, although large swarms depart from optimality. Large swarms cause crowded environments which require strategies to clear paths in order to reduce traffic congestion. We identify two possible strategies to limit traffic congestion: modifying the abandonment strategy or enriching the individual behaviour with collision-reactive states. In this work, after abandonment, the robots simply resumed exploration. The effects of this abandonment strategy are limited as robots quickly rediscover a path (which may be already congested). We believe that a better abandonment strategy [e.g. to stay at the depot for a period of time before resuming exploration, similar to ants (Pagliara et al. 2018)] could improve the results of the abandonment behaviour introduced in this work. Complementarily, traffic flow can be maintained undisrupted even in relatively crowded conditions by individual ants changing their behaviour as a function of collisions with other ants (Dussutour et al. 2004; Poissonnier et al. 2019). Inspired by these results, the robot behaviour could be enriched with new collision-dependent states.

Our results are complementary to other approaches to minimal controllers necessary for collective behaviour in the swarm robotics field (Gauci et al. 2014; Özdemir et al. 2018). Simple controllers increase the transferability to various robotics platforms thanks to their limited hardware requirements. Additionally, simple behaviours generally reduce the impact of the reality gap and preserve consistent dynamics in reality and simulations, as shown in our experiments where the same control software produced qualitatively similar results.

Our results illustrate the sophisticated collective dynamics that can be generated even by simple agents, which should be of interest to biologists and of practical utility to engineers. Similarly, our study of swarm size, and the scalability of foraging success, should interest both biologists and engineers, although it is worth noting that at least in some species of ants congestion is much less of a problem compared to robots (Hönicke et al. 2015; Poissonnier et al. 2019). In Sect. 6.2, we investigated a case closer to biology in which congestions did not impact the travel time; with model and simulations adapted accordingly. Nevertheless, we argue that taking a unifying perspective on the biology and engineering of collective foraging is illuminating, both through their similarities, and their differences.

Notes

  1. 1.

    Lasius niger ants follow a similar behaviour, laying pheromone trails on their way back to the nest while depositing a quantity of pheromone proportional to the quality of the foraged food (Portha et al. 2004; Czaczkes et al. 2013).

References

  1. Arganda, S., Nicolis, S. C., Perochain, A., Péchabadens, C., Latil, G., & Dussutour, A. (2014). Collective choice in ants: The role of protein and carbohydrates ratios. Journal of Insect Physiology, 69, 19–26.

    Google Scholar 

  2. Arvin, F., Yue, S., & Xiong, C. (2015). Colias-\(\phi \): An autonomous micro robot for artificial pheromone communication. International Journal of Mechanical Engineering and Robotics Research, 4(4), 349–353.

    Google Scholar 

  3. Banks, J. H. (1999). Investigation of some characteristics of congested flow. Transportation research record, 1678(1), 128–134.

    Google Scholar 

  4. Baveco, J. M., Focks, A., Belgers, D., van der Steen, J. J., Boesten, J. J., & Roessink, I. (2016). An energetics-based honeybee nectar-foraging model used to assess the potential for landscape-level pesticide exposure dilution. PeerJ, 4, e2293.

    Google Scholar 

  5. Beckers, R., Deneubourg, J. L., & Goss, S. (1993). Modulation of trail laying in the ant Lasius niger (Hymenoptera: Formicidae) and its role in the collective selection of a food source. Journal of lnsect Behavior, 6(6), 751–759.

    Google Scholar 

  6. Beckers, R., Deneubourg, J. L., Goss, S., & Pasteels, J. M. (1990). Collective decision making through food recruitment. Insectes Sociaux, 37(3), 258–267.

    Google Scholar 

  7. Berman, S., Kumar, V., & Nagpal, R. (2011). Design of control policies for spatially inhomogeneous robot swarms with application to commercial pollination. In Proceedings of the 2011 IEEE/RSJ international conference on robotics and automation (ICRA 2011) (pp. 378–385). IEEE.

  8. Bose, T., Reina, A., & Marshall, J. A. R. (2017). Collective decision-making. Current Opinion in Behavioral Sciences, 16, 30–34.

    Google Scholar 

  9. Bosien, A., Turau, V., & Zambonelli, F. (2012). Approaches to fast sequential inventory and path following in RFID-enriched environments. International Journal of Radio Frequency Identification Technology and Applications, 4(1), 28–48.

    Google Scholar 

  10. Bregy, P., Sommer, S., & Wehner, R. (2008). Nest-mark orientation versus vector navigation in desert ants. Journal of Experimental Biology, 211(12), 1868–1873.

    Google Scholar 

  11. Burd, M., Archer, D., Aranwela, N., & Stradling, D. J. (2002). Traffic dynamics of the leaf-cutting ant, Atta cephalotes. The American Naturalist, 159(3), 283–293.

    Google Scholar 

  12. Campo, A., & Dorigo, M. (2007). Efficient multi-foraging in swarm robotics. In F. Almeida e Costa, L. M. Rocha, E. Costa, I. Harvey, & A. Coutinho (Eds.), Advances in artificial life (ECAL 2007). LNCS (Vol. 4648, pp. 696–705). Berlin: Springer.

    Google Scholar 

  13. Campo, A., Gutiérrez, Á., Nouyan, S., Pinciroli, C., Longchamp, V., Garnier, S., et al. (2010). Artificial pheromone for path selection by a foraging swarm of robots. Biological Cybernetics, 103(5), 339–352.

    MATH  Google Scholar 

  14. Charbonneau, D., Hillis, N., & Dornhaus, A. (2015). ’Lazy’ in nature: Ant colony time budgets show high ‘inactivity’ in the field as well as in the lab. Insectes Sociaux, 62(1), 31–35.

    Google Scholar 

  15. Charnov, E. L. (1976). Optimal foraging, the marginal value theorem. Theoretical Population Biology, 9(2), 129–136.

    MATH  Google Scholar 

  16. Choe, D. H., Villafuerte, D. B., & Tsutsui, N. D. (2012). Trail pheromone of the Argentine ant, Linepithema humile (Mayr) (Hymenoptera: Formicidae). PLoS ONE, 7(9), e45016.

    Google Scholar 

  17. Collett, T. S., & Collett, M. (2002). Memory use in insect visual navigation. Nature Reviews Neuroscience, 3(7), 542–552.

    MATH  Google Scholar 

  18. Cox, M. D., & Myerscough, M. R. (2003). A flexible model of foraging by a honey bee colony: The effects of individual behaviour on foraging success. Journal of Theoretical Biology, 223(2), 179–197.

    MathSciNet  Google Scholar 

  19. Czaczkes, T. J., Grüter, C., Ellis, L., Wood, E., & Ratnieks, F. L. W. (2013). Ant foraging on complex trails: Route learning and the role of trail pheromones in Lasius niger. The Journal of Experimental Biology, 216(2), 188–197.

    Google Scholar 

  20. Deneubourg, J. L., Aron, S., Goss, S., & Pasteels, J. M. (1990). The self-organizing exploratory pattern of the argentine ant. Journal of Insect Behavior, 3(2), 159–168.

    Google Scholar 

  21. Detrain, C., & Deneubourg, J. L. (2006). Self-organized structures in a superorganism: Do ants “behave” like molecules? Physics of Life Reviews, 3(3), 162–187.

    Google Scholar 

  22. Detrain, C., & Deneubourg, J. L. (2008). Collective decision-making and foraging patterns in ants and honeybees. Advances in Insect Physiology, 35(08), 123–173.

    Google Scholar 

  23. Detrain, C., Deneubourg, J. L., & Pasteels, J. M. (Eds.). (1999). Information processing in social insects. Basel: Birkhäuser.

    Google Scholar 

  24. Devigne, C., Renon, A. J., & Detrain, C. (2004). Out of sight but not out of mind: Modulation of recruitment according to home range marking in ants. Animal Behaviour, 67(6), 1023–1029.

    Google Scholar 

  25. Dimidov, C., Oriolo, G., & Trianni, V. (2016). Random walks in swarm robotics: An experiment with Kilobots. In M. Dorigo, et al. (Eds.), Swarm intelligence (ANTS 2016), LNCS (Vol. 9882, pp. 185–196). Berlin: Springer.

    Google Scholar 

  26. Ducatelle, F., Di Caro, G. A., Pinciroli, C., Mondada, F., & Gambardella, L. M. (2011a). Communication assisted navigation in robotic swarms: self-organization and cooperation. In Proceedings of the 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS 2011) (pp. 4981–4988). IEEE.

  27. Ducatelle, F., Di Caro, G. A., Pinciroli, C., & Gambardella, L. M. (2011b). Self-organized cooperation between robotic swarms. Swarm Intelligence, 5(2), 73–96.

    Google Scholar 

  28. Dussutour, A., Beekman, M., Nicolis, S. C., & Meyer, B. (2009). Noise improves collective decision-making by ants in dynamic environments. Proceedings of the Royal Society of London B: Biological Sciences, 276(1677), 4353–4361.

    Google Scholar 

  29. Dussutour, A., Deneubourg, J. L., & Fourcassié, V. (2005). Temporal organization of bi-directional traffic in the ant Lasius niger (L.). Journal of Experimental Biology, 208(15), 2903–2912.

    Google Scholar 

  30. Dussutour, A., Fourcassié, V., Helbing, D., & Deneubourg, J. L. (2004). Optimal traffic organization in ants under crowded conditions. Nature, 428(6978), 70–73.

    Google Scholar 

  31. Dussutour, A., & Simpson, S. J. (2009). Communal nutrition in ants. Current Biology, 19(9), 740–744.

    Google Scholar 

  32. Essche, S. V., Ferrante, E., Turgut, A. E., Lon, R. V., Holvoet, T., & Wenseleers, T. (2015). Environmental factors promoting the evolution of recruitment strategies in swarms of foraging robots. In Proceedings of the 1st international symposium on swarm behavior and bio-inspired robotics (pp. 1–8). MIT Press.

  33. Ferrante, E., Turgut, A. E., Dorigo, M., & Huepe, C. (2013). Elasticity-based mechanism for the collective motion of self-propelled particles with springlike interactions: A model system for natural and artificial swarms. Physical Review Letters, 111(26), 268302.

    Google Scholar 

  34. Ferrante, E., Turgut, A. E., Duéñez-Guzmán, E., Dorigo, M., & Wenseleers, T. (2015). Evolution of self-organized task specialization in robot swarms. PLoS Computational Biology, 11(8), e1004273.

    Google Scholar 

  35. Fick, A. (1855). Ueber diffusion. Annalen der Physik, 170(1), 59–86.

    Google Scholar 

  36. Font Llenas, A., Talamali, M. S., Xu, X., Marshall, J. A. R., & Reina, A. (2018). Quality-sensitive foraging by a robot swarm through virtual pheromone trails. In M. Dorigo, M. Birattari, C. Blum, A. Christensen, A. Reina, & V. Trianni (Eds.), Swarm Intelligence (ANTS 2018), LNCS (Vol. 11172, pp. 135–149). Berlin: Springer.

    Google Scholar 

  37. Fourcassié, V., Dussutour, A., & Deneubourg, J. L. (2010). Ant traffic rules. Journal of Experimental Biology, 213(14), 2357–2363.

    Google Scholar 

  38. Fujisawa, R., Dobata, S., Kubota, D., Imamura, H., & Matsuno, F. (2008). Dependency by concentration of pheromone trail for multiple robots. In M. Dorigo, M. Birattari, C. Blum, M. Clerc, T. Stützle, & A. F. T. Winfield (Eds.), SAnt colony optimization and swarm intelligence (ANTS 2008), LNCS (Vol. 5217, pp. 283–290). Berlin: Springer.

    Google Scholar 

  39. Fujisawa, R., Dobata, S., Sugawara, K., & Matsuno, F. (2014). Designing pheromone communication in swarm robotics: Group foraging behavior mediated by chemical substance. Swarm Intelligence, 8(3), 227–246.

    Google Scholar 

  40. Garnier, S., Tâche, F., Combe, M., Grimal, A., & Theraulaz, G. (2007). Alice in pheromone land: An experimental setup for the study of ant-like robots. In Proceedings of the 2007 IEEE swarm intelligence symposium (SIS 2007) (pp. 37–44). IEEE.

  41. Garnier, S. (2011). From ants to robots and back: How robotics can contribute to the study of collective animal behavior. In Y. Meng & Y. Jin (Eds.), Bio-Inspired Self-Organizing Robotic Systems, SCI (Vol. 355, pp. 105–120). Berlin: Springer.

    Google Scholar 

  42. Garnier, S., Combe, M., Jost, C., & Theraulaz, G. (2013). Do ants need to estimate the geometrical properties of trail bifurcations to find an efficient route? A swarm robotics test bed. PLoS Computational Biology, 9(3), e1002903.

    MathSciNet  Google Scholar 

  43. Gauci, M., Chen, J., Li, W., Dodd, T. J., & Groß, R. (2014). Self-organized aggregation without computation. The International Journal of Robotics Research, 33(8), 1145–1161.

    Google Scholar 

  44. Gordon, D. M. (1991). Behavioral flexibility and the foraging ecology of seed-eating ants. The American Naturalist, 138(2), 379–411.

    Google Scholar 

  45. Goss, S., Deneubourg, J. L., Bourgine, P., & Varela, E. (1992). Harvesting by a group of robots. In 1st European conference on artificial Life (pp. 195–204). MIT Press.

  46. Goss, S., Deneubourg, J. L., & Pasteels, J. M. (1989). Self-organized shortcuts in the Argentine ant. Naturwissenschaften, 76(12), 579–581.

    Google Scholar 

  47. Greene, M. J., & Gordon, D. M. (2003). Cuticular hydrocarbons inform task decisions. Nature, 423(6935), 32–32.

    Google Scholar 

  48. Gunther, N. J. (2000). The practical performance analyst. Bloomington: Authors Choice Press.

    Google Scholar 

  49. Hamann, H., & Wörn, H. (2006). An analytical and spatial model of foraging in a swarm of robots. In International workshop on swarm robotics (pp. 43–55). Springer.

  50. Hamann, H. (2012). Towards swarm calculus: Universal properties of swarm performance and collective decisions. In M. Dorigo, et al. (Eds.), Swarm intelligence (ANTS 2012), LNCS (Vol. 7461, pp. 168–179). Berlin: Springer.

    Google Scholar 

  51. Hamann, H. (2013). Towards swarm calculus: Urn models of collective decisions and universal properties of swarm performance. Swarm Intelligence, 7(2–3), 145–172.

    Google Scholar 

  52. Hamann, H. (2018a). Superlinear scalability in parallel computing and multi-robot systems: Shared resources, collaboration, and network topology. In M. Berekovic, R. Buchty, H. Hamann, D. Koch, & T. Pionteck (Eds.), International conference on architecture of computing systems (ARCS 2018), LNCS (Vol. 10793, pp. 31–42). Berlin: Springer.

    Google Scholar 

  53. Hamann, H. (2018b). Swarm robotics: A formal approach. Berlin: Springer.

    Google Scholar 

  54. Hangartner, W. (1969). Orientierung vonLasius fuliginosus latr. an einer gabelung der geruchsspur. Insectes Sociaux, 16(1), 55–60.

    Google Scholar 

  55. Hecker, J. P., Letendre, K., Stolleis, K., Washington, D., & Moses, M. E. (2012). Formica ex Machina: Ant swarm foraging from physical to virtual and back again. In M. Dorigo, et al. (Eds.), Swarm intelligence (ANTS 2012), LNCS (Vol. 7461, pp. 252–259). Berlin: Springer.

    Google Scholar 

  56. Heinze, S., Narendra, A., & Cheung, A. (2018). Principles of insect path integration. Current Biology, 28(17), R1043–R1058.

    Google Scholar 

  57. Herianto, Kurabayashi, D. (2009). Realization of an artificial pheromone system in random data carriers using RFID tags for autonomous navigation. In Proceedings of the 2009 IEEE/RSJ international conference on robotics and automation (ICRA 2009) (pp. 2288–2293). IEEE.

  58. Herianto, Sakakibara T., & Kurabayashi, D. (2007). Artificial pheromone system using RFID for navigation of autonomous robots. Journal of Bionic Engineering, 4(4), 245–253.

    Google Scholar 

  59. Hoff, N., Wood, R., & Nagpal, R. (2012). Distributed colony-level algorithm switching for robot swarm foraging. In A. Martinoli, et al. (Eds.), Distributed autonomous robotic systems (DARS 2010), STAR (Vol. 83, pp. 417–430). Berlin: Springer.

    Google Scholar 

  60. Hölldobler, B., & Wilson, E. O. (1990). The Ants. Cambridge: Harvard University Press.

    Google Scholar 

  61. Hönicke, C., Bliss, P., & Moritz, R. F. A. (2015). Effect of density on traffic and velocity on trunk trails of Formica pratensis. The Science of Nature, 102(3–4), 17.

    Google Scholar 

  62. Houston, A. I., & McNamara, J. M. (1999). Models of adaptive behaviour: An approach based on state. Cambridge: Cambridge University Press.

    Google Scholar 

  63. Houston, A. I., & McNamara, J. M. (2014). Foraging currencies, metabolism and behavioural routines. Journal of Animal Ecology, 83(1), 30–40.

    Google Scholar 

  64. Kacelnik, A. (1984). Central place foraging in Starlings (Sturnus vulgaris). I. patch residence time. The Journal of Animal Ecology, 53(1), 283.

    Google Scholar 

  65. Kacelnik, A. (1993). Leaf-cutting ants tease optimal foraging theorists. Trends in Ecology & Evolution, 8(10), 346–348.

    Google Scholar 

  66. Khaliq, A. A., Di Rocco, M., & Saffiotti, A. (2014). Stigmergic algorithms for multiple minimalistic robots on an RFID floor. Swarm Intelligence, 8(3), 199–225.

    Google Scholar 

  67. Khaluf, Y., Ferrante, E., Simoens, P., & Huepe, C. (2017). Scale invariance in natural and artificial collective systems: A review. Journal of The Royal Society Interface, 14(136), 20170662.

    Google Scholar 

  68. Krause, J., Ruxton, G. D., & Ruxton, G. D. (2002). Living in groups. Oxford: Oxford University Press.

    Google Scholar 

  69. Labella, T. H., Dorigo, M., & Deneubourg, J. L. (2004). Efficiency and task allocation in prey retrieval. In A. J. Ijspeert, M. Murata, & N. Wakamiya (Eds.), Biologically inspired approaches to advanced information technology, LNCS (Vol. 3141, pp. 274–289). Berlin: Springer.

    Google Scholar 

  70. Leduc, C., Padberg-Gehle, K., Varga, V., Helbing, D., Diez, S., & Howard, J. (2012). Molecular crowding creates traffic jams of kinesin motors on microtubules. Proceedings of the National Academy of Sciences, 109(16), 6100–6105.

    Google Scholar 

  71. Liu, W., Winfield, A., Sa, J., Chen, J., & Dou, L. (2006). Strategies for energy optimisation in a swarm of foraging robots. In E. Şahin, W. M. Spears, & A. F. T. Winfield (Eds.), Swarm robotics, LNCS (Vol. 4433, pp. 14–26). Berlin: Springer.

    Google Scholar 

  72. López, J. (1987). Optimal foraging in seed-harvester ants: Computer-aided simulation. Ecology, 68(6), 1630–1633.

    Google Scholar 

  73. Mamei, M., & Zambonelli, F. (2005). Physical deployment of digital pheromones through RFID technology. In Proceedings of the 2005 IEEE swarm intelligence symposium (SIS 2005) (pp. 281–288). IEEE.

  74. Mamei, M., & Zambonelli, F. (2007). Pervasive pheromone-based interaction with RFID tags. ACM Transactions on Autonomous and Adaptive Systems, 2(2), 4.

    Google Scholar 

  75. Mayet, R., Roberz, J., Schmickl, T., & Crailsheim, K. (2010). Antbots: A feasible visual emulation of pheromone trails for swarm robots. In M. Dorigo, et al. (Eds.), Swarm intelligence (ANTS 2010), LNCS (Vol. 6234, pp. 84–94). Berlin: Springer.

    Google Scholar 

  76. Mayya, S., Pierpaoli, P., & Egerstedt, M. (2019). Voluntary retreat for decentralized interference reduction in robot swarms. In Proceedings of the 2019 IEEE/RSJ international conference on robotics and automation (ICRA 2019) (pp. 9667–9673). IEEE.

  77. Mitri, S., Wischmann, S., Floreano, D., & Keller, L. (2013). Using robots to understand social behaviour. Biological Reviews, 88(1), 31–39.

    Google Scholar 

  78. Montes de Oca, M., Ferrante, E., Scheidler, A., Pinciroli, C., Birattari, M., & Dorigo, M. (2010). Majority-rule opinion dynamics with differential latency: A mechanism for self-organized collective decision-making. Swarm Intelligence, 5(3–4), 305–327.

    Google Scholar 

  79. Nicolis, S. C., & Deneubourg, J. L. (1999). Emerging patterns and food recruitment in ants: An analytical study. Journal of Theoretical Biology, 198(4), 575–592.

    Google Scholar 

  80. Nonacs, P., & Dill, L. M. (1990). Mortality risk vs. food quality trade-offs in a common currency: Ant patch preferences. Ecology, 71(5), 1886–1892.

    Google Scholar 

  81. Nouyan, S., Groß, R., Bonani, M., Mondada, F., & Dorigo, M. (2009). Teamwork in self-organized robot colonies. IEEE Transactions on Evolutionary Computation, 13(4), 695–711.

    Google Scholar 

  82. Olsson, O., Brown, J. S., & Helf, K. L. (2008). A guide to central place effects in foraging. Theoretical Population Biology, 74(1), 22–33.

    MATH  Google Scholar 

  83. Orians, G. H., & Pearson, N. E. (1979). On the theory of central place foraging. In D. J. Horn, R. D. Mitchell, & G. R. Stairs (Eds.), Analysis of ecological systems (pp. 154–177). Columbus: The Ohio State University Press.

    Google Scholar 

  84. Özdemir, A., Gauci, M., Bonnet, S., & Groß, R. (2018). Finding consensus without computation. IEEE Robotics and Automation Letters, 3(3), 1346–1353.

    Google Scholar 

  85. Pagliara, R., Gordon, D. M., & Leonard, N. E. (2018). Regulation of harvester ant foraging as a closed-loop excitable system. PLOS Computational Biology, 14(12), e1006200.

    Google Scholar 

  86. Parker, G. A., & Smith, J. M. (1990). Optimality theory in evolutionary biology. Nature, 348(6296), 27.

    Google Scholar 

  87. Payton, D. W., Daily, M., Estowski, R., Howard, M., & Lee, C. (2001). Pheromone robotics. Autonomous Robots, 11(3), 319–324.

    MATH  Google Scholar 

  88. Pinciroli, C., Talamali, M. S., Reina, A., Marshall, J. A. R., & Trianni, V. (2018). Simulating Kilobots within ARGoS: Models and experimental validation. In M. Dorigo, M. Birattari, C. Blum, A. Christensen, A. Reina, & V. Trianni (Eds.), Swarm intelligence (ANTS 2018), LNCS (Vol. 11172, pp. 176–187). Berlin: Springer.

    Google Scholar 

  89. Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., et al. (2012). ARGoS: A modular, parallel, multi-engine simulator for multi-robot systems. Swarm Intelligence, 6(4), 271–295.

    Google Scholar 

  90. Pini, G., Brutschy, A., Scheidler, A., Dorigo, M., & Birattari, M. (2014). Task partitioning in a robot swarm: Object retrieval as a sequence of subtasks with direct object transfer. Artificial Life, 20(3), 291–317.

    Google Scholar 

  91. Pitonakova, L., Crowder, R., & Bullock, S. (2016). Information flow principles for plasticity in foraging robot swarms. Swarm Intelligence, 10(1), 33–63.

    Google Scholar 

  92. Pitonakova, L., Crowder, R., & Bullock, S. (2018). The Information–Cost–Reward framework for understanding robot swarm foraging. Swarm Intelligence, 12(1), 71–96.

    Google Scholar 

  93. Planqué, R., Van Den Berg, J. B., & Franks, N. R. (2010). Recruitment strategies and colony size in ants. PLoS One, 5(8), e11664.

    Google Scholar 

  94. Poissonnier, L. A., Motsch, S., Gautrais, J., Buhl, J., & Dussutour, A. (2019). Still flowing, experimental investigation of ant traffic under crowded conditions. eLife (in press).

  95. Portha, S., Deneubourg, J. L., & Detrain, C. (2004). How food type and brood influence foraging decisions of Lasius niger scouts. Animal Behaviour, 68(1), 115–122.

    Google Scholar 

  96. Pratissoli, F., Reina, A., Kaszubowski Lopes, Y., Sabattini, L., & Groß, R. (2019). A soft-bodied modular reconfigurable robotic system composed of interconnected Kilobots. In Proceedings of the 2019 IEEE international symposium on multi-robot and multi-agent systems (MRS 2019). (in press).

  97. Purnamadjaja, A. H., & Russell, R. A. (2007). Guiding robots’ behaviors using pheromone communication. Autonomous Robots, 23(2), 113–130.

    Google Scholar 

  98. Pyke, G. H. (1984). Optimal foraging theory: A critical review. Annual Review of Ecology, Evolution, and Systematics, 15(1), 523–75.

    Google Scholar 

  99. Reid, C. R., Latty, T., & Beekman, M. (2012). Making a trail: Informed Argentine ants lead colony to the best food by U-turning coupled with enhanced pheromone laying. Animal Behaviour, 84(6), 1579–1587.

    Google Scholar 

  100. Reina, A., Salvaro, M., Francesca, G., Garattoni, L., Pinciroli, C., Dorigo, M., & Birattari, M. (2015b). Augmented reality for robots: Virtual sensing technology applied to a swarm of e-pucks. In Proceedings of the 2015 NASA/ESA conference on adaptive hardware and systems (AHS 2015) (pp. 1–6). IEEE.

  101. Reina, A., Cope, A. J., Nikolaidis, E., Marshall, J. A. R., & Sabo, C. (2017). ARK: Augmented reality for Kilobots. IEEE Robotics and Automation Letters, 2(3), 1755–1761.

    Google Scholar 

  102. Reina, A., Miletitch, R., Dorigo, M., & Trianni, V. (2015a). A quantitative micro-macro link for collective decisions: The shortest path discovery/selection example. Swarm Intelligence, 9(2–3), 75–102.

    Google Scholar 

  103. Robinson, E. J. H., Jackson, D. E., Holcombe, M., & Ratnieks, F. L. W. (2005). ‘No entry’ signal in ant foraging. Nature, 438(7067), 442–442.

    Google Scholar 

  104. Robinson, E. J. H., Ratnieks, F. L., & Holcombe, M. (2008). An agent-based model to investigate the roles of attractive and repellent pheromones in ant decision making during foraging. Journal of Theoretical Biology, 255(2), 250–258.

    MathSciNet  Google Scholar 

  105. Rubenstein, M., Ahler, C., Hoff, N., Cabrera, A., & Nagpal, R. (2014a). Kilobot: A low cost robot with scalable operations designed for collective behaviors. Robotics and Autonomous Systems, 62(7), 966–975.

    Google Scholar 

  106. Rubenstein, M., Cornejo, A., & Nagpal, R. (2014b). Programmable self-assembly in a thousand-robot swarm. Science, 345(6198), 795–799.

    Google Scholar 

  107. Scheidler, A., Brutschy, A., Ferrante, E., & Dorigo, M. (2016). The k-unanimity rule for self-organized decision-making in swarms of robots. IEEE Transactions on Cybernetics, 46(5), 1175–1188.

    Google Scholar 

  108. Schmid-Hempel, P., Kacelnik, A., & Houston, A. I. (1985). Honeybees maximize efficiency by not filling their crop. Behavioral Ecology and Sociobiology, 17(1), 61–66.

    Google Scholar 

  109. Seeley, T. D. (1986). Social foraging by honeybees: How colonies allocate foragers among patches of flowers. Behavioral Ecology and Sociobiology, 19(5), 343–354.

    Google Scholar 

  110. Seeley, T. D. (1994). Honey bee foragers as sensory units of their colonies. Behavioral Ecology and Sociobiology, 34(1), 51–62.

    Google Scholar 

  111. Seeley, T. D., Visscher, P. K., Schlegel, T., Hogan, P. M., Franks, N. R., & Marshall, J. A. R. (2012). Stop signals provide cross inhibition in collective decision-making by honeybee swarms. Science, 335(6064), 108–111.

    Google Scholar 

  112. Shaffer, Z., Sasaki, T., & Pratt, S. C. (2013). Linear recruitment leads to allocation and flexibility in collective foraging by ants. Animal Behaviour, 86(5), 967–975.

    Google Scholar 

  113. Sperati, V., Trianni, V., & Nolfi, S. (2011). Self-organised path formation in a swarm of robots. Swarm Intelligence, 5(2), 97–119.

    Google Scholar 

  114. Stephens, D. W., & Krebs, J. (1986). Foraging theory. Princeton: Princeton University Press.

    Google Scholar 

  115. Stickland, T. R., Britton, N. F., & Franks, N. R. (1999). Models of information flow in ant foraging: The benefits of both attractive and repulsive signals. In C. Detrain, J. L. Deneubourg, & J. M. Pasteels (Eds.), Information processing in social insects (pp. 83–100). Basel: Birkhäuser.

    Google Scholar 

  116. Sugawara, K., Kazama, T., & Watanabe, T. (2004). Foraging behavior of interacting robots with virtual pheromone. In Proceedings of the 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS 2004) (Vol. 3, pp. 3074–3079). IEEE.

  117. Sumpter, T., & Pratt, C. (2003). A modelling framework for understanding social insect foraging. Behavioral Ecology and Sociobiology, 53(3), 131–144.

    Google Scholar 

  118. Svennebring, J., & Koenig, S. (2004). Building terrain-covering ant robots: A feasibility study. Autonomous Robots, 16(3), 313–332.

    Google Scholar 

  119. Tabone, M., Ermentrout, B., & Doiron, B. (2010). Balancing organization and flexibility in foraging dynamics. Journal of Theoretical Biology, 266(3), 391–400.

    MathSciNet  MATH  Google Scholar 

  120. Thienen, W. V., Metzler, D., Choe, D. H., & Witte, V. (2014). Pheromone communication in ants: A detailed analysis of concentration-dependent decisions in three species. Behavioral Ecology and Sociobiology, 68(10), 1611–1627.

    Google Scholar 

  121. Traniello, J. F. (1989). Foraging strategies of ants. Annual Review of Entomology, 34(1), 191–210.

    Google Scholar 

  122. Tsimring, L. S. (2014). Noise in biology. Reports on Progress in Physics, 77(2), 026601.

    Google Scholar 

  123. Valentini, G., Antoun, A., Trabattoni, M., Wiandt, B., Tamura, Y., Hocquard, E., et al. (2018). Kilogrid: A novel experimental environment for the Kilobot robot. Swarm Intelligence, 12(3), 245–266.

    Google Scholar 

  124. Van Vorhis Key, S. E., & Baker, T. C. (1982). Trail-following responses of the Argentine ant, Iridomyrmex humilis (Mayr), to a synthetic trail pheromone component and analogs. Journal of Chemical Ecology, 8(1), 3–14.

    Google Scholar 

  125. Webb, B. (2012). Cognition in insects. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1603), 2715–2722.

    Google Scholar 

  126. Wendt, S., Strunk, K.S., Heinze, J., Roider, A., & Czaczkes, T.J. (2018). Relative value perception in an insect: Positive and negative incentive contrasts in ants. bioRxiv, 330241.

  127. Werger, B.B., Matarić, M.J. (1996). Robotic “food” chains: Externalization of state and program for minimal-agent foraging. In From animals to animats 4. Proceedings of the 4th international conference on simulation of adaptive behavior (SAB 96) (pp. 625–634). MIT Press.

  128. Wilson, E. O. (1962). Chemical communication among workers of the fire ant Solenopsis saevissima (Fr. Smith) 1. The organization of mass-foraging. Animal Behaviour, 10(1–2), 134–147.

    Google Scholar 

  129. Winfield, A. F. T. (2009). Foraging robots. In R. A. Meyers (Ed.), Encyclopedia of complexity and system science (pp. 3682–3700). Berlin: Springer.

    Google Scholar 

  130. Wischmann, S., Floreano, D., & Keller, L. (2012). Historical contingency affects signaling strategies and competitive abilities in evolving populations of simulated robots. Proceedings of the National Academy of Sciences, 109(3), 864–868.

    Google Scholar 

Download references

Acknowledgements

We thank Alex Cope and Michael Port for helpful discussions and the crucial support in tackling the numerous hardware challenges of this project. We also thank Elva JH Robinson for valuable discussions on collective foraging in ants. Finally, we thank Federico Pratissoli for sharing with us the design of the Kilobot’s ring shown in Fig. 1a and Francesco Canciani for helping in conducting the Kilobots experiments.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mohamed S. Talamali.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This study was funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant Agreement No. 647704).

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 110938 KB)

Supplementary material 2 (mp4 115725 KB)

Supplementary material 3 (mp4 116852 KB)

Supplementary material 4 (mp4 108680 KB)

Supplementary material 5 (mp4 108887 KB)

Supplementary material 6 (mp4 109797 KB)

Supplementary material 7 (mp4 157855 KB)

Supplementary material 8 (mp4 158136 KB)

Supplementary material 9 (mp4 157008 KB)

Appendices

Appendix

Derivation of an optimal resource collection model

To link the behaviour of the robot swarm with optimal resource collection, we developed a mathematical model which is inspired by optimal foraging theory and relates to maximising energetic efficiency (Kacelnik 1984; Houston and McNamara 2014). An overview of the parameters used in our model is given in Table 1. The following derivation assumes that the robot swarm is in a steady state reached after time \(T_\mathrm{eq}\). We further assume that the simulation ends at time \(T_\mathrm{end}\). The time between \(T_\mathrm{eq}\) and \(T_\mathrm{end}\) is divided into \(N_t\) smaller intervals with length \(\varDelta t=(T_\mathrm{end}-T_\mathrm{eq})/N_t\). In our study, we have \(T_\mathrm{eq}=30\,\mathrm {min}\) and \(T_\mathrm{end}=60\,\mathrm {min}\). This means \(\varDelta t=1\,\mathrm {min}\) if we assume \(N_t=30\) intervals. The swarm size is denoted by S. We can now define the total utility gain of the swarm in the steady state related to the collection of items from source area \(A_j\) as

$$\begin{aligned} g_j = e_0\,q_j\,\varDelta U_j, \end{aligned}$$
(6)

where \(e_0\) is a utility unit, and \(q_j=Q_j/Q_{\max }\) is the normalised quality of source area \(A_j\), \(Q_j\), with respect to the maximum quality \(Q_{\max }\). \(Q_{\max }\) is the maximum quality value known to the robot swarm, and \(\varDelta U_j=U_j(T_\mathrm{end})-U_j(T_\mathrm{eq})\) is the total number of items carried back from source area j to the central depot within the time interval \(T_\mathrm{end}-T_\mathrm{eq}\). Note that if \(Q_j=Q_{\max }\) (i.e. \(q_j=1\)) then the utility gained per resource item with the maximum quality carried back to the depot is \(e_0\). We may assume that the number of collected items from source \(A_j\) has the following functional relationship with the number of robots on the trail

$$\begin{aligned} \varDelta U_j = \varphi _j\,(T_\mathrm{end}-T_\mathrm{eq})\,\beta _j\,\rho _j\,S , \end{aligned}$$
(7)

where \(\rho _j\) is the fraction of robots on the trail between central depot and source area \(A_j\), i.e. \(\sum _j \rho _j = 1-\rho _\mathrm{e} = \rho _\mathrm{w}\), where \(\rho _\mathrm{e}\) denotes the fraction of robots that explore the landscape and are not involved in object transportation—called explorers—and \(\rho _\mathrm{w}\) is the total fraction of robots actively involved in transportation of resource items—called workers. The parameter \(\beta _j\) is a constant characterising the relationship between workers collecting from source \(A_j\) and the number of collected items carried back to the nest, and \(\varphi _j\) is a foraging rate, which we may approximate by

$$\begin{aligned} \varphi _j =\frac{1}{T_j^\mathrm{U}+T_j^\mathrm{L}} , \end{aligned}$$
(8)

where \(T_j^\mathrm{L}\) and \(T_j^\mathrm{U}\) are the travel times to overcome the distance between central depot and source \(A_j\) with and without load, respectively.

Table 1 Overview of parameters used in the derivation of the optimal foraging model

The cost for one robot in the swarm (unit cost) associated with travelling between depot and source \(A_j\) to collect resource items is given as

$$\begin{aligned} c_j= c_j^\mathrm{U}\,T_j^\mathrm{U} + c_j^\mathrm{L}\,T_j^\mathrm{L} , \end{aligned}$$
(9)

where \(c_j^\mathrm{L}\) and \(c_j^\mathrm{U}\) are the costs per time interval related to travelling with an item (\(\mathrm{L} =\mathrm{loaded}\)) and without (\(\mathrm{U}=\mathrm{unloaded}\)). Using Eqs. (6) and (9), let us define the transport yield (in the steady state) per unit cost as

$$\begin{aligned} \begin{aligned} \tilde{R}= \sum _{j=1}^{n} \frac{g_j}{c_j} =&\sum _{j=1}^{n} \frac{e_0\,q_j\,\varDelta U_j}{c_j^\mathrm{U}\,T_j^\mathrm{U} + c_j^\mathrm{L}\,T_j^\mathrm{L}} \\ =&\sum _{j=1}^{n} \frac{e_0\,(T_\mathrm{end}-T_\mathrm{eq})\,\beta _j\,q_j\,\rho _j\,S}{\left( c_j^\mathrm{U}\,T_j^\mathrm{U} + c_j^\mathrm{L}\,T_j^\mathrm{L}\right) \left( T_j^\mathrm{U} + T_j^\mathrm{L}\right) } \,. \end{aligned} \end{aligned}$$
(10)

As we consider virtual (weightless) objects, the cost for travelling with and without load is identical, hence \(c_j^\mathrm{L}=c_j^\mathrm{U}=c_0\). Similarly, travel times with and without load may be assumed to be identical, that is, we have \(T_j^\mathrm{U}=T_j^\mathrm{L}=T_j\). We can then express the travel time as

$$\begin{aligned} T_j = \frac{d_j}{v_0} + T_{C,j}(\rho _j\,S) , \end{aligned}$$
(11)

where \(d_j\) is the distance between depot and source area \(A_j\), and \(v_0\approx 1\, \mathrm{cm }/\mathrm{s}\) is the travel speed of the Kilobot (see Sect. 3.2). The term \(T_{C,j}(\rho _j\,S)\) is a function that takes into account how the travel time is influenced by the number of robots \(\rho _j\,S\) on the path to \(A_j\). Overcrowded paths cause prolonged travel time due to frequent collisions between robots. We model the additional time arising from traffic congestion as

$$\begin{aligned} T_{C,j}(\rho _j\,S)=T_{0,j}\,\left( \exp \left[ \kappa _j\,\frac{\rho _j\,S}{N_{\mathrm{crit},j}}\right] -1\right) , \end{aligned}$$
(12)

where \(T_{0,j}\) is a constant which sets the time scale of the additional travel time, \(\kappa _j\) is a constant included to fine-tune the nonlinear effect of overcrowding on the path to \(A_j\), and \(N_{\mathrm{crit},j}\) is the critical number for which traffic congestion may have a significant effect. This means that \(T_{C,j}(\rho _j\,S)\) is negligible if \(\kappa _j\,\rho _j\,S\ll N_{\mathrm{crit},j}\). The critical number \(N_{\mathrm{crit},j}\) may be obtained using

$$\begin{aligned} N_{\mathrm{crit},j} = {\max }\left( \lfloor \xi _j \rfloor ,\,1 \right) ,\,\, \xi _j=\frac{d_j}{Z_{\mathrm{KB}}} , \end{aligned}$$
(13)

where \(Z_{\mathrm{KB}}=3.3\,\text {cm}\) is the diameter of a Kilobot and \(\lfloor \xi _j \rfloor \) returns the greatest integer less than or equal to \(\xi _j\).

Normalising Eq. (10) by the experimental constant \(K=e_0\,v_0^2\,(T_\mathrm{end}-T_\mathrm{eq})/(4\,c_0)\), we can derive the normalised yield as

$$\begin{aligned} \begin{aligned} R= \frac{\tilde{R}}{K} = \sum _{j=1}^{n} \frac{q_j\,\beta _j\,\rho _j\,S}{\tilde{d}_j^2} , \qquad \text {with} \quad \tilde{d_j}= d_j + v_o\,T_{C,j}(\rho _j\,S) , \end{aligned} \end{aligned}$$
(14)

which is identical to Eq. (4), and mentioned here again for convenience. This means that travel distance \(d_j\) increases due to traffic congestion between central depot and source area \(A_j\) by the additional length \(v_o\,T_{C,j}(\rho _j\,S)\). The normalised yield, R, in Eqs. (4) and (14), respectively, is given in the physical unit \(\text {m}^{-2}\), which seems appropriate as we consider robots looking for source areas in a two-dimensional space.

Derivation of first and second-order derivatives of yield function with respect to swarm allocation for binary resource collection

Here, we derive the first and second-order partial derivatives that correspond to the binary resource collection problem discussed in Sect. 5.2. We let \(\rho _1=\rho \) and \(\rho _2=\rho _\mathrm{w}-\rho \). The first-order derivative of the yield function in Eq. (5) with respect to \(\rho \) is given as

$$\begin{aligned} \begin{aligned} \frac{\partial R}{\partial \rho } = \frac{\partial R_1}{\partial \rho } + \frac{\partial R_2}{\partial \rho } = \frac{\partial R_1}{\partial T_{C,1}(\rho )}\,\frac{\partial T_{C,1}(\rho )}{\partial \rho }+ \frac{\partial R_2}{\partial T_{C,2}(\rho )}\,\frac{\partial T_{C,2}(\rho )}{\partial \rho } , \end{aligned} \end{aligned}$$
(15)

where \(T_{C,j}(\rho )\) is given in Eq. (12). Performing the derivatives we find

$$\begin{aligned} \begin{aligned} \frac{\partial R}{\partial \rho } =&S\,\Biggl \{ \frac{q_1\,\beta _1}{\tilde{d}_1^2} - \frac{q_2\,\beta _2}{\tilde{d}_2^2} - 2\,v_0\,S\,\Biggl (\frac{\rho \,q_1\,\beta _1\,\kappa _1\,T_{0,1}}{N_{\mathrm{crit},1}\,\tilde{d}_1^3} \exp \left[ \kappa _1\,\frac{\rho \,S}{N_{\mathrm{crit},1}}\right] \\&+ \frac{(\rho -\rho _\mathrm{w})\,q_2\,\beta _2\,\kappa _2\,T_{0,2}}{N_{\mathrm{crit},2}\,\tilde{d}_2^3}\,\exp \left[ \kappa _2\,\frac{(\rho _\mathrm{w}-\rho )\,S}{N_{\mathrm{crit},2}}\right] \Biggr ) \Biggr \} , \end{aligned} \end{aligned}$$
(16)

where \(\tilde{d}_j\) (\(j=1,2\)) is given in Eq. (14). Note that we set \(\rho _\mathrm{w}=1\) in Sect. 5.2, for simplicity.

Using the expression in Eq. (16), we obtain the second-order partial derivative as

$$\begin{aligned} \begin{aligned} \frac{\partial ^2 R}{\partial \rho ^2}&= 2\,v_0\,S^2\,\left[ \frac{q_1\,\beta _1\,\kappa _1\,T_{0,1}}{N_{\mathrm{crit},1}}\,g_1(\rho ) + \frac{q_2\,\beta _2\,\kappa _2\,T_{0,2}}{N_{\mathrm{crit},2}}\,g_2(\rho ) \right] \\ g_1(\rho )&= \frac{\exp \left[ \kappa _1\,\frac{\rho \,S}{N_{\mathrm{crit},1}}\right] }{\tilde{d}_1^3}\, \Biggl (\frac{\kappa _1\,\rho \,S}{N_{\mathrm{crit},1}}\Biggl [ \frac{3\,v_0\,T_{0,1}\,\exp \left[ \kappa _1\,\frac{\rho \,S}{N_{\mathrm{crit},1}}\right] }{\tilde{d}_1} -1 \Biggr ] -2 \Biggr ) \\ g_2(\rho )&= \frac{\exp \left[ \kappa _2\,\frac{(\rho _\mathrm{w}-\rho )\,S}{N_{\mathrm{crit},2}}\right] }{\tilde{d}_2^3} \\&\quad \times \Biggl (\frac{\kappa _2\,(\rho _\mathrm{w}-\rho )\,S}{N_{\mathrm{crit},2}} \Biggl [ \frac{3\,v_0\,T_{0,2}\,\exp \left[ \kappa _2\,\frac{(\rho _\mathrm{w}-\rho )\,S}{N_{\mathrm{crit},2}}\right] }{\tilde{d}_2} -1 \Biggr ] -2 \Biggr ) \,. \end{aligned} \end{aligned}$$
(17)

Deriving the critical swarm size for equal qualities and equal distances

The critical swarm size, denoted by \(S_\mathrm{c}\), indicates when it is better to collect items from both source areas when we assume equal qualities and equal distances. The critical size \(S_\mathrm{c}\) may be derived from solving

$$\begin{aligned} \frac{3\,v_0\,\bar{T}_0}{\tilde{d}}\,\exp \left[ \frac{{\bar{\kappa }}\,S_\mathrm{c}}{2\,N_{\mathrm{crit}}}\right] = 1 + \frac{4\,N_{\mathrm{crit}}}{{\bar{\kappa }}\,S_\mathrm{c}} , \end{aligned}$$
(18)
Table 2 Overview of estimated model parameters

where we used \(\tilde{d}_1=\tilde{d}_2=\tilde{d}\), \(N_{\mathrm{crit},1}=N_{\mathrm{crit},2}=N_{\mathrm{crit}}\) and \(\rho _\mathrm{w}=1\) (giving \(\rho =1/2\)). Further to this, we substituted \(\kappa _1=\kappa _2={\bar{\kappa }}=(\kappa _1+\kappa _2)/2\), \(T_{0,1}=T_{0,2}=\bar{T}_0=(T_{0,1}+T_{0,2})/2\) and \(\beta _1=\beta _2={\bar{\beta }}=(\beta _1+\beta _2)/2\), where the model parameters are given in Table 2. The expression in Eq. (18) follows from the second-order derivative according to

$$\begin{aligned} \left. \frac{\partial ^2 R(\rho ;\,S=S_\mathrm{c})}{\partial \rho ^2} \right| _{\rho =\frac{1}{2}}=0 , \end{aligned}$$
(19)

at which point the global maximum at \(\rho = 1/2\) becomes a local minimum and two local maxima (\(\rho \ne 1/2\)) arise with equal yields R for \(S\ge S_\mathrm{c}\). The full expression of the second-order derivative \(\partial ^2 R/\partial \rho ^2\) is given in Eq. (17) in Appendix B.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Talamali, M.S., Bose, T., Haire, M. et al. Sophisticated collective foraging with minimalist agents: a swarm robotics test. Swarm Intell 14, 25–56 (2020). https://doi.org/10.1007/s11721-019-00176-9

Download citation

Keywords

  • Foraging
  • Swarm robotics
  • Stigmergy
  • Kilobot
  • Augmented reality
  • Traffic congestion