1 Introduction

Exploration and rescue, evacuation from dangers, surveillance and crowd control are all examples of multi-agent herding problems in which two kinds of agents interact.

In these problems, a set of “active” agents (the herders) need to drive a set of “passive” agents (the herd) towards a desired goal region and confine them therein (Long et al. 2020; Nolfi 2002).

In most cases, repulsive forces exerted by the herders on the herd are exploited to drive the movements of the passive agents that need to be corralled and, at times, cooperation among the herders (such as attractive forces between them) are used to enhance the herding performance.

Notable herding solutions are those proposed in Vaughan et al. (2000), Lien et al. (2004), Strombom et al. (2014), Paranjape et al. (2018), Licitra et al. (2019) for single herders and in Lien et al. (2005), Haque et al. (2009), Lee and Kim (2017), Pierson and Schwager (2018), Nalepka et al. (2017b), Montijano et al. (2013), Varava et al. (2017), Song et al. (2021), Chipade and Panagou (2019), Sebastián and Montijano (2021) for multiple herders.

One of the problems to be addressed in designing control strategies to drive herder agents is endowing them with the ability of deciding at any given time what passive agent a herder should target first when multiple herders and targets are present.

For the sake of comparison with our approach, we now briefly review the most relevant research from the literature addressing multi-agent herding, where multiple herders are required to collect and drive a group of passive agents towards a desired goal region.

1.1 Related work

One of the earliest solutions to the herding problem was proposed by Lien et al. (2004, 2005). The trajectories followed by passive and herder agents were generated using global rule-based roadmaps—abstract representations of the walkable paths given as a directed graph (Wilmarth et al. 1999). Numerical simulations showed that multiple herders were successful in coping with increasing sizes of the herd. Nevertheless, herders’ performance worsened as the flocking tendency of passive agents decreased.

Multi-agent herding scenarios were also considered in Haque et al. (2009, 2011). Here the authors addressed the problem of controlling a group of herders so as to entrap a group of passive agents in a region from which they could not escape. To solve this problem, each herder was pre-assigned some region of influence. Targets’ motion was then only influenced by a specific herder if they happened to be within its region of influence; targets travelling otherwise at constant speed and with a heading aligned to that of their neighbouring agents. The velocities of the herders were regulated according to that of the other passive agents with which they interacted.

Other multi-agent herding scenario, where many herders are required to collect and patrol a group of passive agents, were also proposed in Lee and Kim (2017). Inspired by the limited visual field of real sheepdogs and the absence of centralised coordination among them, the latter work proposed a herding algorithm based entirely on local control rules. The dynamics of both herders and passive agents were modelled as the linear combination of potential field-like forces within a sensing area. In addition to this basic dynamics, passive agents were also subject to a repulsive force from the herders. Herders were controlled by an appropriate input selected as a function of their distance from the nearest passive agent and their distance from a desired goal. The result of the proposed shepherding behaviour was the emergence of an arc formation among the herders [a similar formation was instead hard coded in the algorithm presented earlier in Lien et al. (2005)]. Numerical simulations showed the effectiveness of the approach under the assumption that passive agents tend to flock together. In this case, herders could indeed collect and herd multiple sub-flocks without any explicit coordination rule.

In Robotics, feedback control strategies have been recently presented to solve multi-agent herding problems and guarantee convergence of the overall system. For instance, in Pierson and Schwager (2018) the case of multiple herder agents regulating the mean position of a group of flocking passive agents was investigated. An arc-based strategy was proposed for the herders to surround and drive the targets towards a desired goal region. The proposed control law and its convergence properties were explored by modelling the whole herd as a single unicycle controlled by means of a point-offset technique. Montijano et al. (2013) proposed a herding strategy based on elliptical orbits to entrap a passive agent whose position is uncertain. Varava et al. (2017); Song et al. (2021) developed a “herding by caging” solution, based on geometric considerations and motion planning techniques to arrange the herder agents around the flock. A similar formation was presented in Chipade and Panagou (2019), and further developed in Chipade et al. (2021), to let herders identify clusters of flocking adversarial agents, dynamically encircle and drive them to a safe zone. Recently, Sebastián and Montijano (2021) developed analytical and numerical control design procedures to compute suitable herding actions to herd evading agents to a desired position, even when the nonlinearities in the evaders’ dynamics yield implicit equations.

A different approach was used in Cognitive Science (Nalepka et al. 2015, 2017a, b, 2019), where a model of the herding agent was derived from experimental observations of how two human players herd a group of randomly moving agents in a virtual reality setting. It was observed that, at the beginning of the task, all pairs of human players adopt a search and recovery strategy; players individually chasing the farthest passive agent in the half of the game field assigned to them, driving it inside the desired containment region. Once all agents are gathered inside the goal region, most pairs of human herders were observed to switch to an entirely different containment strategy, based on exhibiting an oscillatory movement along an arc around the goal region creating effectively a “repulsive wall” for the passive agents keeping them therein (Nalepka et al. 2017a). To reproduce this behaviour in artificial agents, a nonlinear model was proposed by Nalepka et al. (2019) where the switch from search and recovery to the oscillatory containment strategy is induced by a Hopf bifurcation triggered by a change in the distance of the herd agents from the goal region.

With regard to a single herder agent gathering one-by-one a group of passive agents, recent work by Licitra et al. (2017) employed a backstepping control strategy for the single herder to chase one target at a time, with the herder switching among different targets and succeeding in collecting them within a goal region of interest. This idea was further developed in Licitra et al. (2018, 2019) where other control strategies and further uncertainties in the herd’s dynamics were investigated.

1.2 Contributions of this paper

In this paper, we consider the case of a small group of herders chasing a much larger group of passive agents whose dynamics, as often happens with natural agents such as fish, birds or bacteria, is stochastic and driven by a random Brownian noise. However, contrary to what is usually done in the rest of the literature (Haque et al. 2009; Lien et al. 2004; Pierson and Schwager 2018; Lee and Kim 2017; Chipade and Panagou 2019), we do not consider the presence of any flocking behaviour between passive agents, making the problem more complicated to solve as each target needs to be tracked and collected independently from the others.

To solve the problem, we present a simple, yet effective, dynamic herding strategy consisting of local feedback control laws for the herder agents and a set of target selection rules that drive how herders make decentralised decisions on what targets to follow. A herder’s action is based on the global knowledge of the environment and of the positions of all other agents. With respect to other solutions in the literature (Lien et al. 2004; Pierson and Schwager 2018; Chipade et al. 2021; Song et al. 2021), our approach does not involve the use of ad hoc formation control strategies to force the herders surround the herd, but we rather enforce cooperation between herders by dynamically dividing the plane among them by means of simple yet effective and robust rules that can be easily implemented in real robots.

We then numerically analyse how robust these strategies are to parameter perturbations, uncertainties and unmodeled disturbances in passive agent dynamics. Moreover, we assess how different choices of the target selection rule affect the overall effectiveness of the methodology we propose. Finally, we test the effectiveness of the proposed strategies to solve the herding problem firstly in simulations in ROS and then in experiments on real robots conducted via the online Robotarium platform (Pickem et al. 2017; Wilson et al. 2020).

2 The herding problem

We consider the problem of controlling \({N}_H\ge 2\) herder agents in order for them to drive a group of \(N_T > N_H\) passive agents in the plane (\(\mathbb {R}^2\)) towards a goal region and contain them therein. We term \(\mathbf {y}_{j}\) the position in Cartesian coordinates of the j-th herder in the plane and \(\mathbf {x}_{i}\) that of the i-th passive agent. We denote as \((r_{j},\, \theta _{j})\) and \((\rho _{i},\, \phi _{i})\) their respective positions in polar coordinates as shown in Fig. 1. We assume the goal of the herders is to drive the passive agents towards a circular containment region \(\mathcal {G}\), of radius \(r^\star \) centred at \(\mathbf {x}^{\star }\). Without loss of generality, we set \(\mathbf {x}^{\star }\) to be the origin of \(\mathbb {R}^2\).

Assuming the herders have their own trivial dynamics in the plane, the herding problem can be formulated as that of designing the control action u governing the dynamics of the herders assumed to be given by

$$\begin{aligned} m\, \varvec{\ddot{y}}_{j}=u(t,\mathbf {x}_{1}, \ldots , \mathbf {x}_{N_T},\mathbf {y}_{1},\ldots , \mathbf {y}_{N_H}), \end{aligned}$$
(1)

where m denotes the mass of the herders taken to be unitary, so that the herders can influence the dynamics of the passive agents (whose dynamics will be specified in the next section) and guarantee that

$$\begin{aligned} \Vert \mathbf {x}_{i}(t) - \mathbf {x}^{\star }\Vert \le r^{\star }, \quad \forall i, \forall t\ge t_\mathrm {g}, \end{aligned}$$
(2)

where \(\Vert \cdot \Vert \) denotes the Euclidean norm; that is, all passive agents are contained, after some finite gathering time \(t_\mathrm {g}\), in the desired region \(\mathcal {G}\). A herding trial is said to be successful in the time interval [0, T] if condition (2) holds for some \(t_\mathrm {g} \in [0, T]\). We assume an annular safety region \(\mathcal {B}\) of width \(\varDelta r^\star \) exists surrounding the goal region that the herders leave between themselves and the region where targets are contained.

In what follows, we will assume that (i) herder and passive agents can move freely in \(\mathbb {R}^2\); (ii) herder agents have global knowledge of the environment and of the positions of the other agents therein.

Fig. 1
figure 1

Illustration of the spatial arrangement in the herding problem. The herder agent \( \mathbf{y}_{j} \) (yellow square), with polar coordinates \(( r_{j},\,\theta _{j} )\), must relocate the target agent \(\mathbf{x} _{i}\) (green ball), with polar coordinates \((\rho _{i},\, \phi _{i}) \), in the containment region \(\mathcal {G}\) (solid red circle) of center \( \mathbf {x}^{\star } \) and radius \( r^{\star } \). The buffer region \(\mathcal {B}\), of width \(\varDelta r^\star \), is depicted as a dashed red circle (Color figure online)

3 Target dynamics

Taking inspiration from Nalepka et al. (2017b), we assume that, when interacting with the herders, passive agents are repelled from them and move away in the opposite direction, while in the absence of any external interaction, they randomly diffuse in the plane. Specifically, we assume passive agents move according to the following stochastic dynamics

$$\begin{aligned} d \mathbf{x}_{i}(t) = \mathbf{V}_{r,i}(t) dt + \alpha _b d \mathbf{W}_{i}(t), \end{aligned}$$
(3)

where \(\mathbf{V}_{r,i}\) describes the repulsion exerted by all the herders on the i-th passive agent, \(\mathbf{W}_{i}=[W_{i,1}\), \(W_{i,2}]^\top \) is a 2-dimensional standard Wiener process and \(\alpha _b>0\) is a constant. We suppose the distance travelled by the passive agents depends on how close the herder agents are and model this effect by considering a potential field centred on the j-th herder given by \({v}_{i,j}={1}/{(\Vert \mathbf{x}_{i} - \mathbf{y}_{j} \Vert )}\), exerting on the passive agents an action proportional to its gradient (Pierson and Schwager 2018). Specifically, the dynamics of the i-th passive agent is influenced by all the herders through the reaction term

$$\begin{aligned} \mathbf{V}_{r,i}(t) = \alpha _r \sum _{j=1}^{N_H} \frac{\partial v_{i,j}}{\partial \mathbf{x}_{i}} = - \alpha _r \sum _{j=1}^{N_H} \tfrac{\mathbf{x}_{i} (t) - \mathbf{y}_{j}(t)}{\Vert \mathbf{x}_{i}(t) - \mathbf{y}_{j}(t)\Vert ^3}, \end{aligned}$$
(4)

where \(\alpha _r>0\) is a constant. Uncertainties on the repulsive reaction term (4) can be seen as being captured by the additional noisy term in (3).

Notice that the velocity of all passive agents is completely determined by (3)–(4) and we do not assume any upper bound on its maximum value. The position of the i-th passive agent when it is targeted by the j-th herder will be denoted as \(\varvec{\tilde{x}}_{i,j}\) or in polar coordinates as \((\tilde{\rho }_{i,j},\tilde{\phi }_{i,j})\).

4 Herder dynamics and control rules

Our solution to the herding problem consists of two layered strategies; (i) a local control law to drive the motion of the herder towards the target it selected, and to push it inside the goal region and (ii) a target selection strategy through which herders decide what target they need to chase. When the herd are all gathered, the herders switch back to an idling condition by keeping theirself within the safety region surrounding the goal region.

4.1 Local control strategy

For the sake of comparison with the strategy presented in Nalepka et al. (2017b), Nalepka et al. (2019), we derive in polar coordinates the control law we propose to drive each herder. Albeit not resulting in the shortest possible path travelled by the herders, the controller expressed in polar coordinates ensures circumnavigation of the goal region, avoiding passive agents already contained therein from being scattered around. Specifically, the control input to the j-th herder dynamics (1) is defined as \(\mathbf{u}_{j}=u_{r,j}\, \hat{r}_j + u_{\theta ,j}\, \hat{\theta }_j\), where \(\hat{\theta }_j = \hat{r}_j^\perp \) are unit vectors and \(\hat{r}_j = [\cos \theta _{j}, \, \sin \theta _{j}]^\top \), and its components are chosen as

$$\begin{aligned} u_{r,j}(t)= & {} - b_r \dot{r}_{j}(t)-\mathcal {R}(\varvec{\tilde{x}}_{i,j},t),\end{aligned}$$
(5)
$$\begin{aligned} u_{\theta ,j}(t)= & {} - b_\theta \dot{\theta }_{j}(t)-\mathcal {T}(\varvec{\tilde{x}}_{i,j},t) , \end{aligned}$$
(6)

with \(b_r,\,b_\theta >0\), and where the feedback terms \(\mathcal {R}(\varvec{\tilde{x}}_{i,j},t)\) and \(\mathcal {T}(\varvec{\tilde{x}}_{i,j},t)\) are elastic forces that drive the herder towards the target i and push it towards the containment region \(\mathcal {G}\). Such forces are chosen as

$$\begin{aligned} \begin{aligned} \mathcal {R}(\varvec{\tilde{x}}_{i,j},t)&= \epsilon _{r} \, \Big [ r_{j}(t) - \xi _{j}(t) \, (\tilde{\rho }_{i,j}(t) + \varDelta r^{\star }) \\&\quad - (1 - \xi _{j}(t)) \, ( r^{\star } + \varDelta r^{\star } ) \Big ] , \end{aligned} \end{aligned}$$
(7)
$$\begin{aligned} \begin{aligned} \mathcal {T}(\varvec{\tilde{x}}_{i,j},t)&= \epsilon _{\theta } \, \Big [ \theta _{j}(t) - \xi _{j}(t) \tilde{\phi }_{i,j}(t) \\&\quad - (1 - \xi _{j}(t)) \psi (t) \Big ] . \end{aligned} \end{aligned}$$
(8)

with \(\epsilon _{r},\,\epsilon _{\theta }>0\), and where \(\xi _{j}\) regulates the switching policy between collecting and idling behaviours. That is, \(\xi _{j} = 1\), if \(\tilde{\rho }_{i,j} \ge r^{\star }\), and \(\xi _{j} = 0\), if \(\tilde{\rho }_{i,j} < r^{\star }\), so that the herder is attracted to the position of the i-th target \(\varvec{\tilde{x}}_{i,j}\) (plus a radial offset \(\varDelta r^\star \)) when the current target is outside the containment region (\( \xi _{j} = 1\)) or close to the boundary of the buffer region at the idling position \((r^{\star } + \varDelta r^{\star },\, \psi )\), in polar coordinates, otherwise (\( \xi _{j} = 0\)). The value of the idling angle \(\psi \) depends on the specific choice of the target selection strategy employed, which are discussed next. Note that the control laws (5)–(6) are much simpler than those presented by Nalepka et al. (2017b) as they do not contain any higher order nonlinear term nor are complemented by parameter adaptation rules (see Nalepka et al. 2017b for further details). Moreover, as for the passive agents, we do not assume any upper bound on the maximum velocity of the herders.

4.2 Target selection strategies

In the case of a single herder chasing multiple agents, the most common strategy in the literature is for it to select the target as either the farthest passive agent from the goal region, or the centre of mass of the flocking herd (Vaughan et al. 2000; Strombom et al. 2014; Licitra et al. 2017). When two or more herders are involved, the problem is usually solved using a formation control approach, letting the herders surround the herd and then drive them towards the goal region (Pierson and Schwager 2018; Lien et al. 2004). Rather than using formation control techniques or solving off-line or on-line optimisation problems as in dynamic target assignment problems (e.g., Bürger et al. 2011), here we present a set of simple, yet effective, target selection strategies that exploit the spatial distribution of the herders allowing them to cooperatively select their targets without requiring any computationally expensive optimisation problem to be solved on-line.

Fig. 2
figure 2

Graphical representation of the target selection strategies. Herders are depicted as yellow squares, passive agents as green balls. The colours in which the game field is divided correspond to regions assigned to different herders. Herder \(\mathbf {y}_{j}\) is currently chasing target agent \(\tilde{\mathbf {x}}_{i,j}\), while passive agent \(\mathbf {x}_{i}\) is not chased by any herder (Color figure online)

We present four different herding strategies, starting from the simplest case where herders globally look for the target farthest from the goal region. A graphical illustration of the four strategies is reported in Fig. 2 for \(N_H=3\) herders. Global search strategy (no partitioning). Each herder selects the farthest passive agent from the containment region which is not currently targeted by any other herder (Fig. 2a). Being the simplest, we present this strategy for the sake of comparison only and not for being implemented on real robots.

Static arena partitioning At the beginning of the trial and for all of its duration, the plane is partitioned in \(N_H\) circular sectors of width equal to \(2 \pi / {N}_H\,\mathrm {rad}\) centred at \(\mathbf {x}^{\star }\). Each herder is then assigned one sector to patrol and selects the passive agent therein that is farthest from \(\mathcal {G}\) (Fig. 2b). Note that this is the same herding strategy used in Nalepka et al. (2017b) for \(N_H = 2\) herders.

Dynamic leader-follower (LF) target selection strategy. At the beginning of the trial, herders are labelled from 1 to \(N_H\) in anticlockwise order starting from a randomly selected herder which is assigned the leader role. The plane is then partitioned dynamically in different regions as follows. The leader starts by selecting the farthest passive agent from \(\mathcal {G}\) whose angular position \(\tilde{\phi }_{i,1}\) is such that

$$\begin{aligned} \tilde{\phi }_{i,1} \in \left( \theta _{1} - \frac{1}{2}\frac{2\pi }{ N_H},\, \theta _{1} + \frac{1}{2}\frac{2\pi }{N_H} \right] , \end{aligned}$$

where \(\theta _{1}\) is the angular position of the leader at time t. Then, all the other follower herders (\(j=2,\ldots ,N_H\)), in ascending order, select their targets as the passive agent farthest from \(\mathcal {G}\) such that

$$\begin{aligned} \tilde{\phi }_{i,j} \in \bigg ( \theta _{1} - \frac{1}{2} \frac{2\pi }{ N_H} + \zeta _{j} , \theta _{1} + \frac{1}{2} \frac{2\pi }{ N_H} + \zeta _{j} \bigg ], \end{aligned}$$

with \(\zeta _{j} = {2 \pi }(j-1)/{{N}_H}\). As the leader chases the selected target and moves in the plane, the partition described above changes dynamically so that a different circular sector with constant angular width \(2\pi /N_H\,\mathrm {rad}\) is assigned to each follower at any time instant. In Fig. 2c the case is depicted for \(N_H=3\) in which the sector \((\theta _{1}- \frac{\pi }{3}, \theta _{1}+ \frac{\pi }{3}]\) is assigned to the leader herder (asumed to be \(y_1\)) while the rest of the plane is assigned equally to the other two herders.

Dynamic peer-to-peer (P2P) target selection strategy. At the beginning of the trial herders are labelled from 1 to \(N_H\) as in the previous strategy. Denoting as \(\zeta _j^+\) the angular difference between the positions of herder j and herder \((j+1)\, \mathrm {mod}\, N_H\) at time t, and as \(\zeta _j^-\) that between herder j and herder \((j+N_H-1)\, \mathrm {mod}\, N_H\) at time t, then herder j selects the farthest passive agent from \(\mathcal {G}\) whose angular position is such that

$$\begin{aligned} \tilde{\phi }_{i,j} \in \bigg ( \theta _{j} - \frac{\zeta _j^-}{2}, \, \theta _{j} + \frac{\zeta _j^+}{2} \bigg ]. \end{aligned}$$

Unlike the previous case, now the width of the circular sector assigned to each herder is also dynamically changing as it depends on the relative angular positions of the herders in the plane.

The idling angle \(\psi \) in (8) is set equal to the angular position \(\tilde{\phi }_{i,j}\) of the last contained target for the global search strategy, otherwise it is set equal to the angular position corresponding to the half of the angular sector assigned at each time to the herder. In so doing, the herder is made to rest at a point in which its angular distance is minimised from any passive agent escaping the containment region into the assigned angular sector.

A crucial difference between the herding strategies presented above is the nature (local vs global) and amount of information that herders must possess to select their next target. Specifically, when the global search strategy is used, every herder needs to know the position \(\mathbf {x}_{i}\) of every passive agent in the plane, not currently targeted by other herders. In the case of the static arena partitioning instead a herder needs to know its assigned (constant) circular sector together with the position \(\mathbf {x}_{i}\) of every passive agent in the sector.

For the dynamic target selection strategies, less information is generally required. Indeed, in the dynamic leader-follower strategy the herders, knowing \(N_H\), can either self-select the sector assigned to themselves (if they act as leader) or self-determine their respective sector by knowing the position of the leader \(\mathbf {y}_1\). Similarly in the dynamic peer-to-peer strategy herders can self-select their sectors by using the angles \(\zeta _j^+\) and \(\zeta _j^-\).

Note that in the event of perfect radial alignment of the herder and its target, the herder might push the target away, rather than towards the goal region (Fig. 3).

Although this condition is very unlikely to persist due to the random motion of the passive agents, this problem can be avoided by extending the herder dynamics in (1) by a circumnavigation force \(u^{\perp }_{j}(t)\). This force is orthogonal to the vector \(\varDelta \mathbf {x}_{ij} = \mathbf {x}_{i}-\mathbf {y}_{j}\), and its amplitude depends on the angle \(\chi _{ij}\) between \(\varDelta \mathbf {x}_{ij}\) and \(\mathbf {y}_{j}\), such that it is maximal when the two vectors are parallel (\(\chi _{ij}=0\)) and zero when they are anti-parallel (\(\chi _{ij}=\pi \)). Specifically, it is defined as:

$$\begin{aligned} u^{\perp }_{j}(t)=\bar{U} \cdot v(t) \cdot \cos ^2 \left( {\frac{\chi _{ij}}{2}} \right) \frac{\varDelta \mathbf {x}_{ij}^{\perp }}{\Vert \varDelta \mathbf {x}_{ij}\Vert }, \end{aligned}$$
(9)

where \(\bar{U}>0\) is the maximum amplitude, and the value of \(v \in \{-1,1\}\) depends on which halves of the assigned sector the herder is currently in to guarantee that the target agent is always pushed toward the interior of the sector.

Fig. 3
figure 3

Graphical representation of circumnavigation force \(u^{\perp }_{j}(t)\) in the case that herder and its target are not aligned (a \(\chi _{ij}\in (0,\pi )\)) and in the case the herder is perfectly aligned behind (b \(\chi _{ij}=\pi \)) or ahead (c \(\chi _{ij}=0\)) of the passive agent w.r.t. the containment region \(\mathcal {G}\)

5 Numerical validation

The herding performance of the proposed control strategies has been evaluated through a set of numerical experiments aimed at (i) assessing their effectiveness in achieving the herding goal; (ii) comparing the use of different target selection strategies; (iii) studying the robustness of each strategy to parameter variations. The implementation and validation of the strategies in a more realistic robotic environment is reported in the next section where ROS simulations and experiments are included.

5.1 Performance metrics

We defined the following metrics (see Appendix A for their definitions) to evaluate the performance of different strategies. Specifically, for each of the proposed strategies we computed the (i) gathering time \(t_\mathrm {g}\), (ii) the average length \(d_\mathrm {g}\) of the path travelled by the herders until all targets are contained, (iii) the average total length \(d_\mathrm {tot}\) of the path travelled by herders during all the herding trial, (iv) the mean distance \(D_T\) between the herd’s centre of mass and the centre of the containment region, and (v) the herd agents’ spread \(S_{\%}\).

Note that lower values of \(t_\mathrm {g}\) correspond to better herding performance; herders taking a shorter time to gather all the passive agents in the goal region. Also, lower values of \(D_T\) and \(S_{\%}\) correspond to a tighter containment of the passive agents in the goal region while lower values of \(d_\mathrm {g}\) and \(d_\mathrm {tot}\) correspond to a more efficient herding capability of the herders during the gathering and containment of the herd.

5.2 Performance analysis

We carried out 50 simulation trials with \(N_T=7\) passive agents and either \(N_H=2\) or \(N_H=3\) herders, starting from random initial conditions. All simulation trials were found to be successful, that is, such that condition (2) is verified. (All simulation parameters and the description of simulation setup adopted here are reported in “Appendix B”).

Table 1 Average performance and standard deviation over 50 successful trials of different herding strategies for \(N_T = 7\) passive agents

The results of our numerical investigation are reported in Table 1. As expected, when herders search globally for agents to chase, their average total path, \(d_\mathrm {tot}\), is notably larger than when dynamic target selection strategies are used, pointing out that this strategy is going to be the least efficient when implemented and also requiring complete information about the agents. Therefore, in what follows we will discuss this strategy only for the sake of completeness and not for the purpose of its implementation.

As regards the aggregation of the herd in terms of \(D_T\) and \(S_{\%}\), all other strategies presented comparable results in terms of both mean and standard deviation. Dynamic strategies showed better gathering performance (\(t_\mathrm {g}\) and \(d_\mathrm {g}\)) than the static arena partitioning. Therefore, we find that in general higher level of cooperation between herders and a more efficient coverage of the plane, as those guaranteed by dynamic strategies, yield an overall better herding performance which is more suitable for realistic implementations in robots or virtual agents that are bound to move at limited speed.

Fig. 4
figure 4

Robustness analysis of the proposed herding strategies for two herders (\(N_H=2\)) to variation of herd size \(N_T\) and repulsive reaction coefficient \(\alpha _r\). \(N_T\) was varied between 3 and 60 agents, with increments equal to 3, while \(\alpha _r\) between 0.05 and 2.5, with increments equal to 0.05. For each pair (\(N_T\), \(\alpha _r\)) the corresponding metric was averaged over 15 simulation trials starting with random initial positions. The coloured plots were obtained by interpolation of the computed values (Color figure online)

5.3 Robustness analysis

Next, we analysed the robustness of the proposed herding strategies to variations of the herd size and of the magnitude of the repulsive reaction to the herders exhibited by the passive agents (Fig. 4). Specifically, we varied \(N_T\) between 3 and 60 and the repulsion parameter \(\alpha _r\) in (4) between 0.05 and 2.5, while keeping \(N_H=2\); we found that all strategies succeed in herding up to 60 agents in a large region of parameter values (see the blue areas in Fig. 4a). The global strategy, where herders patrol the entire plane, is found as expected to be the least efficient in terms of total distance traveled by the herders (Fig. 4b); the dynamic peer-to-peer strategy offering the best compromise and robustness property in terms of containment performance (Fig. 4a) and efficiency (Fig. 4b).

To further validate these findings we carried out 50 simulation trials for the herding scenario in which \(N_H=3\) herders are required to herd \(N_T=60\) passive agents.

In this more challenging scenario not all the trials were found to be successful, as per (2), due to at least one passive agent escaping the containment region, and we averaged the resulting performance over the successful trials only (Table 2). Herders adopting the global and peer-to-peer strategies successfully herd all agents in over 50% of the trials. Moreover, herders globally searching for the target to chase spent on average slightly less time gathering the targets (\(t_g = 12.96\)) and achieved and maintained lower herd spread (\(S_\%=0.48\)). However, the path travelled to achieve the goal (\(d_\mathrm {tot}\)) was significantly larger than when static or dynamic selection strategies were adopted.

Table 2 Average performance over successful trials of different herding strategies for \(N_T = 60\) passive agents

6 Experimental validation

To validate in more realistic robotic settings the strategies we propose, we complemented the numerical simulations presented in Sect. 5 with simulations in ROSFootnote 1 and experiments on real robots conducted on the online Robotarium platform (Pickem et al. 2017; Wilson et al. 2020).

6.1 ROS simulations

ROSFootnote 2 is an advanced framework for robot software development that provides tools to support the user during all the development cycle, from low-level control and communication to deployment on real robots. We used the Gazebo software packageFootnote 3 to test the designed control architecture on accurate 3D models of commercial robots to simulate their dynamics and physical interaction with the virtual environment.

We considered a scenario where \(N_T=3\) passive agents need to be herded by \(N_H=2\) robotic herders. All agents were chosen to be implemented as Pioneer 3-DX,Footnote 4 a commercially available two-wheel two-motor differential drive robot whose detailed model is available in Gazebo (Fig. 5). The desired trajectories for the robots are generated by using Eqs. (3) and (5)–(6) for the passive and herder robots, respectively, which are used as reference signals for the on-board inner control loop to generate the required tangential and angular velocities (see “Appendix C” for further details).

Fig. 5
figure 5

Overview of Gazebo-ROS application, with 3D model of the Pioneer 3-DX robot (a) and a landscape view of the simulated environment (b)

Fig. 6
figure 6

ROS simulations. Top panels show the trajectories of passive agents (green lines) and herders (grey lines) adopting a static arena partitioning, b leader-follower and c peer-to-peer herding strategies simulated in the Gazebo environment. The containment region \(\mathcal {G}\) is depicted as a red circle. Black square marks denote the initial and the final (solid coloured) position of the herders. Green circle marks show the initial and the final (solid coloured) position of the passive agents. Bottom panels show that all herders are able to collect the herd in less than \(500\,\mathrm {s}\) by following the angular bounds (red lines) prescribed by the d static arena partitioning, e leader-follower and f peer-to-peer herding strategies (Color figure online)

Examples of ROS simulations are reported in Fig. 6 where all the target selection strategies that were tested (static arena partitioning, leader-follower, peer-to-peer) were found to be successful with herder robots being able to gather all the passive robots in the containment region. Figure 6 also shows that the angular position of the herders remain within the bounds defining the sector of the plane assigned to them for patrolling.

6.2 Robotarium experiments

Robotarium is a remotely accessible swarm robotics research platform equipped with GRITSBot robots which allows rapid deployment and testing of custom control algorithms (Pickem et al. 2015; Wilson et al. 2020). To comply with the limited space of the arena (\(3.2\,\mathrm {m} \times 2\,\mathrm {m}\)) and safety protocols to avoid collisions between robots (robots’ diameter is \(11\,\mathrm {cm}\)) implemented in the platform, we considered a scenario with \(N_T=4\) passive robots and \(N_H=2\) herder robots; a herding scenario that was also considered in Rigoli et al. (2020), Auletta et al. (2021) to study and model the selection strategies adopted by pairs of human-driven herder agents.

Herder parameters were selected as described in Appendix B, while the coefficient of diffusion and repulsion in the dynamics of passive agents (3) were scaled to \((\alpha _b,\alpha _r)= (0.001,0.4)\) to comply with the physical constraints on the hardware of the GRITBots; having a max tangential speed of \(20\,\mathrm {cm/s}\) and a max rotational speed of about \(3.6\,\mathrm {rad/s}\). The results of the experimental test are reported in Fig. 7. Both dynamic strategies were found to be effective in containing all the passive robots with a gathering time \(t_\mathrm {g}\) less than 70 s, and with the peer-to-peer strategy guaranteeing slightly faster convergence (Fig. 7c) than the leader-follower strategy (Fig. 7b) over all the 5 trials that were performed. The movies of two illustrative trials are available in the Supplementary Material.Footnote 5

Note that the dynamic control strategies we proposed were also found to have good performance when implemented in real robots with constraints imposed on their maximum velocities.

Fig. 7
figure 7

Robotarium experiments. a Overview of the Robotarium arena, with GRITSBot robots for a herding scenario with \(N_H=2\) herder robots and \( N_T = 4 \) passive robots, and the evolution in time of the distance of the farthest passive robot from the containment region \(\mathcal {G}\) when the b leader-follower and c peer-to-peer strategies are employed for each trial. The mobile GRITBots representing passive agents were initialised with random initial position chosen as \(\mathbf {x}_{i}(0) = 2\, r^\star \mathrm {e}^{\jmath \phi _{i}(0)}\), with \(\phi _{i}(0)\) drawn with uniform distribution from the interval \((-\pi ,\pi ]\). The radius \( r^\star \) of the containment region has been chosen equal to \(0.3\,\mathrm {m} \), i.e., equal to a third of the length of the arena

7 Conclusions

We presented a control strategy to solve the herding problem in the scenario where a group of herders is chasing a group of stochastic passive agents. Our approach is based on the combination of a set of local rules driving the herders according to the targets’ positions and a herding strategy through which the plane is partitioned among the herders, who then select the target to chase in the sector assigned to them either statically or dynamically. Our results show the effectiveness of the proposed strategy and its ability to cope with an increasing number of passive agents and variations of the repulsive force they feel when the herders approach them. Finally, we tested our control strategy via simulations in ROS and experiments on real robots, showing that our herding solutions are effective and viable in more realistic scenarios.

We wish to emphasise that to date our approach is the only one available in the literature to drive multiple herders to collect and contain a group of multiple agents that do not possess a tendency to flock and whose dynamics is stochastic. A pressing open problem is to derive a formal proof of convergence of the overall control system.