Introduction

With the advent of the artificial intelligence era, unmanned aerial vehicles (UAVs) have become substantially more autonomous and intelligent, which have become popular for various tasks [1, 2]. However, a single aircraft is with limited capability and resources [3], which results in its low efficiency and success rate on complex tasks such as transportation of large-scale supplies. The multiple unmanned aerial system is an unmanned intelligent cluster inspired by the self-organizing behaviors of biological groups, which significantly improves task efficiency through cooperation and complementary capabilities. Therefore, the cooperative control technology of the aerial cluster with distribution, autonomy and robustness is a promising solution. The aerial swarm has numerous potential applications, such as monitoring missions [4], UAV-assisted wireless coverage [5], search and rescue [6], target tracking [7], cooperative exploration [8], air traffic [9]. However, some drones in a swarm of drones have limited perceptual and computational capabilities, making it difficult to obtain abundant information about the environment and their companions. It is a challenge to make a drone swarm without collision, while passing an unknown environment, by relying only on the local information obtained by each individual drone.

Although the swarm has advantages in number and function, these advantages bring great challenges to formation control. The phenomenon of collective motion in nature provides inspiration for solving this problem [10]. Although individual animals have limited perceptual ability and seldom obtain global information, they can form coordinated and orderly collective motion without apparent leaders. This magical and fascinating collective behavior comes from the local interaction between individuals in gregarious living beings. Currently, the earliest collective intelligence algorithm recognized by academic circles is the Boids model proposed by Reynolds [11], which is a microscopic agent-based model constructed from a phenomenological perspective. It follows the separation-alignment-cohesion (SAC) rules. Compared with centralized control, this decentralized self-organization control provides great advantages for the flying robot swarm [12]. The former depends on a central computing node, while the latter each agent is an independent computing node, thus improving the robustness to individual faults. Self-organization means that any global motion mode results from local decisions. The decisions of each agent are only related to a limited number of neighbors, independent of the aerial swarm size [13]. This is highly critical for drone swarm control, which means that we can deploy dozens or hundreds of drones with limited computing resources.

Although the self-organizing control strategy based on biological inspiration has many advantages, it still faces some challenges when applied to the aerial swarm operating in complex environments. A concerning problem is to translate the phenomena of collective motion into interpretable mathematical models and deploy them on the multi-robot system. Vásárhelyi et al. [14] proposed a flocking model for real drones and carried out field experiments with a self-organizing swarm of 30 drones. This work realized flocking flight and obstacle avoidance of the aerial swarm like flocks of birds in predefined confined environments. However, such studies focus on reproducing the collective behavior of social animals in nature without considering the application in unknown environments and complex tasks [15]. The visual perception mechanism that individuals use the visual cortex to obtain information provides inspiration for the flying robot swarm to operate in an unknown environment [16, 17]. Perception and operation of drones in unknown environments is difficult, thus we use a bionic vision mechanism to achieve perception of the unknown environment. Besides, in practical applications, it is not necessary or even impossible to ensure that all individuals are informed of desired region information, such as the aerial swarm operating in substantial interference or dangerous environments. In most animal groups, only a small number of individuals have knowledge of food source or migration route [18, 19].

In this paper, we focus on the problem of a swarm of drones traversing an unknown environment without collision. A novel adaptive goal-directed bio-inspired flocking algorithm is designed. Each drone in the swarm works independently as a computing node and cooperates with others through local information interaction. Instead of prior knowledge of the environment, drones perceive the environment in real time through visual mechanisms. The flocking swarm is composed of informed agents and uniformed agents that operate equally in the overall missions. In addition, each drone moves with the same self-organizing rules. We propose an adaptive goal-directed strategy based on visual perception, which can significantly improve the efficiency of the drone swarm traversing the unknown environment. Finally, the feasibility of the proposed algorithm is verified by numerical simulation and physical experiments of 6 DJI Tello quadrotors.

The rest of this paper is organized as follows. “Related work” section covers some existing related works. The drone model, communication topology and evaluation metrics are presented in “Problem statement” section. In “Proposed flocking algorithm for drone swarm” section, the flocking algorithm suitable for the drone swarm traversing unknown environments is proposed, including non-AGDS and AGDS. In “Validation proposed flocking algorithms: typical scenario” section, the flocking algorithm is verified by simulation and experiment in a typical scenario. The validation and analysis of flocking algorithm in an extended scenario are given in “Extended simulation validation: complicated scenarios” section. Finally, this paper is concluded in “Conclusions” section.

Related work

Collective behavior modelling

The research on the mechanism of self-organizing behaviors promotes the development of the flying robot swarm. Vicsek et al. [20] studied the minimalist conditions of collective motion and proposed the Vicsek model, which only considered the velocity alignment rules between neighbors. Couzin et al. [21] proposed the Couzin model, which defined the zones and priorities of SAC rules. Levine et al. [22] proposed a social force model to study vortex motion based on Newtonian mechanics, which considered both velocity synergy and position synergy. Wu et al. [23] proposed an autonomous cooperative flocking algorithm for heterogeneous swarm, which solved the escort problem in the group cruise. By analyzing biological data, some scholars found unique behavior mechanisms, such as topological communication [24], hierarchical interaction [25], scale-free behavioral correlation [26], etc. Besides, the development of perceptual neuroscience has also promoted the study of collective biological behavior. Pearce et al. [17] proposed a model for long-range information exchange in bird flocks based on the projected view of each individual out through the flock and provided a solution for density control. Bastien et al. [27] proposed a purely vision-based model of collective behavior that showed the organized collective behavior that can be generated by relying solely on individual visual information. The model mainly focused on the perceptual information within the group without considering the perception of the environment. However, most of these works only explain the mechanism of collective motion from mathematical or biological data and have been seldom verified in artificial systems.

Reproducing natural collective motion with swarm robots

With the miniaturization of hardware, some scholars devoted themselves to deploying self-organizing algorithms in swarm robots. Slavkov et al. [28] created emergent morphologies in large swarm of real robots and reproduced the self-organized morphogenetic phenomenon of cells. Berlinger et al. [29] designed a fish-inspired robot swarm with implicit communication, which realized a variety of fish school behaviors such as synchrony, dispersion/aggregation, dynamic circle formation, and search-capture. Werfel et al. [30] developed a termite-inspired robot construction team construct complex predetermined structures. Rubenstein et al. [31] used a thousand-robot swarm to complete the self-assembly of complex two-dimensional shapes. Vásárhelyi et al. [14] deployed the flocking algorithm on 30 autonomous drones to achieve flocking flight and obstacle avoidance like a flock of birds. Balázs et al. [32] proposed a WillFull model for the persistence-responsivity trade-off in the swarm and used 52 autonomous drones to achieve abrupt collective turns like schools of fish or flocks of birds. Thus far, this is the largest self-organizing decentralized aerial swarm in an outdoor environment. The above works are mainly concerned with the reproduction of collective motions in nature on artificial collective systems.

Passing-through experiments using swarm robots and collective models

Based on the above research, swarm robots are expected to perform given missions in various scenarios. Quan et al. [33] proposed a distributed virtual pipeline control algorithm by pre-planning the airspace as virtual pipelines without obstacles. This work addressed the pipeline passage problem of the drone swarm and was experimentally verified using 6 multicopters. Liu et al. [34] developed a pragmatic distributed flexible formation protocol to pass through a narrow and irregular channel. In this study, two leaders tracked predefined channel boundaries, followers tracked leaders, and a field experiment was carried out with 3 unmanned surface vessels. Nevertheless, the premise of study is that the environment needs to be predefined and known to swarm individuals. Virágh et al. [35] provided a model of a general autonomous flying robot integrated into a realistic simulation framework and achieved collective target tracking using 9 autonomous aerial robots in the real world. Balázs et al. [36] proposed a decentralized air traffic control solution for dense traffic problems and implemented crosswalk and package-delivery applications on 30 autonomous drones. Soria et al. [37, 38] deployed the proposed predictive model on the multicopter swarm, which ensured rapid and safe collective migration in cluttered environments. Schilling et al. [39] proposed to learn vision-based swarm behaviors in an end-to-end manner directly from raw images and implemented collision-free navigation of the drone swarm in the real world. In these works, each agent in the drone swarm has knowledge of the desired region.

To sum up, the research on the self-organizing flocking algorithm of the aerial swarm in various applications is still in its infancy. This paper mainly solves the problem of the drone swarm traversing an unknown environment. Different from the above works, we propose a self-organizing flocking algorithm, which does not require every agent to have knowledge of the desired region. This is particularly beneficial for the drone swarm operating in interference or electromagnetic environments, where knowledge of the desired region cannot be guaranteed for each aircraft. Based on using the mathematical bionic vision mechanism to perceive the environment, we propose a vision-based adaptive goal-oriented strategy, which can effectively improve the efficiency of the drone swarm traversing a complex canyon-like environment.

Problem statement

In this section, a drone dynamics model is introduced first, including constraints on velocity and acceleration. Then, the communication topology of the drone swarm is presented. Finally, several evaluation metrics of the algorithm are proposed.

Individual drone dynamics

Consider aerial swarm consists of N individuals denoted by 1, 2, ..., N. The dynamics model of single drone can be simplified as a second-order integral model [40]. Therefore, we assume that the i-th drone with altitude hold mode satisfies the following model:

$$\begin{aligned} \begin{aligned} \varvec{\dot{p}}_i(t)&={\varvec{v}}_i(t) \\ \varvec{\dot{v}}_i(t)&= -\frac{1}{\tau _i}({\varvec{v}}_i(t)-{\varvec{v}}_{i\_\text {d}}(t)) \end{aligned} \end{aligned}$$
(1)

where \({\varvec{p}}_i(t) \in {\mathbb {R}}^2\) and \({\varvec{v}}_i(t) \in {\mathbb {R}}^2\) are respectively the position vector and velocity vector of the i-th drone at time t, \(\tau _i \in {\mathbb {R}}^+\) is the characteristic time of velocity response, \({\varvec{v}}_{i\_\text {d}}(t) \in {\mathbb {R}}^2\) is the desired velocity vector of the i-th drone at time t and is also the output of the flocking algorithm. We do not consider the differences between drones, and thus assume that \(\tau _i = \tau \) for all drones. Because of the physical constraints of the drone, the magnitude of velocity and acceleration are limited to \(v_{\max }\) and \(a_{\max }\), respectively. The rules are given by

$$\begin{aligned}{} & {} {\varvec{v}}_{i\_\text {d}}(t)=\frac{{\varvec{v}}_{i\_\text {d}}(t)}{\vert {\varvec{v}}_{i\_\text {d}}(t)\vert }\cdot \min \left\{ \vert {\varvec{v}}_{i\_\text {d}}(t)\vert , v_{\max } \right\} \end{aligned}$$
(2)
$$\begin{aligned}{} & {} {\varvec{a}}_i(t)=\frac{{\varvec{v}}_i(t) -{\varvec{v}}_{i\_\text {d}}(t)}{\vert {\varvec{v}}_i(t)-{\varvec{v}}_{i\_\text {d}}(t)\vert }\cdot \min \left\{ \frac{\vert {\varvec{v}}_i(t)-{\varvec{v}}_{i\_\text {d}}(t)\vert }{\delta t}, a_{\max } \right\} \nonumber \\ \end{aligned}$$
(3)

where \(\delta t\) is the interval of velocity updates.

Communication network between drones

To describe the model of the drone swarm in mathematical language, we use an undirected graph [41] \({\mathcal {G}}(t)=({\mathcal {V}},{\mathcal {E}}(t))\) to establish the communication network among N drones, where \({\mathcal {V}}=\{v_1,v_2,..,v_N\}\) is the set of vertices representing the set of drones, \({\mathcal {E}}(t) \in {\mathcal {V}} \times {\mathcal {V}}\) is the edge set that defines the communication at time t.

Definition 1

The neighbor set \({\mathcal {N}}_i(t)\) of i-th drone at time t is denoted by

$$\begin{aligned} {\mathcal {N}}_i(t)=\{v_j\in {\mathcal {V}}\vert (v_j,v_i)\in {\mathcal {E}}(t)\} \end{aligned}$$
(4)

Definition 2

The adjacency matrix \(\varvec{{\mathcal {A}}}_i(t)=[a_{ij}(t)]\in {\mathbb {R}}^{N \times N}\) represents the relationship between nodes, which is expressed as

$$\begin{aligned} {a_{ij}(t)} = {\left\{ \begin{array}{ll} 1,&{}{\text {if}}\ (v_j,v_i)\in {\mathcal {E}}(t) \\ {0,}&{}{\text {otherwise.}} \end{array}\right. } \end{aligned}$$
(5)

If drone i can communicate directly with drone j, \(a_{ij}(t)=1\), otherwise \(a_{ij}(t)=0\). Note that \({\mathcal {G}}(t)\) is an undirected graph, thus \(a_{ij}(t)=a_{ji}(t)\) and \(i \in {\mathcal {N}}_j(t) \Leftrightarrow j \in {\mathcal {N}}_i(t)\) hold.

Evaluation metrics for swarm flocking

We consider three different categories of metrics to evaluate the application of the drone swarm traversing an unknown environment.

Overall efficiency

Time and quantity are used to judge the overall efficiency of the drone swarm traversing the unknown environment. The task is completed if all the drones are located within the desired region.

\(T_{\text {end}}\) indicates the time that the drone swarm requires to complete the task.

\(N_{\text {end}}\) represents the number of drones located in the desired region within \(T_{\text {end}}\).

The task completion rate (TCR) is the ratio of the number of times of completed tasks to the total number of times of simulations.

Consensus

We use speed and direction to evaluate the consensus of the aerial swarm.

The normalized speed is expressed by

$$\begin{aligned} \Psi _{\text {vel}} = \frac{1}{T_{\text {end}}}\frac{1}{Nv_0}\int _{0}^{T_{\text {end}}} \left| \sum \limits _{i=1}^{N}{\varvec{v}}_i(t)dt\right| \end{aligned}$$
(6)

Note that \(\Psi _{\text {vel}}\) characterizes the degree to which the flocking speed tends to the preferred migration speed \(v_0\), and \(\Psi _{\text {vel}}=1\) only if the flocking speed is equal to \(v_0\).

The velocity correlation of the swarm can be evaluated by the following form:

$$\begin{aligned} \Psi _{\text {corr}} \!=\! \frac{1}{T_{\text {end}}}\frac{1}{N}\!\!\int _{0}^{T_{\text {end}}}\!\!\sum \limits _{i=1}^{N}\!\left( \!\frac{1}{N_i^{clu}\!-\!1}\!\!\sum \limits _{j\in C_i }^{N}\!\!\frac{{\varvec{v}}_i(t) \cdot {\varvec{v}}_j(t)}{\vert {\varvec{v}}_i(t)|| {\varvec{v}}_j(t)\vert }\!\right) dt\nonumber \\ \end{aligned}$$
(7)

where \(N_i^{clu}\) is the number of drones in the cluster containing the i-th drone. \(C_i\) is the set where the cluster excluding the i-th drone. Note that, the value of the velocity correlation, \(\Psi _{\text {corr}}\), are between \(-1\) (complete disorder) and 1 (perfect order).

Safety

The safety assessment of drones is divided into four parts: distance between drone and obstacle, distance between drones, minimum distance, and average distance between drones.

The ability of drones to avoid collisions with obstacles (or boundaries) is characterized by \(\Psi _{\text {obs}}\), which is governed by

$$\begin{aligned} \Psi _{\text {obs}} = \frac{1}{T_{\text {end}}}\frac{1}{N}\int _{0}^{T_{\text {end}}}\sum \limits _{i=1}^{N}{\mathcal {H}}(f({\varvec{p}} _i,t))dt \end{aligned}$$
(8)

where \(f(\cdot )=1\) when the position of the drone is inside the obstacle or outside wall, otherwise it is 0. On the other hand, \(\Psi _{\text {obs}}\) is equal to 0 when there is no collision, and it is equal to 1 when all agents collide. Next, \({\mathcal {H}}(x)\) is a heaviside step function, as follows:

$$\begin{aligned} {\mathcal {H}}(x) = \left\{ \begin{aligned}&1 ,&x \ge 0 \\&0 ,&x <0 \end{aligned} \right. \end{aligned}$$
(9)

\(\Psi _{\text {coll}}\) represents the ability of drones to avoid collision with each other, which can be evaluated by

$$\begin{aligned} \Psi _{\text {coll}} = \frac{1}{T_{\text {end}}}\frac{1}{N(N-1)}\int _{0}^{T_{\text {end}}}\!\sum \limits _{i=1}^{N}\!\sum \limits _{j\le i}{\mathcal {H}}(d_{ij}(t)-R)dt\nonumber \\ \end{aligned}$$
(10)

where \(d_{ij}=\vert {\varvec{p}}_i-{\varvec{p}}_j\vert \) is the distance between the i-th and j-th drones. If no collision is registered, \(\Psi _{\text {coll}}=0\), whereas \(\Psi _{\text {coll}}=1\), when all pairs of drones are colliding.

\(r_{\text {min}}\) and \(r_{\text {ave}}\) are the minimum distance and average distance between all drones, respectively.

Inspired by [14], we use genetic algorithm to find optimal parameter space that maximize \(\Psi _{\text {vel}}\) and \(\Psi _{\text {corr}}\), while minimizing \(\Psi _{\text {coll}}\) and \(\Psi _{\text {obs}}\).

Proposed flocking algorithm for drone swarm

In this section, the desired dynamics of agents are determined by neighbors \({\mathcal {N}}_i(t)\) and local environment. A flocking algorithm for the drone swarm traversing unknown environments is proposed first, including non-AGDS and AGDS. Then, each term in the flocking algorithm is introduced in detail.

General flocking law

This paper mainly solves the problem of the drone swarm traversing an unknown environment as a group under the condition that all agents cannot be guaranteed to have knowledge of the desired region. As shown in Fig. 1, the mathematical expression of the proposed flocking algorithm is described below:

$$\begin{aligned} {\varvec{v}}_{i\_\text {d}}= {\varvec{v}}_{i\_\text {sac}} + {\varvec{v}}_{i\_\text {v}} + (1-\lambda _i){\varvec{v}}_{i\_\text {w}} + \lambda _i{\varvec{v}}_{i\_\text {t}} \end{aligned}$$
(11)

where \({\varvec{v}}_{i\_\text {sac}}\) is the SAC interaction term that includes separation \(({\varvec{v}}_{i\_\text {s}})\), alignment \(({\varvec{v}}_{i\_\text {a}})\) and cohesion \(({\varvec{v}}_{i\_\text {c}})\), \({\varvec{v}}_{i\_\text {v}}\) is the visual perception term for obstacle avoidance, \(\lambda _i\) is the informed coefficient used to represent the role, \({\varvec{v}}_{i\_\text {w}}\) is the will propagation term and \({\varvec{v}}_{i\_\text {t}}\) is the task information term.

Fig. 1
figure 1

Schematic diagram of the desired velocity \({\varvec{v}}_{i\_\text {d}}\). a The velocity component of the informed agent. b The velocity component of the uninformed agent

We define these drones equipped with task information as informed agents. The identities of informed agents are hidden, and feedback information needs to be obtained from neighbors. Therefore, the status of all agents is equal. It also means that the loss of part of the informed agents does not affect the information interaction within the flock. Of course, the informed agents can also be dynamically added, or the previous ordinary agents can be designated as informed agents.

Separation-alignment-cohesion interaction

We are inspired by a well-known model that allows the drone swarm to roam in confined environments [14]. The model includes the rule of separation to prevent collisions between drones and alignment to reduce velocity oscillations. In addition to this, we also include a cohesion rule to prevent drone swarms from splitting up. Therefore, we integrate these three basic requirements into one, as follows:

$$\begin{aligned} {\varvec{v}}_{i\_\text {sac}} = {\varvec{v}}_{i\_\text {s}} + {\varvec{v}}_{i\_\text {a}} + {\varvec{v}}_{i\_\text {c}} \end{aligned}$$
(12)

where \({\varvec{v}}_{i\_\text {s}}\), \({\varvec{v}}_{i\_\text {a}}\) and \({\varvec{v}}_{i\_\text {c}}\) define the requirements of separation, alignment, and cohesion, respectively. The three requirements are illustrated in Fig. 2.

Fig. 2
figure 2

Schematic diagram of the SAC rule interaction term. Different colors indicate different sizes of separation, alignment, and cohesion zone ranges

Separation rule

Each drone needs to avoid collisions with other drones, which is one of the highest priority rules. Therefore, the separation among drones can be defined by

$$\begin{aligned} {\varvec{v}}_{i\_\text {s}}=p_{\text {sep}}\sum _{j=1}^{N}a_{ij}(r_{\text {sep}}-d_{ij})\cdot \frac{{\varvec{p}}_i-{\varvec{p}}_j}{d_{ij}}\cdot {\mathcal {H}}(r_{\text {sep}}-d_{ij})\nonumber \\ \end{aligned}$$
(13)

where \(p_{\text {sep}}\) is the linear gain of the collision avoidance effect, and \(r_{\text {sep}}\) is the radius of the separation zone.

Alignment rule

In actual flocking, the velocity alignment rule is essential. It can reduce oscillation and prevent dangerous situations by eliminating the excessively high-speed difference between nearby neighbors. The alignment rule is defined by

$$\begin{aligned} {\varvec{v}}_{i\_\text {a}}\!=\!p_{\text {ali}}\!\sum _{j=1}^{N}\!a_{ij}\left( v_{ij}\!-\!\Delta v^{\text {max}}_{ij}\right) \!\cdot \!\frac{{\varvec{v}}_i\!-\!{\varvec{v}}_j}{v_{ij}}\!\cdot \!{\mathcal {H}}\left( v_{ij}\!\!-\!\!\Delta v^{\text {max}}_{ij}\right) ,\nonumber \\ \end{aligned}$$
(14)

where \(p_{\text {ali}}\) is a linear gain of the velocity alignment effect, \(\Delta v^{\text {max}}_{ij}\) is the maximum allowable velocity difference in Ref. [14] and \(v_{ij}=\vert {\varvec{v}}_i-{\varvec{v}}_j\vert \) is the magnitude of the velocity difference between the i-th and j-th drones.

Cohesion rule

When the drone swarm traverses environments with obstacles, the drone swarm can easily be divided into multiple sub-swarms. If no individuals in some subgroups have the knowledge of the desired region, it will lead to drones in this subgroup hardly reaching the desired region. The cohesion rule is

$$\begin{aligned} \begin{aligned} {\varvec{v}}_{i\_\text {c}}=&p_{\text {coh}}\sum _{j=1}^{N}{a}_{ij}\left( d_{ij}\!-\!r_{\text {coh}}\right) \!\cdot \! \frac{{\varvec{p}}_j\!-\!{\varvec{p}}_i}{d_{ij}}\!\cdot \!{\mathcal {H}}\left( d_{ij}\!-\!r_{\text {coh}}\right) \end{aligned}\nonumber \\ \end{aligned}$$
(15)

where \(p_{\text {coh}}\) is the linear gain of cohesion and \(r_{\text {coh}}\) is the distance at which the cohesion rule starts to work. Note that \(r_{\text {coh}}\ge r_{\text {sep}}\).

Visual perception

In our model, we regard each drone as a disk with radius R. The vision projection field of the drone is a circle with a radius of \(r_{\text {vis}}\). The field projection of the drone i is divided into L parts, where l represents the label, \(l = 1,2,...,L\). Thus, line of sight \(\phi _{il}\) is abstracted as the connection between the center and the circumference (see Fig. 3a). Suppose the drone has a 360-degree field of view, but walls and obstacles will block its line of sight [17]. The angle between the two lines of sight is \(\Delta \Phi = 2\pi /L\). Obviously, the closer the robot is to walls and obstacles, the larger the obscured field of view. Therefore, we define the projection field function as follows (see Fig. 3b):

$$\begin{aligned} {V(\phi _i)} = {\left\{ \begin{array}{ll} 1,&{}{\text { if the line of sight is blocked}}\\ {0,}&{}{\text {otherwise.}} \end{array}\right. } \end{aligned}$$
(16)
Fig. 3
figure 3

Schematic diagram of the drone using visual mechanism to perceive the environment. a The field of view projection of the drone. The black square represents the unknown environment. The red disk in the center of the square represents the i-th drone, and its line of sight is a solid black line evenly distributed around it. The blue circle represents the range of vision of the drone, with a radius \(r_{\text {vis}}\). The gray area indicates that it is out of the line of sight. The six green disks and the two solid black lines through the grey area represent obstacles and walls present in the environment, respectively. b The projection field function of the drone in (a). c The velocity direction \({{\textbf {v}}}_i\) of the drone is used as the reference direction (\({{\textbf {x}}}_v\)) of the field of view projection field. The upper and lower boundaries of the angle of the area occupied by the field of view are \(\phi _{il}^{\text {upp}}\) and \(\phi _{il}^{\text {low}}\), respectively. The angle of the occupied area of the visual field and the angle of its center are \(\phi _{il}^{\text {upp}}-\phi _{il}^{\text {low}}\) and \(1/2(\phi _{il}^{\text {upp}}+\phi _{il}^{\text {low}})\), respectively

As shown in Fig. 3c, the velocity \({\varvec{v}}_i\) direction of the drone i is aligned with the \({{\textbf {x}}}_v\) axis, while its counterclockwise normal direction is the \({{\textbf {y}}}_v\) axis. We define the upper and lower bounds of the occupied angle of the field of view projection field as \(\phi _{il}^{\text {upp}}\) and \(\phi _{il}^{\text {low}}\), respectively. Thus, the motion of the drone i based on visual mechanism is formulated as follows:

$$\begin{aligned} {\varvec{v}}_{i\_\text {v}}=\varvec{{\mathcal {R}}}\!\sum _{l=1}^{L}\left( \! \!-\!\sin \frac{\phi _{il}^{\text {upp}}\!-\!\phi _{il}^{\text {low}}}{2}\! \left[ \begin{matrix} p_{\text {fb}} \cos \frac{\phi _{il}^{\text {upp}}+\phi _{il}^{\text {low}}}{2}\\ p_{\text {lr}} \sin \frac{\phi _{il}^{\text {upp}}+\phi _{il}^{\text {low}}}{2} \end{matrix} \right] \! \right) \nonumber \\ \end{aligned}$$
(17)

where \(p_{\text {fb}}\) and \(p_{\text {lr}}\) are linear coefficients that characterize shifting and steering, respectively. Note that the rotation matrix to transform a vector from the body coordinate system \({{\textbf {O}}} _v- {} {{\textbf {x}}} _v{{\textbf {y}}} _v\) to the world coordinate system \({{\textbf {O}}}-{{\textbf {x}}}{{\textbf {y}}} \) is given by

$$\begin{aligned} \varvec{{\mathcal {R}}}=\left[ \begin{matrix} \cos \psi _i &{}\quad -\sin \psi _i \\ \sin \psi _i &{}\quad \cos \psi _i \end{matrix} \right] \end{aligned}$$
(18)

where \(\psi _i\) is the heading of i-th the drone expressed in the coordinate system \({{\textbf {O}}}-{{\textbf {x}}}{{\textbf {y}}} \).

Fig. 4
figure 4

Schematic diagram of the obstacle avoidance function of the visual perception term. a represents the change of the projection of the field of view of the obstacle located in front of and behind the drone i. The black arrow indicates the velocity direction of the drone i. The depth of the color represents the strength of the obstacle’s influence. b represents the change of the field of view projection when obstacles are located on the left and right sides of the drone i

The motion of the drone can be decomposed into radial motion and normal motion. The former can characterize the acceleration and deceleration behavior, and the latter can describe the left and right steering behavior [27]. Therefore, the drone generates avoidance behavior by sensing the size and position occupied by the projection of obstacles or walls in its field of view. As shown in Fig. 4, the vision term \({\varvec{v}}_{i\_v}\) can produce a short-range repulsion, where the first term of the matrix in Eq. (17) can have a repulsion in the front and back directions (Fig. 4a), and the second term of the matrix in Eq. (17) can produce a repulsion in the left and right directions (Fig. 4b).

Will propagation

Due to the limited communication and sensing capabilities of a single drone, it is difficult for the aerial swarm to quickly avoid obstacles when they encounter obstacles. Surprisingly, as the starling encounters natural enemies or obstacles, it will quickly spread the valuable information among the flocks through calls and other means to quickly escape or avoid obstacles [17]. Therefore, in the real environment, it is essential to transmit valuable information quickly between robots. For example, when a robot in front of the flock perceives an obstacle, it can quickly transmit this information to other robots so that the flock can quickly respond to obstacle avoidance instructions. To tackle such issues, a will propagation term \({\varvec{v}}_{i\_\text {w}}\) is introduced and expressed as

$$\begin{aligned} {\varvec{v}}_{i\_\text {w}}=v_0{\mathcal {W}}, \end{aligned}$$
(19)

where \({\mathcal {W}}\) is a unit vector, which indicates the direction of the will propagation, given by [32]

$$\begin{aligned} {\mathcal {W}}={\mathcal {N}}((1-\Omega _i){\mathcal {R}}[{\varvec{v}}_i;s_i^{WiSt}\delta t]+\Omega _iv_0\hat{{\varvec{v}}}_i^{WiSt}), \end{aligned}$$
(20)

where \(\hat{{\varvec{v}}}_i^{WiSt}\) is desired velocity of the WiSt model, \({\mathcal {N}}(\cdot )\) is the normalization operator, \({\mathcal {R}}[{\varvec{v}};\theta ]\) is the operator that rotates vector \({\varvec{v}}\) at angle \(\theta \) on the \({{\textbf {x}}}-{{\textbf {y}}}\) plane and \(s_i^{WiSt}\) is the intrinsic angular momentum. Note that \(\Omega _i\) can be used to distinguish between strong and weak will. Further discussion of Eq. (20) may refer to Ref. [32].

Task information with adaptive goal-directed strategy

Collective motions based on SAC self-organizing rules often show an aimless roaming phenomenon. Thus, we design a task information term to ensure that the drone can reach the desired region. The primary function of the term is to give knowledge of the desired region to the drones. The task information can be represented by the preferred direction \({\varvec{v}}_{\text {pre}}\), that is, a unit velocity vector pointing to the desired region. Therefore, the task information term is defined as

$$\begin{aligned} {{\varvec{v}}}_{i\_t} = v_0{\varvec{v}}_{\text {pre}} \end{aligned}$$
(21)

where \(v_0\) is the preferred speed. Note the non-adaptive goal-directed strategy (non-AGDS) represented by Eq. (21).

However, drones operating in unknown environments often encounter serious conflicts between task information and obstacle avoidance. This leads to a situation where the drone is caught in a dilemma. In nature, migrating or foraging animals adjust their behavior based on real-time information about the environment. Therefore, we propose an adaptive goal-directed strategy (AGDS) based on visual perception.

Fig. 5
figure 5

Schematic diagram of the task information term with adaptive goal-oriented strategy. The orange arrow \({{\textbf {v}}}_i\) indicates the velocity direction of the drone. The red arrow \({{\textbf {v}}}_{\text {pre}}\) indicates the preferred direction to the desired region. The pink arrow \({{\textbf {v}}}_{i\_\text {t}}\) indicates the adaptive task information direction. \(\Phi _i^{\text {task}}\) is the angle of rotation

Firstly, the angle \(\Delta \phi _{\text {safe}}\) suitable for safe passage of drone is determined as follows:

$$\begin{aligned} \Delta \Phi _{\text {safe}} = 2\arctan \left( \frac{R}{r_{\text {vis}}}\right) \end{aligned}$$
(22)

We can then calculate the number \(n_{s}\) of the visual field occupied by the safe angle \(\Delta \Phi _{\text {safe}}\) as given by

$$\begin{aligned} n_{s} = \frac{\Delta \Phi _{\text {safe}}}{\Delta \Phi } \end{aligned}$$
(23)

The upper and lower bounds of the safe angle \(\Delta \Phi _{\text {safe}}\) in the projection field are \(\phi ^{\text {low}}_{\text {safe}}\) and \(\phi ^{\text {upp}}_{\text {safe}}\), respectively. Meanwhile, we determine the number \(n_{\theta }\) of areas in the visual field occupied by the angle \(\theta \) between the velocity \({\varvec{v}}_i\) and the goal-directed velocity \({\varvec{v}}_{\text {pre}}\). Next, we transform the original projection field \(V(\phi )\) (see Fig. 3a) starting with the velocity \({\varvec{v}}_i\) into the projection field \(V(\phi _{\text {task}})\) (see Fig. 5) starting with the goal-directed velocity \({\varvec{v}}_{\text {pre}}\) by rotating \(n_{\theta }\) regions equally.

According to whether there are obstacles in the line of sight of the task direction, there are two situations to deal with:

The first situation without obstacles is given by

$$\begin{aligned} \sum _{\phi ^{\text {low}}_{\text {safe}}}^{\phi ^{\text {upp}}_{\text {safe}}}V(\phi _{\text {safe}})=0, \end{aligned}$$
(24)

which implies that the line of sight is not blocked within the safe angle \(\Delta \Phi _{\text {safe}}\) range, then the task information term of the drone can be shown in Eq. (21).

Another situation with obstacles is expressed as

$$\begin{aligned} \sum _{\phi ^{\text {low}}_{\text {safe}}}^{\phi ^{\text {upp}}_{\text {safe}}}V(\phi _{\text {safe}})\ne 0, \end{aligned}$$
(25)

which implies that there is a conflict between preferred direction and obstacle avoidance. The goal orientation can be adaptively corrected to alleviate such a conflict. We define a cross-correlation function to represent the relationship between the passable direction and the projection field as follows:

$$\begin{aligned} V_c(\phi _{\text {task}})= & {} \frac{1}{\Delta \Phi _\text {safe}}\nonumber \\{} & {} \sum _{k=-n_{s}/2}^{n_{s}/2} {\varvec{{\mathcal {M}}}(k\Delta \Phi )V(\phi _{\text {task}}+k\Delta \Phi )\Delta \Phi } \end{aligned}$$
(26)

where \(\varvec{{\mathcal {M}}}(k\Delta \Phi )\in {\mathbb {R}}^{n_{s}}\) is a weighting matrix indicating the degree of security of the safe region \(\Delta \Phi _\text {safe}\) whose elements are all set as 1. Therefore, the value of \(V_c(\phi _{\text {task}})\) is in the range [0, 1]. Any region that satisfies \(V_c(\phi _{\text {task}})=0\) is a passable direction.

\(V_c(\phi _{\text {task}})\) is divided into two parts (\(V_c^{\text {L}}\) and \(V_c^{\text {R}}\)) with the initial preferred direction \({\varvec{v}}_{\text {pre}}\) as the origin, and the passable directions on both sides are obtained:

$$\begin{aligned} \left\{ \begin{aligned} \phi _{im}^{\text {L}}=\arg \min (V_c^\text {L})\\ \phi _{im}^{\text {R}}=\arg \min (V_c^\text {R}) \end{aligned} \right. \end{aligned}$$
(27)

Convert the above passable directions in Eq. (27) to the projection field \(V(\phi )\) with velocity \({\varvec{v}}_i\) as the origin and calculate the sight line of the passable directions as follows:

$$\begin{aligned} \phi _{im}=\mathop {\arg \min }\limits _{\phi _{im}^{\text {task}}}[\vert \phi _{im}^{\text {task}}-n_{\theta }\vert ] \end{aligned}$$
(28)

where \(\phi _{im}^{\text {task}}\in [\phi _{im}^{\text {L}},\phi _{im}^{\text {R}}]\). Thus, the corresponding angle of the sight line \(\phi _{im}\) in the projection field \(V(\phi )\) is \(\Phi _{i\_\text {m}}\). The angle \(\Phi _{i\_\text {m}}\) is converted to the world coordinate system \({{\textbf {O}}}-{{\textbf {x}}}{{\textbf {y}}} \) as follows:

$$\begin{aligned} \Phi _i^{\text {task}} = \psi _i + \Phi _{i\_\text {m}} \end{aligned}$$
(29)

Finally, the task information with adaptive goal-directed strategy (AGDS) is defined as follows:

$$\begin{aligned} {{\varvec{v}}}_{i\_\text {t}} = v_0\left[ \begin{matrix} \cos \Phi _i^{\text {task}} \\ \sin \Phi _i^{\text {task}} \end{matrix} \right] \end{aligned}$$
(30)

Validation proposed flocking algorithms: typical scenario

In this section, a typical scenario is designed to verify the proposed non-AGDS and AGDS flocking algorithms. Simulations and experiments are given in the following section to show the effectiveness of the proposed method, where a video about simulations and experiments is available on supplemental movieFootnote 1.

Simulation validation

Fig. 6
figure 6

Schematic diagram of the flight area and initial positions of drones. The six drones are placed in the flight area and the initial position of the informed agent can be selected by one of the six positions

Simulation setup and parameters

A typical scenario of six drones with \(r_{\text {cm}}\) = 2.0m, \(r_{\text {vis}}\) = 0.4 m, \(r_{\text {sep}}\) = 0.4 m, \(r_{\text {coh}}\) = 0.8 m and \(v_0\) = 0.3 m/s as shown in Fig. 6 is designed. In this typical scene of 4 m \(\times \) 6.5 m, there is an obstacle region of 2.6 m \(\times \) 1.2 m in front of the desired region. The initial positions of the drones are fixed in the start region. The parameters of the algorithm are \(p_{\text {sep}}\) = 1.00, \(p_{\text {ali}}\) = 1.12, \(p_{\text {coh}}\) = 0.15, \(p_{\text {fb}} = p_{\text {lr}}\) = 0.31.

Simulation results and comparisons

Fig. 7
figure 7

Simulation diagram of the flight trajectory of the drone swarm when the informed agent is in different initial positions. a Simulation results of the proposed flocking algorithm with non-AGDS. TCR is the task completion rate from 15 simulations. b Simulation results of the proposed flocking algorithm with AGDS

The task information terms of the flocking algorithm with non-AGDS and AGDS are represented by Eq. (21) and Eq. (30), respectively. Figure 7a shows the simulation results of the proposed flocking algorithm with non-AGDS. As the initial position of the informed agent is 3# or 4#, the drone swarm successfully reaches the desired region. As shown in Fig. 6, 3# and 4# are the closest positions to the passable area, which means that the informed agent can quickly escape from the effect of obstacles to reduce the conflict between task information and obstacle avoidance. In stark contrast, 1# and 6#, which are farthest from the passable area, are the initial positions that are most likely to fall into the conflicting region. And 2# and 5# are the initial positions of intermediate cases between these two extreme cases. Figure 7b shows the simulation results of the proposed flocking algorithm with AGDS. The initial position of the informed agent does not cause the drone swarm to fail to reach the desired region. In addition, TCR of the method is significantly higher than that of the flocking algorithm with non-AGDS. When task information conflicts with obstacle avoidance, AGDS enables the task information term to be adaptively modified according to the obstacle information in the field of vision of the informed agent. This greatly alleviates the conflict between task information and obstacle avoidance, and prevents the informed agent from constantly oscillating near obstacles, which is very important for robots.

Figure 8a shows the task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions in simulation. Compared with 4# position, 3# is farther from the obstacle and less prone to conflict between task information and obstacle avoidance. Therefore, when the initial position of the informed agent is 3#, the time \(T_{\text {end}}\) it takes for the drone swarm to reach the desired region is shorter, only 43.4s. Compared with the flocking algorithm with non-AGDS, the flocking algorithm with AGDS reduces the time it takes for a swarm of drones to traverse an unknown environment by more than half. In Fig. 8b, the drone swarm can successfully reach the desired region within 30s, and the shortest time is only 22.8s.

Fig. 8
figure 8

Time it takes for the drone swarm to traverse the typical scenario in simulation. a The task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions. b The task completion time of the AGDS flocking algorithm when the informed agent is in different positions

Experimental validation

Experimental setup and parameters

We establish the physical environment shown in Fig. 9. We use 6 DJI Tello drones with commercial autopilots for physical experiments. Based on this DJI autopilot, the velocity of the Tello drone can track the velocity command (see Eq. (11)) generated by the flocking algorithm in a reasonable time. A motion capture system called OptiTrack is installed indoors. The optical motion capture cameras obtain the ground truth of the position and velocity of each drone by sensing the markers on the drone with infrared rays. The ground control station is connected to OptiTrack and these drones by a local network. The information transfer of the experimental system and the configuration of the drone are given in detail in Fig. 9a. The drone swarm takes off from the start region (green area), traverses the task area (gray area), and finally lands in the desired region (red area). Note that the drone swarm may face various adverse situations in the task area, such as obstacles and walls. The parameters of the algorithm are \(p_{\text {sep}}\) = 5 .00, \(p_{\text {ali}}\) = 1.12, \(p_{\text {coh}}\) = 0.15, \(p_{\text {fb}}=p_{\text {lr}}\) = 0.24.

Fig. 9
figure 9

Configuration of physical flight platform based on motion capture. a The platform consists of six DJI Tello drones, a ground control station, a router for building a local area network, and OptiTrack optical motion capture cameras. The hardware composition of the drone is shown in the purple block diagram. Some drones equipped with task information (orange arrow) to guide goal-directed flight of the drone swarm. When all drones pass through the gray area to the desired region (red zone), the task is completed. b The real-world experimental environment

Fig. 10
figure 10

Experimental diagram of the flight trajectory of the drone swarm when the informed agent is in different initial positions. a Experimental results of the proposed flocking algorithm with non-AGDS. The red and blue disks represent the informed and uninformed agents, respectively. b Experimental results of the proposed flocking algorithm with AGDS

Experimental results and comparisons

In indoor experiments, we set the experimental time threshold to 80s. Besides, due to the limitation of site area, we set the desired region as a closed region, which will lead to an uncertain deviation in the task completion time of each experiment. Figure 10a shows the experimental results of the proposed flocking algorithm with non-AGDS. In the experiment, we verify the conclusions from the simulation. Only when the initial position of the informed agent is 3# or 4#, the drone swarm can quickly reach the desired region. When the initial position of the informed agent is 1# or 6#, it is very easy to cause conflicts between task information and obstacle avoidance, resulting in the difficulty of the drone swarm to reach the desired region. Figure 10b shows the experimental results of the proposed flocking algorithm with AGDS. In this set of experiments, the drone swarms all reach the desired region under the action of the adaptive strategy.

Figure 11a shows the task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions in experiment. When the initial position of the informed agent is respectively 3# and 4#, the drone swarm takes 47.77s and 51.43s to reach the desired region. Figure 11b shows the task completion time of the AGDS flocking algorithm when the informed agent is in different positions in experiment. It can be seen from the experimental results that the drone swarm arrives at the desired area within the experimental time threshold.

Fig. 11
figure 11

Time it takes for the drone swarm to traverse the typical scenario in experiment. a The task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions. b The task completion time of the AGDS flocking algorithm when the informed agent is in different positions

Further batch analysis on flocking performance

To assess the reproducibility of the results, we performed 15 stochastic simulations for each of the six initial positions of the informed agent and for the two strategies, and we report here the aggregated performance results (Fig. 12). The initial position of the informed agent has a great influence on the flocking algorithm with non-AGDS. When the initial position of the informed agent is 3#, both \(T_{\text {end}}=32.66\pm 7.52s\) and \(N_{\text {end}}=6\pm 0\) are significantly better than other initial positions. In contrast, the flocking algorithm with AGDS performs better on \(T_{\text {end}}\) and \(N_{\text {end}}\) (1#, \(T_{\text {end}}=30.05\pm 1.98s\); 2#, \(T_{\text {end}}=30.45\pm 3.00\); 3#, \(T_{\text {end}}=30.18\pm 3.64s\); 4#, \(T_{\text {end}}=23.53\pm 3.09s\); 5#, \(T_{\text {end}}=21.74\pm 0.15s\); 6#, \(T_{\text {end}}=25.87\pm 2.78s\); \(1\#\sim 6\)#, \(N_{\text {end}}=6.00\pm 0.00s\)).

Fig. 12
figure 12

Aggregated results (average and standard deviation) of 15 stochastic simulations of the different initial positions of the informed agent for the non-AGDS and AGDS flocking algorithm. The represented metrics are the traversal time \(T_{\text {end}}\), the normalized swarm speed \(\Psi _{\text {vel}}\), the minimum distance \(r_{\text {min}}\), the number \(N_{\text {end}}\) of individuals arriving at the desired region, the velocity correlation \(\Psi _{\text {corr}}\) and the average distance \(r_{\text {ave}}\)

Fig. 13
figure 13

Simulation of the flocking algorithm with AGDS in a complex irregular environment. a The flight trajectories of the drone swarm when traversing the complex and irregular environment. b The evaluation metrics of the flocking algorithm with AGDS

Table 1 Aggregate performance of the informed agent at different initial positions in the canyon-like environment

Regardless of the initial position of the informed agent, the drone swarm successfully reaches the desired region. This means that we can increase or decrease the informed agent during the flight, which will not cause the conflict between task information and obstacle avoidance. The aerial swarm deployed with flocking algorithm with AGDS also performed very well in tracking the preferred swarm speed (\(\Psi _{\text {vel}}=0.98\pm 0.01, 1.00\pm 0.02, 1.01\pm 0.01, 0.98\pm 0.01, 0.98\pm 0.03, 0.97\pm 0.01\)). In addition to this, the drone swarm has better consistency \(\Psi _{\text {corr}}\) during flight. In terms of safety, the minimum distance \(r_{\text {min}}\) and the average distance \(r_{\text {ave}}\) between drones of both strategies are in the safe range (\(>2R\)).

Extended simulation validation: complicated scenarios

In the real world, the drone swarm often operates in complex and dangerous environments. For example, exploring pathways and transporting supplies in post-disaster canyons. In such environment, there is no way to ensure that every drone has information about the desired region. To extend the proposed flocking algorithm with AGDS to practical transportation applications, we need to verify it in a more complicated scenario. Due to the limited coverage of indoor motion capture systems, we design a canyon-like environment in the simulation. The canyon-like environment is a 16m-long irregular flight area, and there are many obstacles in the area that conflicted with the task information.

Figure 13a shows the flight trajectories of the drone swarm when traversing the complex and irregular environment. As shown in Fig. 13b, it only takes 58.4s for the drone swarm to traverse the complex and irregular environment. The drone swarm maintains high normalized swarm speed \(\Psi _{\text {vel}}\) and velocity correlation \(\Psi _{\text {corr}}\) throughout the flight (\(\Psi _{\text {vel}}=0.94\pm 0.06\) and \(\Psi _{\text {corr}}=0.93\pm 0.09\)). In addition, there are no collisions between the drones (\(\Psi _{\text {coll}}=\Psi _{\text {obs}}=0\)), and the distance between drones is kept within a safe range (\(r_{\text {min}}=0.40\pm 0.01\) and \(r_{\text {ave}}=0.77\pm 0.12\)).

To assess the reproducibility of the results in the canyon-like environment, we performed 15 stochastic simulations for each of the six initial positions of the informed agent, and we report here the aggregated performance results. From Table 1, it is evident that, the proposed flocking algorithm with AGDS is independent of the initial position of the informed agent. The drone swarm takes about 59.03 ± 5.23s to travel the complex canyon-like environment. The drone swarm also performs well at velocity correlation and tracking preferred migration speed \(v_0\) (\(\Psi _{\text {corr}}=0.90\pm 0.03\) and \(\Psi _{\text {vel}}=0.98\pm 0.01\)). In addition, there are no collisions among the drones and a safe distance is maintained between the drones (\(r_{\text {min}}=0.42\pm 0.01 > 2R\)).

Conclusions

The work presented in this paper proposed and developed an adaptive goal-directed swarm flocking algorithm for specific missions flying through an unknown environment. Simulations and experiments of six drones have been conducted for validation on the flocking algorithm. The proposed AGDS algorithm possess greater feasibility and efficiency within a typically regular scenario and an extended canyon-like environment.

Next, this work employed the bio-inspired flocking algorithm into practical applications of traversing through unknown environments, rather than aimless collective flight in confined environments [14]. The flocking swarm is composed of informed agents and uniformed agents that operate collectively to accomplish an overall missions. Only the informed agents which are a small minority and know the desired region to go, but the remaining uninformed drones have no knowledge of the destination.The informed and uninformed agents follow identical self-organization rules and obtain feedback from neighbors within their communicable range.

The initial position of the informed agent practically affects the efficiency of the drone swarm traversing the unknown environment. The uninformed agent takes the drone group away from the conflict region through alignment and cohesion operators. On the other hand, the informed agent alleviates the conflict between task information and self-organizing rules through AGDS, which makes the proposed flocking algorithm successful regardless of the initial position.

Our future works will concentrate on a few more informed agents in swarm flocking and their influence on the traversing success ratio. Furthermore, the safety and risk analyses are required to extend the developed flocking algorithm to the Unmanned Aircraft System Traffic Management (UTM), especially in urban-like environments.