AGDS: adaptive goal-directed strategy for swarm drones flying through unknown environments

This paper aims to address a challenging problem of a drone swarm for a specific mission by reaching a desired region, through an unknown environment. A bio-inspired flocking algorithm with adaptive goal-directed strategy (AGDS) is proposed and developed for the drones swarmed across unknown environments. Each drone employs a biological visual mechanism to sense obstacles in within local perceptible scopes. Task information of the destination is only given to a few specified drones (named as informed agents), rather than to all other individual drones (uninformed agents). With the proposed flocking swarm, the informed agents operate collectively with the remaining uninformed agents to achieve a common and overall mission. By virtue of numerical simulation, the AGDS and non-adaptive goal-directed strategy (non-AGDS) are both presented and evaluated. Experiments by flying six DJI Tello quadrotors indoor are conducted to validate the developed flocking algorithm. Additional validations within canyon-like complicated scenarios have also been carried out. Both simulation and experimental results demonstrate the efficiency of the proposed swarm flocking algorithm with AGDS.


Introduction
With the advent of the artificial intelligence era, unmanned aerial vehicles (UAVs) have become substantially more autonomous and intelligent, which have become popular for various tasks [1,2]. However, a single aircraft is with limited capability and resources [3], which results in its low efficiency and success rate on complex tasks such as trans- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, Singapore portation of large-scale supplies. The multiple unmanned aerial system is an unmanned intelligent cluster inspired by the self-organizing behaviors of biological groups, which significantly improves task efficiency through cooperation and complementary capabilities. Therefore, the cooperative control technology of the aerial cluster with distribution, autonomy and robustness is a promising solution. The aerial swarm has numerous potential applications, such as monitoring missions [4], UAV-assisted wireless coverage [5], search and rescue [6], target tracking [7], cooperative exploration [8], air traffic [9]. However, some drones in a swarm of drones have limited perceptual and computational capabilities, making it difficult to obtain abundant information about the environment and their companions. It is a challenge to make a drone swarm without collision, while passing an unknown environment, by relying only on the local information obtained by each individual drone.
Although the swarm has advantages in number and function, these advantages bring great challenges to formation control. The phenomenon of collective motion in nature provides inspiration for solving this problem [10]. Although individual animals have limited perceptual ability and seldom obtain global information, they can form coordinated and orderly collective motion without apparent leaders. This magical and fascinating collective behavior comes from the local interaction between individuals in gregarious living beings. Currently, the earliest collective intelligence algorithm recognized by academic circles is the Boids model proposed by Reynolds [11], which is a microscopic agent-based model constructed from a phenomenological perspective. It follows the separation-alignment-cohesion (SAC) rules. Compared with centralized control, this decentralized selforganization control provides great advantages for the flying robot swarm [12]. The former depends on a central computing node, while the latter each agent is an independent computing node, thus improving the robustness to individual faults. Self-organization means that any global motion mode results from local decisions. The decisions of each agent are only related to a limited number of neighbors, independent of the aerial swarm size [13]. This is highly critical for drone swarm control, which means that we can deploy dozens or hundreds of drones with limited computing resources.
Although the self-organizing control strategy based on biological inspiration has many advantages, it still faces some challenges when applied to the aerial swarm operating in complex environments. A concerning problem is to translate the phenomena of collective motion into interpretable mathematical models and deploy them on the multi-robot system. Vásárhelyi et al. [14] proposed a flocking model for real drones and carried out field experiments with a selforganizing swarm of 30 drones. This work realized flocking flight and obstacle avoidance of the aerial swarm like flocks of birds in predefined confined environments. However, such studies focus on reproducing the collective behavior of social animals in nature without considering the application in unknown environments and complex tasks [15]. The visual perception mechanism that individuals use the visual cortex to obtain information provides inspiration for the flying robot swarm to operate in an unknown environment [16,17]. Perception and operation of drones in unknown environments is difficult, thus we use a bionic vision mechanism to achieve perception of the unknown environment. Besides, in practical applications, it is not necessary or even impossible to ensure that all individuals are informed of desired region information, such as the aerial swarm operating in substantial interference or dangerous environments. In most animal groups, only a small number of individuals have knowledge of food source or migration route [18,19].
In this paper, we focus on the problem of a swarm of drones traversing an unknown environment without collision. A novel adaptive goal-directed bio-inspired flocking algorithm is designed. Each drone in the swarm works independently as a computing node and cooperates with others through local information interaction. Instead of prior knowledge of the environment, drones perceive the environment in real time through visual mechanisms. The flocking swarm is composed of informed agents and uniformed agents that operate equally in the overall missions. In addition, each drone moves with the same self-organizing rules. We propose an adaptive goal-directed strategy based on visual perception, which can significantly improve the efficiency of the drone swarm traversing the unknown environment. Finally, the feasibility of the proposed algorithm is verified by numerical simulation and physical experiments of 6 DJI Tello quadrotors.
The rest of this paper is organized as follows. "Related work" section covers some existing related works. The drone model, communication topology and evaluation metrics are presented in "Problem statement" section. In "Proposed flocking algorithm for drone swarm" section, the flocking algorithm suitable for the drone swarm traversing unknown environments is proposed, including non-AGDS and AGDS. In "Validation proposed flocking algorithms: typical scenario" section, the flocking algorithm is verified by simulation and experiment in a typical scenario. The validation and analysis of flocking algorithm in an extended scenario are given in "Extended simulation validation: complicated scenarios" section. Finally, this paper is concluded in "Conclusions" section.

Collective behavior modelling
The research on the mechanism of self-organizing behaviors promotes the development of the flying robot swarm. Vicsek et al. [20] studied the minimalist conditions of collective motion and proposed the Vicsek model, which only considered the velocity alignment rules between neighbors. Couzin et al. [21] proposed the Couzin model, which defined the zones and priorities of SAC rules. Levine et al. [22] proposed a social force model to study vortex motion based on Newtonian mechanics, which considered both velocity synergy and position synergy. Wu et al. [23] proposed an autonomous cooperative flocking algorithm for heterogeneous swarm, which solved the escort problem in the group cruise. By analyzing biological data, some scholars found unique behavior mechanisms, such as topological communication [24], hierarchical interaction [25], scale-free behavioral correlation [26], etc. Besides, the development of perceptual neuroscience has also promoted the study of collective biological behavior. Pearce et al. [17] proposed a model for long-range information exchange in bird flocks based on the projected view of each individual out through the flock and provided a solution for density control. Bastien et al. [27] proposed a purely vision-based model of collective behavior that showed the organized collective behavior that can be generated by relying solely on individual visual information. The model mainly focused on the perceptual information within the group without considering the perception of the environment. However, most of these works only explain the mechanism of collective motion from mathematical or biological data and have been seldom verified in artificial systems.

Reproducing natural collective motion with swarm robots
With the miniaturization of hardware, some scholars devoted themselves to deploying self-organizing algorithms in swarm robots. Slavkov et al. [28] created emergent morphologies in large swarm of real robots and reproduced the self-organized morphogenetic phenomenon of cells. Berlinger et al. [29] designed a fish-inspired robot swarm with implicit communication, which realized a variety of fish school behaviors such as synchrony, dispersion/aggregation, dynamic circle formation, and search-capture. Werfel et al. [30] developed a termite-inspired robot construction team construct complex predetermined structures. Rubenstein et al. [31] used a thousand-robot swarm to complete the self-assembly of complex two-dimensional shapes. Vásárhelyi et al. [14] deployed the flocking algorithm on 30 autonomous drones to achieve flocking flight and obstacle avoidance like a flock of birds. Balázs et al. [32] proposed a WillFull model for the persistence-responsivity trade-off in the swarm and used 52 autonomous drones to achieve abrupt collective turns like schools of fish or flocks of birds. Thus far, this is the largest self-organizing decentralized aerial swarm in an outdoor environment. The above works are mainly concerned with the reproduction of collective motions in nature on artificial collective systems.

Passing-through experiments using swarm robots and collective models
Based on the above research, swarm robots are expected to perform given missions in various scenarios. Quan et al. [33] proposed a distributed virtual pipeline control algorithm by pre-planning the airspace as virtual pipelines without obstacles. This work addressed the pipeline passage problem of the drone swarm and was experimentally verified using 6 multicopters. Liu et al. [34] developed a pragmatic distributed flexible formation protocol to pass through a narrow and irregular channel. In this study, two leaders tracked predefined channel boundaries, followers tracked leaders, and a field experiment was carried out with 3 unmanned surface vessels. Nevertheless, the premise of study is that the environment needs to be predefined and known to swarm individuals. Virágh et al. [35] provided a model of a general autonomous flying robot integrated into a realistic simulation framework and achieved collective target tracking using 9 autonomous aerial robots in the real world. Balázs et al. [36] proposed a decentralized air traffic control solution for dense traffic problems and implemented crosswalk and package-delivery applications on 30 autonomous drones. Soria et al. [37,38] deployed the proposed predictive model on the multicopter swarm, which ensured rapid and safe collective migration in cluttered environments. Schilling et al. [39] proposed to learn vision-based swarm behaviors in an end-to-end manner directly from raw images and implemented collision-free navigation of the drone swarm in the real world. In these works, each agent in the drone swarm has knowledge of the desired region.
To sum up, the research on the self-organizing flocking algorithm of the aerial swarm in various applications is still in its infancy. This paper mainly solves the problem of the drone swarm traversing an unknown environment. Different from the above works, we propose a self-organizing flocking algorithm, which does not require every agent to have knowledge of the desired region. This is particularly beneficial for the drone swarm operating in interference or electromagnetic environments, where knowledge of the desired region cannot be guaranteed for each aircraft. Based on using the mathematical bionic vision mechanism to perceive the environment, we propose a vision-based adaptive goal-oriented strategy, which can effectively improve the efficiency of the drone swarm traversing a complex canyon-like environment.

Problem statement
In this section, a drone dynamics model is introduced first, including constraints on velocity and acceleration. Then, the communication topology of the drone swarm is presented. Finally, several evaluation metrics of the algorithm are proposed.

Individual drone dynamics
Consider aerial swarm consists of N individuals denoted by 1, 2, ..., N . The dynamics model of single drone can be simplified as a second-order integral model [40]. Therefore, we assume that the i-th drone with altitude hold mode satisfies the following model: where p i (t) ∈ R 2 and v i (t) ∈ R 2 are respectively the position vector and velocity vector of the i-th drone at time t, τ i ∈ R + is the characteristic time of velocity response, v i_d (t) ∈ R 2 is the desired velocity vector of the i-th drone at time t and is also the output of the flocking algorithm. We do not consider the differences between drones, and thus assume that τ i = τ for all drones. Because of the physical constraints of the drone, the magnitude of velocity and acceleration are limited to v max and a max , respectively. The rules are given by where δt is the interval of velocity updates.

Communication network between drones
To describe the model of the drone swarm in mathematical language, we use an undirected graph [41]

Definition 2 The adjacency matrix
∈ R N ×N represents the relationship between nodes, which is expressed as

Evaluation metrics for swarm flocking
We consider three different categories of metrics to evaluate the application of the drone swarm traversing an unknown environment.

Overall efficiency
Time and quantity are used to judge the overall efficiency of the drone swarm traversing the unknown environment. The task is completed if all the drones are located within the desired region. T end indicates the time that the drone swarm requires to complete the task. N end represents the number of drones located in the desired region within T end .
The task completion rate (TCR) is the ratio of the number of times of completed tasks to the total number of times of simulations.

Consensus
We use speed and direction to evaluate the consensus of the aerial swarm.
The normalized speed is expressed by Note that vel characterizes the degree to which the flocking speed tends to the preferred migration speed v 0 , and vel = 1 only if the flocking speed is equal to v 0 . The velocity correlation of the swarm can be evaluated by the following form: where N clu i is the number of drones in the cluster containing the i-th drone. C i is the set where the cluster excluding the i-th drone. Note that, the value of the velocity correlation, corr , are between −1 (complete disorder) and 1 (perfect order).

Safety
The safety assessment of drones is divided into four parts: distance between drone and obstacle, distance between drones, minimum distance, and average distance between drones.
The ability of drones to avoid collisions with obstacles (or boundaries) is characterized by obs , which is governed by where f (·) = 1 when the position of the drone is inside the obstacle or outside wall, otherwise it is 0. On the other hand, obs is equal to 0 when there is no collision, and it is equal to 1 when all agents collide. Next, H(x) is a heaviside step function, as follows: coll represents the ability of drones to avoid collision with each other, which can be evaluated by (10) where d i j = | p i − p j | is the distance between the i-th and j-th drones. If no collision is registered, coll = 0, whereas coll = 1, when all pairs of drones are colliding. r min and r ave are the minimum distance and average distance between all drones, respectively.
Inspired by [14], we use genetic algorithm to find optimal parameter space that maximize vel and corr , while minimizing coll and obs .

Proposed flocking algorithm for drone swarm
In this section, the desired dynamics of agents are determined by neighbors N i (t) and local environment. A flocking algorithm for the drone swarm traversing unknown environments is proposed first, including non-AGDS and AGDS. Then, each term in the flocking algorithm is introduced in detail.

General flocking law
This paper mainly solves the problem of the drone swarm traversing an unknown environment as a group under the condition that all agents cannot be guaranteed to have knowledge of the desired region. As shown in Fig. 1, the mathematical expression of the proposed flocking algorithm is described below: where v i_sac is the SAC interaction term that includes sep- , v i_v is the visual perception term for obstacle avoidance, λ i is the informed coefficient used to represent the role, v i_w is the will propagation term and v i_t is the task information term. We define these drones equipped with task information as informed agents. The identities of informed agents are hidden, and feedback information needs to be obtained from neighbors. Therefore, the status of all agents is equal. It also means that the loss of part of the informed agents does not affect the information interaction within the flock. Of course, the informed agents can also be dynamically added, or the previous ordinary agents can be designated as informed agents.

Separation-alignment-cohesion interaction
We are inspired by a well-known model that allows the drone swarm to roam in confined environments [14]. The model includes the rule of separation to prevent collisions between drones and alignment to reduce velocity oscillations. In addition to this, we also include a cohesion rule to prevent drone swarms from splitting up. Therefore, we integrate these three basic requirements into one, as follows: where v i_s , v i_a and v i_c define the requirements of separation, alignment, and cohesion, respectively. The three requirements are illustrated in Fig. 2.

Separation rule
Each drone needs to avoid collisions with other drones, which is one of the highest priority rules. Therefore, the separation among drones can be defined by where p sep is the linear gain of the collision avoidance effect, and r sep is the radius of the separation zone.

Alignment rule
In actual flocking, the velocity alignment rule is essential. It can reduce oscillation and prevent dangerous situations by eliminating the excessively high-speed difference between nearby neighbors. The alignment rule is defined by where p ali is a linear gain of the velocity alignment effect, v max i j is the maximum allowable velocity difference in Ref. [14] and v i j = |v i − v j | is the magnitude of the velocity difference between the i-th and j-th drones.

Cohesion rule
When the drone swarm traverses environments with obstacles, the drone swarm can easily be divided into multiple sub-swarms. If no individuals in some subgroups have the knowledge of the desired region, it will lead to drones in this where p coh is the linear gain of cohesion and r coh is the distance at which the cohesion rule starts to work. Note that r coh ≥ r sep .

Visual perception
In our model, we regard each drone as a disk with radius R. The vision projection field of the drone is a circle with a radius of r vis . The field projection of the drone i is divided into L parts, where l represents the label, l = 1, 2, ..., L. Thus, line of sight φ il is abstracted as the connection between the center and the circumference (see Fig. 3a). Suppose the drone has a 360-degree field of view, but walls and obstacles will block its line of sight [17]. The angle between the two lines of sight is = 2π/L. Obviously, the closer the robot is to walls and obstacles, the larger the obscured field of view. Therefore, we define the projection field function as follows (see Fig. 3b): As shown in Fig. 3c, the velocity v i direction of the drone i is aligned with the x v axis, while its counterclockwise normal direction is the y v axis. We define the upper and lower bounds of the occupied angle of the field of view projection field as φ upp il and φ low il , respectively. Thus, the motion of the drone i based on visual mechanism is formulated as follows: where p fb and p lr are linear coefficients that characterize shifting and steering, respectively. Note that the rotation matrix to transform a vector from the body coordinate system O v -x v y v to the world coordinate system O-xy is given by where ψ i is the heading of i-th the drone expressed in the coordinate system O-xy.
The motion of the drone can be decomposed into radial motion and normal motion. The former can characterize the acceleration and deceleration behavior, and the latter can describe the left and right steering behavior [27]. Therefore, the drone generates avoidance behavior by sensing the size and position occupied by the projection of obstacles or walls in its field of view. As shown in Fig. 4, the vision term v i_v can produce a short-range repulsion, where the first term of the matrix in Eq. (17) can have a repulsion in the front and back directions (Fig. 4a), and the second term of the matrix in Eq. (17) can produce a repulsion in the left and right directions (Fig. 4b).

Will propagation
Due to the limited communication and sensing capabilities of a single drone, it is difficult for the aerial swarm to quickly avoid obstacles when they encounter obstacles. Surprisingly, as the starling encounters natural enemies or obstacles, it will quickly spread the valuable information among the flocks through calls and other means to quickly escape or avoid obstacles [17]. Therefore, in the real environment, it is essential to transmit valuable information quickly between robots.
For example, when a robot in front of the flock perceives an obstacle, it can quickly transmit this information to other robots so that the flock can quickly respond to obstacle avoidance instructions. To tackle such issues, a will propagation term v i_w is introduced and expressed as where W is a unit vector, which indicates the direction of the will propagation, given by [32] wherev W i St i is desired velocity of the WiSt model, N (·) is the normalization operator, R[v; θ ] is the operator that rotates vector v at angle θ on the x − y plane and s W i St i is the intrinsic angular momentum. Note that i can be used to distinguish between strong and weak will. Further discussion of Eq. (20) may refer to Ref. [32].

Task information with adaptive goal-directed strategy
Collective motions based on SAC self-organizing rules often show an aimless roaming phenomenon. Thus, we design a task information term to ensure that the drone can reach the desired region. The primary function of the term is to give knowledge of the desired region to the drones. The task information can be represented by the preferred direction v pre , that is, a unit velocity vector pointing to the desired region. Therefore, the task information term is defined as where v 0 is the preferred speed. Note the non-adaptive goaldirected strategy (non-AGDS) represented by Eq. (21). However, drones operating in unknown environments often encounter serious conflicts between task information and obstacle avoidance. This leads to a situation where the drone is caught in a dilemma. In nature, migrating or foraging animals adjust their behavior based on real-time information about the environment. Therefore, we propose an adaptive goal-directed strategy (AGDS) based on visual perception.
Firstly, the angle φ safe suitable for safe passage of drone is determined as follows: We can then calculate the number n s of the visual field occupied by the safe angle safe as given by The upper and lower bounds of the safe angle safe in the projection field are φ low safe and φ upp safe , respectively. Meanwhile, we determine the number n θ of areas in the visual field occupied by the angle θ between the velocity v i and the goal-directed velocity v pre . Next, we transform the original projection field V (φ) (see Fig. 3a) starting with the velocity v i into the projection field V (φ task ) (see Fig. 5) starting with the goal-directed velocity v pre by rotating n θ regions equally.
According to whether there are obstacles in the line of sight of the task direction, there are two situations to deal with: The first situation without obstacles is given by which implies that the line of sight is not blocked within the safe angle safe range, then the task information term of the drone can be shown in Eq. (21).
Another situation with obstacles is expressed as which implies that there is a conflict between preferred direction and obstacle avoidance. The goal orientation can be adaptively corrected to alleviate such a conflict. We define a cross-correlation function to represent the relationship between the passable direction and the projection field as follows: where M(k ) ∈ R n s is a weighting matrix indicating the degree of security of the safe region safe whose elements are all set as 1. Therefore, the value of V c (φ task ) is in the range [0, 1]. Any region that satisfies V c (φ task ) = 0 is a passable direction. V c (φ task ) is divided into two parts (V L c and V R c ) with the initial preferred direction v pre as the origin, and the passable directions on both sides are obtained: Convert the above passable directions in Eq. (27) to the projection field V (φ) with velocity v i as the origin and calculate the sight line of the passable directions as follows: where Thus, the corresponding angle of the sight line φ im in the projection field V (φ) is i_m . The angle i_m is converted to the world coordinate system O-xy as follows: (29) Finally, the task information with adaptive goal-directed strategy (AGDS) is defined as follows:

Validation proposed flocking algorithms: typical scenario
In this section, a typical scenario is designed to verify the proposed non-AGDS and AGDS flocking algorithms. Simulations and experiments are given in the following section to show the effectiveness of the proposed method, where a

Simulation results and comparisons
The task information terms of the flocking algorithm with non-AGDS and AGDS are represented by Eq. (21) and Eq. (30), respectively. Figure 7a shows the simulation results of the proposed flocking algorithm with non-AGDS. As the initial position of the informed agent is 3# or 4#, the drone swarm successfully reaches the desired region. As shown in Fig. 6, 3# and 4# are the closest positions to the passable area, which means that the informed agent can quickly escape from the effect of obstacles to reduce the conflict between task information and obstacle avoidance. In stark contrast, 1# and 6#, which are farthest from the passable area, are the initial positions that are most likely to fall into the conflicting region. And 2# and 5# are the initial positions of intermediate cases between these two extreme cases. Figure 7b shows the simulation results of the proposed flocking algorithm with AGDS. The initial position of the informed agent does not cause the drone swarm to fail to reach the desired region. In addition, TCR of the method is significantly higher than that of the flocking algorithm with non-AGDS. When task information conflicts with obstacle avoidance, AGDS enables the task information term to be adaptively modified according to the obstacle information in the field of vision of the informed agent. This greatly alleviates the conflict between task information and obstacle avoidance, and prevents the informed agent from constantly oscillating near obstacles, which is very important for robots. Figure 8a shows the task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions in simulation. Compared with 4# position, 3# is farther from the obstacle and less prone to conflict between task information and obstacle avoidance. Therefore, when the initial position of the informed agent is 3#, the time T end it takes for the drone swarm to reach the desired region is shorter, only 43.4s. Compared with the flocking algorithm with non-AGDS, the flocking algorithm with AGDS reduces the time it takes for a swarm of drones to traverse an unknown environment by more than half. In Fig. 8b, the drone swarm can successfully reach the desired region within 30s, and the shortest time is only 22.8s.

Experimental setup and parameters
We establish the physical environment shown in Fig. 9. We use 6 DJI Tello drones with commercial autopilots for physical experiments. Based on this DJI autopilot, the velocity of the Tello drone can track the velocity command (see Eq. (11)) generated by the flocking algorithm in a reasonable time. A motion capture system called OptiTrack is installed indoors. The optical motion capture cameras obtain the ground truth of the position and velocity of each drone by sensing the markers on the drone with infrared rays. The ground control station is connected to OptiTrack and these drones by a local network. The information transfer of the experimental system and the configuration of the drone are given in detail in Fig. 9a. The drone swarm takes off from the start region (green area), traverses the task area (gray area), and finally lands in the desired region (red area). Note that the drone swarm may face various adverse situations in the task area, such as obstacles and walls. The parameters of the algorithm are p sep = 5 .00, p ali = 1.12, p coh = 0.15, p fb = p lr = 0.24.
The initial position of the informed agent

Front view
Obstacle region Desired region Back view b Fig. 9 Configuration of physical flight platform based on motion capture. a The platform consists of six DJI Tello drones, a ground control station, a router for building a local area network, and OptiTrack optical motion capture cameras. The hardware composition of the drone is shown in the purple block diagram. Some drones equipped with task information (orange arrow) to guide goal-directed flight of the drone swarm. When all drones pass through the gray area to the desired region (red zone), the task is completed. b The real-world experimental environment

Experimental results and comparisons
In indoor experiments, we set the experimental time threshold to 80s. Besides, due to the limitation of site area, we set the desired region as a closed region, which will lead to an uncertain deviation in the task completion time of each experiment. Figure 10a shows the experimental results of the proposed flocking algorithm with non-AGDS. In the experiment, we verify the conclusions from the simulation. Only when the initial position of the informed agent is 3# or 4#, the drone swarm can quickly reach the desired region. When the initial position of the informed agent is 1# or 6#, it is very easy to cause conflicts between task information and obstacle avoidance, resulting in the difficulty of the drone swarm to reach the desired region. Figure 10b shows the experimental results of the proposed flocking algorithm with AGDS. In this set of experiments, the drone swarms all reach the desired region under the action of the adaptive strategy. Figure 11a shows the task completion time of the non-AGDS flocking algorithm when the informed agent is in different positions in experiment. When the initial position of the informed agent is respectively 3# and 4#, the drone swarm takes 47.77s and 51.43s to reach the desired region. Figure 11b shows the task completion time of the AGDS flocking algorithm when the informed agent is in different positions in experiment. It can be seen from the experimental results that the drone swarm arrives at the desired area within the experimental time threshold.

Further batch analysis on flocking performance
To assess the reproducibility of the results, we performed 15 stochastic simulations for each of the six initial positions of the informed agent and for the two strategies, and we report here the aggregated performance results (Fig. 12). The initial position of the informed agent has a great influence on the flocking algorithm with non-AGDS. When the initial position of the informed agent is 3#, both T end = 32.66 ± 7.52s and N end = 6 ± 0 are significantly better than other initial positions. In contrast, the flocking algorithm with AGDS performs better on T end and N end (1#, T end = 30.05 ± 1.98s; 2#, T end = 30.45 ± 3.00; 3#, T end = 30.18 ± 3.64s; 4#, T end = 23.53 ± 3.09s; 5#, T end = 21.74 ± 0.15s; 6#, T end = 25.87 ± 2.78s; 1# ∼ 6#, N end = 6.00 ± 0.00s).
Regardless of the initial position of the informed agent, the drone swarm successfully reaches the desired region. This means that we can increase or decrease the informed agent during the flight, which will not cause the conflict between task information and obstacle avoidance. The aerial swarm deployed with flocking algorithm with AGDS also performed very well in tracking the preferred swarm speed ( vel = The initial position of the informedagent 0.98 ± 0.01, 1.00 ± 0.02, 1.01 ± 0.01, 0.98 ± 0.01, 0.98 ± 0.03, 0.97 ± 0.01). In addition to this, the drone swarm has better consistency corr during flight. In terms of safety, the minimum distance r min and the average distance r ave between drones of both strategies are in the safe range (> 2R).

Extended simulation validation: complicated scenarios
In the real world, the drone swarm often operates in complex and dangerous environments. For example, exploring pathways and transporting supplies in post-disaster canyons. In such environment, there is no way to ensure that every  N end 6.00 ± 0.00 6.00 ± 0.00 6.00 ± 0.00 6.00 ± 0.00 6.00 ± 0.00 6.00 ± 0.00 vel 0.98 ± 0.01 0.97 ± 0.02 0.96 ± 0.01 0.98 ± 0.01 0.99 ± 0.03 0.98 ± 0.01 corr 0.94 ± 0.02 0.91 ± 0.03 0.88 ± 0.05 0.94 ± 0.03 0.81 ± 0.04 0.95 ± 0.01 obs 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 coll 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 drone has information about the desired region. To extend the proposed flocking algorithm with AGDS to practical transportation applications, we need to verify it in a more complicated scenario. Due to the limited coverage of indoor motion capture systems, we design a canyon-like environment in the simulation. The canyon-like environment is a 16m-long irregular flight area, and there are many obstacles in the area that conflicted with the task information. Figure 13a shows the flight trajectories of the drone swarm when traversing the complex and irregular environment. As shown in Fig. 13b, it only takes 58.4s for the drone swarm to traverse the complex and irregular environment. The drone swarm maintains high normalized swarm speed vel and velocity correlation corr throughout the flight ( vel = 0.94 ± 0.06 and corr = 0.93 ± 0.09). In addition, there are no collisions between the drones ( coll = obs = 0), and the distance between drones is kept within a safe range (r min = 0.40 ± 0.01 and r ave = 0.77 ± 0.12).
To assess the reproducibility of the results in the canyonlike environment, we performed 15 stochastic simulations for each of the six initial positions of the informed agent, and we report here the aggregated performance results. From Table  1, it is evident that, the proposed flocking algorithm with AGDS is independent of the initial position of the informed agent. The drone swarm takes about 59.03 ± 5.23s to travel the complex canyon-like environment. The drone swarm also performs well at velocity correlation and tracking preferred migration speed v 0 ( corr = 0.90 ± 0.03 and vel = 0.98 ± 0.01). In addition, there are no collisions among the drones and a safe distance is maintained between the drones (r min = 0.42 ± 0.01 > 2R).

Conclusions
The work presented in this paper proposed and developed an adaptive goal-directed swarm flocking algorithm for specific missions flying through an unknown environment. Simulations and experiments of six drones have been conducted for validation on the flocking algorithm. The proposed AGDS algorithm possess greater feasibility and efficiency within a typically regular scenario and an extended canyon-like environment.
Next, this work employed the bio-inspired flocking algorithm into practical applications of traversing through unknown environments, rather than aimless collective flight in confined environments [14]. The flocking swarm is composed of informed agents and uniformed agents that operate collectively to accomplish an overall missions. Only the informed agents which are a small minority and know the desired region to go, but the remaining uninformed drones have no knowledge of the destination.The informed and uninformed agents follow identical self-organization rules and obtain feedback from neighbors within their communicable range.
The initial position of the informed agent practically affects the efficiency of the drone swarm traversing the unknown environment. The uninformed agent takes the drone group away from the conflict region through alignment and cohesion operators. On the other hand, the informed agent alleviates the conflict between task information and selforganizing rules through AGDS, which makes the proposed flocking algorithm successful regardless of the initial position.
Our future works will concentrate on a few more informed agents in swarm flocking and their influence on the traversing success ratio. Furthermore, the safety and risk analyses are required to extend the developed flocking algorithm to the Unmanned Aircraft System Traffic Management (UTM), especially in urban-like environments.