Routing Prediction Strategy for UAV Swarm Network Using Pigeon-Inspired Optimization-Based Neural Network

A routing prediction strategy via a pigeon-inspired optimization (PIO)-based neural network (NN) is designed for UAV swarm networks with highly dynamic topology. The proposed strategy can predict the performance of the neighboring nodes as the next hop. For more precise prediction and less computational complexity, the states of the UAV swarm motion and the network are considered as the prior information, and the PIO-based NN framework is established. Based on the system model, PIO is applied to find the optimal weight matrices of the NN-based routing prediction model. The matrix of the hop count index function is calculated using this prediction model. The proposed strategy can directly determine the next hop based on the prediction results or can be combined with other routing methods to maintain a balance between the stability and the shortest path. Numerical simulations are conducted to demonstrate the effectiveness of the proposed strategy.


Introduction
UAV swarms play an important role in disaster relief, city management, geological reconnaissance and other difficult tasks for humans [1].Such swarms can work in a complex and dangerous environment with low cost and fewer casualties [2].The cooperative work of a UAV swarm depends on communication, which requires determining an appropriate routing path [3].
It is difficult for the traditional routing protocol to meet the demands of UAV swarm networks because UAVs move at a high speed and the relative motion among UAV swarms is strenuous [4].Therefore, the network topology is highly dynamic and cannot be used as prior information when determining the routing path.To address these problems, many studies have focused on routing methods based on machine learning (ML) since they require less prior information and are more flexible [5,6].The ML algorithms applied to routing methods include reinforcement learning (RL) [7][8][9], neural networks (NNs) [5,[10][11][12], swarm intelligence algorithms B Mohong Zheng mohong_zheng@163.com 1 The 7th Research Institute of China Electronics Technology Group Corporation, Guangzhou 51000, China [13], and combinations of different ML algorithms [13][14][15][16][17].Although the ML-based routing method is intelligent, it leads to routing oscillations since the ML algorithms are not robust.
To find the routing path more intelligently and stably, some researchers have employed ML to evaluate the routing information, such as the performance of the neighbor node and the routing path [18][19][20].In this case, ML algorithms did not directly determine the next hop, but the predicted results of the routing information were provided as the reference for routing.These ML-based routing methods can estimate the optimal routing path for moving nodes or predict the performance of neighboring nodes through trial-and-error structures, imitating path searching and optimal problem solving [21,22].However, some ML algorithms that depend on online learning are difficult to apply.These algorithms occupy much of the computing resources of the UAVs [23,24].In addition, some ML-based routing information prediction methods ignore the features of UAV movement.Those features can be the prior information for more precise prediction and can be easily obtained by the navigation and control systems of the UAV [25].
To apply the predicted routing information to the UAV swarm network for routing path determination, a routing prediction strategy based on the combination of PIO and NN (PIONN) is proposed.The strategy includes offline training and routing prediction.In offline training, the states of the UAV swarm and network are used as the prior information for precise prediction.The PIO method is applied to find the optimal weight matrices of the NN-based routing prediction model, which can reduce the computational complexity and can avoid a differential in the learning process.In routing prediction, the matrix of the hop count index function is calculated according to the PIONN routing prediction (PIO-NNRP) model.
The remainder of this paper is organized as follows: in Sect.2, the features of UAV movement are extracted as the prior information for the routing prediction model, and the PIONN framework is designed.In Sect.3, the routing prediction strategy based on PIONN is presented.In Sect.4, simulations are conducted to evaluate the performance of PIONNRP.Section 5 presents the conclusions of this paper.

Feature Extraction of Movement
The UAV swarm includes N UAV nodes and A a 1 , a 2 , • • • , a N UAV denotes the set of the swarm, with node is the set of neighbors and hop(a i , a j ) ∈ N + is the function of the minimum hop count from node a i to node a j .
Determining a proper next hop can optimize a routing path.The next hop for less hop(a i , a t ) may be the neighbor of the originating node a i , flying toward the target node a t .Hence, the relative position and relative velocity among the originating node, neighboring node and target node should be considered.
As shown in Fig. 1, to evaluate the performance of node a i j as the next hop, the forward distance r for (i, j, t) and forward speed v for (i, j, t) are defined as: r for (i, j, t) r for (i, j, t) r(i, j) cos[θ r (i, j, t)] v for (i, j, t) v for (i, j, t) v(i, j) cos[θ v (i, j, t)] , (1) where r for (i, j, t) ∈ R 3 and v for (i, j, t) are the forward position and forward velocity among the originating node a i , neighboring node a i j and target node a t ∈ A, with t i, j, respectively.r(i, j) ∈ R 3 is the position between node a i and node a i j , and v(i, j) ∈ R 3 is the velocity of node a i j .θ r (i, j, t) ∈ R is the angle between r(i, j) and r(i, t), and θ v (i, j, t) ∈ R is the angle between v(i, j) and r(i, t).

Routing Prediction Model
The hop count index function of the neighbor node a i j is defined to evaluate the performance of this neighbor as the next hop to the target node a t .In a network with a known path, the hop count index function H (a i j , a t ) ∈ R is the minimum hop count from node a i j to node a t .Otherwise, it is obtained by prediction.
hop(a i j , a t ), known path ĥ(a i j , a t ) , else , where ĥ(a j , a t ) ∈ R is the predicted hop count index function of the neighbor node a i j .ĥ(a j , a t ) is predicted using PIONN, which is the fully connected neural network with two hidden layers.The first hidden layer includes 16 neurons, while the second hidden layer includes 4 neurons.
The input u(i, j, t) ∈ R 4 and output ŷ(i, j, t) ∈ R are designed as: where T de (i, j) ∈ R and B(i, j) ∈ R are the delay and maximum bandwidth from node a i to a i j , respectively.And the hop count index function is predicted by: where W uh ∈ R 16×4 , W hh ∈ R 4×16 and W hy ∈ R 1×4 are the weight matrices, and z 1 (i, j, t) ∈ R 16 and z 2 (i, j, t) ∈ R 4 are the outputs of the first and second hidden layers, respectively.

PIONN Framework
To accurately predict the hop count index function, it is necessary to obtain the optimal weight matrices of the NN-based routing prediction model.To address this problem, the PIObased NN framework is proposed.

Definition 1
In the learning process of the PIONN, let W uh ∈ R 16×4 , W hh ∈ R 4×16 , and W hy ∈ R 1×4 be three different pigeon groups: where N set is the number of sample pairs in the database.This means that each pigeon group includes N set individuals.
Definition 2 Pigeon groups W uh , W hh , and W hy move in the spaces V uh ∈ R 16×4 , V hh ∈ R 4×16 , and V hy ∈ R 1×4 , respectively.At the k th iteration, the positions of the pigeon groups are: Three pigeon groups perform a cooperative flight to decrease the error function of the output layer e y ( p, k) ∈ R: with the theoretical output of the output layer ỹ(k) h(k).
To satisfy lim k→k max e y ( p, k) 0 with the order of the last iteration k max , the center positions W c uh (k), W c hh (k), and W c hy (k) are defined as the guidance.All individuals in the pigeon groups fly toward its center, as shown in Fig. 2 with N set 6 as an example.
The central positions W c uh (k), W c hh (k), and W c hy (k) are calculated based on the quality of the p th pigeon individual fitness N1 ( p, k), fitness N2 ( p, k), and fitness y ( p, k): Fig. 2 The movement of pigeon group W uh in which: where e N1 ( p, k) ∈ R 16×1 and e N2 ( p, k) ∈ R 4×1 are the virtual error function of the first and second hidden layers, respectively: Definition 3 In the kth iteration, the virtual target output of the first hidden layer z1 ( p, k) ∈ R 16×1 and the virtual target output of the second hidden layer z2 ( p, k) ∈ R 4×1 of the pth sample pair are defined as: Based on this definition, the virtual error functions e N1 ( p, k) and e N2 ( p, k) are derivate by: Fig. 3 The structure of the routing prediction based on PIONN where, W + hy ( p, k) and W + hh are the pseudoinverse matrices of W hy ( p, k) and W hh ( p, k), respectively.W hy ( p, k) and W hh ( p, k) are not square matrices, the pseudoinverse solutions are used and Eqs. ( 12)-( 13) are rewritten as:

Routing Prediction Strategy Based on PIONN
The routing prediction strategy is designed to predict the hop count index function H (a i j , a t ), which evaluates the performance of the neighbor node a i j as the next hop to the target node a t .The prediction results are used as a reference when choosing the route.The proposed strategy provides only the predicted hop count index function, and it can directly select the next hop or can be combined with most routing methods.In this case, the routing choice can be more intelligent but still stable.The structure of the proposed routing prediction strategy is shown in Fig. 3. Before the mission, to find the routing prediction model, offline training based on PIONN is conducted with the database, which is created by the historical state of the network and the UAV swarm and its corresponding hop count index function.Since the routing path is known in the previous missions, the hop count index function is the hop count from the neighbor node to the target node.
During the mission, the hop count index function of each neighbor node is predicted based on the routing prediction model and the state of the network and UAV swarm in real time.The real-time network state includes the real-time delay T de (i, j) and maximum bandwidth B(i, j) from the originating node a i to its neighbor node a i j , while the real-time state of the UAV swarm includes the real-time forward distance r for (i, j, t) and forward speed v for (i, j, t) from the originating node a i to the target node a t with node a i j as the next hop.

Offline Training
The offline training procedures based on PIONN are shown in Fig. 4: Step 1: Importing database The input and output of the PIONN are obtained based on the database: where p ≤ N set (P ∈ N + ) is the order of the sample pair.
Step 2: Initialization of the learning process The initial order of the learning process is set as k 1, and all initial weight matrices W uh ( p, 0), W hh ( p, 0) and W hy ( p, 0) are chosen.
Step 3: Calculation of the output and error functions of each layer The output of the first and second hidden layers and the output of the output layer are obtained by: The error function of the output layer e y ( p, k) is obtained based on Eq. ( 7), while the virtual error functions of the first where η hy , η hh , and η uh are the learning steps.
Step 5: Determination If lim k→k max e y ( p, k) ≤ ε or k k max is satisfied with a small positive value ε, the learning process ends.The optimal weight matrices are: or let k k + 1 and go to Step 3.

Routing Prediction
The hop count index function of j th the neighbor node is predicted by: and the matrix of hop count index function is: where N NEI and N TAR are the number of the neighbor nodes and target nodes.

Performance Evaluation
To evaluate the performance of the proposed routing prediction strategy, four routing strategies are implemented in simulation scenarios, including greedy perimeter stateless routing (GPSR) [26], back-propagation NN-based routing prediction (BPNNRP) strategy [27], PIONNRP, and the combination of PIONNRP and GPSR (PIONNRP + GPSR), as shown in Table 1: If PIONNRP is implemented without other routing methods, the next hop is selected by the following equation: Since PIONNRP is the routing prediction strategy, it can be combined with other routing methods to maintain a balance between stability and less hop count.Therefore, PIO-NNRP + GPSR is used to determine the next hop.PIONN + GPSR is conducted by a simple strategy: The aforementioned equation means that if rand < ζ , the next hop is selected by PIONNRP, or it is selected based on GPSR, where rand ∈ [0, 1] is a random function and ζ ∈ (0, 1) is a constant.
The parameters of the UAV swarm in the simulation are shown in Table 2, and its position and velocity are shown in Fig. 5.It was flying toward the destination and avoided the forbidden zone.The forbidden zone is shown as a sphere in Fig. 5a.
There are three cases in the simulation, and the half-power beam width (HPBW) among them is different.HPBW is the angular separation in which the magnitude of the radiation pattern decreases by − 3 dB from the peak of the main beam.Therefore, there are more neighbors with larger HPBW within limits.The HPBW of case 1 is larger than that of case 2. Similarly, the HPBW of case 2 is larger than that of case 3. Hence, the average number of neighbors of these cases are different.In each case, the number of sample pairs in the database is N set 73500, and the number of test simulations is 1,225,000.Each case includes four scenarios, and in each scenario, PIONNRP, PIONNRP + GPSR, GPSR, and BPNNRP are implemented.Their parameters are shown in Table 3.The training times of BPNNRP and PIONNRP were 7311.79 s and 2189.06 s, respectively.Obviously, the training time of PIONNRP was much less than the time of BPN-NRP, and the computational complexity of PIONNRP was reduced.For detailed comparison, different time-to-live in networking (TTL) is used in simulations for detailed comparison.TTL is the time limit imposed on the data packet to avoid the problem of circulating forever in the network.It can be regarded as the maximum hop in which the data packet is valid in the network.
The performance evaluations of Case 1 are shown in Fig. 6.In this case, since the average number of neighbors was considerable, as shown in Fig. 6a, it was not difficult to find an appropriate routing path.Therefore, the delivery failure ratios of all routing strategies were less than 10% with TTL {10, 15, 20, 25}.PIONNRP and PIONNRP + GPSR performed much better than GPSR and PIONNRP.Compared with those of the traditional routing protocol, the average hop counts of PIONNRP were about approximately 20.5%, 39.7%, 44.6%, 46.9%, and 47.8% less than those of GPSR with varying TTLs because GPSR considered only the distance among nodes and ignored the historical experience.Compared with the ML-based routing strategy, although both PIONNRP and BPNNRP could find a short path, the delivery  failure ratios of PIONNRP are 43.8%,70.0%, 77.5%, 84.1%, and 91.1% less than those of BPNNRP with varying TTL, respectively.In particular, PIONNRP + GPSR performed slightly better than PIONNRP because PIONNRP + GPSR was more stable than PIONNRP with a lower delivery failure ratio.
The simulation results of case 2 are shown in Fig. 7.The average number of neighbors is less than that of Case 1 due to the smaller HPBW.Similarly, PIONNRP and PIONNRP + GPSR performed better than GPSR and BPNNRP because the delivery failure ratios, average hop counts, and variance of hop count of GPSR and BPNNRP are larger than those of PIONNRP and PIONNRP + GPSR.Contrasting PIONNRP and PIONNRP + GPSR, PIONNRP paid more attention to shorter routing paths, while PIONNRP + GPSR focused on keeping a balance between stability and short paths.In this case, the average hop counts of PIONNRP were 9.3%, 20.1%, 18.6%, 15.0%, and 12.9% less than those of PIONNRP + GPSR with varying TTL, while its delivery failure ratios were 56.8%, 79.3%, and 99.2% larger than those of PIONNRP + GPSR with TTL {5, 10, 15}, respectively.With TTL {20, 25}, the delivery failure ratios of PIONNRP + GPSR were 0%.
The simulation results of case 3 are shown in Fig. 8.The average number of neighbors decreases due to the smaller HPBW.Compared with that of cases 1-2, the average number 123 of neighbors in this case decreases and the average delivery failure ratio increases.Similarly, PIONNRP and PIONNRP + GPSR performed better than GPSR and BPNNRP.As shown in Fig. 8c, the delivery failure ratios of GPSR and BPN-NRP were unbearable.Comparing PIONNRP and PIONNRP + GPSR, PIONNRP + GPSR still maintained a balance between delivery success ratio and hop count, although the neighbors were not dense.
The comparisons among the four routing strategies are shown in Table 4, and the conclusions are given as follows: (a) The traditional routing protocol, GPSR, failed to find an appropriate routing path for UAV swarm network.(b) Although BPNNRP could find a shorter path with lower end-to-end delay, compared with GPSR, its delivery failure ratio was still unacceptable.(c) Compared with GPSR and BPNNRP, PIONNRP could find a shorter path and decrease the delivery failure ratio for the UAV swarm network.(d) PIONNRP + GPSR performed better than PIONNRP since it could find an appropriate routing path with low end-to-end delay and delivery failure ratio.
Since the combination of PIONNRP and GPSR is better in most cases, it is important to optimally choose ζ .A larger ζ is suggested in the case with higher-dynamic topology because PIONNRP can solve problems with uncertainties.

Conclusions
In this paper, a routing prediction strategy is designed based on the combination of PIO and NN for UAV swarm networks.The proposed strategy predicts the performance of neighbors for routing paths through a PIO-based NN framework without the topology as prior information.Therefore, it can be applied to networks with highly dynamic topology.PIONNRP can be implemented to select the next hop according to the prediction results or be combined with other routing methods.Simulation results have demonstrated the efficiency of the proposed strategy.This routing prediction strategy provides a method to apply ML to routing with a balance between intelligence and stability.

Fig. 6
Fig. 6 Performance evaluation with varying TTL of case 1

Fig. 7
Fig. 7 Performance evaluation with varying TTL of case 2

Fig. 8
Fig. 8 Performance evaluation with varying TTL of case 3

Table 3
Parameters of scenarios