UAV flight coordination for communication networks: genetic algorithms versus game theory

The autonomous coordinated flying for groups of unmanned aerial vehicles that maximise network coverage to mobile ground-based units by efficiently utilising the available on-board power is a complex problem. Their coordination involves the fulfilment of multiple objectives that are directly dependent on dynamic, unpredictable and uncontrollable phenomena. In this paper, two systems are presented and compared based on their ability to reposition fixed-wing unmanned aerial vehicles to maintain a useful airborne wireless network topology. Genetic algorithms and non-cooperative games are employed for the generation of optimal flying solutions. The two methods consider realistic kinematics for hydrocarbon-powered medium-altitude, long-endurance aircrafts. Coupled with a communication model that addresses environmental conditions, they optimise flying to maximising the number of supported ground-based units. Results of large-scale scenarios highlight the ability of genetic algorithms to evolve flexible sets of manoeuvres that keep the flying vehicles separated and provide optimal solutions over shorter settling times. In comparison, game theory is found to identify strategies of predefined manoeuvres that maximise coverage but require more time to converge.


Introduction
The provision of wide-area communication is typically addressed using static land-based methods or satellites. The use of unmanned aerial vehicles (UAVs) can provide a dynamic mobile network, overcoming the problem of shadowing effects from obstructions or changes in demand, with B Alexandros Giagkos a.giagkos@aston.ac.uk Elio Tuci elio.tuci@unamur.be Myra S. Wilson mxw@aber.ac.uk Philip B. Charlesworth charlep@hope.ac.uk 1 the ability to reconfigure the equipment spatially. When compared with satellite communication, the reduced slant range of a UAV-based system offers improved round trip time signals and a 50-60 dB reduction in free space loss. It makes possible a series of desirable performance advantages, such as reducing power consumption, increasing bandwidth or simplifying the antenna requirements.
This study describes, evaluates and compares two methods that allow a small group of hydrocarbon-powered mediumaltitude, long-endurance (MALE) UAVs to generate flying manoeuvres autonomously to provide a communication network backbone. Simulated scenarios using a large number of ground-based mobile units (hereinafter referred to as mobiles) are seen as metaphors for a disaster region in which police, military and first aid units synchronously operate in a coordinated way. The units need full-duplex communication links to arrange their tasks and to share data related to their missions. The group of MALE UAVs autonomously and dynamically relocate according to the movements of the mobiles with the goal of maximising the coverage while taking into account the available power for the communication.
The autonomous, adaptive repositioning is generated using two different methods. The first method is based on the use of genetic algorithms (GAs), which at regular intervals select the best set of flying manoeuvres (i.e. one for each UAV) based on a fitness function. The second employs game theory, where the UAVs participate in a non-cooperative game (NCG) whose outcome determines the their next flying positions.
Both GAs and NCG provide the UAVs with the autonomy to decide where to fly next, given the current status of the system, i.e. the positions and directions of the mobiles and present positions of the UAVs. The approaches allow fleet to adaptively adjust its location based on changes in demand, removing the need for a planning function that generally comes with an undesired load to the communication network. They offer an effective way to represent, manipulate and ultimately generate efficient flying manoeuvres.
The contribution of this work is twofold. Firstly, unlike relevant research that mainly deals with multi-UAV path planning, i.e. finding optimal collision-free flying strategies between two points, our work addresses the need to efficiently utilise the limited power budget to facilitate an airborne network infrastructure. Secondly, both presented methods employ a kinematic model that factors the velocity and turn radius of fixed-wing UAVs and considers unconstrained horizontal motion, with a regulatory limit on altitude only.
Moreover, it deals with realistic flight paths of MALE aircraft which are typically capable of 12-24-h flight. As such, our results reflect the behaviours of real systems rather than abstract models (e.g. simulated quadcopters of limited flight duration). In a similar vein, the radio model employed in this work also aims at realism; unlike most research that considers the basic form of the Friis model with free space path loss, the thermal noise to reflect the actual operating environment is considered.
Results from a comparison in highly dynamic large-scale environments and a qualitative discussion highlight the differences, strengths and limitations of each methodological approach. It is found that both GAs and NCG can fulfil the coverage objective, with the GAs approach reaching an optimal result, i.e. converging, faster than the NCG in all scenarios. The NCG method is found to be less robust due to centralisation. Also, the results indicate that the flying strategies produced by the GAs provide a collision-free relocation mechanism for multiple UAVs that demonstrate agent specialisation.
The rest of the paper is structured as follows. Section 2 discusses relevant research works and Sect. 3 formally presents the problem and describes the UAVs' kinematic and communication models, followed by the proposed approaches to generate flying trajectories. Section 4 outlines the experimental methodology and Sect. 5 presents the results of this study. Finally, Sect. 6 summarises the findings and discusses future work.

Background
Industry and research recognise that area coverage for communication services is a promising application area for UAVs (Zeng et al. 2016). The ability to rapidly reconfigure multi-UAV networks is deemed an essential advantage over terrestrial ones. This leads to exciting technical challenges about how such networks need to be implemented and coordinated (Gupta et al. 2016).
Research efforts exist that address the planning of efficient paths rather than area coverage (Kim et al. 2011;Tsourdos et al. 2011;Shiyou et al. 2012;Bortoff 2000;Leonard et al. 2012;Shin et al. 2012). While primarily addressing the problem of timeliness in providing coverage, it became apparent that the multi-UAV coordination process requires some communication between the participating UAVs.
In Holmberg and Olsson (2008), Burdakov et al. (2010) the authors address the interconnection of UAVs operating beyond the line of sight with a ground station. The critical aspect of this research is the optimal positioning of the relay UAVs. This range extension model includes a single or multiple UAVs in the relay chain. More recently, research focuses on path planning and networking of numerous UAVs providing area sensor coverage (Yanmaz et al. 2011;Yanmaz 2012).
Although a variety of approaches for the coordination of the actions and paths of multiple UAVs are proposed, the remainder of this section examines those based on the use of GAs and game theory. A more comprehensive review of the state of the art in motion planning algorithms and coordination of actions in UAVs is found in Goerzen et al. (2010), Dadkhah and Mettler (2012).

Evolutionary algorithms
A GA-based global route planning system for multiple UAVs is presented in Zhang and Duan (2015). The authors suggest a differential evolution algorithm for flying solutions that are designed to be short and at low altitude, also considering flying constraints such as the turning angle, maximum climbing/descending slope and terrain abnormalities. The presented results suggest that the system can obtain optimal feasible paths during every run, and outperforms the state of the art in convergence speed. Çakıcı et al. (2016) present coordinated guidance for multi-UAV systems using GAs. Desired points in a terrain space, i.e. waypoints, under certain constraints, are determined by the evolution of chromosomes that encode heading angles, velocities and altitude information. Simulated results show that the UAVs when flying according to the evolved solutions can reach the destination points while avoiding restricted regions.
Path planning mechanisms based on GAs have been developed to avoid obstacles (Rathbun et al. 2002) or to manage application-related performance constraints (Jia and Vagners 2004). Roberge et al. (2013) compare GAs and particle swarm optimisation and conclude that the former can outperform when planning the path of a single UAV. Their analysis considers factors like fuel consumption and terrain avoidance, demonstrating a new level of complexity in path planning. Carruthers et al. (2005) investigate a search method for multi-UAV missions related to surveillance and searching in unknown areas using GAs. Their work allows several UAVs to dynamically fly through a search space and autonomously navigate by avoiding obstacles, without a priori knowledge of the environment. Although their system assumes perfect communication and central control of UAVs from take-off time to the end of the mission, the authors employ a genetic algorithm to make an exhaustive search of the mission area by generating the set of fittest next positions of the UAVs.
Finally, Agogino et al. (2012) look at the coordination of multiple UAVs required to fly over a targeted area to provide network communication to ground-based customers. Evolutionary computation techniques are used in their study to optimise network parameters such as power level and antenna orientation to maximise area coverage and download effectiveness to the end-users. While utilising the same type of scenario, the work described in this current paper uses evolutionary computation techniques differently. In Agogino et al. (2012), the UAV movements are generated with swarm intelligence techniques, and the evolutionary algorithm optimises network related parameters, whereas in our study, the GAs generate the UAV trajectories. Yan et al. (2004) apply game theory methods to the problem of route planning for teams of UAVs. After defining multiple objectives and constraints that limit the UAVs flying capabilities, a game is designed that involves several players (UAVs), each seeking to optimise its behaviour concerning the possible actions of the other players. The UAV route planning problem is solved looking for the Nash equilibrium (NE) of the game. The results highlight the feasibility of generating routes for teams of UAVs by using game theory methods. Zong et al. (2012) recognise that UAV networks could be coordinated using a game heuristic, noting that the action set needed to be restricted to limit the computational effort. Their work considered inflatable UAVs operating at a fixed altitude of 20 km. This is much higher than a hydrocarbonpowered, and winged UAV would operate. Furthermore, the flight dynamics of inflatable and winged UAVs are very different. They had small footprints, totalling about 20% of the scenario area so that their UAVs concentrated solely on clusters of users, the aim being to maximise the data rate available to those users rather than to maximise the number of users that can be supported. The paper offers an insight into how game theory could be applied to solve the problem of UAV coordination. Note that since their 2012 article, little has been written concerning the application of game theory to optimise the coverage of UAV networks. Sujit and Ghose (2004) address the problem of obtaining optimal strategies for searching in unknown environments. The environment is partitioned into a collection of identical cells that can be navigated by two UAVs. The resulting search space is represented as an uncertainty map, where each cell contains a value that represents a probabilistic interpretation of the uncertainty of whether an undefined mass occupies the cell location. The objective of the UAVs is to select search routes that visit those cells with large uncertainty values. The authors assume that the UAVs can communicate with each other and decide upon a beneficial global decision, leading to a cooperative solution. The results show that several mixed-strategy NE can be identified and used in the absence of a pure strategy NE. However, an increase in the computational time is found when increasing the depth of the exploratory search environment to obtain optimal strategies, as well as when using more UAV in the search space. Park et al. (2018) designed a feedback-based formation control algorithm that allows a group of small abstract quadcopter-like UAVs to form a wireless ad hoc network around a ground control station (i.e. a gateway) in a 500 m × 500 m area. Acting as access points, the UAVs regularly broadcast topological information such as theirs as well as any next hop positions, and number of connected devices. The control algorithm then considers connectivity, the distance between UAVs (i.e. distribution of access points) and congestion caused by the network traffic to fly the abstract quadcopters.

Game theory
Interestingly, the congestion parameter dictates every UAV's altitude change; the more congested the single UAV-to-ground network is, the lower the UAV flies to explicitly narrow the coverage space and reduce the connections. The generated formations result in a NE. Each UAV has a set of strategy profiles consisting of possible movements in a period with a certain speed, and a payoff function calculating the network throughput difference achieved by each movement. The system is found to dynamically transform the network coverage of the infrastructure leading to the enhancement of throughput as compared to a static topology. Nevertheless, the work is not comparable to the one presented in this paper due to (i) the need of a ground control station, (ii) the lack of realistic UAV kinematic models, (iii) the small scale of terrain and altitudes (starting at 60 m) and (iv) the lack of radio frequency (RF) power consideration to support multiple mobiles.
Algorithms for autonomous UAV coordination, guidance, and manoeuvrability is still a new research area. The reviewed research suggests that both the GAs and the game theory approaches can be effective in allowing single or groups of UAVs to move in a mission space autonomously. They are effective both in cases in which the mission requires the generation and use of nearoptimal paths among fixed control points and in cases where the vehicles are required to operate in unknown terrain. In this study, the area of applicability of the methods mentioned above is extended by showing that both the GAs and game theory can be effectively used for coordinating UAVs on communications area coverage missions.

Problem description
The majority of the research found in the literature deals with the coordination of multiple UAVs from a path planning point of view. In this paper, the coordination of network-enabled UAVs that provide communication coverage to multiple mobile users on the ground is addressed. Both proposed systems utilise MALE unmanned aerial vehicles, whose payload and flight systems' RF power budget are derived from a generator driven by the motor shaft. There are limits to the power rating of the payload power regulator and radio amplifier. It is typical for these fixed-wing UAVs to generally remain airborne for over 12 h and perform flying steps that are recalculated within 3 min, over mobiles at high altitudes. The footprints are typically 40-50 km in diameter. Given an average speed of 30 miles per hour, i.e. 13 m per second, mobiles will travel approximately 3 km within the 3-min cycle. Although this renders the coverage less prone to significant changes for a few cycles, the autonomously generated flying strategies need to consider up-to-date mobile's positions to succeed. Consequently, the UAVs may execute small changes to their positions to include those mobiles closer to the edge of their covered area. The problem addressed in this paper is as follows. Let U = {1, 2, . . . , u} be a set of UAVs and G = {1, 2, . . . , g} be a set of mobiles in the scenario. Let C ⊆ G be a set of mobiles, covered by u ∈ U . ∀g ∈ C, u is guaranteed to spend power p u g to support the communication link between them. Finally, let P u max be the maximum power available for communication for u. Both GAs and NCG are then designed to optimise the following: Although minimising power consumption is not formally expressed in Eq. 1, it is reflected by the mechanism responsible for populating every C ∈ C, as described in later sections. The average power budget of a typical MALE UAV to the payload is approximately 500 W. Typically, this is divided between the processors, antenna gimbals, inter UAV links, positional broadcast and other systems. An estimate of 100 W is available for the power amplifier that with 50% amplifier efficiency leads to an average of P max = 50 W of usable radio frequency power. Note that the presented work does not specify any particular protocol stack. The proposed models provide solutions which regulate available bandwidth which can be mapped on to any communication service.

Fixed-wing kinematics
The decision making for both GA-based and NCG-based coordination systems considers simulated fixed-wing UAVs modelled as points whose positions are defined by the latitude (φ), the longitude (λ), and an altitude (h) in a geographic coordinate system. Each point is associated with a direction vector that corresponds to the vehicle heading (θ ). The kinematic model that describes the UAV movements is based on a 6-DoF model in which a vehicle's motion consists of unrelated turns of constant speed in the horizontal and vertical planes. Giagkos et al. (2020) present a detailed description of the MALE kinematic model. Notice that UAVs can only fly within a predefined altitude limit. This restriction is implemented by forbidding altitude additions or subtractions in cases where the maximum or minimum permitted values are reached.

Communication model
The UAVs are expected to fly continually over the scenario area to provide network infrastructure for the mobilesṪhey are equipped with two radio antennae: i) one isotropic able to transmit between each other and ii) one horn-shaped able to transfer data to the ground, where the mobiles are operating. They have limited power for the communication, denoted as P max with which they have to provide as many communication links as possible with the ground. Note that all participating vehicles are expected to be equipped with a Global Positioning System (GPS) and can periodically broadcast information about their current positions.
A realistic model for the radio communications channel between UAVs and ground-based mobiles underpins this research. Many authors choose the Friis model (Friis 1946), which only considers the free space path loss between a transmitter and a receiver and ignores the range limitations imposed by channel noise. Yan et al. (2019) also produced a useful review of channel models. The authors include a set of guidelines for calculating link budgets over the air to surface channels and advice on which models are most appropriate for different environments. In this paper, a model that includes free space path loss and thermal noise from the sky and ground-based is adopted, as proposed in Maral et al. (2020).
Within the communication network, links are treated independently, and transmission is considered successful when a UAV transmitter can feed its antenna with enough power to satisfy the quality requirements. No matter which modulation and demodulation scheme is applied at the higher protocol layers, a communication link is considered of good quality if the ratio of the energy per bit of information E b to the thermal noise in 1 Hz bandwidth N 0 is maintained. The following equation expresses the transmitting power P t required to cover a mobile at slant range d.
In Eq. 2, R b = 2 Mbit/s is the desired data, E b /N 0 = 10 dB is the normalised signal to noise ratio, f = 5 GHz is the frequency of operation and G r = 1 is the gain of the omnidirectional antenna of the receiver. T sys is the equivalent noise temperature of the receiver in Kelvin and K is the Boltzmann's constant (1.38 × 10 −23 J · K −1 ). The gain of the transmitter Gt, commonly given by the antenna manufacturer, is calculated as per Eq. 3, with θ = 170 • corresponding to the half power beam width angle of the horn-shaped antenna and η = 0.95, the efficiency of its transmission.
The existence of obstacles in the terrain is introduced by an elevation angle γ shown in Fig. 1 and calculated as per Eq. 4 using h, the UAV's altitude, and d, the slant range. A Fig. 1 One UAV and one mobile positioned within the UAV's conical footprint. In this picture, d is the slant range; α the angle of the communication link; θ is the beam width angle of the horn-shaped antenna that defines the area within which links are possible; γ is the elevation angle; ω is the minimum elevation angle below which no communication link can be established; and h is the UAV altitude communication link is achieved when γ ≥ ω, with ω = 10 • . Below this elevation angle, the ground noise component of the receiver noise temperature rises rapidly from 10 to 50 Kelvin, adversely affecting the value of T sys (Committee 1990). Subsequently, if γ < ω then the factor p in Eq. 5 is set to 0, reflecting the fact that no power is dedicated to that specific link and thus the UAV does not cover the corresponding mobile. Also, the communication link is ultimately considered achievable if and only if P t is less than or equal to the remaining P max , the maximum power available for communications each cycle. For further details on computing slant range values, the reader is encouraged to consult (Giagkos et al. 2020). Figure 1 depicts a UAV that provides network coverage to a mobile positioned within the conical footprint of its directional antennae. The higher the UAV flies, the greater its altitude h, the wider its conical footprint on the ground, and thus the greater the area covered. The longer the slant range d between the transmitter and the receiver, the higher the signal power required to support the communication. The slant angle α of the mobile for the UAV is calculated by applying spherical trigonometry using the available GPS data that each network user is expected to broadcast at regular intervals. A mobile needs to lie within the footprint of at least one UAV to be part of the network. Coverage is granted only if the UAV responsible for providing the network link has enough power to maintain that link. Table 1 summarises findings from comparing the employment of UAVs with other common methods to deliver area Density (mobile/km 2 ) 1 + t o 1 0 −2 10 −2 to 10 −5 10 −4 to 10 −6 10 −5 to 10 −8 10 −7 to 10 −9

Comparison with terrestrial and satellite-based systems
Free space path loss at edge ( Given that a terrestrial base station has a typical cell diameter of 20-40 km, the number of mobiles it can support is limited by the available spectrum and power. This leads to typical densities greater than 10 −2 mobile/km 2 . Falling below that level, commonly found in sparsely populated rural areas, renders terrestrial infrastructure less economical. A spot beam from a geostationary orbit (GEO) system's satellite can serve these areas quite well. A typical 2 • spot beam from a modern Ku-or Ka-band satellite tends to be power limited, even when multiple channels are 'stacked' in the same beam. Thus, the supported mobiles' density tends to be between 10 −4 and 10 −6 mobile/km 2 . Finally, the 17.4 • Earth cover beam from a GEO system is best used to serve very low mobile densities, namely between 10 −7 and 10 −9 mobile/km 2 .
Furthermore, the new generation of low Earth orbiting (LEO) satellites is also deployed with a single wide footprint. Their lower altitudes provide smaller beams than GEO Earth cover beams but larger than spot beams. LEO satellites achieve better results than GEO technologies (i.e. 15-20dB lower than GEO). Although this leads to stronger received signals, the single wide beam limits the effective mobile density to between 10 −5 and 10 −8 mobile/km 2 .
With UAV-based communications, UAVs compare favourably with terrestrial and satellite systems. The path losses are typically 22 dB higher than terrestrial and between 50-67 dB better than satellite systems. The stronger signals are spread over a smaller area than satellite beams, leading to densities between 10 −2 and 10 −5 mobile/km 2 . This plugs a gap between terrestrial systems, which would often be uneconomical at these densities, and satellite systems that would provide a weaker service in supported mobiles.
Round trip time (RTT) is another important aspect to consider when comparing network coverage technologies, particularly when employed network protocols use acknowledgement packets. Note that telephony systems are also vulnerable to echoes, with the ITU-T recommending an echo delay of no more than 400 ms (ITU 2000). UAVs compare very favourably with terrestrial systems and have a clear advantage over satellite systems as shown in Table 1.
Finally, the availability of the communications service is compared. Terrestrial and GEO systems are continuously available within their footprints, except for occasional deep rain fades in the higher satellite frequency bands. Their orbital characteristics constrain LEO-based systems to give a maximum of 8 min before the satellite moves out of range and the connection is handed off to the next satellite. This handoff process has been the source of dropped or interrupted connections on previous generations of LEO satellites. On the contrary, UAV-based methods can offer a more flexible service than satellite or terrestrial systems as UAVs can be relocated to maximise its usefulness to supporting mobiles.

Proposed approaches to coordinate the UAV movements
As stated previously, both approaches require that the UAVs broadcast data about their location and the location of mobiles within their footprint. By considering the data, the UAVs can generate their next manoeuvres to maximise the joint coverage of the group. With the GA, the system is designed with a single master UAV that gathers the position data, runs the GA and broadcasts the next moves to the other UAVs. Note that due to the assumption that UAV-to-UAV communication is not affected by the power consumption nor the flying of the UAVs, a member of the group is arbitrarily selected to act as the master at the beginning of the mission. In reality, a selection protocol will need to be in place. However, this is currently outside of the scope of the presented work. Data updates are broadcast every 3 s, a sufficient interval to facilitate data transfer in connection to the mobility patterns of the mobiles and their speeds (Charlesworth 2015).
The broadcast data are used to predict the positions of all vehicles (UAVs and mobiles) after the duration of a flying path. The system tolerates the error in predicted positions, 1 based not only on the effect of the delay in evolving solutions but also on the realistic fact that the mobiles might change direction unexpectedly. It is found that even when mobiles move with high speeds, this method allows the UAVs to pace their flying according to the mobility pattern of the supported mobiles (Giagkos et al. 2014). The NCG requires that all the UAVs simultaneously decide on their next move by solving the same game, and synchronously move to their respective next locations corresponding to the unique NE of that game.

The genetic algorithm
In the absence of instructions generated by the GAs on the master UAV, the flying vehicles perform clockwise turn circle manoeuvres with the maximum bank angle (48 • ) to keep the current position. Once the GAs have generated a new set of manoeuvres (one for each UAV), the master UAV broadcasts the solutions to the team using the network. Note that in reality, various network phenomena can jeopardise UAV-to-UAV communication such as the applicability and effectiveness of the MAC and routing protocols, and potential bottlenecks. However, in the context of this work, it is assumed that, during communication, there is no packet loss and that a dynamic routing protocol allows flawless data to relay within the topology. If there is a delay of the transmission of valuable information between the UAVs, the latter is expected to keep flying in a circular motion, waiting longer for a decision to be made. Nevertheless, due to the size of UAVs and their footprints in this problem, the performance of the system is less prone to packet loss (Charlesworth 2015). A flying manoeuvre is described by a Dubins path of 3 segments (Dubins 1961). Each one is represented by a bank angle and the duration of execution. Each part can be either a straight line or a left/right turn, depending on the given bank angle (see Fig. 2a). A Dubins path may request an alteration to the UAVs' altitude. Each part of the manoeuvre can vary in duration, but the sum of the duration of the three parts must be equal to a fixed time interval corresponding to the time required to complete a circle with the maximum bank angle. This time constraint ensures that whatever manoeuvre is executed, the system remains synchronised, with UAVs that start and finish their respective manoeuvres at the same time. As the GAs have limited time to search and generate the next flying manoeuvres, it must be shorter than the time it takes to each UAV to perform two-turn circle manoeuvres on their respective current positions.
Before the end of the second turn, the GAs are expected to have found the new best positions, which are immediately transmitted by the master UAV to the other UAVs. At the end of the second turn, each UAV executes the flying manoeuvre transmitted to it. During the execution of the latest generated manoeuvres, each UAV gathers recent GPS data for each of their respective mobiles, and transmit this data to the master. Figure 3 depicts the internal mechanism of the master UAV. Both a turn circle manoeuvre and a previous Dubins path are performed in step intervals (1 s in simulation). Note that once complete, the master UAV adopts any new evolved solution and communicates it to the rest of the fleet. Concurrently, the generation of the next solution is initialised, taking into consideration any up-to-date GPS data collected. Figure 2b describes this sequence of events which repeat until the end of the mission.
Flying instructions are encoded into chromosomes as shown in Table 2. There are 8 genes, 6 of them defining bank angle β i and duration δt i of the manoeuvre i with i ∈ {1, 2, 3}. The remaining two genes define variations to the altitude δh and whether that variation will be applied within the duration of the Dubins path (i.e. within the time interval corresponding to 3 i=1 (δt i ), for three path segments). Bank angles, durations and altitude changes are encoded with real-valued genes chosen uniformly random from the range [0,1]. An exception to this is the gene arbi- trating an altitude change, which is represented by a binary value b.
The GAs use the linear ranking to generate the flying manoeuvres (Goldberg 1989). At generation 0, a population composed of M × N random chromosomes are created, with N corresponding to the number of UAVs in the group and M = 100 indicating the number of groups or solutions. A solution is made of H chromosomes. For each new generation following the first one, the H chromosomes corresponding to the best performing group/solution (i.e. the elite) are retained unchanged and copied to the new population. Each of the chromosomes for the remaining solutions is formed by first selecting two old solutions using roulette wheel selection. Two chromosomes, each randomly selected from among the members of the chosen solutions are recombined with a probability of 0.3 to reproduce one new chromosome. After a single-point crossover operator is applied for reproduction, each parameter of the new chromosome is in turn mutated. Mutation entails that a random Gaussian offset is applied to each real-valued gene, with a probability of 0.05. The mean of the Gaussian is 0, and its standard deviation is 0.1. During evolution, all real-valued genes are constrained to remain within the range [0, 1]. For binary genes, mutation corresponds to switching the state of the gene.
The process is repeated to form M − 1 new solutions of H chromosomes. The GAs run for 200 generations or until an allowed computation time has elapsed. The use of time as stopping criteria is an efficient way to avoid undesired effects such as long delays in the execution of the UAV's position update phase. Note that beyond a certain size of UAV fleet, it would be beneficial to allocate more time than the GAs can currently use to find a solution. This is because with progressively more UAVs, the search space becomes bigger and the time to evaluate each solution increases. Nevertheless, the search time allocated to the GAs has to remain shorter than the time taken by UAVs to perform the turn circle manoeuvre. Thus, with a large fleet of UAVs, the entire system has to be adjusted to allow the GAs enough time to find good solutions.
The fitness of each group/solution is shared by all the chromosomes forming the solution. The group fitness is computed by summing the number of uniquely supported mobiles per UAV, with UAVs assumed to be positioned at their respective next locations as per Eq. 6.
with U being a set of all UAVs in the system and C n the packing array of the n th UAV ∈ U , and |G| being the total number of mobiles in the scenario. The packing array C n is populated using Algorithm 1. This algorithm ensures that the number of mobiles assigned to each UAV is maximised while the power consumption is minimised. Bearing in mind that the speed at which mobiles move across a footprint is the important factor, the packing algorithm favours the mobiles that are close to the centre of the footprint. These require the lowest power; hence, the UAV can support more mobiles by finding clusters and keeping them close to the middle. The algorithm starts by initialising empty packing arrays for all UAVs of a scenario. These are used to accommodate and assign mobiles to supporting UAV. An important component for the algorithm is the sorted logical map M. The map consists of all UAVs and the mobiles that are found inside their footprints, sorted by the required supporting power for their slant range, in ascending order. Following that criteria, the algorithm iterates all UAVs and populates their packing arrays with mobiles from their footprints, making sure that there is enough power to maintain the UAV-to-mobile communication links.

Algorithm 1 Packing algorithm
Require: A logical map M of all mobiles in G, organised w.r.t. UAVs' footprints, sorted by power in ascending order Ensure: A set C of packing arrays 1: let packing arrays C[n] ← ∅ ∀ n ∈ U 2: while G is not empty of mobiles do 3: for each u in logical map M do For every UAV in the map 4: let g be the first mobile found in u's sorted list 5: let p be the power required to support g 6: if power budget(u) − p ≥ 0 then Ensures that there is enough power to maintain the link 7: C[u] ← g 8: power budget(u) = power budget(u) − p 9: remove remaining instances of g from G 10: return C The algorithm terminates when all mobiles are considered, ensuring that the resulting packing arrays will contain as many of the mobiles as possible, prioritising those that require less power to be supported. Mobiles that are predicted to be positioned within the footprint of more than one UAV are assigned to the UAV with the smallest slant range if that UAV has enough power to provide coverage. Otherwise, they are assigned to other UAVs. It is important to notice that artificial evolution uses information retrieved from frequent broadcast messages sent by all the vehicles. To ensure that the manoeuvres are generated according to valuable positional information, distances between antennae and in-turn power estimation are calculated based on predicted positions.
At the end of each evolutionary process, manoeuvres corresponding to the best solution of the last generation are communicated to the respective team members for implementation along with the mobile-to-UAV allocation table. Each UAV executes the received flying instructions and serves those mobiles that are supposed to be served based on their predicted position. The larger the number of mobiles covered, the fitter that particular solution is when the UAVs execute the manoeuvres. This logic allows the GAs to generate team solutions that maximise the network coverage by assigning mobiles to UAVs that can spend less power to serve them.

The non-cooperative game
A non-cooperative game (NCG) is used to coordinate the movements of a group of simulated UAVs. Non-cooperative games are games in which the players have a common objective but do not form teams or coalitions to achieve that objective. In our system, each player (i.e. UAV) has a set of possible next locations A and a pay-off set C containing the number of mobiles that are supported within that UAV's  An action a * is then defined as per Eq. 7. a * ∈ argmax a∈A c(a) Note that A can be a discrete set of actions or a continuous set. The strategy adopted by all players is a probability distribution across A. The NCG generates a UAV's next location based on the anticipated status of the system at the end of its current manoeuvre. Each UAV then selects an action that maximises its pay-off, given the actions chosen by other UAVs. This set of optimal actions by each of the UAVs defines the best response by each UAV to every other UAV and is consequently an equilibrium state, i.e. the Nash equilibrium (NE). The airspace over the mission area is configured as a pattern of hexagonal cells, as shown in Fig. 4. At the start of each run of the game, each UAV is circling in cell 1. In the simplest case of maintaining a constant altitude, each UAV has the following choices: either it can move in circles in cell 1, or it can relocate to circle in the adjacent cells 2-7. It could also choose to climb at its maximum climb rate and circle in cells 8-14, or descend to circle in cells 15-21. That gives each UAV a set of 21 actions that can be adopted. Generally speaking, if each of the u UAVs has k possible strategies, the number of combinations of strategies increases as k u . The size of the pay-off matrix must also increase as k u .
The cellular pattern has symmetry about the current location that allows all potential locations to be reached at the same time. The time allocated to complete the manoeuvre is sufficient for the UAV to complete one circle at its current altitude, move to the next cell and complete one circle at its Interval of GPS data broadcast 3 s new altitude. The small difference in path length between level flight, climbing and descent are absorbed in the completion of two circuits, giving a manoeuvre time of about 5 min. This ensures that all UAVs have settled into their new altitude before planning the next manoeuvre, and allows the decisions of all UAVs to be synchronised. The pay-off matrix of the NCG contains the coverage of all UAVs for all combinations of actions. In other words, the pay-off matrix is populated by calculating the number of vehicles that can be supported by each UAV for all k u combination of strategies. The vehicle-to-UAV allocation is done following the criteria detailed in Algorithm 1. The single NE, usually a mixed-strategy Nash equilibrium 2 (MSNE), of the game is used to define the strategy that should be adopted by every single UAV. Chatterjee's method (Chatterjee 2009) is used to solve the game and thus identify the best NE out of all those that can exist. The method starts by assuming a random solution, then progressively refines that solution until the error between successive iterations is less than a given threshold.
Contrary to the GAs approach, the NCG the system is fully distributed as all UAVs would generate identical pay-off matrices from identical location data. Thus they are expected to reach an identical NE solution (see also (Charlesworth 2013) for further details). Both the GAs and the NCG approach require each UAV to access information concerning the positions of all UAVs and the position/direction of motion of all vehicles. The systems required global information sharing.

Experimental methodology and evaluation metrics
To thoroughly examine the performance of the two approaches, a series of experiments are conducted. They are designed to evaluate not only the ability of the coordination systems to provide coverage by balancing the power consumption in large-scale scenarios, but also to investigate the flying behaviours that emerge. Table 3 provides a summary of the parameters used throughout the experiments. It is assumed that the mobiles utilise low-cost omnidirectional antennas and the data rate between mobiles and UAVs is fixed at 2 Mbit/s-a communication rate that is considered high enough to support stream video, emails with attachments, and transmission of images between mobiles.
Coverage is defined as the number of vehicles uniquely supported by a UAV. This concept of 'uniqueness' indicates a one-to-one mapping between pairs of communicating UAVs and mobiles. Providing unique coverage maximises the number of mobiles that can be supported. The cost of this strategy is reflected by a slight increase in the complexity of the payload packing algorithm and handoff delays, in case the path between the mobile and UAV becomes unavailable. In practice, these disadvantages are found to be insignificant. Figure 5 shows the typical shape of a coverage graph. Coverage starts at some arbitrary level depending entirely on the initial conditions of the experiment and settles towards a mean coverage μ with an associated standard deviation σ . The first metric, the settling time T set , is the time that each method takes to reach a value within 3σ of the mean.
The value of coverage that needs to be exceeded, C set , can be defined as C set = μ − 3σ . The other two metrics are the mean and standard deviation of the steady-state coverage. The mean, μ, indicates the effectiveness of the method in achieving coverage while the standard deviation σ shows how consistently that level of coverage is supported.
The initial conditions of all the simulations ensure consistency when measuring the settling time (defined in the next section). Selected results from varying the number of UAVs (groups of 2 and 3 UAVs) and the number of mobiles (scenarios with 20, 100, 150 and 200 mobiles) are presented to highlight performance, scalability and convergence strengths and limitations. The mobiles, randomly distributed in the 100 km × 100 km simulation area, employ a random waypoint mobility model by which each vehicle pauses for 120 s before it randomly selects a direction to move at 30 mph. Using the same scenarios with identical initial conditions gives an insight into how the UAVs behave as a group. For every approach, each scenario is simulated ten times, and average results are presented in the next section.

Results and discussion
Results showing a comparison of the two coordination approaches are presented in this section. The similarities as well as the differences between the proposed GAs and NCG approaches are discussed along with an analysis of their emergent flying behaviours and their optimisation capabilities.

Coverage for different number of mobiles
To understand the associations between coverage and power consumption, experimental results from flying 2 UAVs over 20 mobiles are depicted in Fig. 6. The 20 mobiles represent a sparsely distributed group within a large terrain size requiring higher flights and in turn more power to be spent in maintaining links. Figure 6a and b shows that both approaches satisfy the undemanding challenge of supporting all mobiles but demonstrate significant differences in their approaches. The GAs exhibit some imbalance between the 2 UAVs with one supporting 12 and the other supporting 8 mobiles. The consistency shown over the 10 runs provides an indication that the GA-based approach, evaluating the fitness at a group level rather than the individuals, allows specialisation in flying to emerge. The NCG-based approach balances the load between the 2 UAVs with each supporting 10 mobiles for the majority of the flight. Figure 6c and d shows that the RF power required to support the 20 mobiles varies between 60 and 75 W for the GAs, and between 40 and 70 W for the NCG. These findings coupled with the coverage results suggest that the specific power for the GAs is just over 3 W per vehicle, and for the NCG is just under 3 W per vehicle. The main factor that affects specific power is the slant range, dictated primarily by the position of a UAV relative to the mobiles that it supports. The indications from the sparse scenario with 20 mobiles are that the NCG is marginally better than the GAs at positioning the UAVs closer to mobiles. Figure 8 shows the coverage results for both approaches using 2 UAVs covering 100 and 200 mobiles. The differences between the two can now be clearly seen. The GAs quickly achieve and maintain a steady value of total coverage demonstrating signs of specialisation between the 2 UAVs with UAV1 providing consistently higher coverage than UAV2. The NCG demonstrate less consistent coverage, with the total coverage taking some time to settle within a constant range. Visual inspection indicates that the coverage of the NCG is higher than the GAs.
By analysing the power data, a value of between 0.7 and 1.3 W per mobile covered is found. This suggests that the total coverage of 2 UAV payloads of 50 W should be enough to cover 140 mobiles. When 200 UAVs are available, only one of the proposed approaches, NCG achieves coverage of 140 mobiles. The GAs are found to be less adept at exploiting clusters, as indicated by the relatively flat curves in Fig. 8a and c.  The geometry of the scenario and distribution of the mobiles explain this observation. Recall that the scenario is a 100 km × 100 km square and the UAV footprints are circular. Given a random and uniform distribution of the mobiles within the terrain, total coverage can only be achieved when all the mobiles lie within the circular footprints. This can only occur if the latter completely encompass the terrain, implying that the UAVs cruise at extremely high altitudes. However, such altitudes increase the slant range to the mobiles and demand more power, negatively affecting the number of mobiles which can be supported.
In practice, the mobiles are not uniformly distributed across the scenario but, because of their mobility, they tend to form clusters that slowly form and disperse. A method for relocating the UAVs is likely to be more effective if it can react to the formation of clusters and reposition the flying agents closer to the centre of the clusters.
When a third UAV is introduced the total RF power and area of the footprints increases, allowing more mobiles to be supported. Figure 7 depicts that, in all cases, the total coverage increases. It is clear that the GAs still provide consistent cover, and the addition of the third member of the flying group seems to make any specialisation unnecessary. The NCG manages to achieve better coverage results as compared to the GAs, with its variability of coverage being still apparent.
A more detailed analysis of the coverage metrics for these results can be seen in Table 4, where the statistical significance of the results is depicted. Remember that C set , defined as μ − 3σ , is the coverage threshold and T set is the settling     The mean coverage for the NCG is higher than the GAs for all instances except 3 UAVs and 100 mobiles. Analysis of variance (ANOVA) on the two datasets, the distributions of 10 runs for each approach, shows that the coverage data for 100 mobiles is similar for both approaches (p = 0.05). The results of covering 150 mobiles show a definite difference from 2 UAVs but no significant difference for 3. When there are 200 mobiles, there is a definite improvement in coverage when the NCG is used. The standard deviations for the NCG approach are more significant than the GAs, as indicated by Figs. 7 and 8. The GAs approach is providing a constant, reliable service. The settling time for the GAs is consistently fast and increases with the number of mobiles whereas there is no clear pattern for settling time for the NCG. When 3 UAVs provide coverage for 100 mobiles, the NCG is marginally faster than the GAs, but in all other cases, it is significantly longer. This anomalous case can be explained by comparing the values of C set for the GAs and the NCG; the large σ for the NCG generates a lower value of C set and hence a shorter T set . The behaviour is also explained by the tendency of the NCGbased approach to find and exploit clusters of mobiles. The time taken to locate clusters and reposition UAVs is entirely dependent on the initial conditions. This is a weakness in (a) (b) (c) (d) Fig. 9 Altitude versus time for GAs and NCG decision approach of 2 UAVs supporting 20 mobiles this approach as it could take a long time to settle whereas, in identical initial conditions, the GAs are found to be much faster.

Flying behaviours when supporting different numbers of mobiles
The flight paths of the UAVs, defined in terms of their latitude, longitude and altitude, offer some insight into the relative performance of the two approaches. Examples runs of the experiments are selected to demonstrate their associated behaviours. Figure 9 presents exemplar flying trajectories for 2 UAVs supporting 20 mobiles. Power is not a significant limitation for this example, but due to the sparse distribution of the mobiles it is necessary for the UAVs to increase their footprint area by climbing. Thus both approaches require the UAVs to climb to an altitude, where their footprints are sufficiently large to permit visibility of all the mobiles.
The flexible flying provided by the GAs makes the UAVs fly higher and reach the maximum altitude of 22000 ft in an attempt to maximise their footprint areas. In this particular example, UAV1 moves towards the centre of the terrain while UAV2 demonstrates little change in either latitude or longitude. These emergent flying strategies allow the group to successfully achieve a joint coverage of all the mobiles with UAV1 covering more than UAV2, as seen in Fig. 6a. The different behaviours and coverage values suggest some degree of agent specialisation between the 2 UAVs an effect of altruism that results from the group evaluation.
The conservative flying approach taken by the NCG causes both UAVs to move towards the scenario centre, as shown  Fig. 9d, and fly at a lower altitude of around 18000 ft. The 2D flight trajectories such as in Fig. 9c demonstrate the frequent repositioning of the UAVs and their manoeuvres' complexity. Their almost symmetric locations and limited movements suggest that each has achieved stable coverage and has no further need to relocate. This is supported by Fig. 6a which depicts each UAV supporting approximately half the mobiles. The flying behaviours produced by both approaches differ when more mobiles seek coverage (see Fig. 10). Observing Fig. 10a, it is seen that the GAs move both UAVs towards the centre of the scenario's terrain and, for the remainder of the time, they execute small location changes in a clearly defined area. As the number of mobiles increases to 200, the area within which the UAVs manoeuvre also increases, as seen in Fig. 10c. This is expected behaviour as the higher spatial density of the mobiles the lower the UAVs will select to fly, as shown in Fig. 11. Subsequently, the smaller footprints encourage the UAVs to make small responses to marginal changes in coverage.
The horizontal movement of the UAVs using the NCG demonstrate an interesting pattern. When there are only 100 mobiles the 2 flying agents demonstrate a competing behaviour. Their flight paths overlap slightly as they move around the centre of the scenario trying to find marginal improvements to their coverage. As the number of the mobiles increases to 200, the overlap reduces; there are plenty of mobiles to be found without the necessity to compete too aggressively.
Adding a third UAV increases both the available RF power by a third and the combined area of the footprints. It can be expected that coverage will increase and that the 3 UAVs will be less constrained when finding optimal solutions. The vertical movement of the UAVs controlled by the GAs is shown in the altitude Fig. 12a and c. The mobile spatial density is lowest with 100 mobiles; therefore, the GAs encourage flying higher, pushing UAVs to climb to increase their footprints. Interestingly, with 200 mobiles available all 3 UAVs start by descending. However, UAV3 has some difficulty maintaining its coverage and responds by climbing to increase its footprint size. The GAs show evident and consistent behaviour with 3 UAVs. Figure 13a and c show all flying agents initially moving towards the scenario centre. Once their coverage has reached a practical maximum, they make marginal location changes in response to movement of the mobiles, maintaining a clear separation between each UAV and its neighbours. In this way, they maximise their coverage. Game theory demonstrates clear evidence of competition when looking at the NCG results. Figures 12b and 13b   Two approaches to autonomous MALE UAV flying to enable the ground to ground communication over challenging terrain are discussed and compared. The first approach applies GAs whereas the second employs game theory to generate flying solutions. Both approaches are designed to maximise the coverage, that is the number of mobiles that can be supported during the mission, considering the limited available power dedicated to communications. Notice that although the problem is concerned with the ability of the systems to increase the number of covered mobiles as well as the efficiency in managing the power, it is treated as a singleobjective problem by both approaches. This is because the operating packing mechanism (discussed in Sect. 3) encap-sulates the concept of power management when allocating mobiles to their supporting UAVs. Furthermore, the NCG is not intended as an optimisation method; instead, its application explores whether optimal solutions can be found if agents are in competition. Both approaches are capable of supporting more mobiles than other satellite-based methods while achieving less packet loss and significantly shorter round-trip times. This is due to the constraints satellite approaches have, ranging from shorter footprint diameters to restricted access times. The two approaches are compared through a series of simulations for large-scale scenarios. Coverage, power consumption and scalability as well as flying behaviours are thoroughly investigated, highlighting the pros and cons of each approach. An initial observation is that the GA-based approach sought optimal solutions and sometimes allowed the UAVs to specialise. The competitive nature of the NCG results in all UAVs seeking marginal increases in coverage at the expense of competitors. Both methods were found to fulfil the objective of providing adaptive communication coverage, with the GAs being able to maintain a consistent result throughout the mission. This is due to the flexibility of the flying behaviours offered by the current design of chromosomes in the GAs.
The NCG achieves higher coverage than the GAs but with more considerable variability due to its conservative, and quantised, flying behaviour. For instance, with 2 UAVs and 200 mobile the NCG scores μ = 152 with σ = 3.4, whereas the GAs achieve μ = 113 with σ = 2.3. Results are similar when a third UAV is introduced; μ = 179 with σ = 3.9 and μ = 164 with σ = 2.1 for the NCG and the GAs, respectively. It is suggested that smaller quantisation steps might improve the variability of the coverage values, and this could be an interesting question for future research.
In terms of quickly converging to a sufficient separation, the GAs are found to require less time (i.e. settling time T set = 107 compared to 2415 for the NCG) and be able to specialise the resulting flying behaviours due to their flexibility. The NCG is seen to require more time as the UAVs follow a similar trend in traversing shorter distances per flying step while making frequent altitude changes to manage power.
When small, thus sparse, groups of mobiles are concerned (i.e. |G| = 20), the NCG is marginally better than the GAs in power consumption as it requires less than 3 W per mobile with the GAs exceeding this consumption. However, as the number of mobiles increases, it is found that the specific power (i.e. the power required to support one mobile) tends to remain between 0.8 W Watt and 1.3 W per mobile for both approaches.
As the number of mobiles and their supporting UAVs increases, it is easier for both approaches to suggest locations with low slant ranges, and hence lower values of specific power. Under these conditions, both approaches tend to reduce the operating altitude of all members of the flying group. This mostly results from the increased spatial density of the mobiles and the consequent ability of the UAVs to satisfy their coverage demands from smaller footprints.
Results related to the flight paths of the UAVs show different behaviours from the two compared approaches. The GAs tend to maintain clear horizontal separation between the UAVs. In contrast, the NCG sometimes allowed their flight paths to overlap, a consequence of the competition between UAV players in response to local increases in vehicle spatial density. These results also indicate that flying strategies produced by the GAs promise a collision-free relocation of multiple UAVs.
The NCG assumes a distributed planning and decision making, but still needs to share positional data between UAVs over the same unreliable core network. All position-related information is shared, but each UAV decides on its own about what would be the next actions, by anticipating the action of others. In the current form, the GAs use planning with the need to collect data and distribute a plan over a network with some probability of data loss or corruption but, with synchronisation and common rules for genetic operators, it would be possible to produce a distributed version. Finally, this paper uses coverage as the primary criterion for evaluating mission performance and also RF power as the primary constraint. The consideration of other performance criteria and constraints as well as the incorporation of collision avoidance and traffic-related performance metrics could improve the realism of both approaches while establishing some exciting possible directions for future research.
Funding This research was funded by Airbus Group Endeavr Wales.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.