Endogenous stochastic optimisation for relief distribution assisted with unmanned aerial vehicles

Unmanned aerial vehicles (UAVs) have been increasingly viewed as useful tools to assist humanitarian response in recent years. While organisations already employ UAVs for damage assessment during relief delivery, there is a lack of research into formalising a problem that considers both aspects simultaneously. This paper presents a novel endogenous stochastic vehicle routing problem that coordinates UAV and relief vehicle deployments to minimise overall mission cost. The algorithm considers stochastic damage levels in a transport network, with UAVs surveying the network to determine the actual network damages. Ground vehicles are simultaneously routed based on the information gathered by the UAVs. A case study based on the Haiti road network is solved using a greedy solution approach and an adapted genetic algorithm. Both methods provide a significant improvement in vehicle travel time compared to a deterministic approach and a non-assisted relief delivery operation, demonstrating the benefits of UAV-assisted response.


Introduction
Disaster mortality has increased by 180% between 1994-2004 and 2004-2013, reaching 99,700 yearly deaths in the latter decade (CRED 2015). While three megadisasters (2004Asian Tsunami, 2008Cyclone Nargis and 2010 are the main contributors to this trend, their exclusion still yields a 17% rise in mortality over the same period. Another concerning factor is that climate change will increase the frequency of weather-related disasters. Efforts to protect vulnerable communities are undertaken by humanitarian organisations, who are increasing their relief operations budgets (Van Wassenhove and Pedraza Martinez 2012). As an indicative example, the United Nations (UN) World Food Programme (WFP) raised its spending by 60% between 2011 and 2017 (WFP 2017), exceeding $5 billion in 2016.
In the aftermath of a natural disaster, the main immediate response operation consists of transporting and distributing essential goods for survival. This stage is essential for the long-term recovery and sustainability of the affected community, as survival rates reduce exponentially during the first 72 hours after a disaster event (Huang and Lien 2012). Limited accessibility as a result of extensive damage to infrastructure is the main hindrance to relief operations, isolating communities and causing severe operational delays (Meier et al. 2016).
The prohibitive cost of manned aerial operations is a key motivation for humanitarian organisations to integrate unmanned aerial vehicles (UAVs) in disaster response . In particular, damage analysis and assessment have been the focus of organisations and researchers alike (Bravo and Leiras 2016) given the current technological limitations related to empty take-off weight and range, which hamper their applicability to other activities. The low weight, easy transportation and high data quality of UAVs provide a unique opportunity for respondents to collect and assess infrastructure damage during disaster response operations. A recent example is the 2016 Ecuador earthquake response, where UAVs were deployed within a day after the disaster to assess road damage and provided essential data to the Ministry of Transport and National Secretariat for Risk Management, informing their relief routing decisions (DuPlessis 2016).
Despite successful pilot applications and technological improvements, some barriers still preclude the widespread use of UAVs for post-disaster damage assessment. For instance, image processing and analysis are still notably timeconsuming, as exemplified during the 2016 Ecuador deployment, where processing the survey results required 12 to 24 hours of continuous effort (FSD 2016).
For an efficient use of the limited human and technological resources available in the immediate aftermath of a disaster, UAVs must be deployed when and where they provide the greatest benefits to the affected community. These missions include assessing key infrastructure damage and estimating population needs. Moreover, flight times should be minimised given energy shortages and safety risks (risk of injury due to malfunction or vandalism) in humanitarian settings (Van Wassenhove 2006;Van Wassenhove and Pedraza Martinez 2012;Meier et al. 2016). Overall, the optimisation of UAV deployments for road 1 3 Endogenous stochastic optimisation for relief distribution… damage assessment requires the development of a bespoke modelling approach that considers knowledge of the network state acquired over time.
Our literature review (Sect. 2) reveals that the problem of combined UAV-based network assessment and relief distribution is yet to be formulated and solved. Moreover, existing optimisation problems that evaluate relief distribution and network damage assessment generally assume deterministic or exogenous stochastic conditions.
To address this gap, we propose a mathematical model for optimising the provision of humanitarian relief supported by UAV-based damage assessment. The problem formulation presented assumes that the extent of damage to the road network is stochastic and that UAVs are deployed for surveying the physical conditions of the transport infrastructure, reducing the uncertainty related to ground vehicle travel times and costs. Simultaneously, ground vehicles are routed through the network, delivering relief cargo.
Therefore, the problem is defined as endogenously stochastic, where the uncertainty in the network depends on the routing decisions of UAVs and ground vehicles. A greedy sequential solution algorithm and a meta-heuristic (a genetic algorithm) are presented to determine optimal routing itineraries at discrete time steps. Thus, the contribution of this paper is threefold: 1. To the best of the authors' knowledge, it is the first to formalise an optimisation problem to model the UAV-assisted endogenous stochastic travel time relief distribution. This paper proposes a metric to quantify the benefit of integrating UAV road assessment with humanitarian response. 2. A greedy exact solution procedure is proposed. However, additional simplifications are required to ensure reasonable runtime due to its high complexity. A relaxed version of the problem is solved using a meta-heuristic, which is used to solve larger problem instances. 3. It demonstrates the solution methods' applicability by solving a realistic case study based on the 2010 Haiti earthquake. The results show that important benefits are obtained when introducing UAVs in humanitarian response operations.
This paper is structured as follows: Sect. 2 analyses the current literature, followed by Sect. 3 which defines the mathematical model. The solution methods described in Sect. 4 are applied to a numerical case study focusing on a hypothetical response to an earthquake in Haiti in Sect. 5. Conclusions and recommendations for future work are provided in Sect. 6.

Background
The following section presents an overview of the current research on endogenous stochastic optimisation methods and humanitarian relief optimisation models that consider road network damages or accessibility. Despite the two aspects being related, no study thus far incorporates both of them simultaneously. As such, the review analyses the state-of-the-art of both topics separately. A summary of the literature is presented in Table 1.

Post-disaster relief optimisation
This review focuses on studies that optimise the routing decisions of relief distribution or network repair vehicles. Generally, these decisions are captured by the vehicle routing problem (VRP), which can incorporate vehicle capacity constraints, stochastic demand and travel time. Duque and Sörensen (2011) developed a greedy randomised adaptive search procedure for allocating engineering resources to repair road networks. A similar problem instance is solved by Tuzun Aksu and Ozdamar (2014), who presented an integer programming (IP) model which relaxes the VRP to a path-based algorithm by ignoring travel times. The "debris removal problem" is further studied by Sahin et al. (2016). In this study, a recursive mixed integer linear programming model (MILP) is developed to maximise the cumulative network accessibility while minimising clearing time. Similarly, Akbari and Salman (2017) conceived an exact MILP model to generate a synchronised road clearing schedule for multiple engineering teams. Kim et al. (2018) adopted a MILP model to minimise the weighted sum of total damages and the completion time while considering time-dependent damage characteristics. Balcik et al. (2008) proposed a two-step approach to solve the relief distribution problem. In the upper layer, a travelling salesman problem (TSP) generates vehicle routes, and vehicle itineraries are produced in the second step. Özdamar (2011) incorporates evacuation and refuelling considerations to the relief distribution problem. The proposed algorithm relaxes the VRP problem to a static integer flow formulation, significantly improving its runtime. Vahdani et al. (2018) conceived a two-phase, multi-objective, multi-period and multi-commodity mixed integer programming model which locates the distribution centres and plans vehicle routes.
The combined problem of relief distribution and network repair has been studied by Yan and Shih (2009). The proposed multi-objective MILP model is capable of simultaneously planning emergency road repair operations and the subsequent relief distribution deployments for multiple commodities. Ransikarbum and Mason (2016) developed a goal-programming-based (GP) multi-objective integrated response and recovery model that optimises network recovery and relief distribution simultaneously. Finally, Li and Teo (2018) conceived a multi-period bi-level programming model to optimise road repair and relief optimisation. They employ a genetic algorithm to maximise relief delivery quantity and network accessibility.

Exogenous stochastic optimisation
After a disaster, the state of the road network is stochastic and should be assessed before planning debris removal or supplies distribution. Many studies consider these uncertainties by adopting the two-stage stochastic programming (2SP) modelling technique for depicting the post-disaster humanitarian logistics response under an uncertain demand of relief supplies.
Van Hentenryck et al. (2010) focused on modelling the optimisation of large-scale disaster relief and recovery operations as a single commodity allocation problem.

3
A similar framework with multiple commodity types is presented by Tofighi et al. (2016).
Several studies solve the warehouse-location demand-allocation problem, with particular focus on hurricanes (Rawls and Turnquist 2012;Pacheco and Batta 2016), earthquakes (Mete and Zabinsky 2010) and floods (Garrido et al. 2015). Noyan (2012) studied risk-averse location-allocation optimisation instead of common riskneutral problems, while Bozorgi-Amiri et al. (2013) presented a multi-objective robust stochastic programming problem considering uncertain demand, supply, cost and transportation.
The common goal of the established stochastic models is to minimise the total expected cost among all possible scenarios. The operational costs generally comprise ordering and shipping costs, as well as a parametric penalty in case of shortages under extreme scenarios.
In almost all the studies, the optimal logistics configuration was identified by obtaining the exact solution to the 2SP model through an optimal solver. The reason for adopting the optimisation solver is that only a reasonable number of possible scenarios were derived from the case study and the probability of a scenario occurring is independent of the decisions made in the previous time steps. In other words, the uncertainty is assumed exogenous in nature. In addition, it is worth mentioning that the calculation of transportation costs in most of the previous studies only considered the shipping quantity, instead of evaluating the uncertain road condition which may have a prominent influence on travel times.

Endogenous stochastic programming
The following section explores the research that focuses on endogenous stochastic problems, where the decisions directly affect the level of uncertainty in the system (Jonsbraten 1998). This includes studies that consider fluctuating road conditions as another major source of uncertainty during the modelling process. Khaligh and Mirhassani (2016) solved a single-vehicle routing problem with stochastic demands, where the actual demand of a customer is revealed only once the customer is visited. Considering the problem of hospital operation room scheduling, Hooshmand et al. (2018) developed a genetic algorithm to solve the endogenous time-stochastic scheduling problem. Vayanos et al. (2011) studied the production planning problem in offshore oil fields considering endogenous uncertainties. Nohadani and Sharma (2008) conceived a generalised robust linear optimisation model applied to a shortest path problem. The algorithm considers arc lengths to be stochastic and affected by routing decisions. Similar research has been carried out by Boyles and Waller (2010), who instead proposed a minimum cost flow problem.
On another thread, several researchers have focused on reducing the number of non-anticipativity constraints (NACs) in multi-stage stochastic programming models with endogenous uncertainties. The methods adopted included sequential scenario decomposition heuristics and Lagrangian decomposition (Gupta and Grossmann 2011;Apap 2017).

Endogenous stochastic optimisation for relief distribution…
The studies analysed above focus on the reduction of NACs, which grow exponentially in size with larger problem instances. In practice, the algorithms developed are applied to scenarios that deviate significantly from our own.
Only Khaligh and Mirhassani (2016), Nohadani and Sharma (2008) and Boyles and Waller (2010) study the VRP or similar formulations, while others focus on scheduling problems. However, Khaligh and Mirhassani (2016) solved the stochastic demand VRP and used simplifications based on vehicle payload to reduce the scenario tree. Travel time is modelled as a stochastic parameter by Nohadani and Sharma (2008) and Boyles and Waller (2010), but both studies propose a static problem where variable uncertainties do not condition future decisions.
Thus, an important contribution of this paper is the proposal of a mathematical formulation for the time-discretised endogenous stochastic multiple VRP which assesses the aerial assessment carried out by UAVs for revealing the road damage conditions. A unique aspect of our work is that both types of agents (UAV and truck) are capable of reducing network uncertainty. The above research was consulted when formulating the mathematical model presented in the next section.

Model description
The model presented in this section captures key decisions at the operational stage of humanitarian relief distribution in the aftermath of a natural disaster. The distribution of relief is carried out by ground vehicles assisted by UAVs, which assess network damages. The objective is to reduce the operational costs of delivering cargo by optimising UAV and ground vehicle routing decisions over the mission time horizon. Note that the operational costs are calculated based on the vehicle travel times, such that a reduction in costs leads to a reduction in mission time.
A graph G v = {V, E v } composed of nodes V and edges E v represents the road network connecting human settlements. It is assumed that the road network has been subjected to damage, the degree of which is stochastic at the start of the mission. This stochastic component d s ij is assigned to each edge ij ∈ E v , modifying the travel time and travel cost of the link. In this problem, we consider a time horizon that is represented as a discrete set of time periods t such that t ∈ T . The stochastic component is assumed to be time independent.
At every time step, UAVs are deployed to collect information on the road network, revealing the actual cost associated with network damages. The network exploration is captured by the binary decision variable b t ij , indicating that a link is A scenario tree s ∈ S captures all the potential routing options of the UAV.
Note that UAVs are assumed to operate in beyond visual line of sight (BVLOS) conditions in spite of current drone regulations disallowing them to do so without prior clearance (Stöcker et al. 2017). However, UAV deployments have been permitted previously for road damage assessment in Ecuador and, at time of writing, WFP is collaborating with national authorities to develop regulations for humanitarian UAV deployments (DuPlessis 2016; Vornic 2017).
After the UAV is deployed at each time step, cargo vehicles traverse G v transporting relief to every demand node ∀i ∈ N . Given common post-processing and analysis related to road damage assessment by UAVs (FSD 2016), it is assumed that explorations at time step t i inform routing decisions in future time steps t > t i . This behaviour is shown in Fig. 1, where a drone is deployed at time t = 0 , and the information gathered by that drone is available for routing at t = 1 and t = 2 . We adopt the following modelling assumptions: Assumption 1 Ground and aerial vehicles must return to the depot at the end of the time step. A time step t corresponds to a single UAV and truck journey. While the UAV is assumed to be deployed before the truck, we consider that flight preparation, battery recharge, data consolidation and analysis take place during the same time step. This assumption is based on current case studies that require several hours to collect and analyse UAV data (DuPlessis 2016; FSD 2016). Consequentially, it is also assumed that the UAV battery is fully charged before each deployment.
Assumption 2 Average vehicle speeds are related to the Modified Mercalli Intensity (MMI). The MMI is a standardised measurement of the ground movement intensity. Hence, it is assumed that areas with greater ground movements are exposed to more severe damages, leading to lower average vehicle speeds (see Assumption 3). At the start of the mission, respondents have no knowledge of the extent of link damages or the MMI score, thus assuming an average vehicle speed v avg .
Assumption 3 Vehicle speeds are normally distributed. Variations in spot velocities are commonly modelled using normal distributions (Adeke et al. 2018;Naidu 2019). This study assumes that the same standard deviation occurs regardless of damage, but the distribution mean of each link is calculated based on Assumption 2.

The model presented requires two graphs
The former represents the road network and contains information on the deterministic pre-disaster transport travel time g ij , deterministic damage travel time contribution d ij Fig. 1 Truck and UAV routing example. A dashed line indicates an unexplored link, a grey line determines that the link has been explored when the vehicle is being routed, while the green and blue indicate the route followed by the drone and the truck, respectively 1 3 Endogenous stochastic optimisation for relief distribution… and stochastic damage time contribution d ij . G d forms a fully directed graph that the UAV uses to traverse through the network. Edge set E d contains information on the flight times f ij and the approximate battery consumption q ij , defined as percentage battery use. Vehicular payload capacity and node demand compose the remaining required inputs.

Mathematical formulation
The problem described above is consolidated into a single mathematical formulation. The reader is referred to Table 2 for the definitions of the variables and parameters used in the remainder of this section. The linearisation of the problem is presented in the Appendix. ( The overall objective function is presented in equation (1), which consists of seven components defined in (1.1-1.7). Equations (1.1-1.2) describe the two scenarioindependent objectives: UAV travelling cost and route exploration cost, respectively. The scenario-dependent components are evaluated in Equation (1.3) and are regulated by p s and s ij , where p s indicates the probability for scenario s to occur and s ij relates the explored link ij with scenario s. This approach allows us to only consider the damage levels of the explored link in the scenarios during truck routing. The remaining components relate to the costs associated with truck routing. Equations (1.4-1.6) calculate the total vehicle routing cost. The deterministic pre-disaster cost component is evaluated in (1.4), while damage contributions are considered in (1.5) and (1.6). The former uses a deterministic component based on explored links (where b t ij = 1 ), while the latter considers a stochastic cost distribution if the traversed link was not previously explored (where b t ij = 0). It is important to note that the cost values determined in (1.1-1.6) are proportional to the vehicle travel time, as they are calculated using a value of time parameter c. Therefore, the objective minimises response time.
Finally, a penalty associated with undelivered demand is proposed in equation (1.7). The undelivered demand is calculated using the slack variables r st i and h st i , and parameter P relates to the value of life, which serves as a measurement of the social costs related to the response operation (Holguin-Veras et al. 2013).
The full mathematical problem is described as follows:

3
Endogenous stochastic optimisation for relief distribution… The problem constraints are defined by (2.1-2.22). UAV movement is controlled by (2.1-2.4): constraint (2.1) ensures that the UAV starts at the depot, while (2.2) limits the UAV to visit a node once only. Constraint (2.3) allows UAVs to wait only at the depot, meaning that the UAV is not deployed when x t iik = 1 ∀i ∈ D . An equal number of outgoing and incoming UAVs at each node i is defined in (2.4).
We limit UAV range based on the battery consumption matrix q ij , implemented in (2.5-2.6). The variable o jk is defined in (2.5) and represents the percentage battery consumption of the UAV k when arriving to node j. A value of 1 indicates that the usable battery charge has been fully consumed, while a value of 0 represents a fully charged battery. The value of o t jk is bounded to 0 when j ∈ D to replicate its recharging behaviour. Constraint (2.6) ensures that battery levels never exceed their capacity.
Constraints (2.7-2.12) monitor the vehicle and cargo flows throughout the network, with (2.7-2.10) confining the truck's routing behaviour, similar to (2.1-2.4). Constraint (2.7) ensures that the vehicles start at the depot, (2.8) prohibits all vehicles to travel to the same demand node, (2.9) allows only two visits to each node, and (2.10) ensures vehicle flow equilibrium.
Constraints (2.11) and (2.12) monitor cargo flow through the network. The first limits cargo transportation to vehicle payload, while the second assures cargo equilibrium is satisfied considering the slack variables ( r s,t i and h s,t i ). These variables Endogenous stochastic optimisation for relief distribution… indicate the quantity of cargo that remains undelivered at each time step, and are used by the objective component (1.7). Constraint (2.13) ensures that explored link u t ij can only be selected if a drone has traversed this particular link. The explored link variable b t ij is updated by (2.14), giving ground vehicles access to the actual link cost as per (1.5-1.6). Constraint (2.15) ensures that if a certain link is explored at time t i , it will remain explored in all future time steps t ≥ t i . The non-anticipativity constraints (NACs) in (2.16)

Non-anticipativity constraints
The proposed problem formulation reveals actual link damage only if explored by a UAV or traversed by a vehicle in a previous time step. In other words, the timing of the revealed information is decision dependent, and the exploration decisions are conditioned by the available information at the start of the time step. Therefore, given two scenarios s and s ′ at time step t, if the information revealed is the same for s and s � ∀t � ∈ T ∶ t � < t , the scenarios are defined as indistinguishable. NACs ensure that decisions made in indistinguishable scenarios s and s ′ at time step t remain identical. This phenomenon is illustrated in Fig. 2, which depicts a sample scenario tree with three stages and nine scenarios, where each branch represents the three damage levels of a newly explored link (based on a 3-point approximation of the normal distribution). In Fig. 2, all scenarios s at t = 0 are indistinguishable, illustrating the behaviour of our algorithm. Given that no damage has yet been assessed and no routing decisions have been made, all scenarios must be indistinguishable at this stage. Therefore, the NAC at t = 0 is: Fig. 2 Scenario tree example where the normal distribution is modelled using the three-point estimation method For t > 0 , NACs determine both truck, cargo and UAV routing decisions, respectively. By referring again to Fig. 2, the NAC at t = 1 would state that x 1,t ijk = x 2,t ijk = x 3,t ijk , as they all share the same scenario history. In the problem formulation above, the decision depends on the revealed link damages and unsatisfied demand nodes. For convenience, we define a variable z s,t i = r s,t i + h s,t i that calculates all undelivered demand in scenario s at time step t. The NACs for t > 0 are defined in equations (4-6), considering drone, truck and cargo routing, respectively.

Solution method
The adoption of exact solution methods for solving the full-branch scenario tree and NACs result in an exceedingly large problem size even for small instances. Equation (7) shows how the number of variables N relates to the number of time steps T , edges E , vertices V , scenarios S , drones K d and ground vehicles K v . In turn, the number of scenarios equates to the total number of possible combinations of link damages at each time step as defined by equation (8), where n indicates the number of link damage levels. As it can be observed, the number of scenarios increases nonlinearly as the network size increases (number of links), and hence quickly becomes unsolvable for exact solvers. (3) 1 3

Endogenous stochastic optimisation for relief distribution…
The formulation of the proposed problem instance is similar to a time-dependent vehicle routing problem (TD-VRP), where the aim is to assign vehicles to a route to minimise total distance travelled. The problem is NP-complete, as it is a generalisation of the travelling salesman problem, so any exact solution method is unscalable and not applicable to realistic problem instance sizes. Previous research relies on heuristics and meta-heuristics to overcome the scalability issues of exact solution methods for the TD-VRP (Gendreau et al. 2015;Kok et al. 2012) and multi-stage endogenous stochastic problems (Gupta and Grossmann 2011;Hooshmand and MirHassani 2016;Hooshmand et al. 2018;Apap 2017). Thus, the pursuit of exact solutions for these kinds of problems is infeasible within acceptable computational time and resources. Therefore, we propose two solution approaches: a greedy sequential method presented in Sect. 4.1, and a meta-heuristic described in Sect. 4.2.

Sequential algorithm
We reduce the number of variables considered by using a greedy sequential model, solving each time step t ∈ T . A workflow of the model is presented in Fig. 3.
The process commences by generating an initial problem instance with randomised network damages s p . An exact optimisation solver is used to optimise the routing decisions that minimise the objective function (1) for t = 0 . The resulting drone and truck routings are used to update the network demand and network damages in order to carry forward the information gathered at t = 0 . The process continues with increasing values of t until all demand is delivered.
The proposed solution ensures that at each step t i the exact solver minimises the operational cost given the cumulative knowledge gathered for t < t i . However, the main problem of the sequential method is that it ignores potential future decisions

Fig. 3 Sequential algorithm workflow
for t > t i . Thus, the greedy decisions made at t i , while optimal for the proposed time step, may result in deviations from the global optimum.
In exchange for a reduced solution optimality, the greedy sequential model eliminates the need to implement NACs and reduces the scenario tree complexity. The latter is shown in equations (9)-(10), where the number of combinations is dependent on the number of links E and the number of explored links per time step B . As a result, the approach considers only a limited number of explorations at every time step. The potential number of scenarios quickly becomes infeasible given the total combinations of link explorations.
Hence, we enforce B = 1 , following current UAV range limitations, image post-processing and data analysis requirements for damage assessment FSD 2016). Furthermore, the greedy nature of the approach provides a lower-bound solution for the original problem. By developing a genetic algorithm, we can extend the problem to multiple drones, vehicles and explorations per time step, as well as eliminate its greedy nature.

Genetic algorithm
A genetic algorithm (GA) is a widely used meta-heuristic developed in the 1970s that employs probabilistic-based search to explore the solution space (Davis 1991). GAs have been used extensively to solve the VRP due to their linear scaling with problem size (Karakatič and Podgorelec 2015). Their performance level has been shown to be on par with other heuristics and have provided the best-known solutions for some particular problem instances (Berger and Barkaoui 2003;Guezouli and Abdelhamid 2017). Therefore, we selected the GA as an appropriate solution method for our mathematical optimisation.
Five core processes govern the behaviour of the GA: solution generation, evaluation, selection, crossover and mutation. The length of the search, governed by the number of iterations , and the population size z alter the optimality level the GA can achieve. The structure of the components is presented in Algorithm 1.
The generation phase creates an initial solution set. Every solution is then evaluated using equation (1), which returns a scalar fitness value indicating its optimality. The implementation of the evaluation stage is presented in Algorithm 2, which is called by Algorithm 1 in Line 7.
The selection phase determines which solutions are discarded. We propose a modified roulette method that associates a selection probability to each solution based on their fitness value, so that sub-optimal solutions have a greater probability of being discarded but may still continue to the next generation to improve gene diversity.
Finally, the crossover and mutation processes constitute the main solution manipulation methods of the GA. The former combines two solutions, while the latter Endogenous stochastic optimisation for relief distribution… alters a solution individually. The probability of occurrence of each process is controlled by parameters p c and p m , respectively. These randomised alterations aid the GA to avoid local optima, allowing it to navigate through complex search spaces.
Solutions in the GA are encoded as two integer arrays indicating truck and drone routes, respectively, where each node and link is given a unique identification value. Assuming that each demand node has to be visited once, the solutions can be structured as the routing sequence of a travelling salesman problem, where the only information contained is the schedule specifying the route sequence. Given a maximum payload allowance, the route sequence is then divided into the separate time steps, as illustrated in Fig. 4. The same structure is followed in the drone routing, which determines the links to explore and discretises the route into time steps given a maximum battery capacity.
While the route followed by the UAV can be generated in a simple manner, given that the UAVs traverse a fully connected graph, trucks are restricted to a pre-defined road network. Therefore, at each time step t, Dijkstra's algorithm (Dijkstra 1959) is used to calculate the shortest paths for each schedule using the estimated damage when b t ij = 0 and actual link damages otherwise. The proposed structure facilitates solution generation and manipulation. Truck route generation consists of shuffling the order of demand nodes and randomly assigning a node to each vehicle. A similar process is followed for drone routing generation, albeit not all links need to be explored.
Crossover exchanges the genes between two points with another solution. Finally, mutation contains generic operations that affect both drone and truck routing, and specific operations applied to each independently.
The generic mutator operators include: inversion, which inverts the order of the vehicle route; self-swap, that exchanges two genes of the vehicle route, and intervehicle-swap, which exchanges a gene with that of a different vehicle of the same solution. Specifically for truck routing, an operator is used to re-order a set of genes using the nearest neighbour heuristic based on the perceived costs. Drone routing mutation contains insertion and deletion steps based on the links most traversed by the trucks.
The application of any of the mentioned processes is randomly selected. Each process contains a probability of activation, namely p v , p s , p v , p n , p i and p d for inversion, self-swap, inter-vehicle-swap, nearest neighbour, insertion and deletion, respectively. A single operation on both truck and drone routing is permitted for every mutation process.

Case study
We demonstrate our approach in a case study based on the road network around Port-au-Prince, the capital of Haiti. The region has been affected by numerous disasters in its history, and was subject to one of the deadliest earthquakes ever recorded in 2010 (Government of the Republic of Haiti 2010).
The case study area contains 25 cities that are interconnected by the highway system, consisting of 66 uni-directional links (as illustrated in Fig. 5). The international airport of Port-au-Prince represents the only port of entry to the network, and is established as the distribution depot. At every time step, UAVs and ground vehicles must leave from and return to this depot, given constraints (2.1), (2.4), (2.7) and (2.10). The humanitarian mission consists of delivering materials from the depot to the 25 cities. It is assumed that a single visit satisfies initial demand necessities, and that payload capacity limits delivery to two nodes in a single trip.
Vehicles must traverse a damaged network. Where damages are unknown, respondents assume a velocity v avg = 40 . Otherwise, links are assigned an average velocity v avg depending on the MMI measurements reported by USAID (USAID 2010), as shown in Table 3. The selected speeds have been previously recorded in post-earthquake environments (Mas et al. 2012;Wisetjindawat et al. 2014).
Damaged link velocities are generated using a normal distribution, where the mean of the distribution coincides with the average velocities shown in Table 3. Note that these do not mean that the vehicle travels at v avg throughout the whole link, but represents the delay caused by the network damages (Table 4).

Sequential algorithm results
We define a smaller problem instance given the limitations of the sequential greedy algorithm outlined in Sect. 4. The eastern region of the network is evaluated, which consists of 10 nodes connected by 22 links (refer to Fig. 6f). As an initial test run, we assume a constant MMI of 7.
We run the computational experiment on a workstation with a 64-bit Intel Core i7-4790 CPU and 8 GB RAM. Python 3 and Gurobi 8.0 were used to model and solve the algorithm. The experiment is repeated for 1000 scenarios, where link 1 3 damages are randomly and independently assigned using a normal distribution. Each scenario runs in 16 s, with some minor variations.
To provide an insight into the solution structures, we present the routing behaviour of the worst-case scenario (largest percentage network damage) in Fig. 6, with a detailed breakdown of the routing decisions presented in Table 5. Note that each direction of the link is assumed to be explored separately. To show the information that trucks have access to, we present UAV explorations at time step t that correspond to UAV deployments at t − 1 . In other words, a UAV deployed at time step 0 is shown in Fig. 6 at time step 1.
It can be observed in most cases the route explored by the UAV is immediately utilised by the ground vehicle in the following time step. The only exception is the use of the UAV to explore link CB-KC at time step 4, which provides an alternate route to enter and exit the depot in case the damage levels of neighbouring links are too high.
We compare the performance of our algorithm with a deterministic approach, where routing decisions are made based on expected link costs, with our results summarised in Fig. 7. We observe that the stochastic approach provides significant travel time savings in all scenarios, with larger differences observed at  A similar behaviour is shown in Fig. 8, which presents the results of a further experiment carried out to compare performance with and without UAV support, considering damage-visibility only based on previous truck routing decisions. In this case, the UAV-supported response also provides shorter travel times and lower deviations under all network damage configurations compared to the unsupported relief distribution, with average travel time savings of 20%.

Genetic algorithm results
The GA is developed using Python 3 and executed on the same workstation. Parameter values of p c = 0.5 , p m = 0.1 and z = 100 are used based on the results of manual tuning, which is presented in the Appendix. Mutation operator parameters are tuned so that all operators have the same probability of being activated. The number of generations is kept constant at 100. The GA runs without an exploration limitthe UAV is capable of exploring multiple links in the same time step.
In terms of total travel time, the GA provides significant performance improvements compared to the sequential greedy algorithm. Reductions between 10 and 20% can be observed in Fig. 9 depending on network damage levels. The main disadvantage of the GA method is the increased variability in solutions as a result of its probability-based search. As an example, the GA provides upper bound solutions at a similar level to the greedy algorithm between 50 and 60% network damage, despite the average travel cost being approximately 20% lower. Figure 10 shows that battery discharge rates increase significantly when using the GA in comparison with the sequential algorithm. The ability to visit multiple links in the same time step triples the total battery consumption compared to the greedy case. This is expected, as logically the exploration of multiple links in a single deployment requires longer flight times, hence greater battery use than the greedy approach. Similar to Fig. 9, the GA algorithm provides a greater solution variability due to its stochastic nature.

UAV performance
In this section, we compare the performance of the GA with and without UAV assistance. To achieve this, we define perceived cost as the cost estimation made by respondents to traverse a certain path given the cumulative knowledge on the Fig. 9 Average routing costs with network damage for the sequential greedy and GA solution methods 1 3 network damages. This means that a path of cost c may be perceived as c v , and if c v < c , the respondents may be routed to a worse path than expected. Figure 11 presents the percentage deviation between the perceived costs and the actual cost for the traversed truck paths. One can see that the drone-assisted respondents have a lower deviation between perceived and actual routing costs than unassisted respondents. In addition, the maximum cost deviations are lowered by 5-8% on average.
The current figure does not take into account how the UAV aids with route selection. Figure 12 presents the cost deviations between the selected and best available paths. As observed, the UAVs provide near 0% deviation from the best path at all damage levels, meaning that the vehicles are able to select the best route possible to each destination in most cases. Furthermore, the average path travel time deviation ranges between 2% and 6%, and the cost deviation variability is also significantly reduced.

Penalty parameter variation
The penalty parameter P quantifies the cost of delaying the delivery of relief at a node. The intended purpose of the parameter is to minimise the mission length. As a result, it affected the routing decisions made by the algorithm, as greater P values incite faster responses. Conversely, this allows less time for the UAV to explore the network, and may result in higher uncertainty encountered by the vehicle. We evaluate the mission length with variations in the P parameter in Fig. 13. We observe that low values of P result in significantly longer mission times (a maximum mission time of 20 days is allowed). The greatest reduction is observed between 50 and 150, were the penalty exceeds the exploration costs.

Large study case
A final set of experiments is carried out that simulates the relief response to the full 25-node network. A penalty parameter P = 500 is selected, as informed by the results in Sect. 5.2.2. Initially, we assume that two vehicles and two drones are deployed simultaneously given the larger network size.
Similarly to the small case study, both the UAV-assisted and the non-assisted problem instance have comparable deviations between the perceived and actual path cost for their chosen itineraries (Fig. 14). However, Fig. 15 shows that smaller deviations are observed when estimating the cost of the best available path. As a result, the UAV-assisted response provides shorter paths. As seen in Fig. 16, the UAVassisted response reduces the cost between the best possible path and the actual path in all damage levels. The constant 2-3% benefit mirrors the difference observed in Fig. 15.
Consequently, the UAV-assisted response not only reduces the average travel time by 10-20% at all damage levels, but also decreases the deviation significantly Fig. 11 Deviation between perceived and actual path cost with and without drone-assisted reconnaissance throughout. In summary, UAV-based reconnaissance generates a benefit to the humanitarian response, as it consistently provides information that allows trucks to select better routes to each destination (Figs. 16,17,18).
When comparing the response with different UAV fleet sizes, we obtain evident improvements when using two UAVs compared to one (Fig. 19), while additional UAVs provide very similar results (Fig. 20). A critical level of knowledge of the network is obtained when two UAVs are used, and adding more UAVs has minimal impact on the optimality of the routing (Table 6).

Discussion
Our analysis has shown that UAVs provide significant improvements to humanitarian response operations by reducing uncertainty in the road network state and informing routing decisions. However, several barriers limit their widespread implementation. In the first instance, current regulations limit UAV flight in BVLOS conditions (Stöcker et al. 2017). Despite permissions being provided by aviation authorities on a case-by-case basis, current processes are slow and would inevitably lag humanitarian operation timelines.
Some simplifying assumptions in our study relate to weather conditions and air space management. Given the low altitude requirements of road damage inspection, Fig. 12 Deviation between chosen and best possible path traversed with and without drone-assisted reconnaissance the latter assumption is reasonable in areas where drone activity is low. However, the rapid increase in drone use will require efficient trajectory management for humanitarian use.
Additionally, our current methodology assumes weather conditions will not impede drone operations, thus not accounting for flight range deviations as a consequence of temperature variability, precipitation and wind. Furthermore, localised weather events in the form of rain or low visibility may reduce routing options at different time steps, forcing the UAV and vehicle to explore alternative regions. This simplification limits the applicability of the algorithm as it cannot capture energy requirements accurately, especially in weather-related disasters.
Our exact approach considers that a single link is explored at every time step to limit the number of scenarios to be optimised. While the assumption may be appropriate when only one UAV is used, given its range limitations and data processing timelines, the use of multiple UAVs for widespread exploration requires a relaxation of this rule. However, traffic management systems and regulations that will permit such operations are in development at the time of writing, and are expected to be operational in the near future (EASA 2015; Kopardekar et al. 2016;Chakrabarty et al. 2019). Thus, we expect that concerns regarding BVLOS drone operations will be addressed in due course by the relevant authorities.
Additionally, extending the UAV range would require the location of recharge hubs. This includes allowing UAVs to depart from and return to different locations, and selecting optimal locations for the recharge hubs. From an algorithmic perspective, the improvements discussed above would increase algorithm runtime exponentially, as any of the relaxations above results in new exploration combinations. In addition, providing non-greedy solutions to this problem would require the development of intricate scenario trees that are dependent on routing decisions made throughout the mission time horizon.

Conclusions
Infrastructure damage remains as one of the key factors delaying relief distribution. To overcome this problem, humanitarian organisations are increasingly relying on UAVs for rapid damage assessment. While UAV routing has been explored by researchers in the context of logistics, military missions and surveillance, this article seeks to formalise and solve the UAV-assisted vehicle routing problem with endogenous uncertainty.
In particular, this algorithm coordinates UAV and truck deployments simultaneously over the course of a relief distribution operation. Trucks are used to deliver relief resources and UAVs serve to assess the network damages, which are assumed to be unknown at the start of the mission. A sequential greedy stochastic optimisation method and an adapted genetic algorithm are applied to a case study based on the 2010 Haiti earthquake. Our analysis indicates that our algorithms provide Fig. 14 Deviation between perceived and actual path cost with and without drone-assisted reconnaissance in 25-node case significant performance gains compared to deterministic methods, which are widely used in the literature.
Technological limitations still hinder widespread adoption of UAVs in humanitarian operations. Their susceptibility to changing weather patterns, as well as maintenance and required access to electricity bring further uncertainty given potential failure states mid-flight. Collision avoidance further adds an additional operational risk. Taking them into account necessitates the development of trajectory management algorithms for humanitarian operations. In the metropolitan context, the urban air mobility (UAM) concept has been introduced to manage UAV operations. These aspects have not been considered in this paper.
The proposed problem and solution method represent a foundation from which future research may be conducted and improve the modelling methodology by relaxing some simplifying assumptions. In the context of earthquake response, aftershocks may induce further network damages, meaning data collected by UAVs may become outdated. Introducing this time-dependent and stochastic damage component would require the development of a complex link damage model.
Further extensions to the algorithm presented in this paper could implement multi-depot operations, expand the case study size, and provide alternative objective functions, such as evaluating service equity and safety. Aspects of coordination can be evaluated by including multiple independent organisations. Another stream of research could seek to reduce the NACs, which would provide significant    improvements to the applicability of exact solutions methods. These will compose our future work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Fig. 20
Routing costs with 2 and 3 UAVs in 25-node case Table 6 Summary of results

Topic Findings
Sequential and genetic algorithm Reduction in average travel times of 10-20% by GA. Increase in travel time deviations of 50% by the GA. Increase in battery consumption significantly as GA relaxes the drone exploration constraint UAV-assisted and truck response Reduction of average travel times by 20% using UAVs in both the 10and 25-node networks. Small reduction in perceived and actual path costs when using UAVs. No significant difference is observed in the 25-node network. Reduction of perceived and actual costs deviations between best and chosen paths by using UAVs. In the small network, the deviation approximates 0%, while the large network reduces the deviation by 3% Penalty Parameter Reductions in mission length until P = 500 Number of UAVs in response Reductions of average response travel time by 4% between one and two UAVs used. Optimal number of UAVs used at two, as three UAVs do not provide additional routing savings Finally, the battery relationship is linearised by the introduction of n t ijk in (14.1-14.6), which monitors battery level to the previous node visited. The constraint set presented below modifies equations (2.5) and (2.6).