1 Introduction

Alongside the rise of ‘last-mile’ delivery in contemporary urban logistics, drone delivery has rapidly developed and provides ever-increasing applications, given its outstanding triple-bottom-line performance. Commercially, drone delivery helps reduce labour costs with its pilot-free feature, enables faster travel speed than traditional truck-load delivery, and provides a road network constraint-free transport solution (Agatz et al., 2018). Socially, drone delivery can ease traffic congestion (Dukkanci et al., 2021) and can be used to support post-disaster resilience by providing risk assessment, mapping and temporary communication network creation (Chowdhury et al., 2017; Tang & Veelenturf, 2019). Most importantly, in contrast to conventional means of transport, drone delivery can significantly contribute to companies’ green agenda by reducing greenhouse gas emissions (Goodchild & Toy, 2018). Driven by the merits of drone delivery, an increasing number of drone applications are seen in various business fields. For instance, German logistics giant DHL successfully flew its Parcelcopter 4.0 over Lake Victoria with a 60 km flight in 2018 (DHL, 2018); Amazon has implemented its Prime Air system by deploying drones to service its customers since 2016 (Amazon, 2020); and Chinese e-commerce giant, JD.com, has already established 150 operational sites for delivery drones since 2017 (Unmanned Cargo, 2017).

However, as a lithium-ion battery-powered device, the use of drones is no different from other electric devices, and faces battery disposal problems, meaning that all the sustainable goodwill offered by drone delivery could be largely offset. As reported by Forbes (Rapier, 2020), the exploding sales of electric devices, especially vehicles, will result in 11 million metric tons of used lithium-ion batteries reaching the end of their service lives in 2030, while, unfortunately, the recycle rate for lithium-ion batteries globally is below 5%. When such an enormous quantity of batteries is disposed at landfills, subsequent environmental problems—such as air contamination and water supply and soil pollution—will be inevitable and severe (Katwala, 2018). As stated by Edge et al. (2021), lithium-ion batteries wear down with each charge and the maximal capacity is degraded gradually until it cannot be charged again. Therefore, optimising the lithium-ion battery charging plan is crucial for prolonging its lifespan, and will ultimately help mitigate battery disposal-caused environmental consequences. This explains the prevalence of relevant initiatives introduced in the electric vehicle industry (e.g., Pelletier et al., 2017; Yang & Sun, 2015).

However, because of the differences stemming from drone batteries and drone-assisted delivery, solutions applied to address battery-related operational issues in the electric vehicle industry have very limited effectiveness for drones. First, drone batteries have much smaller capacity than those of electric vehicles. In general, battery capacities of current electric vehicles range from 30 to 100 kWh, with voltage from 400 to 800 V (i.e., 37,500 to 250,000 mAh) (Pollard, 2020), whereas battery capacities for drones can range from 2700 to 40,000 mAh (Figliozzi et al., 2018). As a result, drones’ service coverage is highly constrained and there is little space for battery performance optimisation. Second, in contrast to electric vehicles, drones do not have a public charging network on which to rely; rather, every operator maintains its own recharging facility to support its operations. Therefore, existing approaches in the electric vehicle sector, such as optimising recharging network design and vehicle recharging decisions with given grid systems, are difficult to directly incorporate.

Although the above issues limit battery optimisation with an individual operator, the battery capacity and charging facility constraints could be alleviated if multiple operators agreed to share their fleets. If one operator’s drones could be dispatched and collected by other operators, the service coverage of this operator could be significantly expanded and more business opportunities could be fulfilled. The rise of blockchain solution is proposed to provide a trust-free sharing system that could automatically create an immutable, consensually agreed and publically available record of past transactions, thereby mitigating trust issues in a peer-to-peer system (Greiner & Wang, 2015; Hastig & Sodhi, 2020). A blockchain refers to a cryptographically secured distributed ledger with a decentralised consensus mechanism (Wang, Wu, et al., 2017). It securely keeps data shared among its users and allows users to transact valuable assets in a public and pseudonymous setup, without reliance on an intermediary or central authority (Glaser, 2017). As Hawlitschek et al. (2018) stated, mutual trust is a fundamental prerequisite for a sharing economy, as it determines the maximum sharing ability (i.e., the sum of shared information and resources from all participants) that such a sharing system can achieve. With the use of blockchain for sharing economy, smart contracts can be created to unite both operators and customers, proceed payments, and collect data under a trustworthy process. Most importantly, blockchain stands out as a sharing platform solution due to its superior data security and its advantages in ride sharing are thus appreciated in real world. For example, a Russian company Darenta implemented a blockchain-enabled car sharing platform that allows the information of shared cars and drivers traceable but the associated transactions are free from cyber-attacks (Darenta, 2021). In particular, this sharing platform is similar to the idea of Airbnb where customers can lease their own vehicles out or lease other’s vehicles in, but the foundation for the trust of the community is not built upon reviews. Because of blockchain technology, all the on-the-road data are completely recorded and transparently shared, customers who lease their vehicles out has no worry about their assets are mis-treated by the leasees. On the leasees’ side, blockchain keeps them peace of mind as they can have full information about their rented capacity although it is sourced from someone the leasee has no knowledge of. For the proposed drone sharing operation, although fleet capacities are only shared amongst operators, the purpose of adopting blockchain is the same. After the blockchain-enabled platform is offered by someone (e.g. an independent service provider), all drone operators are the “customers” of the platform and information about each operator as well as their fleet conditions become transparent to each participants. Consequently, trust is enhanced as each operator has no need to worry about the conditions of the sourced fleet nor the safety their own fleet that is being shared with other operators. Given that drone delivery is highly constrained by load capacity, battery capacity and infrastructure resources, any subtle increase in the shared resources could make a significant difference. Hence, it makes both economical and practical sense to manage drone operations on a blockchain-enabled sharing system.

Regarding the above discussion, this study aimed to support operations managers in facilitating sustainable supply chain development by optimising the delivery route and extending drone battery lifespan through a blockchain-enabled drone sharing approach. The remainder of this paper is organised as follows. Section 2 reviews the relevant literature and identifies the research gaps, while Sect. 3 explores the underlying problems in more detail and formulates them mathematically. Section 4 presents the methodology used in detail, and Sect. 5 conducts various numerical experiments to illustrate the novelty of this research. Finally, Sect. 6 discusses the main research findings and highlights directions for future studies.

2 Literature review

Considering the relevance of this research, this section will review existing studies from drone routing optimisation and blockchain-enabled sharing economy to indicate the academic significance of this study.

2.1 Drone routing and operation optimisation

Research on drone delivery originally derived from investigations into vehicle routing problems (VRPs). In urban logistics, especially in the last-mile delivery field, the key objective of VRP and its variants is determining a set of optimal routes performed by vehicles with limited capacity and operational constraints to serve a certain group of customers (e.g., Lahyani et al., 2015; Laporte, 2009; Villegas et al., 2013; Wang & Sheu, 2019; Wang, Choi, et al., 2020; Wang, Poikonen, et al., 2017). By incorporating drone delivery to replace or assist the traditional truck-based transport mode, new challenges are arising. For instance, drone service is much more constrained by limited capacity, limited flight range and so forth. When drones operate solely or collaboratively with trucks to serve different customers, existing VRP studies are difficult to apply, as model formulation and the associated solution strategy must be adaptive to the research context (Wang & Sheu, 2019).

In addition, when sustainability is considered for traditional VRP problems (e.g., Govindan et al., 2014, 2019), the focus is minimisation of mileage-related carbon footprint, while, for drone problems, the major sustainability issue is effective battery usage (Katwala, 2018). Therefore, a new research stream has developed to incorporate features of drone delivery in classic VRPs. In particular, San et al. (2016) detailed the process of managing a fleet of heterogeneous drones to serve customers at different locations. The study showed that the key operational challenge of drone application is limited load capacity. As a result, many demands must be strategically fulfilled by multiple trips. Choi and Schonfeld (2017) investigated an automated drone delivery system by considering drone battery capacity, load capacity and flight range constraints. Compared with classic VRP studies, drone-based VRP is challenging because certain drone features significantly limit their service range. Therefore, while the above studies sought to increase the effective flying journey of drones to reduce unnecessary flying distance, another stream of research aims to adopt different initiatives to improve drones’ service coverage.

To expand the service range of drones, Murray and Chu (2015) introduced the ‘travelling salesman’ problem with drone applications (TSP-D), whereby a drone collaborates with a truck to distribute customer parcels at a minimum completion time of two vehicles. The findings summarised two types of drone/truck tandem scheduling problems—the flying sidekick travelling salesman problem (FSTSP) and parallel drone scheduling travelling salesman problem (PDSTSP)—to improve the operation of drone-and-truck coordination. Specifically, the first refers to a drone working in tandem with a delivery truck, while the latter describes a scenario with a distribution centre located in close proximity to customers on top of FSTSP. By applying a route-first, cluster-second heuristic approach, their solution is effective with up to 10 customers. Following a similar methodology, Ha et al. (2018) further improved this heuristic approach with a local search and greedy randomised adaptive search procedure (GRASP) technique, which allowed it to be effective with 50 and 100 customer nodes.

By modelling TSP-D with integer programming, Agatz et al. (2018) proposed both an exact solution and a heuristic to solve the problem. A worst-case approximation guarantee was given for a theoretical analysis conduct and the performance of the heuristic solution was compared with the exact one through an experimental comparison. In a similar context, Wang, Wu, et al. (2017) studied TSP-D with a focus on deriving worst-case bounds for ratios of the total delivery time of different delivery options. Marinelli et al. (2017) further extended FSTSP studies by considering whether drone launch and rendezvous operations can be performed throughout its route arcs. Ham (2018) extended the PDSTSP by using drones for both drop-off and pickup activities. Ulmer and Thomas (2018) considered a heterogeneous fleet and used an approximate dynamic programming solution to solve the TSP-D, with customer orders realised dynamically during each working shift. Yurek and Ozmutlu (2018) provided an approach for solving TSP-D based on decomposition of the problem into two components. Wang and Sheu (2019) further allowed docking hubs in a VRPD problem, which provided flexibility for coordination between trucks and drones. Moshref-Javadi et al. (2020) modelled a joint routing problem of a truck with one or multiple drones, and designed a hybrid tabu search-simulated annealing algorithm to improve the customer waiting situation.

Further, D’Andrea (2014) stated that the energy consumption rate for drones should vary depending on the condition of the payload. This is significant for drones because a drone’s average payload is much higher than its own weight. To enhance drone applications’ feasibility, many existing studies have explored the TSP-D problem in different situations. In particular, given the non-linear components, some research has employed linear approximation-based techniques to reduce their associated complexity (e.g., Dorling et al., 2017; Jeong et al., 2019; Troudi et al., 2018). Further, to identify the key parameters affecting the energy consumption behaviour of drones, some studies have discussed the similarities and differences among various models (e.g., Kirschstein, 2020; Zhang et al., 2021). Moreover, some research has directly formulated drone energy consumption in a non-linear form and solved the TSP-D with considering non-linear energy consumption with exact solutions (e.g., Dukkanci et al., 2021). All these research streams have shifted the focus of TSP-D problems to energy and battery considerations, and have enhanced the practicability of the research by modelling energy consumption more comprehensively. However, there is a lack of research investigating how operators can effectively extend the use of drone batteries and reduce the effects of battery capacity constraints on drones’ daily operations.

To fill this gap, some recent research introduced battery recharging or replacing facilities to extend the overall length of one flight of a drone. For example, Hong et al. (2018) investigated a range-restricted recharging station coverage model for drone delivery service planning. Asadi and Pinkley (2021) studied scheduling, allocation and inventory replenishment at battery swap stations to overcome battery degradation for drones and electric vehicles. Cokyasar (2021) introduced automated battery swapping machines (ABSMs) to the design of a drone-based service network, which is greatly promoted by the United States government. Although introduction of these facilities can extend the service time of a drone, some important issues have been identified and remain unaddressed. First, existing studies on battery swapping or recharging facilities mainly focused on overall network design, rather than charging optimisation for individual drones. Thus, there is a lack of decision support to derive a specific plan for drone routing and battery charging in different situations when it comes to daily operations. Second, the lifespans of lithium-on batteries vary concerning how often and how much electricity is left when the battery is recharged. According to Edge et al. (2021), the less residual electricity in a battery before each charging, a higher capacity fade the battery will have. This battery degradation effect can greatly affect drone operations, especially regarding the fixed costs for battery replacements and environmental effects. However, this has not been explored in existing studies. Third, similar to the essence of recharging points and ABSMs, drone fleet sharing can be an approach for extending the flight range of drones, and does not require a huge upfront investment. However, it requires an effective and trustworthy platform to connect all participating operators for collaboration. Unlike ABSMs, which only extend drone flight range yet do not alter fleet size, drone sharing can significantly increase each operator’s total capacity and enable more profitable business opportunities. Fourth, considering the significant environmental effects of battery disposal, it is worth including this factor with other economic indicators for joint optimisation. It is both academically and practically significant to reduce battery disposal-related negatives and drone operational costs through drone routing, recharging and fleet sharing decisions.

2.2 Blockchain-enabled sharing economy

The sharing economy was first introduced by Lessig (2008) and refers to a business model that consumes resources collaboratively through activities of sharing, exchanging and renting, while not owning the goods. Given the merits of a sharing economy in improving asset use rate, cost control ability and operational flexibilities, a growing number of studies from the transportation field have considered its application for operational improvements. For instance, Lu et al. (2017) considered a parking lot allocation problem with fleet sharing and repositioning. Kabra et al. (2020) proposed a structural choice model to analyse the effects of stations and availability of vehicles in a bike sharing system. He et al. (2020) analytically revealed insights for deploying a charging network and operating an electric vehicle sharing system. However, as Nesta (2015) stated, trust is the most critical currency to buy participants’ shared resources on a sharing platform, yet, regardless of which best practices are adopted, it is never guaranteed that transactions between people will be trustworthy. To overcome this problem, blockchain is considered a trust-enhancement mechanism for a sharing economy.

The study by Greiner and Wang (2015) introduced the notion of ‘trust-free’ systems and called for the implementation of blockchain technology to automate an immutable, consensually agreed and publicly available record of past transactions that is governed by the whole system to mitigate trust issues in peer-to-peer systems. This can also help replace traditional contracts with a contractual agreement embedded in the process of the system itself. In this sense, costly mechanisms to build trust in intermediaries or interpersonal trust are rendered obsolete by design (Greiner & Wang, 2015). Later research by Fröwis and Böhme (2017) examined how such a smart contract structure contributes to a trust-free system through integrity over time. Pazaitis et al. (2017) used the illustrative case of Backfeed to reveal the potential of using blockchain to better support the dynamics of social sharing, and build a theoretical framework to assess and distribute social values. Previous research has also considered the use of blockchain to enhance platform connection ability for computational resource-sharing systems, such as grid computing and cloud computing (Hong et al., 2017; Stanciu, 2017; Teslya & Smirnov, 2018).

In all above studies, blockchain as a trust-enhancement solution for the sharing economy was mainly explored at a strategical or tactical level, while its implications at operational level were barely discussed. In particular, questions about how the performance of a specific application could be boosted by blockchain were not evaluated quantitatively, and how operational decisions should be adjusted accordingly were not answered. Hua et al. (2018) proposed a blockchain-based solution for battery refuelling. The lifetime information for all batteries was recorded in the blockchain network, so that owners of electric vehicles and station grids were guaranteed fair transactions for battery sharing. Although their research applied blockchain technology to construct a battery refuelling application, their study did not include how different operational decision making was affected accordingly.

Given the above research gaps, this paper aimed to address the challenges of drone routing problems through a blockchain-enabled fleet sharing solution. The key contributions are threefold. First, this is the first drone routing research considering fleet sharing and the effectiveness of using blockchain technology for sharing ability enhancement. Second, this study considers the drone battery degradation effect together with routing and charging decisions. Third, this study is the first to consider the joint optimisation of operational costs and environmental effects.

3 Problem formulation

This section presents the problem in detail and proposes a mixed-integer programming formulation to achieve the research objectives. This approach will help operators explicitly minimise battery-related costs and other operational costs for sustainable supply chain development. The proposed model also considers battery-related features as constraints in formulating the problem.

3.1 Model description

This study considers a drone routing problem (DRP) with blockchain-enabled fleet sharing among multiple operators, which we define as a DRPBFS problem. In particular, operators in DRPBFS manage the same type of drones and serve their customers with the same type of products. Demands from customers are shared with each operator, and every single piece of demand is framed with a time window, which implies how early and how late each customer would like to accept delivery. To service customers within the given time window, every operator owns a depot in which commodities and drones are kept, and, from there, the operator must decide a cost-effective plan about which drone will be deployed for a specific demand with a certain route at a specific time point. Throughout all cost components, the composition of battery costs is complicated. Specifically, when a rechargeable battery runs from fully charged until the next recharging, this is labelled one charge cycle, and every lithium-on battery has a cycle-to-failure (CTF) value. This value indicates the maximal number of cycles that a battery lasts and, once it is run out, operators must purchase a new battery for replacement, and the old one will be disposed. However, a drone’s CTF value is not always fixed, but varies depending on the depth-of-discharge (DOD) level for one charge cycle. The DOD level indicates how much volume of electricity is used in a battery before it is charged. According to Jeong et al. (2015), the CTF value of a battery should exponentially decrease alongside the increase of DOD—also known as a battery wear effect or battery degradation.

Therefore, although the cost of purchasing a new battery is fixed, the capital cost associated with each charge cycle could differ and is largely affected by the DOD level before the battery is charged again. In this regard, operators must balance the consumption of CTF and DOD value each time and make decisions considering the drone routing and battery wear. This is a complicated yet important issue, given that, even though draining a battery could allow more demands to be fulfilled in one flight, it accelerates the battery life-to-failure. Conversely, charging a battery with a high level of DOD reduces the use of one battery charge cycle, yet slows the capacity loss of the drone battery for a long run. This study aims to address these issues by introducing a blockchain-enabled fleet sharing platform. In this way, each drone does not need to return to the departure depot for reloading and battery charging. Instead, it can fly to other operators’ depots to swap a fully-charged drone out to continue the work. This enables the shift from independent business operations to collaboration, and operators can integrate all information to optimise their decisions about which route will be used by the shared drones, when to recharge those drones and where to recharge them under the new context. Moreover, with respect to different sharing protocols, the trust level created by the corresponding sharing system is different. When operators have doubts about the shared resources (e.g., conditions of shared drones, if other operators have fully revealed their demand information), the level of their own resources to be shared with others will be discounted as well. When blockchain technology is incorporated as the sharing protocol, all information can be securely accessed and operators will have maximum willingness to share their resources. Hence, the maximum sharing ability will be provided by the underlying sharing platform.

Regarding above narrative, the following assumptions are made here to make the problem tractable:

  1. 1.

    Every customer is served only once.

  2. 2.

    The travel speed of each drone is assumed to be fixed.

  3. 3.

    The weight of each customer demand will be no greater than the maximal capacity of one drone.

  4. 4.

    Every depot owns the same type of drones and has the same commodities for service.

  5. 5.

    Each drone can serve multiple customers within one flight if its capacity allows.

  6. 6.

    Each drone can depart and land at any depot.

  7. 7.

    There is no need for a drone to have the same departure and arrival depot in one flight.

  8. 8.

    By the end of the overall planning horizon, the drone fleet size for each depot should remain the same as its initial amount.

  9. 9.

    Drones always depart from depots with a full battery.

  10. 10.

    The use of blockchain can avoid all trust issues and maximise the sharing ability of a sharing platform.

The assumptions are justified as follows. Assumptions 1 to 5 are commonly used in DRP-related literature (e.g., Chen et al., 2020). Although drone fleet sharing is barely discussed in the existing literature, this practice is not uncommon in collaborative VRP studies (Nataraj et al., 2019; Yao et al., 2019). Similar to research from Wang, Li, et al. (2020), under a fleet sharing context, Assumptions 6 to 8 provide every operator equal access to the shared fleet and help form the foundation for collaboration. Assumption 9 reflects a common industrial practice. Assumption 10 is supported by different literature (e.g., Greiner & Wang, 2015; Pazaitis et al., 2017).

3.2 Notations and problem formulation

The notation below illustrates all the relevant sets, parameters and decision variables of the proposed model.

Notations

Sets

 

\(O\)

Set of depots (operators)

\(C\)

Set of customers

\({C}^{S}\)

Subset of customers that can be shared with multi-depots

\({C}^{i}\)

Subset of customers that can only be served by drones dispatched from the i-th depot node,\(\forall i \epsilon O\)

\(D\)

Set of drones

\(A\)

Set of arcs

\(N\)

Set of nodes

Indices

 

\(i,j,l\)

Indices of node set N

\((i,j)\)

Index of arc set A

\(d\)

Index of drone set D

Parameters

 

\(G=(N,A)\)

Directed graph with node sets \(N\) and arc sets \(A\), where \(N=C\cup O\) and \(A=\left\{\left(i,j\right)\right|\forall i,j\in N,i\ne j\}\)

\(m\)

Norm of the customer’s set

\(n\)

Norm of the operator’s set

\({\overline{T} }_{ij}\)

Travel time for arc \(\left(i,j\right)\),\(\forall (i,j)\in A\)

\({C}_{ij}\)

Travel cost for arc \(\left(i,j\right)\),\(\forall \left(i,j\right)\in A\)

\({Q}_{i}\)

Demand of customer node \(i\),\(\forall i\in C\)

\({T}_{i}^{LA}\)

Latest arrival time requested by customer node \(i\),\(\forall i\in C\)

\({T}_{i}^{ED}\)

Earliest departure time allowed by customer node \(i\),\(\forall i\in C\)

\({K}_{i}\)

Number of drones owned by depot node \(i\),\(\forall i\in O\)

\(B\)

Maximal load capacity of a drone

\(S\)

Speed of a drone

\(E\)

Maximal flight duration of a drone

\(F\)

Purchase cost of a new battery

\(\delta \)

Ratio of battery disposal cost to purchase cost

\(M\)

Sufficiently large number

State variable

 

\({y}^{d}\)

Depth of discharge (\(DOD\)) of drone \(d\) before it is charged again, where \(\forall d\in D\)

\(CTF\)

Drone battery cycle-to-failure value

\({e}_{i}^{d}\)

Cumulative flight time when drone \(d\) arrives at customer node \(i\), where \(\forall i\in N, d\in D\)

\({t}_{i}^{d}\)

Arrival time of drone \(d\) at customer node \(i\), where \(\forall i\in N, d\in D\)

\({v}_{i}^{d}\)

Equals to 1 if the \(d\)-drone is dispatched from depot \(i (\forall i \in O)\), and 0 otherwise

Decision variables

 

\({x}_{ij}^{d}\)

Drone \(d\) is deployed on arc \(\left(i,j\right),\) where \(\forall (i,j)\in A, d\in D\)

The problem is first mapped out in a directed graph\(G=(N,A)\), where \(N=C\cup O\) and \(A=\left\{\left(i,j\right)|i,j\in N,i\ne j\right\}\). In particular, node set \(N\) contains both customer and depot nodes and arc set \(A\) contains all node arcs. Further, the customers set is denoted as \(C=\left\{\mathrm{1,2},..,m\right\}\), and is further partitioned by \(C={C}^{S}{\cup }_{i\in O}{C}^{i}.\) This indicates that subset \({C}^{S}\) contains customers who can be served by shared resources, yet subset \({C}^{i}\) contains customers who can only be served by the appointed depot\(i (\forall i \in O)\). The depot set is denoted \(O=\left\{m+1,m+2,...,m+n\right\}\), where \(m\) is the size of customers (i.e.,\(m=|C|\)) and \(n\) is the size of depots (i.e.,\(n=|O|\)). With regard to each customer node \(i \left(\forall i\in C\right)\), \({Q}_{i}\) represents its demand and is also associated with a hard time window \(\left[{T}_{i}^{LA},{T}_{i}^{ED}\right]\). For each depot node \(i (\forall i\in O)\), drones can be dispatched to service customers and returned for charging, and all available drones are denoted as set \(D=\left\{\mathrm{1,2},\dots ,\left|D\right|\right\}\), where \(d\) is the index of this set (i.e., \(\forall d\in D\)). Benefitting from the blockchain-enabled fleet sharing, drones can be deployed and returned to any depot if that is more cost effective; however, for asset management consideration, every depot should have the same amount of drones as its initial stock by the end of the planning horizon. Further, the load capacity of each drone is B, and every single customer demand should be no more than a full drone load (i.e., \(B\ge {Q}_{i},\forall i\in C\)). As stated by Assumption 2, drones will be travelling over each arc \((i,j)\)(\(\forall (i,j)\in A\)) with speed \(S\), and, given the triangle inequality rule, the shortest travel time between any two nodes, i and j, in a service area is always a straight Euclidean distance, i → j. Therefore, the corresponding travel time for arc \((i,j)\) can be denoted by \({\overline{T} }_{ij}\) (e.g., Chen et al., 2020) and the associated cost is \({C}_{ij}\). Without loss of generality, we also do not consider the service time of each customer, since it can be implicitly indicated by \({\overline{T} }_{ij}\) as well. Accordingly, formulation of the original problem (OP) is given below:

$$\mathrm{min}\sum_{d\in D}\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\sum_{\left(i,j\right)\in A:i\in O}{x}_{ij}^{d}+\sum_{d\in D}\sum_{\left(i,j\right)\in A}{C}_{ij}{x}_{ij}^{d}$$
(1)

s.t.

$${f}_{CTF}\left({y}^{d}\right)={a}_{0}+{a}_{1}{{y}^{d}}^{-1}+{a}_{2}{{y}^{d}}^{-2}+{a}_{3}{{y}^{d}}^{-3}, \forall d\in D$$
(2)

where \({a}_{0}=-4790\), \({a}_{1}=7427\), \({a}_{2}=-1077\) and \({a}_{3}=55.4\), respectively:

$$\sum_{d\in D}\sum_{\left(i,j\right)\in A}{x}_{ij}^{d}=1, \qquad\forall i\in C$$
(3)
$$\sum_{d\in D}\sum_{\left(i,j\right)\in A}{x}_{ij}^{d}\le {K}_{i},\qquad \forall i\in O$$
(4)
$$\sum_{\left(i,j\right)\in A:i\in O}{x}_{ij}^{d}\le 1, \qquad\forall d\in D$$
(5)
$$\sum_{\left(i,j\right)\in A}{x}_{ij}^{d}-\sum_{\left(j,i\right)\in A}{x}_{ji}^{d}=0, \qquad\forall i\in C, d\in D$$
(6)
$$\sum_{d\in D}\sum_{\left(i,j\right)\in A}{x}_{ij}^{d}-\sum_{d\in D}\sum_{\left(j,i\right)\in A}{x}_{ji}^{d}=0,\qquad \forall i\in O$$
(7)
$${v}_{i}^{d}={x}_{ij}^{d},\qquad \forall i\in O, j\in C,d\in D$$
(8)
$${x}_{jl}^{d}\le M(1-{v}_{i}^{d}), \qquad\forall i\in O, j,l \in {C\backslash (C}^{S}\cup {C}^{i})$$
(9)
$$M\left({x}_{ji}^{d}-1\right)+{\overline{T} }_{ji}\le {t}_{i}^{d}, \qquad\forall j\in O,i\in C, d\in D$$
(10)
$$M\left({x}_{ij}^{d}-1\right)+{t}_{i}^{d}+{\overline{T} }_{ij}\le {t}_{j}^{d},\qquad \forall i\in C, j\in N, d\in D$$
(11)
$${{T}_{i}^{LA}\le t}_{i}^{d}\le {T}_{i}^{ED},\qquad \forall i\in C, d\in D$$
(12)
$$M\left({x}_{ji}^{d}-1\right)+{\overline{T} }_{ji}\le {e}_{i}^{d}, \qquad\forall j\in O,i\in C, d\in D$$
(13)
$$M\left({x}_{ij}^{d}-1\right)+{e}_{i}^{d}+{\overline{T} }_{ij}\le {e}_{j}^{d}, \qquad\forall i\in C, j\in N, d\in D$$
(14)
$${\frac{{e}_{i}^{d}}{E}\le y}^{d}\le 1, \qquad\forall i\in O, d\in D$$
(15)
$$\sum_{\left(i,j\right)\in A}{Q}_{i}{x}_{ij}^{d}\le B, \qquad\forall d\in D$$
(16)
$$0\le {e}_{i}^{d}\le E, \qquad\forall i\in O, d\in D$$
(17)
$${x}_{ij}^{d}\in \left\{\mathrm{0,1}\right\}, \qquad\forall d\in D, \left(i,j\right)\in A$$
(18)

Equation (1) presents the objective function of the OP model. It minimises the sum of two cost components: the first term is the sum of battery-related costs (i.e., new replacement costs and sustainable costs) for all drones over all arcs, and the second term is the sum of travelling costs for all drones over all arcs. To investigate the effects of battery charging decision on battery lifespan, Eq. (2) is a logarithmic polynomial function that describes the relationship (Fig. 1 further depicts the graph) between battery depth of discharge (\(DOD\)) and battery cycle-to-failure (\(CTF\)) (Jeong et al., 2015). Equation (3) guarantees that every customer can only be served once. Equation (4) enforces that the number of drones that can be dispatched from each depot does not exceed the total available amount. Equation (5) indicates that not all drones should be used. Equation (6) and (7) balance the flow at customer and depot nodes. Equation (8) identifies the departure depot for a specific drone. Equation (9) forbids the drone from visiting the unshared customer nodes belonging to other depots. By introducing a sufficiently large constant M, we can linearise the ‘if–then’ constraint—that is, if the \(d\)-drone is dispatched from depot \(i\), then the \(d\)-drone does not visit customer l. In practice, the domains of the associated variables can be used for calibrating the value of the big \(M\). For constraints (9), the domains of variables \({x}_{jl}^{d}\) and \({v}_{i}^{d}\) belong to \(\left\{\mathrm{0,1}\right\}\), so the upper bounds of the variables are both 1. We can set the value of the M to any number that is greater than or equal to 1, such as 1. This ensures that the ‘if–then’ constraints will not be violated. For constraints (10), they can be transformed as \({\overline{T} }_{ji}-{t}_{i}^{d}\le M\left(1{-x}_{ji}^{d}\right), \forall j\in O,i\in C, d\in D\). Since \({t}_{i}^{d}\ge {T}_{i}^{LA}\) and \({x}_{ji}^{d}\in \left\{\mathrm{0,1}\right\}\), so if we set \(M\) satisfying \(\underset{\forall j\in O,i\in C}{\mathrm{max}}\left\{{\overline{T} }_{ji}-{T}_{i}^{LA}\right\}\le M,\) the ‘if–then’ constraints will not be violated. For constraints (11), they can be transformed as \({t}_{i}^{d}+{\overline{T} }_{ij}-{t}_{j}^{d}\le M\left(1-{x}_{ij}^{d}\right), \forall i\in C, j\in N, d\in D\). Since \({t}_{i}^{d}\ge {T}_{i}^{LA}\), \({t}_{j}^{d}\le {T}_{j}^{ED}\), and \({x}_{ij}^{d}\in \left\{\mathrm{0,1}\right\}\), so if we set \(M\) satisfying \(\underset{\forall i\in C, j\in N}{\mathrm{max}}\left\{{\overline{T} }_{ij}-{T}_{i}^{LA}+{T}_{j}^{ED}\right\}\le M,\) the ‘if–then’ constraints will not be violated. Equation (10) to (12) ensure that the time window of each customer will not be violated. Equation (13) and (14) illustrate the flow conservation in terms of the drone’s energy. The values of the M here is similar to that from constraints (10)–(11). For constraints (13), \(M\) needs to satisfy \(\underset{\forall j\in O,i\in C}{\mathrm{max}}\left\{{\overline{T} }_{ji}\right\}\le M\). For constraints (14), \(M\) needs to satisfy \(\underset{\forall i\in C, j\in N}{\mathrm{max}}\left\{{\overline{T} }_{ij}+E\right\}\le M\). Equation (15) specifies the value range of \(DOD\) for a drone battery. Equation (16) ensures that each load of a drone is within its maximal capacity. Equation (17) and (18) define the domains of the decision variables.

Fig. 1
figure 1

Relationship between \(CTF\) and \(DOD\)

Proposition 1

The DRPBFS is an NP-complete problem.

Proof The current problem can be reduced to a vehicle routing problem with time windows (VRPTW) if the network only contains one depot and the deployed drones have no flight duration constraints. This completes the proof.

4 Solution strategy

This section introduces the methodology used to solve the problem. As an NP-complete problem (Proposition 1), DRPBFS is difficult to solve directly by commercial optimisation software, such as IBM CPLEX or Gurobi, especially when the scale of the problem is relatively large. To enhance the practicability of this study, a new method has been proposed that can perform more efficiently in solving the problem.

4.1 OP reformulation

Given the battery wear effect, the corresponding cost function brings a non-linear term into the objective function of the OP formulation. To simplify the solution process, we first reformulate OP to remove the non-linear component. In particular, we introduce a new set of auxiliary variables, \({z}^{d} (d\in D)\), to replace the non-linear part, \(\sum_{d\in D}\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\sum_{\left(i,j\right)\in A:i\in O}{x}_{ij}^{d}\), in the OP formulation. To ensure the new formulation is equivalent to OP, a series of new constraints are added based on the McCormick envelope (Lundell et al., 2009). The details of this new formulation are presented below.

P2:

$$\mathrm{min}\sum_{d\in D}{z}^{d}+\sum_{d\in D}\sum_{\left(i,j\right)\in A}{c}_{ij}{x}_{ij}^{d}$$
(19)
$$Eq. (3)-(18)$$
(20)
$${z}^{d}\le M\sum_{\left(i,j\right)\in A:i\in O}{x}_{ij}^{d}, \qquad\forall d\in D$$
(21)
$$\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}-M\left(1-\sum_{\left(i,j\right)\in A:i\in O}{x}_{ij}^{d}\right){\le z}^{d}\le \frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}, \qquad\forall d\in D$$
(22)
$${z}^{d}\ge 0, \qquad\forall d\in D$$
(23)

Equation (20) is a copy of the constraints from OP. With the newly added Eq. (21) and (22), the new formulation P2 no longer contains any non-linear term. For constraint (21), as \({z}^{d}\le \frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\) and \({x}_{ij}^{d}\in \left\{\mathrm{0,1}\right\}\), if we set \(M\) satisfying \(\underset{{0\le y}^{d}\le 1}{\mathrm{max}}\left\{\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\right\}\le M,\) the ‘if–then’ constraints will not be violated. For constraints (22), as \({z}^{d}\ge 0\) and \({x}_{ij}^{d}\in \left\{\mathrm{0,1}\right\}\), if we set \(M\) satisfying \(\underset{{0\le y}^{d}\le 1}{\mathrm{max}}\left\{\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\right\}\le M,\) the ‘if–then’ constraints will not be violated. Equation (23) defines the domains of \({z}^{d}\).

4.2 Branch-and-price algorithm

Based on P2, we further reformulate it to a path-based model that includes both the master problem (MP) and sub-problem (SP), so a branch-and-price-based algorithm can be developed.

4.3 Dantzig-Wolfe (DW) decomposition

In P2, only Eqs. (3), (4) and (7) (from OP originally) constrain to all drones. Based on the principle of DW decomposition, they are the first selection to be decomposed for a path-based reformulation (P3). In DRPBFS, the drone path is defined as a path with start and end at the depot (not necessarily the same one), and a series of customers in between. In this sense, the MP of P3 can be first obtained with the support of some new parameters and decision variables:

Sets

 

\(\Omega \)

Set of feasible drone paths, indexed by k

\({\Omega }_{1}\)

Subset of \(\Omega \)

Parameters

 

\({c}^{k}\)

Travel cost of drone path \(k\),\(\forall k\in\Omega \)

\({a}_{ij}^{k}\)

Equals 1 if path k goes through the arc \((i,j) \epsilon A\), and 0 otherwise

\({z}^{k}\)

Battery-related cost for path \(k\),\(\forall k\in\Omega \)

Decision variables

 

\({\theta }_{k}\)

Equals 1 if path k is selected, and 0 otherwise

P3-MP:

$$\mathrm{min}\sum_{k\in\Omega }\left({z}^{k}+{c}^{k}\right)\cdot {\theta }_{k}$$
(24)

s.t.

$$\sum_{k\in\Omega }\sum_{\left(i,j\right)\in A}{a}_{ij}^{k}{\theta }_{k}\ge 1, \qquad\forall i\in C$$
(25)
$$\sum_{k\in \Omega }\sum_{\left(i,j\right)\in A}{a}_{ij}^{k}{\theta }_{k}\le {K}_{i}, \qquad\forall i\in O$$
(26)
$$\sum_{k\in \Omega }\sum_{\left(i,j\right)\in A}{a}_{ij}^{k}{\theta }_{k}-\sum_{k\in \Omega }\sum_{\left(j,i\right)\in A}{a}_{ji}^{k}{\theta }_{k}=0, \qquad\forall i\in O$$
(27)
$${\theta }_{k}\in \left\{\mathrm{0,1}\right\}, \qquad \forall k\in\Omega $$
(28)

Equation (24) is the objective function that minimises both travelling- and battery-related costs for all selected paths. Equation (25) indicates that every demand should be served. To accelerate the convergence, it is defined as an inequality to loosen constraint. Yet each customer will not be visited more than once, as it is constrained by the objective function and triangle rule. Equation (26) constrains the maximum number of drones that can be used in each depot. Equation (27) represents the flow conservation at each depot. Equation (28) defines the domain of the decision variable \({\theta }_{k}\). By relaxing \({\theta }_{k}\) as a continuous variable and allowing set \({\Omega }_{1}\) to be the subset of \(\Omega \), we can further obtain a linearly relaxed restricted master problem RLMP. In addition, we choose the nearest neighbour (NN) heuristic to generate the initial solution for the RMP.

Parameters

 

\({\overline{c} }_{ij}\)

Reduced cost of arc \((i,j) \epsilon A\)

Decision variables

 

\(z\)

Battery-related cost for drone path

\({t}_{i}\)

Time when drone arrives at customer \(i\)

\({e}_{i}\)

Cumulative flight time when drone arrives at customer \(i\)

\(y\)

Depth of discharge (\(DOD\)) of drone

\({s}_{ij}\)

Equals 1 if the arc \((i,j) \epsilon A\) is selected, and 0 otherwise

State variable

 

\({v}_{i}\)

Equals 1 if depot is the departure for a route, and 0 otherwise

Next, we present the pricing strategy to formulate the sub-problem (P3-SP) with the support of the above parameters and decision variables. Let \(\uplambda \), \(\mu \) and \(\nu \) be the dual vectors that are related to Eq. (25) to (27) (the corresponding dual model is in Appendix B). Then the reduced cost of a drone path is:

$$\overline{c }= z+\sum_{\left(i,j\right)\in A:i\in O}\left({C}_{ij}-{\mu }_{i}-{\nu }_{i}\right){a}_{ij}^{k}+\sum_{\left(i,j\right)\in A:i,j\in C}{\left({C}_{ij}{-\lambda }_{i}\right)a}_{ij}^{k}+\sum_{\left(i,j\right)\in A:j\in O}\left({C}_{ij}+{\nu }_{j}{-\lambda }_{j}\right){a}_{ij}^{k}$$
(29)

If we exclude the battery cost \(z\) from the above equation, the reduced cost of a drone path can be further decomposed based on arcs (see below Eq. [30]):

$${\overline{c} }_{ij}=\left\{\begin{array}{ll}{C}_{ij}-{\mu }_{i}-{\nu }_{i} ,& i\in O\\ {C}_{ij}{-\lambda }_{i} ,& i,j\in C\\ {C}_{ij}+{\nu }_{j}{-\lambda }_{i},& j\in O\end{array}\right.$$
(30)

For a given arc \(\left(i,j\right)\) from a drone path, \({\overline{c} }_{ij}\) represents an integrated form of the reduced costs (with \(z\) excluded) based on the types of its corresponding arc nodes. If this is used to replace the second, third and fourth term in Eq. (29), the sub-problem of P3 can be obtained as below:

P3-SP:

$$\mathrm{min}z+\sum_{\left(i,j\right)\in A}{\overline{c} }_{ij}{s}_{ij}$$
(31)

s.t.

$$\sum_{\left(i,j\right)\in A:i\in O}{s}_{ij} =\sum_{\left(j,i\right)\in A:i\in O}{s}_{ji} =1$$
(32)
$$\sum_{\left(i,j\right)\in A}{s}_{ij}-\sum_{\left(j,i\right)\in A}{s}_{ji}=0 , \forall i\in C$$
(33)
$${v}_{i}={s}_{ij},\forall i\in O, j\in C$$
(34)
$${s}_{jl}\le M(1-{v}_{i}),\forall i\in O, j,l \in {C\backslash (C}^{S}\cup {C}^{i})$$
(35)
$$M\left(1-{s}_{ji}\right)+{\overline{T} }_{ji}\le {t}_{i} , \forall j\in O,i\in C$$
(36)
$$M\left(1-{a}_{ij}^{k}\right)+{t}_{i}^{d}+{\overline{T} }_{ij}\le {t}_{j} , \forall i\in C,j\in N$$
(37)
$${T}_{i}^{LA}\le {t}_{i}\le {T}_{i}^{ED} , \forall i\in C$$
(38)
$$M\left({s}_{ij}-1\right)+{\overline{T} }_{ji}\le {e}_{i}, \forall j\in O,i\in C$$
(39)
$$M\left({s}_{ij}-1\right)+{e}_{i}+{\overline{T} }_{ij}\le {e}_{j}, \forall i\in C, j\in N$$
(40)
$$0\le {e}_{i}\le E, \forall i\in O$$
(41)
$$\frac{{e}_{i}}{E}\le y\le 1, \forall i\in O$$
(42)
$$\sum_{\left(i,j\right)\in A}{Q}_{i}{s}_{ij}\le B$$
(43)
$$z\le M\sum_{\left(i,j\right)\in A:i\in O}{s}_{ij}$$
(44)
$$\frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}-M\left(1-\sum_{\left(i,j\right)\in A:i\in O}{s}_{ij} \right)\le z\le \frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}$$
(45)
$$z\ge 0$$
(46)
$${s}_{ij}\in \left\{\mathrm{0,1}\right\},\forall \left(i,j\right)\in A$$
(47)

Lemma 1. \(\mathrm{The battery cost }z\mathrm{ for a drone path }is \frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}\).

Proof.

With the help of Eq. (32), the component \(\sum_{\left(i,j\right)\in A:i\in O}{s}_{ij}\) in both Eq. (44) and (45) can be dropped and replaced by 1. Therefore, Eq. (44) and (45) will be equivalent to the following inequality:

$$\frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}\le z\le \frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}$$
(48)

Equation (48) is further equivalent to \(z=\frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}\). This completes the proof. Hence, P3-SP can be rewritten as:

P4-SP:

$$\mathrm{min}\frac{\left(1+\delta \right)F}{{f}_{CTF}\left(y\right)}+\sum_{\left(i,j\right)\in A}{\overline{c} }_{ij}{s}_{ij}$$
(49)

s.t.

$$Eq. \,\,(32)-(43), (46)-(47)$$
(50)

Equation (49) is the objective function that minimises the reduced cost of a column for a drone path. Equation (50) copies the constraints for the initial formulation of P3-SP. In particular, Eq. (32) indicates that a drone can be dispatched from and returned to any depot. Equation (33) ensures the flow conservation at each customer node. Equation (34) and (35) forbid the route from containing a departure depot with unshared customer nodes belonging to depots. Equation (36) to (38) enforce the satisfaction of the time window requirement at each customer node. For constraints (35)-(37), the setting of the values of M is the same as that of constraints (9)-(11), respectively. They need to satisfy \(1\le M\), \(\underset{\forall j\in O,i\in C}{\mathrm{max}}\left\{{\overline{T} }_{ji}-{T}_{i}^{LA}\right\}\le M\), and \(\underset{\forall i\in C, j\in N}{\mathrm{max}}\left\{{\overline{T} }_{ij}-{T}_{i}^{LA}+{T}_{j}^{ED}\right\}\le M\), respectively. Equation (39) to (43) ensure the feasibility of flight duration, load capacity and \(DOD\). For constraints (39)–(40), the setting of the values of M is the same as that of constraints (13)–(14) respectively. They need to satisfy \(\underset{\forall j\in O,i\in C}{\mathrm{max}}\left\{{\overline{T} }_{ji}\right\}\le M\) and \(\underset{\forall i\in C, j\in N}{\mathrm{max}}\left\{{\overline{T} }_{ij}+E\right\}\le M\) respectively. For constraints (44)–(45), the setting of the values of M is the same as that of constraints (21)–(22), i.e., \(\underset{{0\le y}^{d}\le 1}{\mathrm{max}}\left\{\frac{\left(1+\delta \right)F}{{f}_{CTF}\left({y}^{d}\right)}\right\}\le M\). Equation (46) and (47) define the domains of the decision variables.

4.3.1 Branch-and-price solution for P3-MP

With the above re-reformulation, we further introduce a branch-and-price algorithm to solve P3-MP. The structure of this algorithm is given as follows:

Step 1: Solve the RMP with the initial columns obtained by the NN heuristic.

Step 2: Solve P4-SP (see Sect. 4.2.3) to generate columns with negative reduced cost. Notably, P4-SP is not required to be solved to optimality in every iteration, since any column with negative reduced cost can improve the objective function of RMP. Therefore, whenever a negative column is found, it must be added to RMP for iteration. Otherwise, go to Step 3.

Step 3: If an integer solution is obtained, the algorithm is terminated and the integer solution is returned as the optimal solution. Otherwise, a branch-and-bound approach (see Sect. 4.2.4) will be further used to search for an integer solution by iterating the above introduced column generation process.

4.3.2 Labelling algorithm for P4-SP

To solve the sub-problem (i.e., P4-SP), P4-SP can be simplified to a classical elementary shortest path problem with resource constraints (ESPPRC), which is NP-hard (Garey & Johnson, 1979). Then, a dynamic programming algorithm named the ‘labelling algorithm’ can be applied (e.g., Feillet et al., 2004; Righini & Salani, 2006). Unlike a general form of an ESPPRC, the resources cumulative flight time, \({e}_{i}\), affects the objective function of P4-SP and gives an additional reduced cost of a column for the drone path, which requires different handling techniques. Based on the labelling algorithm implemented by Desrochers (1988), we develop the following process to fit our sub-problem settings.

(1) Label structure

This is used to record the up-to-node \(i\) partial path and label it with a series of information (e.g., resource consumption information). In our problem, the label structure is expressed as Eq. (51):

$${L}_{i}=\left\{i,predecessor ({L}_{i}),r({L}_{i}),v({L}_{i}),t({L}_{i}),e({L}_{i}),q({L}_{i})\right\}$$
(51)

In the above equation, \(i\) is the last node in the partial path; \(predecessor ({L}_{i})\) is the parental label, which extends to the current \({L}_{i}\); and \(r({L}_{i})\) is the accumulated reduced cost of \({L}_{i}\), which includes two cost components from Eq. (31). In particular, we define the first reduced cost component as \(g\left(e\left({L}_{i}\right)\right)\) and the second as \(c\left({L}_{i}\right)\). \(v({L}_{i})\) is a vector that performs as a tabu list that records visited nodes and un-accessible nodes (Feillet et al., 2004), \(t({L}_{i})\) is the arrival time of the drone at node \(i\), \(e({L}_{i})\) is the accumulated flight time of the drone at node \(i\), and \(q({L}_{i})\) is the accumulated weight of customer parcels for the partial path.

(2) Label extension process

To begin this process, we start with a label \({L}_{i}\) with the minimum reduced cost. A new label \({L}_{j}\) is obtained by extending \({L}_{i}\) through arc (\(i,j\)) if the time window and resources-based constraints are satisfied. The relationship between \({L}_{i}\) and \({L}_{j}\) is shown as follows:

$$t\left({L}_{j}\right)=\underset{ }{\mathrm{max}}\left[t\left({L}_{i}\right)+{\overline{T} }_{ij},{T}_{j}^{ED}\right]$$
(52)
$$e\left({L}_{j}\right)=e\left({L}_{i}\right)+{\overline{T} }_{ij}$$
(53)
$$q\left({L}_{j}\right)=q\left({L}_{i}\right)+{Q}_{j}$$
(54)
$$c\left({L}_{j}\right)=\left\{\begin{array}{c}c\left({L}_{i}\right)+{c}_{ij}-{\mu }_{i}-{\nu }_{i}, \\ {c}_{ij}{-\lambda }_{i}\\ {c}_{ij}{-\lambda }_{i}+{\nu }_{j}\end{array}\right.\begin{array}{c}i\in O\\ i,j\in C \\ j\in O\end{array}$$
(55)
$$predecessor\left({L}_{j}\right)={L}_{i}$$
(56)
$$r\left({L}_{j}\right)=g\left(e\left({L}_{j}\right)\right)+c\left({L}_{j}\right)$$
(57)

Note that the update rules for \(v\left({L}_{j}\right)\) also consider the time window and demands of the next node, \(h\), where node \(j\) will extend to. If node \(h\) is not accessible, it then should be updated to an un-accessible node set \(H\), and this set is further included into \(v\left({L}_{j}\right)\):

$$H=\left\{h|t\left({L}_{j}\right)+{\overline{T} }_{jh}>{T}_{h}^{LA},e\left({L}_{j}\right)+{\overline{T} }_{jh}>E,q\left({L}_{j}\right)+{Q}_{h}>L\right\}$$
(58)
$$v\left({L}_{j}\right)=v\left({L}_{i}\right)\cup H\cup i$$
(59)

(3) Dominance rule

To avoid some unprofitable labels, the following dominance rule is elaborated. The two labels, \({L}_{i}^{1}=\left\{i, predecessor ({L}_{i}^{1}),{v({L}_{i})}^{1},{t({L}_{i})}^{1},{e({L}_{i})}^{1},{q({L}_{i})}^{1}\right\}\) and \({L}_{i}^{2}=\left\{i,{predecessor ({L}_{i}^{2}),r({L}_{i})}^{2},{v({L}_{i})}^{2},{t({L}_{i})}^{2},{e({L}_{i})}^{2},{q({L}_{i})}^{2}\right\}\), represent two different partial paths with the same last node \(i\). If the following conditions (Desrochers, 1988) are satisfied, \({L}_{i}^{1}\) can dominate \({L}_{i}^{2}\):

  • lower reduced cost: \({r({L}_{i})}^{1}\le {r({L}_{i})}^{2}\)

  • less visited nodes: \({v({L}_{i})}^{1}\subseteq {v({L}_{i})}^{2}\)

  • earlier arriving time: \({t({L}_{i})}^{1}\le {t({L}_{i})}^{2}\)

  • lower cumulative flight time: \({e({L}_{i})}^{1}\le {e({L}_{i})}^{2}\)

  • lower cumulative parcel weight: \({q({L}_{i})}^{1}\le {q({L}_{i})}^{2}\).

If a label is dominated by other labels, the extension process for this label should be terminated and this label should be deleted.

4.4 Branching strategy

If a fractional solution is obtained from the above column generation, a branch-and-bound (BB) process must be performed. Given that the path is difficult to branch, we use the method proposed by Danna and Le Pape (2005) to select arcs for BB instead. Specifically, suppose x* is the optimal solution of the RLMP at the current searching node in the BB tree. If x* is a integer, then branching is not necessary. Otherwise, we iterate all paths with the fractional values and select the arc (i, j) with the largest value of \(\sum\nolimits_{k \in {\Omega_1}} {{x_{ij}}} \) to be branched on. In addition, given that adding additional constraints to MP will result in increased dual prices, we branch the fractional arcs through modifying their associated reduced costs indirectly (Degraeve & Jans, 2007). To search the BB tree, we incorporate the depth-first method to effectively obtain feasible solutions at leaf nodes for pruning. In particular, the search starts from the root node in one direction. If the target solution is found, then the process in this direction can be terminated. Otherwise, the previous node will be used to start the same process again in a different direction. The BB procedure is shown below:

  • Step 1: If the solution does not yield an integer solution at the searching node, select one of the arcs (i, j) to branch on, and go to Step 2.

  • Step 2: After selecting the arc (i, j), make two branches according to the following rules:

  1. i.

    The first child node: For the arc (i, j) that must be served (defined as (i, j) = 1) by drones, if both i and j are customer nodes, arcs \(\left( {i,l} \right)\left( {\forall l \in C,l \ne j} \right) \) and \(\left( {l,j} \right)\left( {\forall l \in C,l \ne i} \right) \) should be deleted so that all remaining routes that visit node i or j must pass through arc (i, j). This will differ slightly when either i or j is the depot. We cannot delete the arcs visiting and leaving the node because the depot can be visited more than once.

  2. ii.

    The second child node: For the arc (i, j) that cannot be served (defined as (i, j) = 0) by drones, we only need to delete those arcs that should be imposed with (i, j) = 0 arcs.

After deleting the arcs according to the rules above, the corresponding routes must also be deleted from the column sets of RLMP.

5 Numerical experiment

In this section, a series of numerical experiments are designed to demonstrate both the methodological and managerial significance of our research. Based on use of a piecewise linear approximation with Gurobi (PLA-Gurobi), we conduct a preliminary test to examine the quality of its solution against an enumeration with very small size instances. We then benchmark the computation performance of our BP-based solution against the piecewise linear approximation with Gurobi for problems with various scales. Further, to explore the effects of battery degradation on drone operations, we conduct our experiments with different objective functions (i.e., with or without battery-related cost optimisation). Finally, to uncover more significant insights, different sensitivity analyses are undertaken, varying both economic and operational parameters. All experiments are performed on a Windows 10 OS with four 64-bit 3.4 GHz Intel Core processors and 64 GB RAM. The model is programmed by Java and the solver is Gurobi 9.0.0.

5.1 Computational performance benchmarking

5.1.1 Dataset and parameter settings

Instances used for all our experiments are generated with reference to Hexa-B Hexacopter fleet (Dorling et al., 2017). We set a new battery purchase cost (F) as $500 and the battery disposal-related costs as twice that of a new purchase (i.e., Δ=2). The load capacity of a drone (L) is 3 kg, and the maximal flight duration (E) is 50 min at a speed of \(s\)= 6 m/s. At the beginning of our experiments, unshared customer nodes are not considered (i.e., \({C}^{i}=\phi ,\forall i\in O\)).

To mark the position of each node, we introduce a coordinate system with its origin at (0, 0), and the base unit for distance is kilometres. In the preliminary test, the operation is only performed between two depots, with coordinates of (−3, 0) and (3, 0), respectively. The service coverage of each depot is a circle with a radius of 5 km. We only consider scenarios with two and three customers within each service coverage for the preliminary test to simplify the situation. The relative position for one customer of a depot is defined by [l*sinα, l*cosα], where l represents a direct distance between this customer to its depot and α is the angle of this customer to the x-axis. Also, l is uniformly distributed between [0, 5], and α is uniformly distributed between [0, 2π]. Figure 2 below illustrates the above configuration for one scenario. The initial drone fleet size is the same for all depots. For different instances in the preliminary test, we increment the drone fleet size for every depot from two to five drones for the preliminary test. The overall planning horizon is two hours and each customer associated time window is generated following these rules: the latest arrival time requested for customers is generated uniformly from [0,120], then uniformly sampled from [20,50] and added to the latest arrival time to attain the earliest departure time. Customer demands are uniformly generated from {0.5 kg, 1 kg, 1.5 kg, 2 kg}. To easily refer to a specific experiment, we use a code format A##-##-# for referencing an instance. In particular, A## represents the number of depots considered for the underlying instance. The following ## represents the number of customers serviced by each depot. The last # is a sample ID for instances sharing the same scale.

Fig. 2
figure 2

Example of customer and depot distribution for preliminary test

For experiments with larger instances, we consider scenarios with two, three, four and five depots, respectively. These depots are randomly located on a circle with a radius of 5 km and its centre at the origin of the coordinate system. Each depot’s service coverage is the same as above, yet covers five to 20 customers, depending on the instance. An example of such a setting is illustrated in Fig. 3 below. In addition, for different instances, the number of drones in each depot can range from four to 10. All other settings remain the same as in the preliminary test.

Fig. 3
figure 3

Example of customer and depot distribution for large instance

5.1.2 Preliminary test

Given the non-linear component—that is, the battery-combined objective function of the OP model—the OP cannot be solved directly via the off-the-shelf solver. To evaluate the solution quality and computational performance of the proposed BP algorithm, we derive an enumeration-based method (see details in Appendix C) and the PLA-Gurobi (see details in Appendix D), respectively, for benchmarking. To explore the solution quality of PLA-Gurobi, the enumeration method is used as a baseline for benchmarking. To make the enumeration method tractable, we begin the experiment on a very small scale. Table 1 below presents the experimental results.

Table 1 Results on small instances (K = number of drones; Opt. = optimality; CPU = computation time: second)

In Table 1, the results from enumeration provide the baseline for benchmarking. As a solution to an NP-complete problem, the CPU time of the enumeration approach increases exponentially alongside the increase of problem scale. Comparatively, the use of piecewise approximation and BP is much more efficient. For the comparison of the objective values: (1) the BP algorithm, as an exact solution, produces exactly the same results as the enumeration and (2) a high level of accuracy for the results is obtained from PLA-Gurobi, given that seven results produced have zero gap to enumeration and the other three have relatively small gaps. Taking PLA-Gurobi’s solution quality and computational efficiency into account, PLA-Gurobi is used as the benchmark in the next experiments with larger instances. In particular, we use different problem scales to further evaluate our solution’s computational performance in efficiency.

5.1.3 Experiments for medium-scale instances

This section examines scenarios including two and three depots. Accordingly, each depot’s number of customers ranges from five to ten in each experiment, and, in total, this yields 15 instances explicitly. The PLA-Gurobi and BP algorithm are both implemented to solve these instances. We also set 7200 s as the time tolerance for each instance. The details of the results are presented in Table 2.

Table 2 Computational results for 15 instances (LB = lower bound at root node; Gap = % diff. of opt. to Gurobi)

Regarding the results, all 15 instances are solved to optimality by the BP algorithm in less than 0.3 s. Comparatively, PLA-Gurobi can only solve seven of 15 instances within our time tolerance (i.e., 7200 s), and, even for the solved instances, some can consume over hundreds of seconds.

Compared with the initial LB, we find that the LB obtained by PLA-Gurobi is far worse than the BP algorithm, which spends a great deal of time for convergence. In addition, for instances such as A2-10–1 and A2-10–4, even though PLA-Gurobi finds the optimal solution, the gap between its UB and LB is more than 30% when its computation time reaches tolerance. This can be explained as the LB obtained from the relaxation of the OP model is worse than that obtained from RMP (Song et al., 2017). To further evaluate the BP algorithm’s computational capability, our next section further enlarges the experimental instance scales.

5.1.4 Experiments for large-scale instances

Given the computational limit of PLA-Gurobi (reflected in Sect. 5.1.3), it is excluded from the following large-scale experiments. Instead, only the BP algorithm is implemented for the experiments below (in Table 3). In particular, we examine scenarios with four and five depots. Each depot services 15 or 20 customers, and, in total, 20 different scenarios are explored.

Table 3 Computational results of BP algorithm on large-scale instances

The results on large-scale instances obtained by the BP algorithm are shown in Table 3. For all instances, the BP algorithm can obtain optimality for a gap (i.e., \(\frac{(Opt. - LB)}{Opt.}\times 100\%\)) of zero. Alongside the increase in instance size, the computational time of the BP algorithm increases greatly. However, its practicability is greatly improved, as it can obtain an optimal solution for a problem with 100 nodes (i.e., from A5-20–1 to A5-20–5) within an average of 5000 s.

5.2 Comparison of DRPBFS and DRP

To investigate the operational effectiveness of the blockchain-enabled fleet sharing, we compare the DRPBFS with DRP (i.e., drone routing without fleet sharing) under different instances. Specifically, the two cost components (travel costs and battery-related costs) in our objective function are discussed explicitly. Table 4 below details the experimental results comparing travel costs, battery-related costs and total costs for the two models under different instances. In addition, when a flight has a different departure depot from the arrival one, it indicates this flight is completed by a shared drone. In this sense, we introduce an indicator named ‘ST rate’ to record the percentage of such flights out of the total flights for each instance. This can be further used to explore how the level of fleet sharing frequency would be affected by different operational factors.

Table 4 Comparison analysis of DRP and DRPBFS

As demonstrated by the below results, DRPBFS has a cost-effective advantage of total operational costs over DRP for all examined 15 instances. In particular, given the large cost reduction from battery-related cost savings, although DRPBFS may not always yield the most travelling cost-saving paths, the reduction in total costs is still significant, ranging from 2.91 to 14.34%. In addition, as the problem scale goes up, the cost benefits from fleet sharing are incrementally articulated, with a 2.91% average total cost reduction compared with DRP for 30 nodes (A3-10), 7.02% total cost reduction compared with DRP for 60 nodes (A4-15), and 14.34% total cost reduction compared with DRP for 100 nodes (A5-20). The reasons for this phenomenon are twofold. First, fleet sharing provides much better flexibility for each flight. Given that each drone does not have to return to its departure depot, the level of \(DOD\) for each drone can be maintained at a higher level, as opposed to the non-fleet sharing operation, after each flight is completed. Moreover, given how \(DOD\) affects the \(CTF\) of a battery, it helps reduce battery-related costs. Second, as the problem size increases (from 30 to 60 to 100 nodes), the drone fleet sharing rate increases significantly (from 3.10% to 7.50% to 15.40% on average). It makes business sense that, as more customers and depots are included, more cost-reduction opportunities emerge. Given that DRPBFS promotes fleet sharing for overall optimisation, the effectiveness of fleet sharing gains more space to have an effect when the problem size is relatively large.

5.3 Sensitivity analysis

In this section, we conduct sensitivity analysis by varying the form of objective functions, values of battery-related costs and speed of drones. By doing so, we seek to uncover some insights about sustainable drone operation and how it may be affected by the development of technology.

5.3.1 Effect of different objective function

This section discusses the effects of different objective functions with optimising battery costs only (BC), travel costs only (TC) and battery plus travel costs (BC + TC). Accordingly, two more objective functions (Eq. (60) and (61)) are introduced for the pricing sub-problems, which correspond to the \(\mathrm{BC}\) and \(\mathrm{TC}\) objectives:

$${\overline{c} }_{BC}= {z}^{k}-\sum_{\left(i,j\right)\in A:i\in O}{\left({\mu }_{i}+{\nu }_{i}\right)a}_{ij}^{k}-\sum_{i\in C}\sum_{\left(i,j\right)\in A}{{\lambda }_{i}a}_{ij}^{k}+\sum_{\left(j,i\right)\in A:i\in O}{\nu }_{i}{a}_{ji}^{k}$$
(60)
$${\overline{c} }_{TC}= \sum_{\left(i,j\right)\in A:i\in O}{\left({c}_{ij}-{\mu }_{i}-{\nu }_{i}\right)a}_{ij}^{k}+\sum_{i\in C}\sum_{\left(i,j\right)\in A}{\left({c}_{ij}{-\lambda }_{i}\right)a}_{ij}^{k}+\sum_{\left(j,i\right)\in A:i\in O}\left({c}_{ji}+{\nu }_{i}\right){a}_{ji}^{k}$$
(61)

Instance A2-10-3 generated from the above section is selected for further analysis. Under different objective functions, Table 5 summarises all the paths with their associated costs for the underlying instance. In addition, we draw all the trips surrounding depot 1 under the three models (Fig. 4) to provide a closer presentation of their differences.

Table 5 Trips generated by different objectives for depot 1 in instance A2-10-3
Fig. 4
figure 4

Trips of depot 1 for three types of model

Comparing Fig. 4a to the other type graphs, an additional trip is incurred for the BC model. Inevitably, its associated travel cost is higher than the other two, and this also explains the higher total travel distance generated from the BC model compared with the other two in Table 5. However, even with an increased total travel distance, it is interesting to note that the total battery cost for the BC model is unexpectedly lower than all the others. This leads us to think that the economic effect of a lower \(DOD\) could be much greater than the additional charging times of a battery. The additional trip in the BC model means that the battery of a drone will be charged one more time than in the other two models; however, with each trip shorter in the BC model, the level of \(DOD\) will be higher comparatively. If we zoom in to the trip plans for customer 9 across the three models, this further clarifies our finding. Both the TC and BC + TC model choose to service customer 9 together with other customers within one flight, yet the BC model deploys a separate drone deliberately for this customer, so a certain level of \(DOD\) can be maintained.

5.3.2 Effect of battery disposal cost and purchase cost

This section conducts sensitivity analysis to investigate the effect of the battery disposal cost and purchase cost with the use of instance A2-10-5. The new battery purchase cost starts with our initial setting (i.e., \(F=\$500\)), yet we exclude the battery disposal cost (i.e., \(\updelta =0\)). Following this, we analyse both the TC and TC + BC models by decreasing the value of \(F\) with step size $50. Figure 5 below compares the results in detail.

Fig. 5
figure 5

TC versus TC + BC with varying battery purchase cost

Alongside the decrease in battery purchase cost, the overall optimisation results for TC and TC + BC converge gradually. This means that the shortest path may not necessarily yield an optimal cost control when battery cost is relatively high, as it could accelerate the degradation of batteries. Conversely, if battery purchase costs could be largely reduced, schedules produced from the TC model would be more likely to be the optimal ones and are easier to implement. Therefore, if new battery technology emerges (e.g., reduced battery wear effect or lower manufacturing costs), it would allow drone operators to focus on path planning and simplify their operation.

Next, we evaluate the effects of varying the ratio of battery disposal cost to purchase cost, \(\updelta \). In addition to the above discussed technology improvement, we believe that battery disposal-related cost could continuously decrease as well. Therefore, we set the ratio \(\updelta \) to 0, 0.5, 1.0, 1.5, 2.0 and 2.5, respectively, to perform experiments under three different battery purchasing costs. Figure 6 below plots the experimental results in detail.

Fig. 6
figure 6

Costs for different \(\updelta \) and \(F\)

Similar to the experiment results of battery purchase cost, a smaller \(\updelta \) makes TC optimisation closer to TC + BC optimisation for all three battery purchase costs. Notably, when battery purchase cost is $50 and \(\updelta \) is between 1.0 and 1.3, the TC, TC + BC and BC optimisation is binding, and, if \(\updelta \) goes down further, TC + BC moves towards TC, as the influence of battery-related costs becomes very minor. This means that technological advancements in both battery manufacturing and battery disposal handling will help drone operators better focus on pure distance-based scheduling.

5.3.3 Effect of drone speed

This section investigates the effect of drone speed on drone fleet sharing operations. Assuming that innovations in drone technology will be interested in developing faster drones, we conduct tests with varying drone speed from 6 to 10 m/s (with incrementing 1 m/s each test) to observe their corresponding changes in total costs and sharing rate. The details of those tests are summarised in Fig. 7 below.

Fig. 7
figure 7

Change in total cost and sharing rate for different drone speeds

Figure 7 illustrates the change of total costs and fleet sharing rate alongside the increment of drone speed by 1 m/s in each experiment. The plot of the total cost indicates that a continuous reduction can be produced from the improvement of drone speed. It makes sense that the higher speed of drones allows a shorter service lead time. Consequently, if the battery of drones remains at the same capacity, a much higher \(DOD\) level can be maintained after finishing the same trip. Therefore, a lower battery-related cost can drive the total cost downwards. Examining the change of fleet sharing rate on the other side, it is interesting to see that it increases initially, yet, after a certain speed point (9 m/s), fleet sharing is not favoured any longer, as a sharp decline in rate occurs. The rationale for this is that the increase of drone speed can provide more opportunities and better flexibilities for fleet sharing (e.g., it becomes easier to return a shared drone to its owner); thus, more fleet sharing activities are performed. However, when a drone’s speed is fast enough to service multiple times of customers as only fleet sharing can achieve, fleet sharing starts to lose its economic attractiveness, as the service coverage of one drone is significantly expanded. This implies that the development of drone technology can effectively help extract the value of fleet sharing operation, and eventually replace the role of fleet sharing unless other operational concerns are introduced.

5.3.4 Effect of blockchain

This section varies the sharing ability to represent the use of different fleet sharing protocols. Specifically, thanks to the merits of blockchain technology, the blockchain-enabled platform allows all operators to fully share their demands and drone fleets. However, alternatively, other types of sharing platforms have their sharing rate discounted and range from 10 to 90% (i.e., \(\frac{\sum_{i\in O}|{C}^{i}|}{|C|}\) ranges from 0.1 to 0.9). Given five different instances with scale of 60 nodes (i.e., four depots and 15 customers with each depot), some key economic factors are examined, including battery cost, travelling cost, total cost, node sharing rate and drone sharing rate. Table 6 summarises the average experimental results.

Table 6 Experiment results (average) for different types of sharing platforms

As the sharing ability of the platform increases (from non-sharing to 100% sharing ability), we see decreases in both costs and total fleet sizes. In particular, platforms with a higher sharing ability contribute significantly to the reduction in both battery-related costs (from $17.12 to $16.21 with 5.3% reduction) and total travelling costs (from $10.63 to $10.09 with 5.0% reduction). This also means that not only the economic benefits from improved sharing ability but also the overall sustainability is enhanced, as both battery- and travelling-related environment negatives are reduced. In addition, since the total travelling distance is reduced from improved sharing ability, the overall fleet size for all operators can be reduced by up to 3.5% (from 28.2 to 27.2 by average), which can further help reduce capital costs.

Another interesting finding from this experiment is the change of node sharing rate and drone sharing rate. The node sharing rate refers to how many customer nodes are open to be served by other operators, while the drone sharing rate refers to how many times a drone has different departure and arrival depots. The first helps us understand how often the corresponding sharing platform promotes demand sharing, while the latter helps us understand how often the corresponding sharing platform promotes fleet sharing. The node sharing rate and drone sharing rate are the percentile of these two terms compared with total nodes and total drones, respectively. Figure 8 below plots the change of them alongside the increase of platform sharing ability.

Fig. 8
figure 8

Node sharing rate versus drone sharing rate with respect to different platform sharing ability

As Fig. 8 illustrates, the better sharing ability a platform has, the customer nodes that are open for sharing will have more visits. This makes sense, as the objective of improving platform sharing ability is to allow more customer demands to be served by other operators, if that is economically beneficial. However, counterintuitively, the change of drone sharing rate is not in the same direction as the platform sharing ability change. Instead, it increases initially, as the customer node sharing rate increases; however, after reaching a certain point, it indicates a steady decrease, despite the customer node sharing rate increasing continuously. The reasons for this behaviour are that, when the customer sharing rate level is relatively low, more customer nodes must be served by their corresponding depots. If those unshared customer nodes are far away from their corresponding depot, yet closer to other operators, it will be more cost effective to fly this drone to its nearest depot, instead of returning it to the departure depot. Hence, it increases the chance for drones to be shared. Conversely, for a shared customer node, since it can be served by multiple operators, a nearest depot is more likely to be chosen as the service provider, which makes it more cost effective to return this drone to its departure depot. In this sense, drones are less likely to be shared.

6 Conclusion

In this research, we have studied a DRP based on blockchain-enabled fleet sharing. Specifically, we designed the optimal drone path that makes the best use of drone sharing to minimise both travelling- and battery-related costs. Given the battery wear effect, we proposed a non-linear mixed-integer formulation to describe its associated cost component. We then developed a series of reformulation and solution techniques to make the problem tractable. In comparison with an enumeration-based method and Gurobi-based linear approximation method, our solution showed capability to solve a large-scale test problem within a reasonable computational time.

Experiments were performed to illustrate how the use of fleet sharing affects different operational costs, compared with non-fleet sharing operations. In addition, the metrics of using a blockchain-enabled fleet sharing protocol were demonstrated through comparison with other sharing types. To illustrate the sensitivity of the optimisation to technological evolutions, the effect of varying battery-related costs and drone speed was investigated. In summary, the key findings of this research include the following:

  1. 1.

    Given the strong effects of DOD on battery lifespan, it is preferential for drone operators to prioritise an average level of DOD in their drone fleet over other decisions.

  2. 2.

    Improving battery-related technologies will help drone operators simplify their overall operational planning, given reduced battery-related effects.

  3. 3.

    A sharing platform with higher sharing ability is beneficial for all operators, as it helps drive down all cost components, yet a higher level of shared demands is not necessarily completed upon increased use of shared drones.

  4. 4.

    To efficiently solve this problem, the exact branch-and-price algorithm was developed. Instances of up to 100 customers can be solved optimally in the allowed time limit by the proposed algorithm, which offers practical applications.

While this paper has contributed new fundamental knowledge to be used in improving both economical and sustainable performance from drone operations, it should also inspire others to research this new topic, which has been barely studied in the literature to date. This study could be further expanded in two directions. First, our current model does not consider the effects of different payloads. As some research has asserted (e.g., D’Andrea, 2014), the energy consumption rate and flying speed of a drone vary with respect to its payloads. Therefore, it would be interesting to see how this consideration would further affect the optimisation process. Second, our current model assumed that all depots have the same customer distribution. As a result, the demand pattern is relatively balanced and repositions for empty drones are barely required. In reality, an unbalanced demand pattern among different operators is much more common. Therefore, it would be practically meaningful to incorporate empty drone repositioning as part of a DRP with fleet sharing.