A responsive ant colony optimization for large-scale dynamic vehicle routing problems via pheromone diversity enhancement

Large-scale dynamic vehicle routing problem (LSDVRP) is exhibiting extensive application prospect with the rapid growth of online logistics, whereas a few approaches have been developed to address LSDVRPs. The difficulty in solving LSDVRPs lies in that it requires quick response and high adaptability to numerous newly appeared customers in LSDVRPs. To overcome this difficulty, in this paper, we propose a responsive ant colony optimization algorithm, termed as RACO, for efficiently addressing LSDVRPs. In the proposed RACO, a pheromone diversity enhancing method is suggested to generate diverse pheromone matrices for quickly responding to newly appeared customer requests in solving LSDVRPs. A pheromone ensemble technique is further designed to produce a high-quality initial population that well adapts to the new customer requests by making use of diverse pheromone matrices. Empirical results on a set of 12 LSDVRP test instances demonstrate the effectiveness of the suggested pheromone diversity enhancing method in quickly responding to newly appeared customer requests for solving LSDVRPs. Moreover, we investigate the computational cost and the traveling cost obtained by the proposed RACO to evaluate responsiveness and adaptability of the proposed RACO, respectively. Comparison with four state-of-the-art approaches to DVRPs validates the superiority of the proposed RACO in addressing LSDVRPs in terms of responsiveness and adaptability.


Introduction
In recent years, the dynamic vehicle routing problem (DVRP) has been extensively studied due to its broad applications in real-world logistics, such as the online small-parcel last mile delivery [9,42] and the dial-a-ride problem [6,23]. A typical DVRP involves a fleet of vehicles in a central depot, a set of known customers in different geographic locations, and a set of unknown customer requests that may arrive during the traveling of these vehicles throughout a working day [20,29]. The main task of DVRP is to dynamically plan optimal traveling routes for these vehicles as new customer requests continuously arrive, so that all the customer requests will be satisfied by these vehicles [2,33,34].
To handle the DVRP task, various approaches have been developed and most of them are heuristics or metaheuristics, since DVRPs are NP-hard [3,24,41,45]. Nevertheless, the emergence of large-scale dynamic vehicle routing problem (LSDVRP) poses new challenges to the approaches to DVRPs, which is expected to play an increasingly important part in practical logistics planning with the growth of online logistics requirement [16,28]. For example, the same-day delivery service offered by online Tmall store in China should dynamically plan routes for vehicles to satisfy a considerable number of customer requests that continue revealing and being received during each working day. The delivery efficiency is among the most important factors for customers' satisfaction and commercial profit, and hence, couriers have to quickly adjust the traveling routes to finish delivery tasks before the end of the working day. As a result, planning routes in LSDVRPs is considered as a critical decision task for increasing both customers' satisfaction and commercial profit in real-world logistics.
Compared to typical DVRPs, LSDVRPs need to take a large number of customers into account when planning optimal traveling routes for vehicles, arising in two main issues in addressing LSDVRPs. First, numerous newly appeared customers during the traveling of vehicles lead to a high requirement for quick response to these new customers, by which LSDVRPs can be efficiently addressed. Second, the numerous new customers also require high adaptability of route planning to dynamic environment, since dramatic changes will occur in customer network because of the new customers. As far as we know, there are few studies working on approaches to LSDVRPs that can quickly respond to numerous newly appeared customers. Therefore, in this paper, we propose a responsive ant colony optimization algorithm, termed as RACO, for solving LSDVRPs via pheromone diversity enhancement that provides quick response to newly appeared customers in LSDVRPs. In the proposed RACO, a pheromone diversity enhancing method is suggested for quickly responding to newly appeared customers, based on which a pheromone ensemble technique is further designed to produce an initial population of high adaptability to new customers. To sum up, the main contributions of this paper are as follows: -We propose a pheromone diversity enhancing method to realize quick response to newly appeared customers for solving LSDVRPs, where the pheromone matrix in ant colony optimization algorithm varies to achieve good diversity that facilitates handling newly appeared customers. With the variations of pheromone matrix, a pheromone ensemble technique is specially designed to generate a highly adaptable initial population by constructing high-quality routes. By this means, the proposed pheromone diversity enhancement is able to provide quick and adaptive response to newly appeared customers for solving LSDVRPs. -Based on the proposed pheromone diversity enhancement, we suggest a responsive ant colony optimization algorithm (RACO) to address LSDVRPs, where newly appeared customers can be efficiently tackled using the proposed pheromone diversity enhancement. More-over, a greedy-based insertion is adopted to facilitate accommodating new customers into planned routes at low traveling cost for solving LSDVRPs. The proposed RACO is able to efficiently and effectively find highquality traveling routes for vehicles in solving LSDVRPs. -Empirical results on 12 LSDVRP test instances demonstrate that the proposed pheromone diversity enhancement successfully facilitates efficient and adaptive response to newly appeared customers in solving LSDVRPs. Comparison with four state-of-the-art approaches to DVRPs validates the superiority of the proposed RACO for solving LSDVRPs in terms of responsiveness and adaptability.
The remainder of this paper is organized as follows. "Related work on DVRPs" provides an introduction to existing works on DVRPs. "Problem formulation" presents the mathematical formulation of DVRP. "The proposed RACO" elaborates the proposed RACO, followed by empirical results reported and discussed in "Empirical results and analysis". "Conclusion" draws the conclusions and outlines the future work on LSDVRPs.

Related work on DVRPs
In this section, we first review studies on different types of DVRPs according to dynamic attributes. Then, we describe some representative approaches to DVRPs. Based on different dynamic attributes in real world, three types of DVRPs are commonly used in DVRP modeling, i.e., VRP with stochastic customer request or demand [19,32], VRP with uncertain service time [7,37], and VRP with uncertain traveling time [5,35]. Jia et al. [19] developed a dynamic logistic dispatching system where the underlying model is the DVRP that allows new customer requests being received as a working day processes. Chen et al. [7] considered the uncertainty of service time in DVRPs, and thus intended to find solutions that have not only low traveling cost but also low sensitivity to deviations of service times. Sabar et al. [35] investigated DVRP with uncertain traveling time where the traveling time between a pair of customers may vary because of possible traffic congestion.
To solve various DVRPs, a considerable number of approaches have been proposed and can be roughly categorized into three types [3,24,25,45]. The first type of approaches to DVRPs is to maintain or enhance population diversity in optimization for adapting to dynamic changes in DVRPs, in case that some non-optimal solutions outperform the optimal solutions when dynamic changes occur [1,35,41]. Wang et al. [41] proposed an ensemble learning-based multi-objective evolutionary algorithm (EL-DMOEA) to solve DVRPs with time windows, where a population-based prediction strategy, an immigrant strategy, and a random strategy are employed to enhance the population diversity and accelerate the convergence of the evolutionary algorithm. Experimental results on a set of DVRPTW test benchmarks revealed that EL-DMOEA significantly outperformed four state-of-the-art approaches to DVRPs in terms of both convergence and diversity. AbdAllah et al. [1] suggested an enhanced genetic algorithm to deal with DVRPs by means of increasing population diversity and the capability to escape from local optima. The experimental results on a set of popular DVRP benchmarks have shown the superiority of the enhanced genetic algorithm over several existing approaches in handling DVRPs. Sabar et al. [35] developed a self-adaptive evolutionary algorithm to address DVRPs with traffic congestion, which evolves a set of configurations to effectively handle dynamic changes by guiding the search process towards promising areas. These configurations consider parameter values, operator types, combination of operators, and order of operator invocation for the evolutionary algorithm. Empirical results on two well-known vehicle routing problems with traffic congestion demonstrated that the self-adaptive evolutionary algorithm performed better than a standard EA and several other stateof-the-art approaches for solving DVRPs.
The second type of approaches to DVRPs is to utilize a memory scheme for transferring some previously search information to the search when dynamic changes occur, so that the search does not need to be initiated from scratch [5,14,31]. Ferrucci et al. [14] proposed a new pro-active realtime Tabu search algorithm to handle DVRPs with urgent delivery goods, where some past request information is used to generate stochastic knowledge about future requests for actively guiding vehicles to new request-likely areas before the arrival of new requests. The proposed Tabu search algorithm was compared to a deterministic approach on various test scenarios and validated to attain near-optimal or optimal solutions for all the test scenarios. Cao et al. [5] suggested a unified pheromone-based traffic management framework to reduce traffic congestion in dynamic vehicle rerouting and traffic light control, in which the pheromone of routes is collected and then reused to evaluate real-time traffic conditions as well as to predict expected road congestion levels in near future. By predicting road congestion, a pro-active vehicle rerouting strategy is used to assign routes to vehicles before they enter congested roads based on global distance and local pheromone. Experimental results revealed that the suggested framework was superior over seven existing approaches in terms of both solution quality and robustness. Okulewicz and Mańdziuk [31] devised a knowledge transfer strategy within a two-phase multi-swarm particle swarm optimizer to solve DVRPs, where the best previously found routes are transferred to a new optimization after dynamic changes occur. Empirical results demonstrated that the devised algo-rithm held better performance than several state-of-the-art approaches to DVRPs.
The third type of approaches to DVRPs is to adopt local search in quality improvement of found solutions, which is one of the most popular methods for searching high-quality routes for solving various VRPs [8,12,26]. Euchi et al. [12] suggested an artificial ant colony algorithm using 2-opt local search to solve DVRPs, where the 2-opt local search refines best found routes by swapping customers at each generation to achieve a lower traveling cost. Comparison with several state-of-the-art approaches to DVRPs verified that the suggested algorithm reached a highly competitive performance for solving DVRPs. Mavrovouniotis et al. [26] developed a memetic ACO algorithm integrated with a local search operator to address symmetric and asymmetric dynamic traveling salesman problems (DTSPs). The local search operator is configured with various moves to cope with asymmetric cases without performance deterioration on symmetric cases. Experimental results demonstrated the efficiency of the proposed memetic algorithm in addressing DTSPs compared with other state-of-the-art approaches to DVRPs. Chen et al. [8] proposed an adaptive large neighborhood search algorithm to solve DVRPs with limited vehicles and hard-time windows, which adopts an ad-hoc destroy/repair heuristics and a periodic perturbation procedure. Simulation results on Lackner's benchmark showed that the proposed algorithm can efficiently solve real-time DVRPs by attaining highquality solutions.
In addition to the above three types of approaches, some other representative works have also been reported in tackling DVRPs [17,18,30]. Muñoz-Carpintero et al. [30] designed different configurations of particle swarm optimization and genetic algorithms within a proposed ad-hoc methodology, to solve dynamic pickup and delivery problems in real time. Simulation results on real-world instances showed that the designed approach had much better performance than a well-known heuristic method. James et al. [18] proposed a novel deep reinforcement learning-based neural combinatorial optimization strategy to realize online vehicle routing, which is first transformed to a vehicle tour generation problem and then solved by applying a structural graph embedded pointer network to these tours iteratively. The proposed strategy was comprehensively investigated on a real-world transportation network, and simulation results verified that the proposed strategy significantly outperformed some existing strategies for solving online vehicle routing problems. Guo et al. [17] suggested a two-phase robust particle swarm optimization which builds robust virtual routes for all the potential dynamic customers in advance, to solve a multi-objective DVRP. New vehicle routes are then created by removing all dynamic customers from the robust virtual routes when dynamic changes occur. Empirical results demonstrated that the suggested method produced routes that were more stable and robust than several existing multiobjective DVRP methods.
Despite that a number of approaches have been developed to solve DVRPs, few of them devote to handling LSD-VRPs that have higher requirement for responsiveness and adaptability to newly appeared customers in LSDVRPs compared to DVRPs. LSDVRPs deserve lots of attention from researchers, since LSDVRPs widely exist in real-world logistics, such as real-time ride sharing and taxi dispatch [16,28]. Due to the complexity of planning routes, DVRPs with over 500 customers can be considered as large-scale DVRPs, which can be also attributed to the development of VRP benchmarks [22,40]. In this paper, we propose a responsive ACO via pheromone diversity enhancement to tackle LSDVRPs, where a pheromone diversity enhancing method is suggested to respond to new customers, and a pheromone ensemble technique is designed for adaptation to these customers.

Problem formulation
In this section, we introduce the mathematical formulation of DVRP as presented in [44]. The formulation of CVRP is first given, since DVRP is extended from CVRP [10,20]. Then, the formulation of DVRP is presented.
CVRP is defined based on an undirected graph G = (V , E), where V = {v 0 , v 1 , . . . , v n } is a node set composed of a depot node v 0 and a set of customer nodes v 1 , . . . , v n at different locations, E = {e i j }(i, j = 0, 1, . . . , n), is a set of arcs between nodes with different traveling costs, and e i j denotes the arc between customers v i and v j . There are a fleet of capacitated vehicles in the depot, which are defined to be homogeneous and thus have identical capacity. The main task of CVRP is to find optimal traveling routes for these vehicles at the lowest traveling cost, so that all the customers can be served by these vehicles. Formally, CVRP can be defined as follows: where X = [x i j ] is a matrix consisting of decision variables, x i j = 1 indicates that the node v j is visited after the node v i is visited and 0 otherwise, W = [w i j ] denotes a distance matrix, and w i j denotes the traveling distance between nodes v i and v j . Equation 1 defines the objective of CVRP, which aims at minimizing the total traveling cost of vehicles.
In addition, the CVRP should also satisfy the following constraints: where H = [h i ] is a vector denoting vehicle routes that are obtained from X , m is the number of vehicles provided by the depot, |s| is the number of elements in the set s, q h i+1 denotes the demand of customer node v h i+1 , and c denotes the capacity of each vehicle. Constraints (2) guarantee that each customer should be visited exactly once. Constraint (3) makes sure that the number of dispatched vehicles cannot exceed the total number of vehicles in the depot. Constraint (4) ensures that the vehicle routes will not contain disconnected routes. Constraint (5) indicates that the total customer demand satisfied by a vehicle should not be larger than the capacity of this vehicle. Constraint (6) means that each decision variable can be set to either 1 or 0. Based on the CVRP formulation, a DVRP can be defined as a series of CVRPs by means of decomposing a working day into a set of time slices with equal length, which has been suggested in [20,29] and widely adopted in DVRP study [13,47]. The DVRP at each time slice is regarded as a static CVRP considering both the unvisited known customers and new customers appearing at previous time slice, except for the DVRP at the first time slice, where only the unvisited known customers left from the previous working day are considered. Parameters n ts , T co , T ac , T are used in creating LSDVRP instances based on a set of customer requests that will arrive during the working day. The n ts denotes the number of time slices, indicating how often DVRP will be renewed and solved. The T co denotes the cut-off time after which any newly arrived customer requests will be postponed to the next working day. For creating a DVRP instance based on a static VRP instance, the T co is considered as a time threshold after which the newly arrived customer requests are assumed to be known at the beginning of current working day. The T ac is the commitment time, indicating that a customer request should be committed to a vehicle at least T ac seconds prior to the estimated time of departing from last visited customer, which is regarded as a reaction time for vehicles after being committed new requests.
The objective of DVRP is to minimize the total traveling cost of routes that are executed by vehicles in one working day, which is defined as follows: where n ts is the number of time slices, Y = {y i jk } is a set of matrices and each of them consists of the decision variables denoting the routes that are executed by vehicles at one time slice, y i jk = 1 indicates that the node v k is visited after the node v j is visited at the i-th time slice and 0 otherwise, U = {u i jk } is a set of distance matrices, and u i jk denotes the traveling distance between nodes v j and v k at the ith time slice. Here, i can be equal to n ts +1, since a working day starts with a set of unvisited known customers left from previous working day. The constraints of DVRP can be decomposed into a series of constraints of CVRPs at the above time slices, where at each time slice, the constraints are the same as those presented from Formulas (2) to (6), except that the decision variables and the capacity of each vehicle will vary with the execution of routes.

The proposed RACO
In this section, we elaborate the proposed RACO by three steps. First, the framework of the proposed RACO is introduced. Second, we explain the pheromone diversity enhancing method and the pheromone ensemble technique in details. Finally, other details in the proposed RACO are presented.

Framework of the proposed RACO
Evolutionary algorithms are among the most popular metaheuristics and have been demonstrated to be effective in various large-scale optimization [38,39,46]. Hence, we develop RACO under the framework of the evolutionary algorithm named ant colony optimization (ACO), which has also exhibited effectiveness in solving DVRPs as suggested in [29]. Figure 1 depicts the framework of the proposed RACO, which mainly consists of five steps. First, an initial DVRP is input into ACO at first time slice of a working day, where the known customer requests are left from previous working day. Second, ACO proceeds to search the optimal routes for the input DVRP at one time slice. Third, a new DVRP is created by collecting the customer requests that are not Algorithm 1: RACO Output: R (optimal routes) 1 S ← Initial DVRP creation (C p ) 2 /*C p denotes a set of known customer requests left from previous working day*/ 3 for i = 1 : n ts + 1 do 4 /*n ts is the number of time slices*/ 5 R, τ ← ACO(S, P) 6 /*R and τ denote the optimal routes and the pheromone trails Renew vehicle capacities and start points 16 Return R committed to vehicles and those newly appearing at previous time slice, in which the customer requests that are not committed to vehicles are obtained based on the routes planned by ACO. Fourth, a pheromone diversity enhancing method is suggested to improve the diversity of pheromone matrix in ACO for a response to the new DVRP. Fifth, a pheromone ensemble technique is used to aggregate important information in diverse pheromone matrices, from which a high-quality initial population can be produced to adapt to the new DVRP. The high-quality initial population will be provided to ACO for the optimization of new DVRP at next time slice. The second to the fifth steps will repeat for each time slice and terminate until the end of current working day, after which the optimal routes that are practically executed by vehicles will finally be output. Algorithm 1 presents framework of the proposed RACO in more details.

Pheromone diversity enhancement
Despite that ACO is able to solve DVRP at each time slice, it is difficult for ACO to handle the numerous newly appeared customers in LSDVRPs. Therefore, a pheromone diversity enhancing method is devised in the proposed RACO to quickly respond to these new customers for solving LSD-VRPs. The pheromone diversity enhancing method is first used to generate a set of diverse pheromone matrices based on the new customers at each time slice, and then, a pheromone ensemble technique is applied to these pheromone matrices to produce a high-quality initial population for ACO at next time slice. The above two steps are described in details in this section.
To begin with, Fig. 2 illustrates the pheromone diversity enhancing method, which produces a set of diverse pheromone matrices via three main steps after receiving newly appeared customers. First, the dynamism of newly appeared customers is detected by comparing these new customers with current DVRP. Then, a transformation matrix is generated based on the detected dynamism and distance matrix for known customers in current DVRP. The generated transformation matrix is finally utilized to transfer current pheromone matrix to a set of diversely distributed pheromone matrices.
More specifically, Algorithm 2 presents detailed steps in the pheromone diversity enhancing method. First, the dynamism is detected and calculated as the ratio of number of newly appeared customers to the number of currently known unvisited customers d s . Then, the Monte Carlo Sam- pling is adopted to randomly select d s × N ×(N −1) 2 pairs of customers from the distance matrix W under a set of probability P A for selecting pairs of customers, where N is the number of known unvisited customers at current time slice. In probability set P A, the probability for each pair of customers is determined according to the distance from the arc connecting the customer pair to the new customer that gets closest to the arc, which is formally defined as follows: n is the new customer that gets closest to the arc e i j , and dis max denotes the largest distance between a new customer to a known arc connecting currently known unvisited customers. This step is designed based on the idea that the closer a new customer is to an arc, the more greatly the new customer will influence the arc, which is expected to change the value of pheromone trail on this arc. The Monte Carlo Sampling can thus be applied to currently known customer network based on the probability set P A to obtain d s × N ×(N −1) 2 pairs of customers between which the pheromone is worthy of change in tackling newly appeared customers.
To change the pheromone between each selected customer pair, a transformation matrix should be computed by Therefore, a set of piecewise functions are defined to compute the transformation coefficient values for the transformation matrix, which are listed as follows: where ε is an extremely small value to avoid an invalid division, dis arc(i, j,C i j n ) is the smallest traveling cost of the arc connecting the customer v i , v j and the new customer C i j n that gets closest to the arc connecting the customers v i and v j , avg(W ) and avg(τ ) denote the average values of W and τ , respectively. By this way, a transformation matrix is produced and applied to current pheromone matrix, so that a new pheromone matrix is obtained. To ensure diversity of the pheromone matrix, the transferring for current pheromone matrix will be implemented for n s times, where n s is set as the maximum number of sampling when the Monte Carlo Sampling covers all the customer pairs that satisfying the conditions in a) and c) in the transformation rule. Consequently, a set of diverse pheromone matrices is obtained and expected to effectively respond to newly appeared customers, since these matrices are of good diversity and are generated based on the dynamism and current DVRP.
To employ these pheromone matrices in handling newly appeared customers in LSDVRPs, a pheromone ensemble technique is further suggested for producing a high-quality population based on these pheromone matrices, which is presented in Algorithm 3. In the pheromone ensemble technique, each of these diverse pheromone matrices leads the route construction presented in [29], so that low-cost routes can be obtained and grouped into their leading pheromone matrix. Then, a greedy-based insertion method is introduced from [43] and adopted in accommodating newly appeared customers into the constructed routes. Finally, the interpheromone crossover and mutation are applied to these routes for a set of offspring routes, which ensembles routes from different groups by crossover and mutation operators presented in [4]. A population is then created for initiating the DVRP optimization at next time slice.

Details in the proposed RACO
This section describes some details in the proposed RACO, including the ACO framework, pheromone-guided route construction, pheromone update, and greedy-based insertion.
Inspired by the foraging behavior of ants, ACO is a popular metaheuristic in solving route optimization problems, where Update t c 12 Return R O each ant determines the routes of searching food according to the pheromone trails previously left by other ants. In this paper, we adopt the ACO algorithm presented in [29], which has been validated to be effective on solving DVRPs. Algorithm 4 describes main framework of ACO, which consists of five steps. First, the pheromone trail is initialized as τ = 1 n×Cost(P I ) for the route segment between each pair of known customer and depot nodes, where n is the number of known customer nodes, and Cost(P I ) is the traveling distance of routes that are obtained by a greedy post-insertion algorithm as presented in [11]. Second, a population is generated using a pheromone-guided route construction method. Third, individuals in the created population are evaluated, by which the optimal routes are identified and recorded. Fourth, 2-opt local search is applied to the optimal routes for achieving lower traveling cost. Fifth, the pheromone trail of each route segment is updated according to the previously found routes. The second step to the fifth step will be repeated until the termination of one time slice, and the routes that are practically executed by vehicles will be output as the best routes found by ACO.
Compared to other metaheuristics, ACO is characterized with the pheromone-guided route construction for population creation at each iteration. To generate each individual, the pheromone-guided route construction determines the selection of each route segment based on the distance and the pheromone trails of each route segment. Formally, the next visited node for each ant in the pheromone-guided route construction is determined by a probability distribution of a set of feasible customers that are possibly visited next where τ i j is the value of pheromone trail for arc(i, j), the η i j is the attractiveness of arc(i, j) and is calculated as η i j = 1 w i j , and the α, β are two parameters that control the importance of τ i j and η i j , respectively. The pheromone-guided route construction can generate high-quality population using traveling distance and pheromone trails between nodes.
To acquire higher quality population over iterations, the pheromone-guided route construction often equips with pheromone update for constructing routes with lower traveling cost. In the ACO for DVRPs, there are three types of pheromone update strategies: route pheromone update for renewing pheromone of each route, generation pheromone update for renewing pheromone over generations, and dynamic pheromone update for renewing pheromone over time slices [29]. In the route pheromone update, the pheromone trail between nodes v i and v j is updated when an ant k visits node v j after node v i , which is based on the following rule: where τ i j denotes the pheromone trail on the route segment between nodes v i and v j , τ 0 is the initial value of pheromone trail for each route segment between each pair of known nodes, ρ denotes the pheromone evaporation rate that determines how much previous pheromone trails will maintain, n c is the number of known customer nodes, and L nn denotes the traveling distance of the routes found by a nearest neighbor heuristic as suggested in [15]. Once the route construction for the population at each generation is completed, the generation pheromone update is applied to the pheromone of the population as follows: where f r denotes the traveling distance of the best routes r found in previous optimization. When the ACO process terminates at each time slice, the dynamic pheromone update is employed to ensure that the pheromone trails can adapt to dynamic changes in DVRPs. Specifically, the pheromone trails are updated at the beginning of DVRP optimization at next time slice as follows: where γ r denotes the pheromone evaporation rate between DVRPs at adjacent time slices, τ o i j denotes the value of τ i j in DVRP at previous time slice, and c 0 denotes a set of customers that are received at both of the two adjacent time slices.
With the three types of pheromone update strategies, the ACO for DVRPs can proceed to search for high-quality individuals in solving DVRPs. In the pheromone diversity enhancing method, the initial population is created by a greedy-based initialization method, which is presented in Algorithm 5 and mainly consists of the following three steps. In each individual, each newly appeared customer is first inserted into the position before or after a known customer that is nearest to the customer in the individual, where the specific insertion position is determined based on whether the obtained individual achieves a lower traveling cost. Then, the route pheromone update is implemented on the created route as presented in Eq. (9), The above two steps will be repeated until until all the newly appeared customers are inserted into the individual. Next, the 2-opt local search [12] is applied to the individual for achieving a lower traveling cost in the individual. The above three steps will be repeated until all the individuals are renewed, after which an initial population is output to the ant colony optimizer.

Empirical results and analysis
This section first demonstrates the effectiveness of the suggested pheromone diversity enhancement by comparing it to ACO (i.e., ACO without the pheromone diversity enhancement) in solving large-scale DVRPs. Then, the superiority of the proposed RACO is verified in comparison with four stateof-the-art approaches to DVRPs. Finally, the responsiveness and adaptability of the proposed RACO are discussed.

Test instances
We evaluate the performance of the proposed RACO on a set of 12 LSDVRP test instances with 560-1200 customers. The DVRP creation method suggested in [20] is used to create these LSDVRP instances based on 12 large-scale static VRP 'N C ' denotes the number of customers, 'Capacity' denotes the capacity of each vehicle, 'T ' denotes the length of a working day, and 'S t ' denotes the service time for each customer instances proposed by Li et al. [21]. Specifically, Table 1 presents the properties of the 12 LSDVRP test instances.

Compared algorithms
According to the categorization of approaches to DVRPs in Sect. 2, four state-of-the-art approaches to DVRPs are selected for comparison with the proposed RACO on 12 LSDVRP test instance, namely, MIACO [27] (based on diversity), 2MPSO [31] (based on memory), S-VNS [36] (based on local search), and ES-LC [48] (others).

Parameter setting
The parameter setting in the proposed RACO is listed in Table 2. The parameters in 2MPSO, S-VNS, and ES-LC are set according to the references where 2MPSO and ES-LC were originally suggested [31,36,48], since the parameter settings in their original study have been validated to be effective in solving a set of commonly used DVRP test instances. For two ACO-based algorithms RACO and MIACO, the parameters that they have in common are set to the same values for a fair comparison, whereas their unique parameters are set the same as those suggested in the empirical study in [44]. On each LSDVRP test instance, each algorithm considered in empirical study is implemented for 30 independent runs and 480 s at each time slice. The empirical study is performed on MATLAB 2017b, in a PC with a 1.80 GHz Intel Core i7-8565U CPU and the Windows 7 SP1 64 bit operating system.
'Ave' and 'Std' denote the average and standard deviation of traveling cost over 30 runs, respectively. The Wilcoxon rank sum test is performed at a significance level of 0.05, where '+', '≈', and '−' indicate that the performance of RACO is significantly better, similar, and worse than that of the corresponding compared algorithm on the test instance, respectively

Effectiveness of pheromone diversity enhancement
In this subsection, we verify the effectiveness of the pheromone diversity enhancement by comparing the proposed RACO with the ACO without the pheromone diversity enhancement (namely, the ACO used in [29]). Table 3 presents the traveling cost achieved by the proposed RACO and ACO on 12 LSDVRP instances, where the Wilcoxon rank sum test is used to identify the performance difference between RACO and ACO at a significance level of 0.05. It can be seen from Table 3 that the proposed RACO achieves lower traveling cost than ACO on eleven of the 12 LSDVRP instances. Based on the Wilcoxon rank sum test, the proposed RACO significantly outperforms ACO in terms of traveling cost on 9 of the 12 instances, while the proposed RACO has a similar performance to ACO on the remaining three instances. Hence, the pheromone diversity enhancement strengthens the performance of ACO in solving LSDVRPs. Moreover, it can also be observed that the strengthening effect of pheromone diversity enhancement can maintain high level on most of the 12 LSDVRP instances. The proposed RACO reduces over 10% traveling cost compared to ACO on instances L600, L760, L800, L840, L960, L1120, and L1200, and particularly reduces around 15% traveling cost on instances L600, L760, L800, and L840. On instances L880 and L1040, the proposed RACO reduces about 7% traveling cost compared to ACO. On instances L560 and L720, minor traveling cost reduction can also be reached by the proposed RACO. Despite that the traveling cost achieved by the proposed RACO is about 3% higher than that achieved by ACO on L640, the proposed RACO is still competitive to ACO, since the corresponding result of Wilcoxon rank sum test indicates that the performance of the two algorithms is similar on instance L640. More specifically, Fig. 3 presents the traveling cost obtained by RACO, RACO-D (RACO without pheromone diversity enhancing method), RACO-E (RACO without pheromone ensemble technique), and RACO-G (RACO without greedy-based initialization) on twelve LSDVRP instances. According to Fig. 3, each step in the pheromone diversity enhancement is effective in reducing the traveling Traveling cost 10 4 RACO RACO-D RACO-E RACO-G cost obtained by RACO, including the diversity enhancing method, the pheromone ensemble technique, and the greedybased initialization. Particularly, the diversity enhancing method and the pheromone ensemble technique are more effective in strengthening the performance of RACO than the greedy-based initialization, since they facilitate RACO in reducing more traveling cost than the greedy-based initialization on most of the 12 instances.
Therefore, we can conclude from Table 3 that the pheromone diversity enhancement is effective in improving the performance of ACO in addressing LSDVRPs.

Performance comparison between the proposed RACO and existing approaches to DVRPs
In this subsection, we investigate the performance of the proposed RACO in addressing LSDVRPs by comparing it with four state-of-the-art approaches to DVRPs. Tables 4 presents statistical results of the proposed RACO, MIACO, S-VNS, 2MPSO, and ES-LC on 12 LSDVRP instances averaging over 30 runs. As can be observed from Tables 4, the proposed RACO achieves the lowest traveling cost on 9 of the 12 LSDVRP test instances among all the compared algorithms. According to the Wilcoxon rank sum test, the proposed RACO overall exhibits significantly better performance than each of the four state-of-the-art approaches to DVRPs on most of the 12 instances. Compared to MIACO which achieves the best performance among the four existing algorithms, the proposed RACO performs significantly better, similarly and significantly worse on eight, three, and one LSDVRP instances, respectively. On larger scale instances among the 12 instances L960, L1040, L1120, and L1200, the proposed RACO maintains significantly better performance than the four existing algorithms except for only one case, where the proposed RACO has a similar performance to ES-LC on L1120. This reveals that the proposed RACO is potentially highly effective in solving LSDVRPs, which is expected to be applied to very large-scale DVRPs in the future.
To sum up, the proposed RACO is superior over state-ofthe-art approaches to DVRPs in terms of traveling cost.

Responsiveness and adaptability of the proposed RACO
In this section, we discuss the responsiveness and adaptability of the proposed RACO by comparing it four state-of-the-art approaches to DVRPs. Figures 4 and 5 show the responsiveness and adaptability of RACO, MIACO, S-VNS, 2MPSO, and ES-LC, averaging over 30 runs on 12 LSDVRP test instances. The responsiveness of each compared algorithm is represented as the computational cost by which the algorithm converges to the lowest traveling cost when dynamic changes occur in LSDVRPs. The adaptability of each compared algorithm is represented as the relative traveling cost increasing reached by the algorithm between the first time slice and the second time slice in LSDVRPs. Note that the evaluation for adaptability only considers the traveling cost increasing between the first time slice and the second time slice, since the relative traveling cost increasing in later time slices is too small (i.e., the original traveling cost at current time slice is very large) to facilitate in identifying the adaptability difference between compared algorithms.
As can be observed from Figs. 4 and 5, the proposed RACO is superior over four existing algorithms in terms of both responsiveness and adaptability on all the 12 LSD-VRP test instances. For responsiveness, the proposed RACO maintains rapidly converging to the lowest traveling cost with less than 1/3 length of one time slice on all the 12 instances in handling dynamic changes in LSDVRPs, while  the proposed RACO adapts very well to newly appeared customers with less than 10% traveling cost increasing on instances L640, L720, L760, and L960. It is validated that the proposed RACO has higher adaptability to numerous newly appeared customers than existing approaches to DVRPs in solving LSDVRPs. This is because that the pheromone ensemble technique generates high-quality initial population to adapt the proposed RACO to new customers in solving LSDVRPs.
To sum up, the proposed RACO has higher responsiveness and adaptability in handling numerous newly appeared customers in LSDVRPs than existing approaches to DVRPs.

Conclusions
This paper has proposed a responsive ACO (RACO) via pheromone diversity enhancement to handle numerous newly appeared customers in solving LSDVRPs. In the proposed RACO, we have suggested a pheromone diversity enhancing method to quickly respond to newly appeared customers, and designed a pheromone ensemble technique to produce a highquality initial population well adapting to new customers in solving LSDVRPs. Comparison with four state-of-the-art approaches to DVRPs has validated the superiority of the proposed RACO in addressing LSDVRPs in terms of adaptability and responsiveness. This paper has demonstrated the effectiveness of pheromone diversity enhancement for solving DVRPs, making use of dynamism detection and diversity enhancement to handle dynamic changes. However, there is still plenty of room to fully leverage not only the dynamism information but also current DVRP information for dealing with LSDVRPs. Hence, an interesting idea is to revise learning methods to reflect the relationship between the dynamism information and the DVRP renewing so as to quickly guide the search to global optima in solving LSDVRPs. Moreover, the effectiveness of the proposed RACO needs to be further verified on some real-world LSDVRP applications. Availability of data and materials Not applicable.

Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.

Code availability Not applicable.
Humans and/or animals The results reported in our study do not involve any humans and/or animals.

Ethics approval Not applicable.
Consent to participate Not applicable.

Consent for publication Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.