Solving the vehicle routing problem by a hybrid meta-heuristic algorithm

The vehicle routing problem (VRP) is one of the most important combinational optimization problems that has nowadays received much attention because of its real application in industrial and service problems. The VRP involves routing a fleet of vehicles, each of them visiting a set of nodes such that every node is visited by exactly one vehicle only once. So, the objective is to minimize the total distance traveled by all the vehicles. This paper presents a hybrid two-phase algorithm called sweep algorithm (SW) + ant colony system (ACS) for the classical VRP. At the first stage, the VRP is solved by the SW, and at the second stage, the ACS and 3-opt local search are used for improving the solutions. Extensive computational tests on standard instances from the literature confirm the effectiveness of the presented approach.


Background
One of the parameters that is always in mind in production and services is the reduction of the product's expense since lower product costs yield increase in the company's competitive advantage in terms of production and services, so the company's profit is growth.Furthermore, minimizing the transportation cost is one of the cost reduction methods in which the goods are transported from one place to other places with minimum cost.Therefore, effectively resolving the vehicle routing problem (VRP) is incredibly important in a distribution network of a logistics system.
A lot of research has been carried out in the field of logistics, from the traveling salesman problem to complex dynamic routing problems.Among the prominent problems in distribution and logistics, the vehicle routing problem and its extensions have been widely studied for many years, mainly because of their applications in real world logistics and transportation problems.The VRP has been proven to be a non-deterministic polynomial time problem (Bodin & Golden 1981).This means that the time of the VRP solution grows exponentially with increasing the distribution points.
A large number of techniques in the literature deal with the VRP and its variations such as VRP with pickup and delivery, time windows VRP, stochastic VRP, multidepot VRP and others.Generally, the techniques used for solving the VRP can be categorized into exact, heuristic and hybrid methods.Although exact algorithms are appropriate for instances with small size, they are not often suitable for real instances owing to the computational time required to obtain an optimal solution.There have been many papers proposing exact algorithms for solving the VRP.These algorithms are based on column generation approach (Choi & Tcha 2007), implicit enumeration algorithm (Ruiz et al. 2004), branch-andbound method (Valle et al. 2011), etc.The VRP is an NP-hard problem and difficult to solve by exact methods within acceptable computing times (Lee et al. 2003).In other words, when the problem size of the VRP is increased, the exact methods cannot solve it.So, heuristic methods are necessary to be used for solving instances with large size in a reasonable amount of time.These algorithms can be divided into two main groups including classical heuristics and meta-heuristics.Since heuristic approaches are not so efficient for escaping local optimum values, the obtained solutions by them may have a large disparity compared to the best known or sub-optimal solutions.Some of the well-known heuristic algorithms are gravitational emulation search (Clarke & Wright 1964), local search (Gillett & Miller 1974) and Lin-Kernighan (Christofides et al. 1979).
A new kind of emerged algorithm basically tries to combine basic heuristic methods in higher level frameworks aimed at efficiently and effectively exploring a search space in the last 30 years.These methods are nowadays commonly called meta-heuristics.Since the meta-heuristic approaches are very efficient for escaping from local optimum, they are one of the best group algorithms for solving combinatorial optimization problem.A great number of meta-heuristic algorithms have been applied to the VRP, including genetic algorithm (Baker & Ayechew 2003), large neighborhood search (Hong 2012;Pisinger & Ropke 2010), tabu search (TS) (Leung et al. 2011), simulated annealing (Osman 1993), memetic algorithm (Ngueveu et al. 2010), neural networks (Su & Chen 1999) and particle swarm optimization (Ai & Kachitvichyanukul 2009).On the other hand, a limited amount of research addressing the VRP has used the ant colony optimization (ACO), with candidate lists and ranking techniques to improve the ability of a single ACO to solve the VRP (Bullnheimer et al. 1998;Bullnheimer et al. 1999).
Recently, many researchers have found that the employment of hybridization in optimization problems can improve the quality of problem solving in comparison with heuristic and meta-heuristic approaches.Since hybrid algorithms such as genetic algorithms with a local search (Prins 2009;Lima et al. 2004), genetic algorithms with sweep algorithm (SW) and nearest addition method (Wang & Lu 2009), ACO and greedy heuristics (Zhang & Tang 2009), simulated annealing and TS (Lin et al. 2009), neural networks and genetic algorithms (Potvin et al. 1996a;Potvin et al. 1996b), genetic algorithms and ACO (Reimann et al. 2001), etc. have greater ability to find an optimal solution, they have been considered by researchers and scientists in recent years.
The ACO approach is one of the famous meta-heuristic algorithms that simulates the ants' food-hunting behavior and is used for solving combinatorial optimization problems that do not have a known efficient algorithm.This technique, one of the most powerful methods compared to other meta-heuristic algorithms nowadays, was introduced by Dorigo et al. in (Dorigo 1992) (Dorigo et al. 1991;Dorigo 1992).Because a limited amount of research addressing the VRP has used the ACO combined with heuristic algorithms, a hybrid meta-heuristic algorithm based on ant colony system is proposed for the VRP in this paper.In this algorithm, the VRP is solved by the SW firstly, and then the ant colony system (ACS) and 3-opt local search are used for improving the solutions.
In the following parts of this paper, in 'VRP mathematical model' , we will explain the formulation of VRP.
Then, at the beginning of 'Methods' , we specially explain the SW and ACS, and then, the combination of ACS and SW is extendedly explained in this section.In 'Numerical calculations' , the proposed algorithm will be compared with some of the other algorithms on standard problems, which are included in the VRP library.Finally in the last section, the conclusions are presented.

VRP mathematical model
The VRP is one of the applied problems in the real world.This problem consists of a large number of customers in which everyone has a known demand level.These customers must be supplied from node 0 called the depot.Delivery routes which start and finish at the depot are required for vehicles so that all customer demands are satisfied and each customer is visited by just one vehicle.Vehicle capacities are given and limited by Q.So, the VRP can be explained mathematically as follows.
Let G (V, A) be a perfect undirected connected graph with a vertex set V = {0, 1, . .., n} and an edge set A = {(i, j):i, j 2 V, i 6 ¼ j}.If the graph is not perfect, we can replace the lack of any edge with an edge that has an infinite size.Vertex 0 represents the depot; each of the other vertex i 2 V − {0} is a customer with a nonnegative demand q i , and a non-negative distance c ij is associated with each arc (i, j) 2 A and represents the travel cost from node i to node j.The cost matrix is symmetric, i.e., for all i, j 2 V, c ij = c ji .The use of the loop arc (i, i) is not allowed and defining c ii = + ∞ for all i 2 V.
So, the solution for the VRP determines a set of delivery routes that satisfies the requirements of distribution points and obtains the minimum total cost for all vehicles.In practice, minimizing the total cost is equivalent to minimizing the total distance traveled by m > 1 vehicles.
For presenting the integer linear programming model for VRP, the variables below are introduced: n = the number of nodes for each instance.m = the number of vehicles used for each instance.
x ij ¼ 1 if the vehicle travels directly from node ito node j 0 otherwise: Hence, one of the integer programming formulations for the VRP can be written as follows: subject to The objective function (1) minimizes the total distance traveled in a tour.Constraint sets (2) and (3) ensure that the vehicle arrives once at each node and m times at the depot.Constraint sets (4) and ( 5) ensure that the vehicle leaves each node once and leaves the depot m times.Constraint set (6) prevents vehicles from carrying loads more than their capacity.Constraint sets ( 7) and ( 8) avoid the presence of sub-tours for each vehicle.Finally, Constraint sets ( 9) and ( 10) define binary conditions on the variables.

Numerical calculations
At the first stage in this section, sensitivity analyses of parameters in the proposed algorithm (PA) are performed, and at the second stage, the SW + ACS, which was discussed in the previous section, is analyzed using several benchmarking problems of Christofides et al. (Christofides et al. 1979) and Taillard (Taillard 1993).The proposed algorithm was coded using MATLAB language executed on a computer with a 1.00-GB RAM and an Intel, 2.80-GHz CPU.

Sensitivity analyses of parameters
In this section, the parameters in our algorithm are tuned, in which there are four parameters including α, β, ρ and q 0 .The ranges of the four parameters were set to α 2 {1, 2, 3}, β 2 {2, 4, 6}, ρ 2 {0.1, 0.2, 0.3} and q 0 2 {0.9, 0.95, 0.99}.When tuning the parameters, the instance E-n51-k5 was determined as the test problem.Then, the algorithm with each parameter combination for this instance was tested five times.
Based on the gained results, the algorithm with the smaller weight parameter (α) of pheromone trails possesses higher performance.This may be attributed to the fact that in the SW + ACS, the initial pheromone trails are large values.If using the large control factor of the pheromone trail, the effect of visibility value is weakened and results in a premature convergence.In addition, the qualities of the solutions of the algorithms with β = 2 are better than those of 4 and 6.
From the test results, it can be found that by setting the evaporation factor to 0.1, the proposed algorithm can yield better solutions.This can be attributed to the fact that if pheromone evaporation is too rapid, it more easily results in the search to be trapped in local minimum.In other words, the smaller evaporation factor can ensure the sufficient diversity of search space and guide following ants to explore better solutions.

Benchmark instances
In the instances of Christofides et al. (Christofides et al. 1979), which can be downloaded from the DEIS - Operations Research Group Library of Instances (DEIS -Operations Research Group 2012), the results obtained from calculations by the proposed methods are compared with other algorithms including saving algorithm (SA), saving algorithm + 3-opt (SA + 3-Opt), TS search and ACO.The first two of these algorithms are traditional, and the next two are relatively new.These algorithms were executed on standard instances containing between 51 and 200 nodes of the VRP problem including E-n51-k5, E-n76-k10, E-n101-k8, E-n101-k10, E-n121-k7, M-n151-k12 and M-n200-k17.The number at the right of each instance shows the number of vehicles, and the middle number indicates the number of related customers.
Table 1 summarizes the computational results of the proposed, compared and other mentioned algorithms.Moreover, to show the method's performance more clearly, we present the best known solutions (BKSs) that have been published in the related literature in this table.
Besides, Figure 2 shows a comparison between the gap values of the heuristic and meta-heuristic algorithms, where the gap is defined as the percentage of deviation from the best known solution in the literature.The gap is equal to 100[c(s ** ) − c(s * )]/c(s * ), where s ** is the best solution found by the algorithm for a given instance, and s * is the overall best known solution for the same instance on the web.A zero gap indicates that the best known solution is found by the algorithm.
As can be seen from Table 1, the proposed algorithm finds the optimal solution for one out of seven problems that are published in the literature.The results indicate that SW + ACS is a competitive approach compared to the traditional and new meta-heuristics.For instances E-n101-k8 and M-n200-k17, the gap is relatively as high as 3%.However, in other instances, the proposed algorithm finds nearly the best known solution, i.e., the gap is below 0.76%, and overall, the average difference is 1.1%.The performance comparison of results shows that the SW + ACS method clearly yields better solutions than the ACO, saving algorithm and its modified version (SA + 3-opt).
Moreover, computational results of the proposed algorithm and TS show that these algorithms have a close competition, and the proposed algorithm gives four better solutions than the TS.In other words, the performance of the SW + ACS is better in reaching the suboptimal solution than the TS (Table 1).
From the statistical viewpoint in Table 2, there is no statistical significant difference between the PA and other meta-heuristic algorithms (p value = 0.9881).Furthermore, it can be observed from both the medians plot (Figure 3) and the means plot (Figure 4) that the PA is  better than the ACO, SA + 3-Opt and SA algorithms.Also, by observing these figures, we can yield the approximate equality of the PA and TS methods.
In Table 3, the values obtained by the common ACO and proposed algorithm are shown.The results indicate that these two algorithms have very good ability for small problems, have approximately similar behavior and can converge to the best solutions.However, this ability of ACO against the proposed algorithm decreases when the number of nodes and feasible solutions increases.
Table 4 shows the results obtained for the second problem instances including 12 instances proposed by Taillard and presents the comparison between the best results of our algorithm and the BKS.As shown in Table 4, in three cases, the BKS can be found by our algorithm.In other cases, the deviation between the BKS reported and the solution found by the proposed algorithm is very low.In addition, the proposed algorithm finds closely the BKS for most of the instances, i.e., the gap is below 1.10%.

Conclusions
In this paper, a hybrid algorithm combining ACS, SW and 3-opt local search was proposed for solving the VRP.The SW + ACS is more efficient than the ACO.
Especially for large-size problems, this algorithm yields better solutions compared with previous algorithms.Another benefit of this algorithm is that when ACS is used, the number of nodes and complexity of the problem are decreased.As a result, the algorithm can solve these small problems easily.So, it is not necessary for the algorithm to be executed many times for finding the best solutions since the SW and 3-opt local search do not have great time requirements.It also seems that the combination of the proposed algorithm and genetic or simulated annealing algorithms will yield better results.Using the proposed algorithm for other versions of the VRP is suggested for future research.

Methods
Because the VRP is one of the most important combinatorial optimization problems, a hybrid algorithm called SW + ACS is proposed in this paper.In this section, the SW is presented first then the ACS will be explained, and finally, the hybrid algorithm will be analyzed in more detail.

Sweep algorithm
Gillett and Miller (Gillett & Miller 1974) proposed a SW in 1974 for Euclidean networks, which ranks and links demand points by their polar coordinate angle (Figure 5).The polar coordinate angle is calculated as follows: It is noted that if y(i) − y(0) < 0 and y(i) − y(0) ≥ 0 then − π < An(i) < 0 and 0 ≤ An(i) ≤ π, respectively.In this algorithm, which is one of the famous and powerful heuristic algorithms, at first, the polar coordinates of all n customers are calculated, where the depot is node 0 and so An(1) = 0.Then, sweeping (clockwise or counterclockwise) is started from customer i, which has not been visited, from the smallest angle to the largest angle until all customers are included in a tour.
ACS is one of the most powerful versions of ACO.This meta-heuristic algorithm was introduced in 1996 by Dorigo and Gambardella (Dorigo & Ganbardella 1997), which was strongly inspired by the ant system (AS).So, this algorithm achieves performance improvements because of the introduction of new mechanisms based on ideas not included in the original AS.The ACS algorithm has two major changes to the rules employed in the AS algorithm, namely: (1)A novel transition rule that favors either exploitation or exploration is introduced.Node j that is next to node i, among the unvisited nodes J i k , is selected by ant k in the route.According to the following transition rule, the probability of each node being visited is Where j * = the unvisited node j in J i k for which [τ ir (t)][η ir (t)] β is maximized.τ ij (t) = the amount of pheromone that is on the edge joining nodes i and j. η ij (t) = the heuristic information for the ant visibility measure defined here as the reciprocal of the distance between node i and node j. α, β = two control parameters that represent the relative importance of the amount of pheromone on the edge between node i and node j compared to the ant visibility value respectively.q = a random variable in [0, 1].q 0 = a given arbitrary parameter fixed before the program is started such that when q is less than q 0 , the ant employs exploitation to select node j* as the  next node in its tour, whereas if q exceeds q 0 , the ant uses probabilistic exploration to select the next node in its tour.
(2)There are two ways for updating pheromone trail as follows: Local updating: When an ant moves from node i to node j, it updates the amount of pheromone on the traversed edge using the following formula: where τ 0 , the initial amount of pheromone, is calculated as τ 0 = (nC i ) −1 , n is the number of nodes, C i is the cost of the initial tour produced by a construction heuristic such as the nearest neighbor heuristic, SW, etc., and ρ is a parameter called evaporation rate in the range [0, 1] regulating the reduction of pheromone on the edges.It should be noted that local updating has an important effect, such that whenever an ant traverses an edge (i, j), its pheromone trail τ ij is reduced.So, the edge becomes less desirable for the ants in future iterations.Local updating not only encourages an increase in the exploration of edges that have not been visited yet but also helps avoid poor stagnation situations.Global updating: When all tours are generated by ants, the edges that belong to the best tour are updated using the following formula: where C b is the cost of the best tour T b found yet.It is important to note that the pheromones of the edges belonging to the best tour are only updated in the global updating.This encourages ants to search in the vicinity of this best tour in future iterations.

Hybrid algorithm
From another viewpoint, there are also two heuristic groups for solving combinatorial optimization problems: construction algorithms that produce a feasible solution themselves, and improvement algorithms that can improve the solutions with having a feasible solution.Here, a construction algorithm, namely SW, and two improvement algorithms, namely ACS and 3-opt local search, are used.In this algorithm, the ACS is applied for improving every route of the vehicle; however, the nodes of each vehicle should be unchanged.On the other hand, the 3-opt local search is used for changing nodes and improving each vehicle.In other words, if this algorithm is separated into two parts, the first one to improve the route of the vehicle without changing the vehicle's nodes and the second one to improve it only by changing the vehicle's nodes, then the ACS does operation 1 and the 3-opt local search does operation 2 (Figure 6).In this method, first, the nodes that should be visited by vehicles are ordered with respect to the depot by SW, and then, they are set in an array.Second, all of the   states in which the above-mentioned order is preserved in spite of a change in their locations are obtained.For example, if 1 2 3 is the order obtained for a feasible solution, then orders 2 3 1 and 3 1 2 are set in other arrays.
Then for each array, the vehicle starts to move from the depot and visits the nodes in the arrays in the order described until it is not possible to add a further node.This means that if the load of a vehicle is more than its capacity, it returns to the depot and repeats these steps until there is no more node to be visited.When vehicle routes are obtained, the ACS is implemented for every route until the best route is obtained.Furthermore, the vast literature on meta-heuristics tells us that a promising approach for obtaining highquality solutions is to couple a local search algorithm with a mechanism to generate initial solutions.The ACO's definition includes the possibility of using local search as daemon actions.Daemon actions are used to implement centralized actions that cannot be performed by a single ant.An example of daemon actions is the activation of a local optimization procedure.In implementation, 3-opt local search procedure is used to obtain more improvement in the algorithm's performance.
This algorithm, which is shown in Figure 7, is based on omitting three arcs of the tour that are not adjacent and connecting them again by another method.It can be noted that there are several routes for connecting nodes and producing the tour again, but a state that satisfies the problem's constraints is acceptable.So, this unique tour will be accepted only if, first, the above constraints are not violated specially regarding each vehicle's capacity,

Figure 1
Figure 1 Sample of solving the VRP.

Figure 2
Figure2Computational results of the meta-heuristics.

Figure 8
Figure 8 Description of the proposed algorithm.

Table 1
Comparing algorithms for standard VRP problems

Table 3
Mean gap values of ACO and the SW + ACS

Table 4
Computational results for the benchmark of TaillardValues in bold are the three cases where the best known solution can be found by our algorithm.PA, proposed algorithm; BKS, best known solution.