1 Introduction

Vehicle routing problem (VRP) addresses one of the most important issues of distribution operations, i.e., how to use a limited number of vehicles to supply the demand of customers from a depot. Over the years, researchers have developed multiple versions of this problem to deal with different distribution specifications. The most common constraints of VRP include maximum vehicles’ capacity, maximum travel distance and serving time windows defined for customers. In some VRP applications, the size of items to be distributed decides whether it is possible to load them into loading space. A typical example of this issue is the delivery of industrial machinery. In this type of problems, expressing the demand by weight is not enough and size of the items must also be taken into consideration.

Such problems can be solved through solving a variety of two- or three-dimensional bin packing problems (2BPP-3BPP) and by using separate processes to solve VRP and loading problem. This approach, however, severely reduces the likelihood of obtaining optimal, desirable or even feasible solutions. Therefore, incorporation of the features and constraints of loading problem into VRP has led to the development of new problems that simultaneously assess both issues. Two-dimensional loading capacitated vehicle routing problem (2L-CVRP) and three-dimensional loading capacitated vehicle routing problem (3L-CVRP) are among the most applicable approaches in this regard. The difference between these two is in the nature of demanded items; the basic assumption of 2L-CVRP is that items cannot be loaded on top of each other, while 3L-CVRP has no such assumption. It should be noted that this paper is based on 2L-CVRP assumptions regarding loading constraints.

The typical 2L-CVRP assumes that each arc has a constant travel time throughout the planning horizon. But in most large cities, traffic congestion at different times of a day (especially in rush hours) can significantly alter the time required to pass a given arc. Traffic congestion also affects the labor and fleet cost structure (Figliozzi, 2010). Incorporating these concepts in the model of a distribution service operating in an urban environment can prevent the model from obtaining sub-optimal and inefficient solutions.

In this paper, travel time is modeled as a function of the time at which vehicle set out from the origin node. This study uses a piecewise linear function to inject the concept of time dependency into the travel time. The proposed model also satisfies first-in-first-out (FIFO) property, which is an important feature of time-dependent vehicle routing problem (TDVRP).

In addition, balancing the loads to be distributed by the vehicles not only prevents employees’ dissatisfaction, but also reduces the abnormal depreciation of vehicles. So load balance can increase the model efficiency by providing the mentioned benefits. In this paper, load balance requirement is satisfied regarding the weight of items assigned to vehicles.

This paper presents an extended version of 2L-CVRP model called two-dimensional loading time-dependent vehicle routing problem (2L-TDVRP), which integrates the concepts of two-dimensional loading constraints and time dependency with the load balancing requirements. Given the NP-hard nature of this problem, a meta-heuristic algorithm called elitist non-dominated sorting local search (ENSLS) is proposed to solve the problem. The results obtained from solving small-scale problem instances by the proposed algorithm are compared with the results of an exact solution method, and the results obtained from solving medium-scale problem instances are compared with the results of NSGA-II and SPEA2 meta-heuristic methods.

The rest of this paper is structured as follows: Section 2 reviews the literature on this topic, Section 3 describes the problem and presents its mathematical model, Section 4 describes the proposed solution approach, Section 5 presents the computational results, and Section 6 presents the conclusions and recommendations for future research.

2 Review of the literature

The first paper of VRP literature is the work of Dantzig et al (1954), where authors studied a large-scale traveling salesman problem (TSP) and proposed a solution method. Clarke and Wright (1964) were the first researchers who assessed this problem for more than one vehicle. However, Golden et al (1977) used the term “vehicle routing” in the title of their paper for the first time. More information on this subject can be found in Toth and Vigo (2002) and Kumar and Panneerselvam (2012).

Time-dependent vehicle routing problem is one of the most widely known versions of the VRP. Malandraki (1989) was the first researcher who introduced the mathematical model of TDVRP. Also Malandraki and Daskin (1992) later proposed a mixed integer mathematical model for TDVRP, which used a step function to calculate the travel time between two customers. To solve this model, they used a number of simple heuristics.

In real world, speed variations are not in the form of instant jumps, so dislike mentioned studies, Hill and Benton (1992) used a speed function to avoid these jumps and provided a compact mathematical model, a number of methods to estimate its parameters, and a solution method. Donati et al (2008) considered two hierarchical objective functions, which optimized the number of tours and the total travel time on selected routes. They used a meta-heuristic method based on ant colony optimization (ACO) algorithm to solve this TDVRP. Soler et al (2009) first used a multistage approach to convert the TDVRP to an asymmetric VRP and then solved that problem using methods available in the literature of VRP. Figliozzi (2012) presented a route construction and improvement method to solve the mentioned problem. Zhang et al (2014) assessed the time-dependent vehicle routing problem with simultaneous pickup and delivery. They proposed an integer programming model for that problem and solved this problem using a hybrid method composed of ACO and tabu search algorithms.

FIFO property is among the most realistic features of TDVRP and is defined as follows: when a vehicle departs the node i(origin node) at the time t i , its arrival time at the destination node is always less than the case where it departs the node i at the time \( t_{i}^{\prime} > t_{i} \). Most of the previous studies that have introduced a comprehensive mathematical model for TDVRP or its real-world applications have neglected the FIFO property. Here, a piecewise linear function with linear slope restricted to values greater than −1 is used to model the travel time.

The piecewise linear function used in this paper follows an approach rather different than other studies in the literature. In previous studies, to uphold the FIFO property, travel time needs to be obtained from a speed function, and this dependence imposes extra computation; but in this paper travel time function is independent from speed function. In the paper of Ichoua et al (2003), it was stated that if the slope of travel time function is greater than −1, the FIFO property is respected. However, they did not develop any mathematical model based on this property, as their focus was on the use of heuristic methods to solve the problem. The mathematical model developed by Jabali et al (2012) for time-dependent vehicle routing problem relied on FIFO property. But the disadvantage of that model is the presence of instant jumps in the speeds of different time periods, which is not realistic. Also, the aforementioned model assumes only one daily change in speed function, and injecting a greater number of daily speed changes in their model will need complex modifications. In contrast, the present paper uses a piecewise linear travel time function, which provides the assumption of continuous daily changes in travel time. The use of this function also allows us to easily incorporate a large number of daily travel time variations into the model.

Review of the literature on simultaneous routing and loading problems shows that most of the studies in this area are focused on incorporating the loading constraints in capacitated vehicle routing problem (CVRP). This problem was first introduced by Iori et al (2007). They used an exact solution approach to solve the problem in small scale. Gendreau et al (2008) were the first researches who proposed a meta-heuristic algorithm (a method based on tabu search algorithm) to solve the large-scale instances of the mentioned problem. In their algorithm, lower bounds, heuristic and exact methods were used to check the loading constraints. Zachariadis et al (2009) implemented guided tabu search to solve the problem and also used a set of heuristic methods to check the loading constraints. Fuellerer et al (2009) proposed an ACO-based meta-heuristic method to solve problem. In this paper, heuristic methods were used to check loading constraints, which the most important feature of these heuristic methods is their ability to consider variable orientation. Strodl et al (2010) used a variable neighborhood search (VNS) method to solve the problem and used exact and heuristic methods to check the loading constraints. Leung et al (2011) presented a developed version of guided tabu search and to check loading constraints, used some heuristic methods.

Duhamel et al (2011) presented a method to solve the 2L-CVRP which first converts the loading constraints to the constraints of a resource-constrained project scheduling problem, then uses GRASP × ELS algorithm to solve the converted problem and finally converts the solutions back to the 2L-CVRP form. Zachariadis et al (2013) presented a meta-heuristic method, whose most important feature is its compact structure. Their solution approach uses a local search to solve the problem and uses diversification strategies to avoid getting trapped in local optimums. It also uses heuristic methods to check the loading constraints. In a study by Hamdi-Dhaoui et al (2014), authors assessed the problem where there is a conflict between customers’ items and provided a bi-objective model for this problem. Their first objective function optimizes the total service cost, while the second objective function balances the items assigned to vehicles regarding their area. To solve the problem and check loading constraints, they used a meta-heuristic algorithm and heuristic methods, respectively. Dominguez et al (2014) examined the problem for the instances where item rotation is allowed, and then used a routing algorithm to solve the problem. Their method uses improved heuristics to check the loading constraints.

The literature on this topic contains similar problems that are extended versions of classical routing and loading problems. Malapert et al (2008) proposed an extended version of 2L-CVRP with simultaneous pickup and delivery constraints. They converted loading constraints to scheduling constraints and used a constraint programming model to solve the problem. Leung et al (2013) assessed the VRP with heterogeneous fleet, combined the simulated annealing algorithm with local search heuristic to solve the routing problem and used a variety of heuristic methods to address the loading constraints. Khebbache-Hadji et al (2013) proposed the 2L-CVRP with time windows and then presented a meta-heuristic solution approach for this problem which is a combination of memetic algorithm and some loading heuristics. Dominguez et al (2016) also assessed the VRP with heterogeneous fleet and proposed a heuristic approach to solve it in the scenarios where item rotation is/is not allowed. Also, they used two heuristic methods to check the loading constraints. More information on VRP with two-dimensional loading constraints can be found in Iori and Martello (2010) and Iori and Martello (2013).

3 Problem statement

Assume the directed graph \( G = \left( {V, A} \right) \) where V is the set of nodes \( 0, 1, \ldots , n \). In the set V, node 0 represents the depot and nodes \( 1, \ldots , n \) represent the customers. A denotes the set of arcs \( \left( {i,j} \right), i, j \in V \). The first objective of the problem is to determine the routes which can minimize the serving time. The secondary objective of the problem is to balance the distribution of items assigned to vehicles regarding to the weight of items. The customers are located in an urban area, so the travel time between each two nodes depends on the departure time from the origin node. As said before, in the present paper, a piecewise linear function with linear slope restricted to values greater than −1 is used to model the travel time. The use of this function also forces the changes in travel time to be smooth rather than stepwise behavior and thereby improves the reality of the model. Figure 1 can help to clarify that how this function can guarantee the FIFO property. Suppose that at time \( t_{\text{A}} \) vehicle (A) starts to traverse an arc, and at time t B another vehicle (B) starts to traverse the same arc. According to time function, duration time of travel for vehicles A and B will be D A and D B. Vehicle A will reach the destination at the time \( t_{\text{A}} + D_{\text{A}} \). Since slope is greater than −1, t B − t A is greater than D A − D B and therefore t A + D A (arrival time of vehicle A) will be less than t B + D B (arrival time of vehicle B).

Figure 1
figure 1

Guarantee of FIFO property by piecewise linear function with linear slope restricted to values greater than −1

Figure 2
figure 2

Corresponding rectangular surface (for coordinate (xw B yh A ))

3.1 Assumptions

This modeling process is also based on several other assumptions:

  • The travel time between two nodes also depends on the direction of travel.

  • The fleet is homogenous (in terms of operation cost, capacity and speed).

  • All available vehicles must be used.

  • Customers’ demanded items are rectangular shaped.

  • Customers should be allocated to the vehicles with respect to weight and loading constraints.

It should be noted that loading constraints considered for the presented model are based on the assumptions of “two-dimensional unrestricted oriented loading.” Further information in this regard can be found in Fuellerer et al (2009).

3.2 Mathematical model

This section describes the proposed mathematical model. Before presenting the model, its parameters and variables need to be introduced. The model parameters are as follows:

  • The set of nodes i = 0, …, n where 0 denotes the depot; \( i, \, j, \, p \) are the indices associated with this set.

  • The set of vehicles k = 1, …, K; k is the index associated with this set.

  • The set of time intervals m = 1, …, M; m is the index associated with this set.

  • The set of items it = 1, …, A; it and jt are the indices associated with this set.

T 0 ij :

The starting time horizon

T m ij :

Upper bound of m-th time interval

C m ij :

The travel time from node i to node j when travel happens at time T m ij

st i :

The service time of customer i

t 0 :

Departure time from the depot

D i :

The demand of customer i

w it :

The width of it-th item

h it :

The height of it-th item

W :

The width of vehicles’ loading space

H :

The height of vehicles’ loading space

\( S \) :

Area of vehicles’ loading space (\( S = W \times H \))

Q :

Capacity of vehicles

\( Bel_{it}^{i} \) :

is one if item it belongs to customer i, and is zero otherwise

B :

A big number

The model variables are as follows:

x km ij :

A binary variable which equals one when at time period m vehicle k moves from node i to node j, and is zero otherwise

θ ij :

The travel time from node i to node j

t i :

The departure time from customer i

z k i :

A binary variable which equals one when vehicle k transports the items of customer i, and is zero otherwise

φ k it :

A binary variable which equals one when vehicle k transports item it, and is zero otherwise

xw it :

The X-coordinate (width) of bottom-left corner of item it

yh it :

The Y-coordinate (length) of bottom-left corner of item it

l it,jt :

A binary variable which equals one when item it is placed on the left side of the item jt, and is zero otherwise

b it,jt :

A binary variable which equals one when item it is placed on the down side of item jt, and is zero otherwise

u k i :

Auxiliary variable (sub-tour elimination)

Bl :

The maximum weight load on vehicles

The mathematical model is explained in the following:

$$ \begin{aligned} & \hbox{min}\, O_{1} = \mathop \sum \limits_{i = 0}^{n} \mathop \sum \limits_{j = 0}^{n} \theta_{ij} \\ &\hbox{min}\, O_{2} = Bl \\ \end{aligned} $$
(1)
$$ \theta_{ij} = \mathop \sum \limits_{k = 1}^{K} \mathop \sum \limits_{m = 1}^{M} \left( {C_{ij}^{m - 1} + \left( {\frac{{C_{ij}^{m} - C_{ij}^{m - 1} }}{{T_{ij}^{m} - T_{ij}^{m - 1} }}} \right)\left( {t_{i} - T_{ij}^{m - 1} } \right)} \right)x_{ij}^{km} ,\quad i = 0, \ldots , n;\,j = 0, \ldots , n; $$
(2)
$$ t_{0} = 0 $$
(3)
$$ t_{j} = \mathop \sum \limits_{i = 0}^{n} t_{i} \mathop \sum \limits_{k = 1}^{K} \mathop \sum \limits_{m = 1}^{M} x_{ij}^{km} + \mathop \sum \limits_{i = 0}^{n} \theta_{ij} + st_{j} ,\, j = 1, \ldots , n; $$
(4)
$$ t_{i} + B \cdot \mathop \sum \limits_{k = 1}^{K} x_{ij}^{km} \le T_{ij}^{m} + B, \quad i = 0, \ldots , n;\,j = 0, \ldots , n;\,m = 1, \ldots , M; $$
(5)
$$ t_{i} - T_{ij}^{m - 1} \cdot \mathop \sum \limits_{k = 1}^{K} x_{ij}^{km} \ge 0, \quad i = 0, \ldots , n;\,j = 0, \ldots , n;\,m = 1, \ldots , M; $$
(6)
$$ \mathop \sum \limits_{i = 0}^{n} \mathop \sum \limits_{k = 1}^{K} \mathop \sum \limits_{m = 1}^{M} x_{ij}^{km} = 1, \quad j = 1, \ldots , n; $$
(7)
$$ \mathop \sum \limits_{j = 1}^{n} \mathop \sum \limits_{m = 1}^{M} x_{0j}^{km} = 1,\quad k = 1, \ldots , K; $$
(8)
$$ \mathop \sum \limits_{i = 0}^{n} \mathop \sum \limits_{m = 1}^{M} x_{ip}^{km} - \mathop \sum \limits_{j = 0}^{n} \mathop \sum \limits_{m = 1}^{M} x_{pj}^{km} = 0,\quad p = 0, \ldots , n;\,k = 1, \ldots ,K; $$
(9)
$$ u_{i}^{k} - u_{j}^{k} + n\mathop \sum \limits_{m = 1}^{M} x_{ij}^{km} \le n - 1, \quad i = 0, \ldots n;\,j = 1, \ldots n;\,k = 1, \ldots , K; $$
(10)
$$ \sum\limits_{j = 0}^{n} {\sum\limits_{m = 1}^{M} {x_{ij}^{km} } } = z_{i}^{k} , \quad i = 0, \ldots , n;\,k = 1, \ldots ,K; $$
(11)
$$ \varphi_{it}^{k} = Bel_{it}^{i} \cdot z_{i}^{k} ,\quad i = 1, \ldots , n;\,k = 1, \ldots , K;\,it = 1, \ldots ,A; $$
(12)
$$ \mathop \sum \limits_{i = 1}^{n} z_{i}^{k} \cdot D_{i} \le Q, \quad k = 1, \ldots , K; $$
(13)
$$ xw_{it} + w_{it} \le W, \quad it = 1, \ldots ,A; $$
(14)
$$ yh_{it} + h_{it} \le H,\quad it = 1, \ldots ,A; $$
(15)
$$ l_{it,jt} + l_{jt,it} + b_{it,jt} + b_{jt,it} + \left( {1 - \varphi_{it}^{k} } \right) + \left( {1 - \varphi_{jt}^{k} } \right) \ge 1,\quad it,\,jt = 1, \ldots , A;\,it \ne jt;\,k = 1, \ldots ,k; $$
(16)
$$ xw_{it} + w_{it} + W.l_{it,jt} \le xw_{jt} + W, it,\,jt = 1, \ldots ,A;\,it \ne jt; $$
(17)
$$ yh_{it} + h_{it} + H.b_{it,jt} \le yh_{jt} + H, \quad it,\,jt = 1, \ldots ,A;\,it \ne jt; $$
(18)
$$ \mathop \sum \limits_{i = 0}^{n} z_{i}^{k} D_{i} \le Bl\quad \forall k = 1, \ldots ,K; $$
(19)
$$ \begin{aligned} & x_{ij}^{km} ,z_{i}^{k} ,\varphi_{it}^{k} ,l_{it,jt} ,b_{it,jt} \in \left\{ {0, 1} \right\}; \\ & \theta_{ij} ,t_{i} , u_{i}^{k} ,xw_{it} ,yh_{it} \in R^{ + }. \\ \end{aligned} $$
(20)

Equation (1) expresses the objective functions, which minimizes the service time and minimizes the maximum weight load on vehicles. Constraint (2) calculates the travel time for the arc i − j. Constraints (3) and (4) determine the vehicles’ departure time from the depot and the node j, respectively. Constraints (5) and (6) ensure that at the time period m the vehicle k can pass through the arc i − j only when its departure time from node i is at the same time interval. Constraint (7) ensures that each customer gets served exactly once. Constraint (8) ensures that vehicles start their tours from the depot. Constraint (9) states that each vehicle that enters a node must eventually leave it. Constraint (10) eliminates the sub-tours. Constraints (11) and (12) are related to the allocation of customers and items to vehicles, respectively. Constraint (13) is related to capacity constraint of vehicles. Constraints (14) and (15) ensure that the item demanded by customer i is placed in the loading space. Constraints (16) to (18) prevent the items from overlapping in the loading space. Constraint (19) is related to the maximum weight load on vehicles, and finally constraint (20) determines the type and domain of variables.

3.3 Linearization of the second and fourth constraints

After replacing b km ij  = t i ·x km ij , the second and fourth constraints change to the following form:

$$ \theta_{ij} = \mathop \sum \limits_{k = 1}^{K} \mathop \sum \limits_{m = 1}^{M} \left( {C_{ij}^{m - 1} \cdot x_{ij}^{km} + \left( {\frac{{C_{ij}^{m} - C_{ij}^{m - 1} }}{{T_{ij}^{m} - T_{ij}^{m - 1} }}} \right)\left( {b_{ij}^{km} - T_{ij}^{m - 1} .x_{ij}^{km} } \right)} \right), \quad i,j = 0, \ldots , n; $$
(21)
$$ t_{j} = \mathop \sum \limits_{{\begin{array}{*{20}c} {i = 0} \\ {i \ne j} \\ \end{array} }}^{n} \mathop \sum \limits_{k = 1}^{K} \mathop \sum \limits_{m = 1}^{M} b_{ij}^{km} + \mathop \sum \limits_{{\begin{array}{*{20}c} {i = 0} \\ {i \ne j} \\ \end{array} }}^{n} \theta_{ij} + st_{j} ,\quad j = 1, \ldots , n. $$
(22)

This linearization also requires adding the following four constraints to the model:

$$ b_{ij}^{km} \le t_{i} ,\quad i, j = 0, \ldots , n;\,k = 1, \ldots , K;\,m = 1, \ldots , M; $$
(23)
$$ b_{ij}^{km} \ge t_{i} - B\left( {1 - x_{ij}^{km} } \right),\quad i, j = 0, \ldots , n;\,k = 1, \ldots , K;\,m = 1, \ldots , M; $$
(24)
$$ b_{ij}^{km} \le Bx_{ij}^{km} , \quad i, j = 0, \ldots , n;k = 1, \ldots , K;m = 1, \ldots , M; $$
(25)
$$ b_{ij}^{km} \ge 0, \quad i, j = 0, \ldots , n;k = 1, \ldots , K;m = 1, \ldots , M. $$
(26)

By adding the above constraints to the model and using the mentioned linearized objective function and constraints, computational time for solving the problem is reduced which is the advantage of linear models over nonlinear ones.

4 Solution approach

This section discusses the solution method used for the described problem. It first presents a brief introduction to multi-objective optimization and then explains some notions about the heuristic algorithms used for checking the feasibility of solutions in terms of loading constraints. In the end, it presents the proposed algorithm and discusses the general concepts of NSGA-II and SPEA2.

4.1 Multi-objective optimization

Many real-world problems can be described via several conflicting objectives. This perspective has led to the introduction and application of multi-objective optimization models and methods. Multi-objective problems express the superiority via the concept of dominance. The concept of dominance is defined as follows: suppose without loss of generality a minimization problem and suppose that x and y are the solution vectors with the j-th objective function values f j (x) and f j (y), respectively, then dominance is defined according to Equation (27):

$$ x {\text{ dominates}} \, y \Leftrightarrow \forall \quad \, j = 1, \ldots ,m: \, f_{j} \left( x \right) \le f_{j} \left( y \right) \, \& \, \exists \, j_{0} = 1, \ldots ,m:f_{{j_{0} }} \left( x \right) < f_{{j_{0} }} \left( y \right); $$
(27)

where m is the number of objective functions. Unlike the mono-objective problems where the goal is to find an optimal solution, in multi-objective problems the goal is to find a set of solutions that any other solutions in the solution space cannot dominate them. Each of these solutions is called a Pareto optimal solution, and their set is called the Pareto optimal set. Also, the values of objective functions of Pareto optimal set is called Pareto front.

4.2 Loading heuristics

This paper uses several heuristic methods to check the feasibility of loading items into the loading space. The used heuristic methods include: bottom-left fill (Y-axis) (Chazelle, 1983), bottom-left fill (X-axis) (Chazelle, 1983), maximum touching perimeter (Lodi et al, 1999), maximum touching perimeter no walls (Lodi et al, 1999) and minimum area (Zachariadis et al, 2009). When the routing constraints and the weight requirements of a solution are satisfied, these methods will be used (in the mentioned order) to check the loading feasibility of that solution. These methods load items based on a pre-determined sequence. This paper uses two such sequences; when an algorithm fails to obtain a feasible solution via the first sequence, it proceeds to the second sequence. It should be noted that the first sequence is obtained by sorting all items assigned to a vehicle in a non-ascending order based on their area. The second sequence is based on the order of visits; in this sequence, items whose customers are going to be visited first will be loaded last, and all items of the same customer will be sorted in a non-ascending order based on the area. All these methods will place a given item in a location that will be on the list of accessible coordinates. After registering an item to a location in the loading space, coordinates of that location will be removed from the list and a maximum of four new coordinates will be added to that list. Each heuristic method uses its own criterion to select the coordinates from the list of accessible coordinates; these criteria are shown in Table 1. More detailed information can be found in Zachariadis et al (2009).

Table 1 Coordinate selection criteria used by heuristic methods

It should be noted that if none of the heuristic methods can obtain a feasible loading for an assessed route, the route will be reported as infeasible and its objective function will be penalized.

4.3 ENSLS

The proposed algorithm called ENSLS is an extended version of NSGA-II (Deb et al, 2002). Like NSGA-II, this algorithm ranks the solutions in the order of number of times they have been dominated by other members of population. This method (like NSGA-II) uses a non-dominated sorting algorithm for ranking the solutions and forming a number of fronts. Non-dominated sorting algorithm uses the following procedure to classify the solutions into several fronts: Algorithm first assigns all non-dominated solutions to the first front and then discards them from the solution set. In the second step, algorithm finds the non-dominated solutions in the new solution set, assigns them to the second front and again discards them from the solution set; algorithm repeats this loop unit all solutions are assigned to a class. Full and detailed information regarding this algorithm can be found in Deb et al (2002). Diversity preservation is an important issue in multi-objective optimizations, so (like NSGA-II) this method uses a parameter called the crowding distance to estimate the density of the neighborhood of a solution. This parameter which is defined within a front shows the distance of a member with the previous and next members. The first step for calculating this parameter is to determine the distance of each individual member i from the j-th objective function by Equation (28); this equation uses the sorted values of objective functions.

$$ DI_{i}^{j} = \frac{{\left| {f_{j} \left( {x_{i + 1} } \right) - f_{j} \left( {x_{i - 1} } \right)} \right|}}{{f_{j}^{\hbox{max} } - f_{j}^{\hbox{min} } }}, \quad j = 1, \ldots ,m; $$
(28)

in which f j (x i+1) and f j (x i-1) are values of the j-th objective function of next and previous members and f max j f min j are the best and worst values of the j-th objective in the front. Crowding distance of members on corners of the front is considered to be a large number. Ultimately, crowding distance of the i-th member of population can be obtained by using Equation (29).

$$ Cd_{i} = \sum\nolimits_{i = 1}^{m} {DI_{i}^{j} } $$
(29)

According to Equation (29), the higher crowding distance of a member indicates that it has a higher fitness, since it is located in a less dense area.

ENSLS first generates an initial population, but unlike NSGA-II which uses the typical GA operators such as crossover and mutation to generate a new population, this method uses the concept of neighborhood for this purpose. Before running the algorithm, the probability of selection of each member for moving to its neighborhood is determined. Based on this probability, method determines the number of members to be selected in each iteration. Note that this method uses binary tournament for the selection of members. This method then merges the population of neighbors with the previous generation and uses the non-dominated sorting algorithm to classify them in several fronts. It then sorts the members of each front in the order of their crowding distance which the best solutions will be those with the highest crowding distance. This method then selects a number of solutions from the merged population (equal to the population size) and transfers them to the next generation (Figure 3). The process of generating new population, merging, sorting and transferring will be repeated until the termination condition is satisfied. After the last iteration, members of the first front will be reported as Pareto front approximates.

Figure 3
figure 3

Method of updating the population in ENSLS

4.3.1 Initial population

Half of the initial solutions are generated by a random method, and the other half are generated by a modified nearest neighbor random method. In the random method, customers are randomly assigned to vehicles. The modified nearest neighbor random method generates the solutions via following steps:

  1. 1.

    It first sorts the customers in the ascending order of their distance from the depot and then assigns the K top customers to K vehicles, where K is the number of vehicles.

  2. 2.

    It starts from the first vehicle, as long as the weight and loading constraints allow, randomly selects one of the two customers nearest to the last customer assigned to the vehicle and assigns it to that vehicle.

  3. 3.

    Once weight or loading constraints get violated, it proceeds to the next vehicle and repeats the process until all customers are allocated to vehicles.

4.3.2 Neighborhood

Neighborhood is defined by three operators of swap (Waters, 1987), 2-opt (Croes, 1958; Lin, 1965) and or-opt (Waters, 1987). The swap operator switches the positions of two customers assigned to one or two different vehicles. Figure 4 shows its method of work.

Figure 4
figure 4

Swap operator

This figure shows how this operator swaps the customers of a single route (left) or those from two different routes (right). In this figure, customers are marked with circles, and bolded circles represent the customers selected for the swap operation. The top and bottom sections of this figure show the status of route(s) before and after the swap. The second operator is a variant of 2-opt; Figure 5 shows how this operator processes two customers of same vehicle or those from two different vehicles.

Figure 5
figure 5

2-opt operator

As this figure shows, when both customers are from the same route (left), this operator reverses the sequence between the two, and when customers are from different routes (right), it replaces the sequences located after the two (including the selected customers).

The third operator is a variant of or-opt. This operator changes the location of a single customer. Figure 6 shows its method of function on a single route (left) or two different routes (right).

Figure 6
figure 6

Or-opt operator

4.3.3 Flowchart of ENSLS

Figure 7 shows different steps of the proposed method.

Figure 7
figure 7

Flowchart of ENSLS

4.4 An introduction to NSGA-II

NSGA-II is an extended version of genetic algorithm (Holland, 1975) focused on multi-objective optimization. The previous sections described those parts of NSGA-II that are utilized by the proposed algorithm; so this section only discusses the different part, i.e., the children generation.

4.4.1 Crossover and mutation operators in NSGA-II

After selecting the parents which is done via a binary tournament in this paper, four operators including two-point, three-point, OX (Oliver et al, 1987) and AEX (Grefenstette et al, 1985) (all with equal probabilities of being used) are employed to generate the children. More information regarding OX and AEX operators can be found in (Puljić and Manger, 2013). Also the swap operator is used for mutation.

4.5 An introduction to SPEA2

SPEA2 first introduced by Zitzler et al (2001). Unlike NSGA-II and ENSLS that store the elite population within the main one, SPEA2 stores this population in another set called “archive.” At the start of algorithm, archive is empty. Algorithm then generates an initial population and, at each iteration, merges the members of the main and archived population; it then assigns a fitness value to each member of merged population. This value is based on the number of times this member has dominated by other members, as well as the density of the region where it is located. This fitness value is obtained via following steps: Algorithm first calculates the number of members that are dominated by member i, and names it S i . It then obtains the primary fitness of member i by summing the S values of those members that dominate i. Finally, it calculates the secondary fitness of each member (a value less than one) based on the density of the region where this member is located. In the end, it calculates the final fitness by summing the primary and secondary fitness values.

The algorithm must then generate a new archive; if the size of non-dominated members of merged population is less than or equal to the size of archive, algorithm sorts the members of merged population in the order of their fitness value and moves them to the archive; otherwise, it uses a truncation operator to select the best non-dominated members as much as the size of archive allows. Truncation operator calculates the distance of each member from the rest and then sorts these distances for each member. It then starts to remove the members with lowest distances; when two or more members have the same distance, operator moves to the next shortest distance and when necessary continues this process down to (k − 1)th member (where k is the number of non-dominated members).

Finally, algorithm uses mutation and crossover operators to generate a new population. It should be noted that the new population will replace the current one. This process of merging, calculating the fitness value, creating the new archive and generating the new population will be repeated until termination condition is satisfied. After the last iteration, non-dominated members of the archive will be reported as the Pareto front approximates.

This algorithm generates the initial solutions by the same procedure explained in Section 4.3.1. It performs crossover and mutation through the same operators described in Section 4.4.1. Further information in this regard can be found in Zitzler et al (2001).

5 Computational results

This section presents and analyzes the computational results. To achieve this aim, first the method of generating problem instances will be described; then the method of tuning the parameters of the proposed algorithms will be explained; afterward the performance evaluation criteria will be defined; and finally the performance of the proposed algorithms by solving small- and medium-scale problem instances will be evaluated. It is worth noting that NSGA-II, SPEA2 and the proposed algorithm were coded in MATLAB and executed on a Core i5 2.5 GHz with 6 GB RAM PC under windows 8.1.

5.1 Problem instances

Since this is the first paper that studies this particular problem, we had no access to readily available problem instances, and the needed instances were generated. Two groups of instances were generated for this problem. The first group includes 12 instances with 5–10 customers and 2–3 vehicles. These instances are based on E016-03m, E022-04g and E033-03n instances available in 2L-CVRP literature (these instances can be found in http://www.or.deis.unibo.it/research.html). Coordinates and demands of customers in our instances are in accordance with these problem instances. Since our problems have 5–10 customers, coordinates and demands of this number of customers are used. Dimensions of items are changed from those in original instances, and only one item is attributed to each customer, because otherwise these problems could not be solved by the exact method. The vehicles capacity of the original instances is also changed to obtain more challenging problems. In the rest of this paper, these problems will be referred to as small-scale instance problems. The number of customers and the number of vehicles for these instances can be seen in Table 2:

Table 2 Number of customers and vehicles for small-scale instances

The second group of problems, which are named medium-scale instances problems, includes 25 problems with 15–100 customers and 4–22 vehicles. In these instances, coordinates, demands and dimensions of items and vehicles are similar to selected 2L-CVRP problems. Five instances were selected from each class of mentioned problems, which included E016-05m, E036-11h, E051-05e, E072-04f and E101-14s. The number of customers and the number of vehicles for these instances can be seen in Table 3:

Table 3 Number of customers and vehicles for medium-scale instances

To add the concept of time dependency to the problems, a traffic pattern was designed and attached to each arcs. This traffic pattern consists of five time intervals and starts with the start of workday. The time required to travel the arc within the first time interval is considered as the distance of that arc, and then T 4 is defined in the form of Equation (30):

$$ T_{4} = \left( {\frac{n}{K} + 1} \right) \cdot \bar{T} $$
(30)

In the above equation, n is the number of customers, K is the number of vehicles, and \( \bar{T} \) is the average distance between customers. The procedures shown in Table 4 are used to calculate the length of other time intervals (t i denotes the end of i-th time interval).

Table 4 Specifications of defined time interval

In Table 4, T max is the size of longest arc in the problem. It is obvious that the travel time on each arc depends on the time interval, the slope of the line in that interval and the length of the arc. Figure 8 shows an instance of the piecewise linear function with five time intervals described in Table 4.

Figure 8
figure 8

Piecewise linear function used in problem instances

5.2 Parameter setting

Performance of meta-heuristic methods largely depends on the values of their parameters, so the proper setting of parameters can have a significant impact on the performance of any approach that employs these methods. This paper uses the Taguchi method to set the parameters. Taguchi method for the design of experiments, introduced in 1960 by Taguchi, is among the most reliable methods used for adjusting the parameters; this method can provide the good conditions via the smallest possible number of experiment (Roy, 2010). Using this method instead of classic full factorial approach significantly reduces the time and cost of tests required to set the parameters. Based on the number of selected parameters and the factor levels, Taguchi method uses several orthogonal arrays as experiment matrices. Parameters of ENSIS include: the probability of selecting a solution for moving to its neighborhood (P nb ), the size of population in each iteration (npopualtion) and the number of iterations without any improvement (non-improve). Parameters of NSGA-II are: the probability of using crossover and mutation operators (P c and P m , respectively), the size of population in each iteration (npopualtion) and the number of iterations without any improvement (non-improve). SPEA2 also has the same parameters plus another parameter called the size of archive (narchive). It can be seen that ENSLS has a lower number of parameters, which can be considered as an advantage. Values of all the above parameters were determined by the use of Taguchi method. These values are presented in Table 5.

Table 5 Values set for parameters

5.3 Performance metrics

In mono-objective optimization, the goal is to find a single optimal solution, so the solution methods proposed for these problems can be compared by comparing their reported objective function values. Runtime or computation time is another common metric for such comparison. In multi-objective optimization, however, solutions should not only have a good quality in terms of objective function values, but also have a good diversity and spread to cover more points of the Pareto front. In this paper, the following parameters are used to evaluate the performance of the proposed algorithms.

5.3.1 Quality metric (QM)

To check the quality of the solutions, all solutions obtained from all methods are compared with each other. A new merged set of non-dominated solutions is obtained, and eventually the contribution of each algorithm to this set is determined. The higher contribution of an algorithm points out its better performance.

5.3.2 Spacing metric (SM)

The spacing metric measures the uniformity of distribution of solutions. Literature has provided several measures for this metric, but this paper uses the measures introduced by (Deb et al, 2002). This metric can be calculated via Equation (31):

$$ {\text{SM}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n - 1} \left| {d_{i} - \bar{d}} \right|}}{{\left( {n - 1} \right)\bar{d}}} $$
(31)

where n denotes the number of obtained solutions, d i is the distance between two adjacent solutions in the objective space, and \( \bar{d} \) is the average of all d i s. Lower values of this metric point to better performance of the algorithm.

5.3.3 Diversity metric (DM)

The metric, introduced by Zitzler (1999), demonstrates the extent of non-dominated solutions of an algorithm and can be calculated by Equation (32).

$$ DM = \sqrt {\left( {\frac{{\mathop {\hbox{max} }\limits_{i} f_{1i} - \mathop {\hbox{min} }\limits_{i} f_{1i} }}{{f_{{1{\text{total}}}}^{\hbox{max} } - f_{{ 1 {\text{total}}}}^{\hbox{min} } }}} \right)^{2} + \left( {\frac{{\mathop {\hbox{max} }\limits_{i} f_{2i} - \mathop {\hbox{min} }\limits_{i} f_{2i} }}{{f_{{ 2 {\text{total}}}}^{\hbox{max} } - f_{{ 2 {\text{total}}}}^{\hbox{min} } }}} \right)^{2} } $$
(32)

In the above equation, \( f_{{i{\text{total}}}}^{\hbox{max} } \) and \( f_{{i{\text{total}}}}^{\hbox{min} } \) are the maximum and minimum values of i-th objective function in the solutions of all of the compared algorithms, respectively. The higher the value of DM, the better the performance of the algorithm.

5.3.4 Objective functions value

In this paper, one of the criteria used for the comparison of algorithms is the best value for each objective function obtained by each algorithm.

5.3.5 Computational time

Another criterion used for performance evaluation is the computation time. Time to find best solutions (OPT) and time to completion (TOT) are reported for all assessed algorithms.

5.4 Performance of the proposed algorithm in solving the small-scale problem instances

To evaluate the performance of the proposed method, its results have been compared with the results of two other algorithms and also with the results of an exact method. In this stage, metrics of comparison were the best value obtained for each objective function and the computational time. To find the optimal values of objective functions using exact method, the exact method was used to solve the model for one objective function regardless to the other one, and then the same process was followed for the other function. This gave the best possible value of each objective function. It should be mentioned that the time reported for the exact method is the sum of times it takes to obtain the optimal solutions for the two objectives functions. Then, ENSLS, NSGA-II and SPEA2 were run three times and the best values of objective functions obtained by each algorithm, time to find best solutions and time to completion were determined. In the end, the averages of above mentioned values obtained by each algorithm were calculated. The results are presented in Tables 6 and 7. Note that “gap%” in Table 6 shows the deviation of solutions from the solution of exact method. All three methods showed similar performances in terms of second objective function, so the deviations of this objective function were not calculated.

Table 6 Computational results obtained for small-scale problem instances (part 1)
Table 7 Computational results obtained for small-scale problem instances (part 2)

As the table shows, the proposed algorithm outperformed the other tested algorithms and generated solutions closest to the solutions of exact method. The average error of ENSLS, NSGA-II and SPEA2 for the first objective function is 0.28, 0.34 and 0.48%, respectively, and this demonstrates the good performance of ENSLS in this respect. In the small-scale problem instances, all algorithms have yielded almost identical values for the second objective function, and this is because of small size of the problems. In the small-scale problems, there are only a handful of states regarding the amount of load to be allocated to each vehicle, so all algorithms can achieve optimal solutions. The computational time obtained for small-scale problem instances is shown in Table 7.

In terms of OPT Time, on average, ENSLS is about 41 and 30% faster than the NSGA-II and SPEA2 algorithms, respectively. Average values for OPT Time for ENSLS, NSGA-II and SPEA2 are 1.30, 2.22 and 1.86, respectively, that demonstrate the good performance of proposed algorithm. Also all algorithms showed approximately the similar performance in terms of the difference between OPT Time and TOT Time.

5.5 Performance of the proposed algorithm in solving the medium-scale problem instances

To gauge the performance of the proposed algorithms, like before, each of these three algorithms was run three times. The best run performed by each of these algorithms was used to compare them in terms of quality. However, the average values of diversity and spacing metrics, best values of objective functions as well as computational time through three runs were used as the basis of comparison in these terms. The results are presented in Tables 8 and 9.

Table 8 Computational results obtained for medium-scale problem instances (part 1)
Table 9 Computational results obtained for medium-scale problem instances (part 2)

The results presented in Table 8 indicate that in terms of quality metric (QM), the proposed algorithm is completely superior to the other two methods, as more than half of the best non-dominated solutions have been obtained by this algorithm. In terms of SM and DM metrics, the difference between the methods is negligible. The graph of Figure 9 provides a more clear understanding about the quality of solutions obtained by each algorithm for each instance. Note that higher values of the quality metric show the better performance of the algorithm.

Figure 9
figure 9

QM values obtained for each medium-scale problem instances

Figure 9 shows that in most problem instances, the proposed algorithm provided higher quality solutions, and this quality difference increases with the increase in the number of customers. This difference in performance is reflected in the fact that in 64% of the problem instances, half or more than half of the final non-dominated solutions were the contributions of this algorithm; also, in 28% of the problem instances, all final non-dominated solutions were the contributions of this algorithm, which demonstrates the absolute superiority of ENSLS in terms of quality metric.

The SM values obtained by each algorithm for each problem instance are plotted in Figure 10. Note that lower values of the spacing metric represent the better performance of the algorithm.

Figure 10
figure 10

SM values obtained for each medium-scale problem instance

Figure 10 shows the minor advantage of SPEA2. The results show that SPEA2, ENSLS and NSGA-II have had the best performance in 48, 32 and 20% of instances. In terms of average value of all SM values, the performances of all tested algorithms were somewhat similar.

The next stage of comparison is the diversity metric. Figure 11 shows the DM values for the tested algorithms for each problem instance. Note that higher values of the DM demonstrate the better performance of the algorithm.

Figure 11
figure 11

DM values obtained for each medium-scale problem instance

Figure 11 shows the minor advantage of the proposed algorithm. The proposed method has had the best performance in 40% of the problems, while NSGA-II and SPEA2 have been the best algorithms in 32 and 28% of the instances. In terms of average value of all DM values, there is no significant difference between the performances of the algorithms. The objective functions values and computation time criteria obtained for medium-scale problem instances are presented in Table 9.

The results presented in Table 9 demonstrate the good performance of the proposed method in terms of best objective function value. The average errors of ENSLS, NSGA-II and SPEA2 for the first objective function are, 0.26, 4.22 and 6.04%, respectively, and for the second objective function, these average errors are 0.40, 1.42 and 1.21%.

In terms of OPT Time, on average, SPEA2 is about 15 and 4% faster than the ENSLS and NSGA-II algorithms, respectively. Also, in terms of TOT Time, on average, SPEA2 is about 16 and 4% faster than the ENSLS and NSGA-II algorithms, respectively. Figures 12 and 13 show the graphs of time to find best solutions and time to completion for every proposed algorithm, respectively, and every individual problem instance.

Figure 12
figure 12

Time to find best solutions elapsed for each medium-scale problem instance

Figure 13
figure 13

Time to run to completion elapsed for each medium-scale problem instance

Examining these graphs shows that SPEA2, ENSLS and NSGA-II have had the best OPT time in 44, 28 and 28% of problem instances, respectively, and the best TOT time in 44, 32 and 24% of problem instances, respectively. Also, as mentioned, SPEA2 has shown the best average performance in terms of both OPT time and TOT time.

Overall, these comparisons show that the proposed algorithm completely outperformed the two other algorithms in terms of solution quality and also has an advantage in terms of solution diversity. In terms of spacing metric, however, SPEA2 showed a better performance. Given the simultaneous importance of solution quality and computational time, we can conclude that the proposed algorithm (ENSLS) performed better than NSGA-II and SPEA2. The graph of non-dominated solutions obtained by the tested algorithms for the instance #23 is presented in Figure 14.

Figure 14
figure 14

Graph of non-dominated solutions obtained by the tested meta-heuristic algorithms for the instance #23

Figure 14 demonstrates the superiority of the proposed method over NSGA-II and SPEA2 in exploring the Pareto front. To evaluate the solution improvement process, the graph of non-dominated solutions of instance #18 after 500, 1000, 2000 and 3000 iterations is plotted in Figure 15.

Figure 15
figure 15

Graph of non-dominated solutions of instance #18 obtained by the ENSLS algorithm after 500, 1000, 2000 and 3000 iterations

Figure 15 shows the good performance of the proposed algorithm in solving the problem and the significant improvement of the solutions after 3000 iterations.

6 Conclusions and recommendations for future studies

This paper studied the two-dimensional loading time-dependent vehicle routing problem. As previously mentioned, despite the possible applications of this problem in distribution networks, the literature on simultaneous routing and loading problems lacks proper investigations related to this problem. This paper first introduced this problem and then presented it as a bi-objective mathematical model that incorporates FIFO property for TDVRP. Given the NP-hard nature of this problem, an approach called ENSLS was developed to obtain its solutions. To evaluate the performance of the proposed algorithm in small- and medium-scale problem instances, its results were compared with the results of two other algorithms (SPEA2, NSGA-II). While all algorithms exhibited good performance in solving the small-scale problem instances, the proposed algorithm showed the best performance, since it only had 0.28% error in the first objective function, while errors were 0.34 and 0.48% for NSGA-II and SPEA2, respectively. Also, average time to find best solutions and average time to run to completion for ENSLS were 1.30 and 5.14, respectively. These values were 2.22 and 6.90 for NSGA-II, respectively, and 1.86 and 6.59 for SPEA2, respectively. In the medium-scale problems, the average values of QM, SM, DM, best value of objective 1, best value of objective 2, time to find best solutions and time to run to completion were 0.63, 0.50, 0.98, 1099.75, 4665.41, 1117.10 and 1234.28 for the proposed algorithm, 0.24, 0.55, 1.00, 1156.90, 4667.15, 981.61 and 1088.85 for NSGA-II and 0.13, 0.43, 0.95, 1185.22, 4676.76, 940.30 and 1040.93 for SPEA2. These results show the good performance of the proposed algorithm in solving small-, medium-scale problems instances and therefore its applicability in solving the assessed problem.

The future researches could consider this problem in the scenarios where items demanded by the customers have a non-rectangular cross section. Integrating the features of time-dependent vehicle routing problem with pickup and delivery into loading constraints could also be an interesting field of research. Researchers could also explore other approaches to solve the presented problem with a higher performance and accuracy.