1 Introduction

Real-world problems often involve a large number of decision variables and constraints, making it impossible to find the global optimal solution within the given time budget. When tackling large-scale optimization problems, the divide-and-conquer approach is commonly adopted to decompose the overall problem into smaller sub-problems (Boyd et al. 2007; Omidvar et al. 2014; Mei et al. 2014a). For many real-world problems, the sub-problems are naturally defined. For example, in supply chain management (Thomas and Griffin 1996; Stadtler 2005; Melo et al. 2009), each stage or operation such as procurement, production and distribution can correspond to a sub-problem. However, it is often inevitable that such sub-problems are still interdependent on each other. As mentioned in Michalewicz (2012), one of the main complexity of real-world problems is the interdependence between sub-problems, which makes many conventional approaches ineffective. As a result, even if each sub-problem has been intensively investigated, it is still an open question how to integrate the high-quality partial solutions for the sub-problems to obtain a global optimum or at least a high-quality solution for the overall problem. Therefore, it is important to investigate how to tackle the interdependence between sub-problems.

To facilitate such investigation, Bonyadi et al. (2013) recently defined a benchmark problem called Travelling Thief Problem (TTP). TTP is a combination of two well-known combinatorial optimization problems, i.e., Travelling Salesman Problem (TSP) and Knapsack Problem (KP). Specifically, a thief is to visit a set of cities and pick some items from the cities to put in a rented knapsack. Each item has a value and a weight. The knapsack has a limited capacity that cannot be exceeded by the total weight of the picked items. In the end, the thief has to pay the rent for the knapsack, which depends on the travel time. TTP aims to find a tour for the thief to visit all the cities exactly once, pick some items along the way and finally return to the starting city, so that the benefit of the visit, which is the total value of the picked items minus the rent of the knapsack, is maximized. Since TSP and KP have been intensively investigated, TTP facilitates to concentrate on the interdependence between sub-problems.

An example of potential relevant real-world applications of TTP is the capacitated arc routing problem (Dror 2000) with service profit. Although there have been extensive studies for solving various forms of the capacitated arc routing problem depending on different practical scenarios [e.g., the classic model (Mei et al. 2009a, b; Tang et al. 2009; Fu et al. 2010), the multi-objective model (Mei et al. 2011a), the stochastic model (Mei et al. 2010), the periodic model (Mei et al. 2011b) and the large-scale model (Mei et al. 2013, 2014a, b)], two important practical issues have been overlooked so far. One is the service profit, which is the profit that can be gained by serving the customers. Each customer may have a different profit/demand ratio. Thus, given the limited capacity of the vehicle, one may need to serve only a subset of the customers with higher profit/demand ratios to maximize the final benefit. The other factor is the dependency of the travel cost of the vehicle on its load. Obviously, a heavier load of the vehicle leads to a higher consumption of petrol, and thus a higher travel cost. In this case, it would be more desirable to serve the customers with a higher demand first to save the travel cost of the subsequent route. With the above factors taken into account, the resultant arc routing problem can be modelled as a TTP.

In this paper, the interdependence of TSP and KP in TTP is investigated both theoretically and empirically. First, the mathematical formulation of TTP is developed and analysed to show how the two sub-problems interact with each other. Then, a Cooperative Co-evolution algorithm (CC) (including a standard and a dynamic version) and a Memetic Algorithm (MA) are developed. CC solves TSP and KP separately, and transfers the information between them in each generation. MA solves TTP as a whole. Standard crossover and mutation operators are employed. The proposed algorithms were compared on the benchmark instances proposed in Bonyadi et al. (2013), and the results showed that MA managed to obtain much better solutions than CC for all the test instances. In other words, with the same crossover and mutation operators for each sub-problem, a more proper way of integrating the optimization process of the sub-problems can result in a significantly better solution. This demonstrates that considering the interdependence between sub-problems is important for obtaining high-quality solution for the overall problem. Moreover, the theoretical analysis establishes the fundamental understanding of the problem.

The rest of the paper is organized as follows: TTP is formulated and analysed in Sect. 2. After that, CC and MA are depicted in Sect. 3. Then, the experimental studies are carried out in Sect. 4. Finally, the conclusion and future work are described in Sect. 5.

2 Travelling Thief Problem

In this section, TTP is introduced. The mathematical formulation is first described in Sect. 2.1 and then analysed in Sect. 2.2, particularly in terms of the interdependence between the TSP and KP decision variables in the objective function.

2.1 Mathematical formulation

TTP is a combination of TSP and KP. In TSP, \(n\) cities with the distance matrix of \(D_{n \times n}\) are given, where \(d_{ij}\) is the distance from city \(i\) to \(j\). In KP, there are \(m\) items. Each item \(i\) has a weight \(w_i\), a value \(b_i\) and a set of available cities \(A_i\). For example, \(A_i = \{1,2,5\}\) implies that item \(i\) can only be picked from city \(1\), \(2\) or \(5\). A thief aims to visit all the cities exactly once, pick items on the way and finally come back to the starting city. The thief rents a knapsack to carry the items, which has a capacity of \(Q\). The rent of the knapsack is \(R\) per time unit. The speed of the thief decreases linearly with the increase of the total weight of carried items and is computed by the following formula:

$$\begin{aligned} v = v_{\max }-(v_{\max }-v_{\min })\frac{\bar{w}}{Q}, \end{aligned}$$
(1)

where \(0 \le \bar{w} \le Q\) is the current total weight of the picked items. When the knapsack is empty (\(\bar{w} = 0\)), the speed is maximized (\(v = v_{\max }\)). When the knapsack is full (\(\bar{w} = Q\)), the speed is minimized (\(v = v_{\min }\)). Then, the benefit gained by the thief is defined as the total value of the picked items minus the rent of the knapsack.

Figure 1 illustrates an example of a TTP solution that travels through the path A–B–C–D–A, picking items \(1\) and \(2\) at cities A and C, respectively. The weights and values of the items are \(w_1=b_1=2\) and \(w_2=b_2=1\). The numbers associated with the arcs indicate the distances between the cities. The total value of the picked items is \(b_1+b_2=3\). The travel speeds between each pair of cities are \(v_{\mathrm{AB}}=v_{\mathrm{BC}}=4-3\cdot 2/3=2\) and \(v_{\mathrm{CD}}=v_{\mathrm{DA}}=1\). Then, the total travel time is \((2+1.5)/2+(1+1.5)/1=4.25\). Finally, the benefit of the travel is \(3-1\cdot 4.25=-1.25\) (a loss of \(1.25\)).

Fig. 1
figure 1

An example of a TTP solution

To develop a mathematical formulation for TTP, the 0–1 decision variables of \(x_{ij}\) (\(i,j=1,\ldots ,n\)), \(y_i\) (\(i=1,\ldots ,n\)) and \(z_{ij}\) (\(i=1,\ldots ,m; j=1,\ldots n\)) are defined. The TSP decision variables \(x_{ij}\) takes \(1\) if there is a path from city \(i\) to \(j\), and \(0\) otherwise. The starting city decision variables \(y_i\) equals \(1\) if city \(i\) is the starting city, and \(0\) otherwise. The KP decision variables \(z_{ij}\) takes \(1\) if item \(i\) is picked in city \(j\), and \(0\) otherwise. Then, TTP can be formulated as follows:

$$\begin{aligned}&\max \quad \mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z}) \end{aligned}$$
(2)
$$\begin{aligned}&s.t.{:} \sum _{i=0,i\ne j}^{n}x_{ij} = 1, \quad j = 1, \dots , n\end{aligned}$$
(3)
$$\begin{aligned}&\qquad \quad \sum _{j=0,j\ne i}^{n}x_{ij} = 1, \quad i = 1, \dots , n \end{aligned}$$
(4)
$$\begin{aligned}&\qquad \quad u_i-u_j+nx_{ij} \le n-1, \quad 1 \le i \ne j \le n \end{aligned}$$
(5)
$$\begin{aligned}&\qquad \quad \sum _{i=1}^{n}y_i = 1 \end{aligned}$$
(6)
$$\begin{aligned}&\qquad \quad \sum _{j \in A_i} z_{kj} \le 1, \quad k = 1, \dots , m \end{aligned}$$
(7)
$$\begin{aligned}&\qquad \quad \sum _{j \notin A_i} z_{kj} = 0, \quad k = 1, \dots , m \end{aligned}$$
(8)
$$\begin{aligned}&\qquad \quad \sum _{k=1}^{m}\sum _{j=1}^{n}w_iz_{kj} \le Q \end{aligned}$$
(9)
$$\begin{aligned}&\qquad \quad x_{ij}, y_i, z_{kj} \in \{0,1\}, u_i \ge 0,\nonumber \\&\qquad \quad \quad i,j = 1, \dots n; \; k = 1, \dots , m \end{aligned}$$
(10)

The objective (2) is to maximize the benefit \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\), whose definition is complex and thus will be described in details later. The constraints (3)–(5) are standard constraints that ensure the validity of the TSP solution. In Eq. (5), the \(u_i\)’s (\(i = 1,\ldots ,n\)) are non-negative artificial variables to avoid solutions with sub-tours. The constraint (6) indicates that the tour has exactly one starting city. The constraints (7)–(9) imply that each item is picked at most once from its set of available cities, and the total weight of the picked items cannot exceed the capacity of the knapsack. The constraint (10) defines the domain of the variables.

In Eq. (2), \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) is defined as follows:

$$\begin{aligned} \mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z}) = \sum _{i=1}^{m}\sum _{j=1}^{n}b_iz_{ij}-R \cdot T, \end{aligned}$$
(11)

where \(T\) is the total travelling time that is calculated as

$$\begin{aligned}&T = \sum _{j=1}^{n}y_jT_j \end{aligned}$$
(12)
$$\begin{aligned}&T_j = \sum _{l=1}^{n}\frac{P_{jl}-P_{j(l-1)}}{v_{\max }-(v_{\max }-v_{\min })\bar{w}_{jl}/Q} \end{aligned}$$
(13)
$$\begin{aligned}&P_{jl} = \sum _{\begin{array}{c} k_1,\dots ,k_l=1 \\ k_1\ne \dots \ne k_l \end{array}}^{l}(d_{jk_1}+d_{k_1k_2}+\dots +d_{k_{l-1}k_l})\nonumber \\&\qquad \qquad \times x_{jk_1}x_{k_1k_2} \dots x_{k_{l-1}k_l} \end{aligned}$$
(14)
$$\begin{aligned}&\bar{w}_{jl} = \sum _{\begin{array}{c} k_1,\dots ,k_r=1 \\ k_1\ne \dots \ne k_r \end{array}}^{l}\left( \sum _{r=1}^{l}\sum _{i=1}^{m}w_iz_{ik_r}\right) x_{jk_1}x_{k_1k_2} \dots x_{k_{l-1}k_l} \end{aligned}$$
(15)

In Eq. (12), \(T\) is defined as the total travelling time \(T_j\) starting from city \(j\) where \(y_j=1\). In Eq. (13), \(P_{jl}\) stands for the distance of the path with \(l\) links starting from city \(j\), and \(\bar{w}_{jl}\) is the current total weight of the picked items after visiting \(l\) cities excluding the starting city \(j\). They are calculated by Eqs. (14) and (15), respectively. Note that \(x_{jk_1}x_{k_1k_2} \dots x_{k_{l-1}k_l}\) equals \(1\) if \(x_{jk_1}=x_{k_1k_2}=\dots =x_{k_{l-1}k_l}=1\), i.e., the solution \((\mathbf {x},\mathbf {y},\mathbf {z})\) includes an \(l\)-length path \((j, k_1, \ldots , k_l)\), and \(0\) otherwise. Therefore, Eq. (14) only counts in the total distance of the existing path in the solution \((\mathbf {x},\mathbf {y},\mathbf {z})\), and Eq. (15) only sums up the weights of the items picked along such path.

2.2 Problem analysis

From Eqs. (3)–(10), one can see that in the constraints, the decision variables \(\mathbf {x}\), \(\mathbf {y}\) and \(\mathbf {z}\) are independent of each other. Obviously, Eqs. (3)–(5) only consist of \(\mathbf {x}\), Eq. (6) only includes \(\mathbf {y}\), and Eqs. (7)–(9) solely involve \(\mathbf {z}\). However, as shown in Eqs. (11)–(15), there is a non-linear relationship between the variables in the objective \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\). For example, Eq. (15) includes the product of the \(z_{ij}\)’s and \(x_{ij}\)’s, and Eq. (13) involves the quotient of the \(x_{ij}\)’s and \(z_{ij}\)’s. In the above formulation, it is difficult to find an additively separation of \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\), if not impossible. That is, one cannot find the functions \(\mathcal {G}_1(\mathbf {x})\), \(\mathcal {G}_2(\mathbf {y})\) and \(\mathcal {G}_3(\mathbf {z})\) such that \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z}) = \mathcal {G}_1(\mathbf {x})+\mathcal {G}_2(\mathbf {y})+\mathcal {G}_3(\mathbf {z})\). In other words, it is impossible to decompose the overall problem \(\mathcal {P}(\mathbf {x},\mathbf {y},\mathbf {z})\) into independent sub-problems \(\mathcal {P}_1(\mathbf {x})\), \(\mathcal {P}_2(\mathbf {y})\) and \(\mathcal {P}_3(\mathbf {z})\) such that \(\mathcal {OBJ}(\mathcal {P}) = \mathcal {OBJ}(\mathcal {P}_1)+\mathcal {OBJ}(\mathcal {P}_2)+\mathcal {OBJ}(\mathcal {P}_3)\), where \(\mathcal {OBJ}(\mathcal {P})\) stands for the objective function of the problem \(\mathcal {P}\).

The above analysis enables us to better understand the reason why solving the sub-problems individually can hardly lead to high-quality solutions. Take TTP as an example, in Bonyadi et al. (2013), a simple decomposition of TTP into TSP and KP was designed by setting \(\mathcal {G}_1(\mathbf {x},\mathbf {y}) = \mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {0})\cdot v_{\max }/R = -td(\mathbf {x})\) and \(\mathcal {G}_3(\mathbf {z}) = \sum _{i=1}^{m}\sum _{j=1}^{n}b_iz_{ij}\), where \(td(\mathbf {x})\) stands for the total distance of the TSP tour \(\mathbf {x}\), which is independent of the starting city decision variables \(\mathbf {y}\). In other words, TTP was decomposed into TSP and KP with standard objective functions (minimizing total distance for TSP and maximizing total value for KP). However, the preliminary experimental results showed that such decomposition cannot lead to good TTP solutions. Based on the above analysis, the reason is that the original objective \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) is not the summation of the \(\mathcal {G}_1(\mathbf {x},\mathbf {y})\) and \(\mathcal {G}_3(\mathbf {z})\). Thus, optimizing \(\mathcal {G}_1(\mathbf {x},\mathbf {y})\) and \(\mathcal {G}_3(\mathbf {z})\) is not directly related to optimizing \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) itself.

To summarize, the mathematical formulation of TTP shows that the objective \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) is not additively separable. Therefore, one cannot expect that solving the TSP and KP sub-problems individually will obtain competitive TTP solutions since their objectives are not fully correlated. In this paper, each solution is evaluated directly with respect to the original objective \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) provided that there is no TSP and KP objective functions strongly correlated to \(\mathcal {G}(\mathbf {x},\mathbf {y},\mathbf {z})\) so far.

3 Solving TTP with meta-heuristics

According to the mathematical formulation described in Sect. 2.1, it is seen that TTP is a complex nonlinear integer optimization problem. It is also obvious that TTP is NP-hard, since it can be reduced to the TSP when \(w_i = b_i = 0, \forall i = 1,\ldots ,m\), which has been proved to be NP-hard (Papadimitriou 1977). In this situation, meta-heuristics are good alternatives as it has been demonstrated to be able to obtain competitive solutions within a reasonable computational budget for various NP-hard combinatorial optimization problems (Mei et al. 2009a, 2011a, b; Tang et al. 2009; Fuellerer et al. 2010; Bolduc et al. 2010; De Giovanni and Pezzella 2010; Sbihi 2010). In the following, two meta-heuristic approaches are proposed for solving TTP. The former is a Cooperative Co-evolution algorithm (CC) (Potter and De Jong 1994) that optimizes the TSP and KP decision variables separately and exchange the information between them regularly. The latter is a Memetic Algorithm (MA) (Moscato 1989) which considers TTP as a whole and optimizes all the decision variables simultaneously. Next, the two algorithms are described respectively. Then, their computational complexities are analysed.

3.1 Cooperative Co-evolution

The two sub-problems of TTP, i.e., TSP and KP, are both well-known combinatorial optimization problems. They have been investigated intensively, and various algorithms have been proposed for solving them (Lin and Kernighan 1973; Dorigo and Gambardella 1997; Horowitz and Sahni 1974; Fidanova 2007). However, the algorithm for TTP is not straightforward due to the interdependence between the TSP and KP decision variables in the objective. In this case, an intuitive approach is to optimize the TSP and KP decision variables separately and transfer the information between them during the optimization. The Cooperative Co-evolution (CC) (Potter and De Jong 1994) is a standard approach to this end. It decomposes the decision variables into a number of subcomponents and evolves them separately. The transfer of information is conducted by the collaboration between the subcomponents occurring in evaluation. When evaluating an individual of a subcomponent, it is combined with the collaborators (e.g., the individual with the best fitness value) that are selected from the other subcomponents. Then, its fitness is set corresponding to that of the combined individual(s) of the overall problem.

As mentioned in Wiegand et al. (2001), when selecting the collaborators, there are three main issues that affect the performance of the CC: collaborator selection pressure, collaboration pool size and collaboration credit assignment. They are described as follows:

  • Collaborator selection pressure: The degree of greediness of selecting a collaborator. In general, if the subcomponents are independent from each other, then one should set the strongest selection pressure, i.e., select the best-so-far individuals as collaborators. On the other hand, for the non-linearly interdependent subcomponents, a weak selection pressure is more promising, e.g., selecting the collaborators randomly (Wiegand et al. 2001). Another empirical study (Stoen 2006) also showed that proportional selection performs better than random selection.

  • Collaboration pool size: The number of collaborators selected from each other subcomponent. A larger pool size leads to a more comprehensive exploration of the solution space and thus a better final solution quality. However, it induces a higher time complexity since it requires more fitness evaluations to obtain the fitness of an individual. A better alternative is to adaptively change the pool size during the optimization process (Panait and Luke 2005).

  • Collaboration credit assignment: The method of assigning the fitness value based on the objective values obtained together with the collaborators. The empirical studies (Wiegand et al. 2001) showed that the optimistic strategy that assigns the fitness of an individual as the objective value of its best collaboration generally leads to the best results.

Based on the previous studies, the collaboration strategy in the proposed CC for TTP is set as follows: when evaluating an individual of TSP (KP, resp.), the best \(k\) individuals of KP (TSP, resp.) are selected to be collaborators. Then, the fitness of the individual is set as the best objective value among the \(k\) objective values.

The issue of collaboration in CC has been overlooked so far, and most of the limited studies are focused on continuous optimization problems (Potter and De Jong 1994; Potter 1997; Wiegand et al. 2001; Bull 2001; Panait and Luke 2005; Stoen 2006). For the combinatorial optimization problems, Bonyadi and Moghaddam (2009) proposed a CC for multi-processor task scheduling, in which the collaboration pool size equals the population size, i.e., all the individuals are selected as collaborators. Ibrahimov et al. (2012) proposed a CC for a simple two-silo supply chain problem, which selects the best individual plus two random individuals for collaboration.

The CC for TTP is depicted in Algo. 1. In lines 7 and 8, solveTSP\(()\) and solveKP\(()\) are described in Algos. 2 and 3, which have the same framework. First, two parents are selected randomly and the crossover operator is applied to them. Then, the local search process is conducted with the probability of \(P_{ls}\). Finally, all the generated offsprings are combined with the original population and the best \(N\) (\(k\), resp.) individuals are selected to form the new population (collaborators, resp.). In the algorithm, the fitness function \(\mathcal {F}()\) returns the best objective value of all the collaborations, i.e.,

$$\begin{aligned} \mathcal {F}(\mathbf {x},\mathbf {CZ})&= \max _{l\in \{1,\dots ,k\}}\{\mathcal {G}(\mathbf {x},\mathbf {cz}_l)\} \end{aligned}$$
(16)
$$\begin{aligned} \mathcal {F}(\mathbf {z},\mathbf {CX})&= \max _{l\in \{1,\dots ,k\}}\{\mathcal {G}(\mathbf {cx}_l,\mathbf {z})\} \end{aligned}$$
(17)
figure a
figure b
figure c

Note that in Eqs. (16) and (17), the objective function \(\mathcal {G}(\mathbf {x},\mathbf {z})\) does not take the starting city decision variables \(\mathbf {y}\) into account. This is because in the algorithm, a TTP solution \(\mathbf {(x,z)}\) is represented as the combination of a TSP tour \(\mathbf {x} = (x_1, \ldots , x_n)\) and a KP picking plan \(\mathbf {z} = (z_1, \ldots , z_m)\). \(\mathbf {x}\) is a permutation of the \(n\) cities, with \(x_i \in \{1, \ldots , n\}, \forall i = 1, \ldots , n\), and \(z_i \in A_i \cup \{0\}, \forall i = 1, \ldots , m\) indicates the city to pick the item \(i\). \(z_i = 0\) implies that item \(i\) is not picked throughout the way. The TSP tour naturally starts from city \(x_1\). Thus, the starting city is implicitly determined by \(\mathbf {x}\), and \(\mathbf {y}\) can be eliminated. Given a TTP solution \(\mathbf {(x,z)}\), its benefit is computed by Algo. 4. The computational complexity of \(\mathcal {G}(\mathbf {x},\mathbf {z})\) is \(O(nm)\).

figure d

Conventional crossover and mutation operators for the TSP and KP are adopted here. Specifically, the ordered crossover (Oliver et al. 1987) and 2-opt (Croes 1958) operators are used for the TSP, and the traditional one-point crossover, flip and exchange operators are used for the KP. They are described in details as follows:

Ordered Crossover (OX): Given two tours \(\mathbf {x}_1 = (x_{11}, \ldots , x_{1n})\) and \(\mathbf {x}_2 = (x_{21}, \ldots , x_{2n})\), two cutting positions \(1\le p \le q \le n\) are randomly selected, and \((x_{1p},\ldots ,x_{1q})\) is copied to the corresponding positions of the offspring \((x'_p,\ldots ,x'_q)\). After that, \(\mathbf {x}_2\) is scanned from position \(q+1\) to the end and then from beginning to position \(q\). The unduplicated elements are placed one after another in \(\mathbf {x}'\) from position \(q+1\) to the end, and then from beginning to position \(p-1\). The complexity of the OX operator is \(O(n)\).

2-opt: Given a tour \(\mathbf {x} = (x_1, \ldots , x_n)\), two cutting positions \(1\le p < q \le n\) are chosen and the sub-tour in between is inverted. The offspring is \(\mathbf {x}' = (x_1, \ldots , x_{p-1}, x_q, x_{q-1}, \ldots , x_p, x_{q+1}, \ldots , x_n)\). During the local search, the neighbourhood size defined by the 2-opt operator is \(O(n^2)\).

One-Point Crossover (OPX): Given two picking plans \(\mathbf {z}_1 = (z_{11}, \ldots , z_{1m})\) and \(\mathbf {z}_2 = (z_{21}, \ldots , z_{2m})\), a cutting position \(1 \le p \le m\) is picked, and then the offspring is set to \(\mathbf {z}' = (z_{11}, \ldots z_{1(p-1)}, z_{2p}, \ldots z_{2m})\). The OPX operator has a computational complexity of \(O(m)\).

Flip: Given a picking plan \(\mathbf {z} = (z_1, \ldots , z_m)\), a position \(1 \le p \le m\) is selected, and \(z_p\) is replaced by a different value \(z_p' \in A_p \cup \{0\}\). During the local search, the neighbourhood size defined by the Flip operator is \(O(\prod _{i=1}^{m}|A_i|) = O(nm)\).

Exchange (EX): Given a picking plan \(\mathbf {z} = (z_1, \ldots , z_m)\), two positions \(1 \le p < q \le m\) are selected, and the values of \(z_p\) and \(z_q\) are exchanged. To keep feasibility, it is required that \(z_q \in A_p \cup \{0\}\) and \(z_p \in A_q \cup \{0\}\). During the local search, the neighbourhood size defined by the EX operator is \(O(m^2)\).

Besides the above CC, which will be referred to as the Standard CC (SCC) for the sake of clarity, a variation named the Dynamic CC (DCC) that dynamically updates the collaborators within each generation is developed. From lines 7 and 8 of Algo. 1, one can see that the collaborators are updated after all the sub-problems have been solved. Therefore, within each generation, the latter sub-problem (i.e., KP) cannot use the updated collaborators obtained by the former sub-problem (i.e., TSP). To increase efficiency, the DCC simply replaces line 8 with the following codes:

$$\begin{aligned} (\mathbf {Z}^{(g+1)},\mathbf {CZ}^{(g+1)})=\text {solveKP}(\mathbf {Z}^{(g)}, \mathbf {CX}^{(g+1)}); \end{aligned}$$

In other words, the old collaborators \(\mathbf {CX}^{(g)}\) is replaced by the updated ones \(\mathbf {CX}^{(g+1)}\).

3.2 Memetic Algorithm

Based on the above crossover and local search operators, a MA is proposed for the overall problem. In the MA, the TSP and KP are solved together by combining the aforementioned operators. To be specific, the crossover of a TTP solution is conducted by applying the OX and OPX operators to its tour and picking plan simultaneously. Then, during the local search, the neighbourhood of the current solution is defined as the union of the neighbourhoods induced by all the 2-opt, Flip and EX operators.

The framework of the proposed MA is described in Algo. 5. In line 5, solveTTP() is described in Algo. 6. The only difference between solveTTP() and solveTSP() or solveKP() is in lines 6–7 and lines 15–18, which are the crossover and neighbourhood definition during the local search, respectively.

figure e
figure f

3.3 Computational complexity analysis

The computational complexities of the proposed algorithms are as follows:

$$\begin{aligned}&O(\text {CC}) = g_{\max }(O(\text {solveTSP})+O(\text {solveKP}))\end{aligned}$$
(18)
$$\begin{aligned}&O(\text {solveTSP}) = N_{off}(O(\text {OX})+O(\mathcal {F})+P_{ls}L_{1}S_{1}O_{ls}(\mathcal {F}))\nonumber \\&\qquad \,\,\qquad \qquad \qquad +O(\text {sort})\end{aligned}$$
(19)
$$\begin{aligned}&O(\text {solveKP}) = N_{off}(O(\text {OPX})+O(\mathcal {F})+P_{ls}L_{2}S_{2}O_{ls}(\mathcal {F}))\nonumber \\&\qquad \qquad \qquad \qquad +O(\text {sort})\end{aligned}$$
(20)
$$\begin{aligned}&O(\text {MA}) = g_{\max }O(\text {solveTTP})\end{aligned}$$
(21)
$$\begin{aligned}&O(\text {solveTTP}) = N_{off}(O(\text {OX})+O(\text {OPX})+2O(\mathcal {G})\nonumber \\&\qquad \,\,\qquad \qquad \qquad +P_{ls}L_{3}S_{3}O_{ls}(\mathcal {G}))+O(\text {sort}), \end{aligned}$$
(22)

where \(g_{\max }\) is the maximal number of generations, and \(N_{off}\) is the number of offsprings generated in each generation. \(L_1\), \(L_2\) and \(L_3\) stand for the average number of local search steps for TSP, KP and TTP, and \(S_1\), \(S_2\) and \(S_3\) are the neighbourhood sizes of the local search processes in TSP, KP and TTP, respectively. \(O(\cdot )\) stands for the complexity of the corresponding algorithm or operation, and \(O_{ls}(\cdot )\) indicates the complexity of evaluating a neighbouring solution with respect to \(\mathcal {F}\) or \(\mathcal {G}\) during the local search. Given the current solution \(\mathbf {s}\) and \(\mathcal {G}(\mathbf {s})\), the evaluation for each neighbouring solution may be much faster by computing the difference on \(\mathcal {G}\) caused by the modification, i.e., \(\mathcal {G}(\mathbf {s}') = \mathcal {G}(\mathbf {s})+\Delta \mathcal {G}(\mathbf {s},\mathbf {s}')\). For example, when applying the 2-opt operator to TSP that minimizes the total distance, we have \(O_{ls}(tc(\mathbf {s}')) = O(\Delta tc(\mathbf {s},\mathbf {s}')) = O(1)\), which is much lower than \(O(tc(\mathbf {s})) = O(n)\). However, in TTP, \(O_{ls}(\mathcal {G})=O(\mathcal {G})=O(nm)\), since it is still necessary to calculate the speed between each pair of adjacent cities in the tour. Then, based on Eqs. (16) and (17), we have \(O(\mathcal {F})=kO(\mathcal {G})=kO(nm)\) and \(O_{ls}(\mathcal {F})=kO_{ls}(\mathcal {G})=kO(nm)\).

Besides, we already have

$$\begin{aligned} S_1&= S(\text {2-opt}) = O(n^2)\end{aligned}$$
(23)
$$\begin{aligned} S_2&= S(\text {Flip})+S(\text {EX}) = O(nm)+O(m^2) \end{aligned}$$
(24)
$$\begin{aligned} S_3&= S(\text {2-opt})+S(\text {Flip})+S(\text {EX})\nonumber \\&= O(n^2)+O(nm)+O(m^2) \end{aligned}$$
(25)

It is also known that \(O(\text {OX}) = O(n)\), \(O(\text {OPX}) = O(m)\) and \(O(\text {sort}) = O(N_{off}\log N_{off})\). Clearly, the complexities of the algorithms are dominated by that of the local search. Then, we have

$$\begin{aligned}&O(\text {CC}) = kg_{\max }N_{off}P_{ls}(L_1O(n^3m)\nonumber \\&\qquad \qquad \quad +L_2O(n^2m^2)+L_2O(nm^3))) \end{aligned}$$
(26)
$$\begin{aligned}&O(\text {MA}) = g_{\max }N_{off}P_{ls}(L_3O(n^3m)\nonumber \\&\qquad \qquad \qquad +L_3O(n^2m^2)+L_3O(nm^3))) \end{aligned}$$
(27)

Under the assumption that \(L_1\), \(L_2\) and \(L_3\) are nearly the same, the computational complexity of CC is approximately \(k\) times as that of MA. In other words, when \(k=1\), CC and MA is expected to have comparable computational complexity. This will be verified in the experimental studies.

4 Experimental studies

In this section, the proposed CC and MA are compared on the TTP benchmark instances to investigate their performance.

4.1 Experimental settings

A representative subset of the TTP benchmark instances generated by Bonyadi et al.Footnote 1 is selected to compare the performance of the proposed algorithms. The benchmark set includes instances with various features with the number of cities \(n\) from 10 to 100 and number of items \(m\) from 10 to 150. For each parameter setting of the problem, 10 instances were generated randomly. As a result, there are totally 540 instances. For the sake of simplicity, for each parameter setting with \(10 \le n,m \le 100\), only the first instance is chosen from the 10 generated instances as a representative. The selected subset consists of 39 instances. Note that for some instances, the benefit may be negative due to the insufficient values of the items compared to the knapsack rent.

The complete parameter settings of the compared algorithms are given in Table 1. The population size, number of offsprings and probability of local search are set in the standard way that has been verified to be effective on similar combinatorial optimization problems (Tang et al. 2009; Mei et al. 2011a). For CC, \(k=1\) and \(k=3\) are tested to investigate the effect of \(k\) on the performance of CC. The number of generations is set to 100 for MA and CC with \(k=1\). For CC with \(k=3\), the number of generations is set to \(100/k = 34\) to make the compared algorithms have similar total number of fitness evaluations. Each algorithm is run 30 times independently.

Table 1 The parameter settings of the compared algorithms

4.2 Results and discussions

First, the average performance of the proposed algorithms are compared. Tables 2, 3, 4 show the mean and standard deviation of the final benefits obtained by the 30 independent runs of SCC, DCC and MA on the benchmark instances, whose features are included in their names. For an instance named \(n\)\(m\)–ID–\(\tau \), \(n\) and \(m\) stand for the number of cities and items, ID is the identity of the instance (all are \(1\)’s here, since they are the first instance in each category), and \(\tau \) indicates the tightness of the capacity constraint, which is the capacity of the knapsack over the total weight of the items. For each instance, the result of the algorithm that performed significantly better than the other compared algorithms using the Wilcoxon’s rank sum test (Wilcoxon 1945) under the confidence level of 0.05 is marked in bold.

Table 2 Mean and standard deviation of the benefits obtained by the 30 independent runs of the proposed algorithms on the benchmark instances from 10-10-1-25 to 20-30-1-75
Table 3 Mean and standard deviation of the benefits obtained by the 30 independent runs of the proposed algorithms on the benchmark instances from 50-15-1-25 to 50-75-1-75
Table 4 Mean and standard deviation of the benefits obtained by the 30 independent runs of the proposed algorithms on the benchmark instances from 100-10-1-25 to 100-100-1-75

It can been seen that MA obtained significantly better results than SCC and DCC with both \(k=1\) and \(k=3\) on all the 39 benchmark instances, with larger mean and smaller standard deviation. This implies that MA can obtain better solutions more reliably. For both tested \(k\) values, SCC generally obtained better solutions than DCC, which indicates that it is better to update the collaborators after solving all the sub-problems. This is because updating the collaborators too frequently will mislead the search to a local optimum quickly and make it difficult to jump out of the local optimum due to the strong selection pressure.

Among the proposed CC algorithms, SCC and DCC with \(k=3\) outperformed the ones with \(k=1\) for all the instances except the large ones (\(n=100\) and \(m \ge 50\)). This shows that for the instances with small or medium solution space, a larger \(k\) can lead to a better result since it has a wider neighborhood and thus is stronger in exploration. On the other hand, for the large-scale instances, a smaller \(k\) is a better option to allow more generations given a fixed total number of fitness evaluations.

Tables 5, 6, 7 show the benefits of the best solution and average number of fitness evaluations of the proposed algorithms on the benchmark instances. The best benefits among the compared ones are marked in bold. During the local search, each computation of the objective value of the neighbouring solutions is considered as a complete fitness evaluation, given that there is no simplified evaluation as in TSP or KP alone.

Table 5 The benefits of the best solution and average number of fitness evaluations of the proposed algorithms from 10-10-1-25 to 20-30-1-75
Table 6 The benefits of the best solution and average number of fitness evaluations of the proposed algorithms from 50-15-1-25 to 50-75-1-75
Table 7 The benefits of the best solution and average number of fitness evaluations of the proposed algorithms from 100-10-1-25 to 100-100-1-75

From the tables, one can see that the best performance of the algorithms is consistent with the average performance. MA performed the best. It managed to obtain the best solutions on all the benchmark instances. SCC with \(k=1\) comes next, obtaining the best solutions on 25 out of the 39 instances. DCC with \(k=1\) performed worse than the corresponding SCC, only achieving the best solutions on 15 instances. Both SCC and DCC with \(k=3\) obtained the best solutions on 13 instances. In terms of computational effort, one can see that the compared algorithms have comparable average number of fitness evaluations when the problem size is not large. This is consistent with the analysis in Eqs. (26) and (27) and indicates that the average number of local search steps \(L_1\), \(L_2\) and \(L_3\) are nearly the same for the small- and medium-sized instances. For the larger instances (\(m, n \ge 50\)), SCCs require much more fitness evaluations than the other compared algorithms. Note that DCC generally needs less fitness evaluations than SCC, especially on the larger instances. This is because the dynamic change of the collaborators speeds up the convergence of the search process and thus reduces the number of steps (\(L_1\) and \(L_2\) in Eq. (26)) to reach the local optimum. Besides, given the same number of generations, the number of fitness evaluations increases significantly with the increase of \(n\) and \(m\), which is mainly induced by the increase of the neighbourhood sizes \(S_1=O(n^2)\), \(S_2=O(nm)+O(m^2)\) and \(S_3=O(n^2)+O(nm)+O(m^2)\).

The convergence curves of the compared algorithms on selected representative instances are shown in Figs. 2, 3, 4, 5, 6, 7, 8, where the \(x\)-axis and \(y\)-axis stand for the fitness evaluations and the average benefit of the best-so-far solutions of different runs, respectively. The selected instances include the following four diversified categories: (1) small \(n\) and \(m\); (2) small \(n\) and large \(m\); (3) large \(n\) and small \(m\) and (4) large \(n\) and \(m\). Obviously, MA performed significantly better than the CC algorithms. In almost all the instances, the curve of MA is consistently above that of the other compared algorithms. Since MA solves TTP as a whole, its outperformance over the CC algorithms verifies the importance of considering the interdependence between the sub-problems of TTP.

Fig. 2
figure 2

Convergence curves of the compared algorithms on the TTP instances with \(n=10\) and \(m=10\)

Fig. 3
figure 3

Convergence curves of the compared algorithms on the TTP instances with \(n=20\) and \(m=10\)

Fig. 4
figure 4

Convergence curves of the compared algorithms on the TTP instances with \(n=20\) and \(m=30\)

Fig. 5
figure 5

Convergence curves of the compared algorithms on the TTP instances with \(n=50\) and \(m=15\)

Fig. 6
figure 6

Convergence curves of the compared algorithms on the TTP instances with \(n=50\) and \(m=75\)

Fig. 7
figure 7

Convergence curves of the compared algorithms on the TTP instances with \(n=100\) and \(m=10\)

Fig. 8
figure 8

Convergence curves of the compared algorithms on the TTP instances with \(n=100\) and \(m=100\)

Between the CC algorithms, one can see that DCC converges much faster, but generally obtained worse final results than the corresponding SCC. This implies that the combination of \(k=1\) and dynamic update of the collaborators leads to such a strong selection pressure that the search process become stuck in a local optimum at very early stage and can hardly jump out of it. In most of the instances, the CC algorithms with \(k=3\) converged slower than the ones with \(k=1\) at the earlier stage of the search. This is due to the much larger number of fitness evaluations (nearly \(k\) times) within each generation. However, their curves intersect the ones with \(k=1\) (e.g., Figs. 4, 7), and finally outperformed the CCs with \(k=1\).

In summary, the competitiveness of the proposed MA sheds a light on developing algorithms for complex real-world problems consisting of interdependent sub-problems. First, by solving TTP as a whole, MA can be seen as considering the interdependence between the sub-problems more comprehensively than CC. Second, the properly designed framework and employed operators leads to a comparable computational complexity with CC. In other words, MA explores the solution space more effectively than CC by choosing better “directions” during the search process. This is similar to the ideas of the numerical optimization methods that use the gradient information such as the steepest descent and Quasi-Newton methods, and CMA-ES (Hansen 2006) in the evolutionary computation field. This implies that when tackling the interdependence between the sub-problems, the major issues should be designing a proper measure that can reflect the gradient or dependence of the objective value on the change of decision variables in the complex combinatorial solution space, based on which one can find the best “direction” during the search process.

5 Conclusion

This paper investigates the interdependence between sub-problems of a complex problem in the context of TTP, which is a simple but representative benchmark problem. The analysis is conducted both theoretically and empirically. At first, the mathematical formulations of TTP show that the non-linear interdependence of the sub-problems lying in the objective function makes it difficult to decompose the problem into independent sub-problems, if not impossible. The NP-hardness also makes the exact methods only applicable for small-sized instances. Then, a CC, which further consists of a standard and a dynamic version, and a MA is proposed to solve the problem approximately. The former optimizes the sub-problems separately and exchanges the information in each generation, while the latter solves the problem as a whole. The outperformance of MA over CC on the benchmark instances illustrates the importance of considering the interdependence between sub-problems. The significance of the research reported here may go beyond just TTP because there are other similar problems that are composed of two or more sub-problems, each of which is an NP-hard problem. For example, Gupta and Yao (2002) described a combined Vehicle Routing with Time Windows and Facility Location Allocation Problem, which is composed of Vehicle Routing with Time Windows and Facility Location Allocation. The research in this paper will help to understand and solve the above problem as well.

In the future, more sophisticated operators such as the 3-opt and Lin–Kernighan (LK) heuristic Lin and Kernighan (1973) can be employed in an attempt to enhance the search capability of the algorithm. More importantly, measures that take the interdependence between the sub-problems into account to reflect the dependence of the objective value on the change of the decision variables are to be designed so that frameworks can be developed more systematically by identifying the best “direction” during the optimization process rather than heuristically.