1 Introduction

Vehicle routing problems and covering problems are important and fundamental problems in Operations Research and Logistics. In this paper, we study the recently introduced time-constrained maximal covering routing problem (TCMCRP), which is a generalization of well-known routing problems such as the orienteering problem (see, e.g., Golden et al. (1987); Gunawan et al. (2016)), and in particular the team orienteering problem (see, e.g., Chao et al. (1996)), and the maximal covering location problem (see, e.g., Church and Velle (1974)). The problem was introduced in Amiri and Salari (2019), and applications in health care were discussed.

In the TCMCRP, we are given a directed graph \(G=(V,A)\), where \(V= 0 \cup F \cup C\) is the set of vertices. The vertex 0 represents the central depot, F the set of facilities and C the set of customers. The arc set \(A={A_{0F}}\cup {A_{FC}}\) is defined as set of routing arcs \({A_{0F}}=\{(i,i'):i,i'\in 0 \cup F\}\) (i.e., the complete directed graph on \(0 \cup F\)) and assignment arcs \({A_{FC}}\subseteq \{(i,j):i \in F, j \in C\}\) (i.e., the assignment arcs are a subset of all possible facility/customer connections). Each arc \((i,i') \in {A_{0F}}\) has a travel distance \(d_{ii'}>0\) associated with it. Moreover, let P represent the set of \(k=|P|\) available vehicles, and let \(L_p\) be a distance limit for each \(p\in P\). A feasible solution consists of one Hamiltonian cycle on a subset of \(0 \cup F\) for each \(p \in P\). Each of these cycles must contain 0 and respect the distance limit \(L_p\). A facility F can only appear in at most one cycle. We will also refer to these cycles as tours \(T_1,\ldots , T_k\). A customer is covered by a solution if there exists an assignment arc in \({A_{FC}}\) between the customer and a facility visited in one of the tours. The goal is to find a feasible solution which maximizes the number of covered customers.

Note that in the instances from literature, \(L_p\) is the same for all vehicles, and the distance function is Euclidean. Both are common assumptions in vehicle routing problems. Moreover, the mixed-integer programming (MIP) formulation presented in Amiri and Salari (2019) implicitly assumes that the distance function satisfies the triangle inequalityFootnote 1. In this work, we present two new MIP formulations for the problem. The first formulation assumes that the distance function satisfies the triangle inequality and \(L_p\) is the same for all vehicles. The second formulations do not need these assumptions. Figure 1 shows an exemplary instance graph of the TCMCRP and its optimal solution for four vehicles and a given distance limit.

Fig. 1
figure 1

Exemplary instance graph of the TCMCRP and its optimal solution for four vehicles and a given distance limit. The blue circle is the central depot, orange boxes are facilities and green triangles are customers. The gray edges in 1a between facilities and customers denote which customers are covered by each facility. For better readability, the arcs between facilities, and facilities and the central depot are not displayed. In the solution 1b, the arcs of the optimal solution are indicated in black, and all facilities and customers not in the solution are grayed out


Contribution and paper outline The TCMCRP was recently introduced in Amiri and Salari (2019), where the authors presented a flow-based MIP model, an iterated local search, a tabu search and a variable neighborhood search for it. They evaluated their algorithms on instances derived from the well-known TSPLIB (Reinelt 1991). In this paper, we develop two exact solution algorithms for the problem based on new MIP-models. One algorithm is based on a compact model, i.e., a model with a polynomial number of variables and constraints. The other model contains an exponential number of constraints. We also describe preprocessing techniques, valid inequalities and primal heuristics for both models. Since for the compact model the presented set of valid inequalities has exponential size, we use branch-and-cut (see, e.g., Conforti et al. (2014)) in both solution algorithms.

In a computational study, we evaluate our solution approaches on the instances from Amiri and Salari (2019). The study reveals that our algorithms are able to find the provably optimal solution for 123 instances, where the optimal solution was not known before, and only 3 instances remain unsolved. Moreover, for most of the instances, our algorithms only take a few seconds, and thus are up to five magnitudes faster than the algorithms presented in Amiri and Salari (2019). Finally, we also discuss some issues with the instances used in Amiri and Salari (2019) (e.g., not all customers have a facility associated with it, and the optimal solution often contains just all reachable customers; the calculation of \(L_p\) is not the same as described in the paper), and introduce a set of new and more difficult instances.

The paper is organized as follows: In the remainder of this section, we give an overview of related work. In Sect. 2, we present our two new MIP-models, together with preprocessing/variable-fixing procedures and valid inequalities. In Sect. 3, we discuss additional details about the developed branch-and-cut framework, such as separation procedures for the valid inequalities and primal heuristics. Section 4 contains the computational study, and Sect. 5 concludes the paper.


Related work As the studied problem is a quite general routing/covering problem, there is naturally a vast number of related work; the paper introducing the TCMCRP (Amiri and Salari 2019) contains a quite exhaustive and up-to-date discussion of related problems. We thus focus the discussion about related work on the team orienteering problem (TOP), since both of our models are extensions of models for the TOP. For a general overview on routing problems, we refer to, e.g., Toth and Vigo (2014) and for a general overview on facility location/covering problems, we refer to, e.g., Laporte et al. (2015) (Chapter 5).

The TOP is an extension of the orienteering problem (OP) to multiple vehicles. The OP was first introduced in Tsiligirides (1984). In the OP, we are given a central depot, a set of profitable customers which can be visited, and a distance limit. The goal is to find the most profitable Hamiltonian cycle on a subset of the customers, the cycle must also contain the depot and respect a given distance limit. Sometimes the OP is defined with a start depot and an end depot, and a Hamiltonian path on a subset of the customers from start to end is searched. Moreover, there is a variant of the OP called selective traveling salesman problem (see Gendreau et al. (1998); Laporte and Martello (1990)), with additional compulsory vertices, which must be in any feasible cycle/path. There exist also many other variants with additional side-constraints (such as capacities or time-windows), for more details, see, e.g., the survey Gunawan et al. (2016). Regarding successful exact approaches for the OP, there are several papers (Fischetti et al. 1998; Gendreau et al. 1998; Leifer and Rosenwein 1994) using branch-and-cut approaches based on models with generalized subtour elimination constraints (GSECs)/connectivity cuts(CCs). Similar GSECs/CCs will also be used in our approaches.

The TOP was introduced in Chao et al. (1996), where a heuristic was proposed. The TOP extends the OP by introducing k (homogeneous) vehicles, i.e., the goal is now to find k Hamiltonian cycles/paths containing the depot and respecting the distance limit, instead of a single one. Note that the TOP can be seen as a special case of the TCMCRP, with a one-to-one-correspondence between facilities and customers. A variant of the TOP with heterogeneous vehicles (denoted as multiple tour maximum collection problem) was considered in Butt and Ryan (1999). Several column generation, and branch-and-price(-and-cut) approaches were proposed for the TOP (Boussier et al. 2007; Butt and Ryan 1999; Keshtkaran et al. 2016; Poggi et al. 2010). An exponential size formulation using GSECs and solved by branch-and-cut was developed in Dang et al. (2013); El-Hajj et al. (2016). In Bianchessi et al. (2018), the authors presented a compact model for the TOP based on a formulation of Maffioli and Sciomachen (1997) for the sequential ordering problem. They strengthen the model by separating CCs and were able to solve additional instances to optimality.

Aside from the TOP, another strongly related problem to the TCMCRP is the time-constrained maximal covering salesman problem (TCMCSP), which was introduced in Naji-Azimi and Salari (2014). The TCMCSP is the single-vehicle variant of the TCMCRP. In Naji-Azimi and Salari (2014), the authors presented a flow-based MIP model and some heuristics for the TCMCSP ( Amiri and Salari (2019) is basically the extension of the approaches in Naji-Azimi and Salari (2014) to the TCMCRP). In Ozbaygin et al. (2016), an exact solution algorithm based on GSECs for a variant of the TCMCSP was proposed.

2 Mixed integer programming models and valid inequalities

We first present the compact model, together with its associated preprocessing and valid inequalities, and then the exponential-sized model, together with its associated preprocessing and valid inequalities. Note that for the compact model the presented set of valid inequalities has exponential size. Thus, we also use branch-and-cut in the algorithm based on the compact model. For later use, for a subset \(S \subseteq 0 \cup F\), let \(\delta ^+(S)=\{(i,i')\in {A_{0F}}: i \in S, i' \not \in S\}\) and \(\delta ^-(S)=\{(i,i')\in {A_{0F}}: i \not \in S, i' \in S\}\) be the set of outgoing, resp., incoming arcs of the cut induced by S. In both models, we allow solutions using less than k vehicles, as we present some valid inequalities based on optimality-arguments, namely constraints (C-FD). These inequalities may cut off some optimal solutions, if an instance has multiple optimal solutions, and the remaining optimal solutions may use less than k vehicles. This situation is detailed below in Example 1.

2.1 Compact model

This formulation follows the approach proposed for the TOP in Bianchessi et al. (2018). For this formulation and its associated valid inequalities, we assume that the distance function fulfills the triangle inequality and that the vehicles are homogeneous. Let L denote the homogeneous distance limit, i.e., \(L_p = L\) for \(p \in P\). Let binary variables \(x_{ii'}=1\), for \((i,i')\in {A_{0F}}\), iff arc \((i,i')\) is traveled by a vehicle in the solution. Let binary variables \(y_{i}=1\), \(i \in F\), iff facility i is visited in the solution, and binary variables \(z_{j}=1\), \(j \in C\), iff customer j is covered by the solution. Moreover, let continuous variables \(f_{ii'}\) for \((i,i')\in {A_{0F}}\) indicate the traveled distance from the central depot at facility \(i'\) for a vehicle arriving from i. Let integer variable \(w \in \{1, \ldots , k\}\) indicate the number of vehicles used in the solution. The compact model, denoted by (C), is as follows.

$$\begin{aligned} \max \quad \sum _{j \in C} z_j&\end{aligned}$$
(C-OBJ)
$$\begin{aligned} s.t.\quad \sum _{(i,j)\in {A_{FC}}} y_i&\ge z_j&\quad \forall j \in C \end{aligned}$$
(C-LINK)
$$\begin{aligned} \sum _{(i,i') \in {{A_{0F}}}} x_{ii'}&=y_i&\quad \forall i \in F \end{aligned}$$
(C-OUT)
$$\begin{aligned} \sum _{(i',i) \in {{A_{0F}}}} x_{i'i}&=y_i&\quad \forall i \in F \end{aligned}$$
(C-IN)
$$\begin{aligned} \sum _{(0,i) \in {{A_{0F}}}} x_{0i}&=w&\end{aligned}$$
(C-OUT0)
$$\begin{aligned} \sum _{(i,0) \in {{A_{0F}}}} x_{i0}&=w&\end{aligned}$$
(C-IN0)
$$\begin{aligned} f_{0i}&=d_{0i}x_{0i}&\quad \forall i \in F \end{aligned}$$
(C-FLOW0)
$$\begin{aligned} \sum _{i' \in \delta ^+(i)} f_{ii'} - \sum _{i' \in \delta ^-(i)} f_{i'i}&= {\sum _{(i,i'') \in {A_{0F}}}} d_{ii''}x_{ii''}&\quad \forall i \in F \end{aligned}$$
(C-FLOW)
$$\begin{aligned} f_{ii'}&\le (L-d_{i'0})x_{ii'}&\quad \forall (i,i') \in {A_{0F}}, {i' \ne 0} \end{aligned}$$
(C-DIST)
$$\begin{aligned} y_i&\in \{0,1\}&\quad \forall i \in F \end{aligned}$$
(C-Y)
$$\begin{aligned} z_j&\in \{0,1\}&\quad \forall j \in C \end{aligned}$$
(C-Z)
$$\begin{aligned} x_{ii'}&\in \{0,1\}&\quad \forall (i,i') \in {A_{0F}}\end{aligned}$$
(C-X)
$$\begin{aligned} f_{ii'}&\ge 0&\quad \forall (i,i') \in {A_{0F}}\end{aligned}$$
(C-F)
$$\begin{aligned} w&\in \{1, \ldots , k\} \end{aligned}$$
(C-W)

The objective function (C-OBJ) and constraints (C-LINK) ensure that a customer is only counted in the objective function, if a facility covering it is visited in the solution. Constraints (C-IN) and (C-OUT) ensure that each visited facility has one incoming and one outgoing arc. Constraints (C-OUT0) and (C-IN0) make sure that there are exactly w vehicles leaving and entering the depot. The solution defined by the previous four set of constraints (plus integrality of the variables) will consist of w or more cycles, with w of these cycles containing the central depot, i.e., subtours are possible. Potential subtours are prohibited using flow-conservation constraints (C-FLOW0) and (C-FLOW). These constraints ensure that the flow-variables \(f_{ii'}\) encode the distance traveled from the depot to facility \(i'\). We note that there is no feasible way to set the values of the flow-variables of arcs contained in a potential subtour, as the total traveled distance can only increase along any selected path. Thus, these constraints forbid subtours. Moreover, together with constraints (C-DIST), these constraints also model the distance limit of a tour: For any facility i in the solution, the traveled distance must allow to go back from i to the central depot within the distance limit. Note that constraints (C-DIST) need that the distance function fulfills the triangle inequality, otherwise there could be a shorter, non-direct connection from \(i'\) to the central depot, and the constraints would be too restrictive. Constraints (C-DIST) also link the flow-variables and the arc-variables, i.e., flow is only allowed on an arc, if the arc is selected in the solution. Finally, constraints (C-Y) to (C-W) define the variables.

Valid Inequalities Next, we present some valid inequalities for (C), including variable-fixing procedures, which can be applied in a preprocessing step.

The first set of inequalities is concerned with dominance between facilities. The inequalities use optimality-arguments, i.e., they may cut off some feasible solutions, but there is at least one optimal solution fulfilling all of them.

Theorem 1

Let \(i, i' \in F\), and \(C(i'')=\{j\in C:(i'',j)\in {A_{FC}}\}\) for \(i''=i,i'\). Suppose \(C(i') \subseteq C(i)\). Then, the following facility dominance inequalities

$$\begin{aligned} y_i+y_{i'} \le 1 \end{aligned}$$
(C-FD)

are valid for (C) , i.e., there exists at least one optimal solution fulfilling all of them. Moreover,

$$\begin{aligned} x_{ii'} =0\text { and } x_{i'i} =0 \end{aligned}$$
(C-FDA)

is also valid.

Proof

As i covers at least the same customers also covered by \(i'\), there is no improvement in the objective function by including \(i'\) in any solution containing i. Moreover, as the distance function fulfills the triangle inequality, any tour containing both i and \(i'\) will never be shorter than a tour just containing i. Thus, for each solution containing both i and \(i'\), another solution just containing i with the same objective, and same or shorter tour-lengths can be constructed. \(\square\)

We note that inequalities (C-FD) may cut off all optimal solutions using exactly k vehicles, as shown in the following example.

Example 1

Let \(k=3\), \(F=\{i_1, i_2, i_3\}\), \(C=\{j_1,j_2,j_3\}\) and \(A_{FC}=\{(i_1,j_1), (i_1,j_2), (i_2,j_1), (i_3,j_3)\}\). Moreover, suppose that due to the distances and the given distance limit, each tour can only visit one facility. We have two optimal solutions: (i) using two vehicles, with one visiting \(i_1\) and one visiting \(i_3\), (ii) using three vehicles, and each vehicle visits one facility. As facility \(i_1\) dominates facility \(i_2\), facility dominance inequalities (C-FD) only allow solution (i).

The following variable fixing exploits the distance limit L, similar ideas have been used in Bianchessi et al. (2018); Dang et al. (2013); El-Hajj et al. (2016) for the TOP and in Ozbaygin et al. (2016) for the TCMCSP, they can be seen as special-case of the path inequalities for the OP proposed in Fischetti et al. (1998).

Theorem 2

Let \(i \in F\) with \({d_{0i}+d_{i0}}>L\). Then

$$\begin{aligned} y_{i} =0 \end{aligned}$$
(C-FIXF)

is valid for (C).

Let \(i, i' \in F\) with \(d_{0i}+d_{ii'}+d_{i'0}>L\). Then

$$\begin{aligned} x_{ii'} =0 \end{aligned}$$
(C-FIXA)

is valid for (C).

Let \(F(j)=\{i\in F:(i,j)\in {A_{FC}}\}\) for \(j\in C\). If for all \(i \in F(j)\), we have \({d_{0i}+d_{i0}}>L\), then

$$\begin{aligned} z_{j} =0 \end{aligned}$$
(C-FIXC)

is valid for (C).

Proof

Obvious, as the distance limit does not allow the shortest cycle containing \((i,i')\), resp., i. Moreover, if no facility covering j can be reached given the distance limit, j cannot be in any solution.\(\square\)

The following global constraint (C-DISTG) on the length of all tours can also be added, as well as constraints (C-FLOWER) which impose lower bounds on the flow-variables \(f_{ii'}\) (see Bianchessi et al. (2018)).

$$\begin{aligned} \sum _{(i,i') \in {A_{0F}}} d_{ii'} x_{ii'} \le w L , \end{aligned}$$
(C-DISTG)
$$\begin{aligned} f_{ii'} \ge (d_{0i}+d_{ii'}) x_{ii'} . \end{aligned}$$
(C-FLOWER)

While the inequalities in the model already ensure that the solution is connected (and hence, consists of k cycles containing the central depot), the model can be strengthened by adding connectivity cuts (C-CC) (see, e.g., Bianchessi et al. (2018)).

$$\begin{aligned} \sum _{(i'',i')\in \delta ^-(S)} x_{i''i'} \ge y_i \quad \forall S {\subseteq } F, i \in S: |S| \ge 2 \end{aligned}$$
(C-CC)

The inequalities ensure that there is at least one arc going from \((F \setminus S) \cup 0\) to S, if a facility i in S is chosen in a solution. While these inequalities would also ensure that the solution is connected, we cannot replace the flow-conservation constraints (C-FLOW) with them, as the flow-conservation constraints are also needed for modeling the distance limit. As there are exponentially many inequalities (C-CC), we separate them on-the-fly in a branch-and-cut, see Sect. 3.1 for the separation. In the following, we present various liftings of these inequalities.

Theorem 3

Let \(S {\subseteq } F\) with \(|S| \ge 2\) and \({\mathcal {F}} \subseteq S\) with \(d_{0i}+d_{ii'}+d_{i'0}>L\) for all pairs \(i,i' \in {\mathcal {F}}\). Then, the following inequality is valid

$$\begin{aligned} \sum _{(i'',i')\in \delta ^-(S)} x_{i''i'} \ge \sum _{i \in {\mathcal {F}}} y_i. \end{aligned}$$
(C-CC-FIXA)

Proof

If any of the facilities in \({\mathcal {F}}\) is in a solution, at least one of the arcs associated with the variables on the left-hand-side must be taken to ensure connectivity to the central depot. Moreover, each tour in a solution can only contain at most one of the facilities in \({\mathcal {F}}\), due to the condition defining \({\mathcal {F}}\). Thus, each \(y_i\) with value one on the right-hand-side needs its own tour. This implies that at least as many arcs associated with variables on the left-hand-side must be in a solution as there are facilities from \({\mathcal {F}}\) in this solution. \(\square\)

Let LB be a given lower bound for the objective value, e.g., the value of the current incumbent solution during branch-and-cut. Using LB, an optimality-based lifting of (C-CC) may be possible.

Theorem 4

Let \(S {\subseteq } F\) with \(|S| \ge 2\). Let \(Z(S)=|\{j:(i,j)\in {A_{FC}}, i \not \in S\}|\), i.e., the number of customers, which can be served by facilities not in S. Suppose \(Z(S)\le LB\), then the following inequality is valid

$$\begin{aligned} \sum _{(i'',i')\in \delta ^-(S)} x_{i''i'} \ge 1 \end{aligned}$$
(C-CC-OPT)

Proof

As \(Z(S)\le LB\), facilities outside of S cannot serve enough customers to provide an improved solution, thus, at least one tour must visit facilities in S to provide a solution with value better than LB. \(\square\)

Another lifted version of (C-CC) can be obtained using the facility dominance inequalities (C-FD).

Theorem 5

Let \(i, i' \in S \subseteq F\), and \(C(i'')=\{j\in C:(i'',j)\in {A_{FC}}\}\) for \(i''=i,i'\). Suppose \(C(i') \subseteq C(i)\). Then, the following inequality is valid

$$\begin{aligned} \sum _{(i''',i'')\in \delta ^-(S)} x_{i'''i''} \ge y_i+y_{i'} \end{aligned}$$
(C-CC-FD)

Proof

Both \(i,i'\) are in S and there exists an optimal solution, where at most one of them will be visited, following from the same arguments as in the proof of Theorem 1. \(\square\)

Finally, there is also a version of inequalities (C-CC), which use the customer variables on the right-hand-side.

Theorem 6

Let \(j \in C\) and \(F(j)=\{i\in F:(i,j)\in {A_{FC}}\}\). Let \(S \subseteq F\) and suppose \(F(j) \subseteq S\). Then, the following inequality is valid

$$\begin{aligned} \sum _{(i',i)\in \delta ^-(S)} x_{i'i} \ge z_j \end{aligned}$$
(C-CC-CUST)

Proof

If \(z_j\) is one, customer j is covered in the solution. Thus, at least one of the facilities in F(j) must be in the solution. As \(F(j) \subseteq S\), at least one arc must go into S to allow at least one of the facilities in F(j) to be connected to the central depot. \(\square\)

Note that in any LP-relaxation solution \((x^*,y^*,z^*,f^*,w^*)\) of (C), \(z^*_j \ge y^*_i\) for \((i,j) \in A_{FC}\), as the objective function maximizes \(\sum _{j \in C} z_j\). Thus, whenever there is a violated inequality (C-CC) with S fulfilling the conditions of (C-CC-CUST), there is an inequality (C-CC-CUST) with at least the same violation.

2.2 Exponential model

This model follows the formulation of Dang et al. (2013) and El-Hajj et al. (2016) for the TOP and allows for heterogeneous vehicles, i.e., different distance limits \(L_p\) for each \(p \in P\). Binary variables \(z_{j}\), \(j \in C\) have the same meaning as in model (C). Let binary variables \(y^p_{i}=1\), \(i \in F\), \(p \in P\), iff facility i gets visited by the tour of vehicle \(p \in P\). Moreover, let binary variables \(x^p_{ii'}=1\), for \((i,i')\in {A_{0F}}\), iff arc \((i,i')\) is traveled by vehicle \(p \in P\) in the solution. Let binary variable \(w_p = 1, p \in P\) iff vehicle p is used in the solution. The model, denoted by (E), is as follows.

$$\begin{aligned} \max \quad \sum _{j \in C} z_j&\end{aligned}$$
(E-OBJ)
$$\begin{aligned} s.t.\quad \sum _{p \in P}\sum _{(i,j)\in {A_{FC}}} y^p_i&\ge z_j&\quad \forall j \in C \end{aligned}$$
(E-LINK)
$$\begin{aligned} \sum _{p \in P} y^p_i&\le 1&\quad \forall i \in F \end{aligned}$$
(E-ONEF)
$$\begin{aligned} \sum _{(i,i') \in {A_{0F}}} x^p_{ii'}&=y^p_i&\quad \forall i \in F, p \in P \end{aligned}$$
(E-OUT)
$$\begin{aligned} \sum _{(i',i) \in {A_{0F}}} x^p_{i'i}&=y^p_i&\quad \forall i \in F, p \in P \end{aligned}$$
(E-IN)
$$\begin{aligned} \sum _{(0,i') \in {A_{0F}}} x^p_{0i'}&=w_p&\quad \forall p \in P \end{aligned}$$
(E-OUT0)
$$\begin{aligned} \sum _{(i',0) \in {A_{0F}}} x^p_{i'0}&=w_p&\quad \forall p \in P \end{aligned}$$
(E-IN0)
$$\begin{aligned} \sum _{(i'',i')\in \delta ^-(S)} x^p_{i''i'}&\ge y^p_i&\quad \forall S {\subseteq } F, i \in S: |S| \ge 2, p \in P \end{aligned}$$
(E-CC)
$$\begin{aligned} \sum _{(i,i') \in {A_{0F}}} d_{ii'} x^p_{ii'}&\le L_p w_p&\quad \forall p \in P \end{aligned}$$
(E-DIST)
$$\begin{aligned} y^p_i&\in \{0,1\}&\quad \forall i \in F, p \in P \end{aligned}$$
(E-Y)
$$\begin{aligned} z_j&\in \{0,1\}&\quad \forall j \in C \end{aligned}$$
(E-Z)
$$\begin{aligned} x^p_{ii'}&\in \{0,1\}&\quad \forall (i,i') \in {A_{0F}}, p \in P \end{aligned}$$
(E-X)
$$\begin{aligned} w_p&\in \{0,1\}&p \in P \end{aligned}$$
(E-W)

The objective function (E-OBJ) and constraints (E-LINK) are the same as in model (C). Constraints (E-ONEF) make sure that each facility is only visited by one vehicle (in case of distances satisfying the triangle inequality, these constraints are redundant, as using a facility in more than one tour will only result in larger distances). For each vehicle \(p \in P\), constraints (E-OUT) and (E-IN) ensure that if a facility is visited by vehicle p, the vehicle enters and leaves the facility. Moreover, constraints (E-OUT0) and (E-IN0) ensure that each used vehicle enters and leaves the depot. Thus, the previous four sets of constraints make sure that the solution for each used vehicle consists of one or more cycles, and one of these cycles starts and ends at the depot. Connectivity cuts (E-CC) ensure that a facility i visited by a vehicle p must be connected to the central depot: For any set S containing i, at least one arc going from \((F \setminus S) \cup 0\) to S must be taken by vehicle p. Thus, the solution contains exactly one cycle for each vehicle, and this cycle must contain the central depot. As the set of inequalities (E-CC) is of exponential size, we separate them on-the-fly, when they are violated. Separation of the inequalities is discussed in Sect. 3.1. The distance limit is enforced by constraints (E-DIST). The variables are defined by constraints (E-Y) to (E-W).


Valid Inequalities Similar to model (C), several valid inequalities and variable-fixing procedures can be defined. Some of these inequalities are adaptions of the inequalities for (C); however, there are also additional inequalities. While formulation (E) makes no assumption on the distance function, all of the valid inequalities need that the distance function is symmetric and fulfills the triangle inequality.

Facility domination inequalities (C-FD) can be adapted as follows for facilities \(i,i'\in F\) to obtain inequalities (E-FD)

$$\begin{aligned} \sum _{p \in P} (y^p_i +y^p_{i'}) \le 1 . \end{aligned}$$
(E-FD)

Adaption of the variable fixings (C-FDA), (C-FIXF), (C-FIXA) and (C-FIXC) are straightforward. We denote them as (E-FDA), (E-FIXF), (E-FIXA) and (E-FIXC). Moreover, the following conflict-constraints (E-CLQ) can be derived (see also the incompatibility clique cuts for the TOP in Dang et al. (2013) and El-Hajj et al. (2016))

Theorem 7

Let \(p \in P\) and \({\mathcal {F}} \subseteq F\) with \(d_{0i}+d_{ii'}+d_{i'0}>L_p\) for all pairs \(i,i' \in {\mathcal {F}}\). Then, the following inequality is valid.

$$\begin{aligned} \sum _{i \in {\mathcal {F}}} y^p_i \le w_p . \end{aligned}$$
(E-CLQ)

Proof

Due to the assumption that the distances fulfill the triangle inequality and the condition \(d_{0i}+d_{ii'}+d_{i'0}>L_p\) holds for all pairs \(i,i' \in {\mathcal {F}}\), at most one of the facilities \(i \in {\mathcal {F}}\) can be visited by vehicle p. \(\square\)

There is an exponential number of inequalities (E-CLQ). We thus do not use all of them in our solution framework, but only a subset, resp., we separate them on-the-fly, for more details on this see Sect. 3.1. Inequalities (E-CLQ) can be used for lifting (E-CC) as shown in the following.

Theorem 8

Let \(S {\subseteq } F\) with \(|S| \ge 2\). Let \(p \in P\) and \({\mathcal {F}} \subseteq S\) with \(d_{0i}+d_{ii'}+d_{i'0}>L_p\) for all pairs \(i,i' \in {\mathcal {F}}\). Then, the following inequality is valid

$$\begin{aligned} \sum _{(i'',i')\in \delta ^-(S)} x^p_{i''i'} \ge \sum _{i \in {\mathcal {F}}} y^p_i \end{aligned}$$
(E-CC-CLQ)

Proof

Only at most one of the variables \(y^p_i\) on the right-hand-side can be one in a feasible solution, due to the assumption that the distances fulfill the triangle inequality and the condition \(d_{0i}+d_{ii'}+d_{i'0}>L_p\) holds for all pairs \(i,i' \in {\mathcal {F}}\). \(\square\)

Moreover, a lifted version of (E-CC) using facility dominance, similar to (C-CC-FD) can also be defined.

Theorem 9

Let \(p \in P\), \(i, i' \in S \subseteq F\), and \(C(i'')=\{j\in C:(i'',j)\in {A_{FC}}\}\) for \(i''=i,i'\). Suppose \(C(i') \subseteq C(i)\). Then, the following inequality is valid

$$\begin{aligned} \sum _{(i''',i'')\in \delta ^-(S)} x^p_{i'''i''} \ge y^p_i+y^p_{i'} \end{aligned}$$
(E-CC-FD)

Proof

Similar to the proof of Theorem 5.\(\square\)


Symmetry Breaking Inequalities In case \(L_p=L, p \in P\), there can be symmetric solutions. In order to break these symmetries, the following set of inequalities (E-SYM) can be used. Let \(p_1, p_2, \ldots , p_{k}\) denote an arbitrary ordering of the vehicles. Inequalities (E-SYM) impose that a vehicle with lower index needs to visit at least as many facilities in its tour than a vehicle with a higher index (see also Dang et al. (2013) and El-Hajj et al. (2016) for similar inequalities for the TOP)

$$\begin{aligned} \sum _{i \in F} y^{p_\ell }_i \ge \sum _{i \in F} y^{p_{\ell +1}}_i , \quad 1\le \ell \le k-1. \end{aligned}$$
(E-SYM)

3 Algorithmic frameworks

In this section, we discuss further details of our solution frameworks based on models (C), resp., (E). The solutions frameworks were implemented in C++ using CPLEX 12.9 as MIP solver. In Sect. 4, we report results obtained by several combinations of the ingredients presented in this section.

3.1 Separation algorithms

Model (C) has polynomial size, however, the family of valid inequalities (C-CC) is of exponential size. Thus, we do not add all of them in the beginning, but separate them on-the-fly, when they are violated, i.e., we use branch-and-cut. The same holds for inequalities (E-CC) of model (E). In fact, the separation procedure for both families of inequalities is the same, the only difference is that inequalities (C-CC)are defined on variables x, y, and (E-CC) on \(x^p,y^p\), \(p \in P\). Naturally, all the lifted versions/variants of (C-CC) and (E-CC) are also families of inequalities with exponential size. Finally, the family of inequalities (E-FD) of (E) is also of exponential size.

For separation, we used the LazyConstraintCallback, which gets called by CPLEX to check integer solutions and the UserCutCallback, which gets called by CPLEX to check fractional solutions. When using cuts which remove integer solutions, nonlinear reductions and dual presolve reductions of CPLEX must be deactivated. CPLEX does this automatically, when the LazyConstraintCallback is used CPLEX (2020a, 2020b).


Separation of (C-CC) and (E-CC) for fractional solutions In case \((x^*,y^*)\) is fractional, it is well-known that connectivity cuts like (C-CC) can be separated using maximum flow computations on a graph, where the arc capacities are set to \(x^*\) (see, e.g., Bianchessi et al. (2018) and Fischetti et al. (1998)). A connectivity cut with facility i on the right-hand-side is violated if the maximum flow from the central depot 0 to i is less than the value of \(y^*_i\). Let S be the set containing i in such a minimum cut. The set S is giving a violated constraint (C-CC). In order to speed up the separation, we sort the facilities in decreasing order by the values of \(y^*_i\) and separate in this order. Whenever we find a violated inequality, we remove all facilities in S for separation of further inequalities. The maximum flow/minimum cut computation is done using the algorithm of Cherkassky and Goldberg (1995). This algorithm may also return a second minimum cut \(S'\), and in such a case, we also add the violated constraint induced by \(S'\). Moreover, we add a small \(\epsilon =10^{-5}\) to all capacities before separation to favor the separation of minimal cardinality cuts, see, e.g., Koch and Martin (1998) for more details on the last two techniques. Separation of inequalities (E-CC) is done in a similar way.


Separation of (E-CC) for integer solutions As the flow-conservation constraints (C-FLOW) already ensure connectivity of the solution in model (C), inequalities (C-CC) will never be violated when \((x^*,y^*)\) is integer. In model (E), when \((x^{p*},y^{p*})\) is integer, the induced solution will consist of one or more cycles for each vehicle p. For each cycle S not containing the central depot 0, a constraint (E-CC) is added. These cycles can be found by finding the connected components of the induced solution using, e.g., a breadth-first search. Each component not containing the central depot is such a cycle S. For i on the right-hand-side of (C-CC), we take a facility with the largest number of associated customers, ties are broken by taking the one with smallest index.


Separation of Inequalities (E-CLQ) The set \({\mathcal {F}}\) of facilities on the left-hand-side of inequalities (E-CLQ) for a given vehicle p is a clique in the graph with edge set \(E=\{\{i,i'\} \in F \times F:d_{0i}+d_{ii'}+d_{i'0}>L_p\}\). Thus, for exact separation of these inequalities, we would need to solve the NP-hard maximum weighted clique problem (see, e.g.,(Bomze et al. 1999)) with LP-values \((y^{p*})\) as vertex weights. To allow for fast separation, we have implemented a greedy heuristic for separating these inequalities. Let deg(i) be the degree of a vertex in the graph defined by (F, E). The heuristic is described in Algorithm 1.

figure a

In some of our tested settings, we initialize the model with a subset of the inequalities of family (E-CLQ). This subset is constructed by a modified version of the separation heuristic. In this modified version, we use \(deg(i)+deg(i')\) in line 5 and \(deg(i'')\) in line 14 as of course there are no LP-values available at initialization.

Separation of Lifted Inequalities (C-CC-FIXA), (C-CC-OPT), (C-CC-FD), (C-CC-CUST), (E-CC-CLQ), (E-CC-FD)

We do not explicitly separate the lifted inequalities, but instead try to lift inequalities (C-CC) (resp., (E-CC)) when they are found by the separation routine. As discussed in the previous paragraph, inequalities (C-CC) can only be violated by fractional solutions. Thus, when only using them, it would be enough to use the UserCutCallback of CPLEX. However, in their lifted versions, these inequalities may cut off integer solutions, as some of the liftings are optimality-based. Thus, for settings using the liftings of (C-CC), we install an empty LazyConstraintCallback so that CPLEX deactivates its nonlinear reductions and dual presolve reductions.

For lifting a violated inequality (C-CC) given by some \(S\subseteq F\) and \(i \in F\), we proceed with the following steps. We only move to the next step, if none of the previous steps was successful. A lifting is deemed successful, if the (LP-)value of the variables on right-hand-side is larger than \(y^*_i\).

  1. 1.

    Try to lift it to an inequality (C-CC-FIXA). This means, we have to find a set \({\mathcal {F}} \subseteq S\) fulfilling the same conditions as in the separation of inequalities (E-CLQ). We use a modified version of the heuristic presented for separating (E-CLQ) (on the subset S) for finding \({\mathcal {F}}\). In this modified version, we do not construct \(E'\), but only try to grow a single set \({\mathcal {F}}\). This set is initialized with i, and vertex \(i'=\mathop {\mathrm {arg~max}}\limits _{i''\in F \cap S: d_{0i}+d_{ii''}+d_{i''0}} deg(i'') y^*_{i''}\). If no \(i'\) exists (due to \(i''\in F \cap S: d_{0i}+d_{ii''}+d_{i''0}\) being an empty set), we do not proceed with the lifting, and go to the next step.

  2. 2.

    Try to lift it to an inequality (C-CC-OPT). This is done by checking, if \(Z(S)\le LB\) for the detected set S and the current incumbent objective value LB.

  3. 3.

    Try to lift it to an inequality (C-CC-FD). This is done by checking all candidates fulfilling the facility dominance condition. If there is more than one candidate, we take the one with largest LP-value, ties are broken by taking the one with smallest index.

  4. 4.

    Try to lift it to an inequality (C-CC-CUST), if there are any \(j \in C\) with \(F(j)\subseteq S\). If yes, let \(j^*\) be the customer with maximum \(z^*_j\)-value among the customers fulfilling the condition, ties are broken by taking the one with smallest index. If \(z^*_{j^*}>y^*_i\), we do the lifting.

In case of a violated inequality (E-CC), we proceed with the following steps. Again, we only move to the next step, if none of the previous steps was successful.

  1. 1.

    Try to lift it to an inequality (E-CC-CLQ). This is done in a similar way as the lifting of (C-CC) to (C-CC-FIXA).

  2. 2.

    Try to lift it to an inequality (E-CC-FD). This is done in a similar way as the lifting of (C-CC) to (C-CC-FD).

Additional details of the separation routine In order to avoid spending too much time in the separation routines, we limit the number of rounds of the separation-loop to 20 in the root node of the branch-and-cut, and to five in all other nodes. Moreover, when using formulation (C) we initialize our model with \(x_{ii'}+x_{i'i}\le y_i\) for each \(i \in F\) and the five nearest \(i' \in F\) to i, these constraints are a special case of (C-CC) for \(|S|=2\). The corresponding inequalities in case of formulation (E) are \(x^v_{ii'}+x^v_{i'i}\le y^v_i\). These inequalities forbid subtours of size two. Note that they do not involve the central depot 0 as one of the vertices, as a tour going to a single facility is feasible, and would be forbidden by a constraint \(x_{0i}+x_{i0}\le y_i\).

3.2 Branching priorities

Due to the structure of our models, branching on different set of variables will have different impacts on the structure of the solutions obtainable in the nodes of the branch-and-bound/branch-and-cut tree. CPLEX, which is the branch-and-cut solver we use, allows to give branching priorities to variables. In our implementation, we give the highest branching priorities to the facility variables y (resp., \(y^v\)), as for fixed facility variables, the customers covered in the solution can be found by inspection, and fixed facility variables have also implications on the arcs in the solution.

3.3 Primal heuristics

We implemented primal heuristics to construct feasible solutions guided by the LP-solutions for (C) and (E) using the HeuristicCallback of CPLEX. The heuristics consist of a construction phase, which is slightly different for (C) and (E), and an improvement phase, which does not use the LP-solutions, and thus is the same for both underlying models.


Construction heuristic guided by the LP-solution of (C) The heuristic iteratively constructs k cycles in a greedy fashion using a modified distance function \({\bar{d}}_{ii'}= d_{ii'}(1-x^*_{ii'}+\epsilon )\), where \(x^*\) is the LP-solution and \(\epsilon =10^{-5}\). We refer to a (partial) solution \({\mathcal {S}}\) as \(({\mathcal {T}}, {\mathcal {F}},{\mathcal {C}})\), where \({\mathcal {T}}=\{{\mathcal {T}}_1,\ldots , {\mathcal {T}}_k\}\) is the set of tours of the solution, \({\mathcal {F}}\) is the set of facilities visited in the solution, and \({\mathcal {C}}\) is the set of customers covered. Tours will be indicated by the ordered set of facilities (and central depot 0) visited in the tour. For a facility i and a subset \(F'\subset F\) of facilities, let \({\mathcal {C}}(i,F')\) be the set of customers covered by i, but not by any facility in \(F'\). The heuristic is described in Algorithm 2.

figure b

Construction Heuristic Guided by the LP-solution of (E) The construction heuristic in this case is similar to the one used with (C); the only difference is the modified distance function \({\bar{d}}\). As we have LP-values \((x^{p*})\) for each \(p \in P\), we use a modified distance function \({\bar{d}}^p\) for each vehicle p where \({\bar{d}}^p_{ii'}= d_{ii'}(1-x^{p*}_{ii'}+\epsilon )\) for \(\epsilon =10^{-5}\).


Improvement heuristic The improvement heuristic tries to improve a constructed solution \(({\mathcal {T}}, {\mathcal {F}},{\mathcal {C}})\) by inserting facilities covering customers, which are not yet covered by facilities in the solution. The heuristic is detailed in Algorithm 3.

figure c

First, all facilities, which are dominated by another facility in the solution are removed, and the tours are updated. Let candidateF be a list of all facilities not in the current solution, and which cover at least one customer not covered by the current solution. We sort the facilities in candidateF in descending order based on the number of uncovered customers they would cover. This sorted list is denoted as sortedCandidateF. We then iterate through facilities \(i \in sortedCandidateF\) and try insertion of i in each tour \({\mathcal {T}}_1,\ldots , T_k\) at each possible position. Let \(i^*\) be the first facility on the list sortedCandidateF, which can be inserted in a tour. We insert \(i^*\) at the position and tour, which leads to the smallest increase in tour length. Then, the whole procedure is restarted from the beginning, as adding \(i^*\) to the solution may lead to some new facilities in the solution being dominated. The algorithm terminates when there is no more improving facility \(i^*\) found.

4 Computational results

The runs were carried out on an Intel Xeon E5-2670v2 machine with 2.5 GHz and 3GB of memory using a single thread, and all CPLEX parameters (except branching priorities) are left at their default values.

4.1 Comparison with the MIP Approach of Amiri and Salari (2019)

In this section, we compare our approaches with the MIP approach presented in Amiri and Salari (2019) using the instances introduced in the same paper. We obtained these instances on request from the authors and made them available at https://msinnl.github.io/pages/instancescodes.html. These instances are denoted as AMSAL in the following. For the runs in this section, we used a timelimit of 600 seconds.

The instances are based on TSPLIB-instances with 52, 76, 100, 150, 200, 318, 417, 575, 657 and 724 vertices. For each underlying TSPLIB-instance, three different instances were created by taking 50%, 60% and 70% of the vertices as customers, and the remaining vertices except one as facilities and one vertex as central depot. For each facility, the customers it can cover are randomly chosen from the five nearest ones. Here, we discovered an issue with the instances, namely there are often some customers which cannot be covered by any facility, and thus are useless (this can affect up to 90% of the customers of an instance in some cases). For most of the instances, this even led to the situation that all customers which could be covered by some facility in the instance were in the optimal solution (especially after taking account also customers unreachable due to the distance limit as described in Theorem 2). Figure 2 depicts two instances to illustrate this issue (we refer to instances by \(\texttt {|F|-|C|-k-L}\)). The issue is extremely pronounced, as the facility–customer-split is not chosen randomly among the vertices of the underlying TSPLIB-instance, but the first 50% (resp., 40%, 30%) of vertices in the TSPLIB-instance are taken as facilities. From the figures, it can be seen that these vertices are clustered by location.

Fig. 2
figure 2

Two instance graphs of the TCMCRP from the set AMSAL from Amiri and Salari (2019). The blue circle is the central depot, orange boxes are facilities, and green triangles are customers. The gray edges between facilities and customers denote which customers are covered by each facility

Three different values for the number of vehicles (k) are considered, namely \(k=2,3,4\). In Amiri and Salari (2019), the following formula, for \(\alpha \in \{1,0.9,0.8\}\) is given to define the value for L:

$$\begin{aligned} L= \frac{\alpha }{k} \cdot |F| \cdot \frac{\sum _{i \in F \cup 0} \sum _{i' \in F \cup 0} d_{ii'}}{(|F|\cdot (|F|+1))/2 } \end{aligned}$$

However, when verifying this formula, we noticed that it does not give the correct value, which could be obtained with the following formula instead:

$$\begin{aligned} L= \cdot \frac{\alpha }{k} \cdot |F| \cdot \frac{\sum _{i \in F \cup 0} \sum _{i' \in F \cup 0} d_{ii'}}{(|F|\cdot (|F|+1))/4}. \end{aligned}$$

The formula could be verified, as the respective values of L are also written explicitly in the obtained instance files and in the result tables in Amiri and Salari (2019). However, in some of the instance files, there were wrong values, and also some entries in the tables in Amiri and Salari (2019) have wrong entries, in particular in Table 3 containing the so-called large instances containing 318 vertices and more. We discuss this in more detail later. We corrected these errors in the instance files for our computational study and our uploaded instances also consist of the corrected instances. In total, this set contains \(10\cdot 3 \cdot 3 \cdot 3=270\) instances (ten underlying graphs, and the different parameters for |C|, k and L). From the discussion above, one can already see that these instances are maybe not too meaningful for benchmarking. However, as they are the only instances from literature for the problem, we still consider them, but also introduce new instances to evaluate our approaches in Sect. 4.2.


Comparison of framework ingredients First, we are interested in the effect of the different models, valid inequalities, branching priorities and the primal heuristic. We compare the following settings:

  • C: Model (C) without any variable fixing and valid inequalities. The primal heuristic is not used and branching priorities are left at default.

  • C+: C with variable fixing ((C-FDA), (C-FIXA),(C-FIXC),(C-FIXF)) and valid inequalities (C-FD), (C-DISTG), (C-FLOWER) added at initialization. Note that the resulting model is still compact.

  • C++: C+ with the separation of connectivity cuts (C-CC) (and the liftings (C-CC-OPT), (C-CC-FD), (C-CC-CUST), (C-CC-FIXA)) and also branching priorities as described in Sect. 3.2.

  • C++H: C++ with the primal heuristic as described in Sect. 3.3.

  • C++HS: C++H, where some of the polynomial-sized families of inequalities are separated by enumeration instead of adding it at initialization. The separated inequalities are (C-FD), (C-DIST) and (C-FLOWER). Moreover, explicit upper bounds \(f_{ii'}\le max(L-d_{i'0},0)\) are added at initialization to provide an initial upper bound on the \(f_{ii'}\)-variables, as constraints (C-DIST), which upper-bound \(f_{ii'}\) in connection with \(x_{ii'}\), are now separated.

  • E: Model (E) without any variable fixing and valid inequalities and lifting. The primal heuristic is not used and branching priorities are left at default.

  • E+: Setting E with variable fixing ((E-FDA),(E-FIXA),(E-FIXC),(E-FIXF)) and symmetry breaking inequalities (E-SYM) added at initialization. Valid inequalities (E-FD) are separated by enumeration. Moreover, a subset of inequalities (E-CLQ) is also added at initialization as described in Sect. 3.1.

  • E++: Model (E) with variable fixing ((E-FDA),(E-FIXA),(E-FIXC),(E-FIXF)), and symmetry breaking inequalities (E-SYM) added at initialization. Valid inequalities (E-FD) are separated by enumeration. Inequalities (E-CLQ) are separated, and liftings ((E-CC-CLQ) and (E-CC-FD)) of the connectivity cuts (E-CC) are used. Moreover, branching priorities as described in Sect. 3.2 are also used.

  • E++H: E++ with the primal heuristic as described in Sect. 3.3.

  • E++H2: E++H, where inequalities (E-CLQ) are not only separated, but a subset of them is also added at initialization as described in Sect. 3.1.

Figures 3 and 4 give plots of the runtime to optimality for the instances from set AMSAL for these settings.

Fig. 3
figure 3

Runtime to optimality for instance set AMSAL and different settings based on (C)

Fig. 4
figure 4

Runtime to optimality for instance set AMSAL and different settings based on (E)

Looking at Figs. 3 and 4, the approaches based on model (E) seem more promising, as all settings except E manage to solve nearly all of the instances to optimality within the given timelimit (E++H2 and E++H solver more than 85% of the instances within 25 seconds). The situation for settings based on (C) is more diverse, C++HS and C++H also manage to solve nearly all of the instances to optimality, but the runtime is worse. C+ and E have nearly identical performance (solving about 75% of the instances), and both are considerably better than C (solving about 50% of the instances) and C++ (solving about 62.5% of the instances). Thus, it seems the variable fixing and the valid inequalities are quite helpful regardless of the model, while connectivity cuts (C-CC) are actually hurting the performance for the approach based on (C). Potential explanations for this could be that (i) separation takes too long (leading to a slow node-throughput in the branch-and-bound), (ii) the internal CPLEX heuristics are less effective in finding feasible solutions due to the presence of separated inequalities, (iii) the upper bounds of the LP-relaxation after adding the variable fixing and the valid inequalities are already quite good due to the particular structure of the AMSAL instances. Interestingly, settings C++HS and C++H, which also use the connectivity cuts (C-CC), work better than C+. Both these settings are based on C++, but (i) additionally use the primal heuristic (C++H), resp., ii) also separate some inequalities (C++HS) instead of initializing with them, which leads to a smaller initial model and a faster node-throughput in the branch-and-bound.


Detailed results Next, we give a detailed overview of the results for the instance AMSAL and settings C+, C++HS, and E++H2. Settings C++HS, and E++H2 are the best settings based on (C) and (E), respectively. We also include C+ in the tables as the model used in this setting is compact and the setting uses no heuristic. Thus, it can be given directly to a MIP-solver without the need of implementing separation procedures or heuristics. This could make it an attractive choice for a practitioner. We also compare our results with the ones reported by the MIP approach of Amiri and Salari (2019). The results in Amiri and Salari (2019) were obtained using CPLEX 12.3 on an Intel Core i7 2.93 GHz processor with 3.49 GB of RAM. Tables 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 gives the results for these instances. In the tables, we report for each instance the number of vertices (|V|), the number of facilities (|F|), the number of customers (|C|), the number of vehicles (k), the distance limit (L), the number of customers covered by at least one facility which is reachable from the central depot (\(|C_R|\) “reachable customers,” i.e., following Theorem 2). For each setting, we report the runtime (t[s]), value of the best solution found (\(z^*\)), optimality gap (\(g [\%]\), calculated as \(100\cdot ((UB-z^*)/z^*)\), where UB is the value of the upper bound found by the setting), and number of branch-and-bound nodes (\(\#nBBn\); only for our settings, as these are not reported in Amiri and Salari (2019)). A bold entry in \(z^*\) indicates that optimality has be proven (as can also be seen by the optimality gap \(g [\%]\) being zero). An entry of “-” in \(z^*\) and \(g [\%]\) indicates that no feasible solution could be found within the timelimit. A bold entry in \(|C_R|\) indicates, that the optimal solution consists of all reachable customers. An italic entry in the column \(z^*\) for the results of Amiri and Salari (2019) indicates that the reported solution value has some issues, which are discussed in detail in the following paragraph.

Note that for instances with up to 200 vertices, the facility–customer coverage allocation is the same for each underlying graph and facility/customer split. This means that for the same |F|-|C|-p, the optimal value of instances with larger L always gives an upper bound to the optimal value of instances with smaller L. The following instances files contained wrong values for L, and accordingly, there were also wrong results reported for these instances in Amiri and Salari (2019): instance 15-36-2-1090.05 was the same as instance 15-36-2-1368.07, and instance 59-140-4-5005.31 was the same as instance 59-140-4-5630.98. With regard to instance 15-36-2-1368.07, the value in the instance file and in the result table in Amiri and Salari (2019) is also not the same, it is 1386.07 versus 1386.03, with the former being the correct value when verified using the formula mentioned in the beginning of the section. For 15-36-2-1090.05, the value of L in the table is also not the same as the one obtained by calculation with the formula, the correct value is 1094.46. Moreover, for instances 29-70-2-5574.4, 29-70-2-4955.06, 29-70-3-4129.21, 29-70-3-3716.29 the table in Amiri and Salari (2019) reports an optimal value of 36, while we found different (lower) optimal values. The (optimal) results for instances 25-26-4-894.71 and 25-26-4-795.30 also seem wrong compared to our results, and also conflict with the construction of the instances, as the optimal solution value reported for the instance with lower distance limit is larger than the one for the instance with larger distance limit Finally, in Table 3 in Amiri and Salari (2019) (containing the instances with 318 vertices and more), there are many wrong combinations of |F|-|C|-p-L (i.e., not as occurring in the instance files). Moreover, \(L=1061.21\) in the table should read 51061.21, and \(L=7243.53\) is occurring twice in the table, while 6699.96 is occurring in an instance file, but missing in the table. As the values of L are unique in the instances, the result of Table 3 could still be useful for comparison by assuming that the best objective value reported for a given L gives a correct value and only |F|, |C| and k was mixed up in the table. However, when we compared the reported solution values for a given L with our obtained solution values for the instance with this L, it was often larger than the optimal value we found and even larger than the number of true customers of the instance. Thus, it is not clear to us how to directly compare the results, resp., which of the solution values in Table 3 of Amiri and Salari (2019) correspond to which instance. Hence we do not report the detailed results of Amiri and Salari (2019) for instances of AMSAL with 318 vertices or more, but just note, that for only 38 of these 135 instances, the MIP approach of Amiri and Salari (2019) managed to find a feasible solution within the given timelimit of 18000 seconds, and for 32 of them, optimality is reported (and for 112 of the other instances, leading to a total number of 144 out of 270, for which optimality is reported in Amiri and Salari (2019)).

The results show that E++H2 manages to solve 266 out of 270 instances to optimality, C++HS 255 instances and C+ 216 instances. For one of the four instances not solved to optimality by E++H2, instance 200-79-120-4-6669.35, C++HS find the optimal solution. We note that for the larger instances, C+ often does not manage to leave the root node (see, e.g., Table 10) within the given timelimit, as the size of the model likely becomes prohibitive. This means CPLEX may take a very long time for solving the initial LP-relaxation and also for its internal heuristics, which rely on the LP-relaxation.

For smaller instances, the performance is quite similar to E++H2 and C++HS. Moreover, for 239 instances of the 270 instances, the value of the optimal solution is identical to the number of reachable terminals \(|C_R|\). In particular, for all instances with 318 vertices or more, this is happening.

Table 1 Detailed results for instance set AMSAL with 52 vertices
Table 2 Detailed results for instance set AMSAL with 76 vertices
Table 3 Detailed results for instance set AMSAL with 100 vertices
Table 4 Detailed results for instance set AMSAL with 150 vertices
Table 5 Detailed results for instance set AMSAL with 200 vertices
Table 6 Detailed results for instance set AMSAL for instances with 318 vertices
Table 7 Detailed results for instance set AMSAL for instances with 417 vertices
Table 8 Detailed results for instance set AMSAL for instances with 575 vertices
Table 9 Detailed results for instance set AMSAL for instances with 657 vertices
Table 10 Detailed results for instance set AMSAL for instances with 724 vertices

4.2 Evaluating our MIP approaches on new instances

As the previous section has shown, due to the structure of the instances from AMSAL, it is quite easy to find tight upper bounds. Thus, the main difficulty to solve these instances to optimality is to find the optimal solutions and our approaches seem quite effective also for this purpose. In this section, we introduce a new set of instances, denoted as NEW. The instances are designed in such a way, that for all customers, there are at least two facilities covering it in the underlying graph. Note that due to the nature of the problem, it can still happen, that after applying the distance limit-based variable fixing/preprocessing as described in Theorem 2 that some customers cannot be reached for some values of L. For the runs in this section, we used a timelimit of 1800 seconds.

The instances are made available at https://msinnl.github.io/pages/instancescodes.html and are constructed as follows: |F| facilities and |C| customers are picked by taking random integers within [0, 1000] as the location of them in the Euclidean plane. The central depot is placed at point (500, 500). This is done for the following pairs of (|F|, |C|): (75, 225), (100, 300), (125, 375), for each pair three graphs are constructed. For the facility-customer coverage, we randomly pick between two and five of the nearest facilities for each customer. With this we ensure, that each customer can be covered (and also no trivial one-to-one relationship between some facility and customer can occur). Based on these underlying graphs, instances are created by choosing \(k \in \{2,3,4\}\), and setting L using the following scheme for \(\alpha =\{0.1,0.15,0.2,0.25,0.3\}\): Let \(L(\alpha )\) be the sum of the distances of the \(\lceil \alpha |F|\rceil\) nearest facilities to the central depot. The value of L is set as \(L(\alpha )/l\). In total, this set contains \(3\cdot 3 \cdot 3 \cdot 5=135\) instances (three underlying graphs for each (|F|, |C|) and the different parameters for k and L).

Again, we first analyze the effect of our different settings for the algorithms. Figures 5 and 6 give plots of the runtime to optimality and Figs. 7 and 8 gives plots of the optimality gap for the instances from set NEW and our settings.

Fig. 5
figure 5

Runtime to optimality for instance set NEW and different settings based on (C). For better readability, the y-axis only goes up to 60% of the instances

Fig. 6
figure 6

Runtime to optimality for instance set NEW and different settings based on (E). For better readability, the y-axis only goes up to 60% of the instances

Fig. 7
figure 7

Optimality gap for instance set NEW and different settings based on (C). For better readability, the x-axis only goes up to 20% gap

Fig. 8
figure 8

Optimality gap for instance set NEW and different settings based on (E). For better readability, the x-axis only goes up to 20% gap

From Figures 5 and 6, we can see that these instances seem much more difficult to solve to optimality than the ones from set AMSAL. The best performing settings, namely C++HS, C++H, E++H2 and E++H, only manage to solve about 45% of the instances to optimality within the timelimit. In general, all settings based on (E), except E, have a quite similar performance. For settings based on (C), the basic setting C has the worst performance. Thus, as already seen in the results for set AMSAL, the variable fixings seem quite effective. However, differently to the results for set AMSAL, C++ now performs better than C+. Looking also at the optimality gaps given in Figs. 7 and 8, the settings based on (C) seem to performance a little better than the settings based on (E). Moreover, the performance difference of the settings becomes more pronounced.

Tables 11, 12, 13 give detailed results for the instances of set NEW and settings C+, C++HS and E++H2. In the tables, we report for each instance the number of vehicles (k), the distance limit (L) and the number of customers covered by at least one facility which is reachable from the central depot within the given distance limit (\(|C_R|\) “reachable customers,” i.e., following Theorem 2). For each setting, we report the runtime (t[s]), value of the best solution found (\(z^*\)), optimality gap (\(g [\%]\)), and number of branch-and-bound nodes (\(\#nBBn\)). A bold entry in \(z^*\) indicates that optimality has be proven (as can also be seen by the optimality gap \(g [\%]\) being zero). A bold entry in \(|C_R|\) indicates, that the optimal solution consists of all reachable customers.

In the tables, we can see that the instances with smaller distance limit seem to be easier, as more of them can be solved to optimality compared to instances with larger distance limit. This is not too surprising, as with smaller distance limit, the feasible region of the problem becomes smaller, and a smaller number of facilities are reachable within the distance limit. In general, the instances seem to become harder when the number of nodes is larger. The number of vehicles does not seem to influence the performance much. There are 18 out of 135 instances, where the optimal solution value is equal to \(|C_R|\). Setting C+ manages to solve 51 instances to optimality, C++HS manages to solve 61 instances to optimality and E++H2 manages to solve 59 instances to optimality.

Table 11 Detailed results for instance set NEW for instances with 301 vertices
Table 12 Detailed results for instance set NEW for instances with 401 vertices
Table 13 Detailed results for instance set NEW for instances with 501 vertices

5 Conclusion

In this paper, we studied the recently introduced time-constrained maximal covering routing problem. The problem is a generalization of well-known problems such as the (team) orienteering problem and maximal covering location. In the problem, we are given a central depot, facilities and customers. Each customer can be served by a subset of the facilities. Moreover, we are given distances between the facilities (and central depot), a distance limit L and a number of vehicles k. A feasible solution consists of k Hamiltonian cycles on subsets of the facilities and the central depot. All cycles must contain the central depot and respect the distance limit L. The goal is to maximize the number of customers covered by the facilities in the solution. The problem was introduced in Amiri and Salari (2019), where an exact mixed-integer programming (MIP) approach and several metaheuristics were proposed.

We introduced two new MIP formulations and presented exact solution frameworks based on these MIP formulations. We evaluated our solution approaches on the instances from literature for the problem (from Amiri and Salari (2019)). The computational study revealed that our algorithms were able to find the provably optimal solution for 267 out of 270 instances, including 123 instances, for which the optimal solution was not known before. Moreover, for most of the instances, our algorithms only took a few seconds, being up to five magnitude faster than the exact MIP approach presented in Amiri and Salari (2019). The computational study also showed that the instances from Amiri and Salari (2019) have some issues, which potentially decrease their usefulness as benchmark instances. In particular, there are often many customers, which are not associated with any facility in the instance, and thus can never be in any feasible solution. In many instances, the value of the optimal solution is then simply similar to the number of the remaining customers. We thus also introduced a new set of more challenging instances.

There are several avenues for further work: Naturally, similar to other routing problems, additional side-constraints based on real-life considerations can be added, such as multiple-depots, time-windows, capacities, or uncertainty. Moreover, weighted customers and assignment-costs for customers could also be interesting extensions. To deal with more difficult instances, such as the ones introduced in this paper, the design and (re-)evaluation of metaheuristics could be a promising topic. We note that Amiri and Salari (2019) already proposed and tested some metaheuristics for the problem, however, as discussed, the instances used in their work to evaluate their approaches had some issues. Investigating an exact approach based on a formulation with exponentially many variables (i.e., column generation/branch-and-price) could also be a fruitful topic, as such approaches often work quite well for routing problems of similar type.