1 Introduction

One of the most relevant assumptions in mathematical programming is that the exact value of the input data is fixed and known in advance. However, there is a large variety of real applications in which this is not the case, and some parameters used in the model are just estimates of real parameters and/or are subject to some uncertainty. In this case, an optimal solution according to the nominal values of the parameters can be suboptimal (or even infeasible) according to the actual parameters. From a practical viewpoint, it is thus extremely important to take these aspects into account when real-world applications are considered.

Two main approaches have been proposed in the literature to deal with uncertainty, namely robust optimization and stochastic optimization.

Robust optimization [1, 2, 5, 15] assumes full knowledge about the domain of uncertainty parameters in all “reasonable” realizations of the data. Classic robust optimization defines the solution under the worst case scenario for uncertainty, imposing hard constraints to reduce the feasible set and to forbid solutions that become infeasible under certain realizations of the data. The computational tractability of the resulting model depends on the complexity of the underlying nominal problem and on the properties of the uncertain domain. Robust models typically produce solutions that are too conservative in terms of cost compared to non-robust solutions. For this reason, less conservative robust models have been proposed in the literature, either fixing in advance the maximum worsening of the solution value [8, 14], or relaxing somehow the level of protection against uncertainty and allowing for possible recovery of the solution [10]. As shown in [4], the solution produced by a robust optimization model can be arbitrarily bad in terms of cost compared to the optimal expected cost of the associated stochastic problem.

Stochastic optimization [6, 13] is mainly used when some information about the probabilistic distribution of effective data is available. According to this paradigm, the solution is obtained solving an auxiliary optimization problem that includes additional variables modeling uncertainty, and penalizes feasible solutions that are most likely to become infeasible. Computing exactly the expected value of a stochastic program is computationally hard in general, and most of the solution approaches are based on heuristic algorithms. In some applications, chance constraints are used to impose the solution to be feasible with a given probability [12]. Unfortunately, in real applications the number of scenarios to be considered is very large, producing very large models that are extremely hard to solve in practice. In addition, feasibility of the solution in the actual scenario cannot be guaranteed, and the approach may be combined with the definition of some recourse actions.

Indeed, in many applications there is a number of “wait-and-see” decisions that have to be taken after the actual realization of uncertainty. In these models, that are usually denoted as two-stage, there are some strategic decisions that have to be taken in advance, and some operational decisions that are used to recover the quality or the feasibility of the solution. The K-adaptability approach, firstly introduced in [3], is a variant of two-stage optimization in which the user pre-selects a finite set of (say) K solutions and chooses the best of these solutions at a second stage, when the exact nature of uncertainty materializes. This is extremely useful for those situations in which uncertainty is modeled by considering a finite number of scenarios, the actual scenario materializes a short time before decisions have to be taken, and the hardness of the problem prevents to compute a “reasonable” solution from scratch. This paradigm can be used on any optimization problem. In the remainder of the paper, we call the inner optimization problem the underlying problem.

As an example, consider a routing problem in which a fleet of vehicles has to be used to serve a given set of customers, whose demand (or time windows) is known only a few minutes before service starts. In this case, using an optimal solution computed according to nominal values of the parameters may be impractical, as this solution may be very bad in terms of cost or even infeasible. At the same time, computing a high-quality solution in a very short time may be extremely hard from a computational viewpoint. On the other hand, choosing a solution among K precomputed solutions can be done in a reasonably small time and improves the objective value compared to the classical stochastic approach, in which only one solution is precomputed. Another relevant application for a K-adaptability approach is disaster management, where the decision maker has to prepare escape plans for an evacuation that have to be trained in advance [9]. It is clear that only training a small number of such plans is realistic. In other cases, the alternative solutions may have to be prepared physically, e.g., by establishing links in a network, so that computing and implementing a solution after the scenario materialized is again not possible, and only a small set of candidate solutions can be considered.

Compared with classical two-stage optimization, another advantage of K-adaptability is that it allows the user to take under control the flexibility that is used to face with different scenarios of the problem. This may be crucial in many applications, e.g., school bus routing and home care services for elderly persons. In these contexts, it is preferable to limit the number of different agents that provide service to each customer in different days. This is an immediate byproduct of the K-adaptability approach, as each day the actual solution is taken from the set of the K different precomputed solutions. Changing the value of parameter K allows the decision maker to implement different policies. Small values of K correspond to a conservative policy in which the actual solution has a small variability. On the other hand, the larger the value of K the larger the variability of the solution and, consequently, the better the solution fits with the actual scenario.

Finally, as recently observed in [16], K-adaptability appears to be coherent with human decision-making, which tends to forecast a limited number of alternative solutions and to apply one of them, instead of computing an optimal solution for each realization of the data. Indeed, changing decisions too frequently may be confusing to the final user, who can refuse the proposed solution.

In this paper we consider an underlying optimization problem of the form

$$\begin{aligned} \begin{array}{ll} \min &{} \xi ^T x \\ \text{ s.t. } &{} x \in X \subseteq {\mathbb {R}}^n, \end{array} \end{aligned}$$
(1)

in which uncertainty depends on the realization of a random variable \(\omega \in \Omega \) and may affect both the objective function \(\xi (\omega )\) and the feasible region \(X(\omega )\). The objective is to determine at most K solutions \(x^1, x^2, \ldots , x^K\) to the underlying problem (1) such that the expected value of the best of these solutions is a minimum. In the reminder of the paper, a set of K solutions to the underlying problem will be denoted by \({\mathcal {X}}_K\) and will be called a set of solutions. Thus, the resulting problem is

$$\begin{aligned} \begin{array}{ll} \min &{} E \bigg (\min \{\xi (\omega )^T x: x\in {\mathcal {X}}_K \cap X(\omega ) \} \bigg ) \\ \text{ s.t. } &{} |{\mathcal {X}}_K|\le K. \end{array} \end{aligned}$$
(2)

We assume that uncertainty is described by discrete variables, i.e., it can be modeled by defining a number (say) \(\ell \) of scenarios. Each scenario j (\(j=1,\ldots , \ell \)) has associated a probability \(p_j > 0\), an objective function \(\xi ^j \in {\mathbb {R}}^n\), and a feasible set \(X^j = \{x \in {\mathbb {R}}^n: A^j x \le b^j\}\), which we assume to be bounded. Possible restrictions on the integrality of the variables are included in the definition of the feasible set \(X^j\). Thus, the subproblem associated with scenario j can be rewritten as \(\min \{{\xi ^j}^T x : x \in X^j\}\), and problem (2) can be reformulated as

$$\begin{aligned} \begin{array}{ll} \min &{} \displaystyle {\sum _{j = 1}^{\ell } p_j} \min \{ {\xi ^j}^T x: x\in {\mathcal {X}}_K\cap X^j \} \\ \text{ s.t. } &{} |{\mathcal {X}}_K|\le K. \end{array} \end{aligned}$$

To simplify the notation, we incorporate the probabilities into the objective function vectors, i.e., we consider the following problem

$$\begin{aligned} \begin{array}{lll} \min &{} \displaystyle {\sum _{j = 1}^{\ell }} \min \{ {\xi ^j}^T x: x\in {\mathcal {X}}_K\cap X^j \} &{} \\ \text{ s.t. } &{} |{\mathcal {X}}_K|\le K. &{} \end{array} \end{aligned}$$
(3)

We denote the resulting problem by Discrete Constrained Min-E-Min problem (DCmEm). We say that an instance of (DCmEm) is feasible when there exists a set of solutions \({\mathcal {X}}_K\) such that, for each scenario j, \({\mathcal {X}}_K\cap X^j \ne \emptyset \).

Recently, a special case of problem (3) in which uncertainty affects the objective function only has been addressed in [7]. For this special case, that will be denoted as (mEm) in the following, the authors studied the computational complexity, gave approximation results, and introduced mathematical formulations and algorithms. A similar context was considered in [16], where the relation between classical two-stage optimization and the min-max-min version of K-adaptability was considered. In addition, a branch-and-bound approach for the optimal solution of the problem was introduced, establishing conditions for its asymptotic and finite time convergence.

Contributions The first main contribution of this paper is an investigation of the complexity of Problem (DCmEm). We show that deciding whether an instance of (DCmEm) is feasible is \(\mathcal {NP}\)-hard, even if \(K\ge 2\) is fixed and the underlying problem can be solved in polynomial time. In case K is part of the input, we prove that the same result holds even if all the coefficients of the constraints defining the feasible region are binary and \(|X^j|\) is a polynomial in the input size for every scenario j. The second major contribution is the development of mathematical models and sound exact approaches (including a branch-and-price algorithm) for (DCmEm). The third contribution is a computational evaluation of the performances of the proposed algorithms on a large benchmark of randomly generated instances, obtained by considering underlying problems with different levels of complexity.

Outline The rest of this paper is organized as follows. In the next section, we investigate the complexity for checking feasibility of (DCmEm) in terms of \(\mathcal {NP}\)-hardness and fixed-parameter tractability. In Sect. 3, we describe a compact formulation and two different Set Partitioning models for describing (DCmEm). All other exact approaches including the branch-and-price algorithm are presented in Sect. 4. In Sect. 5 we show how using approximation algorithms for solving the underlying problem affects the solution quality of (DCmEm). The different exact algorithms for solving (DCmEm) are then analyzed in an extensive computational study, that is presented in Sect. 6. Section 7 concludes the paper.

2 Complexity results

In [7], the authors showed that the (mEm) is \(\mathcal {NP}\)-hard. In addition they proved that this problem is not in APX if K is part of the input, and the same holds for \(K\ge 3\) if K is not part of the input. Besides, they showed that the problem is W[2]-hard for parameter K and W[1]-hard for parameter \(n-K\). Trivially, all these results apply to (DCmEm) as well.

In the next section, we consider the (DCmEm), where uncertainty affects the problem constraints as well, and discuss the complexity of deciding whether an instance of (DCmEm) is feasible (i.e., there exists a set of at most K solutions to the underlying problem with non-empty intersection with the feasible region of each scenario). Finally, Sect. 2.2 introduces a subproblem that will be used in our algorithms and analyzes its complexity.

2.1 Uncertain coefficients in the constraints

Hardness results trivially follow when the underlying problem is hard. Hence, we consider the general case where the underlying problem is solvable in polynomial time. The proofs in this section may use binary variables to define the underlying problems, but all these problems remain solvable in polynomial time for each scenario.

Theorem 1

Deciding whether an instance of (DCmEm) is feasible is W[2]-hard parameterized by K if K is part of the input, even if all the coefficients are binary and \(|X^j|\) is a polynomial in the input size for every scenario j.

Proof

We prove the statement by reduction of the W[2]-hard problem Set Cover [11]: given a positive integer \({\overline{K}}\), a set \(U = \{u_1, u_2, \ldots \}\) of items, and a collection \(S = \{s_1, s_2, \ldots \}\) of subsets of U, the Set Cover problem asks if there exists a sub-collection of S with cardinality at most \({\overline{K}}\), so that every item is at least in one of the subsets. The Set Cover is known to be W[2]-hard parameterized by \({\overline{K}}\).

We will show that, given an instance of Set Cover, we can define an instance of (DCmEm) that is feasible if and only if the Set Cover instance has a positive answer. For the reduction, we define a (DCmEm) instance with \(n := |S|\) variables and \(\ell := |U|\) scenarios. For each scenario j, we define a polynomial-time solvable problem where the feasible set \(X^j\) is the set of base vectors of \({\mathbb {R}}^n\) that satisfy a single constraint of the form \({a^j}^T x \ge b\), where \(b := 1\) and the coefficient of each variable \(x_h\) (\(h=1,\ldots ,n\)) is defined as:

$$\begin{aligned} a^{j}_{h} := {\left\{ \begin{array}{ll} \begin{array}{rl} 1 &{} \quad \text {if } u_j \in s_h,\\ 0 &{} \quad \text {otherwise.} \end{array} \end{array}\right. } \end{aligned}$$

Finally, we set \(K := {\overline{K}}\), and therefore we have a parameterized reduction.

The set of solutions of (DCmEm) \({\mathcal {X}}_K\) induces the Set Cover as follows: if the h-th base vector belongs to \({\mathcal {X}}_K\), then item set \(s_h\) belongs to the Set Cover solution. With this definition, it holds that the j-th item is included in at least one subset if and only if there exists a selected subset \(s_h\) such that \(u_j \in s_h\); this means that the associated coefficient \(a^{j}_{h}\) is 1, i.e., the h-th solution is feasible for scenario j. Hence we have a Set Cover if and only if, for every scenario j, at least one solution in \({\mathcal {X}}_K\) is feasible. \(\square \)

We now discuss a similar result for the case of fixed K.

Theorem 2

Deciding whether an instance of (DCmEm) is feasible is \(\mathcal {NP}\)-hard if \(K=2\) is not part of the input.

Proof

We show \(\mathcal {NP}\)-hardness by reduction from the decision variant of the Maximum Cut Problem. Let \(G=(V,E)\) be an undirected graph, and let \(W \subseteq V\) be a subset the vertex set. The induced cut is the set of edges that have exactly one endpoint in W, i.e., \(\delta (W) = \{(v_i, v_j) \in E: v_i \in W, v_j \notin W\}\). Given a positive integer q, the Maximum Cut Problem asks whether there exists a subset W such that \(|\delta (W)|\ge q\).

Given a graph \( G = (V, E)\), we define a (DCmEm) instance with \(n := 2|V|\) variables and \(\ell := 2 |V|\) scenarios. In all scenarios, we define a polynomial-time solvable problem where we impose that

$$\begin{aligned} x_h \in \{0,1\} \quad h = 1, \ldots , |V|, \end{aligned}$$

i.e., the first |V| variables are constrained to be binary and

$$\begin{aligned} x_h \ge \epsilon , \quad h = |V|+1, \ldots , 2|V|, \end{aligned}$$

where \(\epsilon = \frac{1}{|V|+1}\). In addition, scenarios are partitioned into two classes having the same cardinality |V|. Each scenario j in the first class (\(j=1,\ldots ,|V|\)) is characterized by two additional constraints

$$\begin{aligned} \sum _{h = 1}^{|V|} a^j_h x_h = x_{|V| + j}, \quad \quad \text{ and } \quad \quad \sum _{h = 1}^{|V|} x_{|V|+h} \ge q, \end{aligned}$$

where the coefficients \(a^j\) are defined as follows

$$\begin{aligned} a^j_h:={\left\{ \begin{array}{ll} \begin{array}{rl} \delta (v_j) &{} \quad \text { if } h = j,\\ -1 &{} \quad \text { if }(v_h, v_j) \in E,\\ 0 &{} \quad \text { otherwise,} \end{array}\end{array}\right. } \end{aligned}$$

and \(\delta (v_j) = |\{(v_i, v_j) \in E\}|\) denotes the degree of a given vertex \(v_j \in V\).

Finally, each scenario j in the second class is characterized by a single constraint

$$\begin{aligned} x_{|V| + j} = \epsilon . \end{aligned}$$

Assume that (DCmEm) has a feasible solution \({\mathcal {X}}_2 = \{x^1, x^2\}\), i.e., for each scenario either \(x^1\) or \(x^2\) is feasible (or both).

We first observe that, for each \(j = 1, \ldots , |V|\), the j-th scenario in the first class and the j-th scenario in the second class cannot be covered by the same solution. Indeed, while the first constraint of the former scenario sets variable \(x_{|V| + j}\) to an integer value, in the latter scenario the same variable must take a fractional value. Thus, no scenario may have that both \(x^1\) and \(x^2\) are feasible, and each scenario is covered by exactly one solution. Let \(S \subseteq \{1, \ldots , |V|\}\) be the set of indices of the scenarios in the first class for which \(x^1\) is feasible, and let \(j \in S\) be one of these scenarios. Therefore, we must have \(x^1_j = 1\) since otherwise \(\sum _{h = 1}^{|V|} a^j_h x^1_h \le 0\) which is not compatible with the domain of variable \(x_{|V| + j}\).

Using the previous observation, \(x^2\) is not feasible for scenario j. Thus, in case \(x^2_j = 1\), we can define a new solution having \(x^2_j = 0\) since this operation does not affect the feasibility of \(x^2\) for scenarios in which this solution is feasible. Hence, in the first solution, variable \(x_{|V| + j}\) takes a value equal to the number of neighbors of vertex \(v_j\) whose associated scenario is not in S. Finally, observe that \(x^1\) is feasible in all scenarios in the second class that are not in S, i.e., \(x^1_{|V| + h} = \epsilon \) for all \(h \notin S\). Given the second constraint of each scenario in S, solution \(x^1\) must satisfy

$$\begin{aligned} \sum _{h \in S} x^1_{|V| + h} \ge q \end{aligned}$$
(4)

as the total contribution of the remaining \(x^1_{|V| + h}\) variables for \(h \notin S\) is smaller than 1.

Fig. 1
figure 1

Example of the construction of the proof of Theorem 2

Condition (4) implies that the cut induced by vertex set \(W = \{v_j: j \in S\}\) has size at least equal to q.

Conversely, assume now that we have a subset W of vertices whose associated cut has size q (or more), and define the following set of solutions \({\mathcal {X}}_2 = \{x^1, x^2\}\): for each vertex \(v_j \in W\) we set

$$\begin{aligned} x^1_j := 1, \quad x^1_{|V| + j} := \sum _{h = 1}^{|V|} a^j_h x^1_h, \quad x^2_j := 0, \quad x^2_{|V| + j} := \epsilon \end{aligned}$$

while for each vertex \(v_j \notin W\) we have

$$\begin{aligned} x^1_j := 0, \quad x^1_{|V| + j} := \epsilon , \quad x^2_j := 1, \quad x^2_{|V| + j} := \sum _{h = 1}^{|V|} a^j_h x^2_h, \end{aligned}$$

where the \(a^j_h\) coefficients are the same as in the first part of the proof. First of all observe that each variable \(x_{|V| + j}\) associated with a vertex \(v_j \in W\) has, in the first solution, a value equal to the number of edges that connect \(v_j\) to vertices outside W.

We now prove that every scenario admits a feasible solution in \({\mathcal {X}}_2\). Let j (say) be a scenario in the first class such that \(v_j \in W\). By construction, solution \(x^1\) satisfies the first constraint of the scenario. In addition, for each vertex \(v_h \in W\), variable \(x^1_{|V| + h}\) gives the number of edges that connect vertex \(v_h\) to vertices in the other side of the partition; finally, for vertices \(v_h \notin W\), the associated variable is set to a positive value. Thus, we have

$$\begin{aligned} \sum _{h = |V| + 1}^{2|V|} x^1_h = \sum _{h = 1}^{|V|} x^1_{|V| + h} \ge \sum _{\begin{array}{c} h = 1\\ v_h \in W \end{array}}^{|V|} x^1_{|V| + h} \ge q \end{aligned}$$

where the last inequality derives from the value of the cut induced by W. The case with \(v_j \notin W\) is analogous, using solution \(x^2\).

Finally, it is easy to see that solution \(x^2\) (resp. \(x^1\)) is feasible for each scenario j of the second set such that \(v_j \notin W\) (resp. \(v_j \in W\)).

Figure 1 gives an example of the construction, and presents a Maximum Cut instance with positive answer and the associated scenarios (domain of the variables are omitted). \(\square \)

Theorem 3

Deciding whether an instance of (DCmEm) is feasible is \(\mathcal {NP}\)-hard if \(K \ge 3\) is not part of the input.

Proof

For \(K\ge 3\), we show \(\mathcal {NP}\)-hardness by reduction from the Vertex Coloring Problem. A \({\overline{K}}\)-vertex coloring of a graph \(G=(V,E)\) is an assignment of \({\overline{K}}\) different colors to all vertices in V such that no edge in E connects two vertices with the same color. Vertices that receive the same color are called a color class. It is easy to see that each color class is a stable set and that the problem remains \(\mathcal {NP}\)-hard even when a vertex is allowed to belong to more than one color class.

Given a graph \(G=(V,E)\) and an integer \({\overline{K}}\), we define a (DCmEm) instance with \(n := |V|\) binary variables and \(\ell := |V|\) scenarios. For each scenario j, we define a polynomial-time solvable problem with a single constraint of the form \({a^j}^T x \le b\), and the coefficient of each variable h is given by

$$\begin{aligned} a^{j}_{h} := {\left\{ \begin{array}{ll} \begin{array}{rl} -1 &{} \quad \text {if } h = j,\\ 1 &{} \quad \text {if }(v_j, v_h) \in E,\\ 0 &{} \quad \text {otherwise.} \end{array} \end{array}\right. } \end{aligned}$$

Finally, we set \(b := -1\) and define \(K := {\overline{K}}\).

A set of solutions \({\mathcal {X}}_K\) induces the color classes for Vertex Coloring as follows: for each solution \(x^i \in {\mathcal {X}}_K\), all vertices that correspond to variables that are taken in the solution belong to the i-th color class. This solution satisfies the coloring constraint as scenario constraints forbid any two neighbors to be taken in the same solution. Finally, feasibility of each scenario in at least one solution implies that every vertex belongs to a color class.

Conversely, given a Vertex Coloring solution, we can define a set of solutions \({\mathcal {X}}_K\) as follows: each solution \(x^i\) includes all variables that are associated to vertices belonging to the i-th color class. Remember that each color class corresponds to a stable set. Thus, a feasible solution for the j-th scenario is induced by the color class including vertex j, as by definition this color class cannot include any neighbor of j. Since all vertices receive a color, then all scenarios are satisfied by at least one solution.

\(\square \)

2.2 The completion problem

The Completion problem is a variant of the (DCmEm) arising when \({\overline{K}}\) (say) solutions have been fixed and are part of the input. The objective is then to determine the remaining \(K_f := K - {\overline{K}}\) free solutions. We denote by \({\mathcal {X}}_{{\overline{K}}}\) the set of fixed solutions and by \({\mathcal {X}}_{K_f}\) the set of free solutions. The completion problem can be formulated as follows

$$\begin{aligned} \begin{array}{ll} \min &{} \displaystyle {\sum _{j = 1}^{\ell }\min \{\xi ^T x: x\in ({\mathcal {X}}_{{\overline{K}}}\cup {\mathcal {X}}_{K_f})\cap X^j\}} \\ \text{ s.t. } &{} |{\mathcal {X}}_{K_f}|\le K_f. \end{array} \end{aligned}$$

The completion problem appears as a subproblem in a solution approach based on column generation (see Sect. 4.2.1). We now discuss its complexity.

Theorem 4

The completion problem is \(\mathcal {NP}\)-hard and W[2]-hard if \(2 \le K_f < \ell \).

Proof

It is easy to see that an instance of (DCmEm) with parameter K can be modeled as an instance of the completion problem with parameter \(K_f = K \) and \({\overline{K}} = 1\), using a dummy solution that is infeasible (or high-costly) for all scenarios. The result follows from the complexity of (DCmEm). \(\square \)

More interesting is what happens for \(K_f = 1\): in this case the completion problem turns out to be \(\mathcal {NP}\)-hard if at least one solution is fixed, though (DCmEm) is not.

Theorem 5

The completion problem is \(\mathcal {NP}\)-hard even if \(K_f = {\overline{K}} = 1\), the uncertainty affects the objective function only and \(X = \{0,1\}^n\).

Proof

We prove the statement by reduction from Maximum Cut Problem.

Given a graph \(G = (V, E)\), we define an instance of the completion problem with \({\overline{K}} = K_f = 1\) as follows. There are \(n := |V|\) binary variables and \(\ell := 2|E|\) scenarios, all with no explicit constraints, i.e., \(X^j := \{0,1\}^n\) (\(j=1, \ldots , \ell \)). For each edge \(e = (u, v) \in E\), there are two scenarios numbered as e and \(|E|+e\). For the former, the objective function coefficients for the variables are

$$\begin{aligned} \xi ^{e}_{h} := {\left\{ \begin{array}{ll} \begin{array}{rl} -1 &{} \quad \text {if } h = u,\\ 2 &{}\quad \text {if } h = v,\\ 0 &{} \quad \text {otherwise,} \end{array} \end{array}\right. } \end{aligned}$$

while for the latter the objective function is defined as follows

$$\begin{aligned} \xi ^{|E|+e}_{h} := {\left\{ \begin{array}{ll} \begin{array}{rl} 2 &{} \quad \text {if } h = u,\\ -1 &{} \quad \text {if } h = v,\\ 0 &{} \quad \text {otherwise.} \end{array} \end{array}\right. } \end{aligned}$$

The underlying problem associated with each scenario is solvable in polynomial time by inspection. Finally, the fixed solution is a vector of zeros, hence, it has a zero cost for every scenario.

We now show that there exists a subset \(W \subseteq V\) of vertices whose induced cut has size q or more if and only if there exists a solution for the completion problem with value of \(-q\) or less.

Assume that G has a cut with value q, i.e., there exists a subset W of vertices such that \( | \{ e = (u,v) \in E: u \in W, v \notin W \} | \ge q\). Define a solution for the completion problem by setting \(x^1_h = 1\) if \(h \in W\) and 0 otherwise. Since the number of edges with exactly one endpoint in W is at least q, there are at least q scenarios for which this solution has a value \(-1\). For all the remaining scenarios there exists a solution with zero value, hence the value of the completion problem is at least \(-q\).

Conversely, assume now that a solution of the completion problem exists with value \(-q\). Define set W that includes all the vertices associated with variables that take value 1 in the free solution. As the completion problem has value \(-q\), there are q scenarios whose value is \(-1\). Every such scenario is associated with an edge that must have exactly one endpoint in W (otherwise, it would have a positive cost), hence the thesis. \(\square \)

3 Mathematical models

Before discussing mathematical formulations that can be used to model (DCmEm), observe that there are relevant situations in which uncertainty affects the feasible set at an interdiction level. This special case arises, e.g., when the underlying problem is shortest path or spanning tree and uncertainty affects the availability of some edges in some scenarios. In this case, one can change the objective function of each scenario to strongly penalize the use of forbidden edges, thus reducing the problem to an (mEm), and allowing its solution through the algorithms for the unconstrained case.

3.1 Compact formulation

We now present a quadratic programming formulation, derived from a similar model in [7], that involves a polynomial number of variables and constraints. From now on, we will say that a scenario j is covered by the i-th solution if solution i is feasible for scenario j and it is selected, among those that are feasible, for that scenario. For each solution i and scenario j, let \(y_{ij}\) be a binary variable taking value 1 if and only if scenario j is covered by solution i. The model reads

$$\begin{aligned}&\min \displaystyle {\sum _{j=1}^{\ell }\xi ^{jT} \left( \sum _{i=1}^{K} y_{ij} \, x^i \right) } \end{aligned}$$
(5)
$$\begin{aligned}&\text{ s.t. } \displaystyle {\sum _{i=1}^{K} y_{ij} = 1} j = 1, \ldots , \ell \end{aligned}$$
(6)
$$\begin{aligned}&y_{ij}=1 \Rightarrow x^i \in X^j i = 1, \ldots , K; j = 1, \ldots , \ell \end{aligned}$$
(7)
$$\begin{aligned}&y_{ij} \in \{0, 1\} \quad i = 1, \ldots , K; j = 1, \ldots , \ell . \end{aligned}$$
(8)

Constraints (6) impose that each scenario is covered by exactly one solution, whose cost is taken into account in the objective function (5). Domain constraints for the y variables are imposed by (8). Constraints (7) guarantee that each scenario is covered by a solution that is feasible for that scenario. We now discuss possible methods for handling this kind of constraints when using a general-purpose MILP solver. Denoting by n the number of variables in the underlying problem, and assuming that the feasible set of each scenario j is defined as \(X^j=\{x \in {\mathbb {R}}^n : a_r^{jT} x\ge b^j_r; r = 1, \ldots , m\}\), i.e., it is characterized by m linear constraints, the first option is to apply linearization, and to replace (7) by

$$\begin{aligned} a_r^{jT} \, x^i \, + M_{j,r} (1 - y_{ij}) \ge b_r^j \quad i = 1, \ldots , K; j = 1, \ldots , \ell ; r = 1, \ldots , m \end{aligned}$$

where \(M_{j,r} := b_r^j- \min _{x\in X^j}\{a_r^{jT} \, x\}\).

A second option is to exploit the availability of state-of-the-art MILP solvers to use the so-called indicator constraints, which are a modelling tool to express disjunctive conditions. We can then replace constraints (7) by

$$\begin{aligned} y_{ij}=1 \, \Rightarrow \, a_r^{jT} \, x^i \ge b_r^j \quad i = 1, \ldots , K; j = 1, \ldots , \ell ; r = 1, \ldots , m. \end{aligned}$$

Preliminary computational experiments showed that, on our benchmark, the two approaches were comparable, though the former was more robust from a numerical viewpoint, and was adopted in subsequent experiments. Finally, observe that the objective function contains bilinear terms too. Each such term can be handled by introducing an additional variable and a bilinear constraint, to be linearized in a similar way. In this case, however, we experienced better performances by letting the solver to determine the best way to handle non-linearities.

3.2 Set partitioning formulations

We now introduce two novel formulations for (DCmEm). Both formulations are pure 0–1 linear programs and involve an exponential number of variables.

Assignment Formulation We derive the first formulation by working in the space of solutions. Let us denote by \(Q = \cup _j X^j\) the set of all solutions that are feasible for at least one scenario. With a small abuse of notation, for each solution \(p \in Q\), we also denote by p the index of the solution in set Q. For each solution \(p \in Q\), let us introduce a binary variable \(\sigma _p\), taking value 1 if solution p is among the K chosen solutions, and 0 otherwise. In addition, for each solution p and scenario j, let \(\rho _{jp}\) be a binary variable taking value 1 if and only if scenario j is assigned to solution p. Finally, let \(r_{jp}\) be the cost of solution p in scenario j; \(r_{jp}\) and variable \(\rho _{jp}\) are set to zero if \(p \notin X^j\) (the variable is simply not defined in a practical implementation of the model).

Given the choice of the decision variables, we will refer to this model as the Assignment Formulation (AF). The model is as follows

$$\begin{aligned} \begin{array}{lll} \min &{} \displaystyle {\sum _{j = 1}^{\ell } \sum _{p \in Q} r_{jp} \, \rho _{jp}} &{} \\ \text{ s.t. } &{} \displaystyle {\sum _{p \in Q} \rho _{jp} = 1} &{} j = 1, \ldots , \ell \\ &{} \sigma _p \ge \rho _{jp} &{} j = 1, \ldots , \ell ; p \in Q \\ &{} \displaystyle {\sum _{p \in Q} \sigma _p \le K} &{} \\ &{} \displaystyle {\rho _{jp} = 0} &{} j = 1, \ldots , \ell ; p \notin X^j \\ &{} \sigma _p \in \{0, 1\} &{} p \in Q \\ &{} \rho _{jp} \in \{0, 1\} &{} j = 1, \ldots , \ell ; p \in Q. \end{array} \end{aligned}$$
(AF)

The first set of constraints ensures that every scenario is covered by exactly one solution, while the second and third sets of constraints ensure that at most K solutions are used.

Observe that this model could involve an infinite number of variables, in case Q contains an infinite number of solutions, as it happens if the underlying problem includes continuous variables. In this case, we can restrict Q to contain only “relevant” solutions, i.e., all the extreme points of each set obtained as intersection of the feasible sets of any collection of scenarios. Being the number of scenarios finite and each scenario bounded, the number of such intersections is finite and each such intersection defines a bounded polyhedral region. Thus, the overall number of extreme points is finite, though possibly exponential with respect to the input size.

Subsets Formulation To define the second model, that we call Subsets Formulation (SF), let \({{\mathcal {F}}}\) be the family of all feasible subsets of the scenarios, i.e., all subsets of scenarios for which the intersection of the associated feasible regions is non-empty. Each subset \(S \in {{\mathcal {F}}}\) can be described by an \(\ell \)-dimensional binary vector z whose j-th component is equal to 1 if scenario \(j \in S\), and 0 otherwise. Given a subset \(S \in {{\mathcal {F}}}\), the associated cost c(S) is defined by

$$\begin{aligned} \begin{array}{lll} c(S) := &{} \displaystyle {\min \sum _{j \in S} \xi ^{jT} x} &{} \\ \text{ s.t. } &{} x \in X^j &{} j \in S. \end{array} \end{aligned}$$
(9)

In the remainder, we will assume that an oracle is available for solving subproblem (9). In addition, the solution x(S) returned by the oracle will be denoted as the solution induced by S.

For each feasible subset of scenarios \(S_t \in {{\mathcal {F}}}\), denote by \(c_t := c(S_t)\) the cost of the induced solution, and introduce a binary variable \(\vartheta _t\): if \(\vartheta _t\) takes value 1, all scenarios in subset \(S_t\) are covered by the same solution. Then, we obtain the following model

$$\begin{aligned} \begin{array}{lll} \min &{} \displaystyle {\sum _{t = 1}^{|{{\mathcal {F}}}|} c_t \, \vartheta _t} &{} \\ \text{ s.t. } &{} \displaystyle {\sum _{t = 1}^{|{{\mathcal {F}}}|} z_{jt} \, \vartheta _t = 1} &{} \quad j = 1, \ldots , \ell \\ &{} \displaystyle {\sum _{t = 1}^{|{{\mathcal {F}}}|} \vartheta _t \le K} &{}\\ &{} \vartheta _t \in \{0, 1\} &{} \quad t = 1, \ldots , |{{\mathcal {F}}}|. \end{array} \end{aligned}$$
(SF)

The model requires to select at most K subsets, so that each scenario belongs to a selected subset. By associating the i-th selected subset to solution \(x^i\), we obtain a set of solutions for (DCmEm), whose cost is given by the sum of the costs of the selected subsets.

Formulations (SF) and (AF) require enumeration of all the feasible subsets of scenarios and of all feasible solutions for every scenario, respectively. Enumeration can be performed implicitly by means of column generation techniques, as it will be described in Sect. 4.2.1. We conclude this section by showing that the two set partitioning models have the same tightness in terms of continuous relaxation.

Theorem 6

Formulations (AF) and (SF) are equivalent in terms of continuous relaxation.

Proof

In the following we will denote by (AF)\(_c\) and (SF)\(_c\) the continuous relaxations of formulations (AF) and (SF), respectively. To prove the statement we show that, given an optimal solution of (SF)\(_c\), there exists a feasible solution of (AF)\(_c\) with the same value, and vice versa. \(\square \)

Given an optimal solution \(\vartheta \) of (SF)\(_c\), the corresponding solution for (AF)\(_c\) is defined as follows:

  1. 1.

    for each \(p \in Q\) do

    • \(\sigma _p := 0\);

    • for \(j:=1\) to \(\ell \) do \(\rho _{jp} := 0\);

  • endfor

  1. 2.

    for each \(S_t \in \mathcal {F}\) such that \(\vartheta _t > 0\) do

    • let p be the solution induced by subset \(S_t\);

    • \(\sigma _p := \sigma _p + \vartheta _t\);

    • for \(j:=1\) to \(\ell \) do \(\rho _{jp} := \rho _{jp} + z_{jt} \, \vartheta _t\)

  • endfor

Step 1 initializes an empty solution, while the second step defines the correct value for the \(\sigma \) and \(\rho \) variables. In this step, only subsets \(S_t\) that are actually selected in solution \(\vartheta \) are taken into account. For each such subset, the associated induced solution p is considered, i.e., an optimal solution of subproblem (9). Every positive value \(\vartheta _t\) is used to increase the value of a single \(\sigma \) variable, hence \(K \ge \sum _{t=1}^{|{{\mathcal {F}}}|} \vartheta _t = \sum _{p \in Q} \sigma _p\). Similarly, for each scenario j, each variable \(\rho _{jp}\) is increased by \(\vartheta _t\) when considering a subset \(S_t\) that (i) is selected; (ii) includes scenario \(j \in S_t\) (i.e., \(z_{jt} = 1\)); and (iii) induces solution p. It follows that \(1 = \sum _{t=1}^{|{{\mathcal {F}}}|} z_{jt} \, \vartheta _t = \sum _{p \in Q} \rho _{jp}\), hence solution \((\sigma , \rho )\) is feasible to (AF)\(_c\). Finally, consider a selected subset \(S_t\), which contributes with a cost \(c_t \vartheta _t\) to the objective function of (SF)\(_c\). Let p be the associated induced solution and note that, by definition, we have \(c_t = \sum _{j \in S_t} r_{jp}\). Increasing by \(\vartheta _t\) each variable \(\rho _{jp}\) (\(j \in S_t\)) produces a cost increase equal to \(\sum _{j \in S_t} r_{jp} \, \vartheta _t = \vartheta _t \, \sum _{j \in S_t} r_{jp} = c_t \, \vartheta _t\) in the objective function of (AF)\(_c\). Hence, the two solutions have the same cost.

Assume now that \((\sigma , \rho )\) is an optimal solution for (AF)\(_c\). The following procedure defines a solution \(\vartheta \) for (SF)\(_c\):

  1. 1.

    for each \(S_t \in {{\mathcal {F}}}\) do \(\vartheta _t := 0\);

  2. 2.

    for each \(p \in Q\) do

    • for \(j:=1\) to \(\ell \) do \({\bar{\rho }}_{jp} := \rho _{jp}\);

    • while there exists a \(j\in \{1,\dots ,\ell \}\) such that \({\bar{\rho }}_{jp} >0\) do

      • let \(S_t = \{j : {\bar{\rho }}_{jp} >0\}\), and \(\rho _{\min } := \min _{j \in S_t} \{{\bar{\rho }}_{jp} \}\);

      • \(\vartheta _t := \vartheta _t + \rho _{\min }\);

      • for each \(j \in S_t\) do \({\bar{\rho }}_{jp} := {\bar{\rho }}_{jp} - \rho _{\min }\)

    • endwhile

  • endfor

At each iteration of the for loop, we consider a solution p and possibly define a number of different sets \(S_t\). By optimality of \((\sigma , \rho )\), for each such set \(S_t\), p is an optimal solution for the associated problem (9), and \(c_t\) is the sum of the costs of solution p in the scenarios in \(S_t\). Thus, (AF)\(_c\) and (SF)\(_c\) have the same solution value. By construction, for each \(p \in Q\), we have

$$\begin{aligned} \sum _{t:p \text { optimal for } S_t }\vartheta _t=\sigma _p = \max _{ j\in \{1,\dots ,\ell \}} \rho _{jp}, \end{aligned}$$

which makes the sum of all the \(\vartheta _t\) at most K. In addition, for each scenario j, we have \(\sum _{t:j \in S_t} \vartheta _t = \sum _{p \in Q} \rho _{jp} = 1\). Hence, solution \(\vartheta \) is feasible for (SF)\(_c\).

4 Exact algorithms

Solving the set partitioning formulations of Sect. 3.2 requires to enumerate all solutions and all possible subsets of scenarios, respectively, which is possible only for instances with small size. In this section we describe two alternative exact methods for solving instances that cannot be attacked by the direct application of an MILP solver.

4.1 Enumeration of partitions

The first exact method is an extension of an approach introduced in [7] for the (mEm). This algorithm is proven to be an oracle-polynomial time algorithm if \(\ell -K\) is fixed, i.e., it provides a positive complexity result in this special case of (DCmEm). The proposed enumerative approach is based on the following consideration: let P be a partition of the scenarios in K subsets \(S_1, S_2, \ldots , S_K\). Then, an optimal solution for (DCmEm) can be determined by computing K solutions, the i-th induced by subset \(S_i\) according to (9). The K solutions induced by the subsets constitute a set of solutions for (DCmEm); with an abuse of notation, we denote this set as the solution induced by P.

Using this observation, one can design an exact algorithm that defines all possible partitions of scenarios into K subsets and, for each partition P, computes the induced solution. It is easy to see that, to avoid symmetric partitions and empty subsets (which are unnecessary to define an optimal solution), it is enough to consider “only” \(S_{\ell ,K}\) partitions, where \(S_{\ell ,K}\) is the Sterling number of the second kind.

This algorithm has two main pitfalls. First, the number of partitions to be considered can be very large, in particular when \(\ell -K\) is large. In addition, the determination of the solution induced by a subset of scenarios requires to solve a problem that does not have the same structure as the deterministic underlying problem, and may be much more challenging than the latter from a computational viewpoint. For example, if the underlying problem is a knapsack problem, then the problem to be solved is a multidimensional knapsack problem, in which different knapsack constraints (one for every scenario) have to be considered.

4.2 Branch-and-price algorithm

In this section we present an exact algorithm based on the first set partitioning formulation introduced in Sect. 3.2, which is more convenient for generating columns than the second one. Section 4.2.2 describes a branch-and-price scheme that uses, at each node, the algorithm described in the next section for computing a lower bound by solving the LP relaxation of the model.

4.2.1 Column generation

By dropping the binary requirements, the domain of the variables of model (SF) can be replaced by \(\vartheta _t \ge 0\) (\(t = 1, \ldots , |{{\mathcal {F}}}|\)), and hence the dual of the resulting model is:

$$\begin{aligned} \begin{array}{lll} \max &{} \sum _{j = 1}^{\ell } \lambda _j + K &{} \mu \\ \text{ s.t. } &{} \sum _{j = 1}^{\ell } z_{jt} \lambda _j + &{} \mu \le c_t \quad t = 1, \ldots , |{{\mathcal {F}}}|\\ &{} &{} \mu \le 0, \end{array} \end{aligned}$$

where \(\lambda _j\) (\(j = 1, \ldots , \ell \)) and \(\mu \) are the dual variables associated with the constraints of (SF)\(_c\).

Column generation defines a restricted master problem, in which a subset of the \(\vartheta _t\) variables is used, and solves this continuous problem to optimality. Let \(\lambda ^*_j\) (\(j = 1, \ldots , \ell \)) and \(\mu ^*\) be an optimal dual solution to the restricted master problem. Then, column generation asks for a variable (column) that has a negative reduced cost, i.e., whose associated dual constraint is violated. For a given subset S of scenarios, the associated dual constraint is violated if

$$\begin{aligned} \sum _{j \in S} \lambda ^*_j + \mu ^* > c(S) \end{aligned}$$

where \(c(S) = \sum _{j \in S} {\xi ^j}^T x(S)\) is the cost of the induced solution. This solution must satisfy all constraints of the scenarios in the subset. The problem of determining the subset S (if any) whose associated dual constraint is maximally violated can be formulated by introducing, for each scenario j, a binary variable \(\pi _j\) taking value 1 if and only if scenario j belongs to subset S. The reduced cost \({\bar{c}}(S)\) of this subset is given by the optimal solution of the following problem

$$\begin{aligned} \begin{array}{lll} {\bar{c}}(S) = \min &{} \sum _{j = 1}^{\ell } ({\xi ^j}^T x - \lambda ^*_j) \, \pi _j - \mu ^* &{} \\ \text{ s.t. } &{} \pi _j = 1 \Rightarrow x \in X^j &{} j = 1, \ldots , \ell \\ &{} \pi _j \in \{0,1\} &{} j = 1, \ldots , \ell . \end{array} \end{aligned}$$
(10)

If the optimal solution of the model has a negative value, then the subset of scenarios \(S = \{j : \pi _j = 1\}\) corresponds to a variable with negative reduced cost and should be added on-the-fly to the current restricted master.

The model includes a set of non-linear implications that can be either expressed by linear constraints using large M coefficients, or formulated as indicator constraints (see the discussion in Sect. 3.1). In our experiments with a general-purpose solver, we experienced better performances using the first strategy. In addition, the objective function involves the product of the x and \(\pi \) variables, where the latter are binary. Each bilinear term can be replaced by an additional variable whose value is defined either by an indicator constraint or by linearization with a large M coefficient.

Note that the pricing problem is \(\mathcal {NP}\)-hard, as shown the following reduction from the Completion problem (see Sect. 2.2).

Theorem 7

The pricing problem (10) is equivalent to Completion problem with \(K_f = {\bar{K}} = 1\).

Proof

We can reformulate the objective function of the pricing problem as

$$\begin{aligned} \min \sum _{j = 1}^{\ell } {\xi ^j}^T x \pi _j + \sum _{j = 1}^{\ell } \lambda ^*_j (1 - \pi _j) -\sum _{j = 1}^{\ell } \lambda ^*_j - \mu ^* \end{aligned}$$

where the last two terms are constant and do not depend on the variables.

We now show that the pricing problem can be reduced to the Completion problem. Given an instance of the former, we define an instance of the latter with \(n+1\) variables and \(\ell +1\) scenarios. Every scenario \(j\le \ell \) gets the feasible set \((X^j\cup \{0\}^n)\times \{0,1\}\). As to the objective function, it has a coefficient \(\xi ^j_h\) for each variable \(x_h\) (\(j=1,\ldots ,\ell ; h = 1, \ldots , n\)), and a coefficient \(\lambda ^*_j\) for variable \(x_{n+1}\). The last scenario has a feasible set \(X^{\ell +1} = \{0, 1\}^{n+1}\), zero objective function for the first n variables, and a coefficient equal to M for the last variable. Finally, we set the fixed solution as follows: \({\bar{x}}_h = 0\) for \(h = 1, \ldots , n\) and \({\bar{x}}_{n+1} = 1\).

Consider a scenario j covered by the fixed solution: the contribution to the objective value is \(\lambda ^*_j\) if \(j\le \ell \), and M otherwise. Let x be the free solution determined solving the Completion problem with \(K_f = {\bar{K}} = 1\). For a sufficiently large value of M, the free solution must have \(x_{n+1} = 0\), i.e., the solution covers the last scenario with zero cost. For the remaining scenarios \(j \in \{1, \ldots , \ell \}\), the free solution has a cost \({\xi ^j}^T \, x\), i.e., each scenario j will be covered by the free solution x if \(x \in X^j\) and \({\xi ^j}^T \, x < \lambda ^*_j\) holds, and by the fixed solution otherwise. In other words, the scenarios that are covered by the free solution determine the set S that corresponds to the optimal solution of the pricing problem.

Now we want to reduce the Completion problem to the pricing problem. To this aim, given a fixed solution \({\bar{x}}\), we set \(\lambda ^*_j = {\xi ^j}^T {\bar{x}}\) for each scenario j, and use the same set of scenarios in both problems. It is easy to see that the objective functions of the two problems are equivalent to each other and solution \(x_f\) is equivalent to x. \(\square \)

Fig. 2
figure 2

Heuristic algorithm for pricing

As the pricing problem is in general \(\mathcal {NP}\)-hard, it makes sense to solve it using a heuristic algorithm, resorting to an exact method only in case the former failed in producing a variable with negative reduced cost. Our heuristic algorithm for pricing, described in Fig. 2, takes in input two integer parameters \(p_1\) and \(p_2\). In our implementation, we used \(p_1=5000\) and \(p_2=500\), as this setting produced good results in our preliminary tests. The algorithm randomly selects \(p_1\) subsets of scenarios (chosen with equal probability) and checks whether one of them corresponds to a column with negative reduced cost. As already observed, evaluating the reduced cost of a subset S requires to compute the solution induced by S, which can be time consuming in practice. For this reason, the following heuristic rule is used to reduce the number of subsets that are evaluated: for each candidate subset, we determine an approximated reduced cost, and compute the real reduced cost for the most promising \(p_2\) subsets only. The reduced cost of a subset S is given by \(- \mu ^* - \sum _{j \in S} \lambda ^*_j + \sum _{j \in S} {\xi ^j}^T x(S)\), where x(S) is the solution induced by S. Hence, an approximate algorithm for determining this solution produces an approximate value for the reduced cost as well. The design of the specific algorithm used to compute an approximate induced solution depends on the underlying problem.

As a further improvement to the pricing step, note that the reduced cost of a column (variable) consists of a linear term depending on the dual variables plus the costs of the induced solution. Computing the second part can be very time consuming depending on the underlying problem, whereas the first part can be computed with limited computational effort. Observe that only the first part of the reduced costs changes through the column generation process. Therefore, by creating a hash table with sets of scenarios as keys and the corresponding induced solution and cost as values, one can compute “for free" the reduced cost of a column involving a subset of the scenarios that occurred before.

Every time an induced solution is computed and not added to the master (as it does not have a negative reduced cost), we store it and its cost in this table. This may happen if heuristic pricing is used, as many subsets of scenarios are considered and the induced solution for each subset is computed, but also if an exact solver is used for the pricing problem. In the latter case, indeed, one can exploit the possibility of modern solvers to provide, in addition to the optimal one, a pool of feasible solutions with no computational overhead. Before using the heuristic, we compute the reduced cost (taking into account the updated dual values) for each variable corresponding to a subset of scenarios stored in the hash table; if a variable with negative reduced cost is found, we add it to the master problem, which is then re-optimized. Our computational experiments (see Sect. 6) show that this may have a dramatic impact in the performances of the overall procedure, mainly when the underlying problem is hard to solve. Using hash tables makes the column generation procedure faster and faster the more iterations are executed. As a consequence, the method shows a speed-up during the exploration of the enumeration tree in the branch-and-price algorithm because large parts of these hash tables can be passed on from the parent node to a child node.

4.2.2 Branching scheme

The algorithm for computing the LP relaxation of model (SF) has been embedded into a branch-and-price algorithm. At the root node, a feasible solution is computed using a fast heuristic algorithm [7]. At each node, the continuous relaxation of the current subproblem is solved, producing a lower bound on the optimal solution value of the current subproblem. If the solution of the relaxation is integer, the incumbent may be improved and the node is fathomed. Otherwise, if the lower bound is smaller than the incumbent value, the optimal solution of the continuous relaxation is used to branch, producing two subproblems that are explored according to a depth-first strategy. In particular, let a be a scenario that is included in more than one subset that is selected in the current fractional solution (note that this scenario always exists in a non-integer solution). Let \(S_1\) and \(S_2\) be two of these subsets, and let b be a scenario that belongs to the symmetric difference of \(S_1\) and \(S_2\). In the first node we impose the scenarios a and b belong to the same subset, while in the other node we forbid it. Observe that these branching rules affect the pricing subproblems at the descendant nodes. However, a nice property of this branching scheme is that handling these modification is very easy both in the heuristic generation, and in the exact model. Indeed, in the latter case, it is enough to enforce in (10) the additional constraints \(\pi _a = \pi _b\) for the first node, and \(\pi _a + \pi _b \le 1\) for the second one.

In general, we have a number of candidate scenarios for a and b, thus we use the following tie breaking rules. For each scenario j, we define a score

$$\begin{aligned} \mathtt{score}^1_j = \sum _{t: j \in S_t}\min \{1-\vartheta _t,\vartheta _t\}, \end{aligned}$$

that gives a measure of the distance to the closest integer of all the variables associated with subsets that contain j, and select the scenario a having the maximum score.

For a given a, we assign to each scenario a second score

$$\begin{aligned} \mathtt{score}^2_j = \min \big \{{|\{t: \vartheta _t> 0, a \in S_t, j \notin S_t\}|,|\{t: \vartheta _t >0, a \in S_t, j \in S_t\}|} \big \}, \end{aligned}$$

and take the scenario b that maximizes this figure. In this way, we define a branching which balances (among the descendant nodes) the number of subsets that were in the solution and that violate branching conditions.

5 Approximation

In Sect. 2 we discussed the complexity of checking feasibility of (DCmEm), proving hardness results even for the case of polynomial-time solvable underlying problem. We now present some results concerning the approximation of (DCmEm). Given the negative results on approximability in [7], we focus on problem (9), i.e., computing the solution induced by a subset of scenarios S; observe that this problem can be hard even if the underlying problem is polynomial-time solvable. We show that, replacing the exact oracle for problem (9) with an approximation algorithm yields an approximation algorithm for (DCmEm) as well.

Theorem 8

Assuming that all the optimal values of the oracle are either always non-negative or always non-positive, using an \(\alpha \)-approximation as oracle for optimizing a subset of scenarios (problem (9)) gives an \(\alpha \)-approximation algorithm with respect to the algorithm embedding an exact oracle.

Proof

Without loss of generality, we assume that all the optimal values of the oracle are non-negative and that we have a minimization problem. (A similar proof can be given in case of non-positive optimal values).

For a given set S of scenarios, let \(val_\alpha (S)\) be the solution value obtained solving problem (9) for S through an \(\alpha \)-approximation algorithm. With an abuse of notation, given an arbitrary partition \({\widetilde{P}} = \{S_1, S_2, \ldots , S_K\}\) of the scenarios, we denote by \(val_\alpha ({\widetilde{P}})\) the solution value obtained executing the \(\alpha \)-approximation algorithm for each \(S\in {\widetilde{P}}\). Finally, let \(val_1(S)\) and \(val_1({\widetilde{P}})\) be the same figures when an exact oracle is used. By definition of \(\alpha \)-approximation, for each subset S of scenarios in the partition, we have \(val_\alpha (S) \le \alpha \cdot val_1(S)\), and hence

$$\begin{aligned} val_\alpha ({\widetilde{P}}) = \sum _{S\in {\widetilde{P}}} val_\alpha (S) \le \sum _{S\in {\widetilde{P}}} \alpha \cdot val_1(S) = \alpha \cdot val_1({\widetilde{P}}) \end{aligned}$$

Denote now by \(val_\alpha ^*=\underset{P}{\min }\ val_\alpha (P)\) the best solution value, over all possible partitions, using the \(\alpha \)-approximation algorithm. We have

$$\begin{aligned} val_\alpha ^* \le val_\alpha ({\widetilde{P}}) \le \alpha \cdot val_1({\widetilde{P}}) \end{aligned}$$
(11)

Finally, let \(val_1^*\) denote the optimal solution value for the problem, i.e., the solution obtained solving with an exact oracle the subproblems associated with an optimal partitioning.

Observe that (11) is valid for any partition \({\widetilde{P}}\). Using an optimal partition, we get \(val_\alpha ^* \le \alpha \cdot val_1^*\), which concludes the proof. \(\square \)

Note that Theorem 8 requires that the optimal values of the subproblems are all non-negative, a typically mild assumption in practice. However, the theorem does not imply the existence of a polynomial time \(\alpha \)-approximation algorithm for (DCmEm), neither in case a polynomial time \(\alpha \)-approximation for subproblem (9) is available, as (DCmEm) cannot be solved in oracle polynomial time in general (and is \(\mathcal {NP}\)-hard even for underlying problems that are solvable in polynomial time).

Theorem 9

Replacing the exact oracle with an algorithm having absolute approximation \(OPT+c\) yieds an algorithm for (DCmEm) having absolute approximation \((OPT+K\cdot c)\).

Proof

The proof is similar to that of Theorem 8. The only difference is that, for each subset S of scenarios, the algorithm has an absolute approximation, i.e., the associated solution value is

$$\begin{aligned} val_\alpha (S)\le c + val_1(S) \end{aligned}$$

where c is a positive constant. Using the approximation algorithm for all subsets of scenarios we get

$$\begin{aligned} val_\alpha ({\widetilde{P}}) = \sum _{S\in {\widetilde{P}}} val_\alpha (S) \le \sum _{S\in {\widetilde{P}}} \left( c + val_1(S)\right) = K \cdot c + val_1({\widetilde{P}}) \end{aligned}$$

\(\square \)

Note that in the case of an absolute approximation, we do not need to make the assumption that all the optimal values of the oracle are non-negative or non-positive.

6 Computational experiments

The described algorithms were implemented in Java version \(1.8.0\_212\) and executed on machines using Intel Xeon processors with 2.6 GHz. For all the linear and quadratic integer programming problems we used IBM-ILOG Cplex version 12.9. The branch-and-price algorithm was implemented without using any predefined framework.

In our experiments we considered the following three different underlying problems with increasing level of complexity:

  • Maximum flow problems For these problems, we always used a quadratic grid graph as network. The capacity of the arcs were generated in each scenario as equally independent distributed rational numbers between 0.0 and 1.0, while the entries of each cost vector \(\xi \) were set to \(-1.0\) if the corresponding arc leaves the source and 0.0 otherwise. All numbers were rounded to the second digit;

  • Knapsack problems In each scenario, the profits and weights of the items were randomly generated as uncorrelated uniformly distributed values between 0.0 and 1.0. The capacity value was always set to 0.75W, where W denotes the current sum of the weights of the items. In this case, all values were rounded to the third digit;

  • Multidimensional knapsack problems We considered a variant of the knapsack problem in which there are 5 knapsack constraints. The profit, weight and capacity values for each scenario were generated as in the knapsack case;

In this section we compare exact algorithms for (DCmEm) on instances defined with different values of K and \(\ell \). For the maximum flow problem, variables are associated with arcs, and we used \(n \in \{24, 40\}\). For knapsack and multidimensional knapsack instances, variables are associated with items, and we used \(n \in \{10, 15\}\). In all cases, for each combination of parameters K, \(\ell \), and n, we randomly generated 10 instances.

We compare the following four exact algorithms, each executed with a time limit of 600 seconds per instance:

  • Compact formulation direct application of the solver to the compact formulation (5)–(8). In this case, we linearized constraints (7) using large M coefficients instead of using indicator constraints, and let the solver handle the the linearization of the objective function;

  • Set partitioning direct application of the solver to the set partitioning formulation SF. In our implementation, we spent at most 90% of the time for subsets enumeration; in case this limit is reached, the solver is applied to a restricted formulation, thus producing a heuristic approach;

  • Enum. of partitions enumeration of partitions according to the procedure described in Sect. 4.1;

  • Branch-and-price enumerative algorithm based on column generation, as described in Sect. 4.2.

Tables 1, 2 and 3—report the results for the different classes of instances. Each table reports, for each algorithm:

  • value average ratio (over 10 instances) between the solution produced by the algorithm and the best known solution for each instance;

  • # TL number of instances (out of 10) for which the algorithm hit the time limit;

  • time average computing time (with respect to the instances solved to optimality only).

Table 1 Results for the exact methods for the maximum flow problem
Table 2 Results for the exact methods for the knapsack problem
Table 3 Results for the exact methods for the multidimensional knapsack problem with 5 constraints

The results in Table 1 show that, on the Maximum Flow Problem instances, the compact formulation is the best approach for \(K \le 3\). Increasing the value of K, this method could solve to proven optimality only a few instances with small values of K and/or \(\ell \), though it produced high-quality solutions in the other cases. The branch-and-price algorithm is the best approach for \(K\ge 4\). It is the only algorithm that is able to solve instances of all settings. Finally, note that the enumeration of partitions is able to solve a large fraction of the instances. However, the computing time required by this algorithm is typically larger than that for set partitioning and branch-and-price. Also, it strongly depends on the number of scenarios, and the approximation given for unsolved instances is unsatisfactory in some cases.

For the single Knapsack Problem (see Table 2) the situation is similar: the compact formulation has good performances for most settings with \(K = 2\) and for \(K = 3\), while this algorithm becomes unpractical for larger values of K. For most of the other settings branch-and-price outperforms the other algorithms. Observe that there are some instances for which Set Partitioning does not reach the global time limit but hits the time limit imposed on the subsets enumeration phase, which prevents the possibility to compute an optimal solution.

The results show that an increase of \(\ell \) has a higher impact on the running time of all algorithms apart the branch-and-price algorithm. Therefore, we expect that branch-and-price algorithm could outperform the compact formulation even more in case of larger instances and could also be able to be better in some instances with \(K=2\) in case a larger time limit is used. However, when both algorithms hit the time limit, frequently the compact formulation has a slightly better solution value than the branch-and-price algorithm. This is due to the lack of heuristics in the latter approach, while the former takes advantage of the heuristic algorithms that the MILP solver may use on a compact formulation. The integration of heuristic algorithms at the various nodes of the branch-and-price algorithm could improve the performances of this method, but this is outside the scope of the current paper.

Finally, Table 3 illustrates the results of the tests for the Multidimensional Knapsack Problem. In this case too, the compact formulation is the best exact algorithm for \(K \le 3\) and \(\ell =15\) only. The enumeration of partitions solves to optimality only the cases with \(\ell =15\). In all the other settings the branch-and-price is the best algorithm.

Table 4 Results for the exact methods for large Knapsack instances

6.1 Large instances

In order to determine the computational limit of the proposed approaches, we considered an additional benchmark of randomly generated instances whose size is larger than those addressed in the previous experiments. In all these instances, the underlying problem is the Knapsack Problem, which has an intermediate level of computational complexity compared with Maximum Flow and with Multidimensional Knapsack problems. The additional instances are generated exactly as those addressed in Table 2, but for the choice of parameters K, \(\ell \), and n, whose values are raised up to 15, 30, and 50, respectively.

For these experiments we tested the compact formulation, the set partitioning and the branch-and-price with a time limit of two hours per instance. The results are given in Table 4 and report, for each algorithm, the average ratio (over 10 instances) between the solution produced by the algorithm and the best known solution for each instance (column “value”). In addition, for the branch-and-price algorithm, we report the number of instances (out of 10) for which the algorithm hit the time limit (column “TL”) and the average computing time with respect to the instances solved to optimality (column “time”). These two columns are not reported for compact formulation and set partitioning that hit the time limit for all the instances in the benchmark.

The results in Table 4 show that compact formulation and set partitioning are unable to face instances characterized with large values of K, \(\ell \) or n. Note that, in many cases, the set partitioning formulation is unable to even compute a feasible solution within the time limit; in these situations we report a ‘–’ in the corresponding entry in column “value”. The performances of the branch-and-price improve with larger values of K: the number of instances (out of 100) that are solved to optimality is equal to 13 for \(K=2\), and this figure is equal to 20 for \(K=5\), 52 for \(K=10\) and 81 for \(K=15\). Conversely, increasing the number n of items produces a worsening of the results: the algorithm hits the time limit for 11 instances with \(n = 15\) (out of 80), 54 times for \(n = 25\), and 70 times for \(n = 50\). As to parameter \(\ell \), its value seems to have a minor impact in this analysis: the algorithm solves 105 instances (out of 200) with \(\ell =25\) and 61 instances with \(\ell =30\). Finally, for what concerns the quality of the solution produced, we observe that branch-and-price algorithm is the best algorithm on average, though set partitioning is very effective for those instances for which it is able to compute a feasible solution within the time limit.

7 Conclusions

We considered stochastic problems in which recourse actions can be taken after the exact nature of uncertainty materializes, and one is allowed to implement a specific solution chosen among K that have been computed in advance. The resulting K-adaptability paradigm is extremely challenging both from a theoretical and from a computational viewpoint: we proved that even determining if the problem has a solution is \(\mathcal {NP}\)-hard, also in case the underlying problem can be solved in polynomial time. We introduced mathematical formulations of the problem and exact solution approaches. Finally, an extensive computational analysis has been carried out on a large set of randomly generated instances.