Robust two-stage combinatorial optimization problems under convex second-stage cost uncertainty

In this paper a class of robust two-stage combinatorial optimization problems is discussed. It is assumed that the uncertain second-stage costs are specified in the form of a convex uncertainty set, in particular polyhedral or ellipsoidal ones. It is shown that the robust two-stage versions of basic network optimization and selection problems are NP-hard, even in a very restrictive cases. Some exact and approximation algorithms for the general problem are constructed. Polynomial and approximation algorithms for the robust two-stage versions of basic problems, such as the selection and shortest path problems, are also provided.


Introduction
In a traditional combinatorial optimization problem we seek a cheapest object composed of elements chosen from a finite element set E. For example, E can be a set of arcs of a given graph with specified arc costs, and we wish to compute an s − t path, spanning tree, perfect matching etc. with minimum costs (see, for example, Ahuja et al. (1993), Papadimitriou and Steiglitz (1998)). In many practical situations the exact values of the element costs are unknown. An uncertainty (scenario) set U is then provided, which contains all realizations of the element costs, called scenarios, which may occur. The probability distribution in U can be known, partially known, or unknown. In the latter case the robust optimization framework can be used, which consists in computing a solution minimizing the cost in a worst case. Single-stage robust combinatorial optimization problems, under various uncertainty sets, have been extensively discussed over the last decade. Survey of the results in this area can be found in Aissi et al. (2009), Kasperski and Zieliński (2016), Goerigk and Schöbel (2016), Buchheim and Kurtz (2018). For these problems a complete solution must be determined before the true scenario is revealed.
In many practical applications a solution can be constructed in more than one stage. For combinatorial problems, a part of the object can be chosen now (in the first stage) and completed in a future (in the second stage), after the structure of the costs has been changed. Typically, the first stage costs are known while the second stage costs can only be predicted to belong to an uncertainty set U. First such models were discussed in Dhamdhere et al. (2005), Flaxman et al. (2006), Katriel et al. (2008), Kasperski and Zieliński (2011), where the robust two-stage spanning tree and perfect matching problems were considered. In these papers, the uncertainty set U contains K explicitly listed scenarios. Several negative and positive complexity results for this uncertainty representation were established. Some of them have been recently extended in Goerigk et al. (2020), where also the robust two-stage shortest path problem has been investigated. In Kasperski and Zieliński (2017) and Chassein et al. (2018) the robust two-stage selection problem has been explored. The problem is NP-hard for discrete uncertainty representation but it is polynomially solvable under a special case of polyhedral uncertainty set, called continuous budgeted uncertainty (see (Chassein et al. 2018)).
Robust two-stage problems belong to the class of three-level, min-max-min optimization problems. In mathematical programming, this approach is also called adjustable robustness (see, e.g., Ben-Tal et al. 2004;Yanıkoglu et al. 2019; Ardestani-Jaafari and Delage 2016). Namely, some variables must be determined before the realization of the uncertain parameters, while the other part are variables that can be chosen after the realization. Several such models have been recently considered in combinatorial optimization, which can be represented as a 0-1 programming problem. Among them there is the robust two-stage problem discussed in this paper, but also the robust recoverable models (Büsing 2011(Büsing , 2012 and the K -adaptability approach (Buchheim and Kurtz 2017;Hanasusanto et al. 2015). In general, problems of this type can be hard to solve exactly. A standard approach is to apply row and column generation techniques, which consists in solving a sequence of MIP formulations (see, e.g., Zeng and Zhao (2013)). However, this method can be inefficient for larger problems, especially when the underlying deterministic problem is already NP-hard. Therefore, some faster approximation algorithms can be useful in this case.
In this paper we consider the class of robust two-stage combinatorial problems under convex uncertainty, i.e. when the uncertainty set U modeling the uncertain second-stage costs is convex. Important special cases are polyhedral and ellipsoidal uncertainty, which are widely used in the single-stage robust optimization. Notice that in the problems discussed in Dhamdhere et al. (2005), Flaxman et al. (2006), Katriel et al. (2008), Kasperski and Zieliński (2011), U contains a fixed number of scenarios, so it is not a convex set. The problem formulation and description of the uncertainty sets are provided in Sect. 2. The complexity status of basic problems, in particular network optimization and selection problems, has been open to date. In Sect. 3 we show that all these basic problems are NP-hard, under both polyhedral and ellipsoidal uncertainty. In Sect. 4, we construct compact MIP formulations for a special class of robust two-stage combinatorial problems and show several of its properties. In Sect. 5, we propose an algorithm for the general problem, which returns an approximate solution with some guaranteed worst-case ratio. This algorithm does not run in polynomial time. However, it requires solving only one (possibly NP-hard) MIP formulation, while a compact MIP formulation for the general case is unknown. Finally, in Sects 6, 7, and 8 we study the robust two-stage versions of three particular problems, namely the selection, representatives selection and shortest path ones. We show some additional negative and positive complexity results for them. There is still a number of open questions concerning the robust two-stage approach. We state them in the last section.

Problem formulation
Consider the following generic combinatorial optimization problem P: where c c c = [c 1 , . . . , c n ] T is a vector of nonnegative costs and X is a set of feasible solutions. In this paper we consider the general problem P, as well as the following special cases: 1. Let G = (V , A) be a given network, where c i is a cost of arc a i ∈ A. Set X contains characteristic vectors of some objects in G, for example the simple s − t paths or spanning trees. Hence P is the Shortest Path or Spanning Tree problem, respectively. These basic network optimization problems are polynomially solvable, see, e.g., (Ahuja et al. 1993;Papadimitriou and Steiglitz 1998). 2. Let E = {e 1 , . . . , e n } be a set of items. Each item e i ∈ E has a cost c i and we wish to choose exactly p items out of set E to minimize the total cost. Set X contains characteristic vectors of the feasible selections, i.e. X = {x x x ∈ {0, 1} n : i∈ [n] x i = p}. We will denote by [n] the set {1, . . . , n}. This is the Selection problem whose robust single and two-stage versions were discussed in Averbakh (2001), Conde (2004), Kasperski and Zieliński (2017), Chassein et al. (2018). 3. Let E = {e 1 , . . . , e n } be a set of tools (items). This set is partitioned into a family of disjoint sets T l , l ∈ [ ]. Each tool e i ∈ E has a cost c i and we wish to select exactly one tool from each subset T l to minimize their total cost. Set X contains characteristic vectors of the feasible selections, i.e. X = {x x x ∈ {0, 1} n : i∈T l x i = 1, l ∈ [ ]}. This is the Representatives Selection problem (RS for short) whose robust single-stage version was considered in Dolgui and Kovalev (2012), Deineko and Woeginger (2013), Kasperski et al. (2015).
Given a vector x x x ∈ {0, 1} n , let us define the following set of recourse actions: R(x x x) = {y y y ∈ {0, 1} n : x x x + y y y ∈ X } and a set of partial solutions is defined as follows: Observe that X ⊆ X and X contains all vectors which can be completed to a feasible solution in X . A partial solution x x x ∈ X is completed in the second stage, i.e. we choose y y y ∈ R(x x x) which yields (x x x + y y y) ∈ X . The overall cost of the solution constructed is c c c T x x x +c c c T y y y for a fixed second-stage cost vectorc c c = [c 1 , . . . ,c n ] T . We assume that the vector of the first-stage costs c c c is known but the vector of the second-stage costs is uncertain and belongs to a specified uncertainty (scenario) set U ⊂ R n + . In this paper, we discuss the following robust two-stage problem: The RTSt problem is a robust two-stage version of the problem P. It is worth pointing out that RTSt is a generalization of four problems, which we also examine in this paper. Namely, given x x x ∈ X andc c c ∈ U, we consider the following incremental problem: Given scenarioc c c ∈ U, we study the following two-stage problem: Finally, given x x x ∈ X , we also consider the following evaluation problem: A scenarioc c c which maximizes Inc(x x x,c c c) is called a worst scenario for x x x. The inner maximization problem is called the adversarial problem, i.e., the problem max c c c∈U min y y y∈R(x x x) (c c c T x x x +c c c T y y y) Notice that the robust two stage problem can be equivalently represented as follows: Further notice that the two-stage problem is a special case of RTSt, where U = {c c c} contains only one scenario. The following fact is exploited later in this paper: Observation 1 Computing TSt(c c c) for a givenc c c ∈ U (solving the two-stage problem) has the same complexity as solving the underlying deterministic problem P.
and letẑ z z be an optimal solution to problem P for the costsĉ c c. Consider solution (x x x,ŷ y y) constructed as follows: It is easy to verify that (x x x,ŷ y y) is an optimal solution to the two-stage problem with the objective value of TSt(c c c).
In this paper, we examine the following three types of convex uncertainty sets: where c c c = [c 1 , . . . , c n ] T is a vector of the nominal second stage costs. Vector δ δ δ and matrix A A A represent deviations of the second stage costs from their nominal values. Notice that δ δ δ ∈ R n , A A A ∈ R m×n in U HP and δ δ δ ∈ R m , A A A ∈ R n×m in U E and the dimensions of δ δ δ and A A A will be clear from the context. We assume that all the uncertainty sets are bounded. The uncertainty sets U HP and U VP are two representations of the polyhedral uncertainty. By the decomposition theorem (Schrijver 1998, Chapter 7.2), both representations are equivalent, i.e. bounded U HP can be represented as U VP and vice versa. However, the corresponding transformations need not be polynomial. Thus the complexity results from one type of polytope do not carry over to the other, and we consider them separately. The set U E represents ellipsoidal uncertainty, which is a common uncertainty representation in robust optimization (see, e.g., Ben-Tal et al. (2009)). We also study the following special cases of U HP : is called continuous budgeted uncertainty (Nasrabadi and Orlin 2013;Chassein et al. 2018) and can be considered as a variant of the classic set defined in Bertsimas and Sim (2004a). It bounds the total amount of deviations instead of limiting the number second-stage costs that can deviate. However, the computational properties of RTSt with U HP 0 and budgeted sets from Bertsimas and Sim (2004a) can be different (see,.e.g., Pessoa et al. 2021;Bougeret et al. 2020;Tadayon and Smith 2015). In set U HP 1 we have K budget constraints defined for some (not necessarily disjoint) subsets U 1 , . . . , U K ⊆ [n].
The uncertainty set U HP is a general polyhedron and thus generalizes more specialized cases such as locally budgeted uncertainty sets (Goerigk and Lendl 2020) or knapsack uncertainty set discussed in Poss (2018), Pessoa et al. (2021), where coefficients of A A A need to be non-negative (note that for non-negative decision variables, constraints of such type are sufficient to define the relevant down-monotone closure of the uncertainty set).

General hardness results
The robust two-stage problem is not easier than the underlying deterministic problem P. So, it is interesting to characterize the complexity of RTSt when P is polynomially solvable. In this section we focus on a core problem, which is a special case of all the particular problems studied in Sect. 2. We will show that it is NP-hard under U VP , U HP and U E . Hence we get hardness results for all the particular problems. Consider the following set of feasible solutions i.e. X 1 1 1 contains only the vector of ones. We have X 1 1 1 = {x x x ∈ {0, 1} n : x 1 +· · ·+x n ≤ n} and R(x x x) = {1 1 1 − x x x} contains only one solution, as there is only one recourse action for each x x x ∈ X 1 1 1 . Hence, the robust two stage version of the problem with X 1 1 1 can be rewritten as follows: The following result is known: Theorem 1 ( Kasperski and Zieliński 2017;Goerigk et al. 2020) The RTSt 1 1 1 problem with U = {c c c 1 ,c c c 2 }⊂ R n + is NP-hard. Furthermore, if U = {c c c 1 , . . . ,c c c K }⊂ R n + and K is a part of the input, then RTSt 1 1 1 is strongly NP-hard.
We use Theorem 1 to prove the next complexity results. First observe that the value of maxc c c∈Uc c c T (1 1 1−x x x) does not change if we replace a convex set U by its extreme points. This means that by constructing a convex uncertainty set with vectorsc c c 1 , . . . ,c c c K as its extreme points, we can conclude hardness results. This is straightforward in case of uncertainty sets of type U VP . Hence, we immediately get the following corollary:

Corollary 1
The RTSt 1 1 1 problem with uncertainty set U VP is NP-hard when K = 2 and strongly NP-hard when K is a part of the input.
For uncertainty sets U HP and U E , proofs along the same idea require slight technical additions. λ jc c c j , Since the first stage costs of variables x n+1 , . . . , x n+K are 0, we can fix x n+1 = · · · = x n+K = 1 in every optimal solution to the instance I 1 . The problem then reduces to Consequently, the problem with instance I 1 is equivalent to the strongly NP-hard problem with the instance I.
Note that the reduction in the proof of Theorem 2 constructs an uncertainty set U HP with a non-constant number of constraints. We will show in Sect. 4 that if the number of constraints in the description of U HP (except for the nonnegativity constraints) is constant, then the problem RTSt 1 1 1 is polynomially solvable.

Theorem 3
The RTSt 1 1 1 problem with uncertainty set U E is NP-hard.
Proof Given an instance I = (n, c c c, U = {c c c 1 ,c c c 2 }) of RTSt 1 1 1 , define c c c =c c c 1 +c c c 2 and y y y = 1 1 1 − x x x. We use the following equality (see Bertsimas and Sim 2004b):

c c T y y y + y y y T A A AA A A T y y y)
= min c c T y y y).
In consequence, the NP-hard problem with the instance I is equivalent to RTSt 1 1 1 with the first stage costs 2c c c and ellipsoidal uncertainty set U E = {c c c + Aδ Aδ Aδ : ||δ δ δ|| 2 ≤ 1}, where A A A is an n × 1 matrix and δ δ δ is 1-dimensional vector (intuitively, we map the unit sphere to the line segment joining 2c c c 1 and 2c c c 2 ). Proof It is easy to see that RTSt 1 1 1 is a special case of the RTSt Selection problem, with p = n, and the RTSt RS problem, with . To see that it is also a special case of the basic network problems, consider the (chain) network G = (V , A) shown in Fig. 1. This network contains exactly one s − t path and spanning tree. So the problem is only to decide for each arc, whether to choose it in the first or in the second stage, which is equivalent to solving RTSt 1 1 1 . In Sect. 8 we will show that the hardness result from Theorem 4 can be strengthened for the two-stage version of the Shortest Path problem.

Compact formulations
In this section we construct compact formulations for a special class of problems under uncertainty sets U HP and U E . We will assume that and the polyhedron . Important examples, where the set of feasible solutions is described by N are the shortest path and the selection problems discussed in Sect. 2. We can also use the constraints H H Hx x x = g g g to describe X and the further reasoning will be the same. We can rewrite the inner adversarial problem (notice that x x x ∈ {0, 1} n is fixed) as follows: where the last equality follows from the integrality assumptions and the fact that x x x is a fixed binary vector (note that N is integral if and only if {y y y ∈ R n : H H H (x x x + y y y) ≥ g g g, 0 0 0 ≤ x x x + y y y ≤ 1 1 1} is integral for all binary x x x, see , Lemma 6)).
Since U and {y y y : H H H (y y y +x x x) ≥ g g g, 0 0 0 ≤ y y y ≤ 1 1 1−x x x} are convex (compact) sets andc c c T y y y is a concave-convex function, by the minimax theorem (von Neumann 1928) we can rewrite the adversarial problem as follows (a similar technique to cope with adjustable integer variables was used in Ardestani-Jaafari and Delage (2016) If U = U HP , then we can dualize the inner maximization problem in (8) As the result we get the following compact MIP formulation for RTSt under U HP : Observation 2 The integrality gap of (9) is at least (n) for the RTSt Shortest Path problem under the uncertainty set U HP 0 .
Proof Consider an instance of RTSt Shortest Path shown in Fig. 2. Set X contains characteristic vectors of the simple paths from s to t in graph G composed of arcs . Notice that m = n/2, where n is the number of arcs in G (variables) and m is the number of s − t paths in G (solutions). It is easy to see that the optimal objective value of (9) equals m. In the relaxation of (9) (see also the relaxation of (8)) we can fix x si = 1 m , y si = 0 and x it = 0 and y it = 1 m for each i ∈ [m]. The cost of this solution is 1, which gives the integrality gap of m = (n).
Problem (9) can be solved in polynomial time for RTSt Selection under U HP 0 (Chassein et al. 2018). In Sect. 7 we will show that the same result holds for RTSt RS under U HP 0 . On the other hand, (9) is strongly NP-hard for arbitrary U HP , when the constraint H H H (y y y + x x x) ≥ g g g becomes y 1 + · · · + y n + x 1 + · · · + x n = n, i.e. when (9) models the RTSt 1 1 1 problem (see Sect. 3). We now show that RTSt 1 1 1 is polynomially solvable, when there is only a constant number of constraints in U HP , except for the nonnegativity constraints (note that the hardness result in Sect. 2 requires an unbounded number of constraints).

Theorem 5 The RTSt 1 1 1 problem can be solved in polynomial time if the matrix A A A in U HP has a constant number of rows.
Proof The basic idea of the proof is to enumerate all possible bases to the dual of the adversarial problem. As the number of constraints in A A A is constant, it is then possible to check each basis separately in polynomial time (compare also to (Poss 2018, Theorem 3)).
To this end, consider the formulation (9) for RTSt 1 1 1 with u u u = (u 1 , . . . , u m ) for a constant m. Let us assume that x x x and y y y = 1 1 1 − x x x are fixed. The remaining optimization problem can be rewritten as the following linear program with additional n slack variables s s s: where I I I n denotes the identity n × n matrix. Since U HP is nonempty and bounded, there is an optimal n × n basis matrix B B B to this problem, corresponding to basic variables u u u B B B , s s s B B B , so that We will use the fact that the matrix B B B −1 has a special structure. Namely, by reordering the constraints and variables, we can assume that are coefficients corresponding to u u u B B B and s s s B B B , respectively. Notice that y i ∈ {0, 1} for each i ∈ [n], because x i ∈ {0, 1} and x i + y i = 1 for all i ∈ [n]. Let us write y y y = (ỹ y y,ŷ y y), whereỹ y y denotes the first m variables of y y y, andŷ y y denotes the remaining n − m variables. Using this notation, the constraint B B B −1 y y y ≥ 0 0 0 becomes If we fix the values ofỹ ỹ y, these constraints result in upper bounds onŷ y y. The remaining optimization problem is hence trivial to solve by packing the cheaper version of each item, where this is possible with respect to these upper bounds.
There are n+m n = O((m + n) m ) many different candidates to choose a basis, and for each candidate, we enumerate O(2 m ) values for theỹ y y-variables involved. For fixed m, the resulting complexity is thus polynomial in the input size.
Let us now focus on ellipsoidal uncertainty. If U = U E , then (7) can be rewritten as min {y y y:H H H (y y y+x x x)≥g g g, 0 0 0≤y y y≤1 1 1−x x x} c c c T y y y + ||A A A T y y y|| 2 .
Consequently, we get the following compact program for RTSt under U E : Problem (11) is a quadratic 0-1 optimization problem, which can be difficult to solve. In Sect. 5 we will propose some methods of computing approximate solutions to (11). (11)

Observation 3 The integrality gap of
The reasoning is then the same as in the proof of Observation 2.

Computing approximate solutions
A compact formulation for the general RTSt problem is unknown. Therefore, solving the problem requires applying special row and column generation techniques (see, e.g., Zeng and Zhao (2013)). As this method may consist of solving many hard MIP formulations, it can be inefficient for large problems. In this section we propose algorithms, which return solutions with some guaranteed distance to the optimum. We will discuss a general case as well as cases that can be modeled as the min-max problem (8).

General approximation results
Let X be expressed as (5), but now no assumptions on the polyhedron N (see (6)) are imposed. So, the underlying deterministic problem can be NP-hard and also hard to approximate. By interchanging the min-max operators we get the following lower bound on the optimal objective value of the RTSt problem: We thus get Let (x x x * , y y y * ) ∈ Z be an optimal solution to the min-max problem (12). Then Eval ( We thus get and x x x * ∈ X is a ρ-approximate, first-stage solution to RTSt, i.e. a solution whose value Eval(x x x * ) is within a factor of ρ of the value of an optimal solution to RTSt.
For the uncertainty sets U HP and U E the value of LB can be computed in polynomial time by solving convex optimization problems and for U VP by solving an LP problem.
On the other hand, the upper bound and approximate solution x x x * can be computed by solving a compact 0-1 problem (after dualizing the inner maximization problem in (12)). In the next part of this section we will show a special case of the problem for which x x x * can be computed in polynomial time.
We now consider the polyhedral uncertainty. Using duality, the min-max problem (12) under U HP , can be represented as the following MIP formulation: The relaxation of (15), used to compute L B, is an LP problem, so it can be solved in polynomial time. The problem (15) can be more complex. However, it can be easier to solve than the original robust two-stage problem. Using (13) and (14), we get the following theorem: Theorem 6 Let x x x * be optimal to (15). Then x x x * is a ρ-approximate first-stage solution to the RTSt problem and ρ is the integrality gap of (15).
We now describe the case in which x x x * can be computed in polynomial time, which yields a ρ-approximation algorithm for the robust two-stage problem. Namely, we consider the continuous budgeted uncertainty U HP 0 . Fix y y y and consider the following problem: max c c c∈U HP 0c c c T y y y.
This problem can be solved by observing that either the whole budget is allocated to y y y or the allocation is blocked by the upper bounds on the deviations. So In consequence, the minmax problem reduces to solving two two-stage problems, which can be done in polynomial time if the underlying problem P is polynomially solvable (see Observation 1). So, in this case a ρ-approximate solution x x x * can be computed in polynomial time.

Approximating the problems with the integrality property
In this section we propose some methods of constructing approximate solutions for the RTSt problem if the polyhedron N (see (6)) satisfies the integrality property. Recall that in this case we can represent RTSt as the min-max formulation (8), so from now on we explore the approximability of (8). Letĉ c c ∈ U be any fixed scenario. Thus the two-stage problem (see Sect. 2) withc c c, in the second stage, can be then formulated as follows:

min c c c T x x x +ĉ c c T y y y H H H
Using Observation 1, we can solve (16) in polynomial time, by solving one underlying deterministic problem P. We now show how to obtain an approximate solution to (8) by solving (16) for an appropriately chosen scenarioĉ c c. Let (x x x,ŷ y y) ∈ Z be an optimal solution to (16).

Lemma 1 Ifc i ≤ tĉ i , i ∈ [n] (c c c ≤ tĉ c c for short) for eachc c c ∈ U, then (x x x,ŷ y y) is a t-approximate solution to (8).
Proof Let (x x x * , y y y * ) be an optimal solution to (8). We then have maxc c c∈U (c c c Tx x x +c c c Tŷ y y) ≤ t c c c Tx x x +ĉ c c Tŷ y y ≤ t c c c T x x x * +ĉ c c T y y y * ≤ t c c c T x x x * + maxc c c∈Uc c c T y y y * . The inequality (1) follows from the assumption thatc c c ≤ tĉ c c for allc c c ∈ U and t ≥ 1. The inequality (2) holds because (x x x,ŷ y y) is an optimal solution to (16) and this optimal solution will not change when we relax y y y ∈ {0, 1} n with 0 0 0 ≤ y y y ≤ 1 1 1 in (16) due to the integrality property assumed.
Accordingly, we can construct the best guarantee t, by solving the following convex optimization problem: where the values maxc c c∈Uci , i ∈ [n], have to be precomputed by solving additional n convex problems.

Polyhedral uncertainty
The next two theorems are consequences of Lemma 1. (8)

Theorem 7 Problem
x x x ∈ {0, 1} n y y y, u u u ≥ 0 0 0 Since y i ∈ [0, 1] for each i ∈ [n], we get u j ∈ [0, 1] for each j ∈ [K ]. Let us fix = 1 t for some integer t ≥ 0, and consider the numbers E = {0, , 2 , 3 , . . . , 1}. Fix vector (u 1 , . . . , u K ), where u j ∈ E. The problem (18) reduces then to (17), where d i = min{ { j∈[K ]:i∈U j } u j , 1}, i ∈ [n]. Let us enumerate all ( 1 ) K vectors u u u, with components u j ∈ E, j ∈ [K ], and let us solve (17) for each such a vector. Assume that (x x x,ŷ y y,û u u) is the enumerated solution having the minimum objective value in (18) (notice that (x x x,ŷ y y,û u u) is feasible to (18)). Let (x x x * , y y y * , u u u * ) be an optimal solution to (18). Let us round up the components of u u u * to the nearest values in E. As the result we get a feasible solution with the cost at most (1 + ) greater than the optimum. Furthermore the cost of this solution is not greater than the cost of (x x x,ŷ y y,û u u), because the rounded vector u u u * has been enumerated. By the assumption that K is constant and (18) can be solved in polynomial time, we get an FPTAS for (8) under U HP 1 .
We will show how to solve (17) for particular problems in Sects 6, 7 and 8 .

Ellipsoidal uncertainty
In this section we will focus on constructing approximate solutions to (11), which is a compact formulation of (8) under ellipsoidal uncertainty U E . As (11) is a 0-1 quadratic problem, it can be hard to solve. Consider the following linearization of (11):  (20) and (x x x * , y y y * ) be an optimal solution to (11). Then

Theorem 10 Let (x x x,ŷ y y) be an optimal solution to
Proof We use the following well known inequalities: 1 √ n · ||A A A T y y y|| 1 ≤ ||A A A T y y y|| 2 ≤ ||A A A T y y y|| 1 Using them, we get c c c Tx x x + c c c Tŷ y y + ||A A A Tŷ y y|| 2 ≤ c c c Tx x x + c c c Tŷ y y + ||A A A Tŷ y y|| 1 ≤ c c c T x x x * + c c c T y y y * + ||A A A T y y y * || 1 ≤ c c c T x x x * + c c c T y y y * + √ n · ||A A A T y y y * || 2 ≤ √ n · (c c c T x x x * + c c c T y y y * + ||A A A T y y y * || 2 ), and the theorem follows.

Observation 4 Problem
Proof It follows directly from the proof of Theorem 3. It is easy to see that ||A A A T y y y|| 2 = ||A A A T y y y|| 1 for the matrix A A A constructed in the proof.  (22) is a two-stage problem with one second stage scenarioĉ c c and it is polynomially solvable according to Observation 1, if problem P is solvable in polynomial time. Notice that relaxing y y y ∈ {0, 1} n with y y y ≥ 0 0 0 does not change an optimal solution to (22), due to the integrality assumption. Now Theorem 10 implies the result.

Robust two-stage selection problem
In this section we investigate in more detail the robust two-stage version of the Selection problem. In Sect. 3 we have proved that this problem is NP-hard. Let us also recall that RTSt Selection is polynomially solvable under U HP 0 (Chassein et al. 2018). We will show that the problem is approximable within 2 for a wide class of uncertainty sets. In the following we will make the following assumption: Assumption 1 The problem of computing arg maxc c c∈Uc c c T y y y, for any fixed y y y ∈ [0, 1] n , can be solved in polynomial time.
Assumption 1 is satisfied for all convex uncertainty sets considered in this paper. We prove the following result:

Theorem 12 If Assumption 1 is satisfied, then RTSt Selection with convex uncertainty is approximable within 2.
Proof Using the results from Sect. 4 (see (8)), the RTSt Selection problem can be represented as the following program: x x x + y y y ≤ 1 1 1 (26) y y y ≥ 0 0 0 (28) Consider the following linear programming relaxation of (23)- (28): x x x + y y y ≤ 1 1 1 ( 3 2 ) x x x ∈ [0, 1] n (33) y y y ≥ 0 0 0 ( 3 4 ) The set of feasible solutions to the relaxation is convex. Hence, one can solve (29)-(34) in polynomial time by the ellipsoid algorithm, after providing a separation oracle for this set (see, e.g., Grötschel et al. (1993)). We have to check if (x x x, y y y, t) ∈ R 2n+1 satisfies (30)-(34) and, if not, a violating constraint must be indicated. This is trivial for the constraints (31)-(34). For the infinite family of constraints (30) Assumption 1 can be used. Namely, we find in polynomial time c c c * = arg maxc c c∈Uc c c T y y y. If t < c c c * T y y y, then a violating constraint is identified. Otherwise, all constraints (30) are satisfied. Assume w.l.o.g that c 1 ≤ c 2 ≤ · · · ≤ c n . Let (x x x * , y y y * , t * ) be an optimal solution to the relaxation (29)-(34). We first note that given y y y * , the optimal values of x x x * can be obtained in the following greedy way. Set p * := p − i∈[n] y * i . For i = 1, . . . , n, assign x * i := min{ p * , 1 − y * i } and update p * := p * − x * i . Let ∈ [n] be such that x * i > 0 for every i ≤ and x * i = 0 for every i > . It is easily seen that x * i + y * i = 1 for all i ∈ [ − 1]. Therefore, the quantity p − i∈[ −1] (x * i + y * i ) must be integral. By the construction, we get i∈ [n] Notice also that 0 < x * + y * < 1 may happen. We now construct a solution (x x x,ŷ y y,t) that is feasible to (23)-(28). Fixt = 2t * . For each i ∈ [n] we setx i = 1 if x * i ≥ 1/2, andx i = 0 otherwise. We further set y i = 1 −x i for i = 1, . . . , − 1,ŷ = 0 ifx = 1, andŷ = min{2y * , 1} otherwise. For all i > , we setŷ i = min{2y * i , 1}. It holds that To show that this is at least p = − 1 + x * + y * + i> y * i , we need to show that i≥ min{y * i , 1/2} ≥ x * . This is true since x * < 1/2 in this case, and i≥ y * i + x * is an integer that is at least 1. By the constructionŷ y y ≤ 2y y y * . Hence,c c c Tŷ y y ≤ 2c c c Tŷ y y * ≤ 2t * =t for eachc c c ∈ U, so the constraints (24) are satisfied. Notice that i∈[n] (x i +ŷ i ) can be greater than p. However, it is easy to reduce this sum to p without increasing the objective value and violating (24), by decreasing the values ofx i andŷ i , i ∈ [n]. Becausex x x ≤ 2x x x * , we get c c c Tx x x ≤ 2c c c T x x x * , which together witht = 2t * completes the proof.

Theorem 13
The approximation guarantee of the rounding algorithm presented in the proof of Theorem 12 is tight, even if p = n and U = U HP has a single constraint.
Proof Consider the following problem instance with p = n = 2, c c c = 10 γ , with μ > 0 and γ > > 0 being small values. An optimal solution to this problem is to set y 1 = x 2 = 1, with objective function value of 1 + γ . An optimal solution to the relaxation is y 1 = 1, x 2 = 1 2 − μ and y 2 = 1 2 + μ. Applying the algorithm from the proof of Theorem 12, we round x 2 to 0 and y 2 to 1. The objective value of this solution is 2( 1 1+2μ ) + . As μ, γ , approach 0, the ratio of optimal objective value and objective value of the approximate solution approaches 2.
Let us now investigate the problem with uncertainty set U HP . The compact MIP formulation (9) for this case takes the following form: x x x + y y y ≤ 1 1 1 (37)

u u u T A A A ≥ y y y T
x x x ∈ {0, 1} n (39) y y y, u u u ≥ 0 0 0 ( 4 0 )

Theorem 14 The integrality gap of problem (35)-(40) is at least 4/3.
Proof Consider the problem with n = p = 2, c c c = 10 1 , The corresponding problem formulation is An optimal solution to this problem is y 1 = x 2 = 1 with objective value 2, while an optimal solution to the LP relaxation is y 1 = u = 1 and x 2 = y 2 = 1/2 with costs 3/2.
Notice that there is still a gap between the 2-approximation algorithm and the integrality gap 4/3 of the LP relaxation of (35)-(40). Closing this gap is an interesting open problem. We now show a positive approximation result a the uncertainty set U HP 1 being a special case of U HP .

Theorem 15 If the number of budget constraints in U HP
1 is constant, then RTSt Selection with U HP 1 admits an FPTAS. Proof Using Theorem 9 it is enough to show that the following problem is polynomially solvable: where d i ∈ E = {0, , 2 , . . . , 1}, i ∈ [n]. We will show first the following property of (41):

Fig. 3 Illustration of the dynamic algorithm
Property 1 There is an optimal solution to (41) in which y i ∈ E for each i ∈ [n].
Proof Let (x x x, y y y) be an optimal solution to (41). Since i∈[n] (x i + y i ) = p, the quantity i∈[n] y i = p − i∈ [n] x i must be integral. Let us sort the variables so that c 1 ≤ c 2 ≤ · · · ≤ c n . Let be the first index such that y / ∈ E. Notice that 0 < y < d . We get i∈[ −1] y i = k for some integer k ≥ 0. Hence k + y cannot be integral and j> y j > 0. Set y = min{ j> y j , d } and decrease the values of appropriate number of y j , j > , so that still i∈[n] (x i + y i ) = p holds. If y = d , then we are done as d ∈ E. If y = j> y j ≤ 1, then k + y = p and thus y ∈ E. Observe that this transformation does not destroy the feasibility of the solution. Furthermore, it also does not increase the solution cost. After applying it a finite number of times we get an optimal solution satisfying the property.
Property 1 allows us to solve (41) by applying a dynamic programming approach. Indeed, using the fact that x i ∈ {0, 1} and y i ∈ E for every i ∈ [n], in each stage i ∈ [n], we have to fix the pair (x i , y i ), where the feasible assignments are (0, ), (0, 2 ), . . . , (0, d i ), (1, 0). A fragment of the computations is shown in Fig. 3. For each arc we can compute a cost c i x i + c i y i . Notice that sometimes there may exist two feasible pairs between two states (see the transition (s, 1) in Fig. 3). In this case, we choose the assignment with smaller cost.
The running time of the dynamic algorithm is O(np 2 1 2 ), so it is polynomial when > 0 is fixed. By Theorem 9, the overall running time of the FPTAS is O(np 2 (1/ ) K +2 ).

Robust two-stage RS problem
In this section we investigate in more detail the robust two-stage version of the RS problem. In Sect. 3 we proved that this problem is NP-hard. First observe that for each set T l , l ∈ [ ], we have to decide whether to choose a tool in the first or in the second stage. In the former case we always choose the cheapest tool. Hence the problem can be simplified and the MIP formulation (8) takes the following form s.t. t ≥c c c T y y yc c c ∈ U (43) x y y y ≥ 0 0 0 (46) In the above formulations x x x is a vector of binary variables corresponding to the tool sets T 1 , . . . , T , andĉ c c = [ĉ 1 , . . . ,ĉ ] T , whereĉ l , l ∈ [ ], is the smallest first stage cost of the tools in T l , i.e.ĉ l = min j∈T l {c j }. Note also that there are no constraints x l + y j ≤ 1, l ∈ [ ], j ∈ T l , as they are implied by (44). The resulting problem can be 2-approximated by a simple rounding algorithm (Shmoys and Swamy 2006). For the sake of completeness, the proof is presented here.

Theorem 16 If Assumption 1 is satisfied, then RTSt RS with convex uncertainty is approximable within 2.
Proof Using the same argument as in the proof of Theorem 12 we can solve the relaxation of (42)-(46) in polynomial time. Consider an optimal solution (x x x * , y y y * , t * ) to the relaxation. We form a corresponding rounded solution (x x x,ŷ y y,t) as follows. We fixt = 2t * . For each l ∈ [ ], if x * l ≥ 0.5, then we fixx l = 1 andŷ j = 0 for each j ∈ T l ; if j∈T l y * j ≥ 0.5, then we setx l = 0 andŷ j = y * j / k∈T l y * k for each j ∈ T l . Obviously in this case j∈T lŷ j = 1 andŷ j ≤ 2y * j for each j ∈ T l . Hencẽ c c c Tŷ y y ≤ 2t * =t for allc c c ∈ U. Thus the rounded solution is feasible in (42)-(46)and its cost is at most 2 times the optimum.
Using the same instance as in the proof of Theorem 13, one can show that the worst case ratio of the approximation algorithm is attained.

Theorem 17 If the number of budget constraints in U HP
1 is constant, then RTSt RS with U HP 1 admits an FPTAS Proof According to Theorem 9, it is enough to show that the following problem is polynomially solvable: where d j ∈ E = {0, , 2 , . . . , 1}, j ∈ [n]. We first renumber the variables in each set T l , l ∈ [ ], so that they are ordered with respect to nondecreasing values of c j . For each tool set T l , we greedily allocate the largest possible values to y j , j ∈ T l , so that the total amount allocated does not exceed 1. If j∈T l y j < 1 orĉ l ≤ j∈T l c j y j , then we fix x l = 1 and set y j = 0 for j ∈ T l ; otherwise we fix x i = 0 and keep the allocated values for y j , j ∈ T l . Using the fact that the variables were initially sorted, the optimal solution can be found in O(n) time. Using Theorem 9, we can construct an FPTAS for the problem with running time O(n log n + n(1/ ) K ).

Theorem 18
The RTSt RS problem under U HP 0 can be solved in O(n 2 log n) time.
Proof The MIP formulation for the problem under U HP 0 takes the following form (see (9)): which can be represented, equivalently, as follows minĉ c c T x x x + c c c T y y y + π + j∈ [n] We now show the following claim: Claim 1 There is an optimal solution to (49) in which π = 0 or π = 1 p , p ∈ [n].
are feasible solutions to (50) for sufficiently small > 0. Such exists since π ∈ (0, 1). This contradicts our assumption that (u u u, v v v, w w w, t t t, π) is a vertex solution (basis feasible solution). The proof for the second case S ⊂ S may be handled in much the same way. It suffices to notice that for each constraint l, j∈T l (u j + v j ) = 1, l ∈ S \ S , there exits at least one j ∈ T l such that 0 < u j < π or 0 < v j < 1 − π . Using this fact one can build 0 0 0 = (u u u , v v v , w w w , t t t , π ) ∈ R 4n +1 to arrive to a contradiction with the assumption that (u u u, v v v, w w w, t t t, π) is a vertex solution. We thus have proved that there always exists at least one constraint l ∈ S , such that j∈T l (u j + v j ) = p l π = 1, where p l ≥ 2. Hence π = 1 p l for π ∈ (0, 1). After adding the boundary values of π , i.e. 0 and 1, Claim 1 follows.
Problem (49) can be rewritten as follows: where the original variables y j , j ∈ [n], in (48) are restored as follows: y j = πû j + (1 − π)v j . Using Claim 1, let us fix a candidate value for π . We can now sort with respect to nondecreasing values of the costs c j and c j + d j ofû j andv j within each set T l , and either set x l = 1 or pack fromû u u andv v v in nondecreasing order until j∈T l πû j + (1 − π)v j reaches 1. As there are O(n) values for π to check, the overall time required by this method is thus O(n 2 log n) Fig. 4.

Robust two-stage shortest path problem
In Sect. 3 we have shown that RTSt Shortest Path problem is strongly NP-hard even in a very restrictive case, when the cardinality of the set of feasible solutions is 1. We now show that the hardness result can be strengthened.

Theorem 19
The RTSt Shortest Path problem under U VP = conv{c c c 1 , . . . ,c c c K } is hard to approximate within log 1− K for any > 0 unless NP ⊆ DTIME(n polylog n ), even for series-parallel graphs.
Proof Consider the following Min-Max Shortest Path problem. We are given a series-parallel graph G = (V , A), with scenario set U = {c c c 1 , . . . ,c c c K }⊆ R |A| + , where scenarioc c c j is a realization of the arc costs. We seek an s − t path P in G whose maximum cost over U is minimum. This problem is hard to approximate within log 1− K for any > 0 unless NP ⊆ DTIME(n polylog n ) (Kasperski and Zieliński 2009). We construct a cost preserving reduction from Min-Max Shortest Path to RTSt Shortest Path with U VP . Let us define network G = (V , A ) by splitting each arc (v i , v j ) ∈ A into two arcs, namely (v i , v i j ) (dashed arc) and (v i j , v j ) (solid Observe that only dashed arcs can be selected in the first stage for any partial solution x x x with Eval(x x x) < M, and only solid arcs can be selected in the second stage. Furthermore if a dashed arc (v i , v i j ) is selected in the first stage, then, in order to ensure that a solution built is an s − t path in G , the solid arc (v i j , v j ) must be selected in the second stage. So, the choice of the arcs in the first stage uniquely gives the set of arcs chosen in the second stage. Let x x x and y y y ∈ R(x x x) be such a solution to the RTSt problem with total costs less than M. The pair (x x x, y y y) is a characteristic vector of an s − t path in G . Since the first stage costs of the dashed arcs are 0, we get Suppose there is an s − t path P = (v s , v i 1 , v i 2 , . . . , v t ) in G whose maximum cost over U is equal to c. Path P corresponds to path P = (v s , v si 1 , v i 1 , v i 1 i 2 , . . . , v t ) composed of alternated dashed and solid arcs. If x x x is the characteristic vector of all dashed arcs in P , then y y y is the characteristic vector of all solid arcs in P . According to (53) and the construction ofc c c k , k ∈ [K ], we have Eval(x x x) = c.
Suppose that there is a solution x x x, y y y ∈ R(x x x) to RTSt such that Eval(x x x) = c. The characteristic vectors x x x, y y y describe a path P = (v s , v si 1 , v i 1 , v i 1 i 2 , . . . , v t ) in G with alternated dashed and solid arcs, where x x x is the characteristic vector of the dashed arcs and y y y is the characteristic vector of the solid arcs in P . Using (53), we get maxc c c∈{c c c 1 ,··· ,c c c K }c c c T y y y = c. By the construction of the scenarios, we conclude that the maximum cost of the path P Recall that the problem has a K -approximation algorithm under U VP (see Theorem 7).

Theorem 20
The RTSt Shortest Path problem under U HP is hard to approximate in graph G = (V , A) within log 1− |A| for any > 0 unless NP ⊆ DTIME(n polylog n ), even if G is a series-parallel graph.
We will reduce the problem of solving (54) for fixed d i j , (i, j) ∈ A, to the one of finding a shortest s − t path in an auxiliary directed multigraph G = (V , A ) that is built as follows. We first set V = V and A = A and associate with each arc (i, j) ∈ A , the cost equal to c i j . We then compute for each pair of nodes i ∈ V and j ∈ V , i = j, a cheapest unit flow from i to j in the original graph G with respect to the costs c i j and arc capacities d i j and add arc (i, j) to A with the cost equal to the cost of this flow, denoted byĉ i j . Note thatĉ i j is bounded, if a feasible unit flow exists, since c i j are nonnegative. If there is no feasible unit flow from i and j, then we do not include (i, j) to A . The resulting G is a multigraph with nonnegative arc costs.
Finally we find a shortest s − t path P in G . We can construct an optimal solution to (54) as follows. For each arc (i, j) ∈ P: if (i, j) has the cost equal to c i j , then set x i j = 1; otherwise (if (i, j) has the cost equal toĉ i j ) fix y i j to the optimal solution of the corresponding min-cost unit flow problem from i to j. The rest of variables in (54) are set to zero. Since the shortest path and the minimum cost flow problems are polynomially solvable, problem (54) is polynomially solvable as well. By Theorem 9, the problem admits an FPTAS.

Conclusions and open problems
In this paper we have discussed a class of robust two-stage combinatorial optimization problems, where the second-stage costs are uncertain and are specified as a convex uncertainty set. We have investigated the general problem as well as several its special cases. The results obtained for the particular problems are summarized in Table 1.
One can see that there is still a number of interesting open questions concerning the robust two-stage approach. The complexity status of the two network optimization problems under U HP 0 are still open. The complexity status of all the problems under U HP 1 , when the number of budget constraints is a part of the input is also open. Also, no positive and negative approximation results have been established for the robust two-stage version of the Spanning Tree problem. For the selection problems, better approximation algorithms can exist. For the ellipsoid uncertainty, we only know that the basic problems are NP-hard. The question whether they are strongly NP-hard and hard to approximate remains open. Furthermore, a distinction between axis-parallel and general ellipsoids may further highlight differences in complexity. Additionally, it is an interesting question whether it is possible to approximate ellipsoids through polyhedra to derive approximation results in the two-stage setting, as in Han et al. (2016), Poss (2018). It is also interesting to consider the RTSt problem under the budgeted uncertainty sets from Bertsimas and Sim (2004a) or under more general knapsack uncertainty sets, which are special cases of the polyhedral uncertainty U HP . Availability of data and material Data sharing not applicable to this article as no datasets were generated or analysed during the current study.