Oracle-based algorithms for binary two-stage robust optimization

In this work we study binary two-stage robust optimization problems with objective uncertainty. We present an algorithm to calculate efficiently lower bounds for the binary two-stage robust problem by solving alternately the underlying deterministic problem and an adversarial problem. For the deterministic problem any oracle can be used which returns an optimal solution for every possible scenario. We show that the latter lower bound can be implemented in a branch and bound procedure, where the branching is performed only over the first-stage decision variables. All results even hold for non-linear objective functions which are concave in the uncertain parameters. As an alternative solution method we apply a column-and-constraint generation algorithm to the binary two-stage robust problem with objective uncertainty. We test both algorithms on benchmark instances of the uncapacitated single-allocation hub-location problem and of the capital budgeting problem. Our results show that the branch and bound procedure outperforms the column-and-constraint generation algorithm.


Introduction
The concept of robust optimization was created to tackle optimization problems with uncertain parameters.The basic idea behind this concept is to use uncertainty sets instead of probability distributions to model uncertainty.More precisely it is assumed that all realizations of the uncertain parameters, called scenarios, are contained in a known uncertainty set.Instead of optimizing the expected objective value or a given risk-measure as common in the field of stochastic optimization, in the robust optimization framework we calculate solutions which are optimal in the worst case and which are feasible for all scenarios in the uncertainty set.
The concept was first introduced in [67].Later it was studied for combinatorial optimization problems with discrete uncertainty sets in [53], for conic and ellipsoidal uncertainty in [13,14], for semi-definite and least-square problems in [39,40] and for budgeted uncertainty in [26,27].An overview of the robust optimization literature can be found in [2,10,15,32].
The so called robust counterpart is known to be NP-hard for most of the classical combinatorial problems, although most of them can be solved in polynomial time in its deterministic version; see [53].Furthermore it is a well-known drawback of this approach that the optimal solutions are often too conservative for practical issues [27].To obtain better and less-conservative solutions several new ideas have been developed to improve the concept of robustness; see e.g.[1,43,53,55,63].
Inspired by the concept of two-stage stochastic programming a further extension of the classical robust approach which attained increasing attention in the last decade is the concept of two-stage robustness, or sometimes called adjustable robustness, first introduced in [12].The idea behind this approach is tailored for problems which have two different kinds of decision variables, first-stage decisions which have to be made here-and-now and second-stage decisions which can be determined after the uncertain parameters are known, sometimes called wait-and-see decisions.As in the classical robust framework it is assumed that all uncertain scenarios are contained in a known uncertainty set and the worst-case objective value is optimized.The main difference to the classical approach is that the second-stage decisions do not have to be made in advance but can be chosen as the best reaction to a scenario after it occured.This approach can be modeled by min-max-min problems in general.Famous applications occur in the field of network design problems where in the first stage a capacity on an edge must be bought such that, after the real costs on each edge are known, a minimum cost flow is sent from a source to a sink which can only use the bought capacities [21].An overview of recent results for two-stage robustness can be found in [72].Several concepts closely related to the two-stage robust concept were introduced in [1,30,55].
In this work we study binary two-stage robust optimization problems.We consider underlying deterministic problems of the form where f ∶ Z × ℝ m → ℝ , the set Z ⊆ {0, 1} n 1 +n 2 contains all incidence vectors of the feasible solutions and is assumed to be non-empty, c ∈ ℝ m is a given parameter (CP) min (x,y)∈Z f (x, y, c) 1 3 Oracle-based algorithms for binary two-stage robust… vector and f (x, y, ⋅) is concave for each given (x, y) ∈ Z .The variables x are called first-stage solutions and the variables y are called second-stage solutions.We assume that the vector c is uncertain and all possible realizations c are contained in a convex uncertainty set U ⊂ ℝ m .The binary two-stage robust problem is then defined by where X ⊂ {0, 1} n 1 is the projection of Z onto the x-variables, i.e. and Y(x) ∶= {y ∈ {0, 1} n 2 | (x, y) ∈ Z} .Note that all results presented in this paper are still valid, if the recourse variables are non-integer.We do not consider uncertainty affecting the constraints of the problem which is a situation often occuring in practice for most of the classical combinatorial optimization problems.Problem (2RP) can be interpretated as follows: In the first stage, before knowing the precise uncertain vector c, the decisions x ∈ X have to be made.Afterwards, when the cost- vectors are known, we can choose the best feasible second-stage solution y ∈ Y(x) for the given costs.As usual in robust optimization we measure the worst-case over all possible scenarios in U. Note that by our definition of the set Y(x) and since the uncertainty only affects the objective function, there always exists a feasible secondstage solution y ∈ Y(x) for each first-stage solution x ∈ X.
Problem (2RP) has been already studied in the literature and several exact algorithms as well as approximation algorithms have been proposed; see Sect.1.1.While several of the existing methods are able to handle uncertainty in the constraints it is often assumed that a polyhedral description of the sets X and Y(x) is given.Besides the latter limitation most of the methods are based on dualizations or reformulations which destroy the structure of the original problem (CP).Often the uncertainty set is even restricted to be a polyhedron.In this work we derive the first oracle-based exact algorithm which solves Problem (2RP) for any deterministic problem by solving alternately the deterministic Problem (CP) and an adversarial problem presented later.For the deterministic problem any oracle can be used which returns an optimal solution of (CP) for every possible scenario in U.The advantage of the latter method is that the structure of the underlying problem is preserved and any preliminary algorithms which were derived for the underlying problem can be used.Furthermore our algorithm works for most of the common convex uncertainty sets.Additionally we apply the column-andconstraint generation algorithm (CCG) presented in [73] to Problem (2RP) and compare it to our new method.
In Sect.1.1 we will give an overview of the literature related to two-stage robust optimization problems.In Sect. 2 we derive an oracle-based branch and bound procedure to solve Problem (2RP).Furthermore we apply the results in [73] to Problem (2RP).Finally in Sect.3.1 we apply both methods to the uncapacitated singe-allocation hub-location problem and the capital budgeting problem and test it on classical benchmark instances from the literature.
Our main contributions: • We adapt the oracle-based algorithm derived in [29] and show that it can be used to calculate a lower bound for Problem (2RP) which can be implemented in a branch and bound procedure where the branching is performed over the firststage solutions.The calculation of the lower bound can be applied to the common convex uncertainty sets and is done by alternately calling an adversarial problem over U and an oracle which returns an optimal solution of Problem (CP) for a given scenario c ∈ U .Therefore any solution algorithm of the deterministic problem can be used to calculate this lower bound.• We apply the CCG algorithm presented in [73] to Problem (2RP) and show that calculating the upper bound can also be done by the same oracle-based algorithm as above.• We apply the branch and bound procedure and the CCG algorithm to the uncapacitated single-allocation hub-location problem and the capital budgeting problem and show that the branch and bound procedure outperforms the CCG algorithm.

Related literature
Linear two-stage robust optimization or sometimes called adjustable robust optimization was first introduced in [12].The authors show that the problem is NP-hard even if X and Y are given by linear uncertain constraints and all variables are real; see also [57].In [12] the authors propose to approximate the problem by assuming that the optimal values of the wait and see variables y are affine functions of the uncertain parameters.These so called affine decision rules were studied in the robust context in several articles for the case of real recourse; see e.g.[6,11,34,37,47,54,64,71].Furthermore in several works special cases are derived for which a decision rule structure is known which is optimal; see [20,22,48].Further non-linear decision rules are studied in [72].
Lower bounds for two-stage robust problems can be derived by considering a finite subset of scenarios in U. Then for each selected scenario c a duplication of the second-stage solution y c is added to the problem, see [7, 36, 45].The authors in [24]  first dualize the inner minimization and maximization problem and then apply the latter finite scenario approach to the dual problem to obtain stronger lower bounds.Note that while the finite scenario approach can also be applied to the case when the second-stage solutions are integers, for the dualization approach the second-stage variables have to be relaxed to real variables.Unfortunately both lower bounds can not be used in a branch and bound scheme since for a complete fixation of the firststage variables the bounds are not necessarily exact.
Exact methods for real recourse are based on the idea of Benders' decomposition, see [23,44,51,70] or column-and-constraint generation [25,73].Note that for the Benders' decomposition approaches the second-stage solutions have to be real since dualizations of the second-stage problem are used.In contrast to this the CCG algorithm even works for integer recourse, see [74].We will apply the latter method to our problem in Sect.2.2.

3
Oracle-based algorithms for binary two-stage robust… For the case of integer recourse, i.e. the second-stage variables y are modeled as integer variables, decision rules have been applied to Problem (2RP) in [18,19] to approximate the problem.Another approximation approach is called k-adaptability and was introduced in [16].The idea is to calculate k second-stage solutions in the first-stage and allow to choose the best out of these solutions in the second-stage.Clearly since the set of possible second-stage solutions is restricted compared to the original problem, this idea leads to an approximation of the problem.Solution methods and the quality of this approximation were studied in [22,46,68].In [46] it is shown that the k-adaptability problem is exact if k is chosen larger than the dimension of the problem.The authors in [30,31,42] apply the k-adaptability concept to one-stage combinatorial problems to calculate a set of solutions which is worst-case optimal if for each scenario the best of these solutions can be chosen.They furthermore show that solving this problem can be done in polynomial time if an oracle for the deterministic problem exists and if the number of calculated solutions is larger or equal to the dimension of the problem.To solve the problem in the latter case they present an oracle-based algorithm which we will use in Sect. 2. The k-adaptability concept was also applied to the case that the uncertain parameters follow a discrete probability distibution [33].
Besides the exact algorithm in [73,74] approximation methods based on uncertainty set splitting were derived in the literature to approximate two-stage robust problems with integer recourse; see [17,61].
For two-stage robust problems with non-linear robust constraints decision rules have been applied in [58,69].The two-stage problem is studied for second order conic optimization problems in [28].In [9,56] the authors derive robust counterparts of uncertain non-linear constraints.Note that all the latter results were developed for real second-stage solutions.
While this work was under peer review a similar approach to solve two-stage robust optimization problems with uncertainty only affecting the objective function was published; see [5].The authors study Problem (2RP) with linear objective functions and mixed-integer recourse variables, while the set Y(x) is modeled by linear constraints.They study a relaxation of the lower bound presented in Sect. 2 which is implemented in a branch and bound procedure.In contrast to the algorithm described in this work, the method in [5] is not based on the use of oracles for the deterministic problem.Therefore it can not make use of fast solution methods for (CP) as combinatorial algorithms or compact formulations with uncertain parameters appearing in the constraints; see Sect.3.1.

Binary two-stage robustness
In this section we analyze the binary two-stage robust problem (2RP) with convex uncertainty sets U and derive general lower bounds which can be calculated by an oracle-based algorithm and which can be implemented in a branch and bound procedure.The branching will be done over the first-stage solutions.
The classical approach to derive lower bounds in a branch and bound procedure is relaxing the integrality and solving the relaxed problem.Applying this approach to the second-stage decisions of problem (2RP) is not useful, since for a given x ∈ X and c ∈ U an optimal solution of the relaxed second-stage problem may not be con- tained in conv (Y(x)) , e.g. if the relaxation of Y(x) is a polytope which is not integral.It may be even the case that a linear description of conv (Y(x)) is not known.There- fore, even if all first-stage variables are fixed, the lower bound obtained by relaxing the second-stage solution variables would not necessarily be exact and an optimal solution can not be guaranteed using a branch and bound scheme.In the following lemma we derive a lower bound for Problem (2RP) which is exact if all first-stage solutions are fixed.
Proof By changing the order of the outer minimum and the inner maximum in Problem (2RP) we obtain the inequality Merging the two minimum expressions and using Z ⊆ conv (Z) yields which proves the result.◻ Note that, since f is concave in c and since the pointwise minimum of concave functions is always concave, we have to maximize a concave objective function in Problem (LB).In [30] the authors analyze Problem (LB) for the case that f is a linear function in (x, y) and c.They prove that it can be solved in oracle-polynomial time, i.e. by a polynomial time algorithm if solving the deterministic problem (CP) is done by an oracle in constant time.Furthermore if we fix a solution x ∈ X , then the bound (LB) is exact, which we prove in the following.

Proposition 1 If all first-stage variables are fixed then (LB) is equal to the exact objective value of the fixed first-stage solution.
Proof Let x ∈ X be the fixed first-stage solution, then it holds Clearly problem is equivalent to Oracle-based algorithms for binary two-stage robust… which proves the result.◻ The result of Proposition 1 indicates that the lower bound (LB) can be integrated in a branch and bound procedure.
In [30] it was proved that, given an oracle to solve the deterministic problem over Y(x) for each given x , if f is linear in (x, y) and c and under further mild assumptions, Problem (1) can be solved in oracle-polynomial time.Together with Proposition 1 a direct consequence is that, if the dimension n 1 of the first-stage solutions is fixed, then we can enumerate over all possible first-stage solutions and compare the objective values in oracle-polynomial time.Hence, Problem (2RP) can be solved in polynomial time given an oracle for the optimization problem over Y(x) for each x ∈ X.
The authors in [30] present a practical algorithm, based on the idea of columngeneration for the case that f is a linear function.Applied to the more general Problem (LB) the algorithm can be derived as follows: The algorithm starts with a subset of solutions Z ′ ⊂ Z , leading to problem and then iteratively adds new solutions to Z ′ until optimality can be ensured.The solution which is added in each iteration is the one which has the largest impact on the optimal value.To find this solution Problem (2) can be reformulated by applying a level set transformation.The reformulation is given by For an optimal solution ( * , c * ) of the latter problem, we search for the solution z ∈ Z which most violates the constraint f (z, c * ) ≥ * , i.e. the solution with the larg- est improvement on the optimal value of Problem (2).The latter task can be done by minimizing the objective function f (z, c * ) over all z ∈ Z , i.e. solving the deter- ministic problem (CP) under scenario c * by using any exact algorithm.If we can find a z * ∈ Z such that f (z * , c * ) <  * , then we add z * to Z ′ and repeat the procedure.If no such solution can be found, then f (z, c * ) ≥ * holds for all z ∈ Z and there- fore * is the optimal value of (LB).The procedure described above is presented in Algorithm 1. (1) Note that the Problem in Step 3 depends on the uncertainty set U and on the properties of f.If f is a linear function in c, for polyhedral or ellipsoidal uncertainty sets this is a continuous linear or quadratic problem, respectively.Both problems can be solved by the latest versions of optimization software like CPLEX [49].Therefore the algorithm can be implemented for each deterministic problem by using any exact algorithm to solve the deterministic problem in Step 4. The main advantage of this feature is that we do not have to restrict to deterministic problems which can be modeled by a linear compact formulation as it is the case in [5].Instead we can use any combinatorial algorithm or even mixed-integer formulations where the uncertain parameters appear in the constraints; see Sect.3.1.We only require an arbitrary procedure which returns an optimal solution for the given scenario.In [42] the authors applied the latter algorithm to the min-max-min robust capacitated vehicle routing problem and showed that on classical benchmark instances the number of iterations of Algorithm 1 is significantly smaller than the dimension of Z in general.
Note that besides the optimal value of Problem (2RP) the algorithm returns a set of feasible solutions Z ′ ⊆ Z and not a solution in conv (Z) .By the correctness of the algorithm the optimal solution in conv (Z) must be contained in conv Z ′ and could be calculated by finding the optimal convex combination of the solutions in Z ′ which can be done by solving the problem for the given set Z ′ .If f is continuous, quasi-convex in z and quasi-concave in c then the latter problem is equivalent to (4) Oracle-based algorithms for binary two-stage robust… by Sion's theorem [65].Dualizing the inner maximization problem over U (e.g. by using the convex conjugate [9]) this is a continuous minimization problem.If f is a linear function this problem is a linear or a quadratic problem for polyhedral or ellipsoidal uncertainty, respectively.Nevertheless in our branch and bound procedure for non-linear functions f the set Z ′ is sufficient as we will see in Sect.2.1.A practical advantage of the set Z ′ is that it contains a set of second-stage policies which can be used in practical applications.Instead of solving the second-stage problem each time after a scenario occured, which may be a computationally hard problem, we can choose the best of the pre-calculated second-stage policies in Z ′ for the actual scenario.The latter task can be done by just comparing the objective values of all solutions in Z ′ for the given scenario.Note that the returned set of solutions need not contain the optimal solution for each scenario.Nevertheless we will show in Sect.3.2.1 that the calculated solutions perform very well in average over random scenarios in U.

Oracle-based branch and bound algorithm
Using the results of the previous section we can easily derive a classical branch and bound procedure to solve Problem (2RP).The idea is to branch over the firststage solutions x ∈ X and to calculate the lower bound (LB) in each node of the branch and bound tree to possibly prune the actual branch of nodes.All necessary details needed to implement a branch and bound procedure are presented in the following.

Handling fixations
In each node of the branch and bound tree we have a given set of fixations for the x-variables, i.e. a set of indices I 0 ⊂ [n 1 ] such that x i = 0 for each i ∈ I 0 and a given set of indices Therefore in each node for the given fixations we have to solve the problem or to decide if the latter problem is infeasible.It is easy to see that the latter problem, if it is feasible, can be solved by Algorithm 1 by including the given fixations into the set Z. Note that here the oracle for the deterministic problem must be able to handle variable-fixations.Nevertheless for most of the classical problems fixations can easily be implemented in most algorithms.

Warm starts
In each node of the branch and bound tree Algorithm 1 returns a set Z ′ ⊂ Z of feasible solutions satisfying the given fixations.For each possible child- node we can select the set Z ′′ ⊂ Z ′ of solutions which satisfy the new fixations and warm-start Algorithm 1 with the set Z ′′ in the child node. ( Branching strategy An easy branching strategy can be established as follows: For the calculated set of solutions Z ′ returned by Algorithm 1 we define the vector x ∈ [0, 1] n 1 by for all i ∈ [n 1 ] , i.e. the value xi is the fraction of solutions in Z ′ for which x i = 1 holds.We can then use any of the classical branching rules, e.g.we can decide to branch on the index i for which the value xi is the closest to 0.5.Another computationally more expensive approach is to calculate the optimal convex combination of the solutions in Z ′ , i.e. after calculating the optimal Z ′ by Algorithm 1 we calculate an optimal solution * of Problem ( 4) and define Now we can again use any classical branching-strategy on x .Note that if a first-stage variable has the same value in each of the solutions in Z ′ then also the corresponding entry of x has this value.
When going over to the next open branch and bound node to be processed, we choose the one with the smallest lower-bound.

Calculating feasible solutions
In each node of the branch and bound tree we want to find a feasible solution to update the upper bound on our optimal value.We do this as follows: In each branch and bound node Algorithm 1 calculates a set Z ′ ⊆ Z of feasible solutions.If all of the generated solutions in Z ′ have the same first-stage solution x, then the optimal solution of (5) has binary first-stage variables and we obtain a feasible solution x ∈ X which has the objective value * returned by the algorithm.If the first-stage variables are not the same for all z ∈ Z � then we can either choose an arbitrary first-stage solution given by any z ∈ Z � or we can calculate the objective value of all first-stage solutions in Z ′ and choose the one with the best objective value.To this end we have to solve for any first-stage solution x given in Z ′ .Note that the latter problem again can be solved by Algorithm 1 replacing the deterministic problem in Step 4 by If X = {0, 1} n 1 , as it is the case for the hub-location problem (see Sect. 3.2.1),then calculating all objective values as above can be avoided and finding a good feasible solution can be done by rounding each component of the vector x calculated in the latter paragraph.

3
Oracle-based algorithms for binary two-stage robust…

Oracle-based column-and-constraint algorithm
In [73] a column-and-constraint generation method (CCG) was introduced to solve two-stage robust problems with real recourse variables.In [74] the authors show how the algorithm can be applied to two-stage robust problems with mixed-integer recourse variables.In both cases the algorithm is studied for problems with uncertain constraints.In this section we will apply the algorithm to Problem (2RP), i.e. to the special case of objective uncertainty, and show that we can again use Algorithm 1 to solve one crucial step in the CCG.In the following we derive the CCG algorithm for Problem (2RP).For more details see [73,74].Using a level set transformation Problem (2RP) can be reformulated by If we choose any finite subset of scenarios c 1 , … , c l ∈ U we obtain the lower bound which is equivalent to problem The algorithm in [73] now iteratively calculates an optimal solution (x * , * ) of the latter problem (8), which is a lower bound for Problem (2RP), and afterwards calculates a worst-case scenario c l+1 ∈ U by The optimal value of Problem ( 9) is the objective value of solution x * ∈ X and therefore an upper bound for Problem (2RP).Afterwards new variables y l+1 and the constraint are added to Problem (8) and we iterate the latter procedure until min min Clearly a solution (x * , * ) fulfilling the latter condition is optimal for Problem (2RP).Following the proof of Proposition 1 the worst-case scenario in (9) can be calculated by Algorithm 1.This can be done since we do not consider uncertainty in the constraints, while in the more general framework in [73] this is not possible.
The main difference of the latter procedure to our branch and bound algorithm is that in a branch and bound node only a subset of first-stage variables are fixed while the rest are relaxed.Then we use Algorithm 1 to calculate a lower bound for the given fixations.In the CCG procedure in each iteration a first-stage solution is calculated by Problem (8) and therefore all variables are fixed when Algorithm 1 is applied to calculate the worst-case scenario.Nevertheless the number of constraints and the number of variables of Problem (8) increase iteratively, since each second-stage variable has to be duplicated in each iteration, while in the branch and bound procedure we always iterate over the same number of first-stage variables.In Sect.3.2.1 we will compare both algorithms on benchmark instances of the uncapacitated single-allocation hub location problem and the capital budgeting problem.

The uncapacitated single-allocation hub location problem with uncertain demands
In this section the oracle-based branch and bound algorithm is exemplarily applied to the single-allocation hub location problem which can be naturally defined as a two-stage problem.Furthermore due to its quadratic objective function it perfectly fits into the non-linear framework.
Hub-location problems address the strategic planning of a transportation network with many sources and sinks.In many applications sending all commodities over direct connections would be too expensive in operation.Instead, some locations are considered to serve as transshipment points and are then called hubs.Thus, strongly consolidated transportation links are established.The bundling of shipments usually outweighs the additional costs of hubs and detours.Important applications of this problem arise in air freight [50], postal and parcel transport services [41], telecommunication networks [52] and public transport networks [59].The recent surveys of [3,35] provide a comprehensive overview of the various variations and solution approaches of the hub location problem.
The main source of uncertainty in single-allocation hub location problems are demand fluctuations.Thus, it is important to include this uncertainty when deciding hub locations and allocations of the nodes to the hubs.Installing a hub is a long-term decision which lasts for many years or even for several decades.Nonetheless, the allocation to the hub nodes are mid-to-short-term decisions as they can be changed over time.In [62] the variable allocation variant for single-allocation hub location problems under stochastic demand uncertainty is proposed.
We consider a directed graph G = (N, A) , where N = {1, 2, … , n} corresponds to the set of nodes that denote the origins, destinations, and possible hub locations, and A is a set of arcs that indicate possible direct links between the different 1 3 Oracle-based algorithms for binary two-stage robust… nodes.Let w ij ≥ 0 be the amount of flow to be transported from node i to node j and d ij the distance between two nodes i and j.We denote by O i = ∑ j∈N w ij and D i = ∑ j∈N w ji the total outgoing flow from node i and the total incoming flow to node i, respectively.For each k ∈ N , the value f k represents the fixed set-up cost for locating a hub at node k.The cost per unit of flow for each path i − k − m − j from an origin node i to a destination node j passing through hubs k and m respectively, is d ik + d km + d mj , where , , and are the nonnegative collection, transfer, and distribution costs respectively and d ik , d km , and d mj are the distances between the given pairs of nodes.Typically ≤ min { , } since otherwise using a hub would not be beneficial.Note that if hub nodes are fully interconnected, every path between an origin and a destination node will contain at least one and at most two hubs.The SAHLP consists of selecting a subset of nodes as hubs and assigning the remaining nodes to these hubs such that each spoke node is assigned to exactly one hub with the objective of minimizing the overall costs of the network.
To formulate the SAHLP, we follow the first formulation of this problem introduced by O'Kelly [60].Two types of decision variables are introduced.First, the variables indicate whether a node is used as a hub in the transportation network.Second, the variables show how the nodes are allocated to the hub nodes.SAHLP can then be formulated as the following binary quadratic program: The objective is to minimize the total costs of the network which includes the costs of setting up the hubs, the costs of collection and distribution of items between the spoke nodes and the hubs, and the costs of transfer between the hubs.Constraints (11) indicate that each node i is allocated to precisely one hub (i.e.single allocation) while Constraints (12) enforce that node i is allocated to a node k only if k is selected as a hub node.The binary conditions are enforced by Constraints (13).
1 if node k is a hub node 0 otherwise.
1 if node i is allocated to a hub located at node k 0 otherwise.
In order to solve SAHLP, many solution methods have been proposed in the literature.The classical approach to obtain an exact solution is to linearize the quadratic objective function.In [41,66] two mixed-integer linear programming (MILP) formulations for the problem have been proposed which are based on a path and a flow representation, respectively.The path-based formulation in [66] has O(|N| 4 ) variables and O(|N| 3 ) constraints and its linear programming (LP) relaxation was shown to provide tight lower bounds.However, due to the large number of variables and constraints, the path-based formulation can only be solved for instances of relatively small sizes.Alternatively, the flow-based formulation of [41] uses only O(|N| 3 ) variables and O(|N| 2 ) constraints to linearize the problem.To formulate the flow-based SAHLP model (SAHLP-flow), new variables z ikm are defined as the total amount of flow originating at node i and routed via hubs located at nodes k then m, respectively.SAHLP-flow is formulated as Similar to SAHLP, the objective function minimizes the hub setup costs, the costs of collection and distribution, and the inter-hub transfer costs.Besides Constraints (11), (12), (13) which are also used in SAHLP, Constraints (14) are flow balance constraints while Constraints (15) ensure that a flow is possible from spoke i to hub k only if node i is allocated to hub k; see [38].Finally, Constraints (16) indicate the non-negativity restriction on the variables z.
The presented flow-based formulation is typically regarded to be the most effective linearized formulation in order to obtain exact solutions for the singleallocation hub location problem.In our computations we use this simple solution method to solve Step 4 in Algorithm 1.Note that although in the flow-based formulation the uncertain parameters w ij appear in the constraints, we can use this formulation as an oracle in our algorithm while other methods which require linear programming formulations without uncertainty in the constraints can not make use of it.
The SAHLP splits up naturally in first-and second-stage problems as the decision variables in the SAHLP are subject to different planning horizons as discussed above.Therefore, the two-stage robust SAHLP can be modeled as follows: ), ( 12), ( 13)

3
Oracle-based algorithms for binary two-stage robust… where We assume that U ⊂ ℝ n 2 + is a convex uncertainty set.Note that this classical formulation is a quadratic two-stage robust problem.To solve Problem (SAHLP-2RP) we use the branch and bound procedure described in Sect. 2. To this end lower bounds can be calculated by Algorithm 1 implementing the flow linearization SAHLPflow in CPLEX [49] to solve the oracle in Step 4. The variable fixations in each node of the branch and bound tree can be added as constraints to the SAHLP-flow formulation.

Computational results
In this section we apply the branch and bound method derived in Sect.2.1 and the CCG method presented in Sect.2.2 to the SAHLP.Both algorithms were implemented in C++.For the branch and bound procedure we calculate the lower and upper bounds by Algorithm 1 as discussed in the previous sections.The dual solution x is calculated as presented in (6).The branching is performed on the variable xi which is the closest to 0.5.A feasible solution is calculated by rounding the entries of x to the closest integer value.Note that by this rounding procedure we always obtain a feasible first-stage solution for the SAHLP since we do not have restrictions on the first-stage variables.For the selection of the next branch and bound node to be processed we use the best-first strategy, i.e. the node with the smallest dual bound is processed next.
For the CCG algorithm we implemented Problem (8) in CPLEX 12.8 while Problem ( 9) is solved by Algorithm 1.In Algorithm 1 the dual problem in Step 3 is solved by CPLEX 12.8 [49].As deterministic oracle in Step 4 we use the flow linearization SAHLP-flow presented in Sect.3.1 which was also implemented in CPLEX 12.8.After termination of Algorithm 1 we delete all solutions z from the calculated set Z ′ which have a non-zero slack in the dual problem in Step 3, i.e. for which f (z, c * ) >  * in the last iteration of Algorithm 1.By dualizing the dual prob- lem in Step 3 it can be shown that the optimal value does not change by throwing out all calculated solutions with non-zero slack.

Generation of random instances
We generated random instances as follows: As basis for our instances we use a selection of instances of the AP and the CAB datasets which were intensively studied in the hub location literature.The AP instances are based on the mail flows of Australia Post and were introduced in [41].The (SAHLP-2RP) CAB instances contain airline passenger interactions between 25 major cities in the United States of America and were first studied in [60].Both datasets can be found in [8].Since there is only one CAB instance available, we introduce three additional instances (cab1 to cab3) by varying the demand values as follows: For each node pair i, j ∈ N , the demand values are drawn randomly from the interval [0.01 wij , 10 wij ] , where wij is the demand value of the original cab instance.The num- ber of locations n together with its pairwise distances d ij are given by the instance data.The set-up costs for hub locations are also given by the instance data in case of the AP instances.According to [4], the set-up cost at node k are set to 15 log(O k ) for the CAB instances.The collection, transfer and distribution costs are set to = 3 , = 0.75 and = 2 for the AP instances while for the CAB instances = 1 , = 1 and is varied in {0.2, 1} .For each instance and each ∈ 0.02n 2 , 0.1n 2 , rounded down if fractional, we generate 10 random budgeted uncertainty sets which are defined by Here w are the flows given by the AP or CAB instances, respectively, while ŵij is chosen randomly in [0, wij ] for each i, j ∈ N , i.e. the change in demand can be at most 100% of the given mean wij .

Analysis of results
The results for the branch and bound procedure are presented in Tables 1 and 2. Each row shows the average over all 10 random instances of the following values from left to right: The instance name; the number of locations n for the AP instances; the value of the budgeted uncertainty set U ; the value of for the CAB instances; the gap det in %, i.e. the percental difference between the optimal value of Problem (2RP) and the deterministic problem with weights w ; the total solution time t in seconds; the number of nodes solved in the branch and bound tree; the percental difference root of the upper bound and the lower bound calculated for the root problem of the branch and bound tree; the total number of oracle calls; the average number of iterations i lb of Algorithm 1 to calculate the lower bounds; the average number of iterations i ub of Algorithm 1 to calculate the upper bounds; the number of solutions returned by the branch and bound method or the number of iterations of the CCG, respectively; the average percental difference Δ (over 10 ran- dom scenarios in U ) between the best solution in Z ′ and the deterministic optimal solution in each scenario.To be more precicely, to obtain the value Δ we generate 10 random scenarios in U by the following procedure: We first create n 2 equally distributed random numbers s i in [0, ] and define s 0 ∶= 0 .Assume the numbers are given in increasing order.We then define i ∶= s i − s i−1 .If ≤ is not true we start the procedure again.The random scenario is then given by w with .
Oracle-based algorithms for binary two-stage robust… After generating 10 random scenarios w 1 , … w 10 , in each scenario we compare the costs of the best solution in Z ′ to the costs of the optimal solution in the scenario, i.e. for the optimal first-stage solution x we define and set Δ to the average of all l .For the CCG algorithm we define Z ′ as the set of solutions calculated in the last iteration by Problem (8).Note that since the set of optimal second-stage solutions in Z ′ is not unique and especially may not be the same for both algorithms, the value of Δ can be different for the branch and bound procedure and for the CCG.
The results for the AP instances are shown in Table 1.The gap det increases with and with the dimension.The number of calculated nodes in the branch and bound tree is in most cases close to 1 and seems to remain constant with increasing dimension.Nevertheless the run-time increases with the dimension and with which is mainly due to the increasing run-time of Algorithm 1.Here with higher dimension the calculation time of the deterministic problem increases, while with increasing the number of iterations of Algorithm 1 increases which was already observed in [30,42].Another positive observation is that the root gap is very small in general, mostly 0 and never larger than 34% .The number of iterations of Algo- rithm 1 is larger for the calculations of the lower bound than for the upper bound, which is because not all hub variables are fixed in the former case.Nevertheless the number of iterations is very low and never larger than 2.2 for the lower bound and 1.2 for the upper bound.This leads to a very small number of policies calculated by Algorithm 1 and to a very small number of oracle calls in total.Finally the values of Δ indicate that the returned second-stage solutions are optimal in most of the sce- narios, as Δ is 0 for most of the instances.Note that for larger dimensions due to the time consuming computations we did not determine the Δ values.
The computations for the CAB instances are presented in Table 2.The results look similar to the results related to the AP instances.The gap det is larger for larger values of and .The root gap is again very small for most of the instances and never larger than 30% .The number of nodes in the branch and bound tree is very low, but in general higher than that for the AP instances.Nevertheless it is never larger than 8% in average.In contrast to the AP instances the total run-time does not increase much with increasing .Instead the run-time increases significantly with increasing .The reason for this is the larger number of iterations performed by Algorithm 1 to calculate the lower and the upper bounds.Comparing the calculated solutions to the optimal values on random scenarios, the percental difference Δ is again very close to 0 for all of the instances.
All results for the CCG algorithm are presented in Tables 3 and 4. Each row shows the average over all 10 random instances of the following values from left to right: The instance name; the number of locations n for the AP instances; the value of the budgeted uncertainty set U ; the value of for the CAB instances; the total solution time t in seconds; the average time t lb in seconds to solve the lower bound Problem (8); the average time t ub in seconds to solve the upper bound Problem (9); Oracle-based algorithms for binary two-stage robust… the number of solutions l calculated by Problem (8) which is equal to the number of iterations of the CCG algorithm; the average percental difference Δ (over 10 random scenarios w ∈ U  ) between the best of the solutions calculated in the last iteration by Problem (8) and the deterministic optimal solution in each scenario w; see the definition of Δ above.The results of the CCG algorithm are less convincing.We could solve AP instances up to 50 locations in reasonable time, while for the branch and bound procedure we managed to solve instances with 90 locations.Furthermore the runtime is at least three times as large as for the branch and bound method for most of the instances and even larger for growing dimension.The same effect holds for the CAB instances.Here the runtime is much higher for the instances with = 1 .The large runtime of the CCG is mainly caused by the lower bound problem (8).The calculations of the upper bound, solved by Algorithm 1, are less time consuming, at most 6 s in average.The number of calculated solutions, i.e., the number of iterations, is slightly larger than that for the branch and bound procedure but still very small, never larger than 5.A positive effect is that the performance Δ of the calculated solutions on random scenarios is very close to 0 for all instances.
In Fig. 1 we compare the runtimes in seconds of both algorithms.The results show that the runtime of the CCG method increases rapidly for more than 25 locations and is always much larger than the runtime of the branch and bound method.For the larger value of the run-time of the CCG method explodes if n is larger than 40.
Analysis of results for hard instances For the realistic instances calculated above the number of nodes in the branch and bound tree, the number of iterations of the CCG as well as the number of iterations of Algorithm 1 is very low.The same effect occurs for most of the randomly generated instances we tested.To test the boundaries of our algorithm we generated further instances which are generated as the instances above with the only difference that the values ŵij are randomly drawn in [0, 10w ij ] , i.e. the uncertainty sets are much larger.Furthermore for the AP instances we varied ∈ {0.75, 1.5} .The results for the branch and bound procedure are pre- sented in Table 5.For the CCG algorithm we could not even solve instances with 25 locations in reasonable time.The results in Table 5 show that the number of nodes in the branch and bound tree and the number of iterations of Algorithm 1 are larger than those for the realistic instances above but still never get larger than 33 and 12 respectively.Both values are larger for the CAB instances.The number of nodes decreases with increasing dimension and with increasing .The same holds for the root gap which is lower than that for the realistic instances for most of the instances.Clearly the gap det is much larger than for the smaller uncertainty sets.Similar to the results above the number of iterations for the calculations of the lower and the upper bounds and therefore the number of total oracle calls seem to be independent of the dimension.The same holds for the number of calculated second-stage solutions.The performance of these solutions over random scenarios is worse than for the realistic instances above, but still very small and never larger than 0.3% .For the CAB instances it is larger for = 1 .For the CCG algorithm the results are not very convincing.Even for instances with 20 locations finding an optimal solution took more than 16 hours in average for = 0.75 .Interestingly here the instances with smaller were harder to solve (Table 6).
In Fig. 2 we present the development of several problem parameters over for the 20LL instance.All values are the average over 10 random uncertainty sets with random deviations ŵij ∈ [0, 10 wij ] .Cost parameters are defined as above by = 3 and = 2 .Figure 2 shows that the number of nodes in the branch and bound tree rap- idly decreases with increasing .Furthermore the number of iterations performed by Algorithm 1 to calculate the upper bounds and the number of returned policies in Z ′ increases until = 2 and afterwards slowly decreases.The number of itera- tions performed by Algorithm 1 to calculate the lower bounds is nearly constant and slightly decreases.The root gap of the branch and bound procedure decreases with increasing and tends to 0. In contrast to this the performance of the returned policies in Z ′ , indicated by Δ , seems to get worse with increasing , and seems to be constant for ≥ 2 .Nevertheless all Δ values are very small and remain close to 0.2% for ≥ 2.
In summary the results show that the number of nodes of the branch and bound procedure and the number of iterations of Algorithm 1 are very low for the realistic instances of the SAHLP.Hence we could solve instances with up to 90 locations in less than 4 h.Furthermore the number of calculated policies |Z ′ | is very low for the hub location problem but they perform very well on random scenarios.For the larger uncertainty sets, the number of nodes of the branch and bound procedure and the number of iterations of Algorithm 1 is larger but is still very low compared to the dimension of the problem.Furthermore the latter values seem to be nearly constant with increasing dimension.The runtime and the number of iterations of Algorithm 1 increase with increasing while the number of nodes of the branch and bound tree decreases.
An example of an optimal solution of a random instance with 20 locations and ŵij randomly drawn in [0, 10 wij ] is shown in Fig. 3.The figure shows the optimal solu- tion of the nominal scenario w and the three returned solutions in Z ′ .The number  Oracle-based algorithms for binary two-stage robust… of hubs is larger in the two-stage robust solution than in the deterministic solution since for flexible re-allocation after a scenario occured it can be beneficial to build further hubs in advance.Furthermore the figure indicates that a hub which is used by many locations in the deterministic solution may not be used by the second-stage reactions of the two-stage solution.

The capital budgeting problem
In this section the oracle-based branch and bound algorithm and the CCG algorithm are exemplarily applied to the two-stage robust capital budgeting problem studied in [5] which can be naturally defined as a two-stage problem.
The capital budgeting problem (CB) is an investment planning problem, where a subset of n projects has to be selected.Each project i ∈ [n] has costs c i and an uncer- tain profit pi which depends on a set of m risk factors  ∈ U ⊂ ℝ m .The profits are given by pi () = (1 + 1 2 Q ⊤ i )p i , where pi are the nominal profits and Q i is the i-th row of the factor loading matrix.For each project the company can decide if it wants to invest in the project here-and-now or if it wants to wait until the risk factors are known.If an investment is postponed to the second stage the profit generated by the project is f pi where 0 ≤ f < 1 .The costs of a project are the same in the first and the second stage.The company has a given budget B for investing in projects and can additionally take out a loan of volume C 1 with costs in the first stage and a loan of volume C 2 with costs in the second stage where  > 1 .The aim is to maximize the worst-case profit.This problem can be formulated as where X = (x, x 0 ) ∈ {0, 1} n+1 | c ⊤ x ≤ B + C 1 x 0 is the set of feasible first-stage solutions and is the set of feasible second-stage solutions.For more details see [5].

Computational results
In this section we apply the branch and bound method derived in Sect. 2 and the CCG method presented in Sect.2.2 to the capital budgeting problem.The implementation of both algorithms is the same as in Sect.3.1.As deterministic oracle in Step 4 of Algorithm 1 we implemented the deterministic version of the integer programming formulation of Problem (17) in CPLEX 12.8.Note that since we consider a maximization problem here the terms upper bound and lower bound are swapped.
We compare both variants of calculating a dual solution x presented in (6) and ( 7) which we denote by DualSol-Avg and DualSol-Opt, respectively.The branching is performed on the variable which is the closest to 0.5.A feasible first-stage solution is obtained by rounding the entries of x to the closest integer value.If this solution is not feasible we choose the first solution which was returned by Algorithm 1 after the calculation of the upper bound.

Analysis of results
The results for the branch and bound procedure are presented in Tables 7 and 8.Each row in Table 7 shows the average over all 20 instances of the following values from left to right: The number of projects n; the number of risk factors m; the total number of nodes solved in the branch and bound tree; the total number of oracle calls; the total time t in seconds to solve the instance to optimality; the percentage of instances which could be solved to optimality during the timelimit of 7200 s.
Each row in Table 8 shows the average over all 20 instances of the following values from left to right: The number of projects n; the number of risk factors m; the gap det in %; the root-gap root in %; the average number of iterations i ub of Algorithm 1 to calculate the upper bounds; the average number of iterations i lb of ( 17) max Oracle-based algorithms for binary two-stage robust… Algorithm 1 to calculate the lower bounds; the number of solutions |Z ′ | Algorithm 1 returned for the optimal first-stage solution x; the average percental difference Δ (over 10 random scenarios in U) between the best solution in Z ′ and the determinis- tic optimal solution in each scenario; see Sect.7 indicate that the DualSol-Opt variant performs much better on most of the instances.The larger computational effort which is made to calculate the optimal dual solution does not have an impact on the total run-time since the number of processed nodes in the branch and bound tree is much smaller.For both variants the number of nodes processed in the branch and bound tree and the number of oracle calls are significantly larger than for the SAHLP; compare to Sect.3.1.Both values and therefore the run-time increase with increasing m.Interestingly the instances with dimension n = 30 and n = 40 seem to be the hardest to solve.The total run-time for the instances with m = 8 is very large.Nevertheless for most of the configurations all instances could be solved during the timelimit.
In contrast to the latter results, the values in Table 8 are not much larger than for the SAHLP.The root-gap is better for the DualSol-Opt variant for most of the instances but is very small for both methods and at most 8% .The number of iter- ations performed to calculate the upper and the lower bounds and the number of calculated solutions are slightly larger than for the SAHLP but still very small.All values seem to be independent of the size of the dimension and the number of risk factors.The gap Δ is slightly larger than for the SAHLP but still at most 1%.
The results for the CCG are presented in Table 9.Each row shows the average over all 20 instances of the following values from left to right: The number of projects n; the number of risk factors m; the percentage of instances which could be solved to optimality during the timelimit of 7200 s, the optimality gap of the CCG after 7200 s; the total solution time t in seconds (exceeding the timelimit is counted as 7200 s); the average solution time t ub to calculate the upper bounds; the average solution time t lb to calculate the lower bounds; the total number of iterations; the average percental difference Δ (over 10 random scenarios in U) between the best solution in Z ′ and the deterministic optimal solution in each scenario.Here Z ′ is the set of solutions calculated by Algorithm 1 in the last iteration.Note that we stopped the calculations for each instance after 7200 s, since for several instances the memory used by CPLEX was too large.Therefore the run-times can not be compared to the run-times of the branch and bound method.
As for the SAHLP the results of the CCG algorithm are less convincing.The number of instances solved to optimality during the timelimit is much smaller than for the branch and bound method, sometimes smaller than 55% .Nevertheless the optimality gap after the timelimit is very small, at most 3.6% .The number of iterations is small for most of the instances and seems to be independent of the size of the dimension.It increases with increasing m.As for the SAHLP most of the runtime is used to calculate the upper bound problem.The gap Δ is smaller than 1% for most of the instances, as it is the case for the branch and bound method.
To summarize, for the two-stage robust capital budgeting problem the number of nodes processed in the branch and bound tree and the number of oracle calls is significantly larger than for the SAHLP.Nevertheless since the deterministic problem can be solved much faster the total run-time is not larger for the instances with small m.Although most of the instances could be solved during the timelimit by the branch and bound method, the run-time for instances with m = 8 can be very large.But still the branch and bound method solves significantly more instances to optimality than the CCG.Nevertheless the optimality gap of the CCG after the timelimit is very small.

Conclusion
In this paper we derive a branch and bound procedure to solve robust binary twostage problems for a wide class of objective functions.We show that the oraclebased column generation algorithm presented in [30] can be adapted to calculate lower bounds which can be used in a classical branch and bound procedure.The whole procedure can be implemented for any algorithm solving the underlying deterministic problem.Furthermore we apply the column-and-constraint generation algorithm studied in [73] to our problem and show that again the oracle-based algorithm in [30] can be used to solve one step of the procedure.We test both algorithms on classical benchmark instances of the single-allocation hub location problem and on random instances of the capital budgeting problem.We show that the number of nodes in the branch and bound tree, the number of iterations of the CCG algorithm as well as the number of iterations of the column generation algorithm are very low for the SAHLP while the number of branch and bound nodes increases significantly for the capital budgeting problem.Nevertheless our branch and bound procedure is much faster than the CCG algorithm and can solve larger instances in reasonable time.Furthermore our computational results indicate that for both algorithms the precalculated second-stage solutions perform very well on random scenarios.

Fig. 1 3
Fig. 1 Development of the runtime in seconds of both algorithms

Fig. 2
Fig. 2 Development of the parameters of the branch and bound procedure over for the 20LL instance with random deviations ŵij ∈ [0, 10 wij ] , = 3 and = 2 .The graphs in the right plot are presented in logarithmic scale

Fig. 3
Fig.3 The optimal solution of the nominal scenario w (top left) and the optimal two-stage robust solution presented by all 3 solutions in Z ′ returned by Algorithm 1 in the optimal branch and bound node for a 20LL instance with = 1.5 and = 40 3.1 for a precise definition.All values are presented for both variants, DualSol-Avg and DualSol-Opt.The bold-faced values indicate which of the two variants is better.The results in Table

Table 1
Results of the branch and bound procedure for AP instances

Table 2
Results of the branch and bound procedure for CAB instances

Table 3
Results of the CCG algorithm for AP instances

Table 5
Results of the branch and bound procedure for instances with large deviations and = 0.1n2

Table 7
Results of the branch and bound procedure for both variantsThe last row shows the average over all values of the corresponding column

Table 8
Results of the branch and bound procedure for both variantsThe last row shows the average over all values of the corresponding column Oracle-based algorithms for binary two-stage robust…

Table 9
Results of the CCG algorithmThe last row shows the average over all values of the corresponding column n m Opt (%) Gap (%) t (s) t ub (s) t lb (s) #Iter.Δ (%)