1 Introduction

In practice, most decision-making problems have to be solved in view of uncertain parameters. In the operations research domain, two fundamental frameworks exist to inform such decisions. The stochastic optimization framework captures uncertainty by using probability distributions and aims to optimize an expected objective. Initially introduced in the seminal work of Dantzig [26], the framework has been intensively studied and scholars used it to solve a vast variety of problems including production planning [6, 55], relief networks [49], expansion planning [62, 64], and newsvendor problems [52]. While stochastic optimization performs well on many problem classes, finding tractable formulations is oftentimes challenging. Additionally, the data needed to approximate probability distributions might not always be available, and gathering data is often a costly and time-consuming process. The robust optimization (RO) framework overcomes many of these shortcomings by capturing uncertainty through distribution-free uncertainty sets instead of probability distributions. By choosing well-representable uncertainty sets, RO often offers computationally tractable formulations that scale well on a variety of optimization problems. In recent years, these favorable properties have led to a steep increase in research interest, see, e.g. [8, 11, 20, 36, 63]. The flexibility of uncertainty sets and scaleability of solution methods also make RO very attractive for applicational purposes and it has been widely applied to many operations management problems [45].

In traditional RO, all decisions must be made before the uncertainty realization is revealed. However, oftentimes some decisions can be delayed until after (part of) the uncertainty realization is known in real-world situations. As a consequence, RO may lead to excessively conservative solutions. To remedy this drawback, Ben-Tal et al. [9] introduced the concept of adjustable robust optimization (ARO) where some decisions can be delayed until the uncertainty realization is (partly) known. In general ARO, uncertainty realizations are revealed over multiple stages and decisions can be made after each reveal. A decision made in stage t can thus be modeled as a function of all uncertainties associated with previous stages \(t'\le t\).

While ARO improves decision-making in theory, ARO is equivalent to RO on some special problem instances, where static decision policies yield optimal adjustable solutions [9, 22]. Using similar arguments to the ones used for the optimality of static solutions, Marandi and den Hertog [46] identified conditions where optimal adjustable decisions are independent of some uncertain parameters. In general, static policies do not yield optimal solutions in the ARO setting. Elucidating this, Haddad-Sisakht and Ryan [37] identified a collection of sufficient conditions that imply the suboptimality of static policies and a strict improvement of ARO over RO.

In general, even the task of finding optimal adjustable solutions in the special case of two-stage ARO proves to be computationally intractable [9]. Accordingly, recent works developed many approximation schemes for ARO that often yield good and sometimes even optimal results in practice, see, e.g. [9, 19, 42, 58]. In the special case with only two decision stages, the first stage decisions are fixed before any uncertain parameters are known and the second stage decisions use full knowledge of the uncertainty realization. Two-stage ARO already has many applications in practice and has widely been studied in the literature, see, e.g. [14, 16, 18, 22, 23, 39, 40, 59].

Still, many real-world problems show inherent multi-stage characteristics and cannot be modeled by two-stage ARO. Examples include variants of inventory management [12], humanitarian relief [13], and facility location [5]. The transition from two-stage to multi-stage ARO introduces two main challenges. First, multi-stage ARO problems entail nonanticipativity restrictions that disallow decisions to utilize future information. Second, many approaches that solve the problem by iteratively splitting the uncertainty space, like scenario trees [41], and adaptive partitioning [16, 53], grow exponentially in the number of stages. As a consequence, many results found for two-stage ARO do not readily generalize to multi-stage scenarios. Against this background, we design piecewise affine policies for multi-stage ARO that overcome the previously mentioned challenges and extend, although it is not straightforward, many of the best know approximation bounds for two-stage problems to a multi-stage setting.

In the remainder of this section, we formally introduce our problem (Sect. 1.1), discuss closely related work (Sect. 1.2), and summarize our main contributions (Sect. 1.3).

1.1 Problem description

In this work we study multi-stage adjustable robust optimization with covering constraints and a positive affine uncertain right hand side. Specifically, we consider the following problem:

$$\begin{aligned} \begin{aligned} Z_{AR}({\mathcal {U}}) =&\min _{{\varvec{x}}(\varvec{\xi })} \max _{\varvec{\xi }\in {\mathcal {U}}}{} & {} {\varvec{c}}^\intercal {\varvec{x}}(\varvec{\xi })\\&{{\,\mathrm{\text {s.t.}}\,}}{} & {} {\varvec{A}}{\varvec{x}}(\varvec{\xi }) \ge {\varvec{D}}\varvec{\xi }+ {\varvec{d}}&\forall \varvec{\xi }\in {\mathcal {U}}\end{aligned} \end{aligned}$$
(1)

with \({\varvec{A}}\in {\mathbb {R}}^{l\times n}\), \({\varvec{c}}\in {\mathbb {R}}^{n}\), \({\varvec{D}}\in {\mathbb {R}}^{l\times m}_+\), \({\varvec{d}}\in {\mathbb {R}}^l_+\), and compact \({\mathcal {U}}\subset {\mathbb {R}}^{m}_+\). Here, \(m\) is the number of uncertain parameters, \(n\) is the number of decisions, and \(l\) is the number of constraints. To model the problem’s T stages, we split the uncertainty vector \(\varvec{\xi }\) into T sub-vectors \(\varvec{\xi }= \left( \varvec{\xi }^1, \dots , \varvec{\xi }^T\right) \) with \(\varvec{\xi }^t\) being the uncertainty vector realized in stage t. In the following, we denote by \({\underline{\varvec{\xi }}}^t:=\left( \varvec{\xi }^1, \dots , \varvec{\xi }^t\right) \) the vector of all uncertainties with known realization in stage t. Similarly, the adjustable decision vector \({\varvec{x}}(\varvec{\xi })\) divides into \({\varvec{x}}(\varvec{\xi }):= \left( {\varvec{x}}^1({\underline{\varvec{\xi }}}^1), \dots , {\varvec{x}}^T({\underline{\varvec{\xi }}}^T)\right) \), where the decision \({\varvec{x}}^t\) made in stage t has to preserve nonanticipativity and may only depend on those uncertainties \({\underline{\varvec{\xi }}}^t\) whose realization is known in stage t. We explicitly allow \(\varvec{\xi }^1\) to be zero-dimensional making the initial decision \({\varvec{x}}^1\) non-adjustable. Finally, we denote by \({\underline{{\varvec{x}}}}^t(\varvec{\xi }):=\left( {\varvec{x}}^1({\underline{\varvec{\xi }}}^1), \dots , {\varvec{x}}^t({\underline{\varvec{\xi }}}^t)\right) \) the vector of all decisions in the first t stages. Figure 1 visualizes the multi-stage decision process with nonanticipativity restrictions.

Fig. 1
figure 1

Illustration of multi-stage decision making over T stages. In each stage t a fraction \(\varvec{\xi }^t\) of the uncertainty is realized and decisions \({\varvec{x}}^t\) are made. Here, decisions \({\varvec{x}}^t\) may only depend on those uncertainties \({\underline{\varvec{\xi }}}^t\) whose realization is known in stage t

Unless explicitly stated otherwise, we assume w.l.o.g. that the following assumption holds throughout the paper.

Assumption 1

\({\mathcal {U}}\subseteq [0,1]^{m}\) is convex, full-dimensional with \(e_i\in {\mathcal {U}}\) for all \(i\in \{1,\dots , m\}\), and down-monotone, i.e., \(\forall \varvec{\xi }\in {\mathcal {U}}, {\varvec{0}}\le \varvec{\xi }'\le \varvec{\xi }:\varvec{\xi }'\in {\mathcal {U}}\).

Down-monotonicity holds because \({\varvec{D}},{\varvec{d}},\varvec{\xi }\) are all non-negative, and thus constraints become less restrictive for smaller values of \(\varvec{\xi }\). Convexity holds due to the linearity of the problem, and \({\varvec{e}}_i\in {\mathcal {U}}\subseteq [0,1]^m\) holds as \({\mathcal {U}}\) is compact and \({\varvec{D}}\) can be re-scaled appropriately. We note that the non-negativity assumption of the right-hand side does restrict the problem space. As Bertsimas and Goyal [18] point out, this assumption prevents the introduction of uncertain or constant upper bounds. However, upper bounds in other decision variables are still possible as \({\varvec{A}}\) is not restricted, and \({\varvec{D}}, {\varvec{d}}\) can be zero. Overall, Problem (1) covers a wide range of different problem classes including network design [50, 61], capacity planning [44, 48, 51], as well as versions of inventory management [12, 60] where capacities are unbounded or subject to the decision makers choice.

In the context of multi-stage decision making, some works require stagewise uncertainty, i.e., that the uncertainty set \({\mathcal {U}}\) consists of uncertainty sets \({\mathcal {U}}_1,\dots ,{\mathcal {U}}_T\) for each stage, see, e.g., [19, 20, 32]. Like other approaches based on decision rules [17, 43], we do not need these restricting assumptions. However, we show how to utilize the existence of such a structure in Sect. 3.

1.2 Related work

Feige et al. [30] show that already the two-stage version of Problem (1) with \({\varvec{D}}= {\varvec{1}}, {\varvec{d}}={\varvec{0}}\) and \({\varvec{A}}\) being a 0-1-matrix is hard to approximate with a factor better than \(\Omega (\log m)\), even for budgeted uncertainty sets. As it is thereby impossible to find general solutions for \({\varvec{x}}\), a common technique to get tractable formulations is to restrict the function space.

In this context, Ben-Tal et al. [9] consider \({\varvec{x}}\) to be affine in \(\varvec{\xi }\). Specifically, they propose \({\varvec{x}}^t\) to be of the form \({{\varvec{x}}^t({\underline{\varvec{\xi }}}^t) = {\varvec{P}}^t {\underline{\varvec{\xi }}}^t + {\varvec{q}}^t}\). Affine policies have been found to deliver good results in practice [1, 10] and are even optimal for some special problems [19, 42, 58]. Further popular decision rules include segregate affine [24, 25], piecewise constant [20], piecewise affine [14, 31], and polynomial [4, 21] policies, as well as combinations of these [54]. For surveys on adjustable policies we refer to Delage and Iancu [28] and Yanıkoğlu et al. [63].

A key question that arises when using policies to solve ARO problems is how good the solutions are compared to an optimal unrestricted solution. To answer this, many approximation schemes for a priori and a posteriori bounds have been proposed. In the context of a posteriori bounds the focus lies on finding tight upper and lower approximation problems. Hadjiyiannis et al. [38] estimate the suboptimality of affine decision rules using sample scenarios from the uncertainty set. Similar sample lower bounds are used by Bertsimas and Georghiou [17] to bound the performance of piecewise affine policies. Kuhn et al. [43] investigate the optimality of affine policies by using the gap between affine solutions on the primal and the dual of the problem. Georghiou et al. [31] generalize this primal-dual approach to affine policies on lifted uncertainty sets. Building on both of the previous approaches, Georghiou et al. [33] propose a convergent hierarchy of policies that combine affine policies with extreme point scenario samples. Daryalal et al. [27] construct lower bounds by relaxing nonanticipativity and stage-connecting constraints in multi-stage ARO. They then use these lower bounds to construct primal solutions in a rolling horizon manner.

In the context of a priori bounds, most approximation schemes have been proposed for the two-staged version of Problem (1). For general uncertainty sets on the two-stage version of (1), Bertsimas and Goyal [18] show that affine policies yield an \(O(\sqrt{m})\) approximation if \({\varvec{c}}\) and \({\varvec{x}}\) are non-negative. They further construct a set of instances where this bound is tight, showing that no better general bounds for affine policies exist. Using geometric properties of the uncertainty sets, Bertsimas and Bidkhori [15] improve on these bounds for some commonly used sets including budgeted uncertainty, norm balls, and intersections of norm balls. Ben-Tal et al. [14] propose new piecewise affine decision rules for the two-stage problem that on some sets improve these bounds even further. In addition to strong theoretical bounds, this new approach also yields promising numerical results that can be found by orders of magnitude faster than solutions for affine adjustable policies. For budgeted uncertainty sets and some generalizations thereof, Housni and Goyal [40] show that affine policies even yield optimal approximations with an asymptotic bound, i.e., asymptotic behavior of the approximation bound, see, e.g. [56], of \(O\left( \frac{\log m}{\log \log m}\right) \). This bound was shown to be tight by Feige et al. [30] for reasonable complexity assumptions, namely 3SAT cannot be solved in \(2^{O(\sqrt{m})}\) time on instances of size \(m\). We present an overview of known a priori approximation bounds for some commonly used uncertainty sets on two-stage ARO in Table 4 of “Appendix A”.

To the best of our knowledge, Bertsimas et al. [20] are the only ones that provide a priori bounds for multi-stage ARO so far. They show these bounds for piecewise constant policies using geometric properties of the uncertainty sets. More specifically, they consider multi-stage uncertainty networks, where the uncertainty realization is taken from one of multiple independent uncertainty sets in each stage. While the choice of the set selected in each stage may depend on the sets selected before, the uncertainty sets are otherwise independent. Although this assumption is fairly general, it still leaves many commonly used sets where uncertainty is dependent over multiple stages uncovered. Among others, uncovered sets include widely used hypersphere and budgeted uncertainty.

As can be seen, previous work has predominantly focused on providing tighter approximation bounds for two-stage ARO, inevitably raising the question of whether similar bounds hold for multi-stage ARO as well. This work contributes towards answering this question by extending many of the currently best-known a priori bounds on two-stage ARO to its multi-stage setting. By so doing, we are—to the best of our knowledge—the first ones to provide a priori approximation bounds for the multi-stage ARO Problem (1), where uncertainty sets can range over multiple stages. Unless explicitly stated otherwise, we will always refer to a priori approximation bounds when we discuss approximation bounds in the remainder of this paper.

1.3 Our contributions

With this work, we extend the existing literature in multiple ways, where our main contributions are as follows.

Tractable piecewise affine policies for multi-stage ARO: Motivated by piecewise affine policies for two-stage ARO [14], we present a framework to construct policies that can be used to efficiently find good solutions for the multi-stage ARO Problem (1). Instead of solving the problem directly for uncertainty \({\mathcal {U}}\), we first approximate \({\mathcal {U}}\) by a dominating set \({\hat{{\mathcal {U}}}}\). To do so, we define the concept of nonanticipative multi-stage domination and show that this new definition of domination fulfills similar properties to two-stage domination. Based on this new definition, we then construct dominating sets \({\hat{{\mathcal {U}}}}\) such that solutions on \({\hat{{\mathcal {U}}}}\) can be found efficiently. More specifically, we choose \({\hat{{\mathcal {U}}}}\) to be a polytope for which worst-case solutions can be computed by a linear program (LP) over its vertices. In order to ensure nonanticipativity, which is the main challenge of this construction, we introduce a new set of constraints on the vertices that guarantee the existence of nonanticipative extensions from the vertex solutions to the full set \({\hat{{\mathcal {U}}}}\). Finally, we show how to use the solution on the dominating set \({\hat{{\mathcal {U}}}}\) to construct a valid solution for the original uncertainty set \({\mathcal {U}}\).

Approximation bounds: To the best of our knowledge, we provide the first approximation bounds for the multi-stage Problem (1) with general uncertainty sets. More specifically, we show that our policies yield \(O(\sqrt{m})\) approximations of fully adjustable policies. While this bound is tight for our type of policies in general, we show that better bounds hold for many commonly used uncertainty sets.

While our main contribution is to extend approximation bounds to multi-stage ARO, Problem (1) is further less restrictive than problems previously discussed in the literature on approximation bounds. In addition to being restricted to two-stage ARO, previous work often assumed \({\varvec{c}}\) and \({\varvec{x}}\) to be non-negative [14, 15, 18]. Ben-Tal et al. [14] additionally restricted parts of \({\varvec{A}}\), i.e., they require the parts of \({\varvec{A}}\) associated with the first stage decision to be non-negative. Our policies do not need this assumption. However, we show that Problem (1) is unbounded whenever there is a feasible \({\varvec{x}}\) with \({\varvec{c}}^\intercal {\varvec{x}}< 0\), due to the non-negativity of the right-hand side. As a consequence, our policies do not readily extend to general maximization problems.

From a theoretical perspective, mainly asymptotic bounds are of interest. In practice, however, also the exact factors of the approximation are important. Throughout the paper, we thus always give the asymptotic, as well as the exact bounds. We compare all our bounds to the previously best-known bounds for the two-stage setting given in Ben-Tal et al. [14] and show that our constructions yield both constant factor, as well as asymptotic improvements.

Comparison with affine policies: Using the newly found bounds, we show that no approximation bound for affine policies on hypersphere uncertainty exists that is better than the bound we show for our policies. For budgeted uncertainty, on the other hand, we show that affine policies strictly dominate our piecewise affine policies. These findings confirm results that have been reported for the two-stage variant, where affine policies do not perform well for hypersphere uncertainty [18], but very well for budgeted uncertainty [40].

Improvement heuristic: Due to inherent properties of our policy construction, resulting solutions are overly pessimistic on instances where the impact on the objective varies significantly between different uncertainty dimensions. To diminish this effect, we introduce an improvement heuristic that performs at least as well as our policies and that can be integrated into the LP used to construct our policies. While these modifications come at the cost of higher solution times, they allow for significant objective improvements on some instance classes.

Tightening piecewise affine policies via lifting: We show that in the context of Problem (1) the piecewise affine policies via lifting presented by Georghiou et al. [31] yield equivalent solutions to affine policies. To prevent this from happening, we construct tightened piecewise affine policies via lifting using insights from our piecewise affine policies. These new policies integrate the approximative power of affine policies, and our piecewise affine policies and are guaranteed to perform at least as well as the individual policies they combine.

Numerical evidence: Finally, we present two sets of numerical experiments showing that our policies solve by orders of magnitude faster than the affine adjustable policies presented by Ben-Tal et al. [9], the piecewise affine policies via lifting presented by Georghiou et al. [31], and the near-optimal piecewise affine policies by Bertsimas and Georghiou [17], while often yielding comparable or improving results. First, we study a slightly modified version of the tests presented in Ben-Tal et al. [14], allowing us to demonstrate our policies’ scalability and the impact of our improvement heuristic. Second, we focus on demand covering instances to demonstrate good performances of our policies for a problem that resembles a practical application. We refer to our git repository (https://github.com/tumBAIS/piecewise-affine-ARO) for all material necessary to reproduce the numerical results outlined in this paper.

Comparison against closely related work: Compared to the closely related work by Ben-Tal et al. [14], who first introduced the concept of domination in the context of ARO, our contributions are multifold. First, we extend domination-based piecewise affine policies to a wider class of problems by switching from a two-stage to a multi-stage setting and relaxing assumptions. We discuss the structural reasons that make this extension non-trivial at the beginning of Sect. 2. In addition to showing stronger approximation guarantees for our policies, we conduct comprehensive theoretical and numerical comparisons with other adaptable policies. Based on these comparisons, we construct two new policies that mitigate weaknesses of domination and integrate its strength with the strength of other policies. More specifically, the first policy integrates finding a good outer approximation of the uncertainty set in the optimization process. The second policy integrates structural results from domination into lifting policies, c.f., [31]. As a result, we get a hierarchy of piecewise affine adjustable policies with provable relative performance guarantees. We give an overview of all policies constructed in our work and their relative performance guarantees compared to other policies in Fig. 2.

Fig. 2
figure 2

Relations between multi-stage ARO policies compared in this paper. An arc from a policy P to another policy \(P'\) states \(Z_P \le Z_{P'}\), where \(Z_P\) and \(Z_{P'}\) are optimal objective values for the ARO Problem (1) solved with policy \(P, P'\) respectively. Dashed arcs only hold for hypersphere (H) or budgeted (B) uncertainty. Relations proved for the first time in this paper are highlighted (blue, bold). The compared policies are: static policies (static); piecewise affine policies via domination by Ben-Tal et al. [14] (PAPBT); our piecewise affine policies via domination (PAP), c.f., Sects. 2 and 3; affine policies [9] (AFF); our piecewise affine policies with rescaling (SPAP), c.f., Sect. 4; near-optimal piecewise affine policies [17] (BG); piecewise affine policies via lifting [31] (LIFT); our tightened piecewise affine policies via lifting (TLIFT), c.f., Sect. 5

The rest of this paper is structured as follows. In Sect. 2, we introduce our policies and elaborate on their construction. In Sect. 3, we present our approximation bounds for the multi-stage ARO Problem (1). We present an improvement heuristic for our policies in Sect. 4. By using the results of Sects. 2 and 3, we construct tightened piecewise affine policies via lifting in Sect. 5. Finally, we provide numerical evidence for the performance of our policy compared to other state of the art policies in Sect. 6. Section 7 concludes this paper with a brief reflection of our work and avenues for future research. To keep the paper concise, we defer proofs that could possibly interrupt the reading flow to “Appendices B–O”.

2 Framework for piecewise affine multi-stage policies

In this section, we present our piecewise affine framework for the multi-stage ARO Problem (1). The main rationale of our framework is to construct new uncertainty sets \({\hat{{\mathcal {U}}}}\) that dominate the original uncertainty sets \({\mathcal {U}}\). With this, our framework follows a similar rationale as the two-stage framework from Ben-Tal et al. [14]. For a problem \(Z_{AR}({\mathcal {U}})\) we construct \({\hat{{\mathcal {U}}}}\) in such a way that \(Z_{AR}({\hat{{\mathcal {U}}}})\) can be efficiently solved, and a solution of \(Z_{AR}({\hat{{\mathcal {U}}}})\) can be used to generate solutions for \(Z_{AR}({\mathcal {U}})\).

In this context, we note that one cannot straightforwardly apply the construction scheme used by Ben-Tal et al. [14] due to nonanticipativity requirements. More specifically, Ben-Tal et al. [14] construct \({\hat{{\mathcal {U}}}}\) as polytopes, where it is well known that worst-case solutions always occur on extreme points, as any solution can be represented by convex combinations of extreme point solutions. The construction of these convex combinations, however, is not guaranteed to be nonanticipative in the multi-stage setting. To overcome this challenge, we incorporate nonanticipativity in the concept of uncertainty set domination and extend it to a multi-stage setting.

Definition 1

(Domination) Given an uncertainty set \({\mathcal {U}}\subseteq {\mathbb {R}}^{m}_+\), we say that \({\mathcal {U}}\) is dominated by \({\hat{{\mathcal {U}}}}\subseteq {\mathbb {R}}^{m}_+\) if there is a domination function \({\varvec{h}}:{\mathcal {U}}\rightarrow {\hat{{\mathcal {U}}}}\) with \({\varvec{h}}(\varvec{\xi })\ge \varvec{\xi }\), and \({\varvec{h}}\) can be expressed as \({\varvec{h}}(\varvec{\xi }) = \left( {\varvec{h}}^1({\underline{\varvec{\xi }}}^1), \dots , {\varvec{h}}^T({\underline{\varvec{\xi }}}^T)\right) \) where \({\varvec{h}}^t\) maps to the uncertainties in stage t and depends on uncertainties up to that stage.

Intuitively, an uncertainty set \({\hat{{\mathcal {U}}}}\) dominates another set \({\mathcal {U}}\) if for every point \(\varvec{\xi }\in {\mathcal {U}}\) there is a point \({\hat{\varvec{\xi }}}\in {\hat{{\mathcal {U}}}}\) that is at least as large in each component, i.e., \({\hat{\varvec{\xi }}}\ge \varvec{\xi }\). Later we show that the dominating set \({\hat{{\mathcal {U}}}}\) can be constructed as the convex combination of \(m+1\) vertices \({\varvec{v}}_0,\dots ,{\varvec{v}}_m\). We also show how to construct dominating functions \({\varvec{h}}\) for these vertex induced dominating sets. Figure 3 illustrates the hypersphere uncertainty set \({\mathcal {U}}=\left\{ \varvec{\xi }\in {\mathbb {R}}_+^{m} \Big \vert \left\Vert \varvec{\xi }\right\Vert _2^2\le 1\right\} \) together with our dominating set \({\hat{{\mathcal {U}}}}\) (2) and the dominating function \({\varvec{h}}\) (4) for \(m=2\).

Fig. 3
figure 3

Two dimensional hypersphere uncertainty set \({\mathcal {U}}\) with (dashed) dominating set \({\hat{{\mathcal {U}}}}\) (2) induced by the convex combination of vertices \({\varvec{v}}_0,{\varvec{v}}_1,{\varvec{v}}_2\) and dominating function \({\varvec{h}}\) (4) that maps a point \(\varvec{\xi }\in {\mathcal {U}}\) to a point \({\hat{\varvec{\xi }}}\in {\hat{{\mathcal {U}}}}\)

Due to the non-negativity of the problem’s right-hand side, domination at most restricts the set of feasible solutions. As a consequence, each feasible solution for a realization \({\hat{\varvec{\xi }}}\in {\hat{{\mathcal {U}}}}\) is also a feasible solution for all realizations \(\varvec{\xi }\in {\mathcal {U}}\) that are dominated by \({\hat{\varvec{\xi }}}\). Using this property, we can derive piecewise affine policies for \(Z_{AR}({\mathcal {U}})\) from solutions of \(Z_{AR}({\hat{{\mathcal {U}}}})\). Since \({\mathcal {U}}\) is full-dimensional and down-monotone by Assumption 1, there always exists a factor \(\beta \ge 0\) such that scaling \({\mathcal {U}}\) by \(\beta \) contains \({\hat{{\mathcal {U}}}}\). Theorem 1 shows that with this factor \(\beta \), solutions of problem \(Z_{AR}({\hat{{\mathcal {U}}}})\) are \(\beta \)-approximations for problem \(Z_{AR}({\mathcal {U}})\). It also shows that \(Z_{AR}({\hat{{\mathcal {U}}}})\) is unbounded exactly when \(Z_{AR}({\mathcal {U}})\) is unbounded. Thus, we assume for the remainder of this paper w.l.o.g. that both \(Z_{AR}({\mathcal {U}})\) and \(Z_{AR}({\hat{{\mathcal {U}}}})\) are bounded.

Theorem 1

Consider an uncertainty set \({\mathcal {U}}\) from Problem (1) and a dominating set \({\hat{{\mathcal {U}}}}\). Let \(\beta \ge 1\) be such that \(\forall {\hat{\varvec{\xi }}}\in {\hat{{\mathcal {U}}}} :\frac{1}{\beta }{\hat{\varvec{\xi }}}\in {\mathcal {U}}\). Moreover, let \(Z_{AR}({\mathcal {U}})\) and \(Z_{AR}({\hat{{\mathcal {U}}}})\) be optimal values of Problem (1). Then, either \(Z_{AR}({\mathcal {U}})\) and \(Z_{AR}({\hat{{\mathcal {U}}}})\) are unbounded or

$$\begin{aligned} 0\le Z_{AR}({\mathcal {U}}) \le Z_{AR}({\hat{{\mathcal {U}}}}) \le \beta \cdot Z_{AR}({\mathcal {U}}). \end{aligned}$$

We present the proof for Theorem 1 in “Appendix B”.

In the remainder of this section, we demonstrate how the results of Theorem 1 can be used to efficiently construct \(\beta \)-approximations for \(Z_{AR}({\mathcal {U}})\). Therefore, we show in Sect. 2.1 how to construct dominating polytopes \({\hat{{\mathcal {U}}}}\) and efficiently find solutions \(Z_{AR}({\hat{{\mathcal {U}}}})\) that comply with nonanticipativity requirements. Then, we construct the dominating function \({\varvec{h}}:{\mathcal {U}}\rightarrow {\hat{{\mathcal {U}}}}\), which allows us to extend these solutions to solutions for \(Z_{AR}({\mathcal {U}})\) in Sect. 2.2.

2.1 Construction of the dominating set

In the following, we construct a dominating set in the form of a polytope for which the worst-case solution can be efficiently found by solving a linear program on its vertices. Specifically, for an uncertainty set \({\mathcal {U}}\), we consider dominating sets \({\hat{{\mathcal {U}}}}\) of the form

$$\begin{aligned} {\hat{{\mathcal {U}}}}:= {{\,\textrm{conv}\,}}({\varvec{v}}_0, {\varvec{v}}_1, \dots , {\varvec{v}}_{m}) \end{aligned}$$
(2)

where for all \(i\in \{0,\dots , m\}: \frac{1}{\beta } {\varvec{v}}_i\in {\mathcal {U}}\) and for all \(i\in \{1,\dots , m\}: {\varvec{v}}_i = {\varvec{v}}_0 + \rho _i {\varvec{e}}_i\) for some \(\rho _i\in {\mathbb {R}}_+\). We postpone the construction of the domination function \({\varvec{h}}\), the base vertex \({\varvec{v}}_0\), and parameters \(\rho _1,\dots , \rho _{m}\) to Sect. 2.2 and first focus on the construction of solutions for \(Z_{AR}({\hat{{\mathcal {U}}}})\). Here, we extend the notation on \({\varvec{x}}\) and \(\varvec{\xi }\) introduced in Sect. 1.1 to \({\varvec{x}}_i\) and \({\varvec{v}}_i\). Consequently, \({\varvec{x}}_i^t\) is the sub-vector of \({\varvec{x}}_i\) corresponding to decisions made in stage t, and \(\underline{{\varvec{v}}_i}^t\) is the sub-vector of \({\varvec{v}}_i\) corresponding to uncertainties up to stage t. Then, the key component for our construction is LP (3)

$$\begin{aligned} Z_{LP}({\hat{{\mathcal {U}}}}) = \min _{{\varvec{x}}_0,\dots , {\varvec{x}}_{m}}&z \end{aligned}$$
(3a)
$$\begin{aligned} \text {s.t.} \quad&z \ge {\varvec{c}}^\intercal {\varvec{x}}_i&\forall i \in \{0,\dots , m\} \end{aligned}$$
(3b)
$$\begin{aligned}&{\varvec{A}}{\varvec{x}}_i \ge {\varvec{D}}{\varvec{v}}_i + {\varvec{d}}&\forall i \in \{0,\dots , m\} \end{aligned}$$
(3c)
$$\begin{aligned}&{\varvec{x}}_i^t = {\varvec{x}}_j^t&\hspace{-.5cm}\forall i,j \in \{0,\dots , m\}, t\in \{1,\dots , T\}, \underline{{\varvec{v}}_i}^t = \underline{{\varvec{v}}_j}^t . \end{aligned}$$
(3d)

Intuitively, the Objective (3a) together with Constraints (3b) minimize the maximal cost over all vertex solutions \({\varvec{x}}_i\). Constraints (3c) ensure that each \({\varvec{x}}_i\) is a feasible solution for the respective uncertainty vertex \({\varvec{v}}_i\) of \({\hat{{\mathcal {U}}}}\). Finally, Constraints (3d) ensure nonanticipativity by forcing vertex solutions to be equal unless different uncertainties were observed. With these constraints, we construct LP (3) such that it is sufficient to find an optimal solution for \(Z_{LP}({\hat{{\mathcal {U}}}})\) in order to find an optimal solution for \(Z_{AR}({\hat{{\mathcal {U}}}})\).

Lemma 2

Let \({\hat{{\mathcal {U}}}}\) be a dominating set as described in (2), \(Z_{LP}({\hat{{\mathcal {U}}}})\) be the solution of LP (3), and \(Z_{AR}({\hat{{\mathcal {U}}}})\) be the solution of Problem (1). Then the LP solution \(({\varvec{x}}_i)\) on the vertices of \({\hat{{\mathcal {U}}}}\) can be extended to a solution on the full set \({\hat{{\mathcal {U}}}}\) and we find:

$$\begin{aligned} Z_{LP}({\hat{{\mathcal {U}}}}) = Z_{AR}({\hat{{\mathcal {U}}}}). \end{aligned}$$

We present the proof for Lemma 2 in “Appendix C”.

2.2 Construction of the domination function

In the previous section, we showed how to construct dominating sets \({\hat{{\mathcal {U}}}}\) such that \(Z_{AR}({\hat{{\mathcal {U}}}})\) can be solved efficiently. In order for \({\hat{{\mathcal {U}}}}\) to be a valid dominating set for some uncertainty set \({\mathcal {U}}\), we additionally have to construct a nonanticipative dominating function \({\varvec{h}}:{\mathcal {U}}\rightarrow {\hat{{\mathcal {U}}}}\) according to Definition 1. Specifically, we use

$$\begin{aligned} {\varvec{h}}(\varvec{\xi }):= \left( \varvec{\xi }- {\varvec{v}}_0\right) _+ + {\varvec{v}}_0 \end{aligned}$$
(4)

where \((\cdot )_+\) is the element-wise maximum with 0 and \({\varvec{v}}_0\) is the base vertex from Definition (2). By construction, \({\varvec{h}}\) maps each uncertainty realization \(\varvec{\xi }\) to its element-wise maximum with \({\varvec{v}}_0\). It directly follows that \({\varvec{h}}\) is nonanticipative, as each element in \({\varvec{h}}(\varvec{\xi })\) solely depends on the corresponding element in \(\varvec{\xi }\).

Finally, we have to ensure that \({\varvec{h}}(\varvec{\xi })\in {\hat{{\mathcal {U}}}}\) for all \(\varvec{\xi }\in {\mathcal {U}}\). We do so by choosing the base vertex \({\varvec{v}}_0\) and parameters \(\rho _1,\dots , \rho _{m}\) during the construction of \({\hat{{\mathcal {U}}}}\) appropriately. Using \(\lambda _i(\varvec{\xi }):= \frac{\left( \left( \varvec{\xi }- {\varvec{v}}_0 \right) _+\right) _i}{\rho _i}\) with the convention \(\frac{0}{0}=0\), we find

$$\begin{aligned} {\varvec{h}}(\varvec{\xi }) = \sum _{i=1}^{m} \lambda _i(\varvec{\xi }) {\varvec{v}}_i + \left( 1-\sum _{i=1}^{m} \lambda _i(\varvec{\xi })\right) {\varvec{v}}_0. \end{aligned}$$

By definition, any convex combination of \({\varvec{v}}_i\) is contained in \({\hat{{\mathcal {U}}}}\). Thus, \({\varvec{h}}\) is a valid domination function if and only if

$$\begin{aligned} \max _{\varvec{\xi }\in {\mathcal {U}}}\sum _{i=1}^{m} \lambda _i(\varvec{\xi }) \le 1. \end{aligned}$$
(5)

Condition (5) gives a compact criterion to check the validity of dominating sets. By doing so, it lays the basis for our optimal selection of the base vertex \({\varvec{v}}_0\) and parameters \(\rho _0, \dots , \rho _m\). Checking Condition (5) generally requires solving a convex optimization problem. However, in Sect. 3 we show that for many commonly used special uncertainty sets, this problem can be significantly simplified, leading to low dimensional unconstrained minimization problems or even analytical solutions.

Recall that we showed how to extend a solution \(({\varvec{x}}_0,\dots ,{\varvec{x}}_m)\) of LP (3) to the full set \({\hat{{\mathcal {U}}}}\) in the proof of Lemma 2. Combining this with \({\varvec{h}}\) and using Theorem 1 we get a piecewise affine solution for \(Z_{AR}({\mathcal {U}})\) by

$$\begin{aligned} {\varvec{x}}(\varvec{\xi }) = \sum _{i=1}^{m} \lambda _i(\varvec{\xi }) {\varvec{x}}_i + \left( 1-\sum _{i=1}^{m} \lambda _i(\varvec{\xi })\right) {\varvec{x}}_0 \end{aligned}$$
(6)

that has an optimality bound of \(\beta \).

2.3 Limitations

While our policies overcome the nontrivial challenge of nonanticipativity on extreme point solutions, they still rely on the ability to form convex combinations. As integrality is not preserved by convex combinations, there is no natural way to extend our approach to integer or binary recourse decisions \({\varvec{x}}\). However, including non-adjustable integer or binary first-stage decisions in our framework is straightforward. Also, it is not straightforwardly possible to incorporate uncertain recourse decisions, i.e., dependence of \({\varvec{A}}\) on \(\varvec{\xi }\), into our approach, as worst-case realizations for problems with uncertain recourse are not necessarily extreme points of \({\mathcal {U}}\), see, e.g., [3, 33].

3 Optimality bounds for different uncertainty sets

In the previous section, we demonstrated how to construct nonanticipative piecewise affine policies for the multi-stage Problem (1). On this basis, proving approximation bounds mainly depends on geometric properties of the uncertainty sets \({\mathcal {U}}\). We first show approximation bounds for some commonly used permutation invariant uncertainty sets. On these sets, the dominating sets are permutation invariant and we give closed-form constructions. We then give approximation bounds for our piecewise affine policies on general uncertainty sets. Finally, we demonstrate how the bounds of an uncertainty set \({\mathcal {U}}\) generalize to transformations of that set. While in theory, mostly asymptotic bounds are of interest, in practice constant factors are important as well. Thus we always state exact, as well as asymptotic bounds. Table 1 gives an overview of all bounds that are explicitly proven in Propositions 5, 7910 and 11 of this section. We compare all our results against the results for the two-stage setting in Ben-Tal et al. [14] and show constant factor, as well as asymptotic improvements. For budgeted and hypersphere uncertainty sets, we further compare the theoretical performance of our piecewise affine policies with affine adjustable policies.

Table 1 Performance bounds of the piecewise affine policy for different uncertainty sets

For permutation invariant uncertainty sets there exists an optimal choice of \({\hat{{\mathcal {U}}}}\) that is also permutation invariant. More specifically, \({\varvec{v}}_i\) simplifies to \({\varvec{v}}_0 = \mu {\varvec{e}}\) and \({{\varvec{v}}_i = {\varvec{v}}_0+\rho {\varvec{e}}_i}\), for some \(\mu , \rho \).

Lemma 3

Let \({\mathcal {U}}\) be a permutation invariant uncertainty set. Then there exist \(\mu , \rho \) such that for the dominating uncertainty set \({\hat{{\mathcal {U}}}}\) spanned by \({\varvec{v}}_0 = \mu {\varvec{e}}\) and \({{\varvec{v}}_i = {\varvec{v}}_0+\rho {\varvec{e}}_i}\) for \(i\in \{1,\dots ,m\}\) there is no other dominating set \({\hat{{\mathcal {U}}}}'\) constructed as in (2) with a smaller approximation factor \(\beta \).

We present the proof for Lemma 3 in “Appendix D”.

With these simplifications, Condition (5) becomes

$$\begin{aligned} \frac{1}{\rho }\max _{\varvec{\xi }\in {\mathcal {U}}}\sum _{i=1}^m\left( \xi _i-\mu \right) _+ \le 1. \end{aligned}$$
(7)

With the permutation invariance of the problem, Ben-Tal et al. [14] show that for any \(\mu \) there exists a \(j\le m\), such that the maximization problem in (7) has a solution that is constant on the first j components and zero on all others components.

Lemma 4

(Lemma 4 in Ben-Tal et al. [14]) Let \(\gamma (j)\) be the maximal average value of the first j components of any \(\varvec{\xi }\in {\mathcal {U}}\)

$$\begin{aligned} \gamma (j):= \frac{1}{j}\max _{\varvec{\xi }\in {\mathcal {U}}}\sum _{i=1}^j \xi _i. \end{aligned}$$

Then for each \(\mu \) there exists an optimal solution \(\varvec{\xi }^*\) for the maximization problem in Eq.  (7) that has the form

$$\begin{aligned} \varvec{\xi }^* = \sum _{i=1}^j \gamma (j){\varvec{e}}_i, \end{aligned}$$

for some \(j\le m\).

Hypersphere uncertainty: We first use Lemma 4 to find a new dominating set for hypersphere uncertainty. By doing so, we find a new approximation bound that improves the bound of \(\root 4 \of {m}\) provided in Ben-Tal et al. [14] by a factor of \(\sqrt{\frac{\sqrt{m}+1}{2\sqrt{m}}}\), which for large \(m\) converges towards \(\frac{1}{\sqrt{2}}\). While this improvement is irrelevant for the asymptotic complexity of the problem, the new formulation of \({\hat{{\mathcal {U}}}}\) does make a difference in practice. In Fig. 4 we illustrate the improvement of our dominating set \({\hat{{\mathcal {U}}}}\) over the dominating set \({\hat{{\mathcal {U}}}}_{\text {BT}}\) proposed in Ben-Tal et al. [14] for hypersphere uncertainty sets in two and three uncertainty dimensions. Note, that our sets \({\hat{{\mathcal {U}}}}\) are fully contained in the sets \({\hat{{\mathcal {U}}}}_{\text {BT}}\), and all extreme points of \({\hat{{\mathcal {U}}}}_{\text {BT}}\) are located outside of \({\hat{{\mathcal {U}}}}\). This implies \(Z_{AR}({\hat{{\mathcal {U}}}})\le Z_{AR}({\hat{{\mathcal {U}}}}_{\text {BT}})\) for hypersphere uncertainty. The formal proof follows from straightforward convex containment and is left for brevity.

Fig. 4
figure 4

Comparison of our dominating set \({\hat{{\mathcal {U}}}}\) (blue, solid frame) and the dominating set \({\hat{{\mathcal {U}}}}_\text {BT}\) proposed in Ben-Tal et al. [14] (green, dashed frame) for the hypersphere uncertainty set \({\mathcal {U}}\) in \(m= 2\) (a) and \(m=3\) (b), (c) uncertainty dimensions (color figure online)

Proposition 5

(Hypersphere) Consider the hypersphere uncertainty set \({\mathcal {U}}=\left\{ \varvec{\xi }\in {\mathbb {R}}_+^{m} \Big \vert \left\Vert \varvec{\xi }\right\Vert _2^2\le 1\right\} \). Then a solution for \(Z_{AR}({\hat{{\mathcal {U}}}})\) where \({\hat{{\mathcal {U}}}}\) is constructed using Criterion (7) with

$$\begin{aligned} \mu&= \frac{1}{2\root 4 \of {m}},&\rho&= \frac{\root 4 \of {m}}{2}, \end{aligned}$$

gives a \(\beta = \sqrt{\frac{\sqrt{m}+1}{2}}\) approximation for problem \(Z_{AR}({\mathcal {U}})\).

We present the proof for Proposition 5 in “Appendix E”.

We can also use this improved performance bound to show that affine adjustable policies cannot yield better bounds than piecewise affine adjustable policies for \(m\ge 153\). This is because there are instances of Problem (1) with hypersphere uncertainty where affine adjustable policies perform at least \(\frac{4}{5}\left( \root 4 \of {m} - \frac{1}{\root 4 \of {m}}\right) \) worse than an optimal policy. We formalize these results in Proposition 6. Note, that these better bounds do not imply that piecewise affine adjustable policies always yield better results than affine adjustable policies for hypersphere uncertainty.

Proposition 6

Affine adjustable policies cannot achieve better performance bounds than \(\frac{4}{5}\left( \root 4 \of {m} - \frac{1}{\root 4 \of {m}}\right) \) for Problem (1) with hypersphere uncertainty, even for \({\varvec{c}}, {\varvec{x}}, {\varvec{A}}\) being non negative and \({\varvec{A}}\) being a 0,1-matrix.

We present the proof for Proposition 6 in “Appendix F”.

Budgeted uncertainty: Next, we tighten the bounds for budgeted uncertainty sets. Proposition 7 shows that our new bound is given by \(\beta = \frac{k(m-1)}{m+k(k-2)}\). Using

$$\begin{aligned} \frac{\beta }{k}= & {} \frac{k(m-1)}{k(m+k(k-2))} = \frac{m-1}{m-1+k^2-2k+1} = \frac{m-1}{m-1+(k-1)^2} \le 1 \\{} & {} \text {and}\\ \frac{\beta }{\frac{m}{k}}= & {} \frac{k^2(m-1)}{m(m+k(k-2))} = \frac{k^2(m-1)}{k^2(m-1) + k^2 + m^2 - 2 k m} \\= & {} \frac{k^2(m-1)}{k^2(m-1) + (m-k)^2} \le 1, \end{aligned}$$

we show \(\beta \le \min (k,\frac{m}{k})\), which matches the bound for the two-stage problem variant in Ben-Tal et al. [14]. As \(\frac{\beta }{k}\) is decreasing in k and \(\frac{\beta }{\frac{m}{k}}\) is increasing in k, we obtain a maximum improvement for \(k=\frac{m}{k} \Leftrightarrow k=\sqrt{m}\). At this point the improvement of the bound reaches a factor of \(\frac{1}{2}\).

Proposition 7

(Budget) Consider the budgeted uncertainty set \({\mathcal {U}}=\left\{ \varvec{\xi }\in [0,1]^{m} \Big \vert \left\Vert \varvec{\xi }\right\Vert _1\right. \) \(\left. \le k\right\} \) for some \(k\in \{1,\dots , m\}\). Then a solution for \(Z_{AR}({\hat{{\mathcal {U}}}})\) where \({\hat{{\mathcal {U}}}}\) is constructed using Criterion (7) with

$$\begin{aligned} \mu&= \frac{k(k-1)}{m+k(k-2)},&\rho&= \frac{k(m-k)}{m+k(k-2)}, \end{aligned}$$

gives a \(\beta = \frac{k(m-1)}{m+k(k-2)}\) approximation for problem \(Z_{AR}({\mathcal {U}})\).

We present the proof for Proposition 7 in “Appendix G”.

Note, that there is no result analogous to Proposition 6 for budgeted uncertainty as Housni and Goyal [40] showed that affine policies are in \(O\left( \frac{\log (m)}{\log \log (m)}\right) \) for two-stage problems with non-negative \({\varvec{c}}, {\varvec{x}}, {\varvec{A}}\). Furthermore, our piecewise affine policies are strictly dominated by affine policies for integer budgeted uncertainty.

Proposition 8

Consider Problem (1) with budgeted uncertainty and an integer budget. Let \(Z_{PAP}\) be the optimal value found by our piecewise affine policy and \(Z_{AFF}\) be the optimal value found by an affine policy. Then

$$\begin{aligned} Z_{AFF} \le Z_{PAP}. \end{aligned}$$

We present the proof for Proposition 8 in “Appendix H”.

Norm ball uncertainty: In a similar manner as before, we construct new dominating sets for p-norm ball uncertainty and tighten the bound in Ben-Tal et al. [14] by a factor of \(2^{-1+\frac{1}{p}-\frac{1}{p^2}}p^{\frac{1}{p}}(p-1)^{\frac{1}{p^2}-\frac{1}{p}}\) for sufficiently large \(m\). This factor is always smaller than one and converges to \(\frac{1}{2}\) for large p.

Proposition 9

(p-norm ball) Consider the p-norm ball uncertainty set \({\mathcal {U}}=\left\{ \varvec{\xi }\in {\mathbb {R}}_+^{m} \Big \vert \left\Vert \varvec{\xi }\right\Vert _p\le 1\right\} \) with \(p > 1\). Then a solution for \(Z_{AR}({\hat{{\mathcal {U}}}})\) where \({\hat{{\mathcal {U}}}}\) is constructed using Criterion (7) with

$$\begin{aligned} \mu&= 2^{\frac{1}{p}}\left( 2(m-1)+2^p\right) ^{-\frac{1}{p^2}}p^{-1}\left( p-1\right) ^{\frac{1}{p} + \left( 1-\frac{1}{p}\right) ^2}, \\ \rho&= 2^{\frac{1}{p}-1}\left( 2(m-1)+2^p\right) ^{\frac{1}{p}-\frac{1}{p^2}}p^{-1}\left( p-1\right) ^{\left( 1-\frac{1}{p}\right) ^2} \end{aligned}$$

gives a

$$\begin{aligned} \beta \le&\left( 2(m-1) + 2^p\right) ^{\frac{1}{p}-\frac{1}{p^2}}p^{\frac{1}{p}-1}\left( p-1\right) ^{\left( 1-\frac{1}{p}\right) ^2} \\ \le&\left( (2m)^{\frac{1}{p}-\frac{1}{p^2}} + 2^{1-\frac{1}{p}}\right) p^{\frac{1}{p}-1}\left( p-1\right) ^{\left( 1-\frac{1}{p}\right) ^2} = O\left( m^{\frac{1}{p}-\frac{1}{p^2}}\right) \end{aligned}$$

approximation for problem \(Z_{AR}({\mathcal {U}})\).

We present the proof for Proposition 9 in “Appendix I”.

Ellipsoid uncertainty: For the permutation invariant ellipsoid uncertainty set \(\left\{ \varvec{\xi }\in {\mathbb {R}}_+^{m} \Big \vert \varvec{\xi }^\intercal \varvec{\Sigma }\varvec{\xi }\le 1\right\} \) with \(m> 1\), \(\varvec{\Sigma }:={\varvec{1}}+a({\varvec{J}}-{\varvec{1}})\), \(a\in [0,1]\), \({\varvec{1}}\) being the unity matrix, and \({\varvec{J}}\) being the matrix of all ones, we construct dominating sets via a case distinction on the size of a. While for large a already a scaled simplex gives a good approximation, we construct the dominating set for small a more carefully. By doing so, we improve the previously best known asymptotic bound for the two-stage problem variant of \(O(m^\frac{2}{5})\) [14] to \(O(m^\frac{1}{3})\). Note, that for \(a=0\) our bounds converge to the bounds of hypersphere uncertainty in Proposition 5 and for \(a=1\) towards an exact representation.

Proposition 10

(Ellipsoid) Consider the ellipsoid uncertainty set \({\mathcal {U}}=\left\{ \varvec{\xi }\in {\mathbb {R}}_+^{m} \Big \vert \varvec{\xi }^\intercal \varvec{\Sigma }\right. \)\(\left. \varvec{\xi }\le 1\right\} \) with \(m> 1\) and \(\varvec{\Sigma }:={\varvec{1}}+a({\varvec{J}}-{\varvec{1}})\) for \(a\in [0,1]\). Here \({\varvec{1}}\) is the unity matrix and \({\varvec{J}}\) is the matrix of all ones. Then a solution for \(Z_{AR}({\hat{{\mathcal {U}}}})\) where \({\hat{{\mathcal {U}}}}\) is constructed using Criterion (7) with

$$\begin{aligned} \mu&= \frac{1}{2\root 4 \of {(1-a)^3m+ (1-a)^2am^2}},&\rho&= \frac{1}{4(1-a)\mu }&\text {if } a \le m^{-\frac{2}{3}},\\ \mu&= 0,&\rho&= \frac{1}{\sqrt{a}}&\text {if } a > m^{-\frac{2}{3}},\\ \end{aligned}$$

gives a

$$\begin{aligned}\beta = {\left\{ \begin{array}{ll} \sqrt{\frac{1}{2}\left( 1 + \frac{1}{1-a}\left( am+ \sqrt{(1-a)m+am^2}\right) \right) } &{} \text {if } a \le m^{-\frac{2}{3}} \\ \frac{1}{\sqrt{a}} &{} \text {if } a > m^{-\frac{2}{3}} \end{array}\right. } = O(m^{\frac{1}{3}}) \end{aligned}$$

approximation for problem \(Z_{AR}({\mathcal {U}})\).

We present the proof for Proposition 10 in “Appendix J”.

General uncertainty sets: After having shown specific bounds for some commonly used permutation invariant uncertainty sets, we now give a general bound that holds for all uncertainty sets that fulfill the assumptions of Problem (1). We show that any uncertainty set can be dominated within an approximation factor of \(\beta =2\sqrt{m}+1\), which improves the bound in Ben-Tal et al. [14] by a factor of \(\frac{1}{2}\). As shown in Ben-Tal et al. [14] this approximation bounds is asymptotically tight, when using pure domination techniques. More precisely, for any polynomial number of vertices the budgeted uncertainty set with \(k=\sqrt{m}\) cannot be dominated with some \(\beta \) better than \(O(\sqrt{m})\).

Proposition 11

(General uncertainty) Consider any uncertainty set \({\mathcal {U}}\subseteq [0,1]^m\) that is convex, full-dimensional with \({\varvec{e}}_i\in {\mathcal {U}}\) for all \(i\in \{1,\dots ,m\}\) and down-monotone. Then, there always exists a dominating uncertainty set \({\hat{{\mathcal {U}}}}\) of the form in (2) that dominates \({\mathcal {U}}\) by at most a factor of \(\beta = 2\sqrt{m}+1\).

We present the proof for Proposition 11 in “Appendix K”.

Stagewise uncertainty sets: In general, our policies do not require stagewise uncertainty. However, the existence of such a structure can be utilized in the construction of dominating uncertainty sets, leading to approximation bounds that depend linearly on the stagewise approximation bounds.

Proposition 12

(Stagewise) Let \({\mathcal {U}}= {\mathcal {U}}_1\times \dots \times {\mathcal {U}}_T\) be a stagewise independent uncertainty set and for each \({\mathcal {U}}_t\), let \({\hat{{\mathcal {U}}}}_t\) be a dominating set constructed as in (2). Let \(\beta _t\) be the approximation factor for \({\hat{{\mathcal {U}}}}_t\), and let \(\beta '_t = \min \{\beta ':\frac{1}{\beta '}{\varvec{e}}\in {\mathcal {U}}_t\}\) be the constant approximation factor for set \({\mathcal {U}}_t\). Then for any partition \({\mathcal {T}}_1\cup {\mathcal {T}}_2 = \{1,\dots ,T\}\), \({\mathcal {T}}_1\cap {\mathcal {T}}_2 = \emptyset \) of the stages, there exists a dominating set \({\hat{{\mathcal {U}}}}\) for \({\mathcal {U}}\) with approximation factor

$$\begin{aligned} \beta \le \max \left( \sum _{t\in {\mathcal {T}}_1} \beta _t ,\; \max _{t\in {\mathcal {T}}_2}\beta '_t\right) . \end{aligned}$$

We present the proof for Proposition 12 in “Appendix L”.

Transformed uncertainty sets: We note that by the right-hand side \({\varvec{D}}\varvec{\xi }+ {\varvec{d}}\) of Problem (1) any positive affine transformation of uncertainty sets \({\mathcal {U}}\) can be dominated by the same affine transformation of the dominating set \({\hat{{\mathcal {U}}}}\). As the approximation bounds do not depend on \({\varvec{D}}\) and \({\varvec{d}}\), the bounds for the transformed set are the same as for the original set. One well-known uncertainty type covered by these transformations is scaled ellipsoidal uncertainty \(\left\{ \varvec{\xi }\Big \vert \sum _{i=1}^mw_i \xi _i^2 \le 1\right\} \), which was first proposed by Ben-Tal and Nemirovski [7]. These sets can be constructed via transformations from hypersphere uncertainty sets with a diagonal matrix \({\varvec{D}}\) with \(D_{ii} = \frac{1}{\sqrt{w_i}}\). Scaled ellipsoidal uncertainty has been applied to many robust optimization problems, including portfolio optimization [7], supply chain contracting [10], network design [47], and facility location [5].

Another widely used class that is partially covered by these positive affine transformations are factor-based uncertainties given by sets \({{\mathcal {U}}=\left\{ {\varvec{D}}{\varvec{z}} + {\varvec{d}}\Big \vert {\varvec{z}}\in {\mathcal {U}}^z\right\} }\). In these sets, uncertainties affinely depend on a set of factors \({\varvec{z}}\) that are drawn from a factor uncertainty set \({\mathcal {U}}^z\). Problems that were solved using such uncertainty sets include, among others, portfolio optimization [34, 35] and multi-period inventory management [2, 57]. In contrast to the general factor sets, that have no limitations on \({\varvec{D}}\) and \({\varvec{d}}\), our approach is restricted to positive factor matrices which allows only for positive correlations between uncertainties. Nevertheless, even this subset of factor-based uncertainties has wide applicational use. As an intuitive example, one could consider component demands where the factors are demands for finished products.

4 Re-scaling expensive vertices

On some instances, uncertainties do not have a uniform impact on the objective. While accounting for high values in some of the uncertainty dimensions might drastically influence the objective, accounting for high values in others might barely have an impact. This effect can be crucial in our problem setting, because the creation of dominating sets may overemphasize single uncertainty dimensions by up to a factor of \(\beta \) by design. Accordingly, it might thus be beneficial to dominate these uncertainty dimensions more carefully on instances where a few critical uncertainties cause almost all the cost. In this context, we show that it is possible to shrink critical vertices \({\varvec{v}}_i\) at the cost of slightly shifting all other vertices towards their direction.

We illustrate the re-scaling process following from Lemma 13 in Fig. 5. In the depicted example we shrink the critical vertex \({\varvec{v}}_1\) and shift the two remaining vertices \({\varvec{v}}_0, {\varvec{v}}_2\) towards uncertainty dimension \(\xi _1\). As a consequence, the cost of the vertex solution \({\varvec{x}}_1\) decreases, while the costs of the solutions \({\varvec{x}}_0, {\varvec{x}}_2\) increase. The cost reduction on the critical vertex \({\varvec{v}}_1\) leads to a reduction of the worst case vertex cost z which by the construction of LP (3) corresponds to an overall improvement of the objective function. Note, that the dominating set \({\hat{{\mathcal {U}}}}\) used in the example is not an optimal choice for the hypersphere uncertainty set depicted. However, the effect would barely be visible in two dimensions without this sub-optimal choice.

Fig. 5
figure 5

Re-scaling of the expensive vertex \({\varvec{v}}_1\) in a dominating set \({\hat{{\mathcal {U}}}}\) for uncertainty set \({\mathcal {U}}\). a Shows the change of dominating set \({\hat{{\mathcal {U}}}}\) (blue, dashed) and it’s vertices \({\varvec{v}}_0, {\varvec{v}}_1, {\varvec{v}}_2\) to the new re-scaled dominating set \({\hat{{\mathcal {U}}}}'\) (green, solid) with vertices \({\varvec{v}}'_0, {\varvec{v}}'_1, {\varvec{v}}'_2\) for \(s_1=0.5, s_2=0\). b Shows the costs for the vertex solutions \({\varvec{x}}_i\) with maximal cost z (blue, dashed) compared to the costs for the re-scaled vertex solutions \({\varvec{x}}'_i\) with maximal cost \(z'\) (green, solid) (color figure online)

Lemma 13

Let \({\hat{{\mathcal {U}}}}:={{\,\textrm{conv}\,}}({\varvec{v}}_0,\dots , {\varvec{v}}_m)\) be a dominating set for \({\mathcal {U}}\). Let \({\varvec{s}}\in [0,1]^m\) be a vector of scales. Then \({\hat{{\mathcal {U}}}}':={{\,\textrm{conv}\,}}({\varvec{v}}'_0,\dots , {\varvec{v}}'_m)\) with \(v'_{ij}:= s_j(1-v_{ij}) + v_{ij}\) is also a dominating set for \({\mathcal {U}}\).

We present the proof for Lemma 13 in “Appendix M”.

The two extreme cases for the modified dominating sets from Lemma 13 are given by \({\varvec{s}}={\varvec{0}}\) and \({\varvec{s}}={\varvec{e}}\). While the dominating set does not change for \({\varvec{s}}={\varvec{0}}\), all vertices become the unit vector consisting of ones in every component for \({\varvec{s}}={\varvec{e}}\). Intuitively, increasing \(s_i\) leads to shifting the ith component towards one. In the same way as we constructed our dominating sets in (2) this shift of the ith component towards one increases the value for all \({\varvec{v}}_j\) with \(j\ne i\) and decrease it for \(j=i\). Note that this Lemma is not limited to uncertainty sets that are constructed as described in (2), but holds for any dominating set created as a convex combination of vertices.

The transformation used in Lemma 13 is linear, which allows us to add \({\varvec{s}}\) as a further decision variable to the second constraint of LP (3) for a given \({\hat{{\mathcal {U}}}}\). As \({\varvec{s}}={\varvec{0}}\) gives the original dominating set, any optimal solution found with these additional decision variables is at least as good as a non-modified solution. Thus, all performance bounds shown in Sect. 3 also hold for these re-scaled piecewise affine policies. Note that Proposition 8 also extends to the re-scaled uncertainty set; thus, all re-scaled piecewise affine policies are strictly dominated by affine policies for integer budgeted uncertainty.

Adding \({\varvec{s}}\) to the LP increases its size, which in practice will often lead to an increase of solution times. To limit the increase of model size, it is possible to add only those \(s_i\) where one expects \(s_i>0\), as not adding a variable \(s_i\) is equivalent to fixing \(s_i = 0\). Those \(s_i\) with \(s_i>0\) correspond to the critical uncertainty dimensions, and an experienced decision maker with sufficient knowledge of the problem might be able to identify them a priori.

5 Piecewise affine policies via liftings

Georghiou et al. [31] propose piecewise affine policies via liftings. In this section, we strengthen these policies by using the insights from the policies constructed in Sects. 2 and 3.

To construct piecewise affine policies via liftings, in the context of Assumption 1, we first choose \(r_i-1\) breakpoints

$$\begin{aligned} 0< z_1^i< z_2^i< \dots< z_{r_i-1}^i < 1, \end{aligned}$$

for each uncertainty dimension \(i\in \{1,2, \dots , m\}\). For ease of notation let \(z_0^i:=0\), \(z_{r_i}^i:=1\). With these breakpoints, we define the lifting operator \(L:{{\mathbb {R}}^{m} \rightarrow {\mathbb {R}}^{m^L}}\) ,where \(m^L:=\sum _{i=1}^mr_i\), componentwise by

$$\begin{aligned} L_{i,j}(\varvec{\xi }):= \left( \min (\xi _i - z^i_{j-1}, z^i_j - z^i_{j-1})\right) _+. \end{aligned}$$

Further, we define the linear retraction operator \(R:{{\mathbb {R}}^{m^L}\rightarrow {\mathbb {R}}^{m}}\) componentwise by

$$\begin{aligned} R_i(\varvec{\xi }^L):= \sum _{j=1}^{r_i} \xi ^L_{i,j}. \end{aligned}$$

Here \(\xi ^L_{i, j}\) are the components of the \(m^L\) dimensional vector

$$\begin{aligned} \varvec{\xi }^L = (\xi _{1,1}^L, \xi _{1,2}^L, \dots ,\xi _{1, r_1}^L, \xi _{2, 1}^L, \dots , \xi _{m, 1}^L, \dots , \xi _{m, r_m}^L)^\intercal \in {\mathbb {R}}^{m^L}. \end{aligned}$$

Note that \(R\circ L :{\mathcal {U}}\rightarrow {\mathcal {U}}\) is the identity. Finally, Georghiou et al. [31] construct a lifted uncertainty set \({\mathcal {U}}^L\) via

$$\begin{aligned} {\mathcal {U}}^L:= \left\{ \begin{aligned}&\varvec{\xi }^L\in {\mathbb {R}}^{m^L}_+ :R(\varvec{\xi }^L) \in {\mathcal {U}}, \\&\frac{\xi ^L_{i, j+1}}{z^i_{j+1}-z^i_{j}} \le \frac{\xi ^L_{i, j}}{z^i_{j}-z^i_{j-1}}&\forall i\in \{1,\dots , m\}, j\in \{1, \dots , r_i-1\}\\&\xi ^L_{i, 1} \le z^i_1&\forall i\in \{1,\dots , m\} \end{aligned} \right\} . \end{aligned}$$
(8)

Uncertainty set (8) is an outer approximation of the lifting \(L({\mathcal {U}})\subseteq {\mathcal {U}}^L\) and omits \(R({\mathcal {U}}^L) = {\mathcal {U}}\). Replacing the uncertainty in Problem (1) with this lifted uncertainty set yields the lifted adjustable problem

$$\begin{aligned} \begin{aligned} Z^L_{AR}({\mathcal {U}}^L) =&\min _{{\varvec{x}}(\varvec{\xi }^L)} \max _{\varvec{\xi }^L\in {\mathcal {U}}^L}{} & {} {\varvec{c}}^\intercal {\varvec{x}}(\varvec{\xi }^L)\\&{{\,\mathrm{\text {s.t.}}\,}}{} & {} {\varvec{A}}{\varvec{x}}(\varvec{\xi }^L) \ge {\varvec{D}}R(\varvec{\xi }^L) + {\varvec{d}}&\forall \varvec{\xi }^L\in {\mathcal {U}}^L. \end{aligned} \end{aligned}$$
(9)

Limiting \({\varvec{x}}\) to affine policies in the lifted space, yields piecewise affine policies in the original space, which give tighter approximations than affine policies in the original space, i.e., \(Z^L_{AFF}({\mathcal {U}}^L)\le Z_{AFF}({\mathcal {U}})\) [31]. However, \({\mathcal {U}}^L\) is not a tight outer approximation of \(L({\mathcal {U}})\), leading to little or no improvements over affine policies on some instances [17, 31]. In fact, we show that in the framework of ARO, the piecewise affine policies induced by lifted uncertainty (8) are equivalent to classical affine policies, in the sense that for any optimal feasible lifted affine policy, there is an affine policy with the same objective value and vice versa.

Proposition 14

Let \(Z_{AFF}\) be the optimal objective for affine policies on \(Z_{AR}({\mathcal {U}})\) and let \(Z_{LIFT}\) be the optimal objective value for lifted affine policies on \(Z^L_{AR}({\mathcal {U}}^L)\). Then

$$\begin{aligned} Z_{AFF} = Z_{LIFT}. \end{aligned}$$

We present the proof for Proposition 14 in “Appendix N”.

We overcome this shortcoming in the construction of lifted affine policies using our results on dominating uncertainty sets from Sect. 2. Consider the lifting with one break-point \(v_{0i}\) per uncertainty dimension. Here \(v_{0i}\) is the ith component of the base vector \({\varvec{v}}_0\) from Sect. 2. Then the lifting operator L becomes

$$\begin{aligned} L_{ij}(\varvec{\xi }) = {\left\{ \begin{array}{ll} \min (\xi _i, v_{0i}) &{}\text { if } j=1\\ \left( {\varvec{h}}(\varvec{\xi }) - {\varvec{v}}_0\right) _i &{}\text { if } j=2. \end{array}\right. } \end{aligned}$$

With this construction of L it is easy to verify that by Condition (5) each \(\varvec{\xi }^L\in L({\mathcal {U}})\) satisfies \( {\sum _{i=1}^m\frac{\xi ^L_{i,2}}{\rho _i} \le 1}. \) Accordingly, we can tighten the lifted uncertainty set \({\mathcal {U}}^L\) and get the new lifted uncertainty set

$$\begin{aligned} {\hat{{\mathcal {U}}}}^L:= \left\{ \varvec{\xi }^L \in {\mathcal {U}}^L, \sum _{i=1}^m\frac{\xi ^L_{i,2}}{\rho _i} \le 1 \right\} . \end{aligned}$$
(10)

By construction \({\hat{{\mathcal {U}}}}^L\) is an outer approximation of \(L({\mathcal {U}})\) and we have \(L({\mathcal {U}})\subseteq {\hat{{\mathcal {U}}}}^L \subseteq {\mathcal {U}}^L\) and \(R({\hat{{\mathcal {U}}}}^L) = {\mathcal {U}}\). Thus, affine policies on the lifted problem with uncertainty set \({\hat{{\mathcal {U}}}}^L\) yield valid piecewise affine policies for the original problem. Furthermore, the construction of \({\hat{{\mathcal {U}}}}^L\) guarantees that the lifted policies yield tighter approximations than our piecewise affine policies via domination and classical lifted policies with the same breakpoints. Consequently, all approximation bounds for piecewise affine policies via domination also hold for the strengthened piecewise affine policies via lifting.

Proposition 15

Let \(Z_{LIFT}\) be the optimal objective value found by the lifting policies with breakpoints \({\varvec{v}}_0\) and lifted uncertainty set \({\mathcal {U}}^L\) defined in (8), \(Z_{TLIFT}\) be the optimal objective value found by the lifting policies with breakpoints \({\varvec{v}}_0\) and tightened lifted uncertainty set \({\hat{{\mathcal {U}}}}^L\) defined in (10), and \(Z_{SPAP}\) be the optimal objective value found by the piecewise affine policies with re-scaling described in Sect. 4. Then

$$\begin{aligned} Z_{TLIFT}&\le Z_{LIFT},&Z_{TLIFT}&\le Z_{SPAP}. \end{aligned}$$

We present the proof for Proposition 15 in “Appendix O”.

6 Numerical experiments

In this section, we present two numerical experiments to compare the performance of our piecewise affine policies for the different constructions of \({\hat{{\mathcal {U}}}}\) and our tightened piecewise affine policies via lifting with the performance of other policies from the literature. We compare the performance in terms of both objective value and computational time.

We run both of the following tests with hypersphere uncertainty sets and budgeted uncertainty sets. In the experiments of Ben-Tal et al. [14] piecewise affine policies performed particularly well compared to affine adjustable policies for hypersphere uncertainty and relatively bad for budgeted uncertainty. Accordingly, considering these two uncertainty types gives a good impression of the benefits and limitations of piecewise affine policies. Additionally, this experimental design allows us to analyze whether or not the new formulations with the tighter bounds presented in Propositions 5 and 7 have a significant impact in practice.

In our studies, we compare the following policies: the affine policies described in Ben-Tal et al. [9] (AFF), the constant policies resulting from a dominating set \({\hat{{\mathcal {U}}}}=\{{\varvec{e}}\}\) with only a single point which by down-monotonicity corresponds to a box (BOX), the near-optimal piecewise affine policies with two pieces proposed in Bertsimas and Georghiou [17] (BG), our piecewise affine policies constructed as described in Propositions 5 and 7 (PAP), the piecewise affine policies constructed as described in Propositions 1 and 5 in Ben-Tal et al. [14] (PAPBT), our piecewise affine policies with the vertex re-scaling heuristic described in Sect. 4 (SPAP), and our tightened piecewise affine policies via lifting described in Sect. 5 (TLIFT). Note, that piecewise affine policies via lifting from Georghiou et al. [31] are implicitly included in the comparison by Proposition 14. In Table 2 we give an overview of all policies compared in our experiments.

Table 2 Overview of policies compared in the experiments

For all studies, we used Gurobi Version 9.5 on a 6 core 3.70 GHz i7 8700K processor using a single core per instance.

6.1 Gaussian instances

We base our first set of benchmark instances on the experiments of Ben-Tal et al. [14] and Housni and Goyal [40]. Accordingly, we generate instances of Problem (1) by choosing \(m= l= n\), \({\varvec{d}}= {\varvec{0}}\), \({\varvec{D}}= {\varvec{1}}_{m}\) and generate \({\varvec{c}}, {\varvec{A}}\) randomly as

$$\begin{aligned} {\varvec{c}}&= {\varvec{e}}+ \alpha {\varvec{g}},\\ {\varvec{A}}&= {\varvec{1}}+ {\varvec{G}}. \end{aligned}$$

Here, \({\varvec{e}}\) is the vector of all ones, \({\varvec{1}}\) is the identity matrix, \({\varvec{g}}\) and \({\varvec{G}}\) are randomly generated by independent and identically distributed (i.i.d.) standard gaussians, and \(\alpha \) is a parameter that increases the asymmetry of the problem. More specifically, \({\varvec{G}}\) is given by \(G_{ij} = |Y_{ij} |/\sqrt{m}\) and \({\varvec{g}}\) is given by \(g_i = |y_i |\), where \(Y_{ij}\) and \(y_i\) are i.i.d. standard gaussians. Uncertainties \(\varvec{\xi }\) and decision variables \({\varvec{x}}\) are split into \(\lfloor \sqrt{m}\rfloor \) stages where the ith decision always belongs to the same stage as the ith uncertainty. For the budgeted uncertainty sets we use a budget of \(k=\sqrt{m}\). We consider values of \(m= i^2\) for \(i\in \{2,\dots ,10\}\) and values of \(\alpha \) in \(\{0,0.1,0.5,1,5\}\). For each pair of \(m, \alpha \), we consider 30 instances. To make the results more comparable, we scale all objective values presented by the constant policies results (i.e., \(Z_\cdot / Z_{BOX}\)) and report averages over all solved instances. For each parameter pair \(m, \alpha \), we only consider those policies that found solutions on at least \(75\%\) of instances within a hard solution time limit of 4 hours. Additionally, we present all results on a logarithmic scale and artificially lower bound the scale for solution times by 0.01s to make the effects on higher solution times more visible.

Fig. 6
figure 6

Relative objective values (a) and solution times (b) for different policies on gaussian instances with hypersphere uncertainty

Figure 6 shows the performances and solution time results on hypersphere uncertainty sets for the different policies. First, note that BG only finds solutions within the time limit for the smallest instances yielding objective values comparable or marginally better to TLIFT. For the other policies we observe that piecewise adjustable policies perform significantly better than affine adjustable policies for small values of \(\alpha \). The improvement increases for larger values of m reaching almost a factor of 2 for \(m=100\) on our policies PAP, SPAP, and TLIFT. As expected, the performance of PAP and SPAP for small values of \(\alpha \) is almost indistinguishable due to the construction. Additionally, we find that TLIFT only yields marginal improvements over PAP and SPAP for small values of \(\alpha \). For larger values of \(\alpha \) the improvements of the piecewise affine adjustable policies vanish and TLIFT starts to improve over SPAP. The two policies without re-scaling (PAP and PAPBT) perform even worse than AFF for \(\alpha =5\). More severely, PAPBT even performs worse than BOX, which already is a worst-case policy. Only SPAP and TLIFT achieve better results than affine adjustable policies for all values of \(\alpha \).

While solution times for all policies except BOX grow exponentially in the instance size, domination-based piecewise affine adjustable policies are by orders of magnitude faster than classical affine adjustable policies (AFF) and piecewise affine polices via lifting (TPAP). These solution time improvements exceed a factor of 100 for piecewise affine policies PAP and SPAP and a factor of 1000 for PAPBT. Also, solution times of domination-based piecewise affine adjustable policies are barely influenced by values of \(\alpha \). This is not the case for AFF and TLIFT, which take longer to solve for increasing \(\alpha \). While this effect is not easily visible in Fig. 6 due to the logarithmic scale, the solution time difference for AFF and TLIFT between \(\alpha =0\) and \(\alpha =5\) reaches up to a factor of two on large instances.

Figure 7 shows the performances and solution time results on budgeted uncertainty sets. We observe that for budgeted uncertainty sets domination-based piecewise affine policies perform slightly worse than affine policies throughout all instances. This observation nicely demonstrates that our theoretical and experimental results are aligned, as the worse performance is perfectly explained by Proposition 8, which shows that affine policies strictly dominate our piecewise affine policies. Again, we observe that for higher values of \(\alpha \), PAP and PAPBT perform even worse than BOX. However, the solution values for SPAP stay within \(5\%\) of the affine solution values throughout all instances. We further observe that TLIFT yields the same objective values as AFF throughout all instances. Only BG yields slightly better solutions than AFF on some instances. Solution times behave similarly to solution times on hypersphere uncertainty, confirming that piecewise affine policies are found by orders of magnitude faster on different instances and uncertainty types. Only for BG solution times improve significantly, suggesting that BG is highly dependent on the shape of the uncertainty sets.

Fig. 7
figure 7

Relative objective values (a) and solution times (b) for different policies on gaussian instances with budgeted uncertainty

Solution time and performance results for \(\alpha =0\) align with the results found by Ben-Tal et al. [14] for the two-stage problem variant. This demonstrates that our generalized piecewise affine policies do not only extend all theoretical performance bounds, but also achieve comparable numerical results in a multi-stage setting. However, by breaking the symmetry by increasing \(\alpha \), we show that pure domination-based piecewise adjustable policies perform poorly on highly asymmetric instances and re-scaling (SPAP) or tightened lifting (TLIFT) constitute good techniques to overcome this shortcoming.

6.2 Demand covering instances

For the second set of test instances, we consider the robust demand covering problem with non-consumed resources and uncertain demands. The problem has various applications, among others in the domains of appointment scheduling, production planning, and dispatching and is especially relevant for the optimization of service levels. Our instances consist of \(m^l\) locations, \(m^p\) planning periods, and \(m^e\) execution periods per planning period. In each execution period t, an uncertain demand \(\xi _{lt}\) has to be covered at each location l. To do so, the decision maker can buy a fixed number of resources R at a unit cost of \(c^R\) in the first stage and then distributes these R resources among the locations at the beginning of each planning period. If a demand cannot be met with the resources assigned to a location, the decision maker will either delay the demand to the next period or redirect it to another location. In either case a fraction \(q^d_{tl}\in [0,1]\) or \(q^r_{tll'}\in [0,1]\) of the demand is lost. Each unit of lost demand causes costs of \(c^D\). Mathematically, the robust demand covering problem with non-consumed resources and uncertain demands is given by the robust LP (11), where parameters and variables are summarized in Table 3.

$$\begin{aligned} \min \quad&c^R R + c^D\left( \sum _{t\in {\mathcal {T}},l\in {\mathcal {L}}} q^d_{t} s^d_{tl} + \sum _{t\in {\mathcal {T}},l\ne l'\in {\mathcal {L}}} q^r_{ll'}s^r_{tll'}\right) \end{aligned}$$
(11a)
$$\begin{aligned} \text {s.t.}\quad&r_{p(t)l} + s^d_{tl} - (1-q^d_{t-1})s^d_{(t-1) l} \nonumber \\&+ \sum _{l'\in {\mathcal {L}},l'\ne l}\left( s^r_{tll'} - (1-q^r_{ll'})s^r_{tl'l}\right) \ge \xi _{tl}&\forall \varvec{\xi }\in {\mathcal {U}}, \forall t\in {\mathcal {T}}, l\in {\mathcal {L}} \end{aligned}$$
(11b)
$$\begin{aligned}&\sum _{l\in {\mathcal {L}}} r_{pl} \le R&\forall p\in {\mathcal {P}} \end{aligned}$$
(11c)
$$\begin{aligned}&R, {\varvec{r}}, {\varvec{s}}^r, {\varvec{s}}^d \ge 0 \end{aligned}$$
(11d)
Table 3 Notation for the robust demand covering problem with non-consumed resources and uncertain demands

Here Objective (11a) minimizes the sum of resource costs and lost demand costs due to delay and relocation. Constraints (11b) ensure that all demands are fulfilled, delayed, or relocated, and Constraints (11c) upper bound the allocated resources in each planning period by the total number of available resources R.

In most real-world applications of the demand covering problem, some of the demand will be revealed before the actual demand occurs, e.g. due to already existing contracts, sign-ups, orders, or due to forecasting. To incorporate the increase in knowledge over time, we assume an uncertainty vector of the form \(\varvec{\xi }= \varvec{\xi }^c+\varvec{\xi }^p+\varvec{\xi }^e\). Here, \(\varvec{\xi }^c\) is constant and known before the first stage decision, \(\varvec{\xi }^p\) is revealed before each planning period and \(\varvec{\xi }^e\) accounts for the short-term uncertainties revealed before each execution period. Specifically, we assume that demands are given by

$$\begin{aligned} \xi _{lt} = d_{lt}\left( 1+\xi ^p_{lt}+\frac{1}{2}\xi ^e_{lt}\right) , \end{aligned}$$
(12)

where \(d_{lt}\) is the base demand for location l in execution period t. Here the uncertainty vector \((\varvec{\xi }^p,\varvec{\xi }^e)\) is taken from one base uncertainty set \({\mathcal {U}}^{B}\) of dimension \(m=2m^lm^pm^e\). Note, that short-term uncertainties have a less severe effect on demands than uncertainties known in advance. The resulting demand uncertainty set

$$\begin{aligned} {\mathcal {U}}=\left\{ \varvec{\xi }\,\Big \vert \, \xi _{lt} = d_{lt}\left( 1+\xi ^p_{lt}+\frac{1}{2}\xi ^e_{lt}\right) , (\varvec{\xi }^p,\varvec{\xi }^e)\in {\mathcal {U}}^{B} \right\} \end{aligned}$$

is m/2 dimensional, where in our experiments \({\mathcal {U}}^B\) is either a hypersphere uncertainty set, or a budgeted uncertainty set with budget \(\sqrt{m}\).

To construct our instances we draw \(m^l\in \{2,4,6,8,10\}\) locations at uniform random integer positions in the square \(\left[ 0, 2\lfloor \sqrt{m^l}\rfloor + 1\right] ^2\). In each of the \(m^p\in \{1,3,5,7\}\) planning periods, we consider \(m^e=8\) execution periods corresponding to the hours in a working day. We assume that a fraction \(q^d_{tl}=0.1\) of the demand is lost when deferred to a later execution period and consider a doubled loss rate (\(q^d_{tl}=0.2\)) when demand is deferred to another planning period. Similarly, a fraction of the demand is lost when assigned to another location. We assume this fraction to be correlated to the distance and given by \(q^r_{ll'}:=\min \left( 1, 0.02\cdot \text {dist}(l,l')\right) \). We draw the base demands \(d_{lt}\) uniformly from the normal distribution \({\mathcal {N}}(10,4)\). Finally, we set \({c^R=1}\) and choose \(c^D\in \{0.1,0.25,0.5\}\). For each combination of \(m^l, m^p\) we consider 45 instances, where we use each possible value for \(c^D\in \{0.1,0.25,0.5\}\) in a third of these instances.

To analyze practical expected objectives, we also report a simulated average objective that the respective policies achieved on 500 randomly drawn uncertainty realizations \(\varvec{\xi }\), in addition to the robust objective value. We give a detailed description of how uniform uncertainty realizations can efficiently be sampled from the budgeted and hypersphere uncertainty sets in “Appendix P”. We again scale the results by the results achieved by constant policies and use logarithmic scales. For each instance size, we only consider those policies that found solutions on at least \(75\%\) of instances within a hard solution time limit of 2 hours.

First, we observe that BG did not solve any instance within the time limit which can be attributed to the fact that our demand covering instances are significantly larger than our gaussian instances.

Fig. 8
figure 8

Relative objective values (a) and solution times (b) for different policies on demand covering instances with hypersphere uncertainty

For the remaining policies, Fig. 8 shows the performance and solution time results on demand covering instances with hypersphere uncertainty. Compared to our previous experiment (see Sect. 6.1) we no longer observe the strong objective improvements of piecewise affine policies over affine adjustable policies. Still, our piecewise affine formulations give similar results as affine policies, and we observe small improvements on instances with a larger number of planning periods, with TLIFT yielding strict improvements for \(m^p\ge 3\). On the simulated realizations, improvements of PAP and SPAP over affine adjustable policies can already be seen for \(m^p\ge 3\), which might be of interest for a decision maker with practical interest beyond worst-case solutions.

For the solution times, we observe similar improvements to the ones observed on the gaussian instances. Still, all domination-based piecewise affine policies can be found by orders of magnitude faster than affine policies and TLIFT. Interestingly, PAP is solved similarly fast as PAPBT on these instances, while still achieving up to \(15\%\) better objective values on all instances. The largest instance that could be solved by affine adjustable policies within two hours consisted of 320 uncertainty variables and 700 decision variables, while the largest instance solved by PAPs was more than three times larger with 1, 120 uncertainty variables and 6, 230 decision variables.

Fig. 9
figure 9

Relative objective values (a) and solution times (b) for different policies on demand covering instances with budgeted uncertainty

Figure 9 shows the results on demand covering instances with budgeted uncertainty sets. For budgeted uncertainty sets, domination-based piecewise affine policies perform worse than affine adjustable policies throughout all instances, which again can be explained by Proposition 8. Notably, PAPBT even performs worse than constant policies on most instances. In this setting, domination-based piecewise affine policies remain better only from a solution time perspective, as they still solve by orders of magnitude faster than affine policies. Also, TLIFT does not yield any improvements over AFF.

While in this set of experiments piecewise affine policies do not show the same improvements in the objective over affine adjustable policies, they still perform slightly better with hypersphere uncertainty on larger instances. Also, they still solve by orders of magnitude faster, which makes them an attractive alternative for large-scale optimization in practice.

6.3 Discussion

In the experiments presented in Sects. 6.1 and 6.2, some results, e.g. the solution time improvements of piecewise affine policies over affine policies, are consistent throughout all instances. However, other results strongly depend on the set of benchmark instances used. In the following, we give an explanation for the strong solution time improvements, discuss two of the main deviations that we observe between our results on gaussian instances and demand covering instances, and give intuitions of why these differences occur.

Size of robust counterparts: Throughout all experiments, we see strong solution time improvements of piecewise affine policies over affine policies—including affine policies on lifted uncertainty TLIFT. These can be explained by their respective robust counterparts. The robust counterpart for piecewise affine policies is given by LP (3). For the robust counterparts of affine policies, we refer to Ben-Tal et al. [9]. Counterparts of piecewise affine policies have \(O(nm)\) variables compared to \(O(nm+ lm)\) variables for affine policies and both counterparts have \(O(lm)\) constraints. More critically, constraints for piecewise affine policies contain at most \(O(n)\) variables each and feature a block structure, which solvers use to significantly speed up the solution process. This block structure is only connected by the nonanticipativity constraints. In the robust counterpart of affine policies on the other hand, \(O(l)\) constraints have up to \(O(nm)\) variables resulting in a denser constraint matrix and the lack of a block structure. Moreover, the robust counterpart for affine policies on hypersphere uncertainty sets is no longer linear. Instead, a quadratic program has to be solved, which tends to be computationally more challenging.

Solution times of PAPBT: In the experiments we see that PAPBT solves by a factor of 10 to 50 faster than PAP on gaussian instances, but both find solutions similarly fast on-demand covering instances. This can be explained by the construction of PAPBT and the structure of the instances’ constraints. On the gaussian instances, \({\varvec{A}}\) and \({\varvec{x}}\) are non-negative and \({\varvec{D}}\) is the unit matrix. In the construction of dominating sets \({\hat{{\mathcal {U}}}}\) for PAPBT most of the vertices are chosen to be scaled unit vectors. As a consequence, most Constraints (3c) in LP (3) have a zero right-hand side, such that they trivially hold. Consequently, these constraints can be eliminated, which reduces the total number of constraints by a factor of \(O(m)\). On demand covering instances, however, \({\varvec{A}}\) contains negative entries. Thus, no constraints trivially hold and no constraints can be removed. For a decision maker who is primarily interested in fast policies, this gives a good criterion on when PAPBT can improve solution times and when no such improvements can be expected.

Performance differences between gaussian and demand covering instances: We observe that the strong performance improvements of piecewise affine policies over affine policies on gaussian instances with hypersphere uncertainty do not transfer to our demand covering instances. This suggests that the relative performance between piecewise affine policies and affine policies significantly depends on the structure of the problem at hand. An intuitive explanation for this lies in the policies’ construction.

Recall that piecewise affine policies derive solutions by finding vertex solutions \({\varvec{x}}_i\) that can be extended to a full solution. Thereby, \({\varvec{x}}_0\) focuses on finding a good solution for uncertainty realizations where all uncertainties take equal values, and each \({\varvec{x}}_i\) focuses on finding a good recourse to uncertainty \(\xi _i\). Thus, good results can be expected when there are (a) synergy effects that can be utilized by \({\varvec{x}}_0\), and (b) good universal recourse decisions for each uncertainty \(\xi _i\) that do not depend on the realization of other uncertainty dimensions and can be exploited by \({\varvec{x}}_i\). On the other hand, affine policies directly find solutions on the original uncertainty set. In doing so, they do not depend as strongly on good universal recourse decisions as piecewise affine policies do. However, they also lack the ability to use synergy effects in the way vertex solutions \({\varvec{x}}_0\) do.

The gaussian instances used fulfill both of the properties that are favorable for piecewise affine policies. By being based on a unity matrix, \({\varvec{A}}\) has relatively large values along the diagonal, leading to the existence of good universal recourse decisions. Additionally, the relatively small non-negative entries on the non-diagonals lead to synergy effects for uncertainty realizations with many small values. Demand covering instances, however, do not fulfill these properties. The question of how to redirect demand optimally heavily depends on the demand observed at other locations. Also, the only synergistic effects that can be used solely emerge when multiple demands occur at the same location in a single planning period.

On general instances in practice, we would thus not expect to see the same performance improvements that could be observed on our gaussian benchmark instance. Still, piecewise affine policies find solutions by orders of magnitude faster than affine policies and achieve good results throughout all benchmark instances with hypersphere uncertainty. Additionally, Properties (a) and (b) give intuitive criteria on when strong objective improvements over affine policies can be expected.

7 Conclusion

In this work, we presented piecewise affine policies for multi-stage adjustable robust optimization. We construct these policies by carefully approximating uncertainty sets with a dominating polytope, which yields a new problem that we efficiently solve with a linear program. By making use of the problem’s structure, we then extend solutions for the new problem with approximated uncertainty to solutions for the original problem. We show strong approximation bounds for our policies that extend many previously best-known bounds for two-stage ARO to its multi-stage counterpart. By doing so, we contribute towards closing the gap between the state of the art for two-stage and multi-stage ARO. To the best of our knowledge, the bounds we give are the first bounds shown for the general multi-stage ARO Problem. Furthermore, our bounds yield constant factor as well as asymptotic improvements over the state-of-the-art bounds for the two-stage problem variant.

In two numerical experiments, we find that our policies find solutions by a factor of 10 to 1000 faster than affine adjustable policies, while mostly yielding similar or even better results. Especially for hypersphere uncertainty sets our new policies perform well and sometimes even outperform affine adjustable policies up to a factor of two. We observe particularly high improvements on instances that exhibit certain synergistic effects and allow for universal recourse decisions. However, on some instances where few uncertainty dimensions have a high impact on the objective, pure piecewise affine policies perform particularly badly by design, sometimes even worse than constant policies. To mitigate this shortcoming, we present an improvement heuristic that significantly improves the solution quality by re-scaling the critical uncertainty dimension. Furthermore, we construct new tightened piecewise affine policies via lifting that integrate the two frameworks of piecewise affine policies via domination and piecewise affine policies via lifting and combine their approximative power.

While this work extends most of the best-known approximation results for a relatively general class of ARO problems from the two-stage to the multi-stage setting, it remains an open question whether other strong two-stage ARO results can be generalized to multi-stage ARO in a similar manner. Answering this question remains an interesting area for further research. In this context, binary and uncertain recourse decisions remain particularly relevant challenges. Our analysis in Sect. 2.3 has shown that the extension of our policies to encompass these recourse decision types is not straightforward. Nevertheless, exploring the integration of our methodology into the established approaches of piecewise constant policies and k-adaptability, which have proven to be effective in these cases, appears as a promising starting point for future work. Another interesting area for future research is the extension of piecewise affine policies and the concept of domination to adjustable data-driven and distributionally robust optimization. More specifically, we believe that one can obtain tractable data-driven policies by directly fitting the polyhedral uncertainty sets used to construct our policies from data.