1 Introduction

Consider a decision-maker who wants to find optimal decisions y in an environment plagued by some uncertainty represented by parameters \(\xi \). Assume that the decision-maker cares about worst-case performance, either because she is risk-averse or because uncertainty realization is picked by an adversary. If the sequence of events is fixed and known, i.e., the order of decision commitments and uncertain parameter realizations can be specified, then one can formulate this problem into a dynamic robust optimization model. Otherwise, even setting up an optimization model becomes challenging. How should the decision-maker analyze this ill-posed problem?

It turns out that something can still be said by just looking at the two “extreme” cases of the problem, represented by the following two optimization models,

$$\begin{aligned} \text {(RO)~ } \text{ I }:\quad \min _{y \in \bigcap \limits _{\xi \in \varXi }{\mathcal {Y}}_\xi }\max _{\xi \in \varXi }f(\xi , y), \quad \quad \text {(FARO)~ } \text{ II }:\quad \min _{y(\cdot ) \in \prod \limits _{\xi \in \varXi }{\mathcal {Y}}_\xi } \max _{\xi \in \varXi }f(\xi , y(\xi )), \end{aligned}$$

where \({\mathcal {X}},{\mathcal {Y}}\) are two Euclidean spaces, \(\varXi \subseteq {\mathcal {X}}\) is the uncertainty set, \({\mathcal {Y}}_\xi \subseteq {\mathcal {Y}}\) is the feasible decision space of y depending on the fixed set of parameters \(\xi \). Then, \(\bigcap _{\xi \in } {\mathcal {Y}}_\xi \) consists of solutions that are feasible for every realization of \(\xi \in \varXi \), while \(\prod _{\xi \in \varXi } {\mathcal {Y}}_\xi \), called the policy space, contains every function \(y:{\mathcal {X}} \rightarrow {\mathcal {Y}}\) that satisfies \(y(\xi )\in {\mathcal {Y}}_\xi \) for all \(\xi \in \varXi \) (the product sign is commonly used for the set of dependent functions). Problems I and II are recognized as (static) robust optimization (RO) and fully adjustable robust optimization (FARO) [10] models. In the former case, all decisions have to be made prior to any uncertainty realization, while in the latter, all decisions are made afterward.

The main research focus of this paper is to understand the difference between these two problems. More precisely, we ask two questions: (i) Can we identify conditions under which the value of the two problems, denoted by \(z(\text{ I})\) and \(z(\text{ II})\), are equal? (ii) When \(z(\text{ I}) \ne z(\text{ II})\), can we measure the corresponding gap?

The answers to these two questions will provide insights for multiple associated problems and concepts, such as the conservativeness of static robust solutions and the performance of various policy families in adjustable robust optimization (ARO), among others. We provide a detailed overview of these concepts in Appendix A for interested readers. Moreover, a quantified gap between \(\text{ I }\) and \(\text{ II }\) can very well inform the decision-maker’s actions. Suppose this gap is large, the decision-maker is inclined to postpone their decisions and invest more in revealing the uncertainty and reducing the decision lead time. Otherwise, making the decision early could be a viable option.

In this paper, we answer both questions positively for problems with linear objective and linear constraints, providing theoretical characterizations and algorithmic tools to verify the conditions for the two values to be equal and quantify the corresponding gap termed adjustability.

1.1 Literature review

Robust optimization (RO) is a modeling approach to address parameter uncertainty in various decision problems. This method has been extensively studied [8, 10, 15, 16] and has gained popularity in many application areas such as inventory theory [2, 17], supply chain management [6, 54], queuing theory [7], scheduling and transportation [3, 42, 53], portfolio optimization [31], healthcare [38], Markov decision process [47], among others [26, 58]. The core idea of RO is to produce a worst-case optimal solution that is feasible to all possible realizations of uncertainty parameters.

Sometimes, robust optimization is considered too conservative. Consequently, adjustable robust optimization (ARO) has been developed to address this concern [9]. In ARO, decision-makers have the flexibility to adjust their solutions based on various types of policies to accommodate uncertainty realizations. Notably, the constant policy counterpart is equivalent to the robust optimization problem itself. Thus, the performance of constant policies also quantifies the gap and ratio of adjustability. In this paragraph, we will focus on reviewing these constant policy results, as they are particularly relevant to the development of this paper. When first introducing linear ARO with nonnegative fixed recourse, Ben-Tal et al. [9] showed that constant policy is optimal if the uncertainty set \(\varXi \) is a box set (i.e., hyperrectangle). Marandi and Den Hertog [43] generalized this result to ARO with constraints that are convex-concave on the product space of uncertainty set and decision space. These results heavily depend on the constraint-wise separability condition provided by the box uncertainty set. For non-box uncertainty sets, Bertsimas and Goyal [12] derived several constant policy approximation ratios for uncertainty sets with special properties. Later, Bertsimas et al. [19] generalized this method to provide tighter bounds for various uncertainty sets using an upper bound called the stochastic gap. For a particular class of variable recourse ARO problems, Bertsimas et al. [21] proved that constant policy is optimal if the constraint set satisfies certain convexity conditions, and a non-convexity measurement can bound the approximation ratio. Awasthi et al. [4] studied the constant optimality gap in a two-stage adjustable robust packing linear optimization problem where the uncertainty set is column-wise and constraint-wise. For these non-box set results, the uncertainty set under consideration is assumed to be located entirely within the nonnegative orthant. Recently, Iancu et al. [38] showed that in a multiperiod problem, the constant policy is optimal if the objective has certain monotonicity properties and the uncertainty set has certain ordering (e.g., lattice) properties.

Besides the aforementioned constant policies [21], other common policy families include affine [18], piecewise constant (also called K-adaptability [35]), piecewise affine [25], and polynomial policies [5]. A central question about ARO is the optimality criteria and gap of various policy families. In particular, the affine policy family has attracted considerable attention [2, 13, 18, 24, 28, 32, 36, 37, 54] in recent years due to its balanced trade-off between tractability in computation and quality in approximation. Notably, two directions have brought fruitful discussions. One is on the optimality conditions of affine policies, and the other is on the optimality gap of affine policies. Iancu et al. [37] is generally considered to show the most general affine optimality result—affine policies are as good as the most general decision rules when the uncertainty set is a lattice and the objective function satisfies certain convexity and discrete convexity properties. Chen and Zhang [24] showed that affine policies can be optimal if they are functions in a lifted uncertainty space. Simchi-Levi et al. [54] showed that affine policies can be optimal in a network design and flow problem when the uncertainty set is a generalized budgeted set and the network is a tree. On the optimality gaps of affine policies, El Housni and Goyal [28] showed a surprising result that affine policies provide optimality approximation for their problem setup. Embedded in their proof technique, the authors also discover an interesting phenomenon that sparse affine policies can also work well. Sparsity is in the sense that a small number of adjustable variables are affine in uncertain parameters, while other adjustable variables are held to be static.

At the other end is the most flexible fully-adjustable robust optimization (FARO) problem. However, the downside is that solving FARO exactly is intractable in general [9, 10]. Thus, one either has to use heuristic/approximation methods such as various policy families [5, 18, 21, 25, 35] and scenario generations [34], or adopt a column/row generation method [60].

For a more detailed review on the topic of adjustable robust optimization, we refer interested readers to Íhsan Yanıkoǧlu et al. [59] for a comprehensive survey on this topic.

1.2 Contributions

Most results in the RO literature regarding adjustability have restrictive assumptions on the uncertainty set and/or optimization model. Specifically, all the aforementioned projects on adjustability have one or more assumptions below,

  • Constraint-wise and/or column-wise separability [4, 9, 43];

  • Non-negativity conditions on uncertainty sets and/or input parameters [4, 12, 19, 21];

  • Convexity on uncertainty sets [4, 9, 21, 43].

This poses certain restrictions on the application scope of the corresponding results. In particular, it precludes the opportunities of modeling problems with certain types of natural constraints, such as capacity constraints in network optimization, budget constraints in a resource-limited setting, or problems with equality constraints where the right hand side vector is nonzero. To fill these gaps, we develop a theoretical framework that relaxes all such assumptions for problems with linear objective and linear constraints. Moreover, this framework provides new perspectives, understandings, and algorithms for analyzing adjustability. The summary of the main contributions follows,

Theory::

We characterize a necessary and sufficient condition for adjustability to be zero using the relationship between certain direction vectors in the problem and some facets of the polyhedral uncertainty set, which provides a geometrically intuitive interpretation in the form of a theorem-of-the-alternatives. Based on this, we further derive a constructive approach to approximate adjustability for problems that can have general-shaped uncertainty sets. It turns out that the optimality criteria and gap derived using our framework are more general and tighter than existing results in the literature.

Algorithms::

We design two algorithmic procedures to analyze adjustability. The first one is based on a mixed-integer linear program that can verify whether the adjustability gap is zero. The second (poly-time) algorithm computes a tight bound for adjustability using a type of geometric object called the anchor cone. We conduct extensive numerical experiments to analyze the efficacy of the proposed algorithms and the accuracy of the resulting bounds. We demonstrate that both algorithms are quite efficient. Moreover, the algorithm based on the anchor cone can often produce a tight bound for adjustability.

The presentation of our results goes as follows. Section 2 provides the notation set and an exact account of the problem definition. In Sect. 3, we discover several equivalent reformulations of the RO and FARO problems and an algebraic property that bridges them together. In Sect. 4, we derive a necessary and sufficient condition for the adjustability gap to be zero. This result leads to an exact zero-adjustability verification procedure based on a mixed integer program formulation. In Sect. 5, we introduce a constructive approach to characterize and efficiently approximate the adjustability ratio. Finally, in Sect. 6, we conduct numerical experiments to analyze the computation efficiency and approximation accuracy of the proposed algorithms. To better streamline the exposition of the paper, we only include proofs for the key results in the main sections, and defer the others to Appendix B.

2 Preliminary

2.1 Notation

For an optimization problem \(\varPi \), we use \(z(\varPi )\) to denote the associated optimal objective value. Given \(n \in \mathbb {N}\), [n] is defined as the set \(\{1,2,\ldots , n\}\). For any subset \(S\subseteq \mathbb {R}^n\), \(\text {int}(S)\) is the interior points of S, and we use \(\text {conv}(S)\) and \(\text {cone}(S)\) to represent the respective convex hull and conic hull formed by the elements in S. For a scalar \(r \in \mathbb {R}\), we use rS to denote the scaled set \(\{r\xi \mid \xi \in S\}\). Given any two sets of vectors \(S_1\) and \(S_2\), the Minkowski sum is defined as \(S_1 + S_2:= \{v_1 + v_2 \mid v_1 \in S_1, v_2 \in S_2\}\). For a given polyhedron \(\varXi \), \(\text {ext}(\varXi )\), \(\text {eray}(\varXi )\) are the sets of extreme points and extreme rays. Slightly abusing the notation, we use \(\text {conv}(\varXi ):=\text {conv}(\text {ext}(\varXi ))\), \(\text {cone}(\varXi ):=\text {cone}(\text {eray}(\varXi ))\) to represent the polytope part and cone part of \(\varXi \). It is well known that every polyhedron \(\varXi \) can be decomposed into \(\varXi = \text {conv}(\varXi ) + \text {cone}(\varXi )\).

We use upper and lower case letters to denote matrices and vectors, respectively. For a matrix A, we take \(a_i\) and \(a_{ij}\) as the ith row and (ij)th entry of A. We adopt the convention that all vectors without the transpose sign are column vectors. For instance, the ith row \(a_i\) of matrix A or any explicitly constructed vector \((v_1, v_2)\) are all considered as column vectors, of which the row vector counterparts are denoted as \(a_i^\intercal \) and \((v_1,v_2)^\intercal = (v_1^\intercal ,v_2^\intercal )\). Given \(v_1,v_2\) with the same size, we use the notation \([v_1, v_2]\) (separated by a comma) to horizontally concatenate them into a two-column matrix. Similarly, vertical stacking is done by \([v_1^\intercal ; v_2^\intercal ]\) (separated by semi-colon). These two stacking operations naturally extend to multiple matrices and/or vectors with compatible shapes. We also view a matrix as the set of its rows, i.e., \(a \in A\) is some row in A (viewed as a column vector).

We define the inner product \(\langle \cdot , \cdot \rangle :\mathbb {R}^n \times \mathbb {R}^n \rightarrow \mathbb {R}\) in common sense, i.e., \(\langle x_1,x_2\rangle \) (or \(\langle X_1, X_2\rangle \) for two matrices) is the sum of the product of all the entries. We have the identities \(\langle v_1, Av_2\rangle = \langle A^\intercal v_1, v_2\rangle \) and \(\langle A, BC\rangle = \langle B^\intercal A, C\rangle = \langle AC^\intercal , B\rangle \).

2.2 Problem setting

In this paper, we focus on games with a linear payoff function and linear coupling constraints. Then, with the input parameters \((\varXi , a, A, c, C)\), the two general formulations \(\text{ I }\) and \(\text{ II }\) can be specified in the following forms,

$$\begin{aligned} \varPi :\quad \max _{\xi \in \varXi } ~ \min _{y \in \mathbb {R}^m}&~ \langle c, \xi \rangle + \langle a, y\rangle \end{aligned}$$
(1a)
$$\begin{aligned} \text {s.t.}&~~ C\xi -Ay \le 0, \end{aligned}$$
(1b)
$$\begin{aligned} \bar{\varPi } :\quad \min _{y \in \mathbb {R}^m}&~ \max _{\xi \in \varXi } ~ \langle c, \xi \rangle + \langle a, y\rangle \end{aligned}$$
(2a)
$$\begin{aligned} \text {s.t.}&~~ C\xi -Ay \le 0,\quad \forall \xi \in \varXi . \end{aligned}$$
(2b)

where \(\varXi \) is an uncertainty set embedded in \(\mathbb {R}^n\). When \(\varXi \) is a polyhedron, we represent it as \(\varXi :=\{\xi \in {\mathcal {X}} \mid B\xi \le b\}\). Also, we use k to indicate the number of constraints and define the augmented matrix \(\bar{C}:=[c^\intercal ; C]\). Thus, \(c_0\) is sometimes used to refer to the input vector c. Throughout the paper, we assume \(C\ne 0\), \(\varXi \) is a closed set, and both \({\bar{\varPi }}\) and \(\varPi \) are feasible and bounded. This directly leads to the following proposition.

Proposition 1

The feasibility assumption implies (i) for every \(\xi \in \varXi \) and \(u \in \mathbb {R}^k_{+}\), if \(A^\intercal u = 0\) then \(\langle C\xi , u\rangle \le 0\); (ii) for every \(\xi \in \text {cone}(\varXi )\), \(\langle c_i, \xi \rangle \le 0\). The boundedness assumption entails (i) there exists some \(u \in \mathbb {R}^k_{+}\) such that \(A^\intercal u = a\); (ii) for every \(\xi \in \text {cone}(\varXi )\), \(\langle c, \xi \rangle \le 0\).

This proposition shows that there always exists an optimal solution \(\xi \) from \(\text {conv}(\varXi )\), even though we do not require \(\varXi \) to be bounded. In this case, we say \(\varXi \) is effectively compact. It also implies that the set \(\{ u \in \mathbb {R}^k_{+} \mid A^\intercal u = a\}\) is nonempty. We define this polyhedron formally as,

Definition 1

(Dual Polyhedron) Given \(\varPi \), the dual polyhedron is defined as

$$\begin{aligned} {\mathcal {U}}:= \{u \in \mathbb {R}_+^{k} \mid A^\intercal u = a\}. \end{aligned}$$

Thus, the input parameters \((\varXi , a, A, c, C)\) can also be represented as \((\varXi , {\mathcal {U}}, {\bar{C}})\). We will use the notation \({\bar{\varPi }}(\varXi , {\mathcal {U}}, {\bar{C}})\) or \(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) to indicate the two formulations with the specific input. In this setting, we define adjustability as follows.

Definition 2

(Adjustability) Given \(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) and the corresponding \(\bar{\varPi }\), we define the following two metrics,

$$\begin{aligned} \begin{array}{ll} \text {Adjustability gap } \delta _{\text {abs}}(\varPi ) &{}:= z(\bar{\varPi }) - z(\varPi ).\\ \text {Adjustability ratio } \delta _{\text {rel}}(\varPi ) &{}:= |z(\bar{\varPi })|/|z(\varPi )|, \text { given } z(\varPi )> 0 \text { or } z(\bar{\varPi }) < 0. \end{array} \end{aligned}$$

When \(\delta _{\text {abs}}(\varPi )=0\), we say \(\varPi \) is zero-adjustable.

The focus of this paper is to study the conditions for zero-adjustability and estimate a tight bound for adjustability ratio when these conditions are violated. In the next section, we will derive a general criterion.

3 Symmetry gap and symmetric optimality

Our main goal is to quantify the gaps \(\delta _{\text {abs}}\) and \(\delta _{\text {rel}}\). In this section, we focus on polyhedral uncertainty set \(\varXi \) and will convert \(\delta _{\text {abs}}\) and \(\delta _{\text {rel}}\) to the equivalent metrics defined in the following problem.

Definition 3

(Bidual Problem) Given \({\bar{\varPi }}\) with a polyhedral uncertainty set \(\varXi \), the corresponding bidual and symmetric bidual problems are defined as

$$\begin{aligned} {\bar{\Delta }}:\quad \max _{\xi \in \varXi , u \in {\mathcal {U}}, V}&\quad \langle c,\xi \rangle + \langle C,V\rangle \end{aligned}$$
(3a)
$$\begin{aligned} \text {s.t.}&\quad BV^\intercal \le bu^\intercal , \end{aligned}$$
(3b)
$$\begin{aligned} {\bar{\Delta }}^{*}:\quad \max _{\xi \in \varXi , u \in {\mathcal {U}}, V}&~ \langle c,\xi \rangle + \langle C,V\rangle \end{aligned}$$
(4a)
$$\begin{aligned} \text {s.t.}&~ BV^\intercal \le bu^\intercal , \end{aligned}$$
(4b)
$$\begin{aligned}&~ V = u\xi ^\intercal , \end{aligned}$$
(4c)

where the constraint set \(V=u\xi ^\intercal \) is called the symmetry constraint.

Definition 4

(Symmetric Optimality) A feasible solution \((\xi , u, V)\) of the bidual \({\bar{\Delta }}\) is said to be symmetric if \(V = u\xi ^\intercal \). The symmetry gap and symmetry ratio are defined as \(\delta ^*_{\text {abs}}({\bar{\Delta }}):=z({\bar{\Delta }}) - z({\bar{\Delta }}^{*})\) and \(\delta ^*_{\text {rel}}({\bar{\Delta }}):=|z({\bar{\Delta }})|/|z({\bar{\Delta }}^{*})|\), respectively. We say \({\bar{\Delta }}\) is symmetrically optimal if \(\delta ^*_{\text {abs}}({\bar{\Delta }})=0\).

The requirement \(V=u\xi ^\intercal \) implies it is a rank-1 condition, yet the converse is false. The bidual problem \({\bar{\Delta }}\) is obtained from \({\bar{\varPi }}\) using the standard bidualization reformulation technique. First, we dualize each of the constraints. Then, we swap the minimization and maximization in \({\bar{\varPi }}\) citing the classical minimax theorem. Finally, we dualize the inner problem for a fixed \(\xi \). Thus, we have the following lemma.

Lemma 1

Given \({\bar{\varPi }}\) and its bidual \({\bar{\Delta }}\), we have \(z({\bar{\varPi }})=z({\bar{\Delta }})\).

Using the properties of \({\bar{\Delta }}\) and \({\bar{\Delta }}^*\), we derive the following main result of this section.

Theorem 1

Given \(\varPi \) with polyhedral uncertainty set, let \({\bar{\Delta }}\) be its bidual, we have \(\delta _{\text {abs}}(\varPi )=\delta ^*_{\text {abs}}({\bar{\Delta }})\) and \(\delta _{\text {rel}}(\varPi )=\delta ^*_{\text {rel}}({\bar{\Delta }})\).

Proof

First, we show that the dual of \(\varPi \), denoted by \(\Delta \), is equivalent to the symmetric bidual, i.e., \(\Delta = {\bar{\Delta }}^{*}\). The formulation for \(\varPi \) is (1). We fix the uncertainty variables at \(\xi \) and let u be the vector of dual variables for all the constraints. Then, dualizing the inner problem gives the following,

$$\begin{aligned} \Delta :\quad \max _{\xi \in \varXi ,u \in {\mathcal {U}}}&~ \langle c, \xi \rangle + \langle C\xi , u\rangle . \end{aligned}$$

On the other hand, Formulation (4c) for \({\bar{\Delta }}^*\) can be written as

$$\begin{aligned} \max _{\xi \in \varXi ,u \in {\mathcal {U}}}&~ \langle c, \xi \rangle + \langle C, u\xi ^\intercal \rangle \\ \text {s.t.}&~ (B\xi - b)u^\intercal \le 0. \end{aligned}$$

Notice that the constraint set \((B\xi - b)u^\intercal \le 0\) is redundant since \(\xi \) and u are chosen from \(\varXi \) and \({\mathcal {U}}\), respectively. Hence, \({\bar{\Delta }}^{*}=\Delta \). Then, we have the following chain of relations,

$$\begin{aligned} z(\varPi )=z(\Delta )=z({\bar{\Delta }}^{*}) \le z({\bar{\Delta }})=z({\bar{\varPi }}). \end{aligned}$$

The first equality is due to strong duality, the second is by \(\Delta ={\bar{\Delta }}^{*}\) we have just shown, the inequality is because \({\bar{\Delta }}^{*}\) has a more restricted feasible region than \({\bar{\Delta }}\), and the last equality is by Lemma 1. Therefore, the gap and ratio between \(z(\varPi )\) and \(z({\bar{\varPi }})\) are entirely captured by the symmetry gap and ratio of \({\bar{\Delta }}\). \(\square \)

Remark 1

The dualization of \(\varPi \) is done by first fixing some \(\xi \in \varXi \). Thus, we still have \(z(\varPi )=z(\Delta )\) even when \(\varXi \) is not a polyhedron or not a convex set.

The following optimality criterion is a direct consequence of Theorem 1, so we omit its proof.

Corollary 1

\(\varPi \) is zero-adjustable if and only if there exists an optimal solution of the bidual \({\bar{\Delta }}\) that is also symmetric.

We will call these symmetric-optimal solutions hereafter. The next corollary allows us to restrict our attention to the extreme points of \({\mathcal {U}}\) and \(\varXi \).

Corollary 2

\(\varPi \) is zero-adjustable if and only if there exists a symmetric-optimal solution \((\xi ^\star , u^\star , u^\star \xi ^{\star \intercal })\) of the bidual \({\bar{\Delta }}\) such that \(\xi ^\star \in \text {ext}(\varXi )\) and \(u^\star \in \text {ext}({\mathcal {U}})\).

Theorem 1 translates the adjustability gap \(\delta _{\text {abs}}\) into the symmetry gap \(\delta ^*_{\text {abs}}\) that is defined upon the bidual problem \({\bar{\Delta }}\). Compared to the former, the latter gap is more advantageous for analytic purposes since it reveals an interesting structure, the symmetry constraint \(V=u\xi ^\intercal \), that dictates the adjustability gap. In later sections, we use this characterization to derive specific zero-adjustability criteria and adjustability ratio bounds.

4 When is adjustability gap zero?

In this section, given polyhedral uncertainty set \(\varXi \), we study the conditions under which \(\delta _{\text {abs}}=0\), i.e., the zero-adjustability criteria of \(\varPi \). In particular, we provide an exact characterization in the form of theorem-of-the-alternatives in 4.1, analyze affine transformations that preserve the zero-adjustability in 4.2, and then develop an exact algorithmic verifier based on a mixed-integer optimization formulation in 4.3. For notational convenience, throughout the paper, we will use \(\xi _i\) to denote the ith entry of \(\xi \in \varXi \) and use \(\xi ^i\) to denote some vector from \(\varXi \) labeled by the index i.

One interesting observation about Formulation (3) is that the constraint set (3b) is similar to the definition of \(\varXi \). The following proposition formalizes this observation, which provides a geometric interpretation of (3b).

Proposition 2

In bidual \({\bar{\Delta }}\), constraint set (3b) is equivalent to the following,

$$\begin{aligned} v_i \in u_i\text {conv}(\varXi ) + \text {cone}(\varXi ), \ \forall i\in [k]. \end{aligned}$$
(5)

Geometrically, this constraint says that the feasible region of the ith row of matrix V is the scaled polyhedron \(u_i\text {conv}(\varXi ) + \text {cone}(\varXi )\). Thus, constraint set (5) can be viewed as a constraint propagator that propagates each constraint of \(\varXi \) to the feasible space of each \(v_i\) with a scaling factor \(u_i\).

This result allows us to represent each row \(v_i\) of matrix V as \(u_i\xi ^i + \xi ^{i'}\) for some \(\xi ^i \in \text {conv}(\varXi )\) and \(\xi ^{i'} \in \text {cone}(\varXi )\). Then, the bidual formulation (3) has the following alternative form,

$$\begin{aligned} z({\bar{\Delta }}) =&\ \max _{\xi ' \in \text {cone}(\varXi )}\langle c, \xi '\rangle +\max _{\xi \in \text {conv}(\varXi )}\langle c, \xi \rangle \\&\ + \sum _{i \in [k]}\max _{\xi ^{i'} \in \text {cone}(\varXi )}\langle c_i, \xi ^{i'}\rangle + \max _{u \in {\mathcal {U}}} \sum _{i \in [k]}u_i\left( \max _{\xi ^i \in \text {conv}(\varXi )} \langle c_i, \xi ^i\rangle \right) . \end{aligned}$$

Note that, by Proposition 1, the first and third terms in the above formulation are equal to zero, which gives

$$\begin{aligned} z({\bar{\Delta }})&=\max _{\xi \in \text {conv}(\varXi )}\langle c, \xi \rangle + \max _{u \in {\mathcal {U}}} \sum _{i \in [k]}u_i\left( \max _{\xi ^i \in \text {conv}(\varXi )} \langle c_i, \xi ^i\rangle \right) . \end{aligned}$$
(6)

Thus, we can further replace (5) with \( v_i \in u_i \text {conv}(\varXi )\) in the bidual \({\bar{\Delta }}\). With such an intuitive interpretation of the constraints, it is expected that the feasible region of (3), denoted by \(\mathfrak {P}_{\text {}}\), has the following nice properties.

Proposition 3

Any solution \((\xi , u, V) \in \mathfrak {P}_{\text {}}\) is an extreme point if and only if \(\xi \in \text {ext}(\varXi )\), \(u \in \text {ext}({\mathcal {U}})\), and for each row \(v_i=u_i\xi ^i\) of matrix V, either \(u_i=0\) or \(\xi ^i \in \text {ext}(\varXi )\).

4.1 Zero-adjustability criteria

We use \({\bar{\Delta }}(u)\) to denote Formulation (6) with a fixed u. Then, \(\mathcal U^\star :=\arg \max _{u \in {\mathcal {U}}}{\bar{\Delta }}(u)\) is the set of optimal solutions in \({\mathcal {U}}\). According to (6), for a fixed u, \({\bar{\Delta }}(u)\) can be viewed as \(k+1\) independent optimization problems, each of which is the support function [49, p. 28] over \(\text {conv}(\varXi )\) with cost vector \(c\) or \(c_i\). Based on this observation, we can use the following definition to characterize the zero-adjustability criterion.

Definition 5

(Normal Cone [49, p. 15]) Vector \(c'\) is normal to a closed convex set \(\varXi \) at \(\xi \in \varXi \) if \(\langle c', \xi ' - \xi \rangle \le 0\) for all \(\xi ' \in \varXi \). The set of all such vectors, \(N_{\varXi }(\xi )\), is the normal cone to \(\varXi \) at \(\xi \).

When a set of cost vectors \(\{c_i\}_{i \in L}\) belongs to the same normal cone \(N_\varXi (\xi ^\star )\) for some \(\xi ^\star \in \varXi \), the optimization problems in \(\{\max _{\xi \in \varXi }\langle c_i, \xi \rangle \}_{i \in L}\) share the same optimal solution \(\xi ^\star \). In this case, we call this family of problems co-optimal. It is clear that a family of maximization problems over the same \(\varXi \) is co-optimal if and only if the cost vectors belong to the same normal cone. With these definitions, we present the zero-adjustability theorem as follows.

Theorem 2

\(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) is zero-adjustable if and only if there exists some \(\xi \in \varXi \) and \(u \in {\mathcal {U}}^\star \cap \text {ext}({\mathcal {U}})\) such that \(c \in N_{\varXi }(\xi )\) and, for every \(i \in [k]\), one of the following is true:

  1. 1.

    \(u_i = 0\),

  2. 2.

    \(c_i \in N_{\varXi }(\xi )\).

Proof

By Proposition 2, Formulation (6) is equivalent to the bidual formulation (3) with the identities \(v_i = u_i\xi ^i\) for all \(i \in [k]\). Then, every symmetric solution of (3) corresponds to a feasible solution \((\xi , u, \{\xi ^i\}_{i \in [k]})\) of Formulation (6) that satisfies \(\xi =\xi ^i\) for every index i such that \(u_i >0\). For sufficiency, we take u and \(\xi \) that satisfy the premise, and construct the solution \((\xi , u,\{\xi ^i=\xi \}_{i \in [k]})\). This solution is optimal for (6) by the choice of u and \(\xi \), thus corresponds to an optimal and symmetric solution for (3) since \(\xi ^i = \xi \) for all i. Then, by Corollary 1, \(\varPi \) is zero-adjustable. For necessity, according to Corollary 2, \(\varPi \) being zero-adjustable implies there exists a symmetric-optimal solution \((\xi ^\star , u^\star , u^\star \xi ^{\star \intercal })\) for (3) where \(\xi ^\star \in \text {ext}(\varXi )\) and \(u^\star \in \text {ext}({\mathcal {U}})\). Therefore, \((\xi ^\star , u^\star , \{\xi ^i=\xi ^\star \}_{i \in [k]})\) is an optimal solution for (6). This means \(\xi ^\star \) is an optimal solution of the problems \(\max _{\xi \in \varXi }\langle c,\xi \rangle \) and \(\max _{\xi ^i \in \varXi }\langle c_i,\xi ^i\rangle \) for every index i such that \(u_i >0\), i.e., these problems are co-optimal. Thus, \(c\) and \(\{c_i\}_{i \in L}\) belong to the same normal cone \(N_\varXi (\xi ^\star )\) where L labels all the nonzero entries in u. \(\square \)

Interpreting Theorem 2. This theorem-of-the-alternatives bears some resemblance to complementary slackness in the KKT optimality condition, which states that either the dual variables \(u_i=0\) or the associated constraint has to be tight. In our case, u can be considered as the dual variables associated with (2b). Then, Theorem 2 reveals the co-optimality requirement for \(c_i\)’s corresponding to the nonzero entries in u. In other words, Theorem 2 broadens the understanding of equilibria in minimax problems. Classical minimax equilibria are characterized by saddle points. We show that with (coupling) constraints, equilibria are, in addition, characterized by the orthogonality of certain directions in the problem and the relative boundary of the feasible region (i.e., normal cones). This is reminiscent of how some additional orthogonality ingredients are needed to go from unconstrained, first-order optimality conditions to the KKT condition of constrained optimization. The following corollary provides two sufficient criteria.

Corollary 3

\(\varPi (\varXi ,{\mathcal {U}}, {\bar{C}})\) is zero-adjustable if it satisfies either of the following conditions:

  1. 1.

    \(\{c\}\cup \{c_i\}_{i \in [k]}\subseteq N_\varXi (\xi )\) for some \(\xi \in \text {ext}(\varXi )\);

  2. 2.

    for every \(u \in \text {ext}({{\mathcal {U}}})\) with nonzero entries labeled by \(L_{u}\), we have \(\{c\} \cup \{c_i\}_{i \in L_{u}} \subseteq N_\varXi (\xi ^u)\) for some \(\xi ^u \in \text {ext}(\varXi )\).

Figure 1 illustrates the relation between normal cone and zero-adjustability. Suppose all the rows in \({\bar{C}}\) are in the same normal cone, the adjustability gap is equal to zero according to the first condition in Corollary 3. Two trivial special cases follow directly: \(\varXi \) is a box set and rows in \({\bar{C}}\) are in the same orthant; \(\varXi \) is a \(L_1\) ball and \({\bar{C}}\) has a dominant column, i.e., the largest absolute value of each row in \({\bar{C}}\) is at the same entry and is of the same sign. Thus, in both of these cases, the corresponding problem \(\varPi \) is zero-adjustable.

Fig. 1
figure 1

By Corollary 3, if all the rows in \({\bar{C}}\) are in the normal cone \(N_\varXi (\xi )\), then \(\varPi \) is zero-adjustable

For specific problems where the exact descriptions of both \(\text {ext}(\varXi )\) and \(\text {ext}({\mathcal {U}})\) are accessible, Theorem 2 can often be used to produce more interesting optimality conditions for the specific problem at hand. In the next subsection, we extend the analytic capability of Theorem 2 and Corollary 3 using affine transformations.

4.2 Zero-adjustability under affine transformations

Zero-adjustability results can be preserved under certain affine transformations. Using this idea, for instance, we can extend the aforementioned examples to additional zero-adjustability cases, such as parallelotope uncertainty sets.

Our main result for zero-adjustability under affine transformation is the following corollary. Recall that we also view a matrix as the set of its rows; thus, \(\text {cone}({\bar{D}})\) is the conic hull of the rows in \({\bar{D}}\).

Corollary 4

Given \(\varPi (\varXi , {\mathcal {U}}, \bar{C})\) that is zero-adjustable and two matrices \({\bar{D}}\) and R that satisfy \({\bar{D}}R \subseteq \text {cone}({\bar{C}})\). Let \(\phi \) be the affine transformation \(\phi (\xi )=R\xi + \beta \) for any \(\beta \), then, the transformed problem \(\varPi '(\phi (\varXi ),{\mathcal {U}}, {\bar{D}})\) is also zero-adjustable if any of the following is satisfied:

  1. 1.

    \(\varPi \) satisfies Corollary 3;

  2. 2.

    there exists a symmetric-optimal solution \([\xi , u, u\xi ^\intercal ]\) of \({\bar{\Delta }}\) such that \(u > 0\);

  3. 3.

    \(\beta = 0\) and \(D R = \lambda C\) for some \(\lambda \ge 0\), where C and D are \(\bar{C}\) and \({\bar{D}}\) without the first row.

The proof is in Appendix C, where we also use this corollary to study zero-adjustability with respect to other uncertainty sets, such as parallelotopes and simplices.

Remark 2

Since these are special cases where zero-adjustability can be preserved, it is expected that the converse is not true. A counterexample can be constructed as follows. We construct \({\mathcal {U}}\) such that \(\text {ext}({\mathcal {U}})\) contains at least three distinct extreme points \(u,u',u''\) with their nonzero entries \(L, L', L''\) being mutually disjoint. Then, we design a surjective linear map \({\bar{C}}\) and a polytope \(\varXi \) such that rows in \(C_L\) lead to the same extreme point \(\xi \in \varXi \), rows in \(C_{L'}\) lead to the same extreme point \(\xi ' \in \varXi \) with \(\xi \ne \xi '\), and rows in \(C_{L''}\) are not in the same normal cone. Notice that for any polytope \(\varXi \), translating it with an arbitrary direction \(\beta \) will not affect the normal cones of \(\varXi \) (that is, \(N_\varXi (\xi ) = N_{\varXi +\beta }(\xi +\beta )\) for every \(\xi \in \varXi \)), but it allows us to choose an arbitrary \(u \in \text {ext}({\mathcal {U}})\) to be the optimal solution given that C is surjective (see (6)). Hence, without loss of generality, we can assume that under the original \(\varXi \), there is a unique optimal \(u^\star =u\). Clearly, by this construction, problem \(\varPi \) is zero-adjustable as rows in \(C_L\) are in the same normal cone. Now, let \(R = I\), \({\bar{D}} = {\bar{C}}\), and properly select \(\beta \ne 0\) so that the optimal \(u^\star \) shifts from u to \(u'\). Then, this transformed problem is again zero-adjustable as rows in \(C_{L'}\) are in the same normal cone. However, this construction does not satisfy any of the three conditions: it does not satisfy Corollary 3 by our construction of C; it does not satisfy the second condition by the construction of \({\mathcal {U}}\); it does not satisfy the last condition as \(\beta \ne 0\) alters the optimal \(u^\star \).

Remark 3

In particular, for any scalar \(\lambda > 0\), both \((\lambda I) \bar{C}\) and \(\bar{C} (\lambda I)\) (viewed as the set of their rows) are contained in \(\text {cone}(\bar{C})\). Thus, when any of the three conditions is satisfied for a zero-adjustable problem \(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\), the problem \(\varPi '(\phi '(\varXi ), {\mathcal {U}}, \lambda I)\) with \(\phi '(\xi )=\bar{C} \xi + \beta \) and \(\varPi ''(\phi ''(\varXi ), {\mathcal {U}}, {\bar{C}})\) with \(\phi ''(\xi )=\lambda \xi + \beta \) are also zero-adjustable for any translation vector \(\beta \). The former implies “absorbing” \({\bar{C}}\) into the definition of the uncertainty set will not affect the zero-adjustability, while the latter says zero-adjustability is preserved under any scaling and translation on the uncertainty set \(\varXi \).

4.3 MIP verifiers for zero-adjustability

Using Theorem 2, Corollary 3, and Corollary 4, we can verify the zero-adjustability of a class of problems that has certain special properties on the input \((\varXi , {\mathcal {U}}, C)\). In this subsection, we introduce an exact verifier for general inputs.

We assume that the values \({\overline{\omega }}_i:= \max _{\xi \in \varXi } \langle c_i, \xi \rangle \) and \({\underline{\omega }}_i:= \min _{\xi \in \varXi } \langle c_i, \xi \rangle \) can be efficiently computed, and use \({\overline{\omega }}\) and \({\underline{\omega }}\) to denote the corresponding vectors. Then, Formulation (6) can be equally written as

$$\begin{aligned} z({\bar{\Delta }})&= \max _{u \in {\mathcal {U}}} \langle {\overline{\omega }}, u\rangle + {\overline{\omega }}_0, \end{aligned}$$
(7)

where \({\overline{\omega }}_0:= \max _{\xi \in \varXi } \langle c, \xi \rangle \) and \({\mathcal {U}}\) is the dual polyhedron \(\{u \ge 0 \mid A^\intercal u = a\}\). Adding some extra constraints into (7), we obtain the following formulation.

$$\begin{aligned} {\bar{\Delta }}' :\quad \max _{u \in {\mathcal {U}}, \xi \in \varXi , v}&~ \langle {\overline{\omega }}, u\rangle + {\overline{\omega }}_0 \end{aligned}$$
(8a)
$$\begin{aligned} \text {s.t.}&~ Mv \ge u \end{aligned}$$
(8b)
$$\begin{aligned}&~ \langle c, \xi \rangle \ge {\bar{\omega }}_0 \end{aligned}$$
(8c)
$$\begin{aligned}&~ \langle c_i, \xi \rangle \ge {\underline{\omega }}_i + ({\bar{\omega }}_i-{\underline{\omega }}_i)v_i, \quad \forall i \in [k] \end{aligned}$$
(8d)
$$\begin{aligned}&~ v \in \{0, 1\}^{k}, \end{aligned}$$
(8e)

where M is a sufficiently large scalar.

Remark 4

The existence of such big M is guaranteed by Proposition 1 since by definition we have

$$\begin{aligned} \text {cone}({\mathcal {U}})= \{u \ge 0 \mid A^\intercal u = 0\}. \end{aligned}$$

Then, in Proposition 1, the feasibility entailed statement (i) implies that for every ray \(u_0 \in \text {cone}({\mathcal {U}})\), the value \(\langle {\bar{\omega }}, u_0\rangle \) in (8a) is non-positive. Thus, given that the problem is feasible and bounded, there must exist an optimal u that is located within the compact region \(\text {conv}({\mathcal {U}})\).

Formulation (8) serves as a zero-adjustability verifier according to the following theorem.

Theorem 3

\(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) is zero-adjustable if and only if \(z({\bar{\Delta }}) = z({\bar{\Delta }}')\).

Proof

Constraint (8b) ensures that \(v_i=1\) for every nonzero entry \(u_i\). Constraints (8c) and (8d) enforce that all \(c_i\)’s corresponding to nonzero \(u_i\)’s, along with c, are in the same normal cone \(N_\varXi (\xi )\). Let \({u'}^\star \) be the optimal solution obtained by solving \({\bar{\Delta }}'\), and use \({\mathcal {U}}^\star \) and \(\text {ext}({\mathcal {U}})\) to denote the optimal solutions and extreme points of \({\mathcal {U}}\) in problem \({\bar{\Delta }}\), respectively. Suppose \(z({\bar{\Delta }}) \ne z({\bar{\Delta }}')\), then either \({\bar{\Delta }}'\) is infeasible or \({u'}^\star \notin {\mathcal {U}}^\star \). Both cases violate the sufficient condition of Theorem 2, which implies \(\varPi \) is not zero-adjustable. On the other hand, suppose \(z({\bar{\Delta }}) = z({\bar{\Delta }}')\), then \({u'}^\star \in {\mathcal {U}}^\star \). If \({u'}^\star \) is also an extreme point of \({\mathcal {U}}\), we are done by Theorem 2. Otherwise, \({u'}^\star \) is contained in the convex combination of some solutions \(\{u^\star _i\} \subseteq {\mathcal {U}}^\star \cap \text {ext}({\mathcal {U}})\). Select an arbitrary \(u^\star \in \{u^\star _i\}\), and it should be clear that the nonzero entries in \(u^\star \) form a subset of the ones in \({u'}^\star \) since \({\mathcal {U}}\) is nonnegative. Thus, the feasibility of \({u'}^\star \) in \({\bar{\Delta }}'\) also implies the feasibility of \(u^\star \) in \({\bar{\Delta }}'\), i.e., c and \(c_i\)’s that correspond to nonzero entries of \(u^\star \) are still in the normal cone \(N_\varXi (\xi )\). Thus, the claim is true by Theorem 2. \(\square \)

Therefore, given an arbitrary input \(\varPi (\varXi , {\mathcal {U}}, \bar{C})\), comparing \(z({\bar{\Delta }})\) and \(z({\bar{\Delta }}')\) verifies the zero-adjustability of \(\varPi \). We note that \({\bar{\Delta }}\) is a reformulation of \({\bar{\varPi }}\), yet \({\bar{\Delta }}'\) is not a reformulation of \(\varPi \). Instead, it is derived from Theorem 2. According to the following corollary, we in fact have \(z({\bar{\Delta }}') < z(\varPi )\) whenever zero-adjustability fails (we follow the convention \(z({\bar{\Delta }}') = -\infty \) if the maximization \({\bar{\Delta }}'\) is infeasible).

Corollary 5

\(\varPi (\varXi ,{\mathcal {U}}, \bar{C})\) is not zero-adjustable if and only if \(z({\bar{\Delta }}') < z(\varPi )\).

According to this corollary, \({\bar{\Delta }}'\) provides an underestimate for \(\varPi \) and the associated dualized problem \({\bar{\Delta }}^*\).

Notice that Constraints (8b)–(8d) state that c and \(c_i\)’s corresponding to the nonzero entries in u need to be in the same normal cone. For certain given \(\varXi \) and \(\bar{C}\), we can efficiently merge \(c_i\)’s into groups by their co-optimality. In this case, an equivalent MIP reformulation can be constructed with the following definition.

Definition 6

(Co-optimal Index Cover) Given \(\varPi (\varXi ,{\mathcal {U}}, \bar{C})\), we define

$$\begin{aligned} {\mathfrak {L}}':= \{L \subseteq [k] \mid \{c\} \cup \{c_i\}_{i \in L} \text { are co-optimal}\}, \end{aligned}$$

and collect the maximal subsets in \({\mathfrak {L}}'\) to form the co-optimal index cover

$$\begin{aligned} {\mathfrak {L}}=\{L_j\}_{j \in {\mathcal {J}}}, \end{aligned}$$

where we use \({\mathcal {J}}\) to index all these maximal subsets. Furthermore, for every \(j \in {\mathcal {J}}\), we use \(\eta _j\) to be the indicator vector of \(L_j\). That is, for every \(i \in [k]\), the ith entry of the binary vector \(\eta _j\) is one if and only if \(i \in L_j\).

Using this definition, we have the following corollary.

Corollary 6

Given \(\varPi (\varXi ,{\mathcal {U}}, \bar{C})\) with the co-optimal index cover \({\mathfrak {L}}=\{L_j\}_{j \in {\mathcal {J}}}\), then \({\bar{\Delta }}'\) is equivalent to the MIP \({\bar{\Delta }}''\) defined as follows,

$$\begin{aligned} {\bar{\Delta }}'' :\quad \max _{u \in {\mathcal {U}}, z}&~ \langle {\overline{\omega }}, u\rangle + {\overline{\omega }}_0 \end{aligned}$$
(9a)
$$\begin{aligned} \text {s.t.}&~ u \le M(1 -z_j(1-\eta _{j})), \quad \forall j \in {\mathcal {J}} \end{aligned}$$
(9b)
$$\begin{aligned}&~ \sum _{j \in {\mathcal {J}}} z_j = 1 \end{aligned}$$
(9c)
$$\begin{aligned}&~ z \in \{0, 1\}^{{\mathcal {J}}}, \end{aligned}$$
(9d)

where M is a sufficiently large number.

Compared to (8), MIP (9) eliminates the variables and constraints associated with \(\xi \in \varXi \). Moreover, since only one binary variable \(z_j\) equals one due to (9c), we can address (9) by solving each linear program resulting from fixing \(z_j=1\), for each different \(j \in {\mathcal {J}}\). Hence, when \({\mathfrak {L}}\) can be efficiently generated, (9) is a zero-adjustability verifier with complexity \(O(|{\mathcal {J}}|\Lambda )\) where \(\Lambda \) is the time to solve each associated LP. We provide the following example for budgeted uncertainty sets with a nonnegative \({\bar{C}}\).

Example 1

(Co-optimal Index Cover Generation for Budgeted Sets with \({\bar{C}} \ge 0\)) A budgeted set \(\varXi \) is defined as

$$\begin{aligned} \varXi _\beta := \{\xi \in [0, 1]^n \mid \Vert \xi \Vert _1 \le \beta \} \text { for some } \beta \in [n]. \end{aligned}$$

The corresponding extreme points are

$$\begin{aligned} \text {ext}(\varXi _\beta ) = \{\xi \in \{0, 1\}^n \mid \Vert \xi \Vert _1 \le \beta \}. \end{aligned}$$

Then, we have the following lemma regarding co-optimality.

Proposition 4

Given a budgeted set \(\varXi _\beta \) for some constant \(\beta \in [n]\), a set of vectors \(\{c^i\}_{i \in I} \subseteq \mathbb {R}_+^n\) are co-optimal if and only if there exists some index subset \(S \subseteq [n]\) with \(|S| = \beta \) such that, for every \(i \in I\), \(\{c^i_{j}\}_{j \in S}\) are the largest \(\beta \) entries of \(c^i\).

This proposition leads to the following procedure to generate the corresponding co-optimal index cover.

  • Generate all index subsets S’s such that \(|S| = \beta \) and labels the largest \(\beta \) entries of c (at most \(n \atopwithdelims ()\beta \) of such S’s).

  • For each S,

    • Initialize an empty set \(L_S = \varnothing \),

    • For each \(i \in [k]\), add i into \(L_S\) if S labels the largest \(\beta \) entries of \(c_i\),

    • Add \(L_S\) to the co-optimal index cover \({\mathfrak {L}}\) if \(L_S \ne \varnothing \).

  • Return \({\mathfrak {L}}\).

It is clear that this procedure produces exactly the cover \({\mathfrak {L}}\) by Definition 6 and Proposition 4. Moreover, this generation algorithm is poly-time for a fixed \(\beta \) since there are at most \(O(n^\beta )\) iterations, each of which checks k vectors for their largest \(\beta \) entries. Also, the size of \({\mathfrak {L}}\) is of order \(O(n^\beta )\). As mentioned before, fixing each \(L \in {\mathfrak {L}}\) makes Formulation (9) a linear program, which can be solved in parallel since there is no dependence between different L’s in \({\mathfrak {L}}\). In conclusion, given \({\bar{C}} \ge 0\), for the budgeted set \(\varXi _\beta \) with a fixed constant \(\beta \in [n]\), problem (9) can be solved in poly-time with a parallel implementation. Note that the algorithm is not poly-time in general in terms of \(\beta \) since \(n \atopwithdelims ()\beta \) is exponential in \(\beta \).

To conclude the section, we note that algorithms designed to address general bilinear programs can also be employed to verify the zero-adjustability by directly solving the problem \(\Delta \) (in the proof of Theorem 1). Several such methods are available in the literature:

  • McCormick envelopes with a branch-and-bound (B &B) implementation [45].

  • The bilinear reformulation-and-linearization technique with a B &B implementation [52].

  • The digitized MIP reformulation [27] that can approximate a given bilinear program with an arbitrary tolerance \(\epsilon > 0\).

  • Bilinear program approximation methods based on scenario sampling [34].

Without deviating from the main discussion, we have relegated the detailed comparison between these methods and the proposed MIP (8) in Appendix 1. Many commercial solvers, including Gurobi, CPLEX, and SCIP, have incorporated the first method as their primary implementation component for solving bilinear problems. The experiments in Sect. 6 demonstrate that the two MIPs (8) and (9) significantly outperform the latest bilinear solver implemented in Gurobi.

5 Adjustability ratio

In this section, we examine the cases where the adjustability gap may be non-zero, i.e., the adjustability ratio may not be one. In 5.1, we provide a constructive approach to characterize a bound of adjustability ratio. In 5.2 and 5.3, we present an algorithmic procedure to estimate the tightest bound under this approach. To the authors’ knowledge, this is the first characterization and algorithmic approach to quantify the adjustability ratio in such a general setting.

Recall that in the proof of Theorem 1, we have shown \(z(\varPi )=z(\Delta ) \le z({\bar{\Delta }})=z({\bar{\varPi }})\). Thus, the objective value of the original problem \(z(\varPi )\) can be lower bounded by solving \(\Delta \) restricted to some subset of \(\varXi \) as the uncertainty set; the value of the constant policy problem \(z({\bar{\varPi }})\) can be upper bounded by solving \({\bar{\Delta }}_{\varXi '}\) with some polyhedron \(\varXi '\supseteq \varXi \). When this superset is properly constructed to satisfy the condition of Theorem 2 or Corollary 3, it becomes possible to estimate the adjustability ratio \(\delta _{\text {rel}}(\varPi )\). In this section, \(\varXi \) is only assumed to be a closed set.

5.1 Bound on Adjustability ratio

The following result provides a constructive way to bound \(\delta _{\text {rel}}(\varPi )\).

Theorem 4

Given \(\varPi (\varXi , {\mathcal {U}}, \bar{C})\) where \(z(\varPi ) > 0\) (or \(z(\varPi )<0\)), if there exists some polyhedron \(\varXi '\supseteq \varXi \), some \(\xi ' \in \varXi '\), and a scalar \(K \ge 1\) (or \(0<K \le 1\)) that satisfy \(\bar{C} \subseteq N_{\varXi '}(\xi ')\) and \(\xi '/K \in \varXi \), then, we have the bound \(1 \le \delta _{\text {rel}}(\varPi ) \le K (\text { or } 1 \ge \delta _{\text {rel}}(\varPi ) \ge K).\)

Proof

Let \({\bar{\varPi }}'\) and \(\varPi '\) be the RO and FARO formulations with input parameters \((\varXi ', {\mathcal {U}}, \bar{C})\). Clearly, we have \(z({\bar{\varPi }}') \ge z({\bar{\varPi }}) \ge z(\varPi )\), where the first inequality is due to \(\varXi \subseteq \varXi '\). The premise also states that all rows of \(\bar{C}\) belong to the same normal cone of \(\varXi '\), then according to Corollary 3, \(\varPi '\) is zero-adjustable. That is, the corresponding bidual is symmetrically optimal. Thus, the optimal value is \(z(\varPi ') = z({\bar{\varPi }}') = \max _{u \in {\mathcal {U}}}~ \langle c,\xi '\rangle + \langle C, u \xi '^\intercal \rangle \). On the other hand, we have \(\xi '/K \in \varXi \). Thus, for any \(u \in {\mathcal {U}}\), the solution \((\xi '/K, u, u \xi '^\intercal /K)\) is feasible to \(\Delta \)—the dual of \(\varPi \). Thus, we have the following lower bound,

$$\begin{aligned} z(\varPi ) = z(\Delta ) \ge \max _{u \in {\mathcal {U}}}~ \langle c, \xi '/K\rangle + \langle C, u \xi '^\intercal /K\rangle = z({\bar{\varPi }}')/K. \end{aligned}$$

The first equality has been shown in the proof of Theorem 1 and is true for any closed set \(\varXi \) by Remark 1. Combining all these inequalities and \(K > 0\), we get \(Kz(\varPi ) \ge z(\bar{\varPi }') \ge z(\bar{\varPi }) \ge z(\varPi ).\) Finally, when \(z(\varPi )>0\) and \(K \ge 1\), we have \(K z(\varPi ) \ge z(\bar{\varPi }) \ge z(\varPi ) > 0\); when \(z(\varPi )<0\) and \(0 < K \le 1\), we also have \(z(\varPi ) \le z(\bar{\varPi }) \le K z(\varPi ) < 0\). In both cases, the adjustability ratio \(\delta _{\text {rel}}(\varPi )\) is well-defined and can be computed directly, which gives the desired bound K. \(\square \)

Remark 5

Theorem 4 does not require \(\varXi \) to be a polyhedron. In fact, \(\varXi \) can be an arbitrary closed set, such as a discrete, scenario-based uncertainty set, which can arise from practical/data-driven robust optimization problems [22, 50].

Fig. 2
figure 2

Bound on adjustability ratio for a non-convex \(\varXi \), where O is the origin and \(\varXi '\) is a constructed polyhedron. A better bound can be obtained by carefully constructing a tighter \(\varXi '\) (See Sect. 5.2)

Figure 2 provides the intuition of Theorem 4. Given a non-convex space \(\varXi \) with \({\bar{C}}\) consists of vectors \(c, c_1, c_2\), we estimate an upper bound of \(\delta _{\text {rel}}\) by constructing a polyhedron \(\varXi '\) such that rows in \({\bar{C}}\) are in the same normal cone \(N_{\varXi '}(\xi ')\). Next, we scale and translate \(\varXi '\) to contain the original space \(\varXi \). Then, any scalar k that satisfies \(\xi '/k \in \varXi \) provides a valid upper bound of \(\delta _{\text {rel}}\). In this example, 4/3 is the smallest upper bound under the constructed polyhedron \(\varXi '\). However, a tighter bound can be obtained by delicately constructing a different \(\varXi '\). The “optimal” type of polyhedron is called anchor cone, and will be introduced in Sect. 5.2.

In the following three examples, we derive closed-form upper bounds of \(\delta _{\text {rel}}\) to illustrate the usage of Theorem 4. In these examples, we always assume \(z(\varPi ) > 0\) so that the ratio is well-defined.

Example 2

(Convex and Compact \(\varXi \subseteq \mathbb {R}^n_+\); \(\bar{C}\ge 0\)) In this case, the adjustability ratio \(\delta _{\text {rel}}(\varPi )\) is bounded by n. To derive this, we will first use Theorem 4 to show a more general result. We define \(X^\star :=\prod _{i \in [n]}\arg \max _{\xi \in \varXi }\xi _i\) where \(\xi _i\) is the ith entry of \(\xi \). Thus, each element \(x = (x_i)_{i \in [n]} \in X^\star \) is a tuple of vectors, each of which is a maximizer along the ith axis. We use \(\mu (x)\) to denote the number of unique vectors in x, i.e., the cardinality of the set \(\{x_i\}_{i \in [n]}\). Then, we can establish the following.

Corollary 7

Given \(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) where \(\bar{C}\ge 0\) and \(\varXi \subseteq \mathbb {R}^n_+\) is convex, we have \(\delta _{\text {rel}}(\varPi ) \le \mu (x)\) for every \(x \in X^\star \).

This directly gives \(\delta _{\text {rel}}(\varPi ) \le n\) as n is a trivial upper bound of \(\mu (x)\). This result improves the previous result of O(n) from the literature [12]. It is also easy to show that n is a tight bound since a simplex \(\varXi \) will achieve this bound exactly. More generally, the class of budget sets \(\varXi :=\{\xi \in [0,1]^n \mid \langle 1, \xi \rangle \le \beta n\}\) for \(\beta \in [0, 1]\) serves as a natural transition between a simplex (when \(\beta =0 \)) and a box set (when \(\beta =1\)). Thus, the corresponding adjustability ratio upper bound also decreases from n to 1 continuously as \(\beta \) slides from 0 to 1. \(\square \)

Example 3

(Convex and Compact Lattice \(\varXi \subseteq \mathbb {R}^n_+\); \(\bar{C}\ge 0\)) With the extra lattice property, i.e., \(\xi ^1 \vee \xi ^2 \in \varXi \) for every \(\xi ^1, \xi ^2 \in \varXi \) where \(\vee \) is the entry-wise maximum operator, the adjustability bound can be further tightened as \(\min (\dim (\varXi ) + 1, n)\). The n part has been established in the previous example. Suppose \(\dim (\varXi ) + 1 < n\), the following corollary shows that for every \(x \in X^\star \) such that \(\dim (\varXi ) + 1 < \mu (x) \le n\), we can construct another \(x' \in X^\star \) with \(\mu (x') < \mu (x)\).

Corollary 8

Given \(\varPi (\varXi , {\mathcal {U}}, {\bar{C}})\) where \(\bar{C}\ge 0\) and \(\varXi \subseteq \mathbb {R}^n_+\) is a convex lattice, suppose \(\dim (\varXi ) + 1 < n\), then for every \(x \in X^\star \) such that \(\mu (x) > \dim (\varXi ) + 1\), there exists some \(x' \in X^\star \) such that \(\mu (x') < \mu (x)\).

The proof can be found in Appendix B. This result along with Corollary 7 establish the bound \(\min (\dim (\varXi ) + 1, n)\). \(\square \)

When we know the specific description of the uncertainty set \(\varXi \), tighter bounds can be derived using Theorem 4.

Example 4

(Ellipsoidal \(\varXi \); \(\bar{C}\ge 0\)) An ellipsoidal uncertainty set can be defined as \(\varXi = \{\xi \mid \sum _{i \in [n]} \xi _i^2/l_i^2 \le 1\}\) for some \(l = (l_i)_{i \in [n]} > 0\). We take \(\varXi '\) as the box set that circumscribes \(\varXi \), i.e., \(\varXi ' = \prod _{i \in [n]}[-l_i, l_i]\). Because \(\bar{C} \ge 0\), all the row vectors of \(\bar{C}\) belong to the normal cone \(N_{\varXi '}(l)\). Then, the intersection point between the line segment [0, l] and the boundary \(\partial \varXi = \{\xi \mid \sum _{i \in [n]}\xi _i^2/l_i^2 = 1\}\) can be directly computed as \(l/\sqrt{n}\). Applying Theorem 4, we get \(\delta _{\text {rel}}(\varPi ) \le \sqrt{n}\) for ellipsoids that are centered at the origin.

The same technique can be used to derive closed-form bounds for translated and/or rotated ellipsoidal uncertainty sets. For instance, for a given ellipsoidal set \(\varXi = \{\xi \mid \sum _{i \in [n]} \xi _i^2/l_i^2 \le 1\}\), let x be the intersection point of the line segment [0, l] and the boundary of \(\varXi \), we can easily derive a tight bound of \(\delta _{\text {rel}}\) for the translated ellipsoidal set \(\varXi ' = \{\xi \mid \sum _{i \in [n]} (\xi _i - \lambda l_i)^2/l_i^2 \le 1\}\) for some \(\lambda \ge 0\). We construct the tightest box set whose maximal point is \(\lambda l + l\) to enclose \(\varXi '\). Then, we get the following bound

$$\begin{aligned} \delta _{\text {rel}}\le \frac{\Vert \lambda l\Vert _2 + \Vert l\Vert _2}{\Vert \lambda l\Vert _2 + \Vert x\Vert _2} \le \frac{\Vert \lambda l\Vert _2 + \Vert l\Vert _2}{\Vert \lambda l\Vert _2 + \frac{\Vert l\Vert _2}{\sqrt{n}}}=\frac{(\lambda + 1)\sqrt{n}}{\lambda \sqrt{n}+ 1}= 1 + \frac{\sqrt{n}- 1}{\lambda \sqrt{n}+ 1}, \end{aligned}$$

where the first inequality is obtained by direct computation (notice \(\lambda l\), l, and x are in the same direction), the second is due to the result \(\Vert l\Vert _2/\Vert x\Vert _2 \le \sqrt{n}\) from the above example. This upper bound provides some interesting observations. When \(\lambda = 0\), i.e., there is no translation, it recovers the previous upper bound \(\sqrt{n}\). However, for any \(\lambda > 0\), this upper bound is strictly less than the constant \(1 + 1/\lambda \). Moreover, this constant reduces whenever the translation coefficient \(\lambda \) increases. \(\square \)

In Appendix E, we also apply Theorem 4 to other uncertainty sets and input matrix \(\bar{C}\). In particular, when \(\varXi \) is ellipsoidal and the jth column of \(\bar{C}\) is a dominant column, \(\delta _{\text {rel}}\) is bounded by \(\Vert l\Vert _2/(2l_j)\); when \(\varXi \) is the budgeted set \(\{\xi \in [-1,1]^m \mid \Vert \xi \Vert _1 \le \Gamma \}\), \(\delta _{\text {rel}}\) is bounded by \(n/\Gamma \) given \(\bar{C}\ge 0\), and is bounded by \(\Gamma \) given \(\bar{C}\) has a dominant column.

5.2 Formulation for adjustability ratio estimation

We have shown that Theorem 4 can be used to analytically study the bounds for the adjustability ratio \(\delta _{\text {rel}}\). In this subsection, we formalize this idea into a mathematical formulation, called the anchor cone formulation, that produces the tightest bound for \(\delta _{\text {rel}}\) accordingly. Throughout this subsection, we assume that the problem \(\max _{\xi \in \varXi }\langle c, \xi \rangle \) can be efficiently solved for every possible vector c or there exists an oracle.

Anchor cones form a special class of polyhedra that we choose to construct \(\varXi '\) in Theorem 4 for producing a valid bound. We define it as follows.

Definition 7

(Anchor Cone) Given a finite set of vectors \({\mathcal {C}} = \{c_i\}_{i \in L}\) and a point \(x_0\), we define the corresponding anchor cone as

$$\begin{aligned} \mathfrak {A}_{{\mathcal {C}}, x_0}:=\left\{ x \mid \langle c_i, x\rangle \le \langle c_i, x_0\rangle ,\ \forall i \in L \right\} \end{aligned}$$

where \(x_0\) is called the anchor of \(\mathfrak {A}_{\mathcal C, x_0}\).

By this definition, an anchor cone is a convex set constructed by anchoring a cone at \(x_0\). This constitutes a more liberal use of the concept of cone than conventionally done, since our “cone” may not be anchored at the origin. We nonetheless keep this name for its geometric intuition. It has several interesting properties that will be used later. For \(\text {cone}({\mathcal {C}})\), we use \(\text {cone}^*({\mathcal {C}})\) and \(\text {cone}^\circ (\mathcal C)\) to denote the corresponding dual and polar cones.

Proposition 5

Every anchor cone \(\mathfrak {A}_{{\mathcal {C}}, x_0}\) has the following properties: (i) \(\mathfrak {A}_{{\mathcal {C}}, x_0} = \{x_0\} + \text {cone}^\circ ({\mathcal {C}})\); (ii) \(N_{\mathfrak {A}_{{\mathcal {C}}, x_0}}(x_0)=\text {cone}({\mathcal {C}})\); (iii) constraints of \(\mathfrak {A}_{{\mathcal {C}}, x_0}\) that correspond to vectors in \(\text {eray}({\mathcal {C}})\) are sufficient to define \(\mathfrak {A}_{{\mathcal {C}}, x_0}\).

Take J as the index set for \(\text {eray}(\bar{C})\) and let \({\bar{\omega }}_j:= \max _{\xi \in \varXi } \langle c_j, \xi \rangle \) for every \(j \in J\), then, the anchor cone formulation is defined as,

$$\begin{aligned} \Lambda := \min _{\gamma , \xi \in \varXi }(\text {or }\max )&\quad \gamma \end{aligned}$$
(10a)
$$\begin{aligned} \text {s.t.}&\quad \langle c_j, \gamma \xi \rangle \ge {\bar{\omega }}_j , \quad \forall j \in J, \end{aligned}$$
(10b)
$$\begin{aligned}&\quad \gamma \ge 1\ (\text {or } \gamma \le 1). \end{aligned}$$
(10c)

The minimization with \(\gamma \ge 1\) and maximization with \(\gamma \le 1\) are designed for the two cases \(z(\varPi )>0\) and \(z(\varPi )<0\), respectively. The main idea of this formulation is to search for an element \(\xi \in \varXi \) and an optimized positive scalar \(\gamma \) such that the anchor cone \(\mathfrak {A}_{\bar{C}, \gamma \xi }\) contains \(\varXi \). Notice that each constraint in (10b) can be equivalently written as \(\langle c_j, \gamma \xi \rangle \ge \max _{\xi ' \in \varXi }\langle c_j, \xi '\rangle ,\) which implies that \(\varXi \) is contained by the half-plane of \(\mathfrak {A}_{{\bar{C}},\gamma \xi }\) associated with \(c_j\). Then, the correctness of (10) follows the third statement of Proposition 5.

This formulation can also be considered a scenario generation scheme, a common method to evaluate the performance of various policy families [34]. Indeed, according to Theorem 4, the scenario \(\xi \) selected by this formulation provides an underestimation of \(z(\varPi )\). Furthermore, the next two theorems indicate that this choice of \(\xi \) induces the ideal polyhedron to produce the tightest bound of \(\delta _{\text {rel}}\) among all possible polyhedra enclosing \(\varXi \).

Theorem 5

Given \(\varPi (\varXi , {\mathcal {U}}, \bar{C})\), let \(\gamma \) be any feasible solution of the corresponding anchor cone formulation. We have \(\delta _{\text {rel}}(\varPi ) \le \gamma \) when \(z(\varPi )>0\) and \(\delta _{\text {rel}}(\varPi ) \ge \gamma \) when \(z(\varPi )<0\).

Proof

When \(\bar{C}\) is full-rank, the anchor cone \(\mathfrak {A}_{\bar{C}, \gamma \xi }\) has the unique extreme point \(\gamma \xi \). By the second property in Proposition 5, all vectors in \(\bar{C}\) lead to \(\gamma \xi \). Then, a direct application of Theorem 4 proves the claim. When \(\bar{C}\) is not full-rank, the anchor cone \(\mathfrak {A}_{\bar{C}, \gamma \xi }\) does not have any extreme point. However, the uncertain sets of both \(\varPi \) and \({\bar{\varPi }}'\) can be projected without changing the corresponding optimal objective values. That is, \(z(\varPi )\) and \(z({\bar{\varPi }}')\) will not change if we replace \(\varXi \) and \(\varXi ' = \mathfrak {A}_{{\bar{C}}, \gamma \xi }\) with \(\text {proj}_{\bar{C}}(\varXi )\) and \(\text {proj}_{\bar{C}}(\varXi ')\) where \(\text {proj}_{\bar{C}}(\cdot )\) projects the input set onto the subspace spanned by \(\bar{C}\). Then, \(\text {proj}_{\bar{C}}(\varXi ')\) has the unique extreme \(\gamma \xi \) where all the vectors in \({\bar{C}}\) are leading to. Thus, we can still apply Theorem 4 to prove the bound \(\gamma \). \(\square \)

Theorem 6

Given a policy problem \(\varPi (\varXi , {\mathcal {U}}, \bar{C})\), let \(\gamma \) be the optimal value of (10) and let K be any bound calculated using Theorem 4 with some polyhedron \(\varXi '\), then, \(\gamma \) is a tighter bound than K.

Proof

We will show that every such \(\varXi '\) corresponds to a feasible solution \((\gamma _0, \xi )\) of Formulation (10). Let \(\xi '\in \varXi '\) be the extreme point that all vectors in \(\bar{C}\) lead to. According to Theorem 4, \(\xi '/K \in \varXi \). Let \(\gamma _0 = K\) and \(\xi = \xi '/K\), we have \(\xi ' = \gamma _0 \xi \). Since all vectors in \({\bar{C}}\) are in the normal cone \(N_{\varXi '}(\gamma _0\xi )\), constraint set (10b) is trivially satisfied. Thus, \((K, \xi '/K)\) is a feasible solution of Formulation (10), which concludes the proof. \(\square \)

The following two propositions provide the feasibility/infeasibility criteria for Formulation (10).

Proposition 6

Given \(\varXi \) is bounded, Formulation (10) is feasible if either \(\varXi \cap \text {int}(\text {cone}^*({\bar{C}})) \ne \varnothing \) or \(\varXi \subseteq \text {int}(\text {cone}^\circ ({\bar{C}}))\).

Implicitly, the first condition is associated with the case \(z(\varPi )>0\), while the second is for \(z(\varPi )<0\). These two conditions are not necessary. Hence, even if this proposition is violated, Formulation (10) may still be feasible, in which case it can produce a valid bound for the adjustability ratio.

Proposition 7

Formulation (10) is infeasible if

$$\begin{aligned} \dim (\varXi ) + \dim (-\text {cone}({\bar{C}}) \cap \text {cone}({\bar{C}})) > n. \end{aligned}$$

The main tool for this proof is the equality \(\dim (\text {cone}^\circ (\bar{C})) = n - \dim (\hat{{\mathcal {C}}})\) where \(\hat{{\mathcal {C}}}\) denotes \(-\text {cone}({\bar{C}}) \cap \text {cone}({\bar{C}})\). Thus, when vectors in \({\bar{C}}\) cannot lie in some halfspace, we have \(\text {cone}^\circ (\bar{C}) = \{0\}\), which means the anchor cone is a single point. On the other hand, when vectors in \(\bar{C}\) lie in the interior of some halfspace, \(\hat{{\mathcal {C}}}=\{0\}\), which implies the anchor cone is full-dimensional.

Remark 6

For general robust linear optimization problems with fixed recourse, one classic approach to estimate the adjustability ratio (i.e., the performance of constant policies) is based on two geometric properties of the uncertainty set \(\varXi \) called symmetry s and translation factor \(\rho \) [12, 14, 19]. The anchor cone method can be considered a generalization of this classic method and produces a strictly tighter bound for the following reasons. First, despite several variations on the definition of symmetry, the classic analysis eventually leads to the following type of relation.

$$\begin{aligned} \kappa (s, \rho ) \xi ^0 \ge \xi , \quad \forall \xi \in \varXi , \end{aligned}$$

where \(\xi ^0\) is some relative interior point of \(\varXi \) (called the point of symmetry) and \(\kappa (s, \rho )\) is a constant depending on s and/or \(\rho \). This has a specific interpretation within the anchor cone framework: the box set determined by the minimum and maximum points 0 and \(\kappa (s, \rho ) \xi ^0\) contains the target space \(\varXi \). Hence, by Theorem 4, \(\kappa (s, \rho )\) is indeed an upper bound of the adjustability ratio. However, since the point of symmetry \(\xi ^0\) is designed to be a relative interior point of \(\varXi \), in non-trivial cases, the anchor cone method will find a point on the relative boundary of \(\varXi \) to provide a strictly tighter bound. In particular, the bounds computed in Examples 1, 2, and 3 are strictly tighter than those computed using the classic method. In addition, the anchor cone method is also more general in the following aspects: (i) the classic method requires \(\varXi \) to be located inside the first orthant, while the anchor cone method works for general \(\varXi \); (ii) the classic method (implicitly) uses a box set to enclose \(\varXi \), while anchor cone method explicitly optimizes the shape of the anchor cone to minimize the upper bound. These observations echo the conclusions from Theorem 5 and 6, which states that the anchor cone method provides the tightest bounds among all methods that use the idea of enclosing \(\varXi \) with polyhedra.

Algorithm 1
figure a

Anchor Cone Algorithm with a Poly-Time Implementation.

5.3 A poly-time implementation

Notice that Formulation (10) is nonlinear due to the term \(\gamma \xi \). However, we can still solve it efficiently using a binary line search algorithm on \(\gamma \) (see Algorithm 1) where at each iteration, only a feasibility check is required. This binary line search algorithm is justified since given \(z(\varPi )>0\) (\(z(\varPi )<0\)), the anchor cone \(\mathfrak {A}_{{\bar{C}}, \gamma \xi }\) is increasing (decreasing) on \(\gamma \) under the inclusion relation \(\subseteq \). Therefore, the complexity of solving Formulation (10) is \(O(\Lambda _\gamma \log _2 \frac{{\bar{\gamma }} - 1}{\epsilon })\) where \({\bar{\gamma }}\) is some known upper bound of \(\gamma \), scalar \(\epsilon > 0\) is a given accuracy tolerance, and \(\Lambda _\gamma \) is the complexity of solving the anchor cone formulation \(\Lambda \) with a fixed \(\gamma \). For instance, when \(\varXi \) is a polyhedron, \(\Lambda _\gamma \) is the time complexity of solving a linear program with \(O(n\times |\text {eray}({\bar{C}})|)\) variables and \(O(|\text {eray}({\bar{C}})|^2)\) constraints; when \(\varXi \) is convex, it is the complexity of solving a convex optimization with the same size. Moreover, this implementation can be easily paralleled into a \((p+1)\)-ary search in a computer with p computational cores, which improves the complexity to \(O(\Lambda _\gamma \log _{p+1} \frac{{\bar{\gamma }} - 1}{\epsilon })\).

Finally, we finish the section with a special case where the anchor cone formulation has an analytical solution.

Example 5

(\(\text {cone}(\bar{C})=\mathbb {R}_+^n\) and \(z(\varPi )>0\)) In this case, \(\text {cone}({\bar{C}})\) is self-dual. Thus, for any \(\xi \ge 0\), the anchor cone \(\mathfrak {A}_{\bar{C}, \xi }\) is simply obtained by removing all the lower bounds from a box set, which leaves the only extreme point \(\xi \). Then, Formulation (10) reduces to

$$\begin{aligned} \min _{\gamma \ge 1, \xi \in \varXi }&~ \gamma \end{aligned}$$
(11a)
$$\begin{aligned} \text {s.t.}&~ \gamma \xi _j \ge {\bar{\xi }}_j \quad \forall j \in J. \end{aligned}$$
(11b)

where \({\bar{\xi }}_j = \max _{\xi \in \varXi } \xi _j\) is the maximum value of the jth entry of \(\xi \). Suppose \(\varXi \) contains some element \(\xi > 0\), this formulation is always feasible and bounded, and can be further reduced to

$$\begin{aligned} \min _{\xi \in \varXi }\max _{j} \ \frac{{\bar{\xi }}_j}{\xi _j} \Longleftrightarrow \left( \max _{\xi \in \varXi }\min _{j} \ \frac{\xi _j}{{\bar{\xi }}_j}\right) ^{-1}. \end{aligned}$$

Depending on the uncertainty set \(\varXi \), this can be computed directly to derive an analytical expression of the adjustability ratio bound. \(\square \)

6 Computational experiments

The experiments reported in this section were conducted on a Macbook Pro 2023 equipped with the Apple M2 Max chip, 64 GB of RAM, and running on macOS, Ventura 13.3. All the formulations and algorithms were implemented in Python 3.9 and solved using the commercial optimizer Gurobi 10.0.1. Each instance was solved under a time limit of 3, 600 seconds and an optimality gap tolerance of \(10^{-3}\).

We design two experiments, namely EXPT-I and EXPT-II, to evaluate the performance of the zero-adjustability verifiers and the anchor cone algorithm, respectively. For EXPT-I, we aim to compare the time efficiency of MIP1 (Formulation (8)) and MIP2 (Formulation (9)) with the bilinear formulation (3) (referred to as BL) solved using Gurobi. We also investigate the differences between the objective values of the formulations and numerically confirm Corollary 5. For EXPT-II, we focused on evaluating both the time efficiency and approximation tightness of the anchor cone (AC) algorithm as compared to the BD algorithm, i.e., the direct computation of the two bidual formulations, (3) and (3), using Gurobi. It’s worth noting that Gurobi 10.0.1 uses the state-of-the-art branch-and-bound algorithm based on the McCormick envelopes [45] to solve bilinear programs.

6.1 Test instances

From Formulation (3) and (3), it is clear that the size of the polytope \(\varXi \) and the magnitude of rows in \(\bar{C}\) are inessential for adjustability as setting \(\varXi ':= \varXi \varXi / K_1\) and \(\bar{C}':= \bar{C} / K_2\) for some \(K_1, K_2 > 0\) will lead to the same ratio. Thus, we generate random polytopes \(\varXi \) within the hypercube \([-1, 1]^n\) and rows of \(\bar{C}\) with lengths that are less than a unit. To guarantee \(z(\varPi ) > 0\) so that the adjustability ratio \(\gamma \ge 1\), we ensure that (i) the polytopes contain at least one positive vector as their relative interior, (ii) \(\langle c_i, 1\rangle > 0\) for every row vector \(c_i\) in \({\bar{C}}\), and (iii) \({\mathcal {U}}\) is the product space of a set of simplices, i.e., \(a = 1\) and A is a diagonal block matrix where each diagonal block is an all-one vector of varied lengths.

More specifically, to generate a random uncertainty set \(\varXi \), we use n to denote the environment dimension of \(\varXi \) and l to denote the number of constraints that define \(\varXi \). First, we add the 2n constraints that define the hypercube \([-1, 1]^n\). Then, we randomly generate \(l-2n\) vectors \(b_i\)’s within the hypercube and include the associated half-spaces \(\langle b_i, \xi \rangle \le \Vert b_i\Vert _2\) as constraints of \(\varXi \). For \({\mathcal {U}}\), we use m to denote the number of simplices (i.e., the environment dimension of \({\mathcal {Y}}\)), then generate each simplex j with a dimension \(d_j\). Hence, \(k:= \sum _{j \in [m]} d_j\) is the number of rows in \(A\). This entire process produces a valid input \((\varXi , \mathcal U, {\bar{C}})\).

Based on this generation scheme, we design three sets of instances to test various aspects of the algorithms under comparison.

  • S1: n ranges from \(\{5, 10, \dots , 25\} \cup \{30, 40, \dots , 60\}\) and m/n ranges from \(\{1, 2\}\). For each configuration (nm), we randomly generate 5 instances where l/n is randomly selected from \(\{3, \dots , 9\}\) and each simplex has a random dimension from \(\{1, \dots , 5\}\). We randomly generate rows of \({\bar{C}}\) from \(\mathbb {R}_+^n\).

  • S2: \(\varXi \) has an extra budget constraint \(\langle 1, \xi \rangle \le \beta n\) where \(\beta \in (0,1)\) controls the tightness of the budget. n ranges from \(\{5, 10, \dots , 25\} \cup \{30, 40, 50\}\), m is fixed at \(\lfloor 1.5n \rfloor \), and \(\beta \) ranges from \(\{0.1, 0.5, 0.9\}\). We generate three instances under each configuration. Again, the rows of \({\bar{C}}\) are randomly generated from \(\mathbb {R}_+^n\).

  • S3: \(\varXi \) is fixed as the simplex budget set \(\{\xi \ge 0 \mid \Vert \xi \Vert _1 \le 1\}\) with three types (denoted as \(\tau \)) of \({\bar{C}}\) matrices: Type-0 with all rows in \({\bar{C}}\) lead to the same extreme point of \(\varXi \), Type-1 with \({\bar{C}} \ge 0\), and Type-2 with each row vector \(c_i\) of \({\bar{C}}\) satisfying \( \langle c_i, 1\rangle > 0\). We test three instances under each configuration.

In S1, we are interested in evaluating the performance of candidate algorithms on randomly generated instances. In S2, we investigate instances with a random polyhedral uncertainty set that has a single budget constraint. We examine how the stringency of this budget constraint affects the performance of our candidate algorithms. In S3, we analyze the impact of various types of \({\bar{C}}\) on algorithm performance. This instance set is designed to be particularly challenging for the anchor cone algorithm due to two reasons: first, simplex uncertainty sets can generate the largest adjustability ratio among all convex sets, as proven in Corollary 7; and second, for Type-3 \({\bar{C}}\), Proposition 6 suggests that a wide \(\text {cone}({\bar{C}})\) will produce a narrow anchor cone that may require to be significantly scaled to contain \(\varXi \), which could result in a loose upper bound for the adjustability ratio.

6.2 Discussion of EXPT-I

Table 1 presents the performances of MIP1 and BL formulations for verifying zero-adjustability on two sets of instances, S1 and S2. The table shows that MIP1 yields lower objective values than BL for all instances, which confirms the findings of Corollary 5. Moreover, all instances, except for (5, 5), are infeasible under MIP1, as indicated by the objective value \(-\infty \). Even for (5, 5), the objective value of 8.89 obtained from MIP1 is strictly less than the value of 9.61 obtained from BL. Therefore, based on Theorem 3 and Corollary 5, none of the instances in S1 and S2 are zero-adjustable. This result is unsurprising since it is highly unlikely to have rows in \({\bar{C}}\) that are co-optimal with respect to \(\varXi \) and correspond to the nonzero entries of some optimal \(u \in {\mathcal {U}}\), when \({\bar{C}}\) and \(\varXi \) are randomly generated.

Regarding the runtime, MIP1 performs significantly faster than BL for both sets of instances. The average runtime of MIP1 is under one second for all configurations, while BL exceeds the one-hour time limit even for moderate-sized instances such as (20, 30, 0.1) or (25, 37, 0.1). The effect of instance size on the runtime of BL is substantial, as the average runtime increases dramatically from 0.05 to 2, 339.55 when the instance size grows linearly from (5, 5) to (60, 120). However, this effect is negligible for the MIP1 algorithm. From the results on S2, we observe that a tighter budget parameter \(\beta \) generally results in slower performance for the BL algorithm. Specifically, for each fixed (nm), the average runtime decreases as \(\beta \) increases. The standard deviation of the runtime is consistent with these findings: the BL algorithm has a large runtime variance for large instances with small \(\beta \), while MIP1 has a small variance under all configurations.

Table 1 Results of EXPT-I on instance sets S1 and S2
Table 2 Results of EXPT-I on instance set S3

Table 2 displays the results of experiments conducted on instance set S3, where the uncertainty set \(\varXi \) is designed as the simplex \(\{\xi \ge 0 \mid |\xi |_1 \le 1\}\). In addition to MIP1 and BL, we also evaluate the performance of two implementations of Formulation (9), namely MIP2 and LP. MIP2 directly computes (9) using Gurobi, whereas LP solves a linear programming problem for each fixed value of \(z_j=1\). Both MIP2 and LP require to pre-compute the co-optimal index cover (Definition 6), which is straightforward when \(\varXi \) is a simplex. As per Corollary 6, MIP1, MIP2, and LP are equivalent reformulations that produce the same objective values.

We denote the type of matrix \(\bar{C}\) by \(\tau \), where \(\tau =0\), 1, and 2 correspond to the cases where all rows in \(\bar{C}\) are co-optimal, \(\bar{C} \ge 0\), and \(\langle c_i, 1\rangle > 0\) for every row vector \(c_i\) in \(\bar{C}\), respectively. Using this design, all instances with \(\tau = 0\) are zero-adjustable. Indeed, for all such instances, the values obtained from MIP1/MIP2/LP are the same as those obtained from BL, indicating that these instances are zero-adjustable according to Theorem 3 and Corollary 5. All other instances are infeasible under MIP1/MIP2/LP, consistent with the previous results in Table 1.

Regarding the runtime, BL is much slower than the other three algorithms. Furthermore, generating rows in \(\bar{C}\) from a more relaxed space (\(\tau = 2\)) generally increases the runtime of BL. Between MIP1, MIP2, and LP, the latter two are significantly faster than MIP1, reflecting the size of their formulations. In this experiment, MIP2 is slightly more efficient than LP, but this could change for larger instances, as LP is a poly-time implementation and can be easily parallelized.

From the table, we see that the value of MIP1 is quite loose (often infeasible) as a lower bound for BL, except for zero-adjustable instances. Therefore, while MIP1 is an efficient way to verify zero-adjustability, it is not suitable for estimating bounds for the adjustability ratio. Thus, the anchor cone formulation and EXPT-II complement MIP1.

6.3 Discussion of EXPT-II

In EXPT-II, we aim to compare the performance of the anchor cone (AC) algorithm with the BD algorithm, which solves two bidual formulations directly using Gurobi. Specifically, we measure and record the average adjustability ratio \(\gamma \), average runtime t, and runtime standard deviation \(\sigma \) of both algorithms, denoted with subscripts ac and bd, respectively. The ratio \(\gamma _{ac} / \gamma _{bd}\) provides insights into the approximation accuracy of the AC algorithm relative to the BD algorithm. A smaller ratio indicates a tighter bound obtained from AC.

When solving the bilinear program \({\bar{\Delta }}^*\) as a component of the BL algorithm, we also keep track of the value and attainment time of incumbent solutions denoted as \(z'_t({\bar{\Delta }}^*)\). Using this information, we can calculate an upper bound for the adjustability ratio, denoted as \(\gamma '_t:= z({\bar{\Delta }}) / z'({\bar{\Delta }}^*)\). To provide a fair comparison between the two algorithms and make the performance criteria more stringent for the AC algorithm, we also record the time

$$\begin{aligned} t'_{bd}:= \min _{\gamma '_t \le \gamma _{ac}} t \end{aligned}$$

which is the earliest time that BD retrieves a better bound than AC.

Table 3 Results of EXPT-II on instance sets S1 and S2

Table 3 presents experiment results for instance sets S1 and S2. The adjustability ratio bounds obtained by the AC algorithm are generally tight, as the ratio \(\gamma _{ac}/\gamma _{bd}\) falls below 1.5 for the majority of the configurations, and only one configuration (5, 7, 0.1) has this ratio above 3.0. From results in S2, we observe that a more stringent budget parameter \(\beta \) leads to higher values in both \(\gamma _{ac}\) and \(\gamma _{bd}\), resulting in a better approximation ratio \(\gamma _{ac}/\gamma _{bd}\) for larger \(\beta \). Indeed, for each fixed (nm) in S2, this ratio decreases when \(\beta \) increases.

Table 4 Results of EXPT-II on instance set S3

Regarding the average runtime, AC outperforms BD significantly. The average solution time of the AC algorithm is below 0.1 seconds for all configurations, whereas both times \(t_{bd}\) and \(t'_{bd}\) increase substantially with the instance size. Additionally, a smaller \(\beta \) often results in a longer computation time \(t_{bd}\) for solving the BD algorithm exactly, but this effect is not as pronounced in \(t'_{bd}\). The trend in runtime standard deviation is also coherent with these observations: the value of \(\delta _{ac}\) is consistently small across all the configurations, whereas the counterpart \(\delta _{bd}\) is significant for large instances with a stringent budget. Overall, for instance sets S1 and S2, the AC algorithm can efficiently obtain a tight upper bound for the adjustability ratio.

Table 4 presents the results of instance set S3, which mainly examine the impact of different types of \(\bar{C}\) and simplex uncertainty set \(\varXi \) on the performance of the AC and BD algorithms. Intuitively speaking, the values of \(\tau \) indicate three levels of widths that \(\text {cone}({\bar{C}})\) can have: \(\tau =0\) means \(\text {cone}({\bar{C}})\) is relatively narrow since all rows in \({\bar{C}}\) need to approach the same extreme point of the simplex \(\varXi \); \(\tau =1\) implies rows in \({\bar{C}}\) can be generated from the orthogonal cone \({\mathbb {R}}_+^{n}\); \(\tau =2\) allows rows in \({\bar{C}}\) to be generated from the interior of the half-plane \(\{c \mid \langle 1, c\rangle > 0\}\), which is the most relaxed condition for AC to be feasible, as suggested by Proposition 6.

The results in Table 4 indicate that the type of \({\bar{C}}\) has a significant impact on the adjustability ratio bounds. Indeed, all three values \(\gamma _{ac}\), \(\gamma _{bd}\), and the ratio \(\gamma _{ac}/\gamma _{bd}\) increase with \(\tau \). Notably, for \(\tau =0\), the associated instances are all zero-adjustable, resulting in both algorithms providing an exact adjustability ratio of 1.0. For \(\tau = 1\), both values \(\gamma _{ac}\) and \(\gamma _{bd}\) increase, but the ratio \(\gamma _{ac} / \gamma _{bd}\) is still below 2.0 across all configurations, indicating a solid approximation performance from the AC algorithm. For \(\tau = 2\), this ratio becomes significantly larger and increases with the instance size, suggesting that scaling the anchor cone to contain the uncertainty set \(\varXi \) is more challenging due to its narrow angles induced by pairs of row vectors in \(\bar{C}\) that have a wide angle. The expected number of such pairs increases with the number of rows in \(\bar{C}\), which explains the trend observed in the results.

Regarding the average runtime, it is similar to before that the AC algorithm is significantly faster than the BD algorithm in all configurations. Also, as \(\tau \) increases, both algorithms require more time to find a solution, but this impact is more pronounced for the BD algorithm. Interestingly, the BD algorithm performs more efficiently on instances in S3 than on the ones in S1 and S2 with comparable sizes, possibly due to the unique structure of the simplex uncertainty set \(\varXi \).

7 Conclusion and future extensions

We set out to answer two questions: can we identify the zero-adjustability conditions? And can we quantify the adjustability ratio?

For dynamic robust optimization problems, these questions are fundamental for understanding the conservativeness of RO solutions and the approximation performances of various policy families. From the game-theoretic perspective, these questions are posed in the same spirit that von Neumann studied minimax (in)equality in two-person zero-sum games. Such a classical minimax setting does not allow strategy dependence. In this sense, our results extend the classical results by modeling strategy dependence, at least in a linear setting. The practical motivation for this setup comes from a plethora of interdiction and defender-attacker games.

Previous literature regarding adjustability provided intriguing but restrictive findings. The most general work to date had some key assumptions (see discussions in Sect. 1.1 and 1.2), for example, requiring the right hand side to be positive and the constraints to have only the “greater than or equal to” direction. These assumptions precluded the study of many problems, such as network optimization with arc capacities and resource allocation problems with budget constraints.

In this work, we dropped these assumptions and developed a general framework to analyze and quantify adjustability. This framework, including a unified theory and two algorithms, provided a set of tools for decision-makers to study the adjustability of the specific problem at hand. For instance, the theorem of alternatives (Theorem 2) gave a geometrically intuitive characterization for zero-adjustability; Theorem 4 introduced a constructive procedure for bounding adjustability ratio based on an “outer-approximation” idea; the anchor cone formulation (10) provided an efficient method to either derive analytical expressions or conduct numerical computations for adjustability ratio bounds.

For future work, it would be interesting to examine if Theorem 4 is, in general, the “optimal” way to bound adjustability ratio given an arbitrary problem setup. If the answer is affirmative, then one could examine more specialized settings to derive context-specific managerial and policy insights. In addition, the theoretical framework and new proof techniques may be of interest to researchers who work on areas mentioned in Section A.