Interplay of non-convex quadratically constrained problems with adjustable robust optimization

In this paper we explore convex reformulation strategies for non-convex quadratically constrained optimization problems (QCQPs). First we investigate such reformulations using Pataki’s rank theorem iteratively. We show that the result can be used in conjunction with conic optimization duality in order to obtain a geometric condition for the S-procedure to be exact. Based upon known results on the S-procedure, this approach allows for some insight into the geometry of the joint numerical range of the quadratic forms. Then we investigate a reformulation strategy introduced in recent literature for bilinear optimization problems which is based on adjustable robust optimization theory. We show that, via a similar strategy, one can leverage exact reformulation results of QCQPs in order to derive lower bounds for more complicated quadratic optimization problems. Finally, we investigate the use of reformulation strategies in order to derive characterizations of set-copositive matrix cones. Empirical evidence based upon first numerical experiments shows encouraging results.


Introduction and outline
In this text we explore the connections between robust optimization and quadratically constrained quadratic optimization. For the sake of this introduction we briefly review the most important concepts in these areas and then give an outline on the aims of this text.

Quadratically constrained quadratic optimization
A quadratically constrained quadratic optimization problem (QCQP) consists of minimizing a quadratic function subject to quadratic constraints, formally given by inf x∈K x T Qx + q T x + ω : where Q, A i are real, symmetric matrices of order n, K ⊆ R n is a cone, q, a i are real vectors, b i are real numbers and [1 : k] := {1, . . . , k}. General QCQPs are hard, and form a large and quite versatile class of optimization problems. Neither the objective function nor the feasible set of a QCQP need to be convex, the latter may even be disconnected. One way to (approximately) solve QCQPs is to look for convex relaxations which in the best case yield exact reformulations. An important strategy for achieving such reformulations is to lift the space of variables, thereby linearizing the quadratic terms and convexifying the feasible set, such that its extreme points correspond to rank-one matrices, which will decompose into vectors feasible to the original problem:

be a feasible set of a QCQP and denote by
where clconv(A) stands for the closure of the convex hull of a set A.
Then for any Q ∈ S n and q ∈ R n we have inf x∈F x T Qx + q T x + ω = inf (x,X)∈G (F ) trace(QX) + q T x + ω .
The central object in the above theorem is G(F). Its characterization is the major challenge when employing the reformulation strategy depicted in Theorem 1 and a general workable description of G(F) is not known. There are, however, characterizations for specific instances of F. In practice such characterizations are regularly given by a conic intersection involving some matrix cone C and appropriate linear constraints parametrized by matricesĀ i and real numbersb i , i ∈ [1 :m]. Thus, typically we have reformulations of the form: inf (x,X)∈G (F ) where M,Ā i andb i depend on the problem data. Important examples for such reformulations can be found in Anstreicher and Burer (2010), Burer (2009), Burer and Dong (2012), Burer and Yang (2015), Eichfelder and Povh (2013), Yang et al. (2016), some of which we will discuss in more detail later in the text.

The S-procedure
Another strain of literature that is concerned with a dual perspective on QCQPs is the theory of the S-procedure. The central question is when for a given set of matrices {Q, A 1 , . . . , A m } the following two statements are equivalent i) the following system of inequalities has no solution x ∈ R n : x T Qx < 0 and x T A i x ≤ 0 ∀i ∈ [1 : m] ; ii) the following conic inequality has a solution λ ∈ R m + : If equivalence between (i) and (ii) can be established, we say that the S-procedure is exact for this set of matrices. Actually, if equivalence holds for arbitrary Q, then copositivity over K := {x ∈ R n : x T A i x ≤ 0, i ∈ [1 : m]} (see the notations and preliminaries section for the definition) is characterized by COP(K) = {Q : Q + m i=1 λ i A i ∈ S n + for some λ ∈ R m + }. Note that '⊇' always holds since obviously ii) always implies i). The first such exactness result, known as the S-Lemma, has been established in Yakubovich (1971) for the case m = 1 under the condition that x T 0 A 1 x 0 < 0 holds for some x 0 ∈ R n . The result is obtained by invoking a convexity result derived in Dines (1941) concerning the so called joint numerical range of quadratic functions defined as J (Q, A 1 , . . . , A m ) := (x T Qx, x T A 1 x, . . . , x T A m x) : x ∈ R n . It was shown that J (Q, A 1 ) is a convex cone. From this convexity result it is deduced that whenever J is disjoint from the negative quadrant, i.e., (i) holds, these two sets can be separated by a hyperplane. The coefficients of that hyperplane can further be shown to be two nonnegative numbers, where the first is actually positive, by using the assumption on x 0 , which eventually implies (ii).

Robust and adjustable robust optimization
Robust optimization aims at making a decision under uncertainty by looking for the decision with the best worst-case performance among those decisions that will be feasible for all realizations of the uncertain data (see Ben-Tal et al. 2009;Gorissen et al. 2015 and references therein). Formally the robust counterpart of an uncertain optimization problem is (1) The parameters of the functionsf i are uncertain and governed by the uncertainty parameter u that lives in an uncertainty set U. This set encompasses all possible realizations of u. The set X ⊆ R n is some feasible set for the decision vector, that is not affected by uncertainty. The solutions obtained by this approach can be overly conservative, a fact that motivated substantial research into redeeming this shortcoming.
One solution to this problem is known as adjustable robust optimization (ARO). In this setting, the decision variables are grouped into two categories. The first stage decision variables represent decisions to be made "here and now", i.e. at a point in time when there is uncertainty in some of the relevant problem data. The second stage decision variables represent decisions that can be delayed until uncertainty is removed. Of course, one could in the first stage make a decision on all of the variables and require the solution to be feasible for any realization of the uncertainty parameter. But then potential flexibility is not harnessed since this is equivalent to the classical robust approach, i.e. the solution will be unnecessarily conservative. The idea is that, rather than requiring that the decision on all variables is feasible in any case, we merely require our first stage decision to allow for a second stage adjustment that renders the overall solution to be feasible in any case. Said differently, we look for the best among those first stage decisions on x ∈ X for which there exists a function y(u) that, given any realization of the uncertainty parameter u, maps to a vector for the second stage decision that in total gives a feasible solution to the optimization problem. ARO was introduced in Ben- Tal et al. (2004) and has received much attention. For a detailed survey see (Yanikoglu et al. 2019). Generically ARO can be written, in a way slightly differing from (1), as The first stage decision x ∈ X is again vector valued, but the second stage variable y(u) is allowed to adapt to the uncertainty and is thus a function of u. Since the space of all functions is intractable, so is (2), and thus it is much harder to solve in practice than (1). However, there are many powerful approaches to (approximately) solve it, see (Yanikoglu et al. 2019) and references therein. Recently, (Zhen et al. 2019) proposed an adjustable robust approach to disjoint bilinear optimization where the problem inf (x,y)∈X ×Y x T Qy was reformulated into an instance of ARO. The resulting problem can then be tackled by the numerous techniques that are usually applied in the adjustable robust framework.

Contribution
• In Sect. 3 we discuss a strategy by which convex reformulations of QCQPs have been used in literature for the purpose of reformulating semi-infinite constraints with quadratic index. The strategy is not new, but it has so far not been discussed in a general manner. We think this generic perspective is instructive and helps familiarize the two fields with each other, and also allows for a discussion of the critical elements of that approach. We give small illustrative examples, also to highlight the important role of dual attainability in the reformulation. • In Sect. 4 we give an example of a class of homogeneous QCQPs for which we can close the relaxation gap for the so called Shor relaxation (see Theorem 3). We show that the result can be transferred to a class of non-homogeneous QCQPs (see Theorem 5) which are special cases of QCQPs for which the tightness of the Shor relaxation is known. • Also in Sect. 4 we show that our reformulation result implies a sufficient condition on quadratic forms for the S-procedure to be exact (see Theorem 7). Our sufficient condition can be shown to relate to another known sufficient condition in a way that allows us to infer some information on the geometry of the joint numerical range (see Proposition 11). • In Sect. 5 we give an alternative, copositive perspective on the general reformulation strategy for semi-infinite constraints. The so obtained reformulation can be used to characterize the gap that occurs if the underlying QCQP-reformulation fails to exhibit dual attainability (see Theorem 15). • Finally, in Sect. 6, we show that a class of QCQPs can be reformulated as adjustable robust optimization problems (see Theorem 18) which themselves allow for a bound based on the general reformulation strategy for semi-infinite constraints (see Theorem 19). The resulting lower bound is tested against known lower bounds found in literature. Empirical evidence suggests that for some range of the problem parameters our lower bound can be beneficial in terms of computation time and solution quality.

Notation and preliminaries
Throughout the paper matrices are denoted by sans-serif capital letters (e.g. the n × nidentity matrix will be denoted by E n , while O will denote the zero matrix), vectors by boldface lower case letters (e.g. e i will denotes the i-th column of E n , and o will denote the zero vector) and scalars (real numbers) by simple lower case letters. We use x i to denote the i-th entry of a vector x. The analogous holds between matrices and vectors, and between matrices and numbers with double indices. For example, for a matrix X, the i-th column or row vector (which will be pointed out as we go along) is given by x i , and the j-th entry in the i-row will by denoted x i j . Sets will be denoted using calligraphic letters, e.g., cones will often be denoted by K. We use S n to indicate the set of symmetric matrices and S n ++ /S n −− , S n + /S n − for the sets of positive-/negativedefinite and the set of positive-/negative-semidefinite symmetric matrices, respectively. Moreover, we use N n to denote the set of entrywise nonnegative, symmetric matrices. For notational convenience, we will define the block matrix where A, B, C are matrices of appropriate size. Also, for a vector v ∈ R n we will denote by Diag(v) ∈ S n the diagonal matrix whose diagonal entries are the respective entries of v. For a given set A we denote its interior, closure, relative interior, relative boundary, convex hull and conic hull by int(A), cl(A), ri(A), bd(A), conv(A), cone(A), where for the latter operation we stick to the convention that cone(A) := {λx : x ∈ A , λ ≥ 0} for notational ease (note that by this definition cone(A) is the union of {o} and the smallest cone containing A).
Let V be either R n or S n , whereby we use the standard inner product in R n , while in S n the inner product is given by the Frobenius product A • B = trace(AB). For an arbitrary cone K ⊆ V we denote the dual cone by K * ⊆ V where It is well known that K * * = clconv(K). The closure and the convex-hull operations can be omitted if K is closed and/or convex, respectively. We also have We say a convex cone K is pointed if it does not contain a line or, equivalently, if int(K * ) = ∅.
Throughout the text, we will make use of conic linear optimization tools. A conic linear optimization problem is given by which is just a linear optimization problem with an extra constraint that restricts the decision variable x to lie in a closed, convex cone K. The dual problem is given by Slater's condition says that a relative interior feasible point of (P) guarantees p * = d * and existence of an optimal solution to (D) if it is feasible, and likewise with (D) and (P) interchanged. A large portion of our discussion will revolve around set-copositive and setcompletely positive matrix cones, which for arbitrary cones K ⊆ R n are defined in the following way, putting k = n+1 2 = dim S n : The index n will henceforth be suppressed if the dimension is clear from the context. The above matrix cones are convex cones that are dual to each other if K is closed since in general COP(K) * = cl [CPP(K)] = CPP(cl K) holds (Sturm and Zhang 2003, Proposition 1 and Lemma 1). In fact, an easy continuity argument shows that COP(K) is always closed even if K is not. Since S n + = COP(R n ) = CPP(R n ), this also shows that the cone of positive-semidefinite matrices is self-dual, i.e. (S n + ) * = S n + . The concept of copositivity was first introduced in Motzkin (1952) for the case K = R n + . To determine whether a given matrix is an element of either COP(R n + ) or CPP(R n + ) is NP-hard. This leaves the impression that using these cones in optimization yields a complication rather than a solution of a problem. However, both cones can be approximated by tractable matrix cones, and recent literature shows that even simple approximations yield good bounds for several kinds of optimization problems. We will not go into further detail but instead refer to Bomze (2012), Dür (2010), Hiriart-Urruty and Seeger (2010). Here, let us just mention the most elementary examples of tractable approximations given by where the mnemonics N N D, DN N abbreviate non-negative-decomposable and doubly-non-negative respectively. In fact, the inclusions in (4) hold with equality in case n < 5. We also have the useful equality then follows from selfduality of S n+1 + and the fact that R n × R + is a closed, convex cone and hence COP(R n × R + ) = CPP(R n × R + ) * .

Convex reformulations of QCQPs and robust optimization
We will now proceed in describing the general strategy by which these reformulation results can be leveraged in order to cope with cases where the semi-infinite constraints present in robust optimization depend quadratically on the uncertainty vector , ω(x) are matrix-, vector-and scalar-valued functions of the decision vector x. For ease of notation we refer to them as Q, q and ω, suppressing the dependence on x. We thus consider the semi-infinite constraint and its (possibly non-smooth) single-constraint equivalent Now suppose that for generic Q, q and ω the minimization problem in (5) has an exact convex, conic reformulation of the following form, using an appropriate matrix cone C and appropriate matrices A i ∈ S n+1 , real numbers b i ∈ R, i ∈ [1 : m], so that, using notation (3), Further, assume that for this reformulation and its dual we can establish zero duality gap and dual attainability (e.g., if the primal satisfies Slater's condition), so that Since dual attainability guarantees the existence of the dual maximizers, we can enforce the semi-infinite constraint in (5) by demanding that In summary, the strategy is to bypass the need to directly dualize the implicit QCQP in (5) by providing a linear conic reformulation, a dual of which can be formed by invoking linear conic duality, which is a very well developed and understood subject. Many results in recent literature have harnessed this general strategy in order to cope with cases where quadratic terms in the uncertainty vector appear (see e.g. Mittal et al. 2019;Xu and Hanasusanto 2019). We want to highlight that the critical ingredients for the above strategy are: • closing the relaxation gap, • closing the duality gap for the conic reformulation and • guaranteeing dual attainability.
While the first obstacle seems to be the most challenging, duality also comes with some subtleties attached. In Sect. 5 we will make an effort to understand the reformulation gap, that may occur in case of a failure of dual attainability, while in Sect. 4 we will discuss relaxation and duality gaps and their relation to each other. For now, we will give some examples that illustrate the above strategy and also demonstrate that the gap can be infinite if dual attainability does not hold.
If, out of the infinitely many constraints, we enforced merely 0.5 2 x + 0.5 2 y ≥ 1, we obtained a relaxed problem with an optimal value of 4 attained at x = y = 2. This solution is also feasible for the original problem since min 2u 2 1 + 2u 2 2 : u 1 + u 2 = 1, (u 1 , u 2 ) ∈ R 2 + = 1, and thus it is also optimal. We will now show, that the general reformulation strategy yields an equivalent problem with finitely many constraints.

Example 2
To illustrate the issues of a failure of dual attainability for the conic reformulation we consider the robust optimization problem Note that the semi-infinite constraint is equivalent to x + y ≥ 0, so that the constraints overall imply x + y = 0 and the minimum is 0. We claim that for any a ∈ R we have q(a) := min au 2 1 − u 2 u 3 : The claim follows easily after considering that U ∈ S 3 + and u 33 = 0 imply u 23 = 0. Note, that this reformulation is not based on a characterization of G(F), but is coincidental, and chosen here in order to illustrate a point about dual attainability. Also, we see that q(a) = a if a < 0 and zero otherwise. The conic reformulation does not have a Slater point, however its dual, given by does have a Slater point, where λ i , i = 1, 2, 3 are sufficiently small. Thus the conic reformulation and its dual have the same optimal value, but dual attainability is not guaranteed. In fact, some elementary analysis reveals that the dual does not attain its optimum. We thus merely have the following implications The final constraint implies x + y > 0 since in case of x + y = 0 we had λ 1 = 0 and λ 2 can neither be zero nor positive. Thus if we replaced the semi-infinite constraint in (9) by the latter constraint, the problem would become infeasible. This exemplifies that closing the relaxation gap as well as the duality gap is in general not enough to guarantee an exact reformulation.

Exactness of the Shor relaxation and the S-Lemma
As stressed in Burer (2015), the key problem when employing Theorem 1 is to characterize the set G(F), which of course depends heavily on the choice of F (note that no convexity assumptions are imposed on F). A natural starting point is given by the so called Shor relaxation which is constructed in the following manner. For notational convenience we define Y(X, x) := M(X, 2x, 1) as well as We will refer to the latter problem as Shor relaxation since the core idea was introduced in Shor (1987) albeit for the special case of K = R n , in which case CPP (R n × R + ) = S n+1 + (see the notations and preliminaries section). The above derivation makes it apparent that the Shor relaxation is not necessarily tight, since its feasible set F Shor can have extreme points that are not rank one matrices. However, if we find that the optimal solution of the Shor relaxation is a rank one matrix, then we have solved the original QCQP. If we can show that a rank one solution exists for any choice of coefficients (Q, q, ω), we have indeed shown that F Shor = G(F) (see e.g. Yang et al. 2016). But in general F Shor ⊇ G(F) so that one has to find a strengthening of the Shor relaxation if the relaxation gap is to be closed (e.g. see Burer 2009Burer , 2012Eichfelder and Povh 2013) However, another way of establishing exactness of the Shor relaxation stems from the fact that the conic dual of the Shor relaxation and the Semi-Lagrangian dual of the underlying QCQP (see Bomze 2015; Faye and Roupin 2007) take the same form. To see this consider the following derivation of the dual of a QCQP, using the abbreviations In the second to last equality we used the fact that for any quadratic form q(x) and any coneK we have Here, we speak of a Semi-Lagrangian dual, since the constraints (x, x 0 ) ∈ K × R + are not dualized in that we do not introduce respective dual variables. The final equality follows from the definition of COP(·) and in fact yields the conic dual of the Shor relaxation after a small reformulation, Duality results for QCQPs are regularly obtained by proving that the joint numerical range of the quadratic forms involved has some convexity property (either being convex itself or becoming convex after adding the positive orthant in Minkowski's set-sum sense). For examples, see (Chieu et al. 2019;Jeyakumar and Li 2014) and references therein. If for a fixed feasible set F the duality result holds for an arbitrary choice of the objective function coefficients, the feasible set of the Shor relaxation is in fact G(F).
We will now proceed to give some examples that illustrate the interplay between relaxation gaps, duality gaps and exact strengthenings of QCQPs and their relaxations.
Example 3 (Zero duality gap for the QCQP, hence the Shor relaxation is tight) Consider the optimization problem min x∈R + −x 2 : x 2 ≤ 1 . We will now give an argument for full strong duality based on the joint numerical range. In this simple special case the difficulties of such an approach are avoided while the core idea of the argument is maintained. The following equalities are easily checked: In order to close the duality gap we have to show that the condition is equivalent to Clearly, (12) implies (11), since in case (11) fails, any member of the respective set is an x ∈ R + that invalidates (12). The reverse implication can be demonstrated by appealing to the geometry of the joint numerical range J (−1, 1, In fact, in this simple case we have which is a half line and thus convex (which also illustrates the classical result in Dines (1941)). Now assume that (11) holds, then J (−1, 1, R + ) − te 1 − e 2 does not meet int(R 2 − ). Since both sets are convex, they can be separated by a hyperplane, i.e. it holds that where the nonnegativity of the multipliers follows from the fact that we separate from the interior of the negative orthant and the int(·)-operator can be dropped by continuity. For the latter condition to hold, it must be the case that α > 0, otherwise we had β > 0 and x = 0 would take the remaining term to be −β < 0. We thus arrive at (12) with λ = β/α. Finally, we have that for any fixed t ∈ R (12) is equivalent to Thus it follows that the Shor relaxation is tight.
Example 4 (Positive QCQP-duality gap, inexact Shor relaxation with exact tightening, both with zero duality gap) Consider the dual of min x 2 − y 2 + x : x + y = 1, x, y ≥ 0 given by The primal QCQP has an optimal value of −1 while the lagrangian dual is infeasible since any matrix in COP(R 3 + ) has nonnegative diagonal entries. The Shor relaxation given by is unbounded, where for the improving ray we have x = X = Z = 0, y = 1 and Y ≥ 1. Thus the duality gap for the Shor relaxation is actually zero since the QCQP and its Shor relaxation share the same langrangian dual. We can strengthen the Shor relaxation by introducing the additional constraint X + 2Z + Y = 1, which is the relaxation of the redundant constraint (x + y) 2 = 1. This yields an exact reformulation by the main results in Burer (2009), Burer (2012. After applying the simplification steps outlined there, we obtain another exact reformulation, which has a Slater point at X = Y = 0.4, Z = 0.1, so that it exhibits no duality gap. Example 5 (Exact Shor reformulation with finite duality gap) Let us consider min x 2 1 + x 2 2 : x 2 1 = 0, x 1 x 3 − x 2 2 = −1, x 2 ≥ 0 , whose optimal value of 1 is attained at x 1 = 0, x 2 = 1. The Shor relaxation min X 11 +X 22 : X 11 = 0, X 13 −X 22 = −1, X ∈ CPP(R×R + ×R) = S 3 + has the same optimal value, since X 11 = 0 together with X ∈ S 3 + forces X 1i = 0, i = 2, 3. Now the Shor relaxation is in fact a known example from (Pataki 2018) for an SDP with finite positive duality gap.
In summary we have discussed two routes for achieving a convex reformulation of a QCQP, either via closing a relaxation gap or via closing a duality gap, and we have illustrated the following facts: • If a QCQP enjoys strong duality (e.g. if the joint numerical range is a convex cone), we also have an exact Shor reformulation. • If the Shor relaxation is exact and we can close its duality gap we also close the duality gap between the underlying QCQP and its dual. • If the Shor relaxation is not exact but we can find an exact tightening, the dual of that reformulation can be used as an alternative to the Lagrangian dual of the QCQP.
For some background on these facts see for example (Bomze 2015;Chieu et al. 2019) who discuss the case K = R n + . To the best of our knowledge, these facts have so far not been discussed for general convex K, but the generalization is immediate, as shown above. Typically, in literature, one of these paths, closing the relaxation gap or closing the duality gap, is chosen with no regard for the implications of an obtained result for the constitution of the other route. We will now proceed with a demonstration of the former path which is a nice example in two regards. First of all, the derivation is simple and instructive as it is accessible via some geometric intuition. Second, the conditions under which we close the relaxation gap will not only give rise to new conditions for a generalized version of the S-Lemma to hold, but also allows some insight into the joint numerical range of the quadratic forms which fulfil these conditions.

Harnessing Pataki's rank result
In this section we provide the proof of the exactness of the Shor relaxation under a geometric condition by iterating an application of Pataki's rank theorem. We follow a proof strategy originally considered in Burer and Anstreicher (2013), where the claim of Theorem 3 was proved for the special case of m = 2.

Theorem 2 Consider a feasible set of a semidefinite optimization problem in blockstandard form
Let X 1 , . . . , X p be an extreme point of T and let r j := rank(X j ). Then it holds that p j=1 r j (r j + 1) ≤ 2k.
Proof See (Pataki 1998, Theorem 2.2), which is more general than stated here. The required specialization is, however, immediate.
We consider a further specialisation, namely Proof After introducing slack variables, T is a projection of the set and thus rank(X) ≤ 1.
The following geometric condition will allow us to leverage Corollary 1 to cases where more than two linear inequalities are present.

Condition 1 For a collection of matrices
The condition requires that for any feasible X ∈ S n in above sense that at most two constraints can be binding at the same time. Note that, if F := X ∈ S n + : is bounded (as assumed in Theorem 3), one can check Condition 1 by solving (m 3 − 3m 2 + 2m)/6 semidefinite optimization problems of the form For Condition 1 to hold, all the optimal values must be strictly smaller than 0.
We are now able to prove the following result for the homogeneous problem: Theorem 3 Suppose that Condition 1 holds for the matrices A i ∈ S n and real numbers b i ∈ R, i ∈ [1 : m]. Further, suppose that the set F := X∈S n + : Proof Its clear that "≥" has to hold as the SDP is a relaxation of the QCQP. Since the former has linear objective and F is bounded, its optimal value will be attained at extreme points of F. Let X * be one such point. By Condition 1 at most two inequalities, say i and j, are binding at X * . If fewer are binding, then one of these indices or both can be chosen arbitrarily. It follows that X * is also extremal in the set F i, j := X ∈ S n + : To see this, note that by the strict inequalities in Condition 1 there is a ball B (X * ) centered at X * with radius ≥ 0 such that If there was X 1 , X 2 ∈ F i, j such that X were a convex combination of those, then such points had to exist in F since the line between X 1 and X 2 would penetrate B (X * ) ∩ F. By Corollary 1 and extremality of X * in F i, j we see that X * = x * (x * ) T so that x * is feasible for the QCQP, and Example 6 Consider the following quadratic program and its Shor relaxation inf x∈R 2 s.t. : 2x 11 + x 22 ≤ 12, x 11 + 2x 22 ≤ 12, 4x 11 + x 22 ≥ 4, The feasible sets of these problems are depicted in Fig. 1. For the lifted feasible set one can see that no more than two inequalities can be binding at the same time within the semidefinite cone. Also, since all matrices at the boundary of S 2 + are rank-one matrices, we see that the extreme points of the lifted feasible set have rank one, so that the conclusion of Corollary 1 is quite obvious in this case. By inspection one can clearly see that for q 12 = 0, q 11 = q 22 > 0 one finds that the optimal set of the QCQP is at the innermost corners, where the third and forth inequality are binding. These are the points (x 1 , For the Shor relaxation we can also easily see, that the third and fourth inequalities would be binding in the optimum and so would be the semidefiniteness constraint so that x 11 x 22 = x 2 12 for the optimal solution. From 4x 11 + x 22 = 4 and x 11 + 4x 22 = 4 we get x 11 = x 22 = 4/5 so that x 12 = ±4/5. We have as claimed in the theorem. Of course, there are more solutions to the Shor relaxation, but these are convex combinations of the two presented here. They can be seen in Fig. 1 as the line connecting the two lower corners in the lifted feasible set.
The optimization problem in Theorem 3 does not involve linear terms, which raises the question whether the result has implications for that case as well. In fact, the theorem can be transferred to the case where the quadratic functions are non homogeneous by applying the following simple lemma:

Lemma 4
The extreme points of T := {X ∈ S n + : Proof Assume X ∈ ext(T ), the set of extreme points of T , and assume further that there are X 1 , X 2 ∈ T such that X = λX 1 + (1 − λ)X 2 for some λ ∈ (0, 1). We have A 1 • X j ≤ b 1 for j = 1, 2. If A 1 • X j < b 1 for at least one j ∈ {1, 2}, then , a contradiction. Now, assume X ∈ ext T and that A 1 • X = b 1 , so clearly X ∈ T . Again, if there was a pair X 1 , X 2 ∈ T such that X = λX 1 + (1 − λ)X 2 for some λ ∈ (0, 1) there would be such a pair in T ⊇ T , so that we arrive at an analogous contradiction.

Theorem 5 Assume
Proof After reformulating the original problem as we see that its Shor relaxation is given by where we again make use of the fact that CPP(R n × R + ) = S n+1 + . Note that after resolving the inequality we see that the feasible set of the latter problem is exactly F shor . Proving that all extreme points are rank one is done in a manner analogous to Theorem 3, where the fact that we have one equality instead of an inequality does not interfere due to Lemma 4. In the representation of the reformulation we also combined the constraints A m+1 • M(X, 2x, x 0 ) = 1 and M(X, 2x, x 0 ) ∈ S n+1 + into M(X, 2x, 1) ∈ S n+1 + and also resolved the equality.
Theorem 5 is related to the main result in Yang et al. (2016) where the authors proved the following result.
Theorem 6 Assume F ⊆ R n is the feasible set of a QCQP and assume that the for the setF := x ∈ F : The non-intersection condition is similar to the condition of Theorem 5. To see this, note that in the Shor relaxation in Theorem 5 the equality associated with A m+1 consumes one of two inequalities which can be binding at any time. Thus, the remaining inequalities cannot have intersections within the semidefinite cone, or else Condition 1 fails. Also note, that under the Condition 1 the set F Shor is bounded if and only if it is bounded by merely one of the m inequalities in its description. From this we see that we can interpret Theorem 5 as a special case of Theorem 6 where F is given by a single quadratic equality and the non-intersection condition is with respect to the lifted space, i.e. a strictly stronger condition than in Theorem 6. It is an open question whether there is a direct way to deduce Theorem 6 from Theorem 3. However, Theorem 3 cannot be deduced from Theorem 6, and only the former will allow us to derive a generalization of the S-Lemma under geometric condition that is a derivative of Condition 1, as well as allow for some insight in the joint numerical range of the quadratic forms that jointly fulfil said condition.

Generalized S-lemma and geometry of the joint numerical range
We are now ready to use Theorem 3 in order to establish a generalization of the S-Lemma in the form of a characterization of set-copositivity over a cone given by homogeneous quadratic inequalities. The key to this result will be the fact that a strictly feasible point of a quadratic program guarantees a Slater point in its Shor relaxation, as proved in (Tunçel 2001, Theorem 5.1).

Theorem 7 Let
Then Proof Observe that Q∈COP(K) if and only if q * := inf x x T Qx : x T E n x≤1, x∈K ≥ 0, where the latter QCQP has a strictly feasible point x 0 /(2 x 0 ). The matrices A i , i ∈ [1 : m] fulfil (13) and thus also fulfil Condition 1 together with the pair E n and 1. Thus, by Theorem 3 we have q * = inf X∈S n + {Q • X : E n • X ≤ 1, A i • X ≤ 0, i ∈ [1 : m]}, and the latter set is bounded (due to the constraint E n • X ≤ 1) and has a Slater point by (Tunçel 2001 Theorem 5.1) since the QCQP has a strictly feasible point. Thus by strong duality q * = sup λ∈R m Note that the above theorem is stated in terms of a characterization of copositivity over a cone described by homogeneous quadratic functions. Another way of stating the result would be a theorem of the alternative, which usually is done when discussing the S-procedure, as discussed in the introduction. Such theorems are often derived from results on the joint numerical range of the quadratic forms involved. Specifically, if the joint numerical range can be shown to be a convex cone, an S-Lemma type result can be derived in a straightforward manner (see e.g. Pólik and Terlaky 2007). An important example of such a result was proved by Polyak in Polyak (1998) where he showed that the S-procedure is exact for the case m = 2, n ≥ 3 (under a condition discussed shortly) by invoking a convexity result regarding the joint numerical range of three quadratic functions, also provided in Polyak (1998), namely: Theorem 8 Let A i ∈ S n , i = 0, 1, 2. For n ≥ 3 the following assertions are equivalent.

The joint numerical range
is a convex and pointed cone.
The theorem guarantees that whenever 1. holds and J is disjoint from the interior of the negative orthant, it can be separated from the latter set via a hyperplane. Thus, given the first condition in Theorem 8 and Slater's condition hold, one can prove exactness of the S-procedure, that is, one can prove the following theorem as shown in Polyak (1998): Theorem 9 Let n ≥ 3 and A i ∈ S n , i = 0, 1, 2. Assume that there exists μ ∈ R 3 such that μ 0 A 0 + μ 1 A 1 + μ 2 A 2 ∈ S n ++ and that there is an x 0 ∈ R n such that x T 0 A i x 0 < 0, i = 1, 2, then the following two statements are equivalent i) the following system of inequalities has no solution x ∈ R n : x T A 0 x < 0 and x T A i x ≤ 0, i = 1, 2; ii) the following conic inequality has a solution λ ∈ R 2 + : Obviously, the condition ∃μ ∈ R 3 : μ 0 A 0 +μ 1 A 2 +μ 3 A 3 ∈ S n ++ is readily fulfilled if for at least one pair of i = j there exists μ ∈ R 2 with μ 1 A i + μ 2 A j ∈ S n ++ and we will refer to the later condition as Condition 1'. Thus, if we fix A 1 and A 2 such that Condition 1' is fulfilled, we can characterize copositivity over the cone The copositivity condition derived in Theorem 7 is, however, not obtained courtesy of any argument involving the joint numerical range, but rather as a consequence of the exactness of the Shor relaxation and the existence of a Slater point in conjunction with strong conic duality under the assumptions of the theorem. An immediate question is therefore, whether we can learn anything about the joint numerical range of the quadratic forms that fulfil said assumptions. This question is interesting in its own right since the joint numerical range is in general a very ill-behaved object and geometrical results are typically hard to obtain. We will gain at least some insight by proving that Condition 1' is implied by (13) and under additional assumptions both coincide.
Lemma 10 Let K p := (A 1 • X, A 2 • X) : X ∈ S n + and K d := y ∈ R 2 : y 1 A 1 + y 2 A 2 ∈ S n + . Then K d is a closed convex cone, K p is a convex cone, and K * p = K d so that Proof The claims are immediate from the fact that S n + is a self-dual cone (therefore closed and convex).

Proposition 11
For matrices A 1 , A 2 ∈ S n , assume that ∃x 0 ∈ R n such that x T 0 A i x 0 < 0 for i ∈ {1, 2}. Consider the following two statements: 1. There exists μ ∈ R 2 such that 2. For any matrix X ∈ S n + \ {O} and i, j ∈ {1, 2} with i = j we have Then, 2. ⇒ 1. and if we in addition assume that in the description of the set S := {X ∈ S n + : A i • X ≤ 0 , i = 1, 2} none of the linear inequalities is redundant, then also 1. ⇒ 2.
Proof For the first claim, assume that property 1. does not hold. We will show that condition 2. then has to fail. Under this hypothesis the linear subspace M := {μ 1 A 1 + μ 2 A 2 : μ ∈ R 2 } is disjoint from S n ++ . Thus, by (Rockafellar 2015, Theorem 11.2) we have a hyperplane containing M, and hence S n ++ lies entirely in one of its associated open halfspaces. Since M contains the origin, so does said hyperplane; it actually supports S n + . Thus, one of its normal vectors, say X, is in S n + by self-duality of the latter matrix-cone. But then for this X ∈ S n + we have A 1 •X = A 2 •X = 0, i.e. property 2. fails to hold. For the other direction, assume that property 1. and the additional assumption about S hold. Consider the cones K p := {(A 1 • X, A 2 • X) : X ∈ S n + } and K d := {y ∈ R 2 : y 1 A 1 + y 2 A 2 ∈ S n + }. By Lemma 10 both are convex, K d is closed and K * p = K d . Since the cone of positive definite matrices has nonempty interior, property 1. is sufficient to guarantee that K d has nonempty interior. Note that K p has to meet the interior of the negative orthant, because of x T 0 A i x 0 < 0, i = 1, 2. Also, K p has to meet the interior of second orthant, since otherwise there is no X ∈ S n + such that The closures of the two sets would coincide, and by (Rockafellar 2015, Theorem 7 in contrast to the assumption. By an analogous argument we have nonempty intersection of K p and the interior of the fourth orthant. Now assume that condition 2. fails, i.e. there is an X ∈ S n + such that A i • X = 0 and A j • X ≥ 0 for i = j. If in fact A j • X = 0 then X is a normal vector to a hyperplane that contains the linear combinations of A 1 , A 2 and which supports S n + by self-duality. This clearly contradicts 1. as such a hyperplane is disjoint from S n ++ . Thus assume A j • X > 0. Then (A 1 • X, A 2 • X) together with three more points from the intersection of K p and the interior of the second, third and fourth orthant respectively, give four vectors whose positive linear combinations span R 2 . This span is contained in K p so K d = {0} = (R 2 ) * , which also contradicts property 1.
By this proposition, we could replace (13) with Condition 1' in Theorem 7 provided there are no redundant constraints. This gives some insight into the geometry of the joint numerical range J (Q, A 1 , . . . , A m ) of m + 1 matrices that satisfy the assumption of the Theorem 7. If n ≥ 3, every projection of J (Q, A 1 , . . . , A m ) onto three coordinates is in fact a convex, pointed cone by Theorem 8. We do not know whether these geometric properties of J (Q, A 1 , . . . , A m ) are enough to ensure linear separability. However, Theorem 7 is not based on a convexity result concerning J , but is a consequence of Pataki's theorem.

An alternative perspective via set-copositivity
In the following section we want to discuss the fact that the semi-infinite constraints, which arise in robust and adjustable robust optimization, can be interpreted as setcopositivity constraints. Careful analysis of these constraints will reveal that they can be decomposed into linear and copositivity constraints of simpler structure. This decomposition will allow for some interesting insights in the following sense. We have seen so far, that we can use quadratic optimization tools in conjunction with conic duality in order to reformulate semi-infinite constraints that quadratically depend on an index. However, if we can merely close the duality gap without at the same time guaranteeing for dual attainability, the feasible set described by the reformulation shrinks. The decomposition will reveal more details on the nature of this shrinkage. We will discuss this phenomenon in the context of using results in Burer (2009), Eichfelder andPovh (2013) for the reformulation of semi-infinite constraints. Let us reconsider the semi-infinite constraints we have encountered in (5): The quadratic inequality can be homogenized in order to obtain a more compact formulation, i.e., By the definition of set-copositivity the above constraint is equivalent to We thus see that the copositivity constraint (15) can also be enforced by (6). Further, from cone Thus, as a side note, we can handle unions of sets U = ∪ i U i in (5) whenever we can characterize COP (cone [U i × {1}]) for each individual U i . However, the complicated structure of the set over which we we require the quadratic form to be non-negative seems to be prohibitive. Nevertheless, this issue can be resolved quite easily. Assume that U is the compact intersection of a (not necessarily convex) cone K with a hyperplane: We will prove a result which states that the seemingly complicated copositivity constraint in (15) can be replaced by a linear constraint and a copositivity constraint over K. The key idea is that cone[(K ∩ H) × {μ}] is actually K × {0} under a linear transformation. To this end, we need some auxiliary results.

Lemma 12
Let M := {u ∈ R n : Au = b} be an affine subspace where A ∈ R m×n and b ∈ R m \ {o}. Let K ⊂ R n be a closed cone with pointed convex hull K := conv(K) such that K ∩ M is compact. Then the following two statements hold.
a) There are α ∈ int(K * ) and b 0 > 0 such that Proof Let M 0 := {u ∈ R n : Au = o} = ker A denote the recession cone of M. Then, by compactness, we have M 0 ∩ K = {o}. We look for a supporting hyperplane of K which meets this cone only at o (i.e., whose normal vector is interior to K * ) and which contains M 0 (i.e., it is also a supporting hyperplane of M 0 ). Such a hyperplane exists exactly if int(K * ) ∩ M * 0 = ∅, which we will prove to be the case. Assume that int(K * ) ∩ M * 0 = ∅, then by (Rockafellar 2015, Theorem 11.2) there is a hyperplane H which contains for some appropriate λ ∈ R m and α T u = λ T Au = λ T b is redundant for u ∈ M. Assume w.l.o.g. that b 1 > 0. Clearly if m = 1 we have a 1 ∈ int(K * ) so that for all u ∈ K we have a T 1 u > 0 and thus there is an λ ≥ 0 such that a T 1 (λu) = b 1 . This proves cone(M ∩ K) ⊇ K since o is in both cones. The converse is trivial.

Remark
The following procedure can be followed for explicitly determining α and b 0 , given A ∈ R m×n , b ∈ R m \{o} and K ⊂ R n as in the existence result in Lemma 12. We first calculate (λ * , ε * ) = arg max (λ,ε)∈R m+1 ε : where a is a known element of int(K * ) (e.g. a = e if K is the non-negative orthant or a = e 1 if K is the second order cone). Note that Lemma 12 guarantees the existence of an feasible solution (λ, ε) such that A T λ ∈ int(K * ) = y : v T y > 0 ∀v ∈ K \ {o} and ε > 0. The maximization problem is equivalent to max (λ,ε,w)∈R m+2 ε : w ≥ ε, A T λ − wa ∈ K * by a standard duality argument (the dual variable w can be eliminated, which we avoided in order to preserve clarity). Finally we can set α = A T λ * and b 0 = α T x for any x ∈ M.
Finally, for any cone K ⊆ R n observe that and for any nonsingular A ∈ R n×n Now we are ready to prove the main result in this section.

Theorem 14
Let μ ∈ R, K ∩ H ⊆ R n and K be a closed cone with pointed convex hull and H = {x : a T u = b} ⊂ R n a hyperplane, such that conv(K) ∩ H is compact and nontrivial (neither empty nor a singleton). Then there is a nonsingular matrix A of order n + 1 such that cone(K ∩ H × {μ}) = A(K × {0}) and with by Equation (17) The above theorem allows us to harness other results in quadratic optimization in order to characterize or at least approximate set copositive constraints in (15). As an example we use the main result in Eichfelder and Povh (2013), which is a generalization of the celebrated result in Burer (2009), in the following theorem.
Theorem 15 Let U 0 := {u 0 ∈ K 0 : A 0 u 0 = b 0 , (u 0 ) j ∈ {0, 1} for all j ∈ I} = {o} be a closed, bounded and nonempty set, where K 0 ⊆ R n 0 is a closed, convex and pointed cone with nonempty interior, A 0 ∈ R k 0 ×n 0 , b 0 ∈ R k 0 and I ⊆ [1 : n 0 ]. Further, let K := cone(U 0 ). Define for some appropriate matrices A, E and Q, vectors b, b 2 , α and numbers n y , n z and k (which can be formed with minimal effort). Then, the following inclusions hold: Proof In order to apply (Eichfelder and Povh 2013, Theorem 10) we have to guarantee a version of Burer's key condition holds for U 0 , namely Assume (u 0 ) j ≤ 1 is not implied for n y different j ∈ I and 0 ≤ (u 0 ) j is not implied for n z different j ∈ I (i.e. in the worst case n y = n z = |I|, in the best case both are zero). This forces us to introduce n y constraints (u 0 ) j + y j = 1, and n z constraints (u 0 ) j − z j = 0, where y ∈ R n y + and z ∈ R n z + are slack variables, which are necessary since all linear constraints have to be equality constraints. Note that all these additional constraints are redundant. Let n := n 0 + n y + n z and k := k 0 + n y + n z . We define b := (b 0 ; e; o) ∈ R k , where e ∈ R n y is the all-ones vector and o ∈ R n z . Further, u := (u 0 ; y, z) ∈ K × R n y +n z + and Q := (Q 0 , O; O, O) ∈ S n . Also we abbreviate by b 2 the vector whose i-th entry is given by b 2 i . Now let A ∈ R n×k be such that Au = b gives line by line the equalities A 0 u 0 = b 0 , (u 0 ) j + y j = 1 and (u 0 ) j − z j = 0. It is easy to check, that for U := {u ∈ K 0 × R n y +n z + : Au = b , u j ∈ {0, 1} for all j ∈ I} we have inf u∈U u T Qu = inf u 0 ∈U 0 u T 0 Q 0 u 0 , that U is a compact set, and that K 0 × R n y +n z + is a closed, convex, pointed cone. Thus, by Lemma 12, we can add a redundant constraint α T u = β where α ∈ int((K 0 × R n y +n z + ) * ) to the definition of U. By redundancy and since y T α > 0 for all y ∈ K 0 × R n y +n z + \ {0}, we see that β > 0 so that after a scaling, the constraint can be written as α T u = 1. Then (Eichfelder and Povh 2013, Theorem 10) along with a straightforward adaptation of the reduction step outlined in (Burer 2012, Theorem 2) gives us the equivalence between the following two optimization problems where C := CPP(K 0 × R n y +n z + ), so that the latter problem has the following dual, abbreviating E = E |I| : where the dual variables (λ, μ, ν, δ) ∈ R 2k+|I|+1 and where λ and ν were subject to a change of variables from λ ⇒ 2λ and ν ⇒ 2ν to avoid fractions in the expression. Now, since α ∈ int (K 0 × R n y +n z + ) * we have αα T ∈ int COP(K 0 × R n y +n z + ), so that the dual problem has an obvious Slater point and indeed d * = q * . We are now ready to prove the inclusions in the theorem. Note that o / ∈ U under the hypothesis of the theorem, hence p * > 0 if and only if Q ∈ int COP(K). In this case d * = p * > 0 implies that Q ∈ COP pr x (K). If Q ∈ COP pr x (K) then p * = d * ≥ 0 which implies that Q ∈ COP(K).
The argument in the proof is analogous to the general strategy outlined in Sect. 3, the difference being that the objective in the minimization problem is homogeneous. Also, we do not assume dual attainability which is why the reverse inclusions cannot be established. The theorem highlights the fact, that in absence of dual attainability the feasible set is shrinking, and locates the points that are lost at the boundary of the setcopositive cone we sought to characterize. The reformulation results in Burer (2009), Eichfelder andPovh (2013) have been used in Mittal et al. (2019), Xu and Hanasusanto (2019) for the sake of reformulating semi-infinite constraints with quadratic dependency on the uncertainty vector. We want to point out that in these papers the problem of dual attainability was not sufficiently considered when taking uncertain constraints into account. In addition, these papers do not harness the reduction step, which by Lemma 12 is always applicable if the uncertainty set U is compact. The reduction step reduces the dimension of the set-copositive constraint in the copositive reformulation and at least opens the possibility for a primal Slater point to exist, which would never be the case if that reduction step was omitted as was pointed out in (Burer 2012, Sect. 3). To better understand the situation, consider the example given there more carefully. It states, that The first equality is due to the main result in Burer (2012) and the second one is the reduction step outlined in Sect. 3 of that paper. In fact, equality of the first and the last problem was already established at the origins of copositive programming in Bomze et al. (2002). Take any interior point X ∈ intCPP(R n + ). Since ee T ∈ intCOP(R n + ) one can rescale X such that e T Xe = 1 holds, and we see that the last minimization problem indeed has a primal Slater point. In fact, the same line of reasoning remains valid if we replace R n + by any convex cone K 0 and e by any α ∈ K * 0 and 1 by any positive number b. The only difference is that we then have to appeal to the results in Burer (2012) or Eichfelder and Povh (2013) for the reformulation step. In such cases we would have K = cone{x ∈ K 0 : α T x = b} = K 0 , i.e. X 0 would be a base of K 0 . The question is, whether we have any other cases where we end up with a primal Slater point after applying the reduction step? Proposition 16 unfortunately gives a negative answer to this question: it states that a primal Slater point occurs exactly in the case where there is just a single equality constraint. Therefore, Theorem 15 is only an approximate result and does not give rise to an exact copositivity characterization in general. This unfortunate circumstance is at least partly redeemed by the fact that we lose at most some points at the boundary of the set. (19) Proof The "if" part is clear since then all constraints but α T Uα = 1 are redundant, αα T ∈ int CPP(K 0 ) * and 1 > 0 so that any interior point of CPP(K 0 ) can be scaled to fulfil the one remaining constraint. So assume there is a Slater point U ∈ int CPP(K 0 ) ⊆ int S n + . Then U 1/2 exists and is invertible. Set a i = U 1/2 a i and α = U 1/2 α then the constraints read as α T α = 1, a T a i = b 2 i and α T a i = b i . The first and the second one imply α = 1 and a i = b i but then by the equality case of Cauchy-Schwarz, the third constraint implies a i = β i α for some β i ; we actually have b i = β i α 2 = β i and finally, by invertibility of U 1/2 , a i = b i α.

Adjustable robust reformulation of disjoint convex-non-convex quadratic optimization
So far, we have discussed how to use reformulations of QCQPs for the purpose of reformulating robust optimization problems. However, (Zhen et al. 2019) outlined a way to use an adjustable robust approach in order to tackle disjoint bilinear optimization problems, which are special cases of QCQPs, thus demonstrating that the usefulness runs both ways. In this section we will show that the techniques we discussed in this text can be used to further their approach. The problem considered in Zhen et al. (2019) is given by: The following theorem proved in that article establishes a connection between QCQPs and adjustable robust optimization.
Theorem 17 Let Y := y ∈ R n y + : A y y ≥ b y , where A y ∈ R m y ×n y and b y ∈ R m y . Then (21) has the same optimal value as We will expand this approach to the case where in addition a possibly nonconvex quadratic term in x and a convex quadratic term in y are present in the objective. We will also slightly deviate from the definition of Y in that we will assume equality constraints. As a consequence, the function z(x) will not have a sign restriction, which will be useful in in the proof of Theorem 19.
The following argument is a straightforward but tedious generalization of Theorem 17. For the readers' convenience we provide a detailed derivation.

Theorem 18
Let Q x ∈ S n 1 , Q xy ∈ R n 1 ×n 2 , F ∈ R k×n 2 and G ∈ R r ×n 2 . Further, assume X ⊆ R n 1 is a compact set and Y := {y ∈ R n 2 + : Fy = d} ⊆ R n 2 has a Slater point and let Z(x) := {(z, w) : F T z + G T w ≤ Q T xy x}. Then inf x∈X ,y∈Y where z : R n 1 → R k and w : R n 1 → R r are function-valued decision variables.
for all x ∈ X and for all y ∈ Y .
Note that the semi-infinite constraint in the last problem implies a minimization problem of the constraint function with respect to y explicitly given by inf y∈R n 2 + x T Q xy y + Gy 2 : Fy = d By assumption there is a y ∈ int(R n 2 + ) such that Fy = d, thus (24) fulfils Slater's condition, so that zero duality gap and dual attainability is guaranteed for the dual given by The dual feasible set is given by Z(x). The following equations hold: We end up with an adjustable robust optimization problem with second stage variables z(x) and w(x) and uncertainty set X , i.e. the decision vector x takes the role of the uncertainty parameter.
The fact that z(x) and w(x) are function valued variables is a major complication since the space of all functions is an intractable search space. The standard way to deal with this issue is to contract the search space as to encompass only linear functions, i.e. z(x) = Zx + z and w(x) = Wx + w. The new decision variables then become the coefficients that identify the affine functions. If after the contraction step the robust constraints become linear in the uncertainty parameter one can employ standard reformulation techniques based on linear conic duality in order to obtain a tractable reformulation of the semi-infinite constraints. However, at least one of the adjustable robust constraints in (23) depends quadratically on x rather than linearly. Thus, approximations based on linear duality cannot be employed here.
However, the general reformulation strategy allows us not only to deal with the terms x T Q x x in the adjustable robust reformulation (23), it also allows us for the adjustable variable z(x) to employ a quadratic decision rule, i.e. (z(x)) j = x T Z j x+x T z j +z j , j ∈ [1 : k], which is a class of decision rules which contains affine decision rules as a special case. Thus, it can potentially yield tighter approximations than the ones obtainable by employing affine decision rules. On the other hand, w(x) is subject to a (convex) quadratic constraint. Consequently, an affine policy has to be employed so that w(x) = Wx + w, lest terms of order larger than 2 emerge. Since employing such decision rules is a restriction of the maximization problem, we will get a lower bound on the original minimization problem.

Theorem 19
Let the problem data be given as in Theorem 18. Assume that for any Q ∈ S n 1 , q ∈ R n 1 and ω ∈ R n it holds that where the inequality holds due to (30). But since for any pair of matrices A, B ∈ S n we have that x T Bx ≥ 0 ∀x ∈ X and A − B ∈ S n + imply x T Ax ≥ 0 ∀x ∈ X for any set X , we have shown that (28) holds as well.

Experimental evidence
In the following we provide results from extensive numerical experiments, where we tried to assess the quality of the lower bound from Theorem 19 for the QCQP (22). We will pursue two main types of experiments. First, we will compare the quality of our proposed lower bound with lower bounds obtained from typical relaxations of the (set-)copositive reformulations established in Burer (2009), Burer (2012, Eichfelder and Povh (2013). Second, for the case where only the bilinear term is present in the objective function we will compare our lower bound to the one derived in Zhen et al. (2019). The main difference between the two approaches is that we, by means of the strategy outlined in Sect. 3, are able to employ quadratic decision rules while (Zhen et al. 2019) employ affine policies. It should be noted that (Xu and Hanasusanto 2019) provided empirical evidence for the advantage of quadratic over affine decision rules. However, we believe that the disjoint bilinear optimization problem is an interesting special case due to its applications in game theory, that is significantly different from the problems considered in Xu and Hanasusanto (2019), so that particular consideration is warranted. All experiments were implemented using the YALMIP interface. The semidefinite optimization problems were solved using SDPT3, while linear problems were solved using Gurobi. For the purpose of generating feasible solutions, and thus, upper bounds, we employed fmincon as a global solver. The experiments were run on a system with Intel Core i7-4510U CPU and 8GB RAM.

Comparison with lower bounds from copositive programming
when specifying X := {x ∈ K : Bx = c} ⊆ R n 1 for some convex cone K ⊆ R n 1 , B ∈ R k 1 ×n 1 , c ∈ R k 1 and Y := y ∈ R n 2 + : Fy = d as in Theorem 18, we obtain a QCQP that is amenable to copositive reformulations as demonstrated in Burer (2009). We will only consider the case where X and Y are bounded so that by Lemma 12 we can always ad a redundant linear constraint α T (x T , y T ) T = 1 with α = (α T 1 , α T 2 ) T ∈ int K * × R n 2 + which enables us to apply the simplification step outlined in (Burer 2012, Sect. 2.3). The resulting completely positive reformulation involves a conic constraint that restricts the decision matrix to CPP(K × R n 2 + ). This constraint is generally difficult (unless for example n 2 = 0 and K is the second order cone). For the purpose of our experiments we will resort to approximations which are standard in literature. In case K = R n 1 + we will replace CPP R n 1 +n 2 by DN N n 1 +n 2 := S n 1 +n 2 + ∩ N n 1 +n 2 . In case K is the second order cone we will resort to which is an approximation described in Xu and Burer (2018), where the dual of O uter was used to approximate COP K × R n + . Note that by the S-Lemma, CPP (K) = C ∈ S n + : C • J ≤ 0 where J = Diag(−1, 1, . . . , 1) ∈ S n and thus is a tractable matrix cone.
As for the lower bound in Theorem 19 the main assumption is fulfilled as for the implicit minimization problem inf x∈X x T Qx + q T x + ω we can again invoke (Burer (2012)). In consequence the matrix cone C * is given by COP(K) which is exactly characterized by COP(K) = C : C + τ J ∈ S n + , τ ≥ 0 in case K is the second order cone. For each instance type, characterized by the size of n 1 , n 2 , m 1 , m 2 and the choice for K ∈ {R n 1 + , SOC}, which are indicated by the tuples (K, n 1 , n 2 , m 1 , m 2 ) in the first column, we randomly generated 100 instances. In Table 1 we give average values for the relative optimality gap (avg. gap), the maximum relative optimality gap (max. gap), the number of times a model gave the a better solution (# best) and their respective average computation time (avg. time). We see that our model can outperform the copositive bound in cases where the dimension of Y is significantly larger than that of X and the number of constraints in the former is not too small. In these cases computation time seems to be lower for our model. In some of those cases also the solution quality is slightly better. For both models the average optimality gap was almost vanishing, indicating that random generation is not a good way to create challenging instances. However, finding special structure that guarantees hardness is a major task in and of it self, and we do not pursue it in this paper.
Remark It should be noted that for CPP K × R n 2 , K ∈ R n 1 + , SOC we could have also employed approximation hierarchies such as the ones described in Zuluaga et al. (2006), where a sequence of cones is provided that approximate CPP K × R n 2 with increasing precision and where the computational cost of the respective bounds increases accordingly. One could argue that a more fair comparison between the two lower bounds would have been the one where we use those approximations for CPP K × R n 2 where computational effort is similar to our proposed bound. However, these hierarchies are notoriously hard to work with and their computational complexity quickly explodes with precision. It is questionable whether we would have succeeded in fairly balancing computational effort.

Comparison with linear policy for the bilinear case
As shown in Zhen et al. (2019) we have inf (x,y)∈X ×Y x T Q xy y = sup τ,z(x) τ : τ ≤ d T z(x), F T z(x) ≤ Q T xy x ∀x ∈ X (32) where z(x) is a function-valued decision variable and Y is defined as above. If further X is defined as in Sect. 7.1 and we employ an affine decision rule, i.e. z(x) = Zx+z we can reformulate the latter maximization problem via linear conic duality in a standard manner (see e.g. Ben-Tal et al. 2004) as to obtain a lower bound. In contrast, we, in Theorem 19 employ a quadratic decision rule for z(x) (w(x) does not appear if the convex quadratic term is dropped), which potentially tightens the bound. However, moving from a linear programming formulation or second order cone formulation to an SDP comes at a significant computational cost. In our experiment we aimed to quantify the benefit in order to see whether the trade off is justified. Table 2 summarizes the results of our experiments. We again present the average relative optimality gap (avg. gap) to an upper bound computed using fmincon, the maximum relative optimality gap (max. gap) to that upper bound, the number of times the our model outperformed the linear policy (# improved) and the average CPU-time (avg. time). All averages are taken across 100 replications per instance. We see, that the linear policy is outperformed regularly, where the biggest advantages are achieved if m 2 is small. This makes sense, since the number of coefficients of the linear and the quadratic policy respectively depends on m 2 which in turn impacts the flexibility of these decision rules. Since the quadratic decision rule is inherently more flexible than the linear one, that impact is more pronounced for the latter. The computational cost for the benefit is however substantial, which is not surprising as SDPs are known to scale poorly unless special structure is exploited. Such special structure is not present in the generic framework under which we operate here. To explore cases where that would be the cases is an interesting topic for future research.

Conclusion
In this paper we have outlined a general strategy, by which robust optimization problems with quadratic uncertainty can be reformulated using convex reformulation results from quadratic optimization. These convex reformulations of QCQPs can be used as an alternative way to establish duality for quadratic optimization problems by providing a convex conic reformulation first and then invoking conic optimization duality. This is an alternative to the classical approach of establishing duality for QCQPs via the S-Lemma. We introduced a new result on these QCQP-reformulations and explored its connection to existing S-Lemma type results. We also explore a copositve perspective on the general strategy, which enables us to also investigate the effect of a failure of full strong duality, where we have zero duality gap but dual attainability is not guaranteed. We then introduce a new application of the general strategy, where we reformulate a special type of QCQP as an adjustable robust optimization problem, which after introduction of a quadratic policy becomes amenable to reformulations based on the general strategy. In numerical experiments we evaluated the merits of our model, finding that it may outperform existing approaches in some cases.