Abstract
We consider a quadratic programming problem with quadratic cone constraints and an additional geometric constraint. Under suitable assumptions, we establish necessary and sufficient conditions for optimality of a KKT point and, in particular, we characterize optimality by using strong duality as a regularity condition. We consider in details the case where the feasible set is defined by two quadratic equality constraints and, finally, we analyse simultaneous diagonalizable quadratic problems, where the Hessian matrices of the involved quadratic functions are all diagonalizable by means of the same orthonormal matrix.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We analyse a quadratic programming problem with general quadratic cone constraints and an additional geometric constraint. This problem has received attention in the literature in the last decades (see, e.g. [5, 7, 18, 20]) since it contains as a particular case several classic optimization problems as trust region problems, the standard quadratic problem and the max cut problem; moreover, it has many applications in robust optimization under matrix norm data uncertainty and in the field of biology and economics [12].
In this paper, we are interested in establishing necessary or sufficient global optimality conditions for a point that fulfils the Karush–Kuhn–Tucker (KKT) conditions or under the assumption of strong duality on the given problem. The general formulation of the considered quadratic programming problem allows us to treat simultaneously quadratic problems with one or more quadratic equality or inequality constraints and possibly additional constraints that can be included in the geometric one, which makes the analysis of the given problem very general, particularly as regards the possibility of providing equivalent formulations and associating a dual problem with the given one. Our approach allows to recover or generalize several known results in the literature [13, 14, 20].
The paper is organized as follows. In Sect. 2 we recall the main definitions and preliminary results that will be used throughout the paper. In Sect. 3, we characterize global optimality for a KKT point or in the presence of the property of strong duality on the given problem and in Sect. 4, we consider in details the case where the feasible set is defined by two quadratic equality constraints. In Sect. 5 we analyse a simultaneous diagonalizable quadratic problem (SDQP), where the Hessian matrices of the involved quadratic functions are all diagonalizable by means of the same orthonormal matrix S. The analysis previously developed allows us to provide suitable conditions that guarantee the existence of a convex reformulation of SDQP improving some results stated in [15] in the presence of two quadratic inequality constraints.
2 Preliminary Results
Let us recall the basic notations and preliminary results that will be used throughout the paper. Given \(C\subseteq \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), \(\mathrm{co}~C\), \(\mathrm{int}~ C\), \(\mathrm{ri}~ C\), \(\mathrm{cl}~{C}\), \(\mathrm{span}~{C}\), denote the convex hull of C, the topological interior of C, the relative interior, the closure of C and the smallest vector linear subspace containing C, respectively. C is said to be a cone if \(tC\subseteq C\), \(\forall ~t\ge 0\). A convex cone C is called pointed if \(C\cap (-C)=\{0\}\). We define \( \mathrm {cone}~ C:=\bigcup _{t\ge 0}tC\).
We set \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m_+:= \{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m:x\ge 0\}\). If C is a convex set and \(x\in C\), the normal cone to C at \({\bar{x}}\in C\) is defined by \(N_C({\bar{x}}):=\{\xi \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:\langle \xi ,x-\bar{x}\rangle \le 0,~~\forall ~x\in C\}\).
The positive polar of a set \(C\subseteq \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) is defined by \(C^*:=\{y^*\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:\langle y^*,x\rangle \ge 0, \ \forall x\in C\}.\) It is well known that
\(C^\perp :=\{v\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~v^\top x=0,~\forall ~x\in C\}\) is the orthogonal subspace to the set C.
The contingent cone \(T(C;\bar{x})\) of C at \(\bar{x}\in C\) is the set of all \(v \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that there exist sequences \((x_k,t_k)\in C\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits _+\) with \(x_k\rightarrow \bar{x}\) and \(t_k(x_k-\bar{x})\rightarrow v\).
Let \(P\subseteq \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\) be a convex cone and \(C\subseteq \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) a convex set. A function \(f:\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\rightarrow \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\) is said P-convex on C if for every \(x_1,x_2\in C\) and for every \(\lambda \in [0,1]\),
For \(m=1\) and \(P=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits _+\), we recover the classic definition of a convex function. It is known that if f is P-convex on C, then the set \(f(C)+P\) is convex.
In the paper we will use the following preliminary results.
Let \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:g(x)=0\}\), where \(g:\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\rightarrow \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \). Then, we get [6]
and so \([T(C;\bar{x})]^*=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \nabla g(\bar{x})\); whereas if \(g(x)\doteq \dfrac{1}{2}x^\top Bx+b^\top x+\beta \) is a quadratic function, with B being a real symmetric matrix of order n, \(b\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) and \(\beta \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \), then
A symmetric matrix B is positive semidefinite on C, if \(x^\top Bx\ge 0,~\forall ~x\in C.\)
Lemma 2.1
( [18, Lemma 3.10]) Assume that B is an indefinite real symmetric matrix and set \(Z:=\{v\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~v^\top Bv=0\}\). Then
3 The General Case with Cone Quadratic Constraints
Let us consider the problem
where P is a convex cone in \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\), \(g(x):= ( g_1(x),\ldots ,g_m(x))\) and \(f,g_i:\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\rightarrow \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ,~i=1,\ldots ,m\) are quadratic functions, \(C\subseteq \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\),
with \(A,~B_i\) being real symmetric matrices; \(a,~b_i\) being vectors in \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) and \(\alpha ,~\beta _i\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) for \(i=1,\ldots ,m\). \( K:= \{x\in C:~g(x)\in -P\}\) is the feasible set of (4). We associate with (4) the Lagrangian function \( L(\lambda , x)\doteq f(x)+\displaystyle \sum _{i=1}^m\lambda _ig_i(x)\) and its dual problem
We say that strong duality holds for (4), if there exists \(\lambda ^*\in P^*\) such that
In case (4) admits an optimal solution \({\bar{x}}\in K\), then the previous condition is equivalent to
Under suitable assumptions on the cone \(T(C;{\bar{x}})\), we first establish three general results: the first and the second consider the case where \({\bar{x}}\) is a KKT point and provide a sufficient optimality condition and a characterization of its optimality in the case where \(P=\{0\}^m\), respectively, while the third one characterizes optimality under the assumption of strong duality.
Proposition 3.1
Let f, \(g_1,\ldots ,g_m\) be quadratic functions as above. Assume that \({\bar{x}}\in K\) is a KKT point for (4), i.e. there exists \(\lambda ^*\in P^*\) such that
and, additionally, \((K-{\bar{x}})\subseteq \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\). Then the following assertion holds.
If \(\nabla _x^2L(\lambda ^*,\bar{x})\) is positive semidefinite on \(K-\bar{x}\), then \(\bar{x}\) is a (global) optimal solution for problem (4).
Proof
By (8), \(\nabla _xL(\lambda ^*,{\bar{x}})^\top v\ge 0\), for every \(v\in T(C;{\bar{x}})\), and by (1) we obtain
The assumptions imply that
We note that, since the involved functions are quadratic, then, the following equality holds:
Exploiting (10) and (9), for every \(x\in K\), we get
By the previous inequalities, the assertion follows. \(\square \)
Remark 3.2
Proposition 3.1 is related to Theorem 2.1 in [4] when applied to a quadratic problem. Indeed, Theorem 2.1 in [4] requires that K is a convex set and \(C:=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), which guarantees that the condition \((K-{\bar{x}})\subseteq \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\) is fulfilled.
Proposition 3.3
Let f, \(g_1,\ldots ,g_m\) be quadratic functions as above, let \(P:=\{0\}^m\) and \({\bar{x}}\in K\). Assume that
and that \({\bar{x}}\) is a KKT point for (4), i.e. there exists \(\lambda ^*\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\) such that
Then the following conditions are equivalent:
- (a):
-
\(\bar{x}\) is an optimal solution for the problem (4);
- (b):
-
\(\nabla _x^2L(\lambda ^*,\bar{x})\) is positive semidefinite on \(K-\bar{x}\) and so on cl cone\((K-{\bar{x}})\).
Proof
By (12), \(\nabla _xL(\lambda ^*,{\bar{x}})^\top v\ge 0\), for every \(v\in T(C;{\bar{x}})\) and by (1) we get,
The second inclusion in (11) implies that
and, by the first inclusion in (11), \(\nabla _x L(\lambda ^*,{\bar{x}})^\top v = 0\), for every \(v\in (K-{\bar{x}})\).
By (10) and (13), for every \(x\in K\), we get
By the previous equalities, the equivalence between (a) and (b) follows. \(\square \)
Remark 3.4
Note that the second inclusion in assumption (11) is not needed for proving that (b) implies (a), as shown by Proposition 3.1.
In the following proposition we characterize optimality under the strong duality property that can be considered as a regularity condition in view of the fulfilment of the KKT conditions.
Proposition 3.5
Let f, \(g_1,\ldots ,g_m\) be quadratic functions as above, let \({\bar{x}}\in K\), and assume that
Then the following assertions are equivalent:
- (a):
-
\(\bar{x}\) is an optimal solution for the problem (4) and strong duality holds;
- (b):
-
there exists \(\lambda ^*\in P^*\) such that (8) is fulfilled and \(\nabla _x^2L(\lambda ^*,\bar{x})\) is positive semidefinite on \(C-\bar{x}\).
Proof
Assume that (a) holds, or equivalently there exists \(\lambda ^*\in P^*\) such that (7) is fulfilled. Then,
\( L(\lambda ^*,{\bar{x}})\le L(\lambda ^*, x)\), for every \(x\in C\), implies that \(\nabla _xL(\lambda ^*,{\bar{x}})^\top v\ge 0\), for every \( v\in T(C;{\bar{x}})\) and, consequently,
The assumption (15) yields \(\nabla _x L(\lambda ^*,{\bar{x}})^\top v= 0\), for every \( v\in \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\) and, in turn,
From (10) we have
and (b) follows.
Conversely if (b) holds then (8) implies (16) and, consequently, (17).
From (10) we have
and, taking into account that \(\langle \lambda ^*,g({\bar{x}})\rangle =0\), (a) follows. \(\square \)
Remark 3.6
We note that, for the implication \((b)\Rightarrow (a)\) in Proposition 3.5, the second inclusion in (15) is not needed: indeed, by (8) we have \(\nabla _x L(\lambda ^*,{\bar{x}})^\top (x-{\bar{x}}) \ge 0, \ \forall x\in C\) and \(\langle \lambda ^*,g({\bar{x}})\rangle =0\), so that (10) allows us to prove (a).
Remark 3.7
Condition (15) is fulfilled under the following circumstances:
- (i):
-
\({\bar{x}}\in \mathrm{int}\,C\);
- (ii):
-
C is defined by linear equalities, i.e. \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:H x=d\}\), \(H\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^{p\times n}\), \(d\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^p\);
- (iii):
-
\( C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:h(x)=0\}\), where h is a quadratic function with \(\nabla h({\bar{x}})=0\) and \(H:=\nabla ^2 h({\bar{x}})\) is indefinite. In this case \(T(C;{\bar{x}})=C-{\bar{x}}=\{v\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~v^\top Hv=0\}\), this is a consequence of Lemma 3.1 proved in what follows. By Lemma 2.1, cl co \(T(C;{\bar{x}})=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\).
Lemma 3.1
Let \(g_i\) be defined as in (5), for \(i=1,\ldots ,m\). Assume that \(\bar{x}\in A:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:g_i(x)=0, i=1,\ldots ,m\}\) and set \(Z_i(\bar{x}):=\{v\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~\nabla g_i(\bar{x})^\top v + \dfrac{1}{2}v^\top B_iv=0\}\), for \( i=1,\ldots ,m\). Then,
Proof
Let \(i\in [1,..,m]\). Let \(v\in Z_i(\bar{x})\), then \(g_i(v+\bar{x})=g_i(\bar{x})+\nabla g_i(\bar{x})^\top v+\dfrac{1}{2}v^\top B_iv=0\), proving that \(v+\bar{x}\in \{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:g_i(x)=0\}\). Therefore, \( \bigcap _{i=1}^mZ_i(\bar{x})\subseteq A-{\bar{x}}\).
For the other inclusion, take any \(x\in A\). Then
which implies \(x-{\bar{x}}\in \bigcap _{i=1}^mZ_i(\bar{x})\) . \(\square \)
Remark 3.6 leads to the following result.
Corollary 3.8
Let f, \(g_1,\ldots ,g_m\) be quadratic functions as above, let \({\bar{x}}\in K\), and assume that \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:g_i(x)\le 0, i=m+1,\ldots ,p\}\),where \(g_i\) are convex functions, for \(i=m+1,\ldots ,p\).
If there exists \(\lambda ^*\in P^*\) such that (8) is fulfilled and \(\nabla _x^2L(\lambda ^*,\bar{x})\) is positive semidefinite on \(C-\bar{x}\), then \(\bar{x}\) is an optimal solution for the problem (4) and strong duality holds.
Proof
By Proposition 3.5 and taking into account Remark 3.6, it is enough to prove that \((C- {\bar{x}})\subseteq \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\). The convexity of the functions \(g_i\), \(i=m+1,\ldots ,p\), yields that C is convex.
Since C is convex then \(T(C;{\bar{x}})=\mathrm{cl}~\mathrm{cone}(C-{\bar{x}})\) which implies \((C- {\bar{x}})\subseteq \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\) (see, e.g. [2]). \(\square \)
All the results so far obtained generalize optimality conditions for classical quadratic programming to a quadratic problem with cone constraints and a geometric constraint set. We now present suitable particular cases where our results allow to recover and generalize known optimality conditions.
We first consider the quadratic programming problem with bivalent constraints (QP1) defined by
where \(K:= \{x\in C: g_i(x):= x^\top B_ix+2b_i^\top x+\beta _i =0,~i=1,\ldots ,m, \ g_{m+j}(x):= x^\top E_{m+j}x -1 =0,~j=1,\ldots ,n\}\), \(E_{m+j}=\mathrm{diag}(e_j)\) and \(e_j\) is a vector in \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) whose \(j\) \(th\) element is equal to 1 and all the other entries are equal to 0.
Let \(L(\lambda ,\gamma ,x):= f(x)+\sum _{i=1}^m\lambda _ig_i(x)+\sum _{j=1}^n\gamma _jg_{m+j}(x),\) be the Lagrangian function associated with (QP1).
By Proposition 3.3 and Lemma 3.1 we recover Lemma 3.1 of [14] which can be stated as follows.
Proposition 3.9
Let \(C:=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) and \({\bar{x}}\in K\). Assume that there exist \(\lambda \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\) and \(\gamma \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that \(\nabla _xL(\lambda ,\gamma ,\bar{x})=0\) . Then \({\bar{x}}\) is an optimal solution for (QP1) if and only if \(\nabla _x^2L(\lambda ,\gamma ,\bar{x})\) is positive semidefinite on \(Z({\bar{x}})\) defined by (18).
Proof
It is enough to notice that since \(C=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), then, by Lemma 3.1, \(Z({\bar{x}})= K-{\bar{x}}\) and, moreover, (11) is fulfilled. Proposition 3.3 allows us to complete the proof. \(\square \)
By Proposition 3.5 we obtain the following result.
Next result is inspired by Theorem 3.1 of [14] and provides a characterization and a sufficient condition for strong duality for (QP1).
Proposition 3.10
Let \({\bar{x}}\in K\) with \(C:=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\). Consider the following assertions:
- (a):
-
\(\bar{x}\) is an optimal solution for (QP1) and strong duality holds;
- (b):
-
there exist \(\lambda \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\) and \(\gamma \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that \(\nabla _xL(\lambda ,\gamma ,\bar{x})=0\) and \(\nabla _x^2L(\lambda ,\gamma ,\bar{x})\) is positive semidefinite;
- (c):
-
\(A-\mathrm{diag}({\bar{X}}A{\bar{x}} +{\bar{X}}a)\) is positive semidefinite, where \({\bar{X}}:= \mathrm{diag}({\bar{x}}_1,\ldots ,{\bar{x}}_n)\).
Then \((c)\Rightarrow (b)\Leftrightarrow (a)\).
Proof
\((b)\Leftrightarrow (a)\); it follows from Proposition 3.5 with \(C:=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), \(K:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:g_i(x)=0, i=1,..,m+n\}\), \(P:=\{0\}^{m+n}\).
\((c)\Rightarrow (b)\); in the proof of Theorem 3.1 of [14] it is shown that, for any feasible point \({\bar{x}}\), the condition \( \nabla _xL(\lambda ,\gamma ,\bar{x})=0\) is fulfilled with \(\lambda := (0,\ldots ,0)^\top \) and \(\gamma := ({\bar{X}}A{\bar{x}} +{\bar{X}}a)\) and, moreover, for such \(\lambda \) and \(\gamma \), \(\nabla _x^2L(\lambda ,\gamma ,\bar{x})=A-\mathrm{diag}(\bar{X}A{\bar{x}} +{\bar{X}}a)\). Therefore, if (c) holds, then (b) is fulfilled and so is (a), by the previous part of the proof. \(\square \)
Conditions (11) and (15) in general are not fulfilled for a problem with bivalent constraints.
Example 3.11
Let \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2:x_1^2=1\}\), \(K:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2:x_1^2=1, x_2^2=1\}\), \({\bar{x}}=(1,1)\in K\). Then, \(T(C,{\bar{x}})= \{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2:x_1=0\}= \mathrm{cl\, co}~T(C;\bar{x})\),
This also implies that \(C-{\bar{x}} \not \subseteq \mathrm{cl}~\mathrm{co}~T(C;{\bar{x}})\) so that Propositions 3.3 and 3.5 in general cannot be applied to problem (QP1).
Let us make some further comparison with the literature; until the end of this section we assume that \(f,g_i,~i=1,\ldots ,m,\) are quadratic functions defined as in (5). According to Remark 3.7, the following results are all particular cases of Proposition 3.5.
Corollary 3.12
([13] Theorem 2.1, [20] Theorem 1) Consider the problem
where \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n: H x=d\}\), H is a \((p\times n) \) matrix , and let \({\bar{x}}\) be feasible for (20).
The following assertions are equivalent:
- (a):
-
\({\bar{x}}\) is an optimal solution and strong duality holds for (20);
- (b):
-
there exists \(\lambda ^*\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m_+\) such that \(\nabla _x L(\bar{x},\lambda ^*)\in H^\top (\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^p)\), \(\lambda ^*_ig_i(\bar{x})=0,~i=1,\ldots ,m\), and \(\nabla ^2_x L(\bar{x},\lambda ^*)\) is positive semidefinite on \(\mathrm{Ker}~H\).
Consequently, when \(C:= \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), then (b) reduces to the following:
- \((b')\):
-
there exists \(\lambda ^*\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m_+\) such that \(\nabla _x L(\bar{x},\lambda ^*)=0\), \(\lambda ^*_i g_i(\bar{x})=0,~i=1,\ldots ,m\) and \(\nabla ^2_x L(\bar{x},\lambda ^*)\) is positive semidefinite.
4 The Case with Two Quadratic Equality Constraints
In this section we analyse in details a quadratic problem with two quadratic equality constraints defined by
where \(f,g_i,~i=1,2\) are quadratic functions defined as in (5).
Let \(K:=\{x\in {\mathbb {R}}^n:~g_1(x)=0, ~g_2(x)=0\}\).
The standard Lagrangian associated with (21) \(L_S:\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\longmapsto {\mathbb {R}}\) is given by
The following result is a consequence of Proposition 3.3.
Proposition 4.1
Let f, \(g_1,g_2\) be defined as above, let \({\bar{x}}\in K\) be a KKT point for (21), i.e. there exists \(\lambda _1, \lambda _2\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) such that \(\nabla f({\bar{x}}) +\lambda _1\nabla g_1({\bar{x}}) + \lambda _2\nabla g_2 ({\bar{x}}) =0.\)
Then the following conditions are equivalent:
- (a):
-
\(\bar{x}\) is an optimal solution for (21);
- (b):
-
\(A+\lambda _1B_1 +\lambda _2B_2\) is positive semidefinite on \(K-\bar{x}\).
If, additionally, \(\nabla g_2({\bar{x}})=0\) then (b) is equivalent to:
- (b1):
-
\(A+\lambda _1B_1\) is positive semidefinite on \(K-\bar{x}\).
Proof
The equivalence between (a) and (b) follows from Proposition 3.3 where we set \(C:={\mathbb {R}}^n\). Assume now that \(\nabla g_2({\bar{x}})=0\). The equality \(g_2(x)-g_2({\bar{x}})= \nabla g_2({\bar{x}})(x-{\bar{x}})+ \dfrac{1}{2}(x-\bar{x})^\top \nabla ^2 g_2({\bar{x}})(x-\bar{x})\) yields \((x-\bar{x})^\top B_2(x-\bar{x})=0\), \(\forall x\in K\). Therefore, \(\nabla _x^2L_S(\lambda _1,\lambda _2,\bar{x})=A+\lambda _1B_1+\lambda _2B_2\) is positive semidefinite on \(K-\bar{x}\) if and only if (b1) holds. \(\square \)
In the following we set \(C:=\{x\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~g_2(x)=0\}\), so that \(K=\{x\in C:~g_1(x)=0\}\). The dual problem and the standard dual problem associated with (21) are, respectively, defined by:
We say that standard strong duality (SSD) holds for problem (21) if \(\mu =\nu _S\) and problem (23) admits solution. It easy to check that \(\nu _S\le \nu \le \mu .\)
Theorem 4.1
Let \(\bar{x}\in K\) be feasible for (21) and suppose that \(\mu \in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \).
- (a):
-
Assume that \(\nabla g_2(\bar{x})\ne 0\). Then the following assertions are equivalent
- (a1):
-
\(\bar{x}\) is an optimal solution and strong duality holds for problem (21);
- (a2):
-
\(\exists ~\lambda _1,~\lambda _2\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) such that \(\nabla _x L_S(\lambda _1,\lambda _2,{\bar{x}})=0\) and \(A+\lambda _1B_1 +\lambda _2B_2\) is positive semidefinite on \(C-\bar{x}\) (and so on \(\mathrm{cl \, cone}(C-{\bar{x}})\)).
- (b):
-
Assume that \(\nabla g_2(\bar{x})=0\), and \(B_2\) positive (or negative) semidefinite. Then, (a1) is equivalent to
- (b1):
-
\(\exists ~ \lambda _1\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) and \(\exists y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) s.t. \(\nabla f({\bar{x}})+\lambda _1\nabla g_1({\bar{x}})+B_2y=0\) and \(A+\lambda _1B_1\) is positive semidefinite on \(\mathrm{ker}~B_2\).
- (c):
-
Assume that \(\nabla g_2(\bar{x})=0\), and \(B_2\) indefinite. Then, (a1) is equivalent to
- (c1):
-
\(\exists ~ \lambda _1\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) s.t. \(\nabla f({\bar{x}})+\lambda _1\nabla g_1({\bar{x}})=0\) and \(A+\lambda _1B_1\) is positive semidefinite on \(C-{\bar{x}}\) (and so on \(\mathrm{cl \, cone}(C-{\bar{x}})\)).
Proof
(a): \((a1)\Rightarrow (a2)\). By assumption there exists \(\lambda _1\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) such that
Thus, \(\nabla f(\bar{x})+\lambda _1\nabla g_1(\bar{x})\in [T(C,\bar{x})]^*\). Since \(\nabla g_2({\bar{x}})\not =0\), by (2) we get, \([T(C;{\bar{x}})]^*=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \nabla g_2({\bar{x}})\). Hence, there exists \(\lambda _2\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) satisfying \(\nabla _xL_S(\lambda _1,\lambda _2,{\bar{x}})=0\). Then, for every \(x\in C\),
This proves our claim. The previous equalities also show \((a2)\Rightarrow (a1)\). (b): \((a1)\Rightarrow (b1)\). By assumption there exists \(\lambda _1\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) such that (24) holds. Thus, \(\nabla f(\bar{x})+\lambda _1\nabla g_1(\bar{x})\in [T(C,\bar{x})]^*\). Since \(\nabla g_2({\bar{x}})=0\), then (3) yields \(T(C;\bar{x})=\{v\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n:~v^\top B_2v=0\}\) and, since \(B_2\) is positive or negative semidefinite, then \(T(C;{\bar{x}})=\mathrm{ker}~B_2=Z_2({\bar{x}})=C-\bar{x}\), where the last equality is due to Lemma 3.1. Thus we can choose \(y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that
Then, from (24) and for all \(x\in C\) (which means \(g_2(x)=0\)), it follows that
Notice that \((B_2y)^\top (x-{\bar{x}})=0\), since \(\mathrm{ker}~B_2=C-\bar{x}\). These chains of equalities also show that \((b1)\Rightarrow (a1)\). (c): \((a1)\Rightarrow (c1)\). By the above discussion, \(T(C;\bar{x})=Z_2({\bar{x}})\). Lemma 2.1 yields \([T(C;{\bar{x}})]^*=(\mathrm{co}~Z_2({\bar{x}}))^* =\{0\}\), which implies that \(\nabla f(\bar{x})+\lambda _1\nabla g_1(\bar{x})=0\). By using the relation (25), one concludes that \(A+\lambda _1B_1\) is positive semidefinite on \(C-{\bar{x}}\). The same relation allows us to prove that \((c1)\Rightarrow (a1)\). \(\square \)
Necessary or sufficient optimality conditions for a quadratic problem with two quadratic inequality constraints have been obtained in [1, 18]. To the best of our knowledge, Theorem 4.1 is a new characterization of strong duality for a quadratic problem with two quadratic equality constraints.
5 Simultaneously Diagonalizable Quadratic Problems
In this section we characterize strong duality for a simultaneously diagonalizable quadratic problem with quadratic cone constraints, providing conditions that guarantee the existence of a convex reformulation. Our results generalize those obtained in [15] where two quadratic inequality constraints are considered under the assumption that the classic Slater condition is fulfilled.
Consider problem (4) and assume that the matrices A and \(B_i\), \(i=1,..,m\) are simultaneously diagonalizable, i.e. there exists an orthonormal matrix S order n, such that \(S^\top AS=D_0\), \(S^\top B_iS=D_i\), \(S^\top S=I\), where \(D_i\) are diagonal; we set \(D_i=\mathrm{diag}(\gamma _i)\), \(\gamma _i:=(\gamma _{i1},\ldots ,\gamma _{in})^\top \), \(i=0,1,\ldots ,m\).
We refer to [3] for an extensive description of the applications of this problem.
Setting \(y=S^\top x\), then (4) can be written as follows:
where P is a closed and convex cone in \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\), \(\tilde{g}(y):= ({\tilde{g}}_1(y),\ldots ,{\tilde{g}}_m(y))\) and \(\tilde{f},{\tilde{g}}_i:\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\rightarrow \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ,~i=1,\ldots ,m\) are quadratic functions,
We assume that \(\alpha =0\) and \(C=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\). Now, set \(\dfrac{1}{2}y_i^2=z_i\), \(i=1,\ldots ,n\), then \(\dfrac{1}{2}y^\top D_iy=\gamma _i^\top z\) and (26) can be rewritten as follows:
where \({\hat{g}}_i(y,z):= \gamma _i^\top z +b_i^\top Sy+\beta _i,~i=1,\ldots ,m.\)
Replacing the last n equality constraints with the corresponding inequalities, we obtain the following relaxation of (27) (and therefore of (26)):
Let \(L:{\mathbb {R}}^m\times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) be defined by \(L(\lambda ,y):= \dfrac{1}{2}y^\top D_0y+a^\top Sy +\sum _{i=1}^m\lambda _i{\tilde{g}}_i(y) \) as the Lagrangian function associated with (26) and let \(\sup _{\lambda \in P^*}\inf _{y\in C} L(\lambda ,y)\) be the related dual problem. Similarly, let \(L_R:{\mathbb {R}}^m\times {\mathbb {R}}^n \times {\mathbb {R}}^n \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) be defined by
as the Lagrangian function associated with (28) and let
be the corresponding dual problem.
Proposition 5.1
The dual problems associated with (26) and (28) are equivalent, i.e.
Moreover, if the supremum in the right-hand side of (29) is attained at \((\lambda ^*,\mu ^*)\), then the supremum in the left-hand side is attained at \(\lambda ^*\).
Proof
Let us compute \(\psi (\lambda ,\mu ):= \displaystyle \inf _{\begin{array}{c} y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n \\ z\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n \end{array}} L_R(\lambda ,\mu ,y,z)\). Note that
Then, \(\psi (\lambda ,\mu )= \inf _{y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n}[a^\top Sy +\sum _{i=1}^m\lambda _i(b_i^\top Sy+\beta _i ) +\sum _{j=1}^n \frac{1}{2}\mu _j y_j^2], \)
By eliminating the variables \(\mu _j\), we obtain:
Now, observe that \(L(\lambda ,y)= \dfrac{1}{2}y^\top D_0y+a^\top Sy +\sum _{i=1}^m\lambda _i[ \frac{1}{2}y^\top D_i y+b_i^\top Sy+\beta _i ] \)
Therefore,
and
provided that
Notice that, if (31) does not hold, then \(\sup _{\lambda \in P^*}\inf _{y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n} L(\lambda ,y)=-\infty ,\) which yields (29).
The final assertion follows from (30). \(\square \)
Consider problem (28) and let
Assuming that \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \), following the image space approach introduced by Giannessi [10, 11], we define the extended image associated with (28) by:
It is possible to show that since \({\hat{f}}\) and \({\hat{g}}\) are linear and \({\hat{h}}\) is convex, then \(\mathcal{E}\) is a convex set, in fact F turns out to be a \((\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits _+\times P\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n_+)\)-convex function. Many remarkable properties of a constrained extremum problem can be characterized (see [11]) by means of the set \(\mathcal{E}\), as in the next result.
Proposition 5.2
Assume that \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) and
Then, \(\tau =\tau _R\) if and only if \(\tau =\sup _{\lambda \in P^*}\inf _{y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n} L(\lambda ,y)\), i.e. the duality gap is zero for (26).
Proof
It is known that condition (32) is equivalent to the fact that the duality gap is zero for (28) (see [17] Theorem 4.2, for a proof where it is assumed that the infimum \(\tau _R\) of (28) is attained, we notice that it is still valid if merely \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \)). Then, by Proposition 5.1, the following relations hold:
The proof is now straightforward. \(\square \)
Condition (32) is not easy to check: next result, based on a well-known constraints qualification, provides the connections with strong duality for (26).
Proposition 5.3
Assume that \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \) and that the following condition holds for (28):
Then, \(\tau =\tau _R\) if and only if strong duality holds for (26).
Proof
We first prove that (34) implies that strong duality holds for (28): to this aim we will apply Theorem 3.6 of [8] where (34) is requested as one of the assumptions. The other one is given by the following condition:
where \(\mathcal{E}\) is the extended image associated with (28). We now prove that (35) is fulfilled.
We have already observed that \(\mathcal{E}\) is a convex set; we claim that
Let us prove our claim. Notice that, since F is a continuous function then \(0\in \mathrm{cl}~{\mathcal{E}}\) and since \(\mathcal{E}\) is convex so is \(\mathrm{cl}~{\mathcal{E}}\), so that
The reverse inclusion is obvious, so that cl co\((\mathcal{E} \cup \{0\})= \mathrm{cl}~{\mathcal{E}};\) by Theorem 6.3 of [19] we prove our claim. Now, since \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \), by Proposition 3.1 of [8] we have
which implies
or, equivalently,
This proves that (35) is fulfilled and that strong duality holds for (28).
Finally, Proposition 5.1 leads to the following relations:
Assume that \(\tau =\tau _R\); then the first inequality in (36) is fulfilled as equality and because of the second equality, the supremum is attained (see Proposition 5.1), i.e. strong duality holds for (26).
Conversely, if strong duality holds for (26), then \(\tau = \max _{\lambda \in P^*}\inf _{y\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n} L(\lambda ,y)\) and (36) yields \(\tau =\tau _R\). \(\square \)
We note that, when int\(\,P\ne \emptyset \) the (34) collapses to the classic Slater condition.
Corollary 5.4
Assume that \(\tau _R\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \), (34) holds and \( {\bar{y}}\) is an optimal solution of (26). Then \(\tau =\tau _R\) if and only if there exist \(\lambda ^*_i\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits _+\), \(i=1,..,m\), such that:
-
(i)
\(D_0 {\bar{y}}+Sa+\displaystyle \sum _{i=1}^m\lambda ^*_i(Sb_i+D_i {\bar{y}})=0\);
-
(ii)
\(D_0+\displaystyle \sum _{i=1}^m\lambda ^*_iD_i\) is positive semidefinite.
Proof
It is a direct consequence of Proposition 5.3 and Proposition 3.5. \(\square \)
Proposition 5.5
Assume that \(({\bar{y}},{\bar{z}})\) is a KKT point for (28) with \(({\bar{\lambda }},{\bar{\mu }})\) the associated multipliers. If \({\bar{\mu }} >0\), then \({\bar{y}}\) is an optimal solution and strong duality holds for (26).
Proof
We first note that, since (28) is a convex problem, then the KKT conditions guarantee the optimality of \(({\bar{y}},{\bar{z}})\) and \(({\bar{\lambda }},{\bar{\mu }},{\bar{y}},{\bar{z}})\) is a saddle point of the Lagrangian function \(L_R\). Moreover, if \({\bar{\mu }}>0\), then the constraints \(\frac{1}{2}y_j^2 - z_j\le 0\) are active for \(j=1,..,n\), which yields that \({\bar{y}}\) is feasible for (27) and therefore for (26), which proves that \(\tau =\tau _R\) and \({\bar{y}}\) is a global optimal solution for (26).
By Proposition 5.1 the following relations hold:
where the last two equalities follow from the fact that \(({\bar{\lambda }},{\bar{\mu }},{\bar{y}},{\bar{z}})\) is a saddle point of \(L_R\). Since \(\tau =\tau _R\) then
where the last equality is due to Proposition 5.1, which proves that strong duality holds for (26). \(\square \)
We provide a sufficient condition for (34) to be fulfilled.
Proposition 5.6
Assume that
-
(i)
\(\mathrm{cl~cone}({\hat{g}}(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n)+P)=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\);
-
(ii)
There exists \(({\hat{y}},{\hat{z}}) \) such that \({\hat{g}}({\hat{y}},{\hat{z}})\in -P\) and \(\displaystyle \frac{1}{2}{\hat{y}}_j^2 - {\hat{z}}_j < 0\), \(j=1,..,n\).
Then (34) is fulfilled.
Proof
Assume that (34) does not hold, i.e. \(0\not \in \mathrm{ri }(G(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n)+(P\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n_+)).\)
Since \(G(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n)+(P\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n_+)\) is a convex set, by the separation theorem for convex sets (see, e.g. [19]), there exists \((\lambda ^*,\mu ^*)\in (\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n)\setminus \{(0,0)\}\) such that
where \(w:=(w_1,..,w_n)\).
Note that, since (38) must be fulfilled for every \(v\in P\) and \( w\ge 0\), it follows that \(\lambda ^*\in -P^*\) and \(\mu ^*\le 0\). Moreover, by condition (i), we can easily prove that \(\mu ^*\ne 0\). Indeed, if \(\mu ^*=0\), then \(\lambda ^*\ne 0\) and (38) becomes
which implies
i.e.
but the previous inequality cannot hold, since \(\mathrm{cl~cone}({\hat{g}}(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n)+P)=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^m\).
Finally, because of condition (ii), setting \(y:={\hat{y}}\), \(z:={\hat{z}}\), \(v:=0\), \(w:=0\) in (38), yields
a contradiction, which completes the proof. \(\square \)
In the particular case where the feasible set of (28) is defined by explicit equality and inequality constraints, i.e. \(P:=\{0\}_s\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^{m-s}_+\), for \(0\le s\le m\), we obtain a refinement of Proposition 5.3.
Proposition 5.7
Let \(P:=\{0\}_s\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^{m-s}_+\), let \(({\bar{y}},{\bar{z}})\) be an optimal solution of (28), \(I({\bar{y}},{\bar{z}}):=\{i\in [s+1,..,m]:{\hat{g}}_i({\bar{y}},{\bar{z}})=0\} \), \(J({\bar{y}},{\bar{z}})\doteq \{i\in [1,..,n]:{\hat{h}}_i({\bar{y}},{\bar{z}})=0\} \).
Assume that there exists \(d\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^{n}\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that
-
(i)
\(\nabla {\hat{g}}_i({\bar{y}},{\bar{z}})^\top d=0, i=1,..,s\), \(\nabla {\hat{g}}_i({\bar{y}},{\bar{z}})^\top d\le 0, i\in I({\bar{y}},{\bar{z}})\) ;
-
(ii)
\(\nabla {\hat{h}}_i({\bar{y}},{\bar{z}})^\top d < 0, i\in J({\bar{y}},{\bar{z}})\).
Then, \(\tau =\tau _R\) if and only if strong duality holds for (26).
Proof
We first prove that there exist \((\lambda ^*,\mu ^*)\in P^*\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n_+\) such that
Denote by Q the feasible set of (28) and set \(w:= (y,z)\). Since \({\bar{w}}= ({\bar{y}},{\bar{z}})\) is an optimal solution of (28), then \(\langle c,d\rangle \ge 0,\ \forall d\in T(Q;{\bar{w}}),\) where \(c^\top := \nabla {\hat{f}}({\bar{y}},\bar{z})=(a^\top S,\gamma _0^\top )\). Consider the set
Note that, since \(\varGamma \ne \emptyset \), then
We show that \(\mathrm{cl}~\varGamma =T(Q;\bar{w})\). We first prove that \(\varGamma \subseteq T(Q;{\bar{w}})\). Let \(d\in \varGamma \), \(\{\alpha _k\} >0, \ \alpha _k \downarrow 0\), then
The third relation may be written as
Since \(\nabla {\hat{h}}_i({\bar{w}})^\top d < 0, i\in J({\bar{w}})\), then \({\hat{h}}_i({\bar{w}}+\alpha _kd)<0\), for k sufficiently large. Therefore, \(w_k:= {\bar{w}}+\alpha _kd\in Q\), for k sufficiently large, \(w_k\rightarrow {\bar{w}}\) and \(\frac{1}{\alpha _k}[w_k-{\bar{w}}]=d, \forall k\), which implies that \(d\in T(Q;{\bar{w}})\). Since \(T(Q;{\bar{w}})\) is closed, then \(\mathrm{cl}~\varGamma \subseteq T(Q;{\bar{w}})\). We now prove that \( T(Q;\bar{w})\subseteq \mathrm{cl}~\varGamma \). Let \(d\in T(Q;{\bar{w}})\), then \(\exists \alpha _k >0\), \(\exists w_k\in Q\), \(w_k\rightarrow {\bar{w}}\), \(\alpha _k(w_k -{\bar{w}})\rightarrow d\). Then, recalling that \({\hat{g}}\) is linear, we have
where the last inequality is due to the convexity of \({\hat{h}}\). Multiplying the previous relations by \(\alpha _k\) and taking the limit for \(k\rightarrow \infty \) yields \(d\in \mathrm{cl}~\varGamma \), which proves that \( T(Q;{\bar{w}})\subseteq \mathrm{cl}~\varGamma \). Since \( T(Q;\bar{w})=\mathrm{cl}~\varGamma \) and \({\bar{w}}\) is an optimal solution of (28), then the following system is impossible:
Applying the Motzkin’s alternative theorem (see, e.g. [16]), we obtain that there exists a solution \((\lambda ^*,\mu ^*)\in P^*\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n_+\) of the following system:
Finally, note that \(L_R(\lambda ^*,\mu ^*,y,z)\) is a convex function such that \(\nabla L_R(\lambda ^*,\mu ^*,{\bar{y}}, \bar{z})=0\), because of (41), where, we recall \({\bar{w}}=(\bar{y},{\bar{z}})\). This implies that \(({\bar{y}},{\bar{z}})\) is a global minimum point of \(L_R(\lambda ^*,\mu ^*,y,z)\) on \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\), which proves (39).
Since (39) and the complementarity conditions in (41) are fulfilled, then strong duality holds for (28). With the same arguments used in Proposition 5.5, we have that Proposition 5.1 leads to the relations:
Assume that \(\tau =\tau _R\); then, the first inequality in (42) is fulfilled as equality and because of the second equality, the supremum is attained at \(\lambda ^*\) (see Proposition 5.1), i.e. strong duality holds for (26).
Conversely, if strong duality holds for (26), then
\(\square \)
The proof is complete.
Remark 5.8
Computing explicitly the gradients of \({\hat{g}}\) and \({\hat{h}}\), then (i) and (ii) of Proposition 5.7 can be written as
-
(i’)
\((b_i^\top S,\gamma _i^\top ) d=0, i=1,\ldots ,s\), \((b_i^\top S,\gamma _i^\top ) d\le 0, i\in I({\bar{y}},{\bar{z}})\) ;
-
(ii’)
\(({{\bar{y}}}_ie^\top _i, -e_i^\top ) d < 0, i\in J({\bar{y}},{\bar{z}})\), where \(e_i\) denotes the i-th unit vector in \(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\).
Next result relates condition (ii) of Proposition 5.6 with the assumptions of the previous proposition.
Proposition 5.9
Let \(P:=\{0\}_s\times {\mathbb {R}}^{m-s}_+\). If there exists \(({\hat{y}},{\hat{z}})\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^n\) such that \({\hat{g}}({\hat{y}},{\hat{z}})\in -P\) and \({\hat{h}}({\hat{y}},{\hat{z}})<0\), then the assumptions (i) and (ii) of Proposition 5.7 are fulfilled.
Proof
Set \(d\doteq ({\hat{y}},{\hat{z}})-({\bar{y}},{\bar{z}})\). Since \({\hat{g}}\) is an affine function then
which yields (i), because \({\hat{g}}({\hat{y}},{\hat{z}})\in -P\). Moreover, since \({\hat{h}}_i\) is convex, then
and (ii) follows. \(\square \)
Next example shows that the conditions of the previous proposition are weaker than (34).
Example 5.10
Set \(n:= 1\), \(P:=\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2_+\), \({\hat{g}}_1(y,z):= -y-z\), \({\hat{g}}_2(y,z):= y+z\), \({\hat{h}}(y,z):= \frac{1}{2}y^2-z\). Then,
\(G(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits )+\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^3_+= \{(u,v,w)\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^3:u\ge -y-z, \ v\ge y+z, \ w\ge \frac{1}{2}y^2-z, (y,z)\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^2\}\)
This implies that \((0,0,0)\not \in \mathrm{int}[G(\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits \times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits )+\mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^3_+]\), i.e. (34) is not fulfilled.
Nevertheless, the assumptions of Proposition 5.9 are fulfilled. Indeed, \((y^*,z^*):= (-1,1)\) fulfils the inequalities:
We note that in [15] the Slater-type condition (34) has been considered as a blanket assumption. Finally, we provide a refinement of Corollary 5.4.
Corollary 5.11
Let \(P:=\{0\}_s\times \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits ^{m-s}_+\), let \({\bar{y}}\) be an optimal solution of (26), \({\bar{z}}:=(\frac{1}{2}{\bar{y}}_1^2,\ldots ,\frac{1}{2}{\bar{y}}_n^2)\) and assume that the assumptions (i) and (ii) of Proposition 5.7 hold. Then \(\tau =\tau _R\) if and only if there exist \(\lambda ^*_i\in \mathop {{\mathrm{I}}{\mathrm{R}}}\nolimits _+\), \(i=1,\ldots ,m\), such that:
-
(i)
\(D_0{\bar{y}}+Sa+\displaystyle \sum _{i=1}^m\lambda ^*_i(Sb_i+D_i{\bar{y}})=0\);
-
(ii)
\(D_0+\displaystyle \sum _{i=1}^m\lambda ^*_iD_i\) is positive semidefinite.
Proof
Assume that \(\tau =\tau _R\). Let us prove that \(({\bar{y}},{\bar{z}})\) is an optimal solution of (28). Indeed, \({\tilde{f}}({\bar{y}})=\tau =\tau _R\) and \(({\bar{y}},\bar{z})\) is an optimal solution of (27). Since \(({\bar{y}},\bar{z})\) is feasible for (28) and \(\gamma _0^\top {\bar{z}}+a^\top S{\bar{y}}=\tau =\tau _R\), then \((\bar{y},{\bar{z}})\) is an optimal solution of (28). By Proposition 5.7, strong duality holds for (26). Conversely, if strong duality holds for (26), then \(\tau =\tau _R\), as proved in Proposition 5.3. Recalling that here \(C={\mathbb {R}}^n\), applying Proposition 3.5 we complete the proof. \(\square \)
6 Conclusions
We have considered a quadratic programming problem with general quadratic cone constraints and an additional geometric constraint. We have established necessary and sufficient conditions for global optimality for a KKT point or in the presence of the property of strong duality, considering in details the case where the feasible set is defined by two quadratic equality constraints. As a further application, we have obtained conditions that guarantee the existence of a convex reformulation of a simultaneous diagonalizable quadratic problem.
Change history
26 July 2022
Missing Open Access funding information has been added in the Funding Note
References
Ai, W., Zhang, S.: Strong duality for the CDT subproblem: a necessary and sufficient condition. SIAM J. Optim. 19, 1735–1756 (2009)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear programming, theory and algorithms. Wiley, New Jersey (2006)
Ben-Tal, A., den Hertog, D.: Hidden conic quadratic representation of some nonconvex quadratic optimization problems. Math. Program. 143, 1–29 (2014)
Bomze, I.M.: Copositivity for second-order optimality conditions in general smooth optimization problems. Optimization 65(4), 779–795 (2015)
Bomze, I., Jeyakumar, V., Li, G.: Extended trust-region problems with one or two balls: exact copositive and Lagrangian relaxations. J. Glob. Optim. 56(2), 551–569 (2018)
Di, S., Poliquin, R.: Contingent cone to a set defined by equality and inequality constraints at a Fréchet differentiable point. J. Optim. Theory Appl. 81(3), 469–478 (1994)
Flores-Bazán, F., Cárcamo, G.: Strong duality and KKT conditions in nonconvex optimization with a single equality constraint and geometric constraints. Math. Program. 168, 369–400 (2018)
Flores-Bazán, F., Mastroeni, G.: Strong duality in cone constrained nonconvex optimization, SIAM. J. Optim. 23, 153–169 (2013)
Flores-Bazán, F., Mastroeni, G.: Characterizing FJ and KKT points in nonconvex mathematical programming with applications, SIAM. J. Optim. 25, 647–676 (2015)
Giannessi, F.: Theorems of the alternative and optimality conditions. J. Optim. Theory Appl. 42, 331–365 (1984)
Giannessi, F.: Constrained Optimization and Image Space Analysis. Springer, Berlin (2005)
Horst, R., Pardalos, P.M. (eds.): Handbook of Global Optimization, Nonconvex Optimization and Its Applications. Kluwer Academic, Dordrecht (1995)
Jeyakumar, V., Li, G.: Regularized Lagrangian duality for linearly constrained quadratic optimization and trust-region problems. J. Glob. Optim. 49, 1–14 (2011)
Li, G.: Global quadratic optimization over bivalent constraints: necessary and sufficient global optimality conditions. J. Optim. Theory Appl. 152, 710–726 (2012)
Locatelli, M.: Some results for quadratic problems with one or two quadratic constraints. Op. Res. Letters 43, 126–131 (2015)
Mangasarian O.: Nonlinear programming, SIAM, Classics in Applied Mathematics, Philadelphia (1994)
Mastroeni, G.: Some applications of the image space analysis to the duality theory for constrained extremum problems. J. Global Optim. 46, 603–614 (2010)
Peng, J.M., Yuan, Y.X.: Optimality conditions for the minimization of a quadratic with two quadratic constraints, SIAM. J. Optim. 7, 579–594 (1997)
Rockafellar, R.T.: Convex analysis. Princeton University Press, Princeton (1970)
Zheng, X.J., Sun, X.L., Li, D., Xu, Y.F.: On zero duality gap in nonconvex quadratic programming problems. J. Global Optim. 52, 229–242 (2011)
Acknowledgements
The research, for the first author, was supported in part by ANID-Chile through FONDECYT 1212004, ACE210010 and Basal FB210005.
Funding
Open access funding provided by Università di Pisa within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Massimo Pappalardo.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Flores-Bazán, F., Mastroeni, G. First- and Second-Order Optimality Conditions for Quadratically Constrained Quadratic Programming Problems. J Optim Theory Appl 193, 118–138 (2022). https://doi.org/10.1007/s10957-022-02022-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-022-02022-1