Mathematical Programming

, Volume 136, Issue 2, pp 233–251

On convex relaxations for quadratically constrained quadratic programming

Authors

    • Department of Management SciencesUniversity of Iowa
Full Length Paper Series B

DOI: 10.1007/s10107-012-0602-3

Cite this article as:
Anstreicher, K.M. Math. Program. (2012) 136: 233. doi:10.1007/s10107-012-0602-3

Abstract

We consider convex relaxations for the problem of minimizing a (possibly nonconvex) quadratic objective subject to linear and (possibly nonconvex) quadratic constraints. Let \(\mathcal{F }\) denote the feasible region for the linear constraints. We first show that replacing the quadratic objective and constraint functions with their convex lower envelopes on \(\mathcal{F }\) is dominated by an alternative methodology based on convexifying the range of the quadratic form \(\genfrac(){0.0pt}{}{1}{x}\genfrac(){0.0pt}{}{1}{x}^T\) for \(x\in \mathcal{F }\). We next show that the use of “\(\alpha \)BB” underestimators as computable estimates of convex lower envelopes is dominated by a relaxation of the convex hull of the quadratic form that imposes semidefiniteness and linear constraints on diagonal terms. Finally, we show that the use of a large class of D.C. (“difference of convex”) underestimators is dominated by a relaxation that combines semidefiniteness with RLT constraints.

Keywords

Quadratically constrained quadratic programmingConvex envelopeSemidefinite programmingReformulation-linearization technique

Mathematics Subject Classification

90C2690C22

1 Introduction

In this paper we consider a quadratically constrained quadratic programming (QCQP) problem of the form
$$\begin{aligned} \text{(QCQP)}\quad z^{*}&= \min \quad f_0(x) \\&\text{ s.t.}\quad f_i(x) \le d_i,\quad i=1,\ldots ,q\\&\quad \quad \; x\ge 0,\quad Ax\le b, \end{aligned}$$
where \(f_i(x)=x^TQ_i x+ c_i^Tx,\; i=0,1,\ldots ,q\), each \(Q_i\) is an \(n\times n\) symmetric matrix, and \(A\) is an \(m\times n\) matrix. In the case that \(Q_i\succeq 0\) for each \(i\), QCQP is a convex programming problem that can be solved in polynomial time, but in general the problem is NP-Hard. QCQP is a fundamental problem that has been extensively studied in the global optimization literature; see for example [10, 22] and references therein.
A common approach to obtaining a lower bound for a nonconvex instance of QCQP is to somehow convexify the problem. In this paper we compare several different convexification techniques. Let \(\mathcal{F }=\{x\ge 0\,:\,Ax\le b\}\) denote the feasible set for the linear constraints of QCQP. We assume throughout that \(\mathcal{F }\) is bounded. One methodology is to replace each function \(f_i(\cdot )\) with its convex lower envelope1\(\hat{f}_i(\cdot )\) on \(\mathcal{F }\). We refer to the resulting convex relaxation of QCQP as \(\widehat{\mathrm{QCQP}}\). In Sect. 2 we compare \(\widehat{\mathrm{QCQP}}\) with an alternative relaxation \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) based on the convex set
$$\begin{aligned} \mathcal{C }= {\text{ Co}}\left\{ \genfrac(){0.0pt}{}{1}{x}\genfrac(){0.0pt}{}{1}{x}^T\,:\,x\in \mathcal{F }\right\} , \end{aligned}$$
(1)
where \({\text{ Co}}\{ \ \}\) denotes the convex hull. We prove that \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) dominates \(\widehat{\mathrm{QCQP}}\), although in general neither of these problems is computationally tractable.

In Sect. 3 we compare two computable relaxations that can be viewed as tractable approximations of the problems \(\widehat{\mathrm{QCQP}}\) and \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\). One relaxation utilizes “\({\alpha {\mathrm{BB}}}\)” underestimators [1] for the nonconvex quadratic functions of QCQP, and the other applies semidefinite and diagonal constraints that must hold for matrices in \(\mathcal{C }\). We prove that the latter convexification dominates the former, regardless of the choice of the parameters used to define the underestimators. In Sect. 4 we consider a more general D.C. (for “difference of convex”) underestimation procedure suggested in [22], and a strengthened approximation of \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) that combines semidefiniteness with linear constraints from the reformulation-linearization technique (RLT). We again show that the second approach dominates the first, regardless of the parameters used to create the underestimators.

In Sect. 5 we consider particular instances of QCQP that were used as computational examples in [2]. The first of these are indefinite box-constrained QPs, corresponding to QCQP with \(q=0\) and \(\mathcal{F }=\{ x\,:\,0\le x\le e \}\). For these problems we obtain excellent computational results by further strengthening the approximation of \(\mathcal{C }\) through the addition of triangle inequalities related to the Boolean Quadric Polytope [12]. For the second class of QCQP problems, corresponding to planar circle-packing (or equivalently point-packing) problems, we prove an interesting theoretical result that relates convex lower envelopes for reverse convex constraints to the use of RLT constraints for \(\mathcal{C }\).

Notation

We use \(X\succeq 0\) to denote that a symmetric matrix \(X\) is positive semidefinite. For \(n\times n\) matrices \(X\) and \(Y\), \(X \bullet Y\) denotes the matrix inner product \(X \bullet Y=\sum _{i,j=1}^n X_{ij}Y_{ij}\). For an \(n\times n\) matrix \(X\), \({\text{ diag}}(X)\) is the vector \(x\) with \(x_i=X_{ii}\), \(i=1,\ldots , n\), and \({\text{ Diag}}(x)\) is the diagonal matrix with \({\text{ diag}}({\text{ Diag}}(x))=x\). We use \(e\) to denote a vector with each component equal to one, and \(e_j\) to denote a vector with all components equal to zero, except the \(j\)th component which is equal to one.

2 Two convex relaxations for QCQP

Let \(\mathcal{F }\subset \mathfrak R ^n\) be a compact, convex set, and \(f(\cdot ):\mathcal{F }\rightarrow \mathfrak R \). The convex lower envelope2 of \(f(\cdot )\) on \(\mathcal{F }\) [9, 14], denoted \(\hat{f}_{\mathcal{F }}(\cdot )\), is the pointwise maximum of all convex underestimators of \(f(\cdot )\) on \(\mathcal{F }\);
$$\begin{aligned} \hat{f}_{\mathcal{F }}(x)=\max \{ g(x)\,:\,g(\cdot ) \text{ is} \text{ convex} \text{ on} \mathcal{F } \text{ and} g(y)\le f(y)\, \forall y\in \mathcal{F } \},\quad x\in \mathcal{F }. \end{aligned}$$
An important property of \(\hat{f}_{\mathcal{F }}(\cdot )\) that we will repeatedly use is that if \(g(\cdot )\) is any convex underestimator of \(f(\cdot )\) on \(\mathcal{F }\), then \(g(x)\le \hat{f}_{\mathcal{F }}(x)\) for every \(x\in \mathcal{F }\). When the domain \(\mathcal{F }\) is clear from context, we will write \(\hat{f}(\cdot )\) in place of \(\hat{f}_\mathcal{F }(\cdot )\) to reduce notation.

As in Sect. 1, let \(\mathcal{F }=\{x\ge 0\,:\,Ax\le b\}\) denote the feasible set for the linear constraints of QCQP, and let \(\widehat{\mathrm{QCQP}}\) denote the problem where each function \(f_i(\cdot )\) in QCQP is replaced by \(\hat{f}_i(\cdot )\), its convex lower envelope on \(\mathcal{F }\). Let \(\hat{z}\) denote the solution value in \(\widehat{\mathrm{QCQP}}\). Note that although \(\hat{z}\) is well-defined, in practice \(\hat{z}\) may not be computable because the required convex lower envelopes \(\hat{f}_i(\cdot )\) may be impossible to obtain.

We will compare \(\widehat{\mathrm{QCQP}}\) with an alternative convexification that is based on linearizing the problem by adding additional variables. Let \(X\) denote a symmetric \(n\times n\) matrix. Then QCQP can be written as
$$\begin{aligned} \text{(QCQP)}\quad z^{*}=&\min Q_0 \bullet X+c_0^Tx \\&\text{ s.t.} Q_i \bullet X +c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&x\ge 0,\ Ax\le b,\ X=xx^T. \end{aligned}$$
Written in the above form, QCQP is a linear problem except for the quadratic equality constraints \(X=xx^T\). A convexification of the problem can then be given in terms of the set \(\mathcal{C }\) defined in (1). Using \(\mathcal{C }\), we obtain a convex relaxation
$$\begin{aligned} \left(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\right)\quad \tilde{z}=&\min Q_0 \bullet X+c_0^Tx \\&\mathrm{s.t.} Q_i \bullet X+c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&Y(x,X)\in \mathcal{C }, \end{aligned}$$
where
$$\begin{aligned} Y(x,X)=\begin{pmatrix} 1&\quad x^T \\ x&\quad X \end{pmatrix}. \end{aligned}$$
In this section we will demonstrate that the convex relaxation \(\widehat{\mathrm{QCQP}}\) cannot be tighter than \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\); in other words, it is always true that \(\hat{z}\le \tilde{z}\). To do this we will show that there is a simple relationship between the convex lower envelopes used in \(\widehat{\mathrm{QCQP}}\) and the linearized representations of the objective and constraint functions used in \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\).

Theorem 1

For \(x\in \mathcal{F }\), let \(f(x)=x^TQx+c^Tx\), and let \(\hat{f}(\cdot )\) be the convex lower envelope of \(f(\cdot )\) on \(\mathcal{F }\). Then \(\hat{f}(x)=c^Tx+{\displaystyle \min \nolimits _X}\{Q \bullet X\,:\,Y(x,X)\in \mathcal{C }\}\).

Proof

For \(x\in \mathcal{F }\), let \(g(x)=c^Tx+{\displaystyle \min \nolimits _X}\{Q\bullet X\,:\,Y(x,X)\in \mathcal{C }\}\). Our goal is to show that \(\hat{f}(x)=g(x)\). To do this we first show that \(g(\cdot )\) is a convex function with \(g(x)\le f(x)\), \(x\in \mathcal{F }\), implying that \(g(x)\le \hat{f}(x)\).

Assume that for \(i\!\in \!\{1,2\}\), \(x^i\!\in \!\mathcal{F }\) and \(g(x^i)\!=\!Q\bullet X^i +c^Tx^i\), where \(Y(x^i,X^i)\!\in \!\mathcal{C }\). For \(0\le \lambda \le 1\), let
$$\begin{aligned} x(\lambda )=\lambda x^1 + (1-\lambda )x^2,\quad X(\lambda )=\lambda X^1 + (1-\lambda )X^2. \end{aligned}$$
Then \(Y(x(\lambda ),X(\lambda ))=\lambda Y(x^1,X^1)+(1-\lambda ) Y(x^2,X^2)\in \mathcal{C }\), since \(\mathcal{C }\) is convex. It follows that
$$\begin{aligned} g(x(\lambda ))\le Q \bullet X(\lambda )+c^Tx(\lambda )=\lambda g(x^1)+(1-\lambda ) g(x^2), \end{aligned}$$
proving that \(g(\cdot )\) is convex on \(\mathcal{F }\). The fact that \(g(x)\le f(x)\) follows immediately from \(Y(x,xx^T)\in \mathcal{C }\) and \(Q \bullet xx^T+c^Tx=f(x)\).
It remains to show that \(\hat{f}(x)\le g(x)\). Assume that \(g(x)=Q \bullet X +c^Tx\), where \(Y(x,X)\in \mathcal{C }\). From the definition of \(\mathcal{C }\), there exist \(x^i\in \mathcal{F }\) and \(\lambda _i\ge 0\), \(i=1,\ldots ,k\), \(\sum _{i=1}^k \lambda _i=1\) such that
$$\begin{aligned} \sum _{i=1}^k \lambda _i x^i=x,\quad \sum _{i=1}^k \lambda _i x^i(x^i)^T=X. \end{aligned}$$
It follows that
$$\begin{aligned} g(x)&= Q \bullet X+c^Tx\\&= Q\bullet \left(\sum _{i=1}^k \lambda _i x^i(x^i)^T\right) +c^T\left(\sum _{i=1}^k \lambda _i x^i\right)\\&= \sum _{i=1}^k \lambda _i f(x^i). \end{aligned}$$
But \(\hat{f}(\cdot )\) is convex on \(\mathcal{F }\), and \(\hat{f}(x)\le f(x)\) for all \(x\in \mathcal{F }\), so
$$\begin{aligned} \hat{f}(x)=\hat{f}\left(\sum _{i=1}^k \lambda _i x^i\right)\le \sum _{i=1}^k \lambda _i \hat{f}(x^i)\le \sum _{i=1}^k \lambda _i f(x^i)=g(x). \end{aligned}$$
\(\square \)
The claimed relationship between \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) and \(\widehat{\mathrm{QCQP}}\) is an immediate consequence of Theorem 1. In particular, using Theorem 1, \(\widehat{\mathrm{QCQP}}\) could be rewritten in the form
$$\begin{aligned} \left(\widehat{\mathrm{QCQP}}\right)\quad \hat{z}=&\min Q_0 X_0 +c^Tx \\&\mathrm{s.t.} Q_i \bullet X_i +c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&Y(x,X_i)\in \mathcal{C },\quad i=0,1,\ldots ,q, \end{aligned}$$
so that \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) corresponds to \(\widehat{\mathrm{QCQP}}\) with the added constraints \(X_0=X_1=\ldots =X_q\).

Corollary 1

Let \(\hat{z}\) and \(\tilde{z}\) denote the solution values in the convex relaxations \(\widehat{\mathrm{QCQP}}\) and \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\), respectively. Then \(\hat{z}\le \tilde{z}\).

Corollary 1 indicates that the approach to convexifying QCQP taken in \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) has theoretical advantages over the underestimation methodology used in \(\widehat{\mathrm{QCQP}}\). However, it is important to recognize that both of these approaches have practical limitations. In particular, both the problem of computing an exact convex lower envelope \(\hat{f}(\cdot )\) for a quadratic function \(f(\cdot )\), and the problem of characterizing \(\mathcal{C }\), are intractable. It is, however, known that \(\mathcal{C }\) can be exactly represented using the cone of completely positive matrices. To describe this representation it is convenient to define
$$\begin{aligned} Y^+(x,X)=\begin{pmatrix} 1&\quad x^T&\quad s(x)^T \\ x&\quad X&\quad Z(x,X) \\ s(x)&\quad Z(x,X)^T&\quad S(x,X) \end{pmatrix}, \end{aligned}$$
(2)
where
$$\begin{aligned} s(x)&= b-Ax,\nonumber \\ S(x,X)&= bb^T-Axb^T-bx^TA^T+AXA^T,\\ Z(x,X)&= xb^T-XA^T.\nonumber \end{aligned}$$
(3)
The matrices \(S(x,X)\) and \(Z(x,X)\) relax \(s(x)s(x)^T\) and \(xs(x)^T\), respectively. It can then be shown [5] that
$$\begin{aligned} \mathcal{C }=\left\{ Y(x,X)\,:\,Y^+(x,X)\in \mathcal{CP }_{m+n+1}\right\} , \end{aligned}$$
(4)
where \(\mathcal{CP }_k\) is the cone of \(k\times k\) completely positive matrices (that is, matrices that can be written in the form \(VV^T\) where \(V\) is a nonnegative \(k\times p\) matrix). Unfortunately, for \(k\ge 5\) there is no known complete description for \(\mathcal{CP }_k\).
We close this section with an example that illustrates that the distinction between \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) and \(\widehat{\mathrm{QCQP}}\) is already sharp for \(m=n=q=1\). Consider the problem
$$\begin{aligned}&\min x_1^2 \\&\mathrm{s.t.} x_1^2 \ge \frac{1}{2}\\&0\le x_1\le 1. \end{aligned}$$
Written in the form of QCQP, the constraint \(x_1^2\ge \frac{1}{2}\) is \(-x_1^2\le -\frac{1}{2}\), and it is easy to see that the convex lower envelope of \(-x_1^2\) on \([0,1]\) is \(-x_1\), because \(-x_1^2\) is concave. The relaxation \(\widehat{\mathrm{QCQP}}\) is then
$$\begin{aligned}&\min&x_1^2 \\&\mathrm{s.t.}&-x_1 \le -\frac{1}{2}\\&0\le x_1\le 1, \end{aligned}$$
with solution value \(\hat{z}=\frac{1}{4}\). The solution value for \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) is \(\tilde{z}=z^{*}=\frac{1}{2}\). The set \(\mathcal{C }\) is depicted in Fig. 1. Note that for \(x_1=\frac{1}{2}\), \(Y(x_1, x_{11})\in \mathcal{C }\) for \(x_{11}\in [\frac{1}{4},\frac{1}{2}]\). The solution of \(\widehat{\mathrm{QCQP}}\) then corresponds to using \(x_1=\frac{1}{2}\) along with \(x_{11}=\frac{1}{4}\) for the objective, and \(x_{11}=\frac{1}{2}\) for the single nonlinear constraint.
https://static-content.springer.com/image/art%3A10.1007%2Fs10107-012-0602-3/MediaObjects/10107_2012_602_Fig1_HTML.gif
Fig. 1

Set \(\mathcal{C }\) for example

3 Two computable relaxations

As mentioned above, in general both \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) and \(\widehat{\mathrm{QCQP}}\) are intractable problems due to the complexity of computing a convex lower envelope \(\hat{f}(\cdot )\), or the convex hull \(\mathcal{C }\). In this section we consider the important special case where \(\mathcal{F }\) is the box \(0\le x\le e\), and describe two further relaxations that are computable approximations of \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) and \(\widehat{\mathrm{QCQP}}\).

For a quadratic function \(f(x)=x^T Qx+c^Tx\) defined on \(\mathcal{F }=\{x\,:\,0\le x\le e\}\), the well-known “\({\alpha {\mathrm{BB}}}\)” underestimator [1] is
$$\begin{aligned} f_\alpha (x)=x^T (Q+{\text{ Diag}}(\alpha ))x+(c-\alpha )^Tx, \end{aligned}$$
where \(\alpha \in \mathfrak R ^n_+\) is chosen so that \(Q+{\text{ Diag}}(\alpha )\succeq 0\). It is worthwhile to note that although here we restrict our attention to the convexification of quadratic functions, the \({\alpha {\mathrm{BB}}}\) underestimator applies to more general nonlinear functions. The same convexification procedure for the quadratic case has appeared numerous times elsewhere in the literature; see for example [4, 13].
Since \(f_\alpha (\cdot )\) is convex, it is immediate that \(f_\alpha (x)\le \hat{f}(x)\), \(0\le x\le e\). A further relaxation of \(\widehat{\mathrm{QCQP}}\) is then given by the problem
$$\begin{aligned} (\mathrm{QCQP}_{\alpha \mathrm{BB}})\quad z_{{\alpha {\mathrm{BB}}}}=&\min x^T(Q_0+{\text{ Diag}}(\alpha _0))x+(c_0-\alpha _0)^Tx \\&\mathrm{s.t.} x^T(Q_i+{\text{ Diag}}(\alpha _i))x+(c_i-\alpha _i)^Tx \le d_i,\quad i=1,\ldots ,q\\&0\le x\le e, \end{aligned}$$
where each \(\alpha _i\) is chosen so that \(Q_i+{\text{ Diag}}(\alpha _i)\succeq 0\).
For the case of \(\mathcal{F }=\{x\,:\,0\le x\le e\}\), there are a variety of known constraints that are valid for \(Y(x,X)\in \mathcal{C }\). These include:
  1. 1.
    The constraints from the Reformulation-Linearization Technique (RLT) [16],
    $$\begin{aligned} x_{ij} \ge 0,\quad x_{ij} \ge x_i+x_j-1,\quad x_{ij} \le x_i,\quad x_{ij} \le x_j. \end{aligned}$$
    (5)
     
  2. 2.

    The semidefinite programming (SDP) constraint \(Y(x,X)\succeq 0\) [19].

     
  3. 3.
    Constraints on the off-diagonal components of \(Y(x,X)\) coming from the Boolean Quadric Polytope (BQP) [6, 21]; for example, the triangle inequalities for \(i\ne j\ne k\),
    $$\begin{aligned} x_i+x_j+x_k&\le x_{ij}+x_{ik}+x_{jk}+1, \\ x_{ij}+x_{ik}&\le x_i+x_{jk}, \\ x_{ij}+x_{jk}&\le x_j+x_{ik}, \\ x_{ik}+x_{jk}&\le x_k+x_{ij}. \end{aligned}$$
     
The relationship between the SDP and RLT constraints is discussed in [2]. In fact for \(n=2\), the SDP and RLT constraints together give a full characterization of \(\mathcal{C }\) [3]. For \(n=3\) the triangle inequalities and RLT constraints fully characterize the BQP, but these constraints combined with the SDP constraint do not give a complete characterization of \(\mathcal{C }\) [6]. For \(n=3\), an “extended-variable” description of \(\mathcal{C }\) obtained via a triangulation of the 3-cube is given in [3].
We will compare \({\text{ QCQP}}_{\alpha {\mathrm{BB}}}\) with an approximation of \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) that imposes some of the above constraints on \(\mathcal{C }\). In particular, we will apply the semidefiniteness condition \(Y(x,X)\succeq 0\) together with the diagonal RLT constraints \({\text{ diag}}(X)\le x\). Note that these conditions together imply the original bound constraints \(0\le x\le e\). The resulting relaxation is
$$\begin{aligned} (\mathrm{QCQP}_\mathrm{SDP})\quad z_\mathrm{SDP}=&\min Q_0 \bullet X +c_0^Tx \\&\mathrm{s.t.} Q_i \bullet X +c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&Y(x,X)\succeq 0,\quad {\text{ diag}}(X)\le x. \end{aligned}$$
The following theorem shows that there is a simple relationship between the convexifications used to construct \({\text{ QCQP}}_{\alpha {\mathrm{BB}}}\) and \({\text{ QCQP}}_\mathrm{SDP}\).

Theorem 2

For \(0\le x\le e\), let \(f_\alpha (x)=x^T(Q+\mathrm{Diag}(\alpha ))x+(c-\alpha )^Tx\), where \(\alpha \ge 0\) and \(Q+\mathrm{Diag}(\alpha )\succeq 0\). Assume that \(Y(x,X)\succeq 0\), \(\mathrm{diag}(X)\le x\). Then \(f_\alpha (x)\le Q \bullet X+c^Tx\).

Proof

Let \(Q(\alpha )=Q+{\text{ Diag}}(\alpha )\). Since \(Q(\alpha )\succeq 0\),
$$\begin{aligned} f_\alpha (x)&= (c-\alpha )^Tx+\min _X\left\{ Q(\alpha ) \bullet X\,:\,X\succeq xx^T\right\} \\&= (c-\alpha )^Tx+\min _X\{ Q(\alpha ) \bullet X\,:\,Y(x,X)\succeq 0,\ {\text{ diag}}(X)\le x\}, \end{aligned}$$
the last because \({\text{ diag}}(X)\le x\) holds automatically for \(X=xx^T\), \(0\le x\le e\). But then \(Y(x,X)\succeq 0\) and \({\text{ diag}}(X)\le x\) imply that
$$\begin{aligned} f_\alpha (x)&\le Q(\alpha ) \bullet X+(c-\alpha )^Tx\\&= Q \bullet X+c^Tx +\alpha ^T({\text{ diag}}(X)-x)\\&\le Q \bullet X +c^Tx. \end{aligned}$$
\(\square \)

The following immediate corollary of Theorem 2 confirms a relationship between \({\text{ QCQP}}_{\alpha {\mathrm{BB}}}\) and \({\text{ QCQP}}_\mathrm{SDP}\) first conjectured by Jeff Linderoth (private communication).

Corollary 2

Let \(z_{\alpha {\mathrm{BB}}}\) and \(z_\mathrm{SDP}\) denote the solution values in the convex relaxations \(\mathrm{QCQP}_{\alpha {\mathrm{BB}}}\) and \(\mathrm{QCQP}_\mathrm{SDP}\), respectively. Then \(z_{\alpha {\mathrm{BB}}}\le z_\mathrm{SDP}\).

Note that the example at the end of Sect. 2 has \(\mathcal{F }=\{ x_1\,:\, 0\le x_1\le 1\}\), \(q=1\). For this problem \((\alpha _1-1)x_1^2\) is convex for \(\alpha _1\ge 1\). Using \(\alpha _1=1\), the problem \({\text{ QCQP}}_{\alpha {\mathrm{BB}}}\) is identical to \(\widehat{\mathrm{QCQP}}\) and has solution value \(z_{\alpha {\mathrm{BB}}}=\hat{z}=\frac{1}{4}\). The problem \({\text{ QCQP}}_\mathrm{SDP}\) is identical to \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\), and has solution value \(z_\mathrm{SDP}=z^{*}=\frac{1}{2}\).

4 Two stronger relaxations

In this section we consider a convexification procedure for QCQP suggested in [22] that generalizes the \({\alpha {\mathrm{BB}}}\) procedure described in the previous section. Consider a quadratic function \(f(x)=x^TQ x+ c^Tx\), and let \(v_j\in \mathfrak R ^n\), \(j=1,\ldots ,k\). Let \(\mathcal{F }=\{x\ge 0\,:\,Ax\le b\}\), and assume that for \(x\in \mathcal{F }\) we have \(l_j\le v_j^Tx \le u_j\). It follows that for \(x\in \mathcal{F }\), \((v_j^Tx-l_j)(v_j^Tx-u_j)\le 0\), or \((v_j^Tx)^2-(l_j+u_j)v_j^Tx +l_ju_j \le 0\). For \(\alpha \in \mathfrak R ^k_+\), define
$$\begin{aligned} Q(\alpha )&= Q+\sum _{j=1}^k \alpha _j v_jv_j^T, \\ c(\alpha )&= c-\sum _{j=1}^k \alpha _j(l_j+u_j)v_j, \\ p(\alpha )&= \sum _{j=1}^k \alpha _j l_j u_j, \end{aligned}$$
and let \(f_\alpha (x)=x^TQ(\alpha ) x + c(\alpha )^Tx + p(\alpha )\). Then if \(Q(\alpha )\succeq 0,\; f_\alpha (\cdot )\) is a convex underestimator for \(f(\cdot )\) on \(\mathcal{F }\). In [22], functions of the form \(f_\alpha (\cdot )\) are referred to as D.C. (for “difference of convex”) underestimators, and are applied to convexify the objective in QCQP problems with linear and convex quadratic constraints. Note that the \({\alpha {\mathrm{BB}}}\) underestimator on \(0\le x\le e\) from the previous section corresponds to the case of \(v_j=e_j,\; l_j=0,\; u_j=1,\; j=1,\ldots ,n\). Additional possibilities for \(v_j\) suggested in [22] include eigenvectors corresponding to negative eigenvalues of \(Q\), and transposed rows of the constraint matrix \(A\). Using underestimators of the form \(f_\alpha (\cdot )\), we obtain a convex relaxation
$$\begin{aligned} (\mathrm{QCQP}_\mathrm{DC})\quad z_\mathrm{DC}=&\min x^TQ_0(\alpha _0)x+c_0(\alpha _0)^Tx+p(\alpha _0) \\&\mathrm{s.t.} x^TQ_i(\alpha _i)x+c_i(\alpha _i)^Tx +p(\alpha _i) \le d_i,\quad i=1,\ldots ,q\\&x\ge 0,\ Ax\le b, \end{aligned}$$
where each \(\alpha _i\in \mathfrak R ^k_+\) is chosen so that \(Q_i(\alpha _i)\succeq 0\).
We will compare \({\text{ QCQP}}_\mathrm{DC}\) to a relaxation of QCQP that combines the semidefiniteness condition \(Y(x,X)\succeq 0\) with the RLT constraints on \((x,X)\) that can be obtained from the original linear constraints \(x\ge 0,\; Ax\le b\). The RLT constraints can be described very succinctly using the the matrix \(Y^+(x,X)\) from (); in fact it is easy to see that these constraints correspond exactly to \(X\ge 0,\; S(x,X)\ge 0,\; Z(x,X)\ge 0\), where \(S(\cdot ,\cdot )\) and \(Z(\cdot ,\cdot )\) are given in (3). It follows that the RLT constraints and the condition that \(Y(x,X)\succeq 0\) together are equivalent to \(Y^+(x,X)\) being a doubly nonnegative (DNN) matrix, that is, a symmetric matrix that is both positive semidefinite and componentwise nonnegative. We therefore define the relaxation
$$\begin{aligned} (\mathrm{QCQP}_\mathrm{DNN})\quad z_\mathrm{DNN}=&\min Q_0 \bullet X +c_0^Tx \\&\mathrm{s.t.} Q_i \bullet X +c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&Y^+(x,X)\in {\mathcal{DNN }}_{m+n+1}, \end{aligned}$$
where \({\mathcal{DNN }}_k\) is the cone of \(k\times k\) doubly nonnegative matrices. Note that the relaxation \({\text{ QCQP}}_\mathrm{DNN}\) is entirely determined by the data from the original problem QCQP; in particular, \({\text{ QCQP}}_\mathrm{DNN}\) does not involve the vectors \(v_j\) and bounds \((l_j,u_j)\) used to construct the convexifications in \({\text{ QCQP}}_\mathrm{DC}\). Also, from (4) it is clear that \({\text{ QCQP}}_\mathrm{DNN}\) can be viewed as a relaxation obtained from \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) by replacing the constraint \(Y^+(x,X)\in \mathcal{CP }_{m+n+1}\) with the weaker condition \(Y^+(x,X)\in {\mathcal{DNN }}_{m+n+1}\).
In order to compare \({\text{ QCQP}}_\mathrm{DC}\) and \({\text{ QCQP}}_\mathrm{DNN}\) we require a generalization of Theorem 2 that applies to the convexification \(f_\alpha (\cdot )\) used in this section. This result naturally involves the RLT constraints
$$\begin{aligned} v_j^TXv_j-(l_j+u_j)v_j^T x \le -l_ju_j,\quad j=1,\ldots ,k. \end{aligned}$$
(6)
that are obtained from \(l_j\le v_j^Tx\le u_j\), \(j=1,\ldots ,k\).

Theorem 3

For \(x\in \mathcal{F }\), let \(f_\alpha (x)=x^TQ(\alpha )x+c(\alpha )^Tx+p(\alpha )\), where \(\alpha \ge 0\) and \(Q(\alpha )\succeq 0\). Assume that \(Y(x,X)\succeq 0\) and \((x,X)\) satisfy (6). Then \(f_\alpha (x)\le Q \bullet X+c^Tx\).

Proof

The proof is similar to that of Theorem 2. Since \(Q(\alpha )\succeq 0\),
$$\begin{aligned} f_\alpha (x)&= c(\alpha )^Tx+p(\alpha ) +\min _X\left\{ Q(\alpha ) \bullet X\,:\,X\succeq xx^T\right\} \\&= c(\alpha )^Tx+p(\alpha )+\min _X\{ Q(\alpha ) \bullet X\,:\,Y(x,X)\succeq 0,\ (x,X)\ \text{ satisfy}(6) \}, \end{aligned}$$
the last because (6) are satisfied for any \(X=xx^T\), \(x\in \mathcal{F }\). But then if \(Y(x,X)\succeq 0\) and \((x,X)\) satisfy (6),
$$\begin{aligned} f_\alpha (x)&\le p(\alpha )+Q(\alpha ) \bullet X+c(\alpha )^Tx\\&\le p(\alpha ) + Q \bullet X +c^Tx -p(\alpha )\\&= Q \bullet X +c^Tx, \end{aligned}$$
where the second inequality uses (6). \(\square \)

Theorem 4

Let \(z_\mathrm{DC}\) and \(z_\mathrm{DNN}\) denote the solution values in the convex relaxations \(\mathrm{QCQP}_\mathrm{DC}\) and \(\mathrm{QCQP}_\mathrm{DNN}\), respectively. Then \(z_\mathrm{DC}\le z_\mathrm{DNN}\).

Proof

Consider the convex relaxation
$$\begin{aligned} z_V=&\min Q_0 \bullet X +c_0^Tx \\&\mathrm{s.t.} Q_i \bullet X +c_i^Tx \le d_i,\quad i=1,\ldots ,q\\&v_j^TXv_j-(l_j+u_j)v_j^T x \le -l_ju_j,\quad j=1,\ldots ,k\\&x\ge 0,\quad Ax\le b,\quad Y(x,X)\succeq 0. \end{aligned}$$
By Theorem 3 we immediately have \(z_\mathrm{DC}\le z_V\). However, the constraints \(l_j\le v_j^Tx\le u_j\) are implied by the original constraints \(x\ge 0\), \(Ax\le b\), and therefore by [16, Proposition 8.2], the constraints (6) are implied by the RLT constraints \(X\ge 0\), \(S(x,X)\ge 0\), \(Z(x,X)\ge 0\). It follows that \(z_V\le z_\mathrm{DNN}\). \(\square \)

In [22] it is shown that if all of the quadratic constraints of QCQP are convex, then for a given set of \(\{ v_j \}_{j=1}^k\) the problem of choosing the vector \(\alpha _0\) that gives the best value of \(z_\mathrm{DC}\) can be formulated as a semidefinite programming problem. Theorem 4 states that regardless of the vectors \(\{v_j\}_{j=1}^k\) and \(\{\alpha _i\}_{i=0}^q\) used to construct the convexifications in \({\text{ QCQP}}_\mathrm{DC}\), the resulting lower bound \(z_\mathrm{DC}\) cannot be better than the bound \(z_\mathrm{DNN}\) obtained from \({\text{ QCQP}}_\mathrm{DNN}\) when the upper and lower bounds \(l\) and \(u\) correspond to the feasible set for the linear constraints \(\mathcal{F }\). However, in the presence of convex quadratic constraints, better values of \(l_j\) and/or \(u_j\) can be obtained by minimizing or maximizing \(v_j^Tx\) over the set \(\mathcal{S }\) corresponding to the feasible region for the linear and convex quadratic constraints, as suggested in [22], and in this case Theorem 4 would no longer apply unless the strengthened linear inequalities were explicitly added to \(\mathcal{F }\). Of course obtaining such improved bounds could entail substantial auxiliary computation. A different approach for utilizing convex quadratic constraints to obtain improved RLT bounds based on the second-order cone representation of the constraints is suggested in [7, Section 2.3].

5 Applications

In this section we describe applications of the convexifications described above to two particular classes of QCQP problems considered in [2]. The first application is to box-constrained indefinite QP problems of the form
$$\begin{aligned} \mathrm{(QPB)}\ z^{*}=&\min x^TQ_0x+c_0^Tx\\&\mathrm{s.t.} 0\le x\le e, \end{aligned}$$
corresponding to the general QCQP problem of Sect. 1 with \(q=0\) and \(\mathcal{F }=\{ x\,:\,0\le x\le e \}\). Let \(\hat{z}\) and \(\tilde{z}\) be solution values for the corresponding problems \(\widehat{\mathrm{QCQP}}\) and \(\mathrm{Q}\widetilde{\mathrm{CQ}}\mathrm{P}\) described in Sect. 2. It is then obvious from the definition of \(\mathcal{C }\) that \(\tilde{z}=z^{*}\), and \(\hat{z}=\tilde{z}\) follows immediately from Theorem 1, so a full description of either the convex lower envelope \(\hat{f}_0(\cdot )\) or the convex hull \(\mathcal{C }\) would provide an exact solution of QPB. Several valid classes of constraints for \(\mathcal{C }\) for the case that \(\mathcal{F }=\{ x\,:\,0\le x\le e \}\) were described in Sect. 3. The relaxation \({\text{ QCQP}}_\mathrm{SDP}\), corresponding to imposing the semidefiniteness condition \(Y(x,X)\succeq 0\) along with the diagonal RLT constraints \({\text{ diag}}(X)\le x\), was computationally evaluated on a set of 15 QPB test problems with \(n=30\) in [2]. The results of [2] show that the bound \(z_\mathrm{SDP}\) on these problems is much better than a bound based on imposing the RLT constraints (5) on \(Y(x,X)\), and the bound \(z_\mathrm{DNN}\) based on imposing both semidefiniteness and the RLT constraints is much better still. (For the 15 problems considered, using semidefiniteness and the RLT constraints together closed the gap to zero, up to numerical tolerances, on 8 problems and left an average gap of 0.88 % on the remaining 7 problems.)
The QPB test problems used in [2] are from a larger set of 54 problems with \(n=20\), 30, 40, 50 and 60 that were solved using the finite branch-and-bound algorithm of [8]; 50 of these problems were previously solved using the finite branch-and-bound algorithm of [20]. (The computational results in [20] omit the problems 50-050-1/2/3, and the problem 40-100-3 was unsolved.) In Table 1 we report the results of applying several increasingly tight approximations of \(\mathcal{C }\) on the full set of 54 problems. The column labeled “SDP” gives the gap to optimality for the bound \(z_\mathrm{SDP}\), and the column labeled “SDP+RLT” gives the gap for the bound \(z_\mathrm{DNN}\) that imposes both semidefiniteness and the RLT constraints on \(Y(x,X)\). For 29 of the 54 problems, the SDP+RLT bound is exact up to the numerical tolerances used by the SeDuMi solver [17]; for these problems the solution matrix \(Y(x,X)\) is numerically rank-one. For the remaining 25 problems we consider adding triangle inequalities coming from the Boolean Quadric Polytope [6, 21]. For 24 of these 25 problems, adding triangle inequalities closes the gap to zero up to numerical tolerances; a positive gap remains for only problem 50-050-1, with a gap of 0.144 %.3 In the “Cuts Added” columns we report the number of RLT cuts required for problems solved to optimality using only added RLT constraints, or the number of RLT cuts and triangle (TRI) inequalities added for problems that could not be solved using RLT cuts alone. In both cases, violated constraints were added in several “rounds” with a decreasing infeasibility tolerance to avoid adding a large number of redundant inequalities, which would substantially degrade the performance of the solver.
Table 1

Comparison of bounds for indefinite QPB

Problem

Optimum

Cuts added

Relative gaps to optimum

  

RLT

TRI

SDP (%)

SDP + RLT (%)

SDP + RLT + TRI (%)

20-100-1

706.50

197

55

4.655

0.002

0.000

20-100-2

856.50

184

172

5.102

0.171

0.000

20-100-3

772.00

168

 

1.750

0.000

 

30-060-1

706.00

371

777

8.799

1.229

0.000

30-060-2

1,377.17

381

 

3.614

0.000

 

30-060-3

1,293.50

394

288

5.924

0.368

0.000

30-070-1

654.00

369

784

14.133

3.058

0.000

30-070-2

1,313.00

449

 

4.727

0.000

 

30-070-3

1,657.40

452

442

3.763

0.010

0.000

30-080-1

952.73

365

718

10.290

1.315

0.000

30-080-2

1,597.00

376

 

1.616

0.000

 

30-080-3

1,809.78

317

 

1.492

0.000

 

30-090-1

1,296.50

370

 

4.009

0.000

 

30-090-2

1,466.84

344

 

4.160

0.000

 

30-090-3

1,494.00

420

 

1.527

0.000

 

30-100-1

1,227.13

356

 

4.777

0.000

 

30-100-2

1,260.50

427

465

8.316

0.048

0.000

30-100-3

1,511.05

377

265

6.622

0.139

0.000

40-030-1

839.50

656

 

4.419

0.000

 

40-030-2

1,429.00

889

 

4.747

0.000

 

40-030-3

1,086.00

705

 

6.494

0.000

 

40-040-1

837.00

710

1,966

14.228

3.117

0.000

40-040-2

1,428.00

600

 

1.718

0.000

 

40-040-3

1,173.50

745

1,427

8.209

0.626

0.000

40-050-1

1,154.50

797

1,608

10.592

0.515

0.000

40-050-2

1,430.98

788

961

6.047

0.354

0.000

40-050-3

1,653.63

680

 

5.665

0.000

 

40-060-1

1,322.67

696

1,722

12.043

2.287

0.000

40-060-2

2,004.23

739

 

4.758

0.000

 

40-060-3

2,454.50

701

 

2.207

0.000

 

40-070-1

1,605.00

584

 

3.675

0.000

 

40-070-2

1,867.50

650

 

3.418

0.000

 

40-070-3

2,436.50

828

 

3.538

0.000

 

40-080-1

1,838.50

615

 

5.312

0.000

 

40-080-2

1,952.50

639

 

3.094

0.000

 

40-080-3

2,545.50

755

742

3.647

0.015

0.000

40-090-1

2,135.50

763

 

5.948

0.000

 

40-090-2

2,113.00

731

336

7.376

0.035

0.000

40-090-3

2,535.00

598

 

2.338

0.000

 

40-100-1

2,476.38

673

 

3.265

0.000

 

40-100-2

2,102.50

707

1,251

5.428

0.184

0.000

40-100-3

1,866.07

664

1,732

9.176

2.257

0.000

50-030-1

1,324.50

903

 

4.877

0.000

 

50-030-2

1,668.00

831

233

5.257

0.200

0.000

50-030-3

1,453.61

830

180

7.715

0.087

0.000

50-040-1

1,411.00

1,017

 

5.103

0.000

 

50-040-2

1,745.76

868

509

7.766

0.212

0.000

50-040-3

2,094.50

1,081

 

3.938

0.000

 

50-050-1

1,198.41

723

1,531

18.304

8.664

0.144

50-050-2

1,776.00

867

667

9.377

0.765

0.000

50-050-3

2,106.10

937

933

7.689

0.752

0.000

60-020-1

1,212.00

1,199

 

7.048

0.000

 

60-020-2

1,925.50

1,319

 

4.418

0.000

 

60-020-3

1,483.00

1,040

735

8.200

0.543

0.000

Average

 

 

 

5.969

0.499

 

The results reported in Table 1 suggest that on QPB problems of these dimensions, the approach based on approximating \(\mathcal{C }\) is highly competitive with other methodologies. The solution process for individual problems in [20] required the solution of up to approximately 28,000 linear programs, with a total of up to approximately 500,000 cuts generated. (The root gaps, after the addition of cuts, ranged from 12.0 % to 168.5 %, with a mean of 66.2 %.) The SDP relaxations used in [8] substantially reduce the amount of enumeration compared to the algorithm of [20], but still required up to \(10^4\) CPU seconds on a 2.7 GHz Linux-based computer to solve individual problems. Results for the general-purpose global optimization solver BARON [15] on these problems were also reported in [20]. Of the 51 problems considered, BARON was unable to solve 21 problems within 4,000 CPU seconds on a 1.8 GHz Linux-based computer, and the problems that were solved required approximately 20 times more computation than that required using the algorithm of [20] running on a slower machine. Good results using a methodology similar to that applied here for indefinite QPB problems of similar dimensions were previously reported in [21]. Yajima and Fujie [21] consider additional valid inequalities for the BQP beyond the triangle inequalities, but only approximate the semidefiniteness condition \(Y(x,X)\succeq 0\) by adding linear inequalities. The advantage of using linear inequalities is that the resulting relaxations are ordinary linear programming (LP) problems that can be solved using an LP solver, as opposed to the conic solver required when \(Y(x,X)\succeq 0\) is directly imposed.

The second example of QCQP that we consider is a circle-packing problem in the plane: for a given \(n\ge 2\), find the maximum radius of \(n\) non-overlapping circles that all lie in the unit box \(0\le x_i\le 1,\; 0\le y_i\le 1\), \(i= 1,\ldots ,n\). This geometric problem has been extensively studied in the global optimization literature [11, 18]. Via a well-known transformation the problem is equivalent to the “point packing” problem
$$\begin{aligned} \mathrm{PP:}\quad&\max \;\theta \\&\mathrm{s.t.} (x_i-x_j)^2+(y_i-y_j)^2 \ge \theta ,\quad 1\le i<j\le n\\&0\le x\le e,\quad 0\le y\le e. \end{aligned}$$
Obviously PP corresponds to an instance of QCQP with a linear objective and constraints of the form \(f_{ij}(x,y,\theta )\le 0\), where
$$\begin{aligned} f_{ij}(x,y,\theta )=-(x_i-x_j)^2-(y_i-y_j)^2 +\theta ,\quad 1\le i<j\le n. \end{aligned}$$
Note that these are all “reverse convex” constraints; i.e. each \(f_{ij}(\cdot ,\cdot ,\cdot )\) is a concave quadratic function. The variable \(\theta \) represents the minimum squared distance separating \(n\) points in the unit square; the corresponding radius for \(n\) circles that can be packed into the unit square is \(\sqrt{\theta }/[2(1+\sqrt{\theta })]\).

In [2], bounds for the solution value of PP were computed using several combinations of semidefiniteness and RLT constraints. Note that since PP involves no terms of the form \(x_iy_j\), all SDP and RLT constraints can be based on matrices \(X\) and \(Y\) relaxing \(xx^T\) and \(yy^T\), respectively. In addition, it is clear that by symmetry one can assume that \(.5\le x_i\le 1,\; i=1,\ldots , n_x\) and \(.5\le y_i\le 1,\; i=1,\ldots , n_y\) where \(n_x=\lceil n/2 \rceil \) and \(n_y=\lceil n_x/2 \rceil \). We use “SYM” to refer to any problem formulation that uses these more restricted bounds. (Section 5 of [2] considers more elaborate symmetry-breaking using order constraints, but we omit discussion of this topic here.) The computational results obtained in [2] using the SDP, RLT and SYM conditions are summarized in Conjecture 1. (As in Sect. 3, the SDP relaxation includes the diagonal constraints \({\text{ diag}}(X)\le x\) and \({\text{ diag}}(Y)\le y\).) As described in [2], these findings are stated as a conjecture since the solution values given were numerically obtained for instances of size \(n\le 50\).

Conjecture 1

[2] For \(n\ge 2\) consider the RLT and SDP relaxations of PP. Then:
  1. 1.

    The optimal value for the RLT relaxation is 2.

     
  2. 2.

    The optimal value for the SDP relaxation is \(1+\frac{1}{n-1}\) and adding the RLT constraints does not change this value.

     
  3. 3.

    For \(n\ge 5\) the optimal value for the RLT+SYM relaxation is \(\frac{1}{2}\).

     
  4. 4.

    For \(n\ge 5\) the optimal value for the SDP+SYM relaxation is \(\frac{1}{4}\left(1+\frac{1}{\lfloor (n-1)/4 \rfloor }\right)\).

     
Note that the RLT bound of 2.0 is “worst possible” in that this is the maximum squared distance between two points in the unit square. In Fig. 2 we illustrate the various bounds described in Conjecture 1 for \(2\le n\le 30\). (Fig. 2 gives the square roots of the solution values for the various relaxations, corresponding to bounds on the minimum distance bewteen two points.) The “MAX” values correspond to high-precision estimates for the exact optimal values of PP obtained by verified computing techniques [11].
https://static-content.springer.com/image/art%3A10.1007%2Fs10107-012-0602-3/MediaObjects/10107_2012_602_Fig2_HTML.gif
Fig. 2

Bounds on distance from relaxations of PP

Our interest here is to demonstrate a relationship between the bounds described in Conjecture 1 and bounds that correspond to replacing the quadratic constraints \(f_{ij}(x,y,\theta )\le 0\) with their convex lower envelopes. To do this we will utilize a specialization of Theorem 1 that applies when \(\mathcal{F }=\{ x\,:\,0\le x\le e \}\) and \(f(\cdot )\) is concave.

Following the notation of [6], let \({\mathcal{BQP }}_n\) denote the Boolean Quadric Polytope [12]
$$\begin{aligned} {\mathcal{BQP }}_n={\text{ Co}}\{ (x,\{ y_{ij} \}_{1\le i<j\le n}) \,:\,x\in \{ 0,1 \}^n, y_{ij}=x_ix_j, 1\le i<j\le n \}. \end{aligned}$$
The definition of \({\mathcal{BQP }}_n\) avoids duplication of variables due to the symmetry of \(xx^T\) and the fact that \({\text{ diag}}(xx^T)=x\) for binary \(x\). For \(x\in \mathfrak R ^n\), \(X\in \mathfrak R ^{n\times n}\) it is then convenient to define the projection operator
$$\begin{aligned} {\text{ proj}}(x,X)=(x,\{ x_{ij} \}_{1\le i<j\le n}) \end{aligned}$$
that deletes the components of \(X\) on and below the diagonal. Finally, define the convex set
$$\begin{aligned} \mathcal{B }_n=\{ (x,X)\,:\,{\text{ proj}}(x,X)\in {\mathcal{BQP }}_n,\ 0\le {\text{ diag}}(X)\le x \}. \end{aligned}$$
We remark that the lower bounds \(0\le {\text{ diag}}(X)\) are not actually required in the sequel, but we prefer to include them so as to make \(\mathcal{B }_n\) bounded.

Theorem 5

Let \(\mathcal{F }=\{ x\,:\,0\le x\le e \}\). For \(x\in \mathcal{F }\), let \(f(x)=x^TQx+c^Tx\), where \(\mathrm{diag}(Q)\le 0\), and let \(\hat{f}(\cdot )\) be the convex lower envelope of \(f(\cdot )\) on \(\mathcal{F }\). Then \(\hat{f}(x)=c^Tx+{\displaystyle \min \nolimits _X}\{Q \bullet X\,:\,(x,X)\in \mathcal{B }_n\}\).

Proof

The proof is similar to that of Theorem 1, but since several steps require modifications we include the details. For \(x\in \mathcal{F }\), let \(g(x)=c^Tx+{\displaystyle \min \nolimits _X}\{Q \bullet X\,:\,(x,X)\in \mathcal{B }_n\}\). Our goal is to show that \(\hat{f}(x)=g(x)\). To do this we first show that \(g(\cdot )\) is a convex function with \(g(x)\le f(x)\), \(x\in \mathcal{F }\), implying that \(g(x)\le \hat{f}(x)\).

Assume that for \(i\in \{1,2\}\), \(x^i\in \mathcal{F }\) and \(g(x^i)=Q \bullet X^i +c^Tx^i\), where \((x^i,X^i)\in \mathcal{B }_n\). For \(0\le \lambda \le 1\), let
$$\begin{aligned} x(\lambda )=\lambda x^1 + (1-\lambda )x^2,\quad X(\lambda )=\lambda X^1 + (1-\lambda )X^2. \end{aligned}$$
Then \((x(\lambda ),X(\lambda ))\in \mathcal{B }_n\), since \(\mathcal{B }_n\) is convex. It follows that
$$\begin{aligned} g(x(\lambda ))\le Q \bullet X(\lambda )+c^Tx(\lambda )=\lambda g(x^1)+(1-\lambda ) g(x^2), \end{aligned}$$
proving that \(g(\cdot )\) is convex on \(\mathcal{F }\). It is shown in [6, Proposition 5] that if \(x\in \mathcal{F }\), then \({\text{ proj}}(x,xx^T)\in {\mathcal{BQP }}_n\), and \(0\le {\text{ diag}}(xx^T)\le x\) for \(x\in \mathcal{F }\). It follows that \((x,xx^T)\in \mathcal{B }_n\) for any \(x\in \mathcal{F }\), and therefore \(g(x)\le Q \bullet xx^T+c^Tx=f(x)\).
It remains to show that \(\hat{f}(x)\le g(x)\). Assume that \(g(x)=Q\bullet X +c^Tx\), where \((x,X)\in \mathcal{B }_n\). From the definition of \(\mathcal{B }_n\), there exist \(x^i\in \{ 0,1 \}^n\), and \(\lambda _i\ge 0\), \(i=1,\ldots ,k\), \(\sum _{i=1}^k \lambda _i=1\) such that
$$\begin{aligned} {\text{ proj}}\bigg (\sum _{i=1}^k \lambda _i x^i,\sum _{i=1}^k \lambda _i x^i(x^i)^T\bigg ) ={\text{ proj}}(x,X). \end{aligned}$$
Define
$$\begin{aligned} \bar{X}=\sum _{i=1}^k \lambda _i x^i(x^i)^T. \end{aligned}$$
From the definition of \(\mathcal{B }_n\) we then have
$$\begin{aligned} X_{ij}&= \bar{X}_{ij},\quad \quad \quad \,\;i\ne j\\ 0\le X_{ii}&\le \bar{X}_{ii}=x_i,\quad i=1,\ldots , n. \end{aligned}$$
Therefore
$$\begin{aligned} g(x)&= Q \bullet X+c^Tx\\&= Q \bullet \bar{X}+c^Tx +\sum _{i=1}^n q_{ii}(X_{ii}-x_i)\\&\ge Q \bullet \bar{X}+c^Tx\\&= \sum _{i=1}^k \lambda _i f(x^i). \end{aligned}$$
But \(\hat{f}(\cdot )\) is convex on \(\mathcal{F }\), and \(\hat{f}(x)\le f(x)\) for all \(x\in \mathcal{F }\), so
$$\begin{aligned} \hat{f}(x)=\hat{f}\left(\sum _{i=1}^k \lambda _i x^i\right)\le \sum _{i=1}^k \lambda _i \hat{f}(x^i)\le \sum _{i=1}^k \lambda _i f(x^i)\le g(x). \end{aligned}$$
\(\square \)
See [6, Proposition 9] for a result closely related to Theorem 5. Using Theorem 5 we can prove an interesting relationship between bounds for PP obtained using convex lower envelopes of the constraints versus bounds obtained using RLT constraints on \((x,X)\) and \((y,Y)\). Slightly abusing notation, we can write the constraint functions for PP in the form
$$\begin{aligned} f_{ij}(x,y,\theta )=f_{ij}(x)+f_{ij}(y)+\theta , \end{aligned}$$
where \(f_{ij}(x)=-(x_i-x_j)^2\), and therefore the convex lower envelope can be written in the form
$$\begin{aligned} \hat{f}_{ij}(x,y,\theta )=\hat{f}_{ij}(x)+\hat{f}_{ij}(y)+\theta . \end{aligned}$$

Theorem 6

For \(\mathcal{F }=\{ (x,y)\,:\,0\le x\le e,\, 0\le y\le e \}\), let \(\hat{z}\) be the solution value for the relaxation of PP obtained by replacing the constraint functions with their convex lower envelopes on \(\mathcal{F }\), and let \(z_\mathrm{RLT}\) be the solution value for the relaxation that imposes the RLT constraints on \((x,X)\) and \((y,Y)\). Then \(\hat{z}\ge z_\mathrm{RLT}\). Moreover this relationship continues to hold if \(\mathcal{F }\) is replaced by the tighter SYM bounds.

Proof

By Theorem 5,
$$\begin{aligned} \hat{f}_{ij}(x)=\min _X\{ 2x_{ij}-x_{ii}-x_{jj}\,:\,((x_i,x_j),X_{[i,j]}) \in \mathcal{B }_2 \}, \end{aligned}$$
where \(X_{[i,j]}\) is the principal submatrix of \(X\) corresponding to row and column indeces \(i\) and \(j\). However, \({\mathcal{BQP }}_2\) is completely characterized by the RLT inequalities on \(x_{ij}\) [12], and the additional constraints \(0\le x_{ii}\le x_i\), \(0\le x_{jj}\le x_j\) of \(\mathcal{B }_2\) are RLT constraints on the diagonal elements of \(X\). The result immediately follows. When applying the tighter SYM bounds, we can apply an affine transformation to the variables to re-write the problem in terms of transformed variables \((x^\prime ,y^\prime )\) with \(0\le x^\prime \le e\), \(0\le y^\prime \le e\), and use the fact that the convex lower envelopes and RLT constraints [16, Proposition 8.4] are both invariant with respect to affine transformations of the variables. \(\square \)

Since the RLT constraints on \((x,X)\) and \((y,Y)\) are already sufficient to characterize the convex lower envelopes of the quadratic constraints in PP, it would be natural to speculate that adding the semidefiniteness conditions \(X\succeq xx^T\) and \(Y\succeq yy^T\) would have no effect on bounds for the solution value. The values given in Conjecture 1 show that this is not the case. Note, however, that each convex lower envelope \(\hat{f}_{ij}(x)\) requires only values of the variables \(X_{[i,j]}\), and the semidefiniteness condition \(Y(x,X)\succeq 0\) is stronger than the condition that all principal submatrices of \(Y(x,X)\) corresponding to two variables are semidefinite.

Footnotes
1

The convex lower envelope is defined at the beginning of the next section.

 
2

In the literature, the convex lower envelope of \(f(\cdot )\) is sometimes called simply the convex envelope of \(f(\cdot )\). We prefer to include the word “lower” as a reminder that the convex (lower) envelope is an underestimator of \(f(\cdot )\).

 
3

The problem 50-050-1 is structurally similar to the other problems considered. One possibility to attempt to reduce the gap on this problem, which we have not attempted, would be to impose additional valid constraints from the BQP as considered in [21].

 

Acknowledgments

I am grateful to two anonymous referees for corrections and suggestions that have improved the paper.

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2012