Global solutions of nonconvex standard quadratic programs via mixed integer linear programming reformulations

A standard quadratic program is an optimization problem that consists of minimizing a (nonconvex) quadratic form over the unit simplex. We focus on reformulating a standard quadratic program as a mixed integer linear programming problem. We propose two alternative formulations. Our first formulation is based on casting a standard quadratic program as a linear program with complementarity constraints. We then employ binary variables to linearize the complementarity constraints. For the second formulation, we first derive an overestimating function of the objective function and establish its tightness at any global minimizer. We then linearize the overestimating function using binary variables and obtain our second formulation. For both formulations, we propose a set of valid inequalities. Our extensive computational results illustrate that the proposed mixed integer linear programming reformulations significantly outperform other global solution approaches. On larger instances, we usually observe improvements of several orders of magnitude.


Introduction
A standard quadratic program is an optimization problem in which a (nonconvex) homogeneous quadratic function is minimized over the unit simplex.An instance of a standard quadratic program is given by where Q ∈ S n and S n denotes the set of n × n real symmetric matrices, x ∈ R n , and ∆ n denotes the unit simplex in R n given by where e ∈ R n is the vector of all ones and R n + denotes the nonnegative orthant in R n .We remark that having a quadratic form in the objective function is not restrictive since the problem of minimizing a nonhomogeneous quadratic function over the unit simplex can be reformulated in the form of (StQP) using the following identity: x T Qx + 2c T x = x T (Q + ec T + ce T )x, for each x ∈ ∆ n .
Standard quadratic programs arise in a variety of applications ranging from the classical portfolio optimization problem [27] to population genetics [24]; from quadratic resource allocation [21] to selection replicator dynamics and evolutionary game theory [5].For a given matrix Q ∈ S n , Q is copositive if and only if ν(Q) ≥ 0. Therefore, standard quadratic programs can be used to check if a matrix is copositive.We refer the reader to the paper [3] for other applications, in which the term standard quadratic program was coined.The maximum stable set problem in graph theory [28] and its weighted version [17] can be formulated as instances of (StQP), which implies that (StQP) is, in general, NP-hard.
There is an extensive amount of literature on standard quadratic programs.In this paper, we are concerned with computing a global solution of (StQP).We therefore restrict our literature review to the exact solution approaches in the literature.All of these approaches, in general, are based on a branch-and-bound scheme and differ only in terms of the subroutines used for computing upper and lower bounds, and subdividing the feasible region.
For instance, a DC (difference of two convex functions) programming approach is employed in [4] to compute a lower bound and a local optimization method is used to find an upper bound.Using a relation between global solutions of (StQP) and the set of cliques in an associated graph, referred to as the convexity graph (see Section 2.3), a branch-and-bound method based on an implicit enumeration of cliques is proposed in [30].More recently, another branch-and-bound method was proposed in [25], in which both convex envelope estimators and polyhedral underestimators are employed for computing lower bounds and an implicit enumeration of the KKT points using the relation with the set of cliques in the convexity graph is utilized.In another recent paper [10], a set of cutting planes is proposed in the context of a spatial branch-and-bound scheme.
Standard quadratic programs can also be solved by finite branch-and-bound methods proposed for solving more general nonconvex quadratic programming problems (see, e.g., [12,13]).These approaches are based on an implicit enumeration of the complementarity constraints in the KKT conditions.The resulting subproblems are approximated by semidefinite relaxations or by polyhedral semidefinite relaxations.By a simple manipulation of the KKT conditions, a general quadratic program can be formulated as a linear program with complementarity constraints (LPCC) (see, e.g., [20] and the references therein) and the resulting LPCC can be solved by an enumerative scheme such as branch-and-bound.An LPCC can also be formulated as a mixed integer linear programming (MILP) problem and can be solved using Benders decomposition [20] or by branch-and-cut [33].A similar MILP formulation is proposed in [32] under the assumption of a bounded feasible region.Alternatively, using the completely positive reformulation of (StQP) (see, e.g., [7]), adaptive inner and outer polyhedral approximations of completely positive programs can be employed [11].Clearly, one can also use general purpose nonlinear programming solvers such as BARON [31] and COUENNE [2].
In this paper, we propose globally solving a standard quadratic program by reformulating it as a mixed integer linear programming (MILP) problem.We choose MILP reformulations due to the existence of powerful state-of-the-art MILP solvers such as CPLEX [22] and GUROBI [18].We propose two different MILP reformulations.Our first formulation is based on reformulating (StQP) as a linear program with complementarity constraints and linearizing the complementarity constraints by using binary variables and big-M constraints.
We discuss how to obtain valid bounds for the big-M parameters by exploiting the structure of (StQP).The second formulation is obtained by replacing the quadratic objective function in (StQP) by an overestimating function given by the maximum of a finite number of linear functions associated with the positive components of a feasible solution, referred to as the support.We show that the overestimating function is exact at all KKT points of (StQP), which leads to the second MILP formulation by introducing binary variables for modeling the support of a feasible solution.We further show that our second MILP formulation is, in fact, an exact relaxation of the first one.Furthermore, using the relation between the support of a global minimizer of (StQP) and the set of cliques in the convexity graph, we propose a set of valid inequalities for both of our MILP formulations.We conduct extensive computational experiments to assess the performances of our MILP formulations in comparison with several other global solution approaches.The computational results indicate that the proposed MILP formulations consistently outperform other global solution approaches.Furthermore, on especially large instances, we observe improvements of orders of magnitude.
Our work is related to the previous work on reformulations of a quadratic program as an instance of a linear program with complementarity constraints (LPCC) [20,32].For a general quadratic program, the paper [20] proposes a two-stage LPCC approach.In the first stage, an LPCC is solved to determine if the quadratic program is bounded below, in which case a second LPCC is formulated to compute a global solution.The resulting complementarity problems are formulated as MILP problems and solved using a parameter-free approach via branch-and-cut approach).In contrast, the paper [32] explicitly uses big-M parameters.By using a Hoffman type error bound, the authors show that there exists a valid upper bound for the big-M parameters under the assumption of a bounded feasible region.They give a closed form expression of this bound for (StQP).Our first MILP formulation, which is based on a similar approach as in these previous MILP formulations, also employs big-M parameters as in [32].In contrast with their approach, we exploit the specific structure of (StQP) in an attempt to obtain much tighter bounds for big-M parameters.Furthermore, our second MILP formulation is based on specifically taking advantage of the particular structure of (StQP).Therefore, in contrast with the previous approaches in the literature, we propose stronger MILP formulations for a more specific class of quadratic programs.This paper is organized as follows.In Section 1.1, we briefly review our notation.Section 2 discusses several useful properties of standard quadratic programs.We present our MILP formulations as well as a set of valid inequalities in Section 3. Section 4 is devoted to the results of our computational experiments.We conclude the paper in Section 5.

Notation
We use R n , R n + , and S n to denote the n-dimensional Euclidean space, the nonnegative orthant, and the space of n × n real symmetric matrices, respectively.For u ∈ R n , we denote its jth component by u j , j = 1, . . ., n.Similarly, U ij denotes the (i, j) entry of a matrix U ∈ S n , i = 1, . . ., n; j = 1, . . ., n.We denote the unit simplex in R n by ∆ n .For any U ∈ S n and V ∈ S n , the trace inner product of U and V is given by U, V := The unit vectors in R n are denoted by e j , j = 1, . . ., n.We reserve e ∈ R n and E = ee T ∈ S n for the vector of all ones and the matrix of all ones, respectively.We use 0 to denote the real number zero, the vector of all zeroes as well as the matrix of all zeroes in the appropriate dimension, which will always be clear from the context.We use conv(•) to denote the convex hull.We define the following convex cones in S n : namely, the cone of component-wise nonnegative matrices, the cone of positive semidefinite matrices, the cone of copositive matrices, the cone of completely positive matrices, and the cone of doubly nonnegative matrices, respectively.The following relations easily follow from these definitions:

Preliminaries
In this section, we review several basic properties of standard quadratic programs that will be useful in the subsequent sections.We remark that these results can be found in the literature (see, e.g., [3,8]).We include proofs of some of these results for the sake of completeness.

Optimality Conditions
Since a standard quadratic program has linear constraints, constraint qualification is satisfied at every feasible solution.Given an instance of (StQP), if x ∈ ∆ n is an optimal solution, then there exist s ∈ R n and λ ∈ R such that the following KKT conditions are satisfied: x x j s j = 0, j = 1, . . ., n.
For an instance of (StQP), x ∈ ∆ n is said to be a KKT point if there exist s ∈ R n and λ ∈ R such that the conditions ( 8) -( 12) are satisfied.

Properties of ν(Q)
The following lemma presents several useful properties about the optimal value function ν(•).
, and γ ∈ R, the following relations are satisfied: (iii) If Q is a diagonal matrix with strictly positive diagonal entries Q 11 , . . ., Q nn , then Proof.
(i) For any Q ∈ S n and any γ ∈ R, we have from which the assertion follows.
(iii) If Q is a diagonal matrix with strictly positive diagonal entries Q 11 , . . ., Q nn , then a simple manipulation of the KKT conditions ( 8) -( 12) reveals that the unique KKT point satisfies Substituting this solution in the objective function yields the result. (iv

Properties of An Optimal Solution
In this section, we present a useful relation between an optimal solution of a standard quadratic program and a related graph.
Given an instance of (StQP) with Q ∈ S n , we can associate with it an undirected graph G = (V, E), called the convexity graph of Q, where V = {1, 2, . . ., n} with node j corresponding to the vertex of the unit simplex e j , j = 1, . . ., n.There is an edge between node i and node j if the restriction of the quadratic form x T Qx to the edge of the unit simplex between the vertices e i and e j is strictly convex, i.e., For x ∈ ∆ n , we introduce the following index sets: The indices in P(x) are referred to as the support of x.For a given undirected graph G = (V, E), a set C ⊆ V of nodes is called a clique if each pair of nodes is connected by an edge.Similarly, a set S ⊆ V of nodes is called a stable set if no two nodes in S are connected by an edge.The following theorem [30] establishes a useful connection between the support of a global solution of (StQP) and the convexity graph G = (V, E).
Theorem 2.1 (Scozzari and Tardella, 2008) Given an instance of (StQP), let G = (V, E) denote the convexity graph of Q.Then, there exists a globally optimal solution x * ∈ ∆ n of (StQP) such that the nodes corresponding to the indices in P(x * ) (i.e., the support of x * ) in G form a clique (or, equivalently, a stable set in the complement of G).

Lower Bounds on ν(Q)
In this section, given an instance of (StQP), we review two lower bounds on ν(Q).

A Simple Lower Bound
We start with a simple lower bound on ν(Q).By Lemma 2.1(iv), with equality if there exists k ∈ {1, . . ., n} such that This lower bound can be slightly sharpened if the minimum entry of Q is not on the main diagonal, i.e. if γ 0 < γ 1 .In this case, we have Q − γ 0 ee T ∈ N n with strictly positive diagonal elements, which can be decomposed as Q − γ 0 ee T = D + F , where D ∈ N n and F ∈ N n are such that D is a diagonal matrix with strictly positive entries D kk = Q kk − γ 0 , k = 1, . . ., n along the main diagonal, and all diagonal entries of F are equal to zero.Since it follows by Lemma 2.1(i), (ii), and (iii) that .

Lower Bound from Doubly Nonnegative Relaxation
In this section, we present another lower bound on ν(Q) using an alternative formulation of (StQP).
A standard quadratic program can be equivalently reformulated as the following instance of a linear optimization problem over the convex cone of completely positive matrices [7]: where X ∈ S n and CP n is given by (5).Despite the fact that (CPP) is a convex reformulation of (StQP), it remains NP-hard since the membership problem for the cone of completely positive matrices is intractable (see, e.g., [14]).
By (7), one can replace the intractable cone of completely positive matrices in (CPP) by the larger but tractable cone of doubly nonnegative matrices so as to obtain the following doubly nonnegative relaxation of (CPP): where DN N n is given by (6).
Therefore, another lower bound on ν(Q) is given by By [8, Theorem 13], we have the following relation: i.e., the lower bound 2 (Q) is at least as tight as 1 (Q).By Lemma 2.1(iv) and ( 15), both lower bounds are exact if the minimum entry of Q lies along the diagonal.However, as illustrated by our computational results in Section 4, 2 (Q) is, in general, much tighter than Note that 1 (Q) can be computed in O(n 2 ) time whereas 2 (Q) requires solving a computationally expensive semidefinite program.We remark that there exist other lower bounds in the literature (see, e.g., [1,6,8,29], and [8] for a comparison).Usually, there is a trade-off between the quality of the lower bound and its computational cost.

Mixed Integer Linear Programming Formulations
In this section, we present two different mixed integer linear programming (MILP) reformulations of standard quadratic programs.We then propose a set of inequalities that are valid for both formulations.

A Formulation Based on KKT Conditions
Our first MILP formulation is obtained by exploiting the KKT conditions.We discuss how the nonlinear complementarity constraints can be linearized by employing binary variables.
We also discuss how to obtain valid upper bounds for the big-M parameters that arise from this linearization.
Given an instance of (StQP), if x ∈ ∆ n is a KKT point of (StQP), then ν(Q) = x T Qx = λ by ( 8), (9), and (12) (see also [16]).Based on this simple observation, (StQP) can be equivalently formulated as the following linear program with complementarity constraints: We can linearize the nonconvex complementarity constraints in (LPCC1) by using binary variables and big-M constraints, which gives rise to the following MILP reformulation of (StQP): x j ≤ y j , j = 1, . . ., n, x ≥ 0, ( 21) Note that, by ( 19) and ( 20), the binary variable y j ensures that x j and s j cannot simultaneously be positive for each j = 1, . . ., n.In particular, if y j = 1, then s j = 0 by ( 20) and x j is allowed to be positive.Since x ∈ ∆ n , we have 0 ≤ x j ≤ 1, which implies that (19) yields a valid upper bound on x j .On the other hand, if y j = 0, we have x j = 0 by (19), in which case we need valid upper bounds on the variable s j in (20).
Next, we discuss how to obtain bounds for the big-M parameters employed in (MILP1).
By (17), We can obtain an upper bound on s j by deriving an upper bound for each of the terms on the right-hand side of (24).For the first term, since x ∈ ∆ n , we have In order to bound the second term from above, any lower bound on λ can be employed.
Since λ ≥ ν(Q) for any feasible solution of (MILP1), it follows that any lower bound on ν(Q) can be used to obtain an upper bound on s j , j = 1, . . ., n.Indeed, let denote an arbitrary lower bound on ν(Q).For any feasible solution (x, y, s, λ) of (MILP1), it follows from ( 24) and ( 25) that which implies that would be a valid choice in (MILP1).In particular, we use

An Alternative Formulation
In this section, we present an alternative MILP formulation.Given an instance of (StQP), we first derive an underestimating function and an overestimating function for the objective function.We then establish useful relations of these two functions, which form the basis of our second formulation.
We start with a lemma that presents an underestimating and an overestimating function for the objective function of (StQP), both of which depend on the support of a feasible solution given by (13).
Lemma 3.1 For any Q ∈ S n and x ∈ ∆ n , we have where P(x), given by (13), denotes the support of x.Furthermore, if x ∈ ∆ n is a KKT point of (StQP), then Proof.For any Q ∈ S n and x ∈ ∆ n , x j e T j Qx = n j=1 x j e T j Qx = j∈P(x) x j e T j Qx , i.e., x T Qx is a convex combination of e T j Qx for j ∈ P(x), from which (27) follows.
Using Lemma 3.1, we next present an alternative characterization of ν(Q).
Proposition 3.1 Given an instance of (StQP), Proof.For any x ∈ ∆ n , we have e T j Qx by Lemma 3.1, which implies that e T j Qx.Conversely, let x * ∈ ∆ n be an optimal solution of (StQP).Then, x j ≤ y j , j = 1, . . ., n, x ≥ 0, (35) Note that the auxiliary variable α is introduced and used in (30) and (31) to linearize the maximum function on the right-hand side of ( 29) and the binary variables y j are employed in (34) to ensure that the maximum is restricted only to the linear functions corresponding to the support of x in (31).Indeed, if j ∈ P(x), then x j > 0, which forces y j = 1 by ( 33) and z j = 0 by (34).Otherwise, ( 31) is a redundant constraint since z j can then take a positive value.Note that we again rely on big-M parameters U j in (34).The next proposition presents a valid bound for these parameters.
Proposition 3.2 Given an instance of (StQP), (MILP2) is an equivalent reformulation of (StQP) if where M j is defined as in (26) and is any lower bound on ν(Q).
Proof.Let x ∈ ∆ n .Then, for each j ∈ P(x), we have y j = 1 by (33), which implies that z j = 0 by (34) and α ≥ max e T j Qx ≥ x T Qx by (31) and by Lemma 3.1.Since the objective function minimizes α, the best choice of α would be given by α = max j∈P(x) e T j Qx.Consider an index j ∈ P(x).Let us define z j = max{0, e T j Qx − α} ≥ 0.Then, if e T j Qx − α < 0, we have z j = 0 ≤ M j , which satisfies the constraints of (MILP2).Otherwise, where is any lower bound on ν(Q).It follows that for each x ∈ ∆ n , we can construct y ∈ R n , z ∈ R n , and α ∈ R such that We close this section by a brief comparison of (MILP1) and (MILP2).Note that the constraint set (31) in (MILP2) can be rewritten as Identifying the variables z with s and α with λ, a comparison with (17) in (MILP1) reveals that (MILP2) is in fact a relaxation of (MILP1).It follows that, for any feasible solution (x, y, s, λ) of (MILP1), we can define z = s and α = λ so that (x, y, z, α) is a feasible solution of (MILP2).On the other hand, while each feasible solution (x, y, s, λ) of (MILP1) corresponds to a KKT point of (StQP), we can construct a feasible solution (x, y, z, α) for any x ∈ ∆ n such that α ≥ x T Qx, with equality if x is a KKT point of (StQP).Therefore, (MILP2) can be viewed as an exact relaxation of (MILP1).

Valid Inequalities
In this section, we present a set of inequalities that are valid for both formulations (MILP1) and (MILP2).
Given an instance of (StQP), Theorem 2.1 presents a relation between the support of an optimal solution and the convexity graph of Q.This relation gives rise to the following theorem.
Theorem 3.1 The following inequalities are valid for both formulations (MILP1) and (MILP2): Proof.In both formulations (MILP1) and (MILP2), the binary variables y j are equal to one if j ∈ P(x) for any feasible solution x ∈ ∆ n .By Theorem 2.1, there exists a global solution of (StQP) whose support set forms a stable set in the complement of the convexity graph G = (V, E) of Q.The assertion follows.

Computational Results
We report the results of our computational experiments in this section.We first describe the set of instances used in our experiments.Then, we explain our experimental setup in detail.Finally, we report performances of the proposed MILP formulations in comparison with several other global solution approaches from the literature.

Set of Instances
In an attempt to accurately assess the performances of the MILP formulations (MILP1) and (MILP2), we conducted extensive experiments on the following set of instances from the literature: (i) BLST Instances Denoting the density of the convexity graph by δ ∈ [0, 1], it follows that the number of cliques in G tends to increase as δ increases.Therefore, instances of (StQP) with larger values of δ contain a larger number of possible support sets for a global minimizer.Indeed, this difficulty is also reflected in earlier computational experiments (see, e.g.[25,30]).It is also worth mentioning that the number of valid inequalities (38) is given by (1−δ)n(n−1)/2.
Therefore, for fixed n, the number of valid inequalities decreases as δ increases.For each set of instances, we therefore report the range of the parameter δ.
Recall that, for an instance of (StQP), if the minimum entry of Q lies along the main diagonal, then ν(Q) equals that entry by Lemma 2.1(iv).Therefore, we use this criterion as a preprocessing step in order to eliminate trivial instances.Apart from this, we do not use any other preprocessing procedure.After eliminating such trivial instances, we obtain a test bed that consists of a total of 376 instances.We summarize the statistics on the set of instances in Table 1. with varying sizes and characteristics.In particular, we note the higher density of the convexity graphs associated with the hard BSU instances.Indeed, 14 instances in this set have a density of 1; 5 instances have densities between 0.92 and 0.95; and only one instance has a density of 0.6, supporting our previous observation regarding the correlation between the difficulty of an instance and the density of the associated convexity graph.

Experimental Setup
Note that we propose two MILP formulations (MILP1) and (MILP2).For each formulation, one can use the lower bound 1 (Q) or 2 (Q).Finally, for each choice of the lower bound, we have the option of adding the valid inequalities (38) or not, which implies that we have a total of 8 variants.For each variant with (MILP1), we add the following bound constraints on the variable λ: where the upper bound follows from Lemma 2.1(iv).Similarly, the corresponding constraints are added for each variant with (MILP2): We compare the performances of our MILP formulations with three other global solution approaches, namely, the MILP formulation of [32], which is publicly available at However, we noticed that the wall clock time and the CPU time reported by BARON were virtually identical on all instances, suggesting that BARON did not take advantage of the multiple threads.Therefore, we caution the reader about the interpretation of the run times in our experiments with BARON.
In CPLEX and BARON, we set the optimality gap tolerance to 10 −6 , which is given by The default settings were employed for all the other parameters of CPLEX, BARON, and MOSEK.
We denote by MILP1-L1 and MILP1-L1-VI the MILP formulation (MILP1) using the lower bound 1 (Q) with and without the set of valid inequalities (38), respectively.We replace the suffix L1 by L2 for the lower bound 2 (Q).We use a similar convention for (MILP2).The MILP formulation of [32] is denoted by QP-IP, whereas CPLEX QP and BARON refer to the QP formulations solved by CPLEX and BARON, respectively.The doubly nonnegative relaxation (DNN) is denoted by DNN.For our MILP formulations with the lower bound 2 (Q), the solution times exclude the computational effort for solving the corresponding doubly nonnegative relaxation (DNN) required to compute this lower bound, which is reported separately.Finally, for a solution x ∈ ∆ n reported by a solver, an index j ∈ {1, . . ., n} is considered to be in the support of x (i.e., j ∈ P(x)) if x j > 10 −8 .

BLST Instances
In this section, we report the performances on the BLST instances [10].Recall that BLST30 consists of 134 instances with n = 30 and BLST50 is comprised of 138 instances with n = 50.We report the summary of the results in Tables 2 and 3 for the BLST30 and BLST50 instances, respectively.In each table, we have a row for each of the eight variants using (MILP1) and (MILP2), and a row for each of QP-IP, CPLEX QP, BARON, and DNN.
The total time taken by each approach (in seconds) over all instances in each set is reported in the second column.The third column reports the number of instances solved to optimality within the time limit of 3600 seconds.We present the number of instances on which the corresponding approach hits the time limit and the average optimality gap defined as in (41) over these instances in the fourth and fifth columns, respectively.Recall that we do not impose any time limit for solving the DNN relaxation.On each of the BLST30 and BLST50 instances, the lower bound 2 (Q) (i.e., the optimal value of (DNN)) is either equal to ν(Q) or the difference between the two is at most 10 −6 , whereas the simple lower bound 1 (Q) is always smaller than ν(Q).Therefore, 2 (Q) is significantly tighter than 1 (Q) over all instances.The support sizes of optimal solutions are in the range [1,10].
As illustrated by Tables 2 and 3, our MILP formulations as well as QP-IP can solve each instance to optimality in a fraction of a second on average.On the other hand, each of CPLEX-QP and BARON hits the time limit on some of the instances, with CPLEX QP  exhibiting a better performance than BARON.Note that the average gaps of BARON are usually considerably higher than those reported by CPLEX-QP.While the total computational effort for solving the DNN relaxation is quite negligible on BLST30, it increases significantly on BLST50, even surpassing the total time required for solving each MILP formulation.On each set, the total time taken by each of CPLEX QP and BARON is significantly larger than that required for each MILP formulation.
Each of the eight variants of (MILP1) and (MILP2) slightly outperforms QP-IP on BLST30.The improvement is more significant on BLST50.Furthermore, while the performance of each of the eight variants is similar on BLST30, (MILP2) seems to outperform (MILP1) on BLST50, especially when used with the lower bound 2 (Q).Note that the valid inequalities do not seem to have a significant effect on these instances.Finally, we remark that the total time of each of our MILP formulations on BLST50 increases only modestly in comparison with BLST30.

ST Instances
In this section, we report our computational results on ST instances [30], which consists of 24 instances with n = 100 (ST100), 18 instances with n = 200 (ST200), 11 instances with n = 500 (ST500), and one instance with n = 1000 (ST1000).We report our results in Tables 4, 5, 6, and 7 for ST100, ST200, ST500, and ST1000, respectively, each of which is organized similarly to Table 2.Note that the DNN relaxation could not be solved for any instance in ST500 and ST1000.We therefore omit the rows corresponding to the variants of our MILP formulations with the lower bound 2 (Q) and the DNN relaxation in Tables 6   and 7 accordingly.We first focus on the instances in ST100 and ST200.On each of these instances, the difference between the lower bound 2 (Q) and ν(Q) is less than 10 −5 .On the other hand, the simple lower bound 1 (Q) is considerably smaller than ν(Q) on all instances.The support sizes of optimal solutions vary between 3 and 7.
A close examination of Tables 4 and 5 reveals that MILP formulations significantly outperform each of CPLEX QP and BARON on ST100 and ST200.In particular, all of the instances in these two sets can be solved to optimality by each of our MILP formulations and by QP-IP, whereas CPLEX QP can solve only 6 instances out of 24 in ST100 to optimality and BARON can solve none within the time limit.Furthermore, BARON generally reports considerably higher optimality gaps in comparison with CPLEX QP.Note again that the total computational effort required for each MILP formulation is significantly smaller than that for CPLEX QP and BARON.The total computational effort for solving the DNN relaxation similarly exceeds the total effort required for each MILP formulation.In particular, the average solution time of DNN increases from 16.22 seconds on ST100 to 429.43 seconds on ST200, an increase of more than 25-fold despite the fact that n only increases by a factor of 2.
Each of the eight variants of our MILP formulations outperforms QP-IP on both ST100 and ST200.MILP2-L2 stands out as a clear winner among the other MILP formulations.
Note, in particular, that MILP-L2 is about 13 times faster than QP-IP on ST100 and more than 30 times faster on ST200.The lower bound 2 (Q) seems to have a significantly positive effect on the performance of (MILP2), both with and without valid inequalities and, to a lesser extent, on the performance of (MILP1) on ST200.The inclusion of valid inequalities has a mixed effect on (MILP1) and (MILP2).It is worth mentioning that instances with sparse convexity graphs give rise to a larger number of valid inequalities, which may adversely affect the overall performance of the variants with valid inequalities.
We now focus on the larger instances ST500 and ST1000.On each of these instances,   Tables 6 and 7 report the results on ST500 and ST1000, respectively.Recall that the DNN relaxation cannot be solved on any of these instances.Therefore, we present the results only for the variants of our MILP formulations with the simple lower bound 1 (Q).On ST500, while each of our formulations can solve all instances to optimality, QP-IP hits the time limit on 5 of the 11 instances.Furthermore, both CPLEX QP and BARON also hit the time limit on all of the instances.ST1000 consists of a single instance with n = 1000.We include this instance in our experiments to assess the performances of different global approaches on a very large sample instance.This instance indeed turns out to be very challenging for each approach except MILP2-L1, which manages to solve it to optimality within the time limit.We believe that this instance provides a remarkable computational evidence about the effectiveness of (MILP2), even when used with the simple and fairly loose lower bound 1 (Q).
We also point out the extremely large optimality gaps reported by the QP solvers, which illustrates that instances of this scale are well beyond the reach of current state-of-the-art QP and NLP solvers.
On ST500, a comparison of total computational effort indicates that our MILP formulations are far more effective than the other approaches.On ST1000, only MILP1-L1-VI reports a larger gap than that of QP-IP.Note that the valid inequalities do not seem to help on these two sets since the total number of such inequalities can be fairly large.For instance, the number of valid inequalities is about 375000 on the single instance in ST1000.On both of these sets, MILP2-L1 clearly outperforms all the other MILP formulations.

DIMACS Instances
In this section, we report our computational results on DIMACS instances, consisting of 8 instances in DIMACS1 with n ∈ [28, 171], and 22 instances in DIMACS2 with n ∈ [200, 300].
Each of these instances is obtained from the reformulation of the maximum stable set problem as an instance of (StQP) [28].We first review this formulation.
Let G = (V, E) be a simple, undirected graph, where V = {1, . . ., n}.Recall that a set S ⊆ V is a stable set if no two nodes in S are connected by an edge.The maximum stable set problem is concerned with computing the largest stable set in G, whose size is denoted by α(G).
By defining a binary variable y j ∈ {0, 1} for each j ∈ V to indicate whether node j belongs to a maximum stable set, the maximum stable set problem can be formulated as an integer linear programming problem as follows: Motzkin and Straus [28] proposed the following formulation in the form of (StQP): where A G ∈ S n denotes the adjacency matrix of G. Furthermore, if S * ⊆ V denotes a maximum stable set of G, then an optimal solution of (MS-StQP) is given by x j = 1/|S * | if j ∈ S * , and x j = 0 otherwise.
While we think that reformulating a combinatorial optimization problem as an instance of (StQP) and then reformulating it as an MILP problem may not necessarily be the best solution approach, we still include this set of instances in our experiments due to the existence of a global optimal solution of (MS-StQP) with a support of size α(G), which can be fairly large for certain classes of graphs.Therefore, these instances can be particularly challenging for MILP formulations of (StQP).
Given an instance of (MS-StQP) corresponding to a graph G = (V, E), it is easy to verify that the convexity graph of 3) is precisely given by the complement of G. Since the valid inequalities (38) are defined for each edge of the complement of the convexity graph, which coincides with G, it follows that the valid inequalities in our MILP formulations are given by which are precisely the same constraints as in (ILP).Therefore, our MILP formulations with valid inequalities can in some sense be viewed as extended formulations of (ILP).
We report our computational results on DIMACS1 and DIMACS2 in Tables 8 and 9, respectively.For comparison purposes, we also include the performance of the integer linear programming formulation (ILP) on these instances, denoted by ILP.
In contrast with BLST instances and ST instances, the lower bound 2 (Q) is almost tight on 5 out of 8 instances in DIMACS1 and only on 6 out of 22 instances in DIMACS2.The lower bound 1 (Q) is quite loose across all instances.The support size of an optimal solution is in the range of [4,44] and [8,128] in DIMACS1 and DIMACS2, respectively.
Tables 8 and 9 illustrate that these instances are indeed challenging for each of the global solution approaches.We discuss the computational results on DIMACS1 and DIMACS2 separately.
On DIMACS1, as illustrated by Table 8, each variant of our MILP formulations without valid inequalities can solve 7 of the 8 instances to optimality within the time limit.The addition of the valid inequalities not only helps to close the gap on the remaining instance but also significantly improves the overall performance of both (MILP1) and (MILP2) regardless of the particular lower bound employed.Similarly, QP-IP can solve 7 out of 8 instances to optimality.Each of CPLEX QP and BARON hits the time limit on each of the 8 instances, with BARON reporting a higher average optimality gap compared to CPLEX QP.The total computational effort for solving the DNN relaxation is larger than the total solution time for each variant of our MILP formulations with valid inequalities.Finally, ILP can solve all of these instances to optimality fairly quickly.We conjecture that this favorable outcome can be attributed to CPLEX's capability of identifying the particular structure of this formulation and adding other well-known inequalities and cuts.
Overall, each of our eight MILP variants requires less computational effort than QP-IP, either achieving significantly smaller solution times or terminating with smaller optimality gaps.While the performances of MILP1-L1-VI and MILP2-L1-VI are similar, MILP2-L2-VI exhibits a significantly better performance than MILP1-L2-VI.
On DIMACS2, which consists of larger instances, Table 9 reveals that each approach, including ILP, hits the time limit on various subsets of instances, illustrating the challenging structure of these instances.Similar to DIMACS1, ILP achieves the best performance both in terms of the total time and the number of instances solved to optimality.In particular, ILP is terminated due to the time limit only on one instance out of 22. MILP1-L1-VI ranks in the second place, with the second smallest total time and the second largest number of instances solved to optimality, followed by MILP2-L1-VI and MILP1-L2-VI, respectively.Once again, QP-IP is outperformed by each of our eight MILP variants in terms of each of total time, number of instances solved to optimality, and average optimality gap.Both CPLEX QP and BARON can solve only one instance to optimality and hit the time limit on each of the remaining instances, with CPLEX QP reporting a slightly lower average gap compared to BARON.The total computational effort required for solving the DNN relaxation is quite significant, exceeding, for instance, the total solution time of MILP1-L1-VI.
We again observe significant improvements due to the addition of valid inequalities for both (MILP1) and (MILP2).Furthermore, it is worth noting that both formulations (MILP1) and (MILP2) significantly benefit in terms of average optimality gaps when used with the better lower bound 2 (Q).

BSU Instances
In this section, we report our computational results on the BSU instances.This set consists of one instance for each value of n in the range [5,24] and is specifically constructed to have an exponential number of strict local minimizers.We report our results in  The lower bound 2 (Q) is tight only on one instance in this set, namely for n = 6.On the other hand, the lower bound 1 (Q) is considerably weaker than 2 (Q) on all instances.
The sizes of the support of an optimal solution range between 2 and 13.
Table 10 reveals that each instance in this set can be solved by each MILP formulation in a fraction of a second on average, illustrating that the existence of an exponential number of strict local minimizers does not seem to hinder the performances of MILP formulations.
In particular, each of our eight MILP variants slightly outperforms QP-IP on this set.On the other hand, each of CPLEX QP and BARON can solve only the five smallest instances to optimality within the time limit, each hitting the time limit on the remaining set of 15 instances.Therefore, these instances seem to be particularly challenging for QP and NLP solvers despite the small values of n.Note that DNN requires a negligible total computational effort on this set.
This set of instances provides strong computational evidence that MILP formulations are less likely to be affected by the possibility of an exponential number of local minimizers, which, in general, poses a great challenge for general purpose QP and NLP solvers.These results illustrate that MILP formulations can be much more effective for solving standard quadratic programs in comparison with general purpose QP and NLP solvers.

Overall Comparison
In this section, we present an overall comparison of each of the global solution approaches on all of the instances in our test bed.In an attempt to make a fair comparison, we first divide the set of instances into two subsets.IS1 consists of all instances on which each global solution approach is attempted, i.e., the set of all instances excluding the sets ST500 and ST1000.We collect these two sets of larger instances in the set IS2. Recall that the DNN relaxation cannot be solved on IS2.
Our first comparison is based on the summary of the performances of all of the global solution approaches on our test bed.In Table 11, we report the total time required by each approach as well as the number of instances solved to optimality and the number of instances terminated due to the time limit.We organize the results similarly as in Table 2, except that we have a separate set of columns for each of IS1 and IS2.Furthermore, we do not include the average optimality gaps due to the high variability.
We first focus on instances in IS1.addition of these inequalities almost always leads to smaller optimality gaps on instances terminated due to the time limit.Therefore, valid inequalities may potentially help to close the optimality gap faster if a larger time limit is imposed.
Note that IS2 consists of 12 larger instances.On this set, MILP2-L1 seems to outperform all other approaches in terms of both performance metrics, which is followed, in turn, by MILP2-L1-VI, MILP1-L1, and MILP1-LI-VI, respectively.Once again, each of our MILP variants exhibits better performance than QP-IP on both metrics.Each of CPLEX QP and BARON hits the time limit on each instance in this set.Note that valid inequalities seem to degrade the performance, possibly due to the large number of such inequalities.
In an attempt to shed more light on the comparison of the performances of different approaches, we report performance profiles [15], which are frequently used for benchmarking purposes in optimization.For a given problem instance, the performance ratio of a particular approach is defined as the ratio of the solution time of that approach to the best solution time among all approaches on that particular instance, which is defined to be +∞ if the approach fails to solve the instance.Then, for each approach a, a cumulative distribution function P a (τ ) is defined to be the percentage of the number of instances that can be solved by that approach within a factor of τ of the solution time of the best approach.Note that only the instances that can be solved by at least one approach within the time limit are included in the comparison.In Figures 1 and 2, we report two performance profiles for the instance set IS1. Out of 364 instances in this set, the figures compare the performance ratios on 361 of them, which can be solved by at least one approach.Figure 1 includes all of the approaches whereas CPLEX QP and BARON are excluded in Figure 2 to facilitate an easier comparison among MILP formulations only.Note that performance ratios are reported in logarithmic scale.
Furthermore, the markers are included for every 20th performance ratio.
Both Figures 1 and 2 indicate the superior performance of the proposed MILP formulations over the other global solution approaches on the instance set IS1.In particular, it is worth noticing that each of our 8 MILP variants outperforms QP-IP, which, in turn, exhibits better performance than CPLEX-QP, followed by BARON.In terms of the best solution time, MILP2-L2 exhibits the best performance, followed by MILP1-L2-VI, and MILP2-L2-VI, respectively.Note, however, that each of MILP2-L2-VI and MILP1-L2-VI manages to solve a higher percentage of instances within the time limit than MILP2-L2.We present the performance profile for the larger instance set IS2 in Figure 3.We do not report CPLEX QP and BARON since each of them hits the time limit on each instance in this set.Note again the logarithmic scale for the performance ratios.These results illustrate that our MILP formulations constitute an effective approach for solving standard quadratic programs.The relaxed formulation (MILP2) seems to achieve the best solution times most frequently, especially when used with the improved lower bound 2 (Q) whenever this bound can be computed.We remark, however, that the computational effort required for computing this tighter bound can be excessive on larger instances.The addition of valid inequalities may improve or degrade the performance depending on the instance and the number of such inequalities.However, for both (MILP1) and (MILP2), a larger number of instances can be solved to optimality within the time limit by the addition of valid inequalities.In summary, as illustrated by our computational results, both of our formulations are quite competitive, even when used with the simple lower bound 1 (Q) and without valid inequalities.

Concluding Remarks
In this paper, we propose solving standard quadratic programs by using two alternative MILP reformulations.The first MILP formulation arises from a simple manipulation of the KKT conditions.The second MILP formulation is obtained by exploiting the specific structure of a standard quadratic program and is in fact a relaxation of the first formulation.Both of our MILP formulations involve big-M parameters.We derive bounds on these parameters by taking advantage of the particular structure of (StQP).We show that these bounds are functions of a lower bound on the optimal value.We consider two different lower bounds, which differ in terms of the computational effort and tightness.Our extensive computational experiments illustrate that the proposed MILP formulations outperform another recently proposed MILP approach in [32] and are much more effective than general purpose quadratic programming and nonlinear programming solvers.
Since the tighter bound 2 (Q) requires a considerable computational effort for large in-stances, one can use cheaper methods for computing or approximating this bound.For instance, the dual problem of (DNN) can be solved using a combination of proximal gradient method and binary search (see, e.g., [23]), which may be cheaper than using a semidefinite programming solver.Alternatively, based on the encouraging computational results reported in [25], a tight linear programming based lower bound can be employed in lieu of 2 (Q).We leave these problems for future work.
Another interesting research direction is the investigation of decomposition approaches for the proposed MILP formulations arising from large-scale standard quadratic programs.
In addition, encouraged by our computational results, we intend to identify other classes of nonconvex quadratic programs that would be amenable to a similar effective MILP formulation.
j∈P(x * ) e T j Qx * by Lemma 3.1, which establishes the reverse inequality.We are now in a position to propose an alternative MILP formulation based on the characterization (29) stated in Proposition 3.1.(MILP2) min α (30) s.t.
[10]: This set consists of 150 instances1 with n = 30 (BLST30) and 150 instances with n = 50 (BLST50).Each entry of Q is randomly generated from a triangular distribution with parameters a < c < b, where a and c denote the minimum and maximum values and c is the mode of the distribution.(ii)ST Instances[30]: This set consists of 24 instances 2 with n = 100 (ST100), 18 instances with n = 200 (ST200), 11 instances with n = 500 (ST500), and 1 instance with n = 1000 (ST1000).For each of these instances, the matrix Q is randomly generated so that its convexity graph G = (V, E) has a prespecified density.(iii)DIMACS Instances: It is well-known that the maximum stable set problem in graph theory can be formulated as an instance of (StQP)[28].This set consists of (StQP) instances obtained from the complements of the 30 instances of the maximum clique problem 3 from the Second DIMACS Implementation Challenge with n ∈ [28, 300].These instances are divided into two groups based on the number of vertices.DIMACS1 consists of 8 instances with n ∈ [28, 171] and DIMACS2 is comprised of 22 instances with n ∈ [200, 300].(iv)BSU Instances[9]: This set consists of 20 "hard" instances with n ∈[5,24].Each of these instances is specifically constructed to harbor an exponential number of strict local minimizers.In particular, the number of strict local minimizers varies between 1.38 n and 1.49 n .Note that Theorem 2.1 establishes a relation between the support of a global minimizer of an instance of (StQP) and the set of cliques of the associated convexity graph G = (V, E).

Table 1 :
Summary of InstancesAs illustrated by Table1, our test bed encompasses a large number of instances of (StQP) [26]) was called from GAMS (version 24.8.5) using the MATLAB interface.The computation of 2 (Q) requires a semidefinite programming solver.For that purpose, we employed MOSEK (version 8.1.0.49) using the YALMIP interface (version R20180413)[26].We measured the running times in terms of wall clock time.In our experiments with CPLEX and BARON, we imposed a time limit of 3600 seconds for each MILP and QP problem.No time limit was imposed on MOSEK for the computation of 2 (Q).Our computational experiments were carried out on a 64-bit HP workstation with 24 threads (2 sockets, 6 cores per socket, 2 threads per core) running Ubuntu Linux with 48 GB of RAM and Intel Xeon CPU E5-2667 processors with a clock speed of 2.90 GHz.In our experiments with CPLEX, we chose the deterministic parallel mode by setting cplex.parallelmode= 1 in order to have reproducible results.We remark that both CPLEX and MOSEK can take advantage of multiple threads.Similarly, we employed option threads = 24 in GAMS.
https://github.com/xiawei918/quadprogIP, the quadratic programming (QP) solver of CPLEX, and the nonlinear programming (NLP) solver BARON.We solved all MILP formulations in MATLAB (version R2017b) using CPLEX (version 12.8.0)with the Cplex Class API provided in CPLEX for MATLAB Toolbox.Similarly, the QP solver of CPLEX was called in MATLAB using the same API.The NLP solver BARON (version 17

Table 10 ,
which is organized similarly to Table2.