Tight compact extended relaxations for nonconvex quadratic programming problems with box constraints

Cutting planes from the Boolean Quadric Polytope can be used to reduce the optimality gap of the NP\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {NP}$$\end{document}-hard nonconvex quadratic program with box constraints (BoxQP). It is known that all cuts of the Chvátal–Gomory closure of the Boolean Quadric Polytope are A-odd cycle inequalities. We obtain a compact extended relaxation of allA-odd cycle inequalities, which allows to optimize over the Chvátal–Gomory closure without repeated calls to separation algorithms and has less inequalities than the formulation provided by Boros et al. (SIAM J Discrete Math 5(2):163–177, 1992) for sparse matrices. In a computational study, we confirm the strength of this relaxation and show that we can provide very strong bounds for the BoxQP, even with a plain linear program. The resulting bounds are significantly stronger than these from Bonami et al. (Math Program Comput 10(3):333–382, 2018), which arise from separating A-odd cycle inequalities heuristically.


Introduction
Bonami et al. [3] show how to effectively solve the N P-hard nonconvex quadratic program with box constraints, i.e., with Q ∈ R n×n symmetric, via linear programming techniques. Without loss of generality we assume l = 0 and u = 1, because we assume that l and u are finite. They first obtain a B Bernd Perscheid perscheid@uni-trier.de Sven de Vries devries@uni-trier.de 1 Operations Research, FB IV -Mathematics, Trier University, 54286 Trier, Germany convex relaxation of the BoxQP with a linear objective function. This linearization induces nonlinear constraints, which are replaced by the so-called McCormick inequalities [9]. We denote the resulting weak linear relaxation of the BoxQP by LP M , see Sect. 2. However, cutting planes from the Boolean Quadric Polytope can be used to turn it into a very strong relaxation of the BoxQP. For given A ∈ Z m×n , b ∈ Z m , and the problem Ax ≥ b with x ∈ Z n , these cuts are Chvátal-Gomory cuts: for any α ∈ R m + with α T A ∈ Z n . Furthermore, it is shown that all Chvátal-Gomory cuts for the Boolean Quadric Polytope are 0 − 1 2 −Chvátal-Gomory cuts (i.e., α ∈ {0, 1 2 } m ). Caprara and Fischetti [7] show that separating 0 − 1 2 −Chvátal-Gomory cuts is N P-hard in general. However, Koster et al. [8] study ways to separate them effectively in practice. Fortunately for our purposes, all 0 − 1 2 −Chvátal-Gomory cuts of the Boolean Quadric Polytope are dominated by A-odd cycle inequalities, see [3], which can be separated in polynomial time [2].
Boros et al. [4] present a system of inequalities that describes the polyhedron given by the linear relaxation of the Boolean Quadric Polytope with all variables together with all 0 − 1 2 -Chvátal-Gomory cuts. Their formulation has O(n 2 ) variables and contains O(n 3 ) inequalities in addition to the O(n 2 ) inequalities from the linear relaxation of the Boolean Quadric Polytope. We present an extended formulation which has also O(n 2 ) variables but only adds O(|E|n) inequalities, where |E| is the number of non-zeros in Q above its diagonal. This extended formulation is much more compact for sparse matrices Q, since we have |E| n 2 in this case.

Previous work and notation
This section is basically a summary of some results by Bonami et al. [3] and forms the basis for everything novel we derive in Sect. 3. We slightly deviate from the notation adopted in [3]. Let N denote the set {1, . . . , n} and E := i j | {i, j} ⊆ N , i = j, Q i j = 0 be the set of undirected edges. Notice that E is well-defined since Q i j = 0 if and only if Q ji = 0 by symmetry of Q.
A weak linear relaxation of the BoxQP can be obtained by replacing the nonlinear constraints of an obvious convex relaxation of the BoxQP by the so-called McCormick inequalities, see [9]. The resulting linear program is given by Notice that the constraint x i ≥ X i j additionally encompasses x j ≥ X i j when applying it to ji = i j ∈ E. Furthermore, this relaxation is strengthened to the Boolean Quadric Polytope can be defined as The inequalities of BQP L P imply box-constraints 0 ≤ x i ≤ 1 for all i ∈ N , since 0 ≤ X i j ≤ x i and x i + x j − 1 ≤ x j for all i j ∈ E, cf. [9]. Moreover, if we combine the McCormick inequalities X i j ≥ 0 and X i j ≥ x i + x j − 1, we have Analogously, adding up x i ≥ X i j and x j ≥ X ji , while using i j = ji, yields For inequalities (A i j ) and (B i j ), we define their slacks To obtain additional cuts for BQP, certain combinations of (A i j )and (B i j )-inequalities are useful. Let E A ⊆ E be the set of all edges i j for which we use inequality (A i j ) and E B ⊆ E be the set of all edges i j for which we use inequality (B i j ). Let We combine (A i j )and (B i j )-inequalities such that |E A | is odd and E A∪ E B is a simple cycle of arbitrary parity, where∪ denotes the disjoint union.

Remark 2.2 Adding inequality (
Adding up inequalities for a simple cycle E A∪ E B yields Subtracting |E A | and dividing by 2 yields In the case of |E A | odd, we can strengthen this inequality, cf. [3], to which yields after another transformation That is equivalent to and substituting by w A i j and w B i j yields We call inequality (3) an A-odd cycle inequality. Notice that this inequality is sometimes just called odd cycle inequality in literature, but we want to avoid confusion with cycles of odd cardinality. Thus, we prefix A to emphasize that |E A | must be odd and the parity of the cycle itself is irrelevant.

Separation and extended formulation
The following construction is similar to the separation algorithm for the cut polytope presented by Barahona and Mahjoub [2]. Nevertheless, we give a detailed explanation on how the separation problem for the A-odd cycle inequalities of BQP can be solved, since our extended formulation depends on the separation technique. Let us mention that, as for every arc ((i, r ), ( j, s)) in H the anti-parallel arc (( j, s), (i, r )) is present too, the separation algorithm would also work with an undirected graph instead of F, that has two vertices connected by an edge and a loop at both vertices. However, this does not make a difference to our extended formulation (if, for example, some variable f i j is involved for edge i j, then it simultaneously produces an inequality for variable f ji ). Also, directed edges are better suited for network-flow formulations such as (A-OC), which will be defined later. Now we assign arc variable w A i j to the arcs ((i, r ), ( j, 1 − r )) ∈ A G×F and w B i j to the arcs ((i, r ), ( j, r )) ∈ A G×F for all r ∈ {0, 1}. Figure 1 shows the structure of H for a single edge i j ∈ E G .
Whenever we use an (A i j )-inequality for an edge i j ∈ E G , an arc with arc variable w A i j in the product graph H is used and the second index of a vertex in H changes from 0 to 1 or from 1 to 0. Otherwise, using a (B i j )-inequality for an edge i j ∈ E G corresponds to an arc with arc variable w B i j in H and the second index does not change. Definition 3.2 A u-v-path/walk in G together with an assignment of arc variables, either w A i j or w B i j , to every edge i j of the respective path/walk is called A-odd if the total number of assigned arc variables w A i j is odd. If the total number is even, the u-v-path/walk is called A-even.

Lemma 3.4 Let (x,X ) be fixed. Thew-weight of a shortest A-odd cycle in G is equal to thē
Proof Notice first that thew-weight of a shortest (i, 0)-(i, 1)-path in H is equal to thewweight of a shortest (i, 1)-(i, 0)-path in H because for every arc there is an anti-parallel arc of equal weight. Let P be a shortest (i, 0)-(i, 1)-path of all (i, 0)-(i, 1)-paths in H with i ∈ N . If the first index of all vertices except (i, 0) and (i, 1), which serve as start and end point, on P is different, then there is nothing to show. Otherwise, if for some j both vertices ( j, 0) and ( j, 1) lie on P, thew-weight of the subpath between ( j, 0) and ( j, 1) cannot exceed thew-weight of P as we do not have negative arc weights. Conversely, the subpath between ( j, 0) and ( j, 1) cannot have lessw-weight than P by the assumption of P being one of the shortest of all (i, 0)-(i, 1)-paths in H with i ∈ N . Without loss of generality we can replace P by aw-shortest ( j, 0)-( j, 1)-path with fewer arcs. Successively, we end up in the first case.

Lemma 3.5
Given (x,X ) ∈ BQP L P . Then (x,X ) violates an A-odd cycle inequality if and only if there exists a path from (i, 0) to (i, 1) in H for some i ∈ N ofw-weight less than 1.
Proof Let E A∪ E B be the edge set of a simple cycle C in G whose A-odd cycle inequality is violated by (x,X ). Then (3) and because |E A | is odd, there exists a path from (i, 0) to (i, 1), and one from (i, 1) to (i, 0), in H for all i ∈ C that add up the samew A i j andw B i j as given above. Thus, the weight of this path is less than 1.
For the converse, consider a path from (i, 0) to (i, 1) in H with weight less than 1. Analogously, there exists an A-odd cycle in G of equal weight. Theorem 3.1 allows us to solve the separation problem for the A-odd cycle inequalities of BQP with a linear program. Ahuja et al. [1,Chapter 9.4] find a shortest s-t-path by solving a minimum cost flow problem, i.e., by sending one unit of flow from vertex s to vertex t through the network. We apply their technique of using duality to our graph H and consider for fixed i ∈ N the linear program where the f -variables are indexed by two vertices in H , hence f is jt is shorthand for f (i,s),( j,t) and relates to a lower bound for the weight of paths from (i, s) to ( j, t). Moreover, (i, 0) serves as the start vertex and (i, 1) as the target vertex of a path. Notice that we partition the inequalities into two types. Both of them bound the variables f i0 js , and hence the weight of a shortest (i, 0)-( j, s)-path in H , from above by the weight of any (i, 0)-( j, s)-path in H where the last arc is fixed. The first type of inequalities ensures that the last arc on the path has arc weightw A k j = 2X k j −x k −x j + 1 whenever s = t in order to arrive at vertex ( j, s), i.e., when the last inequality on the path is an (A k j )-inequality. For s = t, that is, if the last inequality is a (B k j )-inequality, we have to use arcs with arc weightw B k j = −2X k j +x k +x j . Thus, the above linear program increases variable f i0i1 until it is equal to the weight of a shortest (i, 0)-(i, 1)-path in H and therefore it does not exceed the weight of any (i, 0)-(i, 1)-path in H . Obviously, this is also true for every variable f i0 js if vertex ( j, s) is part of a shortest (i, 0)-(i, 1)-path.
If the objective value of a solution of this linear program is greater or equal than 1 for every i ∈ N , then (x,X ) fulfills all A-odd cycle inequalities of BQP.
Using this idea, we obtain the following compact extended formulation that enforces all A-odd cycle inequalities. Notice that x and X as well as w are now variables in contrast to the separation linear program (A-OC).

Theorem 3.6
The linear system together with Eqs. (1) and (2) is an extended formulation of the (potentially exponentially many) A-odd cycle inequalities of BQP and therefore provides a relaxation for BQP.
Proof Let (x,X ) ∈ BQP L P . Then the weightsw A andw B are explicitly given by equations (1) and (2). We first show that if inequalities (3) are fulfilled by (x,X ), then for every pair (i, r ) and ( j, s) in V G×F there existsf ir js such that (x,X ,f ) is feasible for inequalities (4)-(7). Definē f ir js as the weight of a shortest (i, r )-( j, s)-path in H if such a path exists. Otherwise, assign a large value tof ir js . Inequalities (4) are obviously fulfilled, as shortest paths from a vertex to itself have weight 0 in digraphs where all arc weights are nonnegative. Inequalities (5) and (6) express that the weight of a shortest (i, r )-( j, s)-path cannot exceed the weight of an (i, r )-( j, s)-path where the last arc is fixed, which is always true. Finally, Lemma 3.5 ensures that inequalities (7) are fulfilled.
Conversely, let (x,X ,f ) be feasible for inequalities (4)- (7). Thenf irir = 0 for every i ∈ N and r ∈ {0, 1} by Eq. (4), which is equal to the weight of a shortest path from vertex (i, r ) in H to itself. Consider the case (k, t) = (i, r ) in inequalities (5) and (6). For every i j ∈ E, variables f ir js are bounded from above by arc variables w A i j if r = s. Moreover, variables f ir js are bounded from above by arc variables w B i j if r = s. Taking those cases for inequalities (5) and (6) into account, where (k, t) = (i, r ), every variable f ir js is bounded from above by the weight of a shortest path from (i, r ) to ( j, s) in H . Thus, for every vertex pair (i, r ) and ( j, s) in H , the valuef ir js is lower or equal than the weight of a shortest path from (i, r ) to ( j, s) in H . Sincef i0i1 is lower or equal than the weight of a shortest (i, 0)-(i, 1)-path in H andf i0i1 ≥ 1 for i ∈ N , every shortest (i, 0)-(i, 1)-path in H has weight at least 1. This holds for every i ∈ N and therefore all A-odd cycle inequalities (3) are fulfilled by (x,X ), see Lemma 3.5.

Remark 3.7
The extended A-odd cycle formulation in Theorem 3.6 requires 4n 2 additional variables f , whereas the w A -and w B -defining equations can be replaced by their definition in terms of x and X . In total, 16|E|n + n inequalities are added. Notice that f irir for all i ∈ N and r ∈ {0, 1} are just constant numbers and every edge k j ∈ E produces two inequalities for fixed i ∈ N and r , s, t ∈ {0, 1}.
The extended formulation of Boros et al. [4] includes four different inequalities for every subset {i, j, k} of three different vertices of N , that all involve variables X i j , X ik , and X jk , independent of the sparsity of Q. Each of these triangle inequalities is equivalent to one A-odd cycle inequality for a cycle of length three in a complete graph and vice versa. The number of inequalities (5) and (6) in our extended formulation depends on the cardinality of N and E. The latter represents the density of Q and hence our formulation is much more compact if Q is sparse.

Numerical experiments
In a computational study, Bonami et al. [3] add 0 − 1 2 −Chvátal-Gomory cuts heuristically to the basic relaxations LP M and QP M 2 in CPLEX. They apply this method to the 99 BoxQP test instances of Nemhauser and Vandenbussche [11], Burer and Vandenbussche [6], and Burer [5]. Our contribution is to compute the bounds that arise from exact A-odd cycle separation for the pure linear program LP M . To this end, we add the extended relaxation from Theorem 3.6 to the constraint set of LP M and solve the resulting LP with CPLEX v. 12.8.0.0 on the same benchmark set. The computation takes a few seconds on an 8-core Intel Core i7-6700 CPU machine running at 3.40 GHz for each of the smaller instances, whereas a few minutes of running time where consumed when solving the larger ones. The largest instances required up to two hours of running time, but our focus here is more on the strength of the relaxation than on raw speed.
Adding the extended relaxation from Theorem 3.6 to QP M 2 gives a convex quadratic program with a large number of variables and inequalities. Although we could by far not solve all of these QPs in reasonable running time, we compute the solutions for a subset of all test instances, i.e., all instances where n ≤ 40 whose density is not too high. Even for some of these small instances, the computation took several hours.
The optimality gap is defined as where z BoxQP is the optimal objective value of the BoxQP and z is the optimal objective value of the considered relaxation. We denote the bounds arising from LP M and QP M 2 , respectively, by z M and z M 2 . The notation LP • M is used when 0 − 1 2 −Chvátal-Gomory cuts are added heuristically to LP M via CPLEX (see [3]) and LP • M is used for LP M strengthened with all inequalities (4)-(7) together with Eqs. (1) and (2). We define QP • M 2 and QP • M 2 analogously. The bounds arising from optimal solutions of LP respectively. Let d be the percentage of non-zeros in Q. An instance is called sparse, medium, or dense, if d ≤ 40%, 40% < d ≤ 60%, or d > 60%, respectively. Moreover, we divide these classes further into small (n ∈ {20, 30, 40}), medium (n ∈ {50, 60, 70}), large (n ∈ {80, 90}), and jumbo (n ∈ {100, 125}).
The set of small test instances contains 6 sparse, 9 medium, and 27 dense instances. Table 1 shows how much of the optimality gap of LP M and QP M 2 is closed by the A-odd cycle inequalities when adding them heuristically and when separating all of them.
The average optimality gap left by LP M , QP M 2 , LP • M , QP • M 2 , and LP • M , respectively, for all instances with n ≥ 50 is visualized in Figs. 2 and 3. We obtain that the pure linear  relaxation LP M together with all or at least some A-odd cycle inequalities closes by far more of the optimality gap than the quadratic relaxation QP M 2 without the A-odd cycle inequalities, regardless of the choice of the size n or the density d. Moreover, the impact of the A-odd cycle inequalities increases when decreasing the density of Q. Especially on sparse instances, the relaxation LP • M is much stronger than LP • M . Furthermore, LP • M is on average stronger than QP • M 2 , except for dense instance, where the difference regarding the optimality gap is very small. For all test instances, bounds and optimality gaps are listed in the appendix.

Conclusion
We showed how to construct a tight compact extended relaxation for nonconvex QP with box constraints by enforcing the A-odd cycle inequalities for the Boolean Quadric Polytope. Therefore, we avoid running multiple rounds of a separation algorithm. On a large benchmark set, our computational results illustrate how efficient it is to strengthen the weak linear relaxation LP M . Since our strengthened relaxation remains linear, it is applicable in practice. We think that our extended formulation might be most useful in problems that have a small quadratic programming substructure within larger structures, so that the quadratic subproblem can be strengthened in one step.
Acknowledgements The authors would like to thank the reviewers and the editor for their insightful comments that significantly improved this manuscript. They also like to thank Jeff Linderoth for his fine presentation on MINLP at the ALOP summer school 2018 in Trier initiating this research and for very helpful comments on early versions of this manuscript. Both authors were supported by the Research Training Group 2126 Algorithmic Optimization (ALOP), funded by the German Research Foundation DFG. The results presented here are part of Bernd Perscheid's Ph.D. thesis [10].
Funding Open Access funding enabled and organized by Projekt DEAL.
Availability of data, material, and code Data, material, and code concerning the results presented here are available upon request.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Table 2 extends Table 9 in [3] by adding the bounds z • M and z • M 2 that arise from exact separation. Notice that the bounds z • M 2 are at least as strong as the bounds z • M . However, computing z • M 2 was only possible in reasonable time for instances with n ∈ {20, 30} and for half of the instances with n = 40.

Appendix A. Numerical results
The first number in the name of the test instance is equal to n, i.e., the number of variables of the BoxQP. The second number expresses the percentage of non-zeros in Q and the third number enumerates different test instances that have the same parameters.
In Table 3, we list the optimality gap for every relaxation LP M , Table 2, except for those QP • M 2 that were not solved.      Table 3 continued