Enhancements of Discretization Approaches for Non-Convex Mixed-Integer Quadratically Constraint Quadratic Programming: Part II

This is Part II of a study on mixed-integer programming (MIP) relaxation techniques for the solution of non-convex mixed-integer quadratically constrained quadratic programs (MIQCQPs). We set the focus on MIP relaxation methods for non-convex continuous variable products and extend the well-known MIP relaxation normalized multi-parametric disaggregation technique (NMDT), applying a sophisticated discretization to both variables. We refer to this approach as doubly discretized normalized multiparametric disaggregation technique (D-NMDT). In a comprehensive theoretical analysis, we underline the theoretical advantages of the enhanced method D-NMDT compared to NMDT. Furthermore, we perform a broad computational study to demonstrate its effectiveness in terms of producing tight dual bounds for MIQCQPs. Finally, we compare D-NMDT to the separable MIP relaxations from Part I and a state-of-the-art MIQCQP solver.


Introduction
In this work, we study relaxations of general mixed-integer quadratically constrained quadratic programs (MIQCQPs). More precisely, we consider discretization techniques for non-convex MIQCQPs that allow for relaxations of the set of feasible solutions based on mixed-integer programming (MIP) formulations.
To this end, we study a number of MIP formulations that form relaxations of the quadratic equations z " x 2 and z " xy. These MIP relaxations can then be applied to MIQCQPs by introducing auxiliary variables and constraints for each quadratic term to form a relaxation of the overall problem. In particular, we consider the strength of various MIP relaxations applied directly to a given problem, which is the simplest approach to enable the solution of MIQCQPs via an MIP solver. Our focus here is to analyze these approaches both theoretically and computationally with respect to the quality of the dual bound they deliver for MIQCQPs. Dual bounds give a lower bound for the optimal value in a minimization problem. The term comes from the so-called dual program, which can also be used to determine such bounds. Background MIQCQPs naturally arise in the solution of many real-world optimization problems, stemming e.g. from the contexts of power supply systems ( [2]), gas networks ( [19,27]), water management ( [23]) or pooling/mixing ( [6,11,15,30,31]). See [25,37] and the references therein for more examples. For the solution of such problems, there are a number of different approaches, which differ in case the problems are convex or non-convex. Within this work, we focus on the most general case, i.e. non-convex MIQCQPs, and only require finite upper and lower bounds on the variables.
In the literature, a variety of solution techniques for non-convex MIQCQPs exists. The most prominent class among them are McCormick -based techniques, see e.g. [12,13,14,16,35,36]. For quadratic programs, in particular, convexification can be applied to bivariate monomials xy by introducing a new variable z " xy and constructing the convex hull over the bounds on x and y. This yields the so-called McCormick relaxation, which is the smallest convex set containing the feasible set of the equation z " xy for given finite bounds on x and y. This relaxation is known to be a polytope described by four linear inequalities (see [34]), and it is tighter the smaller the a priori known bounds on x and y are. Hence, one standard solution approach is spatial branch-and-bound, where the key idea is to split the domain recursively into two subregions. For instance, one can choose the two subregions where x ďx and x ěx, respectively, for some valuex. By branching on subregions, we can improve the convexification of the feasible region by adding valid inequalities to the subproblems. Thus, applying spatial branch-and-bound in conjunction with convexification (such as McCormick Relaxations) sequentially tightens the relaxation of the problem.
Alternatively, similar effects can be achieved through some kind of binarization. This is a general term that describes the conversion of continuous or integer variables into binary variables. By branching on these new binary variables, we also partition the space into subproblems in a way that simulates spatial branch-and-bound. The binarization of the partition makes the resulting prob-lem a piecewise linear (p.w.l. ) relaxation of the original problem with binary auxiliary variables. McCormick-based methods can differ in the way the partition and the binarization are performed. The partition can be performed purely on one variable or on both variables, equidistantly or non-equidistantly. The binarization can be done linearly or logarithmically in the number of partition elements, see [40,32]. In a broader sense, (axial-)spatial branching for bilinear terms can also be seen as a piecewise McCormick linearization approach. Here, the partition is not performed a priori, but rather an initial partition is refined via branching on continuous variables. An overview of spatial-branching techniques can be found in [8].
Another common idea for linearizing variable products is to use quadratic convex reformulations as in [9,26,22,21,7]. This technique transforms the nonconvex parts of the problem into univariate terms via reformulations. In [7], the authors apply diagonal perturbation to convexify the quadratic matrices. The resulting univariate quadratic correction terms are then linearized by introducing new variables and constraints of the form z i " x 2 i , which are then approximated by p.w.l. functions. The binarization of the univariate p.w.l. functions is done logarithmically by using the so-called sawtooth function, introduced in [42]. An advantage of this approach is that only linearly many expressions of the form z i " x 2 i have to be linearized instead of quadratically many equations of the form z ij " x i x j , with respect to the dimension of the original quadratic matrix. This approach yields a convex MIQCQP relaxation instead of the MIP relaxation obtained via direct modeling using bilinear terms. See also [1] that adapts the branch and bound approach αBB [3] to general twice differentiable objectives by providing convex reformulations via perturbations.
A further set of approaches relies on separable reformulations of the nonconvex variable products, as done e.g. in [5]. Here, each term of the form xy is reformulated as a sum of separable univariate terms, for example using the equivalent reformulation xy " 1 {2px 2`y2´p x´yq 2 q " 1 {2pr`s´tq with r " x 2 , s " y 2 , and t " px´yq 2 as described by [4]. The univariate constraints, here equations of the form r " x 2 , s " y 2 , and t " px´yq 2 , are then relaxed. Again, this approach can be combined with a logarithmic encoding of the univariate linear segments, as in [22,7]. In [5], the authors analyze the following possible reformulations: Bin2 : xy " 1 {2`px`yq 2´x2´y2˘, Bin3 : xy " 1 {2`x 2`y2´p x´yq 2˘.
They prove that MIP-based approximations of each of these univariate reformulations require fewer binary variables than a bivariate MIP-based approximation that guarantees the same maximal approximation error, if this prescribed error is small enough. However, this comes at the cost of weaker linear programming (LP) relaxations. Alternatively, one can also obtain an MIP relaxation of xy directly via a bivariate p.w.l. relaxation, see e.g. [5,10,27,40]. One way to do this is to perform a triangulation of the domain, which defines a p.w.l. approximation of the variable product. This p.w.l. approximation can then easily be converted into a relaxation of the feasible set by axis-parallel shifting, which yields a p.w.l. underestimator and overestimator. Bivariate p.w.l. approximations can also be binarized using (logarithmically-many) binary variables, see e.g. [27,40,32].
Contribution We compare different MIP relaxation approaches, both known ones, and a new one, in terms of the dual bound, they impose for non-convex MIQCQPs. We extend the separable approximation approaches Bin2 and Bin3 from [5] to MIP relaxations for z " xy. Additionally, we introduce a novel MIP relaxation for z " xy called hybrid separable (HybS) that is based on a sophisticated combination of Bin2 and Bin3 that allows us to relax only linearly-many univariate quadratic terms (in the dimension of the quadratic matrix). In a theoretical analysis, we show that HybS has theoretical advantages, such as fewer binary variables and better LP relaxations compared to Bin2 and Bin3. We combine HybS, Bin2, and Bin3 with an MIP relaxation, called sawtooth relaxation, for z " x 2 that requires only logarithmically-many binary variables with respect to the relaxation error. Thus, we can obtain MIP relaxations for MIQC-QPs. The sawtooth relaxation is an extension of the sawtooth approximation from [7], which has the strong property of hereditary sharpness. The hereditary sharpness of an MIP formulation means that the formulation is tight in the space of the original variables, even after branching on integer variables. We can show that the sawtooth relaxation is also hereditary sharp.
Finally, we perform an extensive numerical study where we generate MIP relaxations of non-convex MIQCQPs. Foremost, we test the different relaxation techniques in their ability to generate tight dual bounds for the original quadratic problems. We will see that HybS has a clear advantage over its predecessors Bin2 and Bin3. This effect becomes even more apparent on dense instances.
We present Part II of this work in a separate paper, where we study MIP relaxations that are distinctly different and are extensions of the normalized multiparametric disaggregation technique (NMDT) [13]. We provide further theoretical and computational analyses. The NMDT uses a combination of McCormick envelopes and selective discretization of variables; it was useful in some applications to chemical engineering. In addition, we perform a comparison of HybS with NMDT-based methods and Gurobi as an MIQCQP solver.
Outline We proceed as follows. In Section 2, we introduce several useful concepts and notations used throughout the work. In Section 3, we present core formulations used repeatedly in our linear relaxations of quadratic terms. In Section 4, we introduce the new MIP relaxation HybS for equations of the form z " xy. In Section 5, we prove various properties about the strengths of this MIP relaxation focusing on volume, sharpness, and optimal choice of breakpoints. In Appendix B we prove that the sawtooth relaxation is hereditarily sharp. In Section 6, we present our computational study.

MIP Formulations
In this work, we study relaxations of general mixed-integer quadratically constrained quadratic programs (MIQCQPs), which are defined as for Q 0 , Q j P R nˆn , c 0 , c j P R n and b j P R, j " 1, . . . m.
Throughout this article, we use the following convenient notation: for any two integers i ď j, we define i, j :" ti, i`1, . . . , ju, and for an integer i ě 1 we define i :" 1, i . We will denote sets using capital letters but also use capital letters for matrices, some functions, and the number of layers L. We typically denote variables using lowercase letters and vectors of variables using boldface. For a vector u " pu 1 , . . . , u n q and some index set I Ď n , we write u I :" pu i q iPI . Thus, e.g. u i " pu 1 , . . . , u i q. Furthermore, we introduce the following notation: for a function F : X Ñ R and a subset B Ď X, let gra B pF q, epi B pF q and hyp B pF q denote the graph, epigraph and hypograph of the function F over the set B, respectively. That is, gra B pF q :" tpu, zq P BˆR : z " F puqu, epi B pF q :" tpu, zq P BˆR : z ě F puqu, hyp B pF q :" tpu, zq P BˆR : z ď F puqu.
In the following, we introduce the concept of MIP formulations as well as properties regarding MIP formulations which will be used later on.
We will study mixed-integer linear sets, so-called mixed-integer programming (MIP) formulations, of the form P IP :" tpu, v, zq P R d`1ˆr 0, 1s pˆt 0, 1u q : Apu, v, zq ď bu for some matrix A and vector b of suitable dimensions. The linear programming (LP) relaxation or continuous relaxation P LP of P IP is given by P LP :" tpu, v, zq P R d`1ˆr 0, 1s pˆr 0, 1s q : Apu, v, zq ď bu.
We will often focus on the projections of these sets onto the variables u, i.e. proj u pP IP q :" tu P R d`1 : Dpv, zq P r0, 1s pˆt 0, 1u q s.t. pu, v, zq P P IP u. (2) The corresponding projected linear relaxation proj u pP LP q onto the u-space is defined accordingly.
In order to assess the quality of an MIP formulation, we will work with several possible measures of formulation strength. First, we define notions of sharpness, as in [7,29]. These relate to the tightness of the LP relaxation of an MIP formulation. Whereas properties such as total unimodularity guarantee an LP relaxation to be a complete description for the mixed-integer points in the full space, we are interested here in LP relaxations that are tight description of the mixed-integer points in the projected space.
Definition 1 (Sharpness). We say that the MIP formulation P IP is sharp if proj u pP LP q " convpproj u pP IP qq holds. Further, we call it hereditarily sharp if, for all I Ď q andẑ P t0, 1u |I| , we have proj u pP LP | z I "ẑ q " conv`proj u pP IP | z I "ẑ q˘.
Sharpness expresses a tightness at the root node of a branch-and-bound tree.
Hereditarily sharp means that fixing any subset of binary variables to 0 or 1 preserves sharpness, and therefore this means sharpness is preserved throughout a branch-and-bound tree.
In this article, we study certain non-polyhedral sets U Ď R d`1 and will develop MIP formulations P IP to form relaxations of U in the projected space, as defined in the following.
We now define several quantities to measure the error of an MIP relaxation.

Definition 3 (Error).
For an MIP relaxation P IP of a set U Ď R d`1 , let u P proj u pP IP q. We then define the pointwise error ofū as We next define the following two error measures for P IP w.r.t. U : 1. The maximum error of P IP w.r.t. U is defined as 2. The average error of P IP w.r.t. U is defined as E avg pproj u pP IP q, U q :" volpP IP zU q.
Via integral calculus, the second, volume-based error measure can be interpreted as the average pointwise error of all points u P proj u pP IP q. Note that whenever the volume of U is zero (i.e. it is a lower-dimensional set), the average error just reduces to the volume of P IP . Both of the defined error quantities for an MIP relaxation P IP can also be used to measure the tightness of the corresponding LP relaxation P LP . In Section 5.3.2, we use these to compare formulations when P LP is not sharp.

Core Relaxations
In the definition of the MIP relaxations studied in this work, we repeatedly make use of several "core" formulations for specific sets of feasible points. They are introduced in the following.
For our relaxations of MIQCQPs, we will frequently need to consider terms of the form z " xy for continuous or integer variables x and y within certain bounds D x :" rx,xs and D y :" rȳ,ȳs, respectively. To this end, we introduce the function F : D Ñ R, F px, yq " xy, D :" D xˆDy , and refer to the set of feasible solutions to the equation z " xy via the graph of F , i.e. gra D pF q " tpx, y, zq P DˆR : z " xyu. In order to simplify the exposition, we will, for example, often write gra D pxyq or refer to a relaxation of the equation z " xy instead of gra D pF q. We will do this similarly for the epigraph and hypograph of F as well as for the univariate function f : D x Ñ R, f pxq " x 2 and equations of the form z " x 2 , for example.

McCormick Envelopes
The convex hull of the equation z " xy for px, yq P D is given by a set of linear equations known as the McCormick envelope. See [34].
x¨y`x¨ȳ´x¨ȳ ď z ďx¨y`x¨ȳ´x¨ȳ, x¨y`x¨ȳ´x¨ȳ ď z ď x¨y`x¨ȳ´x¨ȳ.

Sawtooth-Based MIP Formulations
We next recall an MIP formulation for approximating equations of the form z " x 2 that requires only logarithmically-many binary variables in the number of linear segments. It makes use of an elegant p.w.l. formulation for gra r0,1s px 2 q from [42] using the recursively defined sawtooth function presented in [39] to formulate the approximation of gra r0,1s px 2 q, as described in [7]. Let L be an positive integer and let F L be the piecewise linear interpolation of x 2 at uniformly spaced breakpoints i 2 L for i " 0, 1, . . . , 2 L ; see Figure 1. This function has a convenient recursive definition [42,39]. To this end, define the "tooth" function G : r0, 1s Ñ r0, 1s, Gpxq " mint2x, 2p1´xqu. Subsequently, we define compositions of the tooth function Under this notation, we can formally define the function F L : r0, 1s Ñ r0, 1s, We summarize useful information from [42,7] about the approximation F L . These properties will be used in our analysis of the models that we propose.
Proposition 1 ([42,7]). The function F L satisfies the following properties: 1. The function F L is the piecewise linear interpolation of x 2 at uniformly spaced breakpoints i 2 L for i " 0, 1, . . . , 2 L ; see Figure 1. The shifted piecewise linear function F L´2´2L´2 has each affine part being the tangent to x 2 at the midpoint i 2 L`1 2 L`1 ; see Figure 2.
The function F L is convex on the interval r0, 1s.
The successive piecewise linear approximations of x 2 shifted down to be underestimators. The markers indicate the places where the underestimators coincide with x 2 and in fact, show that the affine segments are tangent lines to the function. The inequality z ě F L pxq´2´2 L´2 in fact creates 2 L tangent lower bounds.
Following [7], we create an MIP formulation to encode this piecewise linear function. We create variables g j to represent the output of a "sawtooth" function of x and binary variables α P t0, 1u L that represent decision in Gpxq that either 2x ď 2p1´xq or 2p1´xq ď 2x. In particular, we design the formulation such when α P t0, 1u L , the relationship between g j and g j´1 is g j " mint2g j´1 , 2p1´g j´1 qu for j " 1, . . . , L, To this end, we define a formulation parameterized by the depth L P N: g 0 " x, 2pg j´1´αj q ď g j ď 2g j´1 j " 1, . . . , L, 2pα j´gj´1 q ď g j ď 2p1´g j´1 q j " 1, . . . , L.
Using the relationships (5) and (6) between x and g, any constraint of the form z " x 2 can be approximated via the function for an integer L ě 0. We use the above definitions to give an MIP formulation that approximates equations of the form z " x 2 . Fig. 3. The sawtooth relaxation from Definition 5 at depths L " 0, 1, 2. The shaded region is the relaxation. Some additional inequalities are plotted to help visualize the inequalities with respect to the functions F j .
Based on the sawtooth approximation, we can now present the sawtooth relaxation for z " x 2 from [7], illustrated in Figure 3, which arises by shifting each approximating function F j , j " 0, . . . , L, down by its maximum error 2´2 j´2 (established in Proposition 1, Item 2) and then adding additional outer-approximation cuts to x 2 at x " 0 and x " 1.

(12)
Remark 1 (Transformation to General Bounds). To this point, the sawtooth MIP formulations were presented for x 2 with x P r0, 1s. However, all sawtoothbased MIP formulations can be extended to general intervals x P rx,xs by mapping rx,xs to r0, 1s via the substitutionx " x´x x´x P r0, 1s and applying the sawtooth formulation to model the equation Thus, for general intervals, we first apply the approximation toẑ "x 2 , then add the equationsx In our computational study in Section 6, these constraints are implemented as defining expressions forx andẑ, and the MIP formulations are constructed forx andẑ then. See Appendix A for the generalized MIP formulations under this transformation.N ow, we consider the LP relaxation of S L , where each variable α j is relaxed to the interval r0, 1s. Then, via the constraints (8), we see that the weakest lower bounds on each g j w.r.t. g j´1 can be attained via setting α j " g j´1 , yielding a lower bound of 0. Thus, after projecting out α, the LP relaxation of S L in terms of just x and g can be stated as The sawtooth relaxation (11) is sharp by Theorem 1 (proved later in this work), which follows in much the same way as the sharpness of the sawtooth approximation (10), as established in [7,Theorem 1]. Thus, the LP relaxation of the sawtooth relaxation (11) yields the same lower bound on z as the MIP version due to sharpness and the convexity of F L . This allows us to define an LP outer approximation for inequalities of the form z ě x 2 : Fig. 4. The tightened sawtooth relaxations R L,L 1 from Definition 7 for the pairs pL, L1q " p0, 1q, p0, 2q, p1, 2q. By increasing L1 beyond L, we tighten the lower bound by creating more inequalities. This is done by only adding linearly-many variables and inequalities in the extended formulation to gain exponentially-many equally spaced cuts in the projection.
We will prove in Proposition 2 that the maximum error for the sawtooth epigraph relaxation is 2´2 L´4 .
Finally, we combine the depth-L sawtooth relaxation (11) with the depth-L 1 sawtooth epigraph relaxation (14) for some L 1 ě L to obtain a sawtooth relaxation which is stronger in the lower bound, but uses the same number of binary variables.
Definition 7 (Tightened Sawtooth Relaxation, TSR). Given some L, L 1 P N with L 1 ě L, the tightened sawtooth relaxation for z " x 2 on the interval x P r0, 1s with upper-bounding depth L and lower-bounding depth L 1 is given by R L,L1 :" tpx, zq P r0, 1sˆR : Dpg, αq P r0, 1s L1`1ˆt 0, 1u L : (17)u. (16) z ď f L px, g 0,L q, px, g 0,L , αq P S L , (17b) We connect the last constraints with a brace since there are all defining constraints for Q L1 . Since we define Q L1 in the projection space px, zq, we cannot simply write px, gq P Q L1 since we need the same α and g to apply to the other constraints as well.
We will prove in Theorem 1 that the tightened sawtooth relaxation is also sharp, and in Theorem 2 that it is hereditarily sharp.
for which all terms of the form x 2 i and x i x j appear in either the objective or in some constraint. The novel formulation HybS presented herein is an extension of existing formulations Bin2 and Bin3, designed to significantly reduce the number of binary variables required to reach the same level of relaxation accuracy compared to its original predecessors Bin2 and Bin3 for completely dense MIQCQPs, which will also be introduced in the following.

Separable MIP Relaxations
We present three MIP relaxations based on separable reformulations. A separable reformulation turns a multivariate expression into a sum of univariate functions. To this end, we make use of the reformulation approaches Bin2 and Bin3, given via Bin2: xy " 1 2 ppx`yq 2´x2´y2 q, Bin3: xy " 1 2 px 2`y2´p x´yq 2 q, see e.g. [5], and combine them with the sawtooth relaxation (16) to derive MIP relaxations for the occurring equations of the form z " xy. While the following MIP relaxations on Bin2 and Bin3 are natural extensions of the MIP approximations studied in [5] to MIP relaxations, we will also combine both reformulations to a new formulation in which the MIP relaxation requires significantly less binary variables if it is used to solve problems of the form (1) As a reminder, in the definitions below, the notation M is used to describe the McCormick envelope.
Remark 2. In [5], Bin1: xy " p 1 {2px`yqq 2´p 1 {2px´yqq 2 is also discussed as a possible separable reformulation. However, for completely dense MIQCQPs, Bin1 requires a number of binary variables that is by a factor of roughly 2 greater than that required for Bin2 and Bin3. This is due to the fact that for each bivariate product x i x j , we need to discretize both p 1 {2px i`xj qq 2 and p 1 {2px i´xj qq 2 instead of only one of the two squares for Bin2 and Bin3. Therefore, we omit Bin1 in the following.D efinition 8 (Bin2). The MIP relaxation Bin2 of z " xy, x, y P r0, 1s 2 , with a lower-bounding depth L 1 P N and an upper-bounding depth L P N, is defined as follows: p " x`y z " 1 {2pz p´zx´zy q px, y, zq P Mpx, yq px, z x q, py, z y q, pp, z p q P R L,L1 x, y P r0, 1s, p P r0, 2s.
Definition 9 (Bin3). The MIP relaxation Bin3 of z " xy, x, y P r0, 1s 2 , with a lower-bounding depth L 1 P N and an upper-bounding depth of L P N, is defined as follows: p " x´y z " 1 {2pz x`zy´zp q px, y, zq P Mpx, yq px, z x q, py, z y q, pp, z p q P R L,L1 x, y P r0, 1s, p P r´1, 1s.
Note that we apply the tightened sawtooth relaxation R L,L1 , defined in (16), not only to x, y P r0, 1s, but also to the variable p, where the domain is either r0, 2s or r´1, 1s. This is done by following the transformation in Remark 1 to map p and z p to the interval r0, 1s and then applying (16) to the transformed variables.
We now combine Bin2 and Bin3 to derive an MIP relaxation for z " xy based on bounding z in the following two ways: and then replacing each right-hand side with proper upper and lower bounds. We choose this setting so that we only have to model lower bounds for the px´yq 2and px`yq 2 -terms and can thus apply the sawtooth epigraph relaxation (14) to circumvent the use of binary variables for these terms. To this end, we introduce the continuous auxiliary variables p 1 , p 2 , z x , z y , z p1 , z p2 and z to obtain an equivalent relaxation for z " xy: Finally, we replace x 2 and y 2 in the non-convex constraints (20b) with a sawtooth relaxation (17a) of depth L and p 2 1 and p 2 2 in the convex constraints (20c) by a sawtooth epigraph relaxation (17f) with depth L 1 to obtain a relaxation of z " xy in (20d). The resulting model is especially interesting as, in contrast to Bin2 and Bin3, it does not require binary variables to model equations of the form p 2 1 " px`yq 2 and p 2 2 " px´yq 2 , since we only need to incorporate lower bounds as used in Q L .
Definition 10 (Hybrid Separable HybS). Let x, y P r0, 1s, and let L, L 1 P N. The following MIP relaxation for z " xy, which combines the relaxations Bin2 and Bin3, is called the hybrid separable MIP relaxation, in short HybS, with a lower-bounding depth of L 1 and an upper-bounding depth of L: p 1 " x`y, p 2 " x´y px, z x q, py, z y q P R L,L1 pp 1 , z p1 q, pp 2 , z p2 q P Q L1 1 {2pz p1´zx´zy q ď z ď 1 {2pz x`zy´zp2 q px, y, zq P Mpx, yq x, y P r0, 1s, p 1 P r0, 2s, p 2 P r´1, 1s.
As Q L1 in (21) is originally defined for variables in r0, 1s, we again use the transformation from Remark 1 to extend it to other domains. Note that, when some constraint of an MIQCQP has a completely dense quadratic matrix, the number of (20c)-type constraints is quadratic in the dimension of x. Thus, the number of binary variables for Bin2 and Bin3 is in Opn 2 Lq, while the formulation HybS requires only nL binary variables. As we will show in Section 5, the formulation HybS also has a strictly tighter LP relaxation than that of either formulation Bin2 or Bin3. This implies a smaller volume of the projected LP relaxation as well. We also note, however, that the MIP relaxation is not strictly tighter. For example, let L " L 1 " 1 and consider the point px, yq " p 1 4 , 3 4 q. The upper bound on z " xy produced by the MIP relaxation Bin2 at this point is z ď 3 16 , i.e. the exact value. The MIP relaxation HybS (as well as Bin3), however, has a weaker upper bound of z ď 1 4 at this point.
When we apply any of the separable formulations Bin2, Bin3 and HybS to compute dual bounds for MIQCQPs in Section 6, all original univariate quadratic terms of the form x 2 i (i.e. those not resulting from any reformulations) are modeled via the tightened sawtooth relaxation (16).

Remark 3.
We can alternatively obtain a convex mixed-integer quadratic relaxation of z " xy by directly incorporating the convex quadratic constraints z x ď x 2 , z y ď y 2 , z p1 ě p 2 1 and z p2 ě p 2 2 in (20) exactly instead of using p.w.l. relaxations. This variation could be implemented using a convex solver instead of a linear solver.R emark 4 (Binary Variables and Dense MIQCQPs). When modeling Problem (1) using the MIP relaxations Bin2 and Bin3 at depth L, we have L binary variables created whenever the tightened sawtooth relaxation R L,L1 is used. For Bin2, we need the relaxations px i , z xi q P R L,L1 and pp ij , z pij q P R L,L1 for all pairs i ‰ j, where p ij " x i`xj . Note that p ij " p ji . Thus, we need pn`1 2 pn´1q 2 qL " 1 2 pn 2`1 qL binary variables. We have the same result for Bin3, where instead we have p ij " x i´xj for all pairs i ‰ j. Although this means p ij ‰ p ji , we still have p 2 ij " p 2 ji . Thus, a careful implementation also has 1 2 pn 2`1 qL binary variables. HybS uses significantly fewer binary variables as it only requires px i , z xi q P R L,L1 for each i. Hence, there are only nL binary variables. Surprisingly, this relaxation halves the error bound from Bin2 and Bin3. The strength in this approach is gained without quadratically-many binary variables by using the tightening set Q L1 with the p 1 -and p 2 -variables.˛

Theoretical Analysis
In this section, we give a theoretical analysis of the presented MIP relaxations for the equation z " xy over x, y P r0, 1s as well as the equation z " x 2 over x P r0, 1s, respectively, in order to allow for a comparison of structural properties between them. In particular, we will analyze their maximum and average errors, formulation strengths, i.e. (hereditary) sharpness and LP relaxation volumes, as well as the optimal placement of breakpoints to minimize average errors. The results we will arrive at are summarized in Table 1. Table 1. A summary of characteristics of the different MIP relaxations. Binary variables and constraints are given in the worst-case, in which every possible quadratic term must be modeled, for example if some matrix Qi is completely dense. The average error for HybS, Bin2 and Bin3 with respect to gra r0,1s 2 pxyq is calculated for L1 Ñ 8 and without the McCormick envelopes added. Finally, the average errors for Bin2 and Bin3 apply only to L ě 1; the corresponding volumes are 7 12 for L " 0. Finite L1 leads to slightly increased error bounds for the methods Bin2, Bin3 and HybS.

Maximum Error
We start the error analysis by discussing the maximum errors of the presented MIP relaxations.

Core Formulations
First, we discuss the maximum errors of the core formulations from Section 3.1. For the sawtooth approximation (10), the maximum error is an overestimation by 2´2 L´2 , see [7]. The maximum error of the sawtooth epigraph relaxation is 2´2 L´4 , which we prove in the following. The tightened sawtooth relaxation stated in (16) uses the sawtooth approximation for overestimation while the lower bound, which is incident with the sawtooth epigraph relaxation (14), gains an extra layer of accuracy, with a maximum error of 2´2 L´4 . Due to the overestimator, the (tightened) sawtooth relaxation has the same maximum error of 2´2 L´2 as the sawtooth approximation.
Proposition 2 (Error of the sawtooth epigraph relaxation). The maximum error of the sawtooth epigraph relaxation Q L for z ě x 2 with x P r0, 1s defined in (14) is 2´2 L´4 .  (18). In the left column, we show the case L " L1 " 1. In the right column, we show L " 1 and L1 Ñ 8.
Proof. The lower-bounding inequalities on z induced by the px, zq-projection of the sawtooth epigraph relaxation, i.e. proj x,z pQ L q, are exactly the supporting valid linear inequalities to z ě x 2 at the points x k :" k 2 L`1 , k " 0, . . . , 2 L ; see Proposition 1. The maximum error is attained at the intersection of two consecutive linear segments on the boundary of the feasible region defined by these inequalities, i.e. at px k , z k q :" p x k`xk`1 2 , x k x k`1 q " ppk`1 2 q2´L´1, kpk1 q2´2 L´2 q. Thus, the maximum error is given by independent of the choice of k.
Bivariate Products in MICQCP's. Second, for bivariate products, i.e., z " xy, in MIQCQPs, we use a different separable reformulation in each approach.
In the following, we derive upper bounds, purely depending on L, and lower bounds, depending on L and L 1 , on the maximum errors for variable products.
Depending on the reformulation, we have to address two different maximum error scenarios in the bounds on z.
We start with the maximum error in the relaxations for z in which x 2 and y 2 are overestimated and p 2 is underestimated. This applies to the upper and lower bound on z in HybS, the lower bound on z in Bin2, and the upper bound on z in Bin3. In each of these cases, the maximum overestimation of both z x " x 2 and z y " y 2 with the sawtooth relaxation is 2´2 L´2 , occurring at the grid centers x k " y k " pk`1 2 q2´L, k " 0, . . . , 2 L´1 . If we combine these points, x k and y k , with a point on the graph of p 2 , i.e. z p " p 2 , this point has an approximation error 0 and we obtain a lower bound for the maximum error in the relaxation of z " xy. Namely, if P IP L,L1 denotes either of the MIP relaxations Bin2, Bin3 or HybS of gra r0,1s 2 pxyq with depths L, L 1 , we have independent of the choice of k. This yields the following proposition.
Proposition 3. The maximum error in the MIP relaxations Bin2, Bin3 and HybS for z " xy with x, y P r0, 1s is at least 2´2 L´2 .
Furthermore, the maximum underestimation of p 2 is 2´2 L1´2 (twice the domain width, which means the error quadruples). This means we have an upper bound of on the maximum error in the lower bound on z in Bin2, the upper bound on z in Bin3 and both the upper and lower bound on z in HybS. We can use this observation to give an upper bound on the maximum error in the MIP relaxation HybS for z " xy. See Figure 6 for the maximum over-and underestimation of the HybS MIP relaxation.
Proposition 4. The maximum error in the MIP relaxation HybS for z " xy with x, y P r0, 1s is at most 2´2 L´2`2´2L1´3 .
Next, we consider the upper bound on z in Bin2 and the lower bound on z in Bin3. Here, we are interested in the overestimation of p 2 and the underestimation of x 2 and y 2 . The maximum overestimation of p 2 is 2´2 L (again, doubling the domain width quadruples the error). Combined with the maximum underestimation of the sawtooth relaxation for x 2 and y 2 of 2´2 L1´4 , this yields an upper bound on the maximum error on z of in terms of overestimation in Bin2 and underestimation in Bin3. Thus, we obtain the following upper bound for the maximum error in Bin2 and Bin3. See Figure 5 for the maximum over-and underestimation of the Bin2 MIP relaxation.
Proposition 5. The maximum error in the MIP relaxations Bin2 and Bin3 for z " xy with x, y P r0, 1s is at most 2´2 L´1`2´2L1´3 .
In summary, we have the same lower bound for the maximum error of 2´2 L´2 in Bin2, Bin3 and HybS. However, the known upper bound 2´2 L´1`2´2L1´4 in HybS is slightly better than that of Bin2 and Bin3 with 2´2 L´1`2´2L1´3 .
Remark 5. In the MIP relaxations Bin2, Bin3, and HybS, increasing L 1 does not introduce any new binary variables. Therefore, we note that in our computations in Section 6 we choose L 1 to be significantly larger than L, such that the maximum error depends primarily on L. As L 1 increases to infinity, the maximum errors in all three MIP relaxations converge to 2´2 L´2 .5

.2 Average Error and Minimizing the Average Error
In this section, we will study the average error of an MIP relaxation by computing the volume enclosed by the projected MIP relaxation as an additional measure of its relaxation quality.
First, we compute the volumes of all presented MIP relaxations. Then we prove that the uniform discretizations, which are used by definition in each MIP formulation in this article, are indeed optimal in terms of minimizing the volume of the projected MIP relaxation if the number of discretization points is fixed (i.e. if L and L 1 are fixed).
In all separable formulations, we use the sawtooth relaxation (11) for equations of the form z " x 2 . In [7,Propostion 6], the authors show that the volume of this relaxation R L,L is 3 {16¨2´2 L . Furthermore, from [7, Proposition 5] it follows that for any fixed number of breakpoints a uniform discretization minimizes the volume of the sawtooth epigraph relaxation.
Next, we consider the volumes for the MIP relaxations of z " xy. We start by showing that Bin2, Bin3 and HybS induce a grid structure in terms of relaxation error and have constant volumes over the resulting grid pieces. While the grid structure for HybS is obvious, we have yet to show it for Bin2 and Bin3. From [5, Table 4], we further know that for L, L 1 Ñ 8 the z-values in the projected LP relaxation of Bin2 (18) are bounded from below by the convex function C L 2 : rx,xsˆrȳ,ȳs Ñ R and from above by the concave func- The same holds for Bin3 (19) and the convex and concave functions C L 3 : rx,xsr ȳ,ȳs Ñ R and C U 3 : rx,xsˆrȳ,ȳs Ñ R, As the upper bound on the z-value in HybS is the same as that for Bin2 and the lower bound is the same as that for Bin3, the respective projected LP relaxations P LP L,L1 in the limit for Bin2, Bin3 and HybS are pproj x,y,z pP LP L,L1 qq " tpx, y, zq P r0, 1s 2ˆR : [Bin3]: lim L,L1Ñ8 pproj x,y,z pP LP L,L1 qq " tpx, y, zq P r0, 1s 2ˆR : [HybS]: lim L,L1Ñ8 pproj x,y,z pP LP L,L1 qq " tpx, y, zq P r0, 1s 2ˆR : In the following discussion, we will let L 1 Ñ 8 in all three formulations. This simplifies the proofs considerably and is relevant in so far as in our computations we use a relatively high value of L 1 " 10, which has a resulting maximum error below the standard accuracy of state-of-the-art MIP solvers (10´6) and yet has no influence on the number of binary variables and uses only OpL 1 q constraints. Although for different values of L 1 the volumes are different, the hierarchy of MIP relaxations that we establish is independent of this choice. We start with the volume of the MIP relaxation HybS. Proposition 6. Let P IP pLx,Lyq,L1 be the MIP relaxation HybS from (21) without the McCormick inequalities, where we now allow for independent discretization depths L x and L y to overestimate x 2 and y 2 , respectively (i.e. with px, z x q P R Lx,L1 and py, z y q P R Ly,L1 ),i.e.
Then the volume of P IP pLx,Lyq,L1 converges to the same value over each grid piece of the form rk x 2´L x , pk x`1 q2´L x sˆrk y 2´L y , pk y`1 q2´L y s, where k x P 0, 2 Lx and k y P 0, 2 Ly for L 1 Ñ 8. Furthermore, for the total volume of P IP pLx,Lyq,L1 , we have lim L1Ñ8 vol´proj x,y,z pP IP pLx,Lyq,L1 q¯" 1 6 p2´2 Lx`2´2Ly q.
Proof. Since F L1 Ñ x 2 uniformly over r0, 1s as L 1 Ñ 8, we have lim L1Ñ8 tpp, z p q P r0, 1sˆR : pp, z p q P Q L1 u " tpp, z p q P r0, 1sˆR : pp, z p q P epi r0,1s pp 2 qu under Hausdorff distance. In HybS, we have pp 1 , z p1 q, pp 2 , z p2 q P Q L1 (via the transformation in Remark 1) as well as p 1 " x`y and p 2 " x´y. Thus, we have in the limit, as L 1 Ñ 8: Furthermore, since F L pxq ě x 2 for all x P r0, 1s, L P tL x , L y u, and px, z x q P R Lx,L1 , py, z y q P R Ly,L1 , we obtain z x ď F Lx pxq and z y ď F Ly pyq.
Therefore, the inequality from (21) implies the following in the limit: Now we apply these inequalities to grid pieces of the form rx,xsˆrȳ,ȳs. Let x :" k x 2´L x ,x :" pk x`1 q2´L x , ȳ :" k y 2´L y andȳ :" pk y`1 q2´L y , and define w x :"x´x " 2´L x as well as w y :"ȳ´ȳ " 2´L y . Then, as F Lx pxq " px`xqx`xx for x P rx,xs and F Lx pyq "´pȳ`ȳqy`ȳȳ for y P rȳ,ȳs, the above bounds on z are exactly the envelopes C L 2 px, yq for the lower bound and C U 3 px, yq for the upper bound, respectively. Thus, by Proposition 11, which is proved later, the volume of proj x,y,z pP IP pLx,Lyq,L1 q over the grid piece is Lx`2´2Ly q in the limit. Note that this does not depend on the choice of k x and k y (and thus the choice of grid piece).
Since we have 2 LxLy grid pieces overall, the total volume in the limit is then given by [ \ The following proposition establishes the volumes of the MIP relaxations and grid structure for the MIP relaxations Bin2 and Bin3. As this derivation is extensive, we prove it in Appendix C.
Proposition 7. Let P IP L,L1 be either the MIP relaxation Bin2 from (18) or Bin3 from (19). Then the volume of P IP L,L1 converges to the same value over each grid piece of the form rk2´p L´1q , pk`1q2´p L´1q sˆrk2´p L´1q , pk`1q2´p L´1q s, where k P 0, 2 L . Furthermore, for the total volume we have Now that we have calculated the average error, i.e. the volume of the MIP relaxations, for uniform breakpoints, we show that among all possible breakpoint choices, uniform placement of breakpoints minimizes the average error. For z " x 2 and the sawtooth functions, this has already been shown in [7]; for equations z " xy it still has to be shown. We prove average error minimization for uniform breakpoint placement in HybS and do not consider the formulations Bin2 and Bin3 here, as they are hard to analyze in this respect, which is also mentioned in [5] for approximations. In Proposition 6, we have shown that HybS has a grid structure where on each grid piece, the average error is where w x and w y are the widths of the grid piece in x-and y-direction respectively. In the following, we consider a piecewise relaxation defined via these grid pieces and show that the total average error is minimized by a uniform breakpoint placement, as is the result of HybS. Proposition 8. Let 0 " x 0 ă x 1 ă . . . ă x n " 1 and 0 " y 0 ă y 1 ă . . . ă y m " 1 be sets of breakpoints. For each grid piece rx i´1 , x i sˆry j´1 , y j s, consider a relaxation of gra r0,1s 2 pxyq with average error 1 6 pw xi w 3 yj`wyj w 3 xi q, where w xi :" x i´xi´1 and w yj :" y j´yj´1 are the widths of the grid piece with i P n and j P m . Then a uniform spacing of these breakpoints minimizes the average error overall piecewise relaxations of this form.
Proof. The problem of minimizing the average error of a piecewise relaxation of this form can be formulated as The objective function in (29) sums the average errors over the single grid pieces while the constraints ensure that all single grid lengths sum up to 1 and are greater than or equal to 0. The objective function can be rewritten to Thus, (29) decomposes into two independent problems where the respective optimal solutions x˚and y˚, can be composed to create px˚, y˚q, which is optimal for the original problem (29). The subproblems are and These are exactly the sawtooth-area optimization problems from [7, Proposition 5], such that a uniform placement of the breakpoints where each w xi " 1 n is optimal for (30), and w yj " 1 m is optimal for (31). Consequently, a uniform placement of grid points is optimal for (29) and the total volume is 1 6 p 1 m 2`1 n 2 q.
[ \ Remark 6. Let P IP L,L be a depth-L HybS MIP relaxation of gra r0,1s 2 pxyq from (21), with L " L 1 . Since P IP L,L satisfies the uniform spacing of breakpoints discussed in Proposition 8, we see that P IP L,L is an optimal piecewise relaxation in the sense of minimizing the average error, attaining the average error of E avg pP IP L,L , gra r0,1s 2 pxyqq " 1 3 2´2 L .5

.3 Formulation Strength
In the previous section, we discussed the maximum and average errors incurred from using certain discretizations. We will now consider the strength of the resulting MIP relaxations by analyzing their LP relaxation. First, we will check for sharpness and later compare them via the volume of the projected LP relaxation. Sharpness means that the projected LP relaxation equals the convex hull of the set to be formulated. If we now consider the volume of a projected LP relaxation, it can minimally be the volume of the convex hull, which precisely holds if the formulation is sharp. If a formulation is not sharp, the volume of the projected LP relaxation measures how much a formulation deviates from sharpness. The volume of LP relaxation as a measure of formulation strength was previously used in [5].

Sharpness
We start with the core formulations from Section 3. It is well known that the McCormick relaxation yields the convex hull of the feasible set of z " xy over box domains. Therefore, it is obviously sharp. In [7], it is shown that the sawtooth approximation for z " x 2 is sharp. We use this result to prove that sharpness also holds for the tightened sawtooth relaxation (16). See Figure 4 for examples of this relaxation under different parameter choices.
Then P IP L,L1 is sharp if and only if both P IPL ,L1 and P IPĹ ,L1 are sharp. This simplification holds since P IP L,L1 " P IPL ,L1 XP IPĹ ,L1 and since the upper bound P IPL ,L1 strictly overestimates x 2 , while the lower bound, P IPĹ ,L1 strictly underestimates x 2 , such that sharpness of the two can be considered separately. Now, the sharpness of P IPL ,L1 follows directly from the sharpness of the sawtooth approximation (10), which holds by [7,Theorem 1]. For the sharpness of P IPĹ ,L1 , the proof closely follows the proof of sharpness in [7, Theorem 1], except that, after choosing some fixed x P r0, 1s, we frame the contradiction as follows: 1. Choose g˚as in [7, Theorem 1], and choose the minimum possible value of z˚given g˚, such that z˚attains one of its lower bounds. 2. Observe that the chosen solution admits a feasible solution in P IP L,L1 , such that if it is minimal in the LP, then we are done. 3. Suppose for a contradiction that there exists a better z-minimal solution pẑ,ĝq than the proposed solution pz˚, g˚q, such that some incident lower bound must have been improved. 4. Observe that the improved incident lower bound must be of the form z ě f j px, g˚q´2´2 L´2 for some j ě 0, as the lower bounds 0 and 2x´1 do not change with the choice of g˚. Thus, f j px, g˚q´2´2 L´2 ě f j px,ĝq´2´2 L´2 5. Show that f j px,ĝq´f j px, g˚q ă 0, a contradiction on the choice of pŷ,ĝq.
Thus, the solution (z˚, g˚) was optimal to begin with, and therefore sharpness must hold.
The proof that f j px, g˚q´f j px, g˚q ă 0 follows in exactly the same manner as [7, Theorem 1] and is thus omitted here.
[ \ In [7], besides sharpness, it is further shown that the sawtooth approximation is also hereditarily sharp. The following theorem states that the same is true for the tightened sawtooth relaxation (16) and z " x 2 .
Theorem 2. The tightened sawtooth relaxation for z " x 2 is hereditarily sharp.
As the proof of Theorem 2 takes up a significant amount of space, we moved it to Appendix B.
Next, we show that neither of the MIP relaxations Bin2, Bin3 nor HybS for z " xy are sharp. That is, their projected LP relaxation does not equal Mpx, yq We now show`lim L,L1Ñ8 proj x,y,z pP LP L,L1 q˘z convpproj x,y,z pP IP 1,1 qq ‰ H, which implies proj x,y,z pP LP L,L1 qz convpproj x,y,z pP IP L,L1 qq ‰ H, such that P IP L,L1 is not sharp for any L, L 1 P N. The argument works in the following manner: proj x,y,z pP LP L,L1 qz convpproj x,y,z pP IP L,L1 qq Ěˆlim L,L1Ñ8 proj x,y,z pP LP L,L1 q˙z convpproj x,y,z pP IP 1,1 qq ‰ H ñ proj x,y,z pP LP L,L1 qz convpproj x,y,z pP IP L,L1 qq ‰ H ñ proj x,y,z pP LP L,L1 q ‰ convpproj x,y,z pP IP L,L1 qq.
To this end, we show that there exist points px, y, zq P lim L,L1Ñ8 proj x,y,z pP LP L,L1 q with px, y, zq R proj x,y,z pP IP 1,1 q. Observe that, for any L, the point px, xq is feasible within the LP relaxation of the tightened sawtooth relaxation (16) for x 2 , with α i " g i´1 , g i " 0. Thus, for all L, L 1 ě 0 and for allx,ŷ P r0, 1s 2 , we have that P LP L,L1 , and thus also its limit lim L,L1Ñ8 proj x,y,z pP LP L,L1 q, admits the values z x "x, z y "ŷ and z p1 " px`ŷq 2 . Therefore, for px, yq " p0, 1 4 q, we obtain z " 1 2 ppx`yq 2´x´y q "´3 16 , such that p0, 1 4 ,´3 16 q P P LP 8,8 . Next, in order to prove p0, 1 4 ,´3 16 q R convpproj x,y,z pP IP 1,1 qq, we show mintz : py, zq P proj y,z pP IP 1,1 | x"0 qu "´1 8 . If this holds, then we have mintz : py, zq P convpproj y,z pP IP 1,1 | x"0 qqu "´1 8 , such that p0, 1 4 ,´3 16 q R convpproj x,y,z pP IP 1,1 qq. We derive a representation of proj y,z pP IP 1,1 | x"0 q that becomes an LP after branching spatially at y " 1 2 to resolve the upper bound on z y . We then minimize z over both branches via solving an MIP.
Note that the two pieces of the upper bound on z y meet at y " 1 2 . Using this to separately minimize z over the above set, once over y P r0, 1 2 s and once over y P r 1 2 , 1s, e.g. using an MIQCQP solver, we obtain two globally minimizing solutions with z "´1 8 , namely at y " 1 4 and at y " 3 4 . Thus, we conclude that p0, 1 4 ,´3 16 q R convpproj x,y,z pP IP 1,1 qq, such that P IP L,L1 is not sharp for any 1 ď L ď L 1 .
[ \ Proposition 10. Let P IP L,L1 be either of the two MIP relaxations Bin2 (18) or Bin3 (19). Then, without the inequalities from the McCormick envelope Mpx, yq, P IP L,L1 is not sharp for any L, L 1 P N.
Proof. Since Bin2 (18) has the same lower-bounding constraints as HybS, the proof follows directly from Proposition 9. Moreover, for Bin3 (19), the proof follows in exactly the same way as the proof of Proposition 9, except for the upper-bounding version of the same point, px, y, zq " p0, 1 4 , 3 8 q, and acting on the upper-bounding constraints from (21) and maximizing z instead. As the proof is very similar, with the corresponding upper bound z " 1 8 on proj y,z pP IP 1,1 | x"0 q, we omit it here.

LP Relaxation Volume
Having proved that none of the separable MIP relaxations is sharp, which implies that they are also not hereditarily sharp, we now turn to consider the volume of projected LP relaxations.
For L " L 1 , the volume for the tightened sawtooth formulation (7) is 3 16 2´2 L , which has been shown in [7]. For general L 1 , by integrating over the overapproximation and underapproximation errors separately with the same analysis as in [7], we can derive a general volume of 1 6 2´2 L`1 48 2´2 L1 . We omit the precise calculation here.
In our analysis of the separable MIP relaxations, we only consider the limits for L, L 1 Ñ 8. This allows us to evaluate the volumes independently of the underlying discretizations. For the additional volumes resulting from discretization errors, we refer to [7,Appendix], where the volume over the error function of the sawtooth approximation is given. We start with HybS.
Proposition 11. Let P LP L,L1 be the LP relaxation of the MIP relaxation HybS stated in (21) over the general domain rx ,xsˆrȳ ,ȳs. Without the McCormick envelope constraints, the volume of the limit of the projected LP relaxation lim L,L1Ñ8 proj x,y,z pP LP L,L1 q is 1 6 pw x w 3 y`wy w 3 x q, where w x "x´x and w y "ȳ´ȳ .
Proof. The z-values in the projected LP relaxation of (21) are bounded by the convex function C L 2 and the concave function C U 3 , which are stated above in (22) and (25), respectively. The volume of the projected LP relaxation (21) is then calculated via integration: [ \ Proposition 12. Let P LP L,L1 be the LP relaxation of either the MIP relaxation Bin2 or Bin3 stated in (18) and (19) over the domain rx ,xsˆrȳ ,ȳs. Without the McCormick envelope constraints, the volume of the limit of the projected LP relaxation is lim L,L1Ñ8 volpproj x,y,z pP LP L,L1 qq " where w x "x´x and w y "ȳ´ȳ .
Proof. The z-values in the projected LP relaxation of (18) and (19) are bounded by the convex function C L 2 and the concave function C U 3 , which are stated above in (22) and (25), respectively. The volume calculation is then done via integration: [ \ We use Proposition 11 and Proposition 12 to prove that HybS yields strictly tighter LP relaxations than Bin2 and Bin3.
Proposition 13. Without the McCormick envelope constraints, the LP relaxation of the MIP relaxation HybS in the limit as L, L 1 Ñ 8 is strictly tighter than that of Bin2 or Bin3. Moreover, the volume of the projected LP relaxation of formulation HybS in the limit as L, L 1 Ñ 8 is smaller by 1 4 w 2 x w 2 y .
Proof. In [5, Appendix, Proposition 2] it has been shown that C L 2 is a tighter convex underestimator than C L 3 and that C U 3 is a tighter concave overestimator than C U 2 for z " xy. Thus, since the HybS approach converges to C L 2 as an underestimator and C L 3 as an overestimator, it is strictly tighter than either of Bin2 or Bin3. The volume calculation can again be done via integration: [ \

Computational Results
In the previous sections, we have shown the theoretical advantages of HybS compared to Bin2 and Bin3, most importantly that it requires fewer binary variables to model MIP relaxations of variable products with the same accuracy. As the density of quadratic matrices in MIQCQPs increases, this advantage becomes larger, leading to a maximum of Opnq binary variables for HybS and Opn 2 q binary variables for Bin2 and Bin3; see Table 1. In general, the number of binary variables of an MIP relaxation is crucial for its solution time. Hence, the theoretical results suggest that the HybS formulation yields MIP relaxations that are faster to solve than the Bin2 and Bin3 relaxations. Consequently, shorter run times or better primal and dual bounds after certain run time limits can be expected. To analyze these MIP relaxations for z " xy, it is preferable to use a model for the x 2 terms that requires as few binaries as possible. Otherwise, the impact of fewer binaries for HybS might not be that noticeable, since the difficulty of the various MIP models might then be more determined by the MIP formulations of the x 2 terms. The sawtooth relaxation does exactly that with its logarithmic number of binary variables. Furthermore, we proved that it is also a hereditary sharp formulation. In the computational study, we first compare both run times and dual bounds of the MIP relaxations. MIP relaxations are primarily used to deliver dual bounds for the MIQCQPs. The best dual bound of an MIP relaxation is then a valid dual bound for the MIQCQP. However, with increasing accuracy of the relaxations, the solution times also increase. Therefore, both the run time (for coarser relaxations) and the best dual bounds (for finer relaxations) are important measures if we want to compare different MIP relaxations with the same accuracy.
Complementary to this, in a second part of the study we investigate to what extent the MIP solutions can serve as a starting point to find feasible solutions to the MIQCQP. A common heuristic approach is to fix any integer variables from the original problem according to the MIP solution and solve the resulting QCQP to local optimality. The starting points of the continuous variables of the original problem again correspond to the values of the MIP solution. As before, our theoretical results imply that the HybS relaxations are generally more likely to find MIP solutions after certain run time limits due to the smaller number of binary variables. Presumably, this translates to a higher probability of finding feasible solutions to the MIQCQP using the heuristic approach. In detail, we solve MIP relaxations using either HybS, Bin2, or Bin3 in combination with the sawtooth relaxation using Gurobi [28] and a callback function that uses the nonlinear programming (NLP) solver IPOPT [41] to find local optimal solutions for the QCQP.
All instances were solved in Python 3.8.3, via Gurobi 9.5.1 and IPOPT 3.12.13 on the 'Woody' cluster, using the "Kaby Lake" nodes with two Xeon E3-1240 v6 chips (4 cores, HT disabled), running at 3.7 GHZ with 32 GB of RAM. For more information, see the Woody Cluster Website of Friedrich-Alexander-Universität Erlangen-Nürnberg. The global relative optimality tolerance in Gurobi was set to the default value of 0.01%, for all MIPs and MIQCQPs.

Study Design
In the following, we explain the design of our study and go into detail regarding the instance set as well as the various parameter configurations.
Instances. We consider a three-part benchmark set of 60 instances: 20 nonconvex boxQP instances from [22,7,17] and earlier works, 20 AC optimal power flow (ACOPF) instances from the NESTA benchmark set (v0.7.0) (see [18]), previously used in [2], and 20 MIQCQP instances from the QPLIB [24]. In Appendix D links that contain download options and detailed descriptions of the instances can be found. For an overview of the IDs of all instances, see Table 8.
The benchmark set is equally divided into 30 sparse and 30 dense instances. We call an instance dense if either the objective function and/or at least one quadratic function in the constraint set is of the form x J Qx, where x P R n are all variables of the problem and Q P R n,n is a matrix with at least 25% of its entries being nonzero.
Parameters. For each instance, we solve the resulting MIP relaxation of each method from Section 4 using various approximation depths of L P t1, 2, 4, 6u and a time limit of 8 hours. In Section 6.1 , we have listed the maximum errors associated with each L, which are derived from the values in Table 1. All sawtooth and separable MIP relaxations are solved once with L 1 " L and once with a tightened underestimator version for univariate quadratic terms where L 1 " maxt2, 1.5Lu. This tightening is done as described in Definition 7 by adding linear cuts and without introducing further binary variables. In the separable methods HybS, Bin2, and Bin3 this leads to a tightening of the relaxation of z " xy terms as well as of z " x 2 terms in the original MIQCQP. We refer to the tightened MIP relaxations as T-HybS, T-Bin2, and T-Bin3. Table 2 gives an overview of the different parameters in our study. In total, we have 24 parameter configurations for 60 original problems, which means that we solve 1440 MIP instances. Table 2. In the study, we consider the parameters cuts, depth, and formulation on 60 MIQCQP instances and thus solve p2¨4q¨3¨60 " 1440 MIP relaxations. 5e-04 L = 6 2e-05 3e-05 Table 3. Maximum error for different values of L Callback function. Solving all MIP relaxations, we use a callback function with the local NLP solver IPOPT that works as follows: given any MIP-feasible solution, the callback function fixes any integer variables from the original problem (before applying any of the discretization techniques from this work) according to this solution and then solves the resulting QCCP, the original MIQCQP with fixed binaries, locally via IPOPT in an attempt to find a feasible solution for the original MIQCQP problem.

Number of Binaries
In advance of the results of the study, we provide another table that shows, how many binary variables can be saved relatively with HybS compared to Bin2 and Bin3. In Table 4 we specify how many variables occur on average with each method in the MIP relaxation models. Apart from a few original variables of the MIQCQPs, the main part of the binary variables comes from the MIP relaxations of quadratic terms. Since Bin2 and Bin3 require exactly the same number of binary variables for each univariate or bivariate MIP relaxation, only Bin2 is listed in Table 4. The table shows that HybS requires close to two-thirds of the binary variables on the sparse instances. The difference is much greater on the dense instances, where HybS requires only nearly 6% of the binary variables of Bin2 and Bin3. Both numbers are in line with our theoretical findings. Assuming, we had an MIQCQP instance with only one variable product x i x j and we would set L " 1, then there would be three binary variables each for Bin2 and Bin3, while we would need only two for HybS. The fact that this effect is significantly stronger for dense instances stems from the quadratic increase of binary variables in dense matrices for Bin2 and Bin3 compared to the linear increase for HybS.

Results
In the following, we present the results of our study at a detailed level. In particular, we aim to answer the following questions regarding run times, dual bounds, and the ability to find feasible solutions for the MIQCQPs: -Is our enhanced method HybS computationally superior to its predecessors Bin2 or Bin3?
-Is it beneficial to use tightened versions of the MIP relaxations HybS, Bin2, and Bin3, i.e., to choose L 1 ą L?
We point out that in Part II of this work, we also present a more detailed comparison with different MIP relaxation methods and the state-of-art MIQCQP solver Gurobi.

Run Times
We start with a discussion on the run times for the different methods. Here, we use the shifted geometric mean, which is a common measure for comparing two different MIP-based solution approaches. The shifted geometric mean of n numbers t 1 , . . . , t n with shift s is defined as`ś It has the advantage that it is neither affected by very large outliers (in contrast to the arithmetic mean) nor by very small outliers (in contrast to the geometric mean). We use a typical shift s " 10. Moreover, we only include those instances in the computation of the shifted geometric mean, where at least one solution method delivered an optimal solution within the run time limit of 8 hours.
In Table 5, the shifted geometric mean values of the run times for solving the separable MIP relaxations on all instances are given. Here, HybS clearly outperforms all other methods, including its tightened variant T-HybS. HybS is at least a factor of two faster than (T-)Bin2 and (T-)Bin3. Tightening HybS, Bin2, and Bin3 results in comparable but slightly higher run times for Bin2 and Bin3 and partially in notably higher run times for HybS, e.g. by a factor of more than two in case of L " 4.
For sparse instances, the same picture emerges, although the benefit of HybS is not as great as before, see the second block in Table 5. Conversely, the advantage of HybS increases dramatically for dense instances. Here, HybS is at least a factor of five faster than (T-)Bin2 and (T-)Bin3, see the third block Table 5. Tightening the three methods again leads to mostly slightly higher run times for Bin2 and Bin3 and to considerably higher run times for HybS.

Dual Bounds
As mentioned before, MIP relaxations are primarily used to deliver (tight) dual bounds for MIQCQPs. Thus, we now compare the tightness of the dual bounds provided by the various methods. To this end, we compute relative optimality gaps g p,s :" |d p,s´bp |{|b p | for all methods s (with a certain L value) and instances p of the benchmark set, where d p,s is the corresponding best dual bound found by method s and b p is the best-known primal bound for instance p. Table 6 shows the arithmetic and geometric means of the relative optimality gaps for all 60 instances. Please note that we rounded each gap below 0.0001 to avoid multiplications by 0 for the geometric mean. First, the arithmetic mean decreases with higher L values but then starts to increase again. This pattern indicates the presence of more outliers with higher L values, leading to inconsistencies in the arithmetic mean. On the other hand, the geometric mean shows a tendency that with higher L values, we can expect tighter dual bounds for the considered instances. This trend is more consistent and reflects a more balanced view of overall performance. HybS often achieves the lowest geometric mean values, which indicates its superior performance. In summary, the geometric means in Table 6 emphasize the effectiveness of higher L values for tighter dual bounds, with HybS standing out as a particularly strong method based on the considered data. Comparing the tightened versions (T-Bin2, T-Bin3, and T-HybS) with their non-tightened counterparts, the results are mixed. The tightened versions yield similar optimality gaps, with some showing slightly better and others slightly worse performance depending on different L values. However, there is no clear trend, suggesting that there is generally no advantage to tightening the methods. Dividing the benchmark set into sparse and dense instances, gives a similar picture for dense instances as on the full benchmark set, see the third block in Table 6. However, a different trend can be seen for sparse instances in Table 6. Here, for higher L values, both the arithmetic and geometric means consistently decrease, while HybS again outperforms Bin2 and Bin3. In contrast to the full benchmark set, the tightening is now slightly beneficial for all three methods.
Additionally, we provide performance profile plots as proposed by Dolan and More [20] to illustrate the scaling of the dual bounds, see Figure 7 - Figure 9. The intention here is to obtain a more sophisticated picture of how the various methods perform if we allow the dual bounds to lie within a given factor of the best overall dual bound. The performance profiles work as follows: Let d p,s again be the best dual bound obtained by MIP relaxation s for instance p after a certain time limit. With the performance ratio r p,s :" d p,s { min s d p,s , the performance profile function value P pτ q is the percentage of problems solved by approach s such that the ratios r p,s are within a factor τ P R of the best possible ratios. All performance profiles are generated with the help of Perprof-py by Siqueira et al. [38]. In addition to the performance profiles across all instances, we also show performance profiles for the dense and sparse subsets of the instance set. Please note that in minimization problems, the higher the value of a dual bound, the better it is. Since lower values are considered better in performance profiles, we simply take the inverse of the dual bound as the value to be compared.
In Figure 7 the performance profiles of the separable MIP relaxations with regard to dual bounds using all instances can be seen. Starting with L " 2, the newly introduced methods HybS and T-HybS deliver significantly better dual bounds. Except for L " 2, where T-HybS dominates HybS, we do not obtain better dual bounds by tightening the separable MIP relaxations. With L " 4 and L " 6, HybS yields dual bounds that are within a factor 1.05 of the overall best bounds among separable MIP relaxations for nearly all instances. The other methods require a corresponding factor of at least 1.2. In Figure 8 and Figure 9, we divide the benchmark set into sparse and dense instances again to obtain a more in-depth look at the benefits of HybS. For sparse instances, using HybS and T-HybS has no clear advantage, as Figure 8 shows. However, with L " 1 and L " 2, the tightened variants deliver notably better dual bounds. For L " 1, the dual bounds computed with T-Bin2 and T-Bin3 are in almost all cases the overall best-found bounds. Their counterparts Bin2 and Bin3 are only able to provide the overall best bounds for about 50% of the instances. For L " 2, we  see a similar picture. T-Bin2 and T-Bin3 deliver the best bounds for roughly 80% of the instances, while Bin2 and Bin3 achieve this only in 40% of the cases.
For dense instances, the picture is much clearer. Here, HybS and T-HybS are considerably better than Bin2, Bin3, and their tightened variants, particularly from L " 2 to L " 6; see Figure 9. With L " 2, HybS and T-HybS are able to compute dual bounds that are within a factor 1.05 of the overall best bounds for nearly all instances. All other methods require a corresponding factor of more than 1.2. For L " 4 and L " 6, we obtain by HybS the best overall bounds for roughly 90% of all instances, while all other approaches provide the best bounds for less than 50% of the instances. With the exception of L " 2, where tightening HybS results in slightly better dual bounds, the tightened versions of the separable MIP relaxations attain significantly weaker dual bounds than their corresponding counterparts.

Feasible Solutions
Finally, we highlight some important results on primal bounds. Table 7 gives the number of feasible solutions that the separable MIP methods were able to find in combination with IPOPT as the local QCQP solver. The quality of the corresponding solutions is computed in terms of relative  optimality gaps, where we used the best-known dual bounds from the literature or computed them elsewhere using Gurobi and our methods. Regarding the ability to find feasible solutions, all separable methods perform quite similarly and find more feasible solutions with higher L values. With L " 6, HybS in combination with IPOPT is able to compute feasible solutions to the original MIQCQP for 51 out of 60 benchmark instances, 43 of which have a relative optimality gap below 1% and 40 of which are even globally optimal, i.e., which have a gap below 0.01%. All in all, HybS offers a slight advantage in terms of finding feasible solutions when coupled with IPOPT. Table 7. Number of feasible solutions found with different relative optimality gaps. The first number corresponds to a gap of less than 0.01%, the second to a gap of less than 1% and the third number indicates the number of feasible solutions. Bin2 T-Bin2 Bin3

Discussion
All in all, the clear winner among the separable methods is HybS. For large L values, HybS provides the best bounds, the shortest run times, and finds in combination with IPOPT the most and best feasible solutions for the original MIQCQP instances. This advantage is especially noticeable on dense instances and consistent with the theoretical findings from Section 5. While in HybS the number of binary variables increases linearly in the number of variable products, it increases quadratically in Bin2 and Bin3. On the one hand, this results in short run times for the HybS models or better bounds after certain run time limits. On the other hand, with significantly fewer binaries we are more likely to find feasible solutions for the MIP relaxations. As the accuracy increases, the MIP relaxations lead to solutions with smaller and smaller MIQCQP feasibility violations. Therefore, at higher L values, we are more likely to find an MIQCQP feasible solution using the heuristic IPOPT approach, which coincides with Table 7. Furthermore, based on the computational results, a tightening of the separable methods is not advisable, except for sparse instances with small L values. This is most likely due to the large number of additional constraints that are needed to underestimate p 2 1 and p 2 2 ; see Table 1. In Part II of this work, we revisit the idea of tightening MIP relaxations for the normalized multiparametric disaggregation technique (NMDT) introduced in [13]. In addition, we perform a comparison of HybS with NMDT-based methods and Gurobi as an MIQCQP solver. To this end, we reuse the results of HybS from Part I.

Conclusion
We introduced an enhanced MIP relaxation for non-convex quadratic products of the form z " xy, called hybrid separable (HybS). We showed that HybS has clear theoretical advantages over its predecessors Bin2 and Bin3, all based on separable reformulation of xy to univariate quadratic terms. Most importantly, HybS requires a significantly lower number of binary variables and has a tighter linear programming relaxation. In addition to this enhanced MIP relaxation for z " xy, we introduced a hereditary sharp MIP relaxation called sawtooth relaxation for z " x 2 terms, which requires only a logarithmic number of binary variables with respect to the relaxation error. We combined the sawtooth relaxation and HybS to obtain MIP relaxations for MIQCQPs.
In a broad computational study, we compared HybS against its predecessors from the literature, which we again combined with the sawtooth relaxation for univariate quadratic terms. We showed that HybS determines far better dual bounds, while also exhibiting shorter run times. Finally, HybS is also able to find high-quality solutions to the original quadratic problems when used in conjunction with a primal solution callback function and a local non-linear programming solver.

A MIP Relaxations on General Intervals
In this section, we generalize the MIP relaxations for gra r0,1s 2 pxyq and gra 2 r0,1s px 2 q discussed in this article to general box domains px, yq P rx,xsˆP rȳ,ȳs and x P rx,xs, where x ăx, ȳ ăȳ and x,x, ȳ,ȳ P R. by giving explicit formulations for general bounds on x and y.

A.1 MIP Relaxations for Bivariate Quadratic Equations
First, we consider MIP relaxations for z " xy and give an explicit model of HybS for general box domains. We omit the formulation of Bin2 and Bin3 here, as these work analogously to HybS.
In the HybS MIP relaxation, in addition to the variables x and y, we must also transform the variables p 1 " x`y and p 2 " x´y and their respective bounds. In the following, the sawtooth modeling px, z x q P R L,L1 , py, z y q P R L,L1 , pp 1 , z p1 q P Q L1 , pp 2 , z p2 q P Q L1 is performed according to Remark 1. HybS (21) for general box domains then reads as follows: wy , g y q`ȳp2y´ȳq z ě 1 2 pz p1´zx´zy q z ď 1 2 pz x`zy´zp2 q px, y, zq P Mpx, yq x P rx,xs y P rȳ,ȳs p 1 P rx`ȳ,x`ȳs p 2 P rx´ȳ,x´ȳs. (32)

A.2 MIP Relaxations for Univariate Quadratic Equations
In order to MIP relaxations for z " x 2 where x P rx,xs with x ăx and x,x P R, we introduce the auxiliary variablex P r0, 1s and apply each original MIP relaxation to modelẑ "x 2 . In addition, we mapx andẑ back to r0, 1s, yieldinĝ , with x P rx,xs, cf. Remark 1. With this transformation, we are able to formulate the tightened sawtooth relaxation for x P rx,xs. The tightened sawtooth relaxation (16) for general box domains then reads tpx, zq P rx,xsˆR : Dpx,ẑ, g, αq P r0, 1sˆRˆr0, 1s L1`1ˆt 0, 1u L : (34)u, (33) where the constraints arex We note that generalizing the sawtooth epigraph relaxation (14) works analogously.

B Proof of Theorem 2: Hereditary Sharpness of the Tightened Sawtooth Relaxation
This section is devoted to proving Theorem 2 which states that the tightened sawtooth relaxation (16) for z " x 2 is hereditarily sharp. This is a similar, albeit, more difficult result than the related one in [7] regarding the original sawtooth approximation. It is not clear how to obtain the former as a corollary of the latter. Furthermore, we use the result of [7] to shorten the work needed here. Before we begin the proof, we first introduce some required notation and restate several helpful results from [7]. For integers L 1 ě L ě 0, let P IP L,L1 be the tightened sawtooth relaxation from (16) in the space of px, z, g, αq and let P LP L,L1 be its LP relaxation, where in the latter all α-variables are relaxed to the interval r0, 1s. For convenience, and to avoid the variable redundancy g 0 " x throughout this section, we will omit the use of g 0 and use the abbreviated notation g " g 1,L1 . To further simplify the notation, we omit the subscript L, L 1 when the context is clear and simply write P IP and P LP instead of P IP L,L1 and P LP L,L1 . Now let I Ď L be the index set of the binary variables α which are fixed to given values ᾱ P t0, 1u I . This can be thought of as considering the branch in a branch-and-bound tree where α " ᾱ holds. Then we wish to show that at this node in the tree, sharpness also holds. More precisely, the goal is to show that P IP is sharp under the restriction α I " ᾱ , where α I " rα i1 , . . . , α i |I| s J and I " ti 1 , . . . , i |I| u. Hereditary sharpness of P IP then means convpproj x,z pP IP | α I "ᾱ qq " proj x,z pP LP | α I "ᾱ q.
Next, we define the lower-bounding functionsf j : r0, 1sˆr0, 1s L1`1 Ñ r0, 1s, Note thatf´1 andf´2 do not actually depend on g. Further, note that there is a slight abuse of the notation above, since technically f j has the domain r0, 1sˆr0, 1s j`1 ; however, we assume the reader will interpret the functional expressions as f j px, g j q instead. We also define the lower-bounding functionš F j : r0, 1s Ñ r0, 1s,F j pxq " F j pxq´2´2 j´2 j " 0, . . . , L 1 , F´1pxq " 2x´1, F´2pxq " 0 (38) in terms of only x, based on the functions F L from (6), as the j-th p.w.l. underestimator to z " x 2 in the construction of the sawtooth relaxation, as defined in Section 3.2. Further, definef : r0, 1sˆr0, 1s L Ñ r0, 1s andF : r0, 1s Ñ r0, 1s witȟ Observation 2 The functionF is convex as it is the maximum of a finite set of convex functions.
This applies analogously toP LP,ᾱ . We now state some important results from [7] that establish bounds on each variable g i withinP LP,ᾱ px,g,αq and a closed-form optimal solution for g when minimizing z withinP IP,ᾱ or anyP IP,ᾱ j . Lemma 1 (Bounds in Projection, Lemma 3 from [7]). For all i P 0, L , we have proj gi pP LP,ᾱ px,g,αq q " convpproj gi pP IP,ᾱ px,g,αq qq ": ra i , b i s ‰ H. Furthermore, it holds that ra L , b L s " r0, 1s, and ra i´1 , b i´1 s can be computed from ra i , b i s as Note that in the last case, a i´1 ď 1 2 and b i´1 ě 1 2 hold. Note that Lemma 1 with i " 0 and g 0 " x yieldsX LP " convpX IP q, viǎ which has also been used in [7]. Next, we adapt Lemma 5 from [7], which establishes that, when minimizing or maximizing z within P LP L,L | α I "ᾱ given a fixed value forx, each g i can directly be computed from g i´1 and the bounds established in Lemma 1. In particular, for the sawtooth relaxation (i.e. I " H), when minimizing z over the MIPfeasible points with a fixed x, we find that g i " mint2g i´1 , 1´2g i´1 u. That is, the g-variables take one of the two upper bounds that restrict them. However, in this section, we have fixed several of the α-variables and have thus changed the feasible domain for each g-variable. Now, it could be that b i becomes an additional upper bound.
Lemma 2 (Adapted from Lemma 5 from [7]). Let a i and b i be defined as in Lemma 1 for all i P L 1 and letx P ra 0 , b 0 s. Further, define g˚as where, for i P I, it holds G i pg i´1 q " 2g i´1 if α i " 0, and G i pg i"1 q " 2p1´g i´1 q otherwise. Then we have g˚P argmintz : pz, gq P proj z,g pP LP,ᾱ | x"x qu, g˚P argmintz : pz, gq P proj z,g pP LP,ᾱ j | x"x qu @j P ´2, L 1 .
That is, each g i with unfixed α i can take on one of its upper bounds w.r.t. g i´1 when minimizing z withinP LP,ᾱ | x"x andP LP,ᾱ j | x"x . Furthermore, this choice is unique for all i ď j, i.e.
Finally, there exists some j P ´2, L 1 for whicȟ f j px, g˚q " mintz : pz, gq P proj z,g pP LP,ᾱ q| x"x qu. (43) Proof. The proofs of the optimality results (42a) and (42b) on g˚for j ě 1 closely follow the structure of the proof of Theorem 1, with the same underlying reasoning as in the proof of [7,Lemma 5]. In fact, the uniqueness of the optimizer also follows from the proof. Thus, the details are omitted here. To establish the optimality results for j ď 0, we observe that in this casef j is purely a function of x, such that the choice of g has no effect onf j , and g˚is thus still optimal. Finally, to fulfil (43), let j max P ´2, L 1 be chosen such that max jP ´2,L1 f j px, g˚q "f jmax px, g˚q.
Then we have mintz : pz, gq P proj z,g pP LP,ᾱ q| x"x qu " max jP ´2,L1 f j px, g˚q "f jmax px, g˚q " mintz : pz, gq P proj z,g pP LP,ᾱ jmax q| x"x qu ď mintz : pz, gq P proj z,g pP LP,ᾱ q| x"x qu, as required.
[ \ The next auxiliary result we need is a lemma concerning reflections over x " 1 2 inP IP,ᾱ px,g,αq andP LP,ᾱ px,g,αq for the case where α 1 is not fixed.
That is, the maximum errors from the lower bounds coincide. Similarly, x 2´f´1 px, g˚q " p1´xq 2´f´2 p1´x, g˚q, where g˚is defined on Lemma 2. Lastly, Proof. Recall thatP IP,ᾱ px,g,αq is formed from the constraints in S L and T L1 , along with fixing binary variables α I " ᾱ . It is easy to check that px, g, αq PP IP,ᾱ px,g,αq if an only if p1´x, g,ᾱq PP IP,ᾱ px,g,αq , whereᾱ 1 :" 1´α 1 andᾱ i :" α i for i P Izt1u. Thus, (44) holds due to this correspondence.
For j P 0, L 1 , we have Thus (48) holds. Similarly, (47) holds as Lastly, (46) holds by considering the substitutionx Ð 1´x from (47). The same secondary result holds iff j px, gq is replaced withf px, gq. This follows since each constituting function (for the pair j "´1, j "´2) is symmetric about x " 1 2 w.r.t. the maximum error; the pointwise maximum over the functions retains the same symmetry. Similarly, the same result holds if I " H, such thatX " r0, 1s.
[ \ The following lemma formalizes the convex hull of convex functions whose domain is a finite union of closed and bounded intervals. By gaps, we refer to the open intervals in the convex hull of the domain but do not intersect the domain. Lemma 4. Let X Ď R be a finite union of compact intervals, and let F : convpXq Ñ R be a convex function. For anyx P convpXqzX, definē x´:" maxtx P X : x ăxu andx`:" mintx P X : x ąxu.
(49) Then we have convpepi X pF qq " epi convpXq pF X q.
This lemma is proved in Appendix C. We are now ready to prove Theorem 2.
We denote the boundary of the set X by BX.
In particular, note that at the boundary points BX IP " t0, 2 8 , 6 8 , 1u, the tight lowerbounding inequalities are z ě 0, z ě 2x´1 and z ě F 1´2´4 . Thus, on the gap p 2 8 , 6 8 q the functionsF 2 ,F 3 are not needed to describe the convex hull of the MIP.

Proof (of Theorem 2).
As discussed before, we only need to show thatP IP,ᾱ L,L1 is sharp to conclude that P IP L,L1 is hereditarily sharp. In particular, we need to show that convpproj x,z pP IP,ᾱ L,L1 qq " proj x,z pP LP,ᾱ L,L1 q.
Reduction to L 1 " L: Recall that L 1 ě L holds by definition.
Claim. We claim that it suffices to reduce L 1 to L to conclude hereditary sharpness of P IP L,L1 .
Claim proof: Assume that L 1 ą L holds. To constructP IP,ᾱ L,L1 fromP IP,ᾱ L,L1´1 , we simply maintain the same fixing α I " ᾱ , then add a new variable g L1 ě 0, together with the new constraints (from (17d)) We then note the following: 1. It holdsP IP,ᾱ L,L1 ĎP IP,ᾱ L,L1´1 , since L 1 ą L 1´1 , and thus there are more inequalities used to defineP IP,ᾱ L,L1 . 2. We haveP IP,ᾱ L,L1 | xPBX IP "P IP,ᾱ L,L1´1 | xPBX IP . To see this, first notice that BX IP Ď t i 2 L : i P 2 L u, since I Ď L . Thus, for L 1 ą L, the inequality z ě x´ř is not tight at any of these points in BX IP ; see Proposition 1, Item 3. 3. It follows from the previous equation that for anyx P BX IP , we have proj x,z pP IP,ᾱ L,L1´1 | x"x q " proj x,z pP IP,ᾱ L,L1 | x"x q " tpx, zq : z ěF pxq, x "xu.
4. When we restrict to the domain convpX IP qzX IP and consider the convex hulls, we have equality as we reduce L 1 , i.e. convpproj x,z pP IP,ᾱ L,L1´1 q| xPconvpX IP qzX IP q " convpproj x,z pP IP,ᾱ L,L1 q| xPconvpX IP qzX IP q.
This is due to Item 2, the convexity ofF and Lemma 4.
Thus, the convex hull remains unchanged across the gaps inX IP , and since the LP relaxation does not weaken, sharpness in lower bound is maintained; see Figure 10. This implies thatP IP,ᾱ L,L1 is sharp ifP IP,ᾱ L,L1´1 is sharp. The claim then holds by induction.W e now proceed to prove sharpness ofP IP,ᾱ L,L by induction on L. Base case: If L " 0, then there are no binary variables and, hence, nothing to branch on; therefore, the result holds trivially. Induction on L: For the inductive step, we assume thatP IP,ᾱ L´1,L´1 is hereditarily sharp for all possible fixings of α-variables, and show thatP IP,ᾱ L,L is hereditarily sharp.
We begin by observing that proj x,z´P IP,ᾱ L,L¯" epiXIPpF q.
By Lemma 4, it follows that convpepiXIPpF qq " epi convpX IP q`FX IP˘, whereFXIP is defined as in Lemma 4. Thus, proving Theorem 2 is equivalent to proving that proj x,z pP LP,ᾱ L,L q " epi convpX IP q pFXIPq. In particular, it suffices to show that for anyx P convpX IP q, we havě which we do in the following. Case I:x PX IP . By Theorem 1, P IP L,L is sharp (i.e. when I " H). Thus, the LP lower bounds on z coincide with the MIP lower bounds for MIP-feasible points x PX IP , such that we have proj x,z pP LP,ᾱ L,L q| xPX IP " epiXIP pF q| xPX IP . This implies (51). Case II:x P convpX IP qzX IP . Letx´,x`PX IP as defined in Lemma 4. Since x RX IP , it follows thatx´,x`P BX IP . Case II.A: 1 R I. Assume 1 R I. Case II.A.1: rx´,x`s Ď BX IP X r0, 1{2s. We make use of the induction hypothesis here. To this end, we will work with L´1 layers. We will decorate variables and parameters from the smaller set using "˜".
Thus, we have that Φ z pzq ěf j pΦ x pxq, Φ g pgqq for all j ‰ 1, where the absence off´1px, gq is due to the fact thatf´1px, gq ď 0 for x P r0, 1 2 s, such that that the corresponding bound is inactive on Φ x pX LP q.
Note that the above calculations also imply that, for allj P ´2, L´1 and for all px,gq P proj x,g pP LP px,g,αq q, we have for some j P ´2, L that Φ z pfjpx,gqq " f j pΦ x pxq, Φ g pgqq. Further, since eachj maps to a unique j (with only the inactive j "´1 skipped), this implies that Φ z pf px,gqq "f pΦ x pxq, Φ g pgqq. Thus, we can conclude that (17d) and (17f) hold.
Next, we argue that px, g, αq P proj x,g,α 2,L pP LP,ᾱ px,g,αq q. This implies in particular that (17b) as well as (17c) hold and that we have α I " ᾱ I .
Since g 1 "x " 2x, we observe thatP LP,ᾱ px,g,αq can be written as the set of points px, g, αq P r0, 1sˆr0, 1s Lˆr 0, 1s L such that In this form, it is straightforward to confirm px, g, αq PP LP,ᾱ px,g,αq from the corresponding form forP LP px,g,αq : since the indices for both the map on g and on the shift fromĨ to I are shifted by 1 in the same direction, with the same choice of ᾱ , all equality constraints on g i , i PĨ, are preserved through the mapping. Further, the relationship between each g i and g i´1 is likewise preserved, as the corresponding α i is the same, and finally the choice of g 1 is feasible given x. Thus, all constraints are satisfied, such that px, g, αq PP LP,ᾱ px,g,αq , yielding for the choice of z above that px, z, g, αq PP LP,ᾱ L,L | xPconvpX IP Xr0, 1 {2sq . Further, from the form forP LP,ᾱ px,g,αq above, we observe that Φ x pX IP q "X IP X r0, 1 2 s and Φ x pX LP q " convpX IP X r0, 1 2 sq. To show the first part, we have already shown that Φ x pX IP q ĎX IP X r0, 1 2 s. To prove the other direction, we simply reverse the map for any px, g, αq PP IP,ᾱ px,g,αq | xPr0, 1 {2s , ignoring α 1 : lettingx " g 1 " x 2 ,g " g 2,L andα " α 2,L , it is easy to confirm px,g,αq PP px,g,αq .
Consequently, Φ z pFXIPpxqq "FXIPpΦ x pxqq holds onX. Now, by Lemma 4, across any gapx´,x`PX for which px´,x`q XX " H andx P rx´,x`s, we have thatFXIP pxq is on the line between the points px´,f px´qq and px`,f px`qq. Thus, sincex :" Φ x pxq, and since Φ is linear in x and z,f pxq lies on the line between the points pΦ x px´q,f px´qqq and Φ x ppx`q,f px`qq. Now, observe that, since Φ x pXq " X X r0, 1 2 s, we have that px´, x`q :" pΦ x px´q, Φ x px`qq is a gap in X, with x´, x`P X and px´, x`q X X " H. Furthermore, as x`, x´P X, we have thatFXIP pxq " Φ x pFXIP px´qq, and similarly for x`. Then, by Lemma 4, we have for x P px`, x´q thatFXIPpΦ x pxqq " FXIPpxq " Φ x pFXIPpxqq, as required. Case II.A.2: rx´,x`s Ď convpX IP X r1{2, 1sq. Applying Lemma 3 toP IP,ᾱ , we immediately recover sharpness on 1´Φ x pX LP q " convpX IP X r 1 2 , 1sq. To see this, let x P Φ x pX LP q. Then, via Lemma 3, we obtain exactly the same feasible regions for g, α with x " 1´x as with x "x, i.e. proj g,α 2,L pP IP,ᾱ px,g,αq | x"x q " proj g,α 2,L pP IP,ᾱ px,g,αq | x"1´x q, and moreover, similar to Lemma 3, it is not hard to show that we havex 2´F pxq " p1´xq 2´F p1´xq. Thus, we have that botȟ F p1´xq and min gPP LP,ᾱ L,L | x"x pf p1´x, gqq maintain the same distance below p1´xq 2 asFXIP pxq and min gPP LP,ᾱ L,L | x"x pf px, gqq, respectively. Since the second pair coincides, so must the first pair, such thať FXIP p1´xq " min gPP LP,ᾱ | x"x pf p1´x, gqq, and therefore sharpness holds on 1´Φ x pX LP q. Case II.A.3: 1 2 P rx´,x`s. Since we showed sharpness on both convpX IP Xr0, 1 2 sq and convpX IP Xr 1 2 , 1sq, we only have to show sharpness on the gap px´,x`q inX IP . Note, in this case, 1 2 RX IP . We wish to show that min gPP LP,ᾱ | x"x pf px, gqq coincides with the line between px´,f px´qq and px`,f px`qq.
To show this, we first note that both endpoints coincide withf jmax px, g˚q for some j max , and by Lemma 3, both this value of j and the corresponding solution g˚must be the same for both gap endpoints. Further, sincex´,xà re the endpoints of a gap, we have thatf px´q "x 2 andf px`q "x 2 . This can be seen as follows: first, by [7, Lemma 6], we have that eachf j , j ě 0, is incident with x 2 exactly at the points x " k 2 j`1 2 j`1 , k " 0, . . . , 2 j´1 . Furthermore, the points at which the α-vector changes, and thus the possible gaps inX IP , are exactly the pointsx " k2´L, which must take the form above for some j P 0, L´1 , so thatF j´1 pxq " x 2 for x P tx´,x`u. Since each otheř f j pxq ď x 2 at these points, this yieldsf pxq " x 2 for x P tx´,x`u. Now, let ra 1 , b 1 s be the bounds on g 1 from Lemma 1. Then we have g1 " b 1 : through the mapping Φ, we have g1 "x "b 0 at bothx´andx`, whereb 0 is defined in the manner of Lemma 1. Thus, since g 1 is subject to every constraint inP IP,ᾱ px,g,αq thatx is inP IP,ᾱ px,g,αq , we have that b 1 ďb 0 " g1 ď b 1 , such that g1 " b 1 .
Furthermore, by the convexity of proj x,g´P LP,ᾱ px,g,αq¯, since px´, g˚q, px`, g˚q P proj x,g´P LP,ᾱ px,g,αq¯, we have that px, g˚q P proj x,g´P LP,ᾱ px,g,αq¯f or allx P px´,x`q. Thus, we have for any suchx that g1 " b 1 ě minp2x, 2p1´xq, b 1 q ě g1 , yielding by Lemma 2 that g˚P argmintz : pz, gq P proj z,g pP LP,ᾱ L,L q| x"x u. Thus, we havef px, g˚q " min gPP LP,ᾱ L,L | x"x pf px, gqq "f pxq is linear inx across the gap rx´,x`s and coincides withf pxq at the endpoints, as required. Therefore, we have thatP LP,ᾱ L,L is sharp across the gap. We have now established sharpness ofP LP,ᾱ L,L over all of convpP px,g,αq q, and thus the proof is complete for 1 R I. Case II.B: 1 P I. Finally, to recover sharpness if 1 P I, we only have to observe that inserting 1 into I, thereby restricting α 1 " 1 or α 1 " 0, simply restrictš P IP,ᾱ L,L to either x P Φ x pX IP q or x P 1´Φ x pX IP q, on which sharpness holds exactly as the sharpness result on the image of Φ (or its reflection) with 1 P I, with one difference: we define Φ so that α 1 "α 1 . However, this difference has no effect on the z-minimal solutions for g1 withinX LP , and thus no effect on sharpness.

C Auxiliary Results and Proofs
In this section of the appendix, we give the proofs of Lemma 4 and Proposition 7 which we have moved here for better readability.

C.1 Epigraphs Over Non-Contiguous Domains
Here we present the proof of Lemma 4.
Proof (Lemma 4). We first note that we have F X pxq ě F pxq for all x P convpXq: for all x P convpXq, we have that either F X pxq " F pxq or that F X pxq is the line between two points on the graph of f , which must lie above the graph of f by the convexity of f . Further, we have that F X is convex, as it is a maximum between the convex function F and some of its secant lines, which are also convex. Now, trivially, by the convexity of F X , we have convpepi X pF qq " convpepi X pF X qq Ď convpepi convpXq pF X qq " epi convpXq pF X q To show that epi convpXq pF X q Ď convpepi X pF qq, let px, yq P epi convpXq pF X q. Then if x P X, y ě F X pxq " F pxq, such that px, yq P epi X pF q Ď convpepi X pF qq.
On the other hand, if x P convpXqzX, then by definition of F X we have that there exist some λ P r0, 1s and x 1 , x 2 P X such that x " λx 1`p 1´λqx 2 and F X pxq " λF px 1 q`p1´λqF px 2 q. Then we have that px, yq is a convex combination of the points px 1 , f px 1 q`py´F X pxqqq and px 2 , F px 2 q`py´F X pxqqq, which are in epi X pF q (since y´F X pxq ě 0), yielding px, yq P convpepi X pF qq as required.

C.2 Volume Proof for Bin2 and Bin3
Now we prove Proposition 7.
Proof (Proposition 7). Let P IP L,L1 be the MIP relaxation Bin2, where F L is the sawtooth approximation of z x " x 2 and z y " y 2 that consists of secant lines to x 2 between consecutive breakpoints x k " k2´L and y k " k2´L for k P 0, 2 L . Further, for L 1 Ñ 8 we have lim L1Ñ8 tpp, z p q P r0, 1sˆR : pp, z p q P Q L1 u " tpp, z p q P r0, 1sˆR : pp, z p q P epi r0,1s pp 2 qu under Hausdorff distance. As a result, we obtain lim L,L1Ñ8 pproj x,y,z pP IP L,L1 qq " tpx, y, zq P r0, 1s 2ˆR : 2´y2˘u .
Now let and w x " w y " 2´p L´1q be the distance between any two consecutive breakpoints x k , x k´1 and y k , y k´1 , respectively, and consider the volume of proj x,y,z pP IP L,L1 q over the grid piece rx k´1 , x k sˆry k´1 , y k s: L`x`y 2˘´p x`y 2 q 2˘d ydx.
The first two integrals are each the overapproximation volumes for the sawtooth approximation over two consecutive univariate domain segments, each of which has an area of 1 6 2´3 L , see [7, Appendix A]. Thus, since w x " w y " 2˚2´L, we have that the first two integrals add up to 2 3 2´4 L .
To process the third integral, we apply the two substitutions u " px´x k´1 q`py´y k´1 q 2 and v " px´x k´1 q´py´y k´1 q 2 . The integral then becomes 2 ż x k x k´1 ż y k y k´1`F L`x`y 2˘´p x`y 2 q 2˘d ydx The steps J1 and J2 rely on the observation that F L is the secant line to x 2 across the intervals r L , x k´1`yk´1 2`22´2 L s, due to the positions of x k´1 and y k´1 . In addition, for somex P rx k´1 , x k s, the error between and x 2 and the secant line to x 2 at points x k´1 and x k is given by px´x k´1 qpx k´x q -the product of distances to each endpoint. Thus, for u P r0, 2´Ls, we have F L´u`xk´1`yk´1 2¯´p u`x k´1`yk´1 2 q 2 " up2 L´u q, yielding the validity of step J2. On the other hand, to show that step J1 is valid, we observe for u P r0, 2´Ls that F L´u`xk´1`yk´1 2¯´p u`x k´1`yk´1 2 q 2 " pu´2 L qp2´2 L´u q holds, such that the second integral becomes the first integral under the substitutionũ " 2´L´u, since the secant-error portion of the integrand is symmetric about u " 2´L. Thus, the volume related to the second integral is 4 3 2´4 L . The volume of P IP L,L1 over each grid piece converges to 2¨2´4 L , yielding a total volume convergence of lim L1Ñ8 volpproj x,y,z pP IP L,L1 qq " 2 2pL´1q p2¨2´4 L q " 1 2 2´2 L .
The proof for Bin3 is similar and therefore omitted here.

D Instance set
In Table 8 we show a listing of all instances of the computational study from Section 6. The boxQP instances are publicly available at https://github.com /joehuchette/quadratic-relaxation-experiments. The ACOPF instances are also publicly available at https://github.com/robburlacu/acopflib. The QPLIB instances are available at https://qplib.zib.de/. In total, we have 60 instances, of which 30 are dense and 30 are sparse. Table 8. IDs of all 60 instances used in the computational study. In bold are the IDs of the instances that are dense.