1 Introduction

In this work, we study relaxations of general mixed-integer quadratically constrained quadratic programs (MIQCQPs). More precisely, we consider discretization techniques for non-convex MIQCQPs that allow for relaxations of the set of feasible solutions based on mixed-integer programming (MIP) formulations.

We enhance the normalized multiparametric disaggregation technique (NMDT) introduced in [7]. NMDT is a McCormick relaxation based MIP relaxation approach, which is applied to form relaxations of the quadratic equations \( z= x^2 \) and \( z= xy \). The McCormick relaxation is a set of four inequalities that describe the convex hull of the feasible points of the equation \(z=xy\) in the satisfying finite lower and upper bounds on x and y, see [16]. We extend NMDT by applying a discretization to both variables. We refer to the latter as doubly discretized NMDT (D-NMDT). Both MIP formulations, NMDT and D-NMDT, can be applied to MIQCQPs to form an MIP relaxation by introducing auxiliary variables and one such quadratic equation for each quadratic term in the MIQCQP. Such an MIP relaxation can then be solved with a standard MIP solver. We analyze these MIP relaxation approaches theoretically and computationally with respect to the quality of the dual bound they deliver for MIQCQPs.

For a thorough discussion of background on discretization and piecewise linear techniques in MIQCQPs, please refer to Part I [3].

Contribution We extend NMDT by a discretization of both variables, called D-NMDT. We analyze both MIP relaxations in terms of the dual bound they impose for non-convex MIQCQPs. In a theoretical analysis, we show that D-NMDT requires fewer binary variables and yields better linear programming (LP) relaxations at identical relaxation errors compared to NMDT. Finally, we perform an extensive numerical study where we use NMDT and D-NMDT to generate MIP relaxations of non-convex MIQCQPs. We show that D-NMDT has clear advantages, such as tighter dual bounds, shorter runtimes, and it finds more feasible solutions to the original MIQCQPs when combined with a callback function that uses the non-linear programming (NLP) solver IPOPT [19]. These effects become even more apparent in dense instances with many variable products. Moreover, we combine NMDT and D-NMDT with the tighten sawtooth epigraph relaxation from Part I [3] to obtain even tighter relaxations for \(z=x^2\) terms in MIQCQPs. This tightening leads to improved results in the computational study.

Outline In Sect. 2.1 and Sect. 2.2 we review several useful concepts, notations, and core formulations from Part I [3]. In Sect. 3, we recall the NMDT MIP relaxation and introduce the new MIP relaxation D-NMDT. In Sect. 4, we prove various properties about the strengths of the MIP relaxations focusing on volume, sharpness, and optimal choice of breakpoints. In Sect. 5, we present our computational study.

2 Preliminaries

2.1 MIP formulations

We follow Part I [3] for notation used in this work. We provide this section here for the completeness of this article. We study relaxations of general mixed-integer quadratically constrained quadratic programs (MIQCQPs), which are defined as

(1)

for \( Q_0, Q_j \in \mathbb {R}^{n \times n} \), \( c_0, c_j \in \mathbb {R}^n \), \( d_0, d_j \in \mathbb {R}^k \) and \( b_j \in \mathbb {R}\), \( j = 1, \ldots m \). Throughout this article, we use the following convenient notation: for any two integers \( i \le j \), we define \( \llbracket i, j \rrbracket {:}{=}\{i, i + 1, \ldots , j\} \), and for an integer \( i \ge 1 \) we define \( \llbracket i \rrbracket {:}{=}\llbracket 1, i \rrbracket \). We will denote sets using capital letters, but also use capital letters for matrices, some functions, and the number of layers L. We typically denote variables using lower case letters and vectors of variables using bold face. For a vector \( \varvec{u} = (u_1, \ldots , u_n) \) and some index set \( I \subseteq \llbracket n \rrbracket \), we write \( \varvec{u}_I {:}{=}(u_i)_{i \in I} \). Thus, e.g. \( \varvec{u}_{\llbracket i \rrbracket } = (u_1, \ldots , u_i) \). Furthermore, we introduce the following notation: for a function \( F:X \rightarrow \mathbb {R}\) and a subset \( B \subseteq X \), let \( {{\,\textrm{gra}\,}}_B(F) \), \( {{\,\textrm{epi}\,}}_B(F) \) and \( {{\,\textrm{hyp}\,}}_B(F) \) denote the graph, epigraph and hypograph of the function \(F\) over the set B, respectively. That is,

$$\begin{aligned}&{{\,\textrm{gra}\,}}_B(F) {:}{=}\{(\varvec{u},z) \in B \times \mathbb {R}: z= F(\varvec{u})\},\ \ \\&{{\,\textrm{epi}\,}}_B(F) {:}{=}\{(\varvec{u},z) \in B \times \mathbb {R}: z\ge F(\varvec{u})\},\\&{{\,\textrm{hyp}\,}}_B(F) {:}{=}\{(\varvec{u},z) \in B \times \mathbb {R}: z\le F(\varvec{u})\}. \end{aligned}$$

In the following, we introduce MIP formulations as we will use them to represent these sets as well as the different notions of the strength of an MIP formulation explored in this work.

We will study mixed-integer linear sets, so-called mixed-integer programming (MIP) formulations, of the form

$$\begin{aligned} P^{{\text { IP}}}{:}{=}\{(\varvec{u}, \varvec{v}, \varvec{z}) \in \mathbb {R}^{d + 1} \times [0, 1]^p \times \{0, 1\}^q: A (\varvec{u}, \varvec{v}, \varvec{z}) \le b\} \end{aligned}$$

for some matrix A and vector b of suitable dimensions. The linear programming (LP) relaxation or continuous relaxation \( P^{{\text { LP}}}\) of \( P^{{\text { IP}}}\) is given by

$$\begin{aligned} P^{{\text { LP}}}{:}{=}\{(\varvec{u}, \varvec{v}, \varvec{z}) \in \mathbb {R}^{d + 1} \times [0, 1]^p \times [0, 1]^q: A (\varvec{u}, \varvec{v}, \varvec{z}) \le b\}. \end{aligned}$$

We will often focus on the projections of these sets onto the variables \( \varvec{u} \), i.e.

$$\begin{aligned} {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}) {:}{=}\{\varvec{u} \in \mathbb {R}^{d + 1}: \exists (\varvec{v}, \varvec{z}) \in [0, 1]^p \times \{0, 1\}^q \quad \mathrm {s.t.}\quad (\varvec{u}, \varvec{v}, \varvec{z}) \in P^{{\text { IP}}}\}. \end{aligned}$$
(2)

The corresponding projected linear relaxation \( {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { LP}}}) \) onto the \( \varvec{u} \)-space is defined accordingly.

In order to assess the quality of an MIP formulation, we will work with several possible measures of formulation strength. First, we define notions of sharpness, as in [5, 14]. These relate to the tightness of the LP relaxation of an MIP formulation. Whereas properties such as total unimodularity guarantee an LP relaxation to be a complete description for the mixed-integer points in the full space, we are interested here in LP relaxations that are tight descriptions of the mixed-integer points in the projected space. In the following \({{\,\textrm{conv}\,}}(S)\) denotes the convex hull of a set S.

Definition 1

We say that the MIP formulation \( P^{{\text { IP}}}\) is sharp if

$$\begin{aligned} {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { LP}}}) = {{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}})). \end{aligned}$$

holds.

Sharpness expresses a tightness at the root node of a branch-and-bound tree.

In this article, we study certain non-polyhedral sets \( U \subseteq \mathbb {R}^{d + 1} \) and will develop MIP formulations \( P^{{\text { IP}}}\) to form relaxations of U in the projected space, as defined in the following.

Definition 2

For a set \( U \subseteq \mathbb {R}^{d + 1} \) we say that an MIP formulation \( P^{{\text { IP}}}\) is an MIP relaxation of U if

$$\begin{aligned} U \subseteq {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}). \end{aligned}$$

Given a function \( F:[0, 1]^d \rightarrow \mathbb {R}\), we will mostly consider

$$\begin{aligned} U = {{\,\textrm{gra}\,}}_{[0, 1]^d}(F) \subseteq \mathbb {R}^{d + 1}. \end{aligned}$$

In particular, we will focus on either

$$\begin{aligned} U = \{(x, z) \in [0, 1]^2: z= x^2\} \quad \text {or} \quad U = \{(x, y, z) \in [0, 1]^3: z= xy\}. \end{aligned}$$

We now define several quantities to measure the error of an MIP relaxation.

Definition 3

For an MIP relaxation \( P^{{\text { IP}}}\) of a set \( U \subseteq \mathbb {R}^{d + 1} \), let \( \bar{\varvec{u}} \in {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}) \). We then define the pointwise error of \( \bar{\varvec{u}} \) as

$$\begin{aligned} {\mathcal {E}}(\bar{\varvec{u}}, U) {:}{=}\min \{|\varvec{u}_{d + 1} - \bar{\varvec{u}}_{d + 1}|: \varvec{u} \in U, {\varvec{u}}_{\llbracket d \rrbracket } = \bar{\varvec{u}}_{\llbracket d \rrbracket }\}. \end{aligned}$$

This enables us to define the following two error measures for \( P^{{\text { IP}}}\) w.r.t. U:

  1. 1.

    The maximum error of \( P^{{\text { IP}}}\) w.r.t. U is defined as

    $$\begin{aligned} {\mathcal {E}}^{\max }(P^{{\text { IP}}}, U) {:}{=}\max _{\bar{\varvec{u}} \in {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}})} {\mathcal {E}}(\bar{\varvec{u}}, U). \end{aligned}$$
  2. 2.

    The average error of \( P^{{\text { IP}}}\) w.r.t. U is defined as

    $$\begin{aligned} {\mathcal {E}}^{\text {avg}}(P^{{\text { IP}}}, U) {:}{=}{{\,\textrm{vol}\,}}({{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}) \setminus U). \end{aligned}$$

Via integral calculus, the second, volume-based error measure can be interpreted as the average pointwise error width of all points \( \varvec{u} \in {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}) \). Note that whenever the volume of U is zero (i.e. it is a lower-dimensional set), the average error width just reduces to the volume of \( {{\,\textrm{proj}\,}}_{\varvec{u}}(P^{{\text { IP}}}) \). Both of the defined error quantities for an MIP relaxation \( P^{{\text { IP}}}\) can also be used to measure the tightness of the corresponding LP relaxation \( P^{{\text { LP}}}\). The volume of LP relaxation as a measure of a MIP relaxation strength was previously used in [2].

2.2 Core relaxations

In the definition of the MIP relaxations studied in this work, we will frequently consider equations of the form \(z=xy\) for continuous or integer variables x and y within certain bounds , respectively. To this end, we will often use the function \( F:D \rightarrow \mathbb {R},\, F(x, y) = xy \), \( D {:}{=}D_x \times D_y \), and refer to the set of feasible solutions to the equation \( z= xy \) via the graph of F, i.e. \( {{\,\textrm{gra}\,}}_D(F) = \{(x, y, z) \in D \times \mathbb {R}: z= xy\} \). In order to simplify the exposition, we will, for example, often write \( {{\,\textrm{gra}\,}}_D(xy) \) or refer to a relaxation of the equation \( z= xy \) instead of \( {{\,\textrm{gra}\,}}_D(F) \). We will do this similarly for the univariate function \( f:D_x \rightarrow \mathbb {R},\, f(x) = x^2 \) and equations of the form \( z= x^2 \), for example. For inequalities, like \(z\ge xy\) or \(z=x^2\), we can use the epigraph.

Furthermore, we repeatedly make use of several “core” formulations for specific sets of feasible points. They are introduced in the following.

2.2.1 McCormick envelopes

The convex hull of the equation \( z= xy \) for \( (x, y) \in D \) is given by a set of linear equations known as the McCormick envelope, see [16]:

figure a

When one of the variables, here \( \beta \), is binary, the McCormick envelope of \( z= x \beta \) simplifies to

figure b

For univariate continuous quadratic equations \( z= x^2 \), it simplifies to

figure c

2.2.2 Sawtooth-based MIP formulations

Next, we state an MIP relaxation for equations of the form \( z\ge x^2 \) that requires only logarithmically-many auxiliary variables and constraints in the number of linear segments. It makes use of an elegant piecewise linear (pwl) formulation for \( {{\,\textrm{gra}\,}}_{[0, 1]}(x^2) \) from [20] using the recursively defined sawtooth function presented in [18] to formulate the approximation of \( {{\,\textrm{gra}\,}}_{[0, 1]}(x^2) \), as described in [5]. We will use this formulation to further strengthen the relaxation of \(z=x^2\) by NMDT or D-NMDT. To this end, we define a formulation parameterized by the depth \( L \in \mathbb {N}\):

figure d

Note that, by construction in [5, 20], \( S^L \) is defined such that when \( \varvec{\alpha }\in \{0, 1\}^L \), the relationship between \( g_j\) and \( g_{j - 1} \) is \( g_j = \min \{2g_{j - 1}, 2(1 - g_{j - 1})\} \) for \( j = 1, \ldots , L \), which means that it is given by the “tooth” function \( G:[0, 1] \rightarrow [0, 1],\, G(x) = \min \{2x, 2(1 - x)\} \). Therefore, each \( g_j \) represents the output of a “sawtooth” function of x, as described in [18, 20], i.e. when \( \varvec{\alpha }\in \{0, 1\}^L \), we have

$$\begin{aligned} g_j = G^j(x) \quad \text {for } G^j {:}{=}\underbrace{G \circ G \circ \cdots \circ G}_j. \end{aligned}$$
(11)

Now, we define the function \( F^L :[0, 1] \rightarrow [0, 1]\),

$$\begin{aligned} F^L(x) {:}{=}x - \sum _{j = 1}^L 2^{-2j} G^j(x), \end{aligned}$$
(12)

which is a close approximation to \( x^2 \).

Using the relationships (11) and (12) between x and \( \varvec{g} \), any constraint of the form \( z= x^2 \) can be approximated via the function

figure e

Now, we consider the LP relaxation of \( S^L \), where each variable \( \alpha _j \) is relaxed to the interval [0, 1] . Then, via the constraints (10), we see that the weakest lower bounds on each \( g_j \) w.r.t. \( g_{j - 1} \) can be attained via setting \( \alpha _j = g_{j - 1} \), yielding a lower bound of 0. Thus, after projecting out \( \varvec{\alpha }\), the LP relaxation of \( S^L \) in terms of just x and \( \varvec{g} \) can be stated as

figure f

The LP relaxation \(T^L\) is sharp by [3, Theorem 1]. Thus, \(T^L\) yields the same lower bound on \( z\) as the MIP formulation \(S^L\) due to sharpness and the convexity of \( F^L \). This allows us to define an LP outer approximation for inequalities of the form \( z\ge x^2 \):

Definition 4

(Sawtooth Epigraph Relaxation, SER) Given some \( L \in \mathbb {N}\), the depth-L sawtooth epigraph relaxation for \( z\ge x^2 \) on the interval \( x \in [0, 1] \) is given by

figure g
Fig. 1
figure 1

The sawtooth epigraph relaxations \(Q^L\) for \(L=1\) and \(L = 2\). By increasing L, we tighten the lower bound by creating more inequalities. This is done by only adding linearly many variables and inequalities in the extended formulation to gain exponentially many equally spaced cuts in the projection

In [3] it is shown that that the maximum error for the sawtooth epigraph relaxation is \( 2^{-2L - 4} \) (Fig. 1).

3 MIP relaxations for non-convex MIQCQPs

In this section, we present MIP relaxations for bivariate equations of the form \(z=xy\) and univariate equations of the form \(z= x^2\). For convenience, we define a completely dense MIQCQP as an MIQCQP for which all terms of the form \( x_i^2 \) and \( x_i x_j \) appear in either the objective or in some constraint.

We proceed as follows. First, we recall the well-known MIP relaxation technique NMDT. Then, we introduce an enhanced version of it, called D-NMDT, which is designed to reduce the number of binary variables required to reach the same level of approximation accuracy compared to NMDT for completely dense MIQCQPs. Finally, we define the two tightened variants of NMDT and D-NMDT, for which we also incorporate the sawtooth epigraph relaxation (15) for all \(z=x_i^2\) terms. We call these methods T-NMDT and T-D-NMDT, respectively. We will mention the corresponding maximum errors of the presented MIP relaxations and derive them in detail in Sect. 4.1.

3.1 Base-2 NMDT

The Normalized Multiparametric Disaggregation Technique (NMDT) was introduced by Castro [7]. Later it was used in [4, 5] along with its univariate form (see [5, Appendix A]). While in [7] a base of 10 was chosen for the discretization, in [4, 5] NMDT is described with a base of 2. We use the latter here and provide both the bivariate and univariate definition of base-2 NMDT according to [5] here.

In NMDT, the key idea for relaxing \( z= xy \) is to discretize one variable, e.g. x, using binary variables \( \varvec{\beta }\in \{0, 1\}^L \) and a residual term \( {\Delta _{x}^L} \) and then relaxing the resulting products \( \beta _i y \) and \( {\Delta _{x}^L} y \) using McCormick envelopes. The following derivation of NMDT can be transferred one-to-one to bases different to 2. We start with the base-2 discretization of the variable x:

$$\begin{aligned} x = \displaystyle \sum _{j = 1}^L 2^{-j} \beta _j + {\Delta _{x}^L}. \end{aligned}$$

Then we multiply by y to obtain the exact representation

$$\begin{aligned} \begin{array}{rll} x &{}= \displaystyle \sum _{j = 1}^L 2^{-j} \beta _j + {\Delta _{x}^L},\, z= \displaystyle \sum _{j = 1}^L 2^{-j} \beta _j y + {\Delta _{x}^L} y\\ {\Delta _{x}^L} &{}\in [0, 2^{-L}],\, \varvec{\beta }\in \{0, 1\}^L. \end{array} \end{aligned}$$
(17)

Next, we use McCormick envelopes to model all remaining product terms, \( \beta _i y \) and \( {\Delta _{x}^L} \cdot y \), to obtain the final formulation.

Definition 5

(NMDT, [7]) The MIP relaxation NMDT of \( z= xy \) with \( x \in [0, 1] \), \( y \in [0, 1] \) and a depth of \( L \in \mathbb {N}\) is defined as follows:

figure h

Since McCormick envelopes are exact reformulations of the variable products if at least one of the variables is required to be binary, the maximum error of NMDT with respect to \( z= xy \) is purely due to the McCormick relaxation of \( {\Delta _{z}^L} = {\Delta _{x}^L} \cdot y \), with a value of \( 2^{-L - 2} \).

An advantage of the NMDT approach compared to the separable formulations from Part I is that it requires fewer binary variables to reach the desired level of accuracy for bipartite MIQCQPs, for which the quadratic part in each constraint is of the form \( \varvec{x}^T Q \varvec{y} \). This is due to the fact that one has only to discretize either \( \varvec{x} \in \mathbb {R}^n \) or \( \varvec{y} \in \mathbb {R}^m \). Thus, to reach a maximum error of \( 2^{-2L - 2} \) for each bilinear term, NMDT requires only \( 2L\min \{m, n\} \) binary variables instead of the \( L(m + n) \) variables required by the approaches D-NMDT (see Sect. 3.2) or HybS (from Part I). In contrast, NMDT requires twice the number of binary variables to reach the same level of accuracy if all quadratic terms \( x_i x_k \) and \( x_l^2 \) with \( k = 1, \ldots , n \) and \( l = 1, \ldots , m \) must be modelled, for example if Q is dense, see Table 1.

Next, we show how to model univariate quadratic equations \( z= x^2 \) with the NMDT technique:

Definition 6

(Univariate NMDT ([7])) The MIP relaxation NMDT of \( z= x^2 \) with \( x \in [0, 1] \) and a depth of \( L \in \mathbb {N}\) is defined as follows:

figure i

Note that for any depth L, the univariate formulation NMDT yields a maximum error of slightly less than \( 2^{-L - 2} \) instead of the \( 2^{-2L - 2} \) in the sawtooth relaxation from [3]. Further, the formulation NMDT is not sharp. For example at \( x = \tfrac{1}{2} \), its LP relaxation admits the solution \( \beta _j = \tfrac{1}{2} \) for all \( j \in \llbracket L \rrbracket \), \( {\Delta _{x}^L} = 2^{-L - 1} \), \( u_j = 0 \) for all \( j \in \llbracket L \rrbracket \), \( {\Delta _{z}^L}= 0 \) and \( z= 0 \), which is not in the convex hull of the MIP formulation Univariate NMDT stated in (19).

However, we can tighten the lower bound on \(z\) in (19) by adding the sawtooth epigraph relaxation (15) of depth \( L_1 \) (with \( L_1 \ge L\)), i.e. \( (x, z) \in Q^{L_1}\). We refer to NMDT with this lower-bound tightening for univariate quadratic terms as T-NMDT. Note that univariate T-NMDT is a sharp MIP formulation, which we discuss in Sect. 4.3.

Definition 7

(Univariate T-NMDT) The MIP relaxation T-NMDT of \( z= x^2 \) with \( x \in [0, 1] \) and a depth of \( L, L_1 \in \mathbb {N}\) with \( L_1 \ge L \) is defined as follows:

figure j

3.2 Doubly discretized NMDT

The key idea behind the novel MIP relaxation Doubly Discretized NMDT (D-NMDT) for \( z= xy \) is to further increase the accuracy of NMDT by discretizing the second variable y as well, which leads to a double NMDT substitution, namely in the \( {\Delta _{x}^L} y \)-term. In this way, for problems where NMDT would require discretizing all \( x_i \)-variables, e.g. if we have some dense constraint, we can double the accuracy of the relaxation for the equations \( z_{ij} = x_i x_j \) without adding additional binary variables by taking advantage of the fact that both variables are discretized anyway. In NMDT, we could choose to discretize either x or y for each equation of the form \( z= xy \). For D-NMDT, we consider both options of discretization, and then, by introducing a parameter \( \lambda \in [0, 1] \), we can model a hybrid version of the two resulting MIP relaxations. Namely, we write

$$\begin{aligned} xy = \lambda xy + (1 - \lambda ) xy, \end{aligned}$$

then discretize y first in the relaxation of \( \lambda xy \) and x first in the relaxation of \( (1 - \lambda ) xy \). Finally, the complete MIP relaxation D-NMDT is obtained by relaxing the resulting products via McCormick envelopes (see Appendix A for the detailed derivation).

Definition 8

(D-NMDT) The MIP relaxation D-NMDT of \( z= xy \) with \( x, y \in [0, 1] \), a depth of \( L \in \mathbb {N}\) and the parameter \( \lambda \in [0, 1] \) is defined as follows:

figure k

As McCormick envelopes are exact reformulations of bilinear products if one of the variables is binary, we only make an error in the relaxation of the continuous variable product \( {\Delta _{x}^L} {\Delta _{y}^L} \). This yields a maximum error of \( 2^{-2L - 2} \) for D-NMDT. For bounds on the terms \( (1 - \lambda ) {\Delta _{x}^L} + \lambda x \) and \( \lambda {\Delta _{y}^L} + (1 - \lambda ) y \), see Appendix B.

Remark 1

For our implementation of the D-NMDT technique used in Sect. 5, we set \(\lambda = \tfrac{1}{2} \) for the sake of formulation symmetry in x and y.

To model the univariate quadratic terms with this method, we set \( y = x \) in \( z= xy \) and get an MIP relaxation for \( z= x^2 \), The resulting MIP relaxation is stronger than the univariate NMDT approach from Definition 6, which we will prove later.

Definition 9

(Univariate D-NMDT) The MIP relaxation D-NMDT of \( z= x^2 \) with \( x \in [0, 1] \) and a depth of \( L \in \mathbb {N}\) is defined as follows:

figure l

Again, as McCormick envelopes are exact reformulations of bilinear products if one of the variables is required to be binary, we only make an error in the relaxation of the continuous variable product \( {\Delta _{x}^L} {\Delta _{x}^L} \). This yields a maximum error of \( 2^{-2L - 2} \) for univariate D-NMDT. Note that the upper bound of this formulation is formed by exactly the same pwl approximation for \( z= x^2 \) as the sawtooth formulations. Unfortunately, the univariate D-NMDT is not sharp; for example, at \( x = \tfrac{1}{2}\), its LP relaxation admits the solution \( \beta _j = \tfrac{1}{2}~\) for all \( j \in \llbracket L \rrbracket \), \( {\Delta _{x}^L} = 2^{-L - 1} \), \( {\Delta _{z}^L}= 0 \), \( u_j = 0 \) for all \( j \in \llbracket L \rrbracket \) and \( z= 0 \), which is not in the convex hull of \( {{\,\textrm{gra}\,}}_{[0, 1]}(x^2) \).

To formulate a tightened version of D-NMDT, we tighten the lower bound on \(z\) in (22), by removing all McCormick lower bounds and adding the sawtooth epigraph relaxation (15) of depth \( L_1 \) (with \( L_1 \ge L\)). Note that univariate T-D-NMDT is a sharp MIP formulation, which we will prove in Sect. 4.3.

Definition 10

Univariate T-D-NMDT) The MIP relaxation T-D-NMDT of \( z= x^2 \) with \( x \in [0, 1] \) and depths \( L, L_1 \in \mathbb {N}\) with \( L_1 \ge L \) is defined as follows:

figure m

In Table 1 in Sect. 4, we give a summary of the number of binary variables and constraints as well as the accuracy of each MIP relaxation when applied to a dense MIQCQP of the form (1).

Remark 2

(Binary Variables and Dense MIQCQPs) When modelling Problem (1) using the MIP relaxations NMDT and D-NMDT, for each variable \(x_i\), we will need a discretization of the form \( x_i = \sum _{j = 1}^L 2^{-j}\beta _j + \Delta ^L_{x_i} \) with \( \beta \in \{0, 1\}^L \). Thus, both of these formulations use nL binary variables in the case of a dense MIQCQP. However, the improved binarizations in D-NMDT reduces the errors exponentially compared to NMDT.

Note that it is possible that some preprocessing or reformulation, such as via a convex quadratic reformulation may improve the number of binary variables needed. We do not use such reformulations in this work, but just focus on applying our MIP relaxations as is.

4 Theoretical analysis

In this section, we give a theoretical analysis of the presented MIP relaxations for the equation \( z= xy \) over \( x, y \in [0, 1] \) as well as the equation \( z= x^2 \) over \( x \in [0, 1] \), respectively, in order to allow for a comparison of structural properties between them. In particular, we analyze their maximum error, average errors, formulation strengths, i.e. sharpness, as well as the optimal placement of breakpoints to minimize average errors. Our results are summarized in Table 1, which also includes the results for the separable methods HybS, Bin2, and Bin3 from Part I [3].

Table 1 A summary of characteristics of the different MIP relaxations for \(z=xy\)

4.1 Maximum error

We start by discussing the maximum errors. We will derive the maximum errors of the NMDT-based formulations by reducing the error calculations to the error of a single McCormick relaxation per grid piece. In general, for the equation \( z= xy \) over a grid piece , the maximum under- and overestimation is , attained at , see e.g. [15, page23].

For NMDT, to show that the maximum error can be computed from a single McCormick relaxation, we fix \( \varvec{\beta }\in \{0, 1\}^L \) in (18) and observe two facts: (1) we get \(x = k 2^{-L} + {\Delta _{x}^L}\) for some integer k and therefore x varies only with \( \Delta _x^L \in [0, 2^{-L}]\), and (2) the McCormick relaxation \( (y, \beta _i, u_i) \in {\mathcal {M}}(y, \beta _i) \) is exact for each \( i = 1, \ldots , L \), i.e. , the relaxation equals \( u_i = y \beta _i \). These two facts imply that the only error incurred on this small interval stems from the single McCormick relaxation \( ({\Delta _{x}^L}, y, {\Delta _{z}^L}) \in {\mathcal {M}}(\Delta _x^L, y) \) over regions of the form \(({\Delta _{x}^L}, y) \in [0, 2^{-L}] \times [0,1]\). This yields a maximum error of \(\tfrac{1}{4}(2^{-L} \cdot 1) = 2^{-L-2}\). Similarly, for D-NMDT and univariate NMDT and D-NMDT, one can also show that all errors come from the McCormick relaxations of the continuous error terms. The maximum errors of the different MIP relaxations are listed in the following propositions.

Proposition 1

The maximum error in the NMDT MIP relaxation for \( z= xy \) with \( x, y \in [0, 1] \) is \(\tfrac{1}{4}(2^{-L} \cdot 1) = 2^{-L-2}\).

Likewise, for D-NMDT, the maximum error in \( z= xy \) is purely in the McCormick relaxation of the term \(\left( {\Delta _{x}^L}, {\Delta _{y}^L}, {\Delta _{z}^L}\right) \in {\mathcal {M}}\left( {\Delta _{x}^L}, {\Delta _{y}^L}\right) \) over the region \(({\Delta _{x}^L}, {\Delta _{y}^L}) \in [0, 2^{-L}]\times [0, 2^{-L}]\), yielding a maximum error of \(\tfrac{1}{4}(2^{-L} \cdot 2^{-L}) = 2^{-2L-2}\).

Proposition 2

The maximum error in the D-NMDT MIP relaxation for \( z= xy \) with \( x, y \in [0, 1] \) is \(\tfrac{1}{4}(2^{-L} \cdot 2^{-L}) = 2^{-2L-2}\).

For univariate D-NMDT, the maximum error in \(z=x^2\) arises from the McCormick relaxation \(({\Delta _{x}^L}, {\Delta _{z}^L}) \in {\mathcal {M}}({\Delta _{x}^L}, {\Delta _{x}^L})\) over the interval \({\Delta _{x}^L} \in [0, 2^{-L}]\), yielding a maximum error of \( 2^{-2L - 2}\).

Proposition 3

The maximum error in the univariate D-NMDT MIP relaxation for \( z= x^2 \) with \( x, y \in [0, 1] \) is \( 2^{-2L - 2}\).

Finally, for univariate NMDT, the error is incurred by the McCormick relaxation \(({\Delta _{x}^L}, x, {\Delta _{z}^L}) \in {\mathcal {M}}({\Delta _{x}^L}, x)\) over the box \(({\Delta _{x}^L}, x) \in [0, 2^{-L}] \times [0,1]\) with \(x = k 2^{-L} + {\Delta _{x}^L}\) for some \(k \in \{0, \dots , 2^{-L}-1\}\). Over this box, the error-maximizing point \( (x, {\Delta _{x}^L}) = (\tfrac{1}{2}, 2^{-L-1}) \) derived in [15] is not feasible, as \( x = \tfrac{1}{2} \) implies \( {\Delta _{x}^L} = 0 \). In fact, we can show that the maximum error is slightly less than the expected \( 2^{-L - 2} \).

To prove this, we focus on the maximum error of the underestimating part of the McCormick envelope with respect to \(x{\Delta _{x}^L}\) and skip the overestimating part as it works analogously. By (4), the McCormick relaxation underestimator over the box \(({\Delta _{x}^L}, x) \in [0, 2^{-L}] \times [0,1]\) is given as

$$\begin{aligned} \max _{\begin{array}{c} {\Delta _{x}^L} \in [0, 2^{-L}],\\ k\in \{0,\ldots , 2^{L}-1\} \end{array}} \{0,{\Delta _{x}^L} - 2^{-L}(1-x)| x=k2^{-L}+ {\Delta _{x}^L}\}. \end{aligned}$$

The underestimator is zero at points in the domain where

$$\begin{aligned} {\Delta _{x}^L} \le -2^{-L}x+2^{-L}=2^{-L}(1-2^{-L}k- {\Delta _{x}^L}) \end{aligned}$$
(24)

holds and \({\Delta _{x}^L} - 2^{-L}(1-2^{-L}k- {\Delta _{x}^L})\) at the rest of the domain. The maximum error of the McCormick underestimation is

$$\begin{aligned}&\max _{\begin{array}{c} {\Delta _{x}^L} \in [0, 2^{-L}],\\ k\in \{0,\ldots , 2^{L}-1\} \end{array}} \{ x {\Delta _{x}^L} - \max \{0, {\Delta _{x}^L} - 2^{-L}(1-x)\} | x=k2^{-L}+ {\Delta _{x}^L} \}\\ =&\max _{\begin{array}{c} {\Delta _{x}^L} \in [0, 2^{-L}],\\ k\in \{0,\ldots , 2^{L}-1\} \end{array}} \{ 2^{-L}k {\Delta _{x}^L} + ({\Delta _{x}^L})^2 - \max \{0, {\Delta _{x}^L} - 2^{-L}(1-{\Delta _{x}^L} - k2^{-L})\} \}. \end{aligned}$$

First, we determine the maximum error on the piece where the McCormick underestimator is the zero function. In the \(({\Delta _{x}^L}, k)\) space the region described by the inequality (24) equals \({\Delta _{x}^L} \le \frac{2^L-k}{2^L+4^L}\). Now suppose we are at some point in this region, then we can increase the error function \(2^{-L}k {\Delta _{x}^L} + ({\Delta _{x}^L})^2 -0\) by increasing either k or \({\Delta _{x}^L}\). Consequently, the maximum error is attained if \({\Delta _{x}^L} = \frac{2^L-k}{2^L+4^L}\). The error at these points can be purely expressed as a quadratic function in k:

$$\begin{aligned} x {\Delta _{x}^L} - 0 = (2^{-L}k+ {\Delta _{x}^L}) {\Delta _{x}^L} = \left( 2^{-L}k+ \frac{2^L-k}{2^L+4^L}\right) \left( \frac{2^L-k}{2^L+4^L}\right) . \end{aligned}$$

It is maximized and symmetric at \(k^*=\frac{1}{2}(2^{L}-1)= 2^{L-1} - \frac{1}{2}\). Since \(k^* \not \in \mathbb {N}\) for any \(L\ge 1\), the maximum error is attained at \(k_1=2^{L-1}-1\) and \(k_2=2^{L-1}\). It has a value of \(2^{-L-2} - 2^{-3L-2} (1+2^{-L})^{-2}\). We can use the same reasoning for the region \({\Delta _{x}^L} \ge \frac{2^L-k}{2^L+4^L}\) and the increase in the error function by decreasing either k or \({\Delta _{x}^L}\) and obtaining the same maximum error at the same points. The values \(k_1\) and \(k_2\) correspond to

$$\begin{aligned} ({\Delta _{x}^L}, x)= \left( \tfrac{1}{2(2^L + 1)}, \tfrac{1}{2} \pm \tfrac{1}{2(2^L + 1)}\right) . \end{aligned}$$

The maximum overestimation error with the McCormick envelope, where the proof works very similarly, is obtained at \(({\Delta _{x}^L}, x)=(\frac{1}{4},\frac{1}{4})\) and \(({\Delta _{x}^L}, x)=(\frac{1}{4},\frac{3}{4})\) with a value of \(2^{-4}\) if \(L=1\). However, for \(L \ge 2\) the value is somewhat lower, namely \(2^{-L-2} - 2^{-3\,L-2} (1-2^{-L})^{-2}\) attained at

$$\begin{aligned} ({\Delta _{x}^L}, x) = \left( \tfrac{1}{2(2^L - 1)},\tfrac{1}{2} \pm \tfrac{1}{2(2^L - 1)}\right) \text { if } L \ge 2. \end{aligned}$$

The maximum error is therefore set by the underestimation. We summarize these findings in the following proposition.

Proposition 4

The maximum error in the univariate NMDT relaxation for \( z= x^2 \) with \( x, y \in [0, 1] \) is \(2^{-L-2} - 2^{-3\,L-2} (1+2^{-L})^{-2}\).

A summary of the maximum error analysis results can be found in Table 1. It should be noted that for a fixed depth L, HybS and D-NMDT provide the smallest maximum errors among the considered MIP relaxations in our study.

4.2 Average error and minimizing the average error

In this section, we will study the average error of the considered MIP relaxation. In Definition 3 the average error is defined as the volume enclosed by the projected MIP relaxation. We consider it to be an additional measure of the quality of a MIP relaxation besides the maximum error.

For equations of the form \( z= x^2 \), univariate D-NMDT gives piecewise McCormick relaxations. In [5, Proposition5], it is shown that uniform discretization is optimal for fixed numbers of breakpoints. However, for univariate NMDT the calculation of the volume is much more complicated, so we omit it here.

Next, we compute the average errors of NMDT and D-NMDT for the equation \(z=xy\). Then we prove that the uniform discretizations, which are used in the definition of NMDT and D-NMDT, are indeed optimal in terms of the minimizing the volume of the projected MIP relaxation if the number of discretization points is fixed (i.e. if L and \( L_1 \) are fixed).

Proposition 5

Let \( P^{{\text { IP}}}_{\text {NMDT}} \) and \( P^{{\text { IP}}}_{\text {D-NMDT}} \) be the MIP relaxations of NMDT and D-NMDT for \( z= xy \) for some \( L \ge 0 \) as defined in (18) and (21), respectively. Their respective average errors are

$$\begin{aligned} {\mathcal {E}}^{\text {avg}}(P^{{\text { IP}}}_{\text {NMDT}}, {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy)) = \tfrac{1}{6} 2^{-L-2} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {E}}^{\text {avg}}(P^{{\text { IP}}}_{\text {D-NMDT}}, {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy)) = \tfrac{1}{6} 2^{-2L - 2}. \end{aligned}$$

Proof

Note that the discretization in NMDT and D-NMDT yields piecewise McCormick relaxations over a uniformly spaced grid, where each grid piece corresponds to some fixed integer solution \( \varvec{\beta }^x, \varvec{\beta }^y \in \{0, 1\}^L\), \( {\Delta _{x}^L}, {\Delta _{y}^L} \in [0, 2^{-L}] \). The volume of of the McCormick envelope over a single grid piece is \( \tfrac{1}{6} w_{x}^2 w_{y}^2 \), where \( w_{x}\) is its x-width and \( w_{y}\) is its y-width (see e.g. [15, page 22]). The average error is then the sum over all grid piece volumes. Now, for NMDT we have \( 2^L \) grid pieces with \( w_{y}= 1 \) and \( w_{x}= 2^{-L} \), yielding a volume per grid piece of \( \tfrac{1}{6} 2^{-2\,L} \) and thus a total volume of \( \tfrac{1}{6} 2^{-L} \). Similarly, for D-NMDT we have \( 2^{2L} \) grid pieces with \( w_{x}= w_{y}=2^{-L} \), which yields a volume per grid piece of \( \tfrac{1}{6} 2^{-4L} \) and thus a total volume of \( \tfrac{1}{6} 2^{-2L} \). \(\square \)

When applied to \( {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy) \), NMDT and D-NMDT are both piecewise McCormick relaxations, defined as

$$\begin{aligned} \bigcup _{i \in \llbracket n \rrbracket , j \in \llbracket m \rrbracket } {\mathcal {M}}([x_{i - 1}, x_i], [y_{j - 1}, y_j]), \end{aligned}$$

where we use the notation \( {\mathcal {M}}([x_{i - 1}, x_i], [y_{j - 1}, y_j]) \) to mean the McCormick envelope \( {\mathcal {M}}(x, y) \) with \( x \in [x_{i - 1}, x_i] \) and \( y \in [y_{j - 1}, y_j] \), for \( 0 = x_0< x_1< \cdots < x_n = 1 \) and \( 0 = y_0< y_1< \cdots < y_m = 1 \).

We now prove that a uniform placement of breakpoints minimizes the average error in a piecewise McCormick relaxation. For \( n = 2^L \) and \( m = 1 \), this yields precisely the NMDT relaxation of depth L, and if \( n = m = 2^L \), then this yields precisely the D-NMDT relaxation of depth L. Hence, they are optimal discretizations. The average error in NMDT is \( \frac{1}{6n} = \frac{1}{6}2^{-L} \), and \( \frac{1}{6n^2} = \frac{1}{6}2^{-2\,L} \) in D-NMDT. This follows from the proof below.

Theorem 1

Let \( 0 = x_0< x_1< \cdots < x_n = 1 \) and \( 0 = y_0< y_1< \cdots < y_m = 1 \) be sets of breakpoints. Then a uniform spacing of these breakpoints minimizes the average error over all piecewise McCormick relaxations of \( {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy) \).

Proof

Let \( w_{x_i}{:}{=}x_i - x_{i - 1} \) and \( w_{y_j}{:}{=}y_j - y_{j - 1} \) with \( i \in \llbracket n \rrbracket \) and \( j \in \llbracket m \rrbracket \) be the widths of the grid pieces \( [x_{i - 1}, x_i] \times [y_{j - 1}, y_j] \). The volume of the McCormick envelope \( {\mathcal {M}}([x_{i - 1}, x_i], [y_{j - 1}, y_j]) \) over a single grid piece is \( \tfrac{1}{6} w_{x_i}^2 w_{y_j}^2 \), see [15, page 22]. Therefore, the problem of minimizing the average error of a piecewise McCormick relaxation can be formulated as

$$\begin{aligned} \begin{array}{rll} \displaystyle \tfrac{1}{6} \min &{} \sum _{i = 1}^n \sum _{j = 1}^m w_{x_i}^2 w_{y_j}^2\\ \text {s.t.} &{} \sum _{i = 1}^n w_{x_i}=1\\ &{} \sum _{j = 1}^m w_{y_j}=1\\ &{} w_{x_i}\ge 0 &{} i = 1, \ldots , n\\ &{} w_{y_j}\ge 0 &{} j = 1, \ldots , m. \end{array} \end{aligned}$$
(25)

The objective function in (25) sums the average errors over the single grid pieces while the constraints ensure that all single grid widths sum up to 1 and are greater than or equal to 0. Rewriting it to

$$\begin{aligned} \begin{array}{rll} \displaystyle \tfrac{1}{6} \min &{} (\sum _{i = 1}^n w_{x_i}^2)\cdot (\sum _{j = 1}^m w_{y_j}^2)\\ \text {s.t.} &{} \sum _{i = 1}^n w_{x_i}=1\\ &{} \sum _{j = 1}^m w_{y_j}= 1\\ &{} w_{x_i}\ge 0 &{} i = 1, \ldots , n\\ &{} w_{y_j}\ge 0 &{} j = 1, \ldots , m. \end{array} \end{aligned}$$
(26)

lets (26) decompose into the two independent convex subproblems

$$\begin{aligned}{} & {} \begin{array}{rll} \displaystyle \tfrac{1}{6} \min &{} \sum _{i = 1}^n w_{x_i}^2\\ \text {s.t.} &{} \sum _{i = 1}^n w_{x_i}=1\\ &{} w_{x_i}\ge 0 &{} i = 1, \ldots , n,\\ \end{array} \end{aligned}$$
(27)
$$\begin{aligned}{} & {} \begin{array}{rll} \displaystyle \tfrac{1}{6} \min &{} \sum _{j = 1}^m w_{y_j}^2\\ \text {s.t.} &{} \sum _{j = 1}^m w_{y_j}=1\\ &{} w_{y_j}\ge 0 &{} j = 1, \ldots , m. \end{array} \end{aligned}$$
(28)

Applying the KKT conditions to (27) and (28), which are sufficient for global optimality here, directly shows that a uniform placement of the breakpoints with \( w_{x_i}= \tfrac{1}{n} \) and \( w_{y_j}= \tfrac{1}{m} \) is optimal for (25). The total average error is then \( \tfrac{1}{6nm} \). \(\square \)

Corollary 1

Let \( 0 = x_0< x_1< \cdots < x_n = 1 \) and \( 0 = y_0 < y_1 = 1 \) be sets of breakpoints with \( n = 2^L \) and \( P^{{\text { IP}}}_L \) a depth-L NMDT MIP relaxation of \( {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy) \) from (18). Then \( P^{{\text { IP}}}_L \) is an optimal piecewise McCormick relaxation with an average error of \( {\mathcal {E}}^{\text {avg}}(P^{{\text { IP}}}_L, {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy)) = \frac{1}{6}2^{-L} \).

Corollary 2

Let \( 0 = x_0< x_1< \cdots < x_n = 1 \) and \( 0 = y_0< y_1< \cdots < y_m = 1 \) be sets of breakpoints with \( n = m = 2^L \) and \( P^{{\text { IP}}}_L \) a depth-L D-NMDT MIP relaxation of \( {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy) \) from (21). Then \( P^{{\text { IP}}}_L \) is an optimal piecewise McCormick relaxation with an average error of \( {\mathcal {E}}^{\text {avg}}(P^{{\text { IP}}}_L, {{\,\textrm{gra}\,}}_{[0, 1]^2}(xy)) = \frac{1}{6}2^{-2L} \).

We summarize the key results of Sect. 4.2 in the remark below and in Table 1.

Remark 3

(Tightness of MIP Relaxations) For an equation \( z= x^2 \) and a fixed depth L, the tightened sawtooth relaxation [3, Definition 7], and the separable formulations from Part I that employ it, have the smallest volume in the projected MIP relaxation among all studied formulations: they are equivalent in upper bound, with a tightened lower bound, compared to univariate NMDT and D-NMDT. For \( z= xy \), D-NMDT is the tightest formulation, as it yields the convex hull of \( {{\,\textrm{gra}\,}}_D(xy) \) on each grid piece \( D = [k^x 2^{-L}, (k^x + 1) 2^{-L}] \times [k^y 2^{-L}, (k^y + 1) 2^{-L}] \), \( k^x, k^y \in \llbracket 0, 2^L - 1 \rrbracket \). Combining these facts, T-D-NMDT is the tightest relaxation presented for the full MIQCQP.\(\diamond \)

4.3 Formulation strength: sharpness and LP relaxations

In the previous section, we discussed maximum error and average errors incurred from using certain discretizations. We will now consider the strength of the resulting MIP relaxations by analyzing their LP relaxation, i.e. we will check for sharpness. Sharpness means that the projected LP relaxation equals the convex hull of the MIP relaxation.

We start with the core formulations from Sect. 2.2. It is well known that the McCormick relaxation yields the convex hull of the feasible set of \( z= xy \) over box domains . Therefore, it is obviously sharp. The volume is . In [3] it is further shown that the sawtooth epigraph relaxation is also sharp. Since the epigraph of f is an unbounded set, we do not discuss volume here.

Next, we look at the formulations from Sect. 3. The LP relaxations of NMDT and D-NMDT for \(z= xy\) yield the McCormick envelope over D, and thus they are sharp. The LP relaxation volumes of NMDT and D-NMDT for \(z= xy\) is thus and independent of the choice of L. In Sect. 3 we proved that univariate NMDT as well as univariate D-NMDT are not sharp by giving points that are feasible for the LP relaxation but are not in the convex hull of the MIP relaxations. Finally, we consider the two tightened formulations univariate T-NMDT and univariate T-D-NMDT for \(z=x^2\). We show that both formulations are sharp for any \(L_1\) with \(L_1\ge L\). A graphical illustration of how tightening leads to sharp MIP formulations in the univariate cases can be seen in Fig. 2 for D-NMDT and Fig. 3 for NMDT. We begin with a lemma about the structure of (non-tightened) univariate D-NMDT MIP relaxations.

Fig. 2
figure 2

Feasible set of the univariate MIP relaxation D-NMDT and its LP relaxation with \(L=2\). In three of the plots, we display the lower bounds obtained from tightening to show how this affects the MIP relaxation

Fig. 3
figure 3

Feasible set of the univariate MIP relaxation NMDT and its LP relaxation with \(L=2\). In three of the plots we display the lower bounds obtained from tightening to show how this affects the MIP relaxation

Lemma 1

Let \(P^{{\text { IP}}}_L\) be the univariate D-NMDT MIP relaxation with depth \(L \ge 1\) for \(z= x^2\) as defined in (22). Then the projection of \(P^{{\text { IP}}}_L\) in the \((x,z)\)-space gives a series of McCormick envelopes, i.e.

$$\begin{aligned} {{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { IP}}}_L) = \bigcup _{i \in \llbracket 0, 2^L - 1 \rrbracket } {\mathcal {M}}_i \end{aligned}$$
(29)

where \({\mathcal {M}}_i\) is the McCormick relaxation \({\mathcal {M}}(x,x)\) with \(x \in [i 2^{-L}, (i+1)2^{-L}]\) and \(i=0,\ldots , 2^{L}-1\).

The proof of Lemma 1 is stated in Appendix “Piecewise McCormick relaxations of univariate DNMDT”. We use Lemma 1 to prove sharpness of the tightened version univariate T-D-NMDT.

Theorem 2

The univariate T-D-NMDT MIP relaxation for \(z=x^2\) is sharp.

Proof

Let \(P^{{\text { IP}}}_{L,L_1}\) be a univariate T-D-NMDT MIP relaxation with \(L_1\ge L\) and let \(P^{{\text { LP}}}_{L,L_1}\) be the corresponding LP relaxation. To prove sharpness, we have to show that

$$\begin{aligned} {{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}) = {{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{ (x,z)}(P^{{\text { IP}}}_{L,L_1})). \end{aligned}$$

As, \({{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}) \supseteq {{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{ (x,z)}(P^{{\text { IP}}}_{L,L_1})) \) is obvious for any LP relaxation, we have to show

$$\begin{aligned} {{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}) \subseteq {{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{ (x,z)}(P^{{\text { IP}}}_{L,L_1})). \end{aligned}$$

To do that we analyze the minimum and maximum values of \(z\) in \({{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1})\) and \({{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{ (x,z)}(P^{{\text { IP}}}_{L,L_1}))\).

In Lemma 1, we showed that the projected MIP relaxation of univariate D-NMDT is a series of small McCormick relaxations. As the MIP relaxation contains the points (0, 0) and (1, 1) it follows that its convex hull contains the line connecting these points and thus

$$\begin{aligned} \max _{z\in [0,1]}{{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { IP}}}_{L,L_1}|_{x=\tilde{x}}))= \tilde{x}, \, \forall \, \tilde{x} \in [0,1] \end{aligned}$$

holds.

Next, we show that the same inequality bounds the maximum value of \(z\) in the LP relaxation, i.e. we prove that

$$\begin{aligned} \max _{z\in [0,1]}{{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}|_{x=\tilde{x}})\le \tilde{x}, \, \forall \, \tilde{x} \in [0,1]. \end{aligned}$$

The McCormick relaxations in (22) give the following inequalities

$$\begin{aligned} {\Delta _{z}^L} {\mathop {\le }\limits ^{(*)}} 2^{-L}{\Delta _{x}^L},\,\, u_j{\mathop {\le }\limits ^{(**)}} x+ {\Delta _{x}^L}, \text { and } u_j{\mathop {\le }\limits ^{(***)}} (1+2^{-L})\beta _j \quad \forall j= 1, \ldots , 2^L-1. \end{aligned}$$

Thus in the LP relaxation, z is bounded as follows

$$\begin{aligned} \begin{aligned} z&= \displaystyle \sum _{j = 1}^L 2^{-j} u_j + {\Delta _{z}^L}{\mathop {\le }\limits ^{(*),(**)}} (x + {\Delta _{x}^L})(\displaystyle \sum _{j = 1}^L 2^{-j} \beta _j)+2^{-L}{\Delta _{x}^L}\\&\le (x + {\Delta _{x}^L})(1-2^{-L})+2^{-L}{\Delta _{x}^L} = x + {\Delta _{x}^L} - 2^{-L}x. \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} z&= \displaystyle \sum _{j = 1}^L 2^{-j} u_j + {\Delta _{z}^L}{\mathop {\le }\limits ^{(*),(***)}} (1+2^{-L})(\displaystyle \sum _{j = 1}^L 2^{-j} \beta _j)+2^{-L}{\Delta _{x}^L}\\&=(1+2^{-L})(x-{\Delta _{x}^L})+2^{-L}{\Delta _{x}^L} = x + 2^{-L}x - {\Delta _{x}^L}. \end{aligned} \end{aligned}$$

Combining both inequalities we have

$$\begin{aligned} z \le {\left\{ \begin{array}{ll} x + ({\Delta _{x}^L} - 2^{-L}x),\\ x + (2^{-L}x - {\Delta _{x}^L}), \end{array}\right. } \end{aligned}$$

which implies \(z\le x\) and therefore

$$\begin{aligned} \max _{z\in [0,1]}{{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}|_{x=\tilde{x}})\le \tilde{x}, \, \forall \, \tilde{x} \in [0,1]. \end{aligned}$$

It remains to show that

$$\begin{aligned} \min _{z\in [0,1]}{{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}|_{x=\tilde{x}})\ge \min _{z\in [0,1]}{{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { IP}}}_{L,L_1}|_{x=\tilde{x}})), \, \forall \, \bar{x} \in [0,1].\nonumber \\ \end{aligned}$$
(30)

We start with the linear cuts given by the tightened sawtooth relaxation. The sawtooth relaxation is bounded from below by the recursively defined function \(F^L\). From [3, Proposition1] in Part I it follows that for each L and every breakpoint \(x_i {:}{=}\frac{i}{2^L}\) with \(i=1,\ldots , 2^{L}-1\), there is a function \(F^j-2^{-2j-2}\) with \(j < L\) such that \(F^j\) lies tangent to \(x^2\) at \(x_i\). These cuts are exactly the McCormick underestimators of the MIP relaxation derived in Lemma 1, \(z\ge 2x(i 2^{-L}) -(i 2^{-L})^2 \text { at } x_i {:}{=}\frac{i}{2^L} \text { for } i=1,\ldots , 2^{L}-1\). As the additional sawtooth cuts for \(L_1\ge L \) tighten both the MIP and LP relaxations, we have

$$\begin{aligned} \min _{z\in [0,1]}{{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { LP}}}_{L,L_1}|_{x=\tilde{x}})\ge&\min _{z\in [0,1]}{{\,\textrm{conv}\,}}({{\,\textrm{proj}\,}}_{(x,z)}(P^{{\text { IP}}}_{L,L_1}|_{x=\tilde{x}})),\\ {}&\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \forall \, \bar{x} \in [2^{-L-1},1-2^{-L-1}]. \end{aligned}$$

Finally, consider the boundary regions \([0,2^{-L-1}]\) and \([1- 2^{-L-1},1]\). We show that both the MIP and LP relaxations yield the same minimum value. As established in Lemma 1, D-NMDT provides a piecewise McCormick relaxation. Consequently, for \(x \in [0,2^{-L-1}]\), the minimum value is \(z= 0\). Meanwhile, for \(x \in [1- 2^{-L-1},1]\), the minimum value is \(z= 2x-1\). We further assert that \(z\ge 0\) and \(z\ge 2x -1\) are valid cuts for the LP relaxation. Based on (22), we can deduce the following McCormick cuts for the LP relaxation:

$$\begin{aligned} u_j&{\mathop {\ge }\limits ^{(*)}} x + {\Delta _{x}^L}-(1+2^{-L})(1-\beta _j),&u_j&{\mathop {\ge }\limits ^{(**)}} 0&\forall j=1,\ldots ,L \\ {\Delta _{z}^L}&{\mathop {\ge }\limits ^{(')}} 2\cdot 2^{-L}{\Delta _{x}^L} -2^{-2L},&{\Delta _{z}^L}&{\mathop {\ge }\limits ^{('')}} 0. \end{aligned}$$

From the above, we can estimate \( z= \displaystyle \sum _{j = 1}^L 2^{-j} u_j + {\Delta _{z}^L} {\mathop {\ge }\limits ^{(**),('')}} 0\) and

$$\begin{aligned} z&= \displaystyle \sum _{j = 1}^L 2^{-j} u_j {\mathop {\ge }\limits ^{(*)}} (1-2^{-L})(x+{\Delta _{x}^L}-(1+2^{-L})) + (1+2^{-L}) \left( \displaystyle \sum _{j = 1}^L 2^{-j} \beta _j\right) + {\Delta _{z}^L} \\&{\mathop {\ge }\limits ^{(')}} 2x-2\cdot 2^{-L}{\Delta _{x}^L} - 1 + 2^{-2L}+2\cdot 2^{-L}{\Delta _{x}^L} - 2^{-2L} = 2x-1. \end{aligned}$$

This concludes our proof. \(\square \)

Lastly, we state that the univariate T-NMDT is also sharp.

Theorem 3

The univariate T-NMDT MIP relaxation is sharp.

The proof of Theorem 3 works by showing that the projection of D-NMDT is a subset of the projection of NMDT and is stated in Appendix “Sharpness of univariate NMDT”.

5 Computational results

In order to test the MIP relaxations from Sect. 3 with respect to their ability to determine dual bounds, we now perform an indicative computational study. More precisely, we will derive MIP relaxations of non-convex MIQCQP instances. The MIP relaxations are then solved using Gurobi [13] as an MIP solver to determine dual bounds and a callback function that uses the non-linear programming (NLP) solver IPOPT [19] to find a feasible solution for the MIQCQP. The MIP relaxation methods are tested for several discretization depths. To compare the considered methods to state-of-the-art spatial branching based solvers, we also run Gurobi as an MIQCQP solver.

All instances were solved in Python 3.8.3, via Gurobi 9.5.1 and IPOPT 3.12.13 on the ‘Woody’ cluster, using the “Kaby Lake” nodes with two Xeon E3-1240 v6 chips (4 cores, HT disabled), running at 3.7 GHZ with 32 GB of RAM. For more information, see the Woody Cluster Website of Friedrich-Alexander-Universität Erlangen-Nürnberg. The global relative optimality tolerance in Gurobi was set to the default value of 0.01%, for all MIPs and MIQCQPs.

5.1 Study design

In the following, we explain the design of our study and go into detail regarding the instance set as well as the various parameter configurations.

Instances We consider a three-part benchmark set of 60 instances: 20 non-convex boxQP instances from [5, 8, 11] and earlier works, 20 AC optimal power flow (ACOPF) instances from the NESTA benchmark set (v0.7.0) (see [9]), previously used in [1], and 20 MIQCQP instances from the QPLIB [12]. In Appendix D you will find links that contain download options and detailed descriptions of the instances. For an overview of the IDs of all instances, see Table 9. The benchmark set is equally divided into 30 sparse and 30 dense instances. We refer to dense instances if either the objective function and/or at least one quadratic function in the constraint set is of the form \(x^\top Q x\), where \(x \in \mathbb {R}^n\) are all variables of the problem and \(Q \in \mathbb {R}^{n, n}\) is a matrix with at least 25% of its entries being nonzero.

Parameters For each instance, we solve the resulting MIP relaxation of each method from Sect. 3 using various approximation depths of \( L \in \{1, 2, 4, 6\} \) and a time limit of 8 h. All MIP relaxations are solved twice. Once in the standard versions from Sect. 3 and once with a tightened underestimator version for univariate quadratic terms where \( L_1 = \max \{2, 1.5L\} \). Note that the tightened MIP relaxations T-NMDT and T-D-NMDT are equivalent to the non-tightened MIP relaxations NMDT and D-NMDT when applied to bilinear terms of the form \(z=xy\). However, they differ from them in that all lower bounding McCormick constraints in the univariate quadratic terms of the form \(z=x^2\) are replaced by a tighter sawtooth epigraph relaxation (15) as described in Sects. 3.1 and 3.2. Furthermore, we include HybS, the most performant separable MIP relaxation from Part I, in the study. However, we do not apply tightening to HybS, as it was shown in Part I that this does not result in computational improvements.

In Table 2, one can see an overview of the different parameters in our study. In total, we have 24 parameter configurations for 60 original problems. However, as we do not apply tightening to HybS we end up with 1200 MIP instances. For the comparison with Gurobi as a state-of-the-art MIQCQP solver, we solve an additional 480 MIP instances and 120 MIQCQP instances. These additional MIP instances arise from disabling the cuts in Gurobi for the winner of the NMDT-based methods and HybS. The 120 MIQCQP instances are built by solving all 60 benchmark problems once with cuts enabled and once with cuts disabled.

Table 2 In the study, we consider the parameters cuts, depth, and formulation to create MIP relaxations for 60 MIQCQP instances

See Sect. 5.2.2 for more details on the latter.

Callback function Solving all MIP relaxations, we use a callback function with the local NLP solver IPOPT that works as follows: given any MIP-feasible solution, the callback function fixes any integer variables from the original problem (before applying any of the discretization techniques from this work) according to this solution and then solves the resulting NLP locally via IPOPT in an attempt to find a feasible solution for the original MIQCQP problem.

5.2 Results

In the following, we present the results of our study. In particular, we aim to answer the following questions regarding dual bounds:

  • Is our enhanced method D-NMDT computationally superior to its predecessors NMDT?

  • Is it beneficial to use tightened versions of the NMDT and D-NMDT, i.e., to choose \(L_1>L\)?

  • How do the studied methods compare to the state-of-the-art MIQCQP solver Gurobi?

We provide performance profile plots as proposed by Dolan and More [10] to illustrate the results of the computational study regarding the dual bounds, see Fig. 4, 5, 6, 7, 8 and 9. The performance profiles work as follows: Let \(d_{p,s}\) be the best dual bound obtained by MIP relaxation or MIQCQP solver s for instance p after a certain time limit. With the performance ratio \(r_{p,s} {:}{=}d_{p,s} / \min _s d_{p,s}\), the performance profile function value \(P(\tau )\) is the percentage of problems solved by approach s such that the ratios \(r_{p,s}\) are within a factor \(\tau \in \mathbb {R}\) of the best possible ratios. All performance profiles are generated with the help of Perprof-py by Siqueira et al. [17]. The plots are divided into two blocks, one for NMDT-based methods and one for the comparison against HybS and Gurobi as an MIQCQP solver. In addition to the performance profiles across all instances, we also show performance profiles for the dense and sparse subsets of the instance set. Please note that in minimization problems, the higher the value of a dual bound, the better it is. Since lower values are considered better in performance profiles, we simply take the inverse of the dual bound as the value to be compared.

Although the main criterion of the study is the dual bound, we also discuss run times. Here, we use the shifted geometric mean, which is a common measure for comparing two different MIP-based solution approaches. The shifted geometric mean of n numbers \(t_1,\ldots , t_n\) with shift s is defined as \(\big (\prod _{i=1}^n (t_i+s)\big )^{1/n} - s\). It has the advantage that it is neither affected by very large outliers (in contrast to the arithmetic mean) nor by very small outliers (in contrast to the geometric mean). We use a typical shift \(s = 10\). Moreover, we only include those instances in the computation of the shifted geometric mean, where at least one solution method delivered an optimal solution within the run time limit of 8 hours.

Finally, we will highlight some important results regarding primal bounds in the comparison of our methods with Gurobi [13] as an MIQCQP solver.

5.2.1 NMDT-based MIP relaxations

We start our analysis of the results by looking at the dual bounds, run times and feasible solution of the NMDT-based MIP relaxations.

Dual bounds As mentioned before, MIP relaxations are primarily used to deliver (tight) dual bounds for MIQCQPs. Thus, we now compare the tightness of the dual bounds provided by the various methods. To this end, we compute relative optimality gaps \(g_{p,s} {:}{=}|d_{p,s} - b_{p}| / |b_{p}|\) for all methods s (with a certain L value) and instances p of the benchmark set, where \(d_{p,s}\) is the corresponding best dual bound found by method s and \(b_{p}\) is the best known primal bound for instance p. Looking at Table 3, which displays the arithmetic (left) and geometric (right) means of the relative optimality gaps for all 60 instances of the benchmark set based on NMDT methods, several observations can be made. Across all instances, the gap generally decreases with increasing \( L \) values, although there are exceptions. While the outcomes for the arithmetic mean, in which outliers play a greater role, the outcomes for the geometric mean are clear. Here, T-D-NMDT is the winner exhibiting the best geometric means for all \( L \) values.

Additionally, in Fig. 4 we show performance profiles for the dual bounds that are obtained by the different NMDT-based MIP relaxations. Starting from \(L=2\), we can see that both D-NMDT and T-D-NMDT deliver notably tighter bounds within the run time limit of 8 h. The largest difference is at \(L=4\), where D-NMDT and T-D-NMDT are able to find dual bounds that are within a factor 1.05 of the overall best bounds for nearly all instances. In contrast, NMDT and T-NMDT require a corresponding factor of more than 1.1. In addition, the tightened versions perform somewhat better than the corresponding counterparts, especially for \(L=4\).

Table 3 Arithmetic (left) and geometric (right) mean of relative optimality gaps (in %) on all instances for NMDT-based methods
Fig. 4
figure 4

Performance profiles to dual bounds of NMDT-based methods on all instances

To gain a deeper insight into the benefits of D-NMDT and the tightening of NMDT-based relaxations, we divide the benchmark set into sparse and dense instances.

For sparse instances, the advantage of the new methods in the performance profiles is rather small; see the performance profiles in Fig. 6. Here, T-D-NMDT provides marginally better bounds than the other methods in case of \(L=4\) and \(L=6\). For \(L=1\) and \(L=2\), however, T-NMDT dominates all other approaches. Moreover, the tightened versions outperform their counterparts for all depths L. Regarding the arithmetic and geometric means, the gaps consistently decrease with increasing \( L \) values for all methods, indicating improved performance. T-NMDT and T-D-NMDT are generally competitive, with T-D-NMDT having the best geometric mean in all cases.

For dense instances, D-NMDT and T-D-NMDT are clearly superior to NMDT and T-NMDT; see Fig. 6. Regardless of the relaxation depth, the new methods yield the tightest dual bounds, with T-D-NMDT being superior to D-NMDT only in case of \(L=2\), where the tightened version T-D-NMDT is able to find the best dual bound for roughly 10% more instances than D-NMDT. Tightening the NMDT method does not deliver better bounds, in fact, T-NMDT is surpassed by NMDT for \(L=1\). Regarding the arithmetic and geometric means, there’s no clear trend of improvement with increasing \( L \) values. NMDT overall has the best arithmetic means for \( L=1 \), \( L=2 \) and \( L=4 \), while other methods shine at different \( L \) values. In summary, the performance of the NMDT-based methods varies depending on the \( L \) value and dataset density. For sparse instances, the performance seems to consistently improve with increasing \( L \) values, while for dense instances, no clear trend is discernible. Overall, T-D-NMDT showcases the performance in most scenarios. Tightening leads to an improvement in dual bounds across all instances, but this is mainly due to the sparse instances. In dense instances, we assume that the large number of additional cuts in the tightened variants leads to slower computations and thus worse bounds (Fig. 5).

Fig. 5
figure 5

Performance profiles to dual bounds of NMDT-based methods on sparse instances

Fig. 6
figure 6

Performance profiles to dual bounds of NMDT-based methods on dense instances

Run times Table 4 showcases the shifted geometric mean for run times. Throughout all instances, T-D-NMDT consistently outperforms the other methods, indicating its efficiency, especially at \( L=1 \) and \( L=2 \). As \( L \) increases, the run times generally rise for all methods, but T-D-NMDT remains the fastest. In sparse instances, T-D-NMDT retains its edge in efficiency, especially evident at \( L=6 \). However, the run times of other methods, particularly NMDT, escalate significantly. For dense instances, T-D-NMDT is the fastest for small values while D-NMDT takes the lead at \( L=4 \) and \( L=6 \). This observation is in line with increasing gaps for tightened variants. Despite these fluctuations, T-D-NMDT remains the most efficient methods across most scenarios.

Table 4 Shifted geometric mean for run times on all 60 instances in NMDT-based MIP relaxations

Feasible solutions In Table 5, we can see that the QP heuristic (IPOPT) we mentioned at the beginning of this section delivers high-quality feasible solutions for the original (MIQC-)QP instances. With increasing L values, IPOPT is able to find more feasible solutions with all NMDT-based methods quite similarly. For \(L=6\), T-D-NMDT combined with IPOPT yields feasible solutions for 50 out of 60 benchmark instances, 47 of which have a relative optimality gap below 1% and 46 of which are even globally optimal, i.e., which have a gap below 0.01%.

Table 5 Number of instances with feasible solutions found with different relative optimality gaps.

In summary, both T-D-NMDT and D-NMDT are clearly superior to the previously known NMDT approach. The double discretization and the associated reduction in the number of binary variables while maintaining the same relaxation error are most likely the reason for this. Surprisingly, the tightening of the lower bounds in the univariate quadratic terms and the resulting introduction of new constraints does not lead to higher run times. Thus, the latter is recommended. Moreover, T-D-NMDT is slightly ahead of the other methods in computing good solutions for the MIP relaxations that are used by the NLP solver IPOPT to find feasible solutions for the original MIQCQP instances. Altogether, we consider T-D-NMDT to be the winner among the NMDT-based methods.

5.2.2 Comparison with state-of-the-art MIQCQP Solver Gurobi

Finally, we compare the two winners T-D-NMDT and HybS of the NMDT-based and separable Methods (Part I) with the state-of-the-art MIQCQP solver Gurobi 9.5.1. We perform the comparison in two ways. Firstly, with Gurobi’s default settings, and secondly, with cuts disabled, i.e., we set the parameter "Cuts = 0". The reason for running Gurobi again with cuts turned off is that cuts are one of the most important components of MIQCQP/MIP solvers that rely on the structure of the problem. While constructing the MIP relaxations with T-D-NMDT and HybS, the original problem is transformed in such a way that Gurobi can no longer recognize the original quadratic structure of the problem. However, many cuts would still be valid and applicable in the MIP relaxations, for instance, RLT and PSD cuts.

Dual bounds We start our comparison with showing performance profiles for Gurobi, T-D-NMDT, HybS, and their variants without cuts ("-NC") on all instances in Fig. 7. As expected, Gurobi performs best for all L values, followed by its variant without cuts in second place. However, as the depth L increases, the MIP relaxations provide gradually tighter dual bounds. For \(L=6\), T-D-NMDT and HybS are able to find the best dual bounds for more than 50% of the cases, while Gurobi delivers the best bounds for roughly 90% and its variant without cuts for about 70% of the cases. Surprisingly, in contrast to T-D-NMDT, disabling cuts in case of HybS has little effect on the quality of the dual bounds. In Table 6 we displays the arithmetic and geometric means of the relative optimality gaps for various methods and their "no cuts" (NC) versions. Here, the findings are in line with those from the performance plots.

As before, we divide the benchmark set into sparse and dense instances. For sparse instances, the dual bounds computed by T-D-NMDT and HybS become progressively tighter with increasing L; see Fig. 8. For \(L=4\) and \(L=6\), T-D-NMDT and HybS are able to find the best dual bounds in about 60% of the instances, while Gurobi delivers the best bounds for roughly 80%. Compared to Gurobi-NC, our new methods T-D-NMDT, HybS, and most notably HybS-NC perform almost equally well. Regarding the means in Table 6, the cuts contribute minimally to improved gaps. The differences between the versions with and without cuts are marginal. For HybS, the results even indicate that the cuts can be counterproductive.

Fig. 7
figure 7

Performance profiles on dual bounds of best MIP relaxation compared to Gurobi as MIQCQP solver, with and without cuts, on all 60 instances

Table 6 Arithmetic (left) and geometric (right) mean of relative optimality gaps (in %) on all instances

In the case of dense instances, a different picture emerges, see Fig. 9. Again, Gurobi and also Gurobi-NC are dominant for all approximation depths and thanks to the cuts, Gurobi can solve all instances to a gap of 0%, see Table 6 However, for \(L=1\), T-D-NMDT delivers dual bounds that are within a factor 1.1 of the dual bounds provided by the variant of Gurobi without cuts. With higher L values, T-D-NMDT, HybS, and HybS-NC compute in about 40% of the cases the best bounds, while Gurobi yields the best bounds in all cases and Gurobi-NC for roughly 70% of the instances. When comparing D-NMDT and HybS, D-NMDT demonstrates advantages for smaller \( L \) values. In contrast, HybS delivers better gaps for larger \( L \) values.

Table 7 Shifted geometric mean for run times on all instances for best MIP relaxation compared to Gurobi as MIQCQP solver with cuts and without cuts (-NC)
Fig. 8
figure 8

Performance profiles on dual bounds of best MIP relaxation compared to Gurobi as MIQCQP solver, with and without cuts, on sparse instances

Run times In Table 7 we show the shifted geometric mean values of the run times for solving all instances with Gurobi and the corresponding MIP relaxations constructed with T-D-NMDT and HybS. The variants of Gurobi, T-D-NMDT, and HybS without cuts are also contained. Gurobi has significantly shorter run times than all other approaches. However, with \(L=1\) and \(L=2\), T-D-NMDT, HybS, T-D-NMDT-NC and HybS-NC are somewhat faster than Gurobi-NC.

Remark 4

Note, that for calculating the shifted geometric mean only those instances are used for which at least one method computed the optimal solution within the run time limit of 8 h. Since with higher L values the complexity of the MIP relaxations increases, fewer instances are solved to optimality by T-D-NMDT and HybS. Therefore, the shifted geometric mean decreases for Gurobi and Gurobi-NC with higher L values. This inherent nature of the shifted geometric mean is also the reason why we see different values in Tables 7 and 4 for the same methods.\(\diamond \)

Fig. 9
figure 9

Performance profiles on dual bounds of best MIP relaxation compared to Gurobi as MIQCQP solver, with and without cuts, on sparse instances. See Fig. 8 for the legend

Feasible Solutions. In combination with IPOPT as a QP heuristic, T-D-NMDT, HybS, and their variants without cuts are competitive with Gurobi for high L values when it comes to finding feasible solutions, as Table 8 shows. HybS-NC with IPOPT is able to find feasible with a relative optimality gap below 1% for 48 out of 60 benchmark instances, while Gurobi finds 50 feasible solutions with a gap below 1%. T-D-NMDT computes 46 solutions that are globally optimal, whereas Gurobi achieves this for 50 instances. Surprisingly, the variant without cuts of HybS delivers more feasible solutions than its variant with cuts enabled. Finally, we note that some MIQCQP instances have been solved to global optimality by the MIP relaxation methods, while Gurobi reached the run time limit of 8 h. For instance, T-D-NMDT with IPOPT is able to solve the QPLIB instance “QPLIB_0698" to global optimality for \(L \in \{2,4,6\}\) with a run time below 5 min, while Gurobi has a relative optimality gap of more than 5% after a run time of 8 h.

Overall, the comparison with Gurobi as a state-of-the-art MIQCQP solver has shown that the new methods T-D-NMDT and HybS can be relevant for practical applications. For sparse instances, the dual bounds provided by T-D-NMDT and HybS are of similar quality to those provided by Gurobi. In terms of MIQCQP-feasible solutions, for most instances the two methods are able to find very high quality solutions in combination with IPOPT as NLP solver.

Moreover, there is still plenty of room for improvement. First, numerical studies have shown before that an adaptive refinement of nonlinearities drastically decreases run times for solving MINLPs by piecewise linear MIP relaxations; see [6] for example. Hence, an approach with an adaptive refinement of the approximation depth L is even more promising. Second, HybS and its variant without cuts HybS-NC have performed very similarly in our computational study. In addition, HybS-NC was relatively close to Gurobi-NC in both solution quality and dual bounds for the MIQCQPs. Since most MIQCQP-specific cuts can still be integrated into the HybS approach, we believe that HybS can be further improved by embedding it in a branch-and-cut solution framework that is able to add MIQCQP-specific cuts, such as BQP and PSD cuts, to the MIP relaxations. In this way, we obtain both tighter dual bounds and MIP relaxation solutions that are more likely to yield feasible solutions for the MIQCQP in combination with IPOPT.

Table 8 Number of feasible solutions found with different relative optimality gaps

6 Conclusion

We introduced an enhanced mixed-integer programming (MIP) relaxation technique for non-convex mixed-integer quadratically constrained quadratic programs (MIQCQP), called doubly discretized normalized multiparametric disaggregation technique (D-NMDT). We showed that it has clear theoretical advantages over its predecessor NMDT, i.e. it requires a significantly lower number of binary variables to achieve the same accuracy. In addition, we combined both, D-NMDT and NMDT, with the sawtooth epigraph relaxation from Part I [3] to further strengthen the relaxations for univariate quadratic terms.

In a two-part computational study, we first compared D-NMDT to NMDT. We showed that D-NMDT determines far better dual bounds than NMDT and also has shorter run times. Furthermore, we were able to show that our tightening in both methods led to better dual bounds while simultaneously shortening the computation time. In the second part of the computational study, we compared the tightened D-NMDT (T-D-NMDT) against Hybrid Separable (HybS), the best-performing MIP relaxation from Part I. We showed that HybS does perform slightly better in terms of dual bounds. However, both new methods were able to find high-quality solutions to the original quadratic problems when used in conjunction with a primal solution callback function and a local non-linear programming solver. Furthermore, we showed that they both method can partially compete with the state-of-the-art MIQCQP solver Gurobi.

Finally, we gave some indications on how to further improve the new approaches. Two of the most promising directions in this context are employing adaptivity and adding MIQCQP-specific cuts that are valid but not recognized by the MIP solvers. This is the subject of future work.