Multivariate McCormick relaxations

McCormick (Math Prog 10(1):147–175, 1976) provides the framework for convex/concave relaxations of factorable functions, via rules for the product of functions and compositions of the form F∘f\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F\circ f$$\end{document}, where F\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F$$\end{document} is a univariate function. Herein, the composition theorem is generalized to allow multivariate outer functions F\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F$$\end{document}, and theory for the propagation of subgradients is presented. The generalization interprets the McCormick relaxation approach as a decomposition method for the auxiliary variable method. In addition to extending the framework, the new result provides a tool for the proof of relaxations of specific functions. Moreover, a direct consequence is an improved relaxation for the product of two functions, at least as tight as McCormick’s result, and often tighter. The result also allows the direct relaxation of multilinear products of functions. Furthermore, the composition result is applied to obtain improved convex underestimators for the minimum/maximum and the division of two functions for which current relaxations are often weak. These cases can be extended to allow composition of a variety of functions for which relaxations have been proposed.

objective function and the constraints. In principle it would be desirable to construct the several generalizations. The generalization of McCormick relaxations presented here, makes the relationship of the two approaches explicit, yielding a, to the best of our knowledge previously unknown, interpretation of McCormick Relaxation's as a decomposition method to solve the relaxed NLP constructed by AVM. We note that such decomposition methods for AVM have not been implemented by the global optimization community.
The proposed generalization allows a more direct relaxation of the product of functions, which proves to be at least as tight and in some cases tighter than McCormick's product rule. It also allows the direct relaxation of multilinear product of functions, i.e., without resorting to recursive application of the bilinear rule. Similarly, the proposed theorem results in at least as tight and often tighter relaxations for the minimum/maximum and the division of two functions.
The rest of the paper is organized as follows. In Sect. 2 we review McCormick's Composition Theorem and we give its generalization to multivariate outer functions, while in Sect. 3 we provide a way to propagate subgradient information. In Sect. 4 we discuss the relationship with AVM. We apply our results to compute relaxations of the product of two functions in Sect. 5, the minimum/maximum of two functions in Sect. 6 and the division of two functions in Sect. 7. We conclude and discuss future directions in Sect. 8.

Convex underestimator theorems
Theorem 1 is the main result in McCormick [23] and constructs convex/concave relaxations of composite functions where the outer function is univariate. Therein, mid(α, β, γ ) gives the median of three real numbers; in the trivial case that α = β = γ we have mid(α, β, γ ) = α; otherwise it is the numerical value that is smaller than the maximum and/or larger than the minimum. [23]) Let Z ⊂ R n and X ⊂ R be nonempty compact convex sets. Consider the composite function g = F • f (·) where f : Z → R, F : X → R and let f (Z ) ⊂ X . Suppose that convex/concave relaxations f cv , f cc : Z → R of f on Z are known. Let F cv : X → R be a convex relaxation of F on X and let x min ∈ X be a point where F cv attains its minimum on X . Thenḡ cv : Z → R,

Theorem 1 (McCormick composition theorem
is a convex relaxation of g on Z .
A similar theorem exists for the concave relaxation. Below we give an equivalent, yet more convenient to generalize, definition of the McCormick relaxation.
For the functionḡ cv defined by (1) there holds g cv (z) = g cv (z) for all z ∈ Z.
Proof Note that we clearly have f cv (z) ≤ f cc (z) for all z ∈ Z and that since f (Z ) ⊂ X there holds [ f cv (z), f cc (z)] ∩ X = ∅ for all z ∈ Z . Furthermore, let x min be the minimum of F cv in X .
We consider all three cases. If mid{ f cv (z), f cc (z), x min } = x min then we have If on the other hand mid{ f cv (z), f cc (z), x min } = f cv (z) we note that f cv (z) ≤ f cc (z) and thus x min ≤ f cv (z) ≤ f cc (z). Since F cv is convex it must be nondecreasing for x ≥ x min [30]. In addition we have x min ≤ f cv (z) ≤ f (z) and therefore f cv (z) ∈ X . Thus, we have Similarly, if mid{ f cv (z), f cc (z), x min } = f cc (z) we have f cv (z) ≤ f cc (z) ≤ x min . Since F cv is convex it must be nonincreasing for x ≤ x min .
In addition we have f (z) ≤ f cc (z) ≤ x min and therefore f cc (z) ∈ X . Thus, we have Theorem 2 gives a generalization of Theorem 1 for multivariate outer functions. Its proof makes use of Lemma 1, which we will also use in the development of subgradient propagation in Sect. 3. Note that ∂g(x) denotes the subdifferential of g at x, i.e., the set of all subgradients.
Lemma 1 [17] Let f 1 , . . . , f m be m convex functions from R n → R and let F be a convex and non-decreasing function from R m → R. Then g(x) = F( f 1 (x), . . . , f m (x)) is a convex function. Furthermore Theorem 2 Let Z ⊂ R n and X ⊂ R m be nonempty compact convex sets. Consider the composite function g = F( f 1 (z), . . . , f m (z)), where F : X → R and for i ∈ I = {1, . . . , m}, f i : Z → R are continuous functions, and let Suppose that convex relaxations f cv i : Z → R and concave relaxations f cc i : Z → R of f i on Z are known for every i ∈ I . Let F cv : X → R be a convex relaxation of F on X and F cc : X → R be a concave relaxation of F on X . Then g cv : Z → R, is a convex relaxation of g on Z and g cc : Z → R, is a concave relaxation of g on Z .
Proof First we prove that g cv underestimates g on Z . Using (3) and the fact that f cv . . , f m (z)) = g(z).
Next we prove that g cv is convex. Consider the function h defined on X × X by From convexity of F cv , the function h is increasing and convex as a perturbation function of a convex problem [8]. Observing that . . , − f cc m (z) and applying Lemma 1 we obtain convexity of g cv . Note, that the negative sign at the right hand side of the second constraint in the definition of h, was chosen conveniently to negate the concave terms f cc i and decompose g cv to a convex function of convex functions. The proof for g cc is analogous.
We note that g cv /g cc is not in general the convex/concave envelope of g even if F cv , f cv are the convex envelopes and F cc , f cc the concave envelopes of F, f respectively, see e.g., Fig. 1.
The definitions of g cv /g cc at a point z involve the minimization/maximization of the convex/concave relaxation F cv , F cc of F, where the convex/concave relaxations are computed over X and the optimization is over X ∩ B where B is a box defined by f cv i (z), f cc i (z). This is typically a relatively easy convex problem to solve as F cv , F cc are usually simple functions. In many cases, including the binary product of functions (Sect. 5), the solution can be described as a function of z in closed form.
Similarly to McCormick's relaxations, nested functions can be handled by recursive application of the theorem and do not present any difficulty. The only requirement is the availability of closed form solutions or reliable algorithms to solve the convex problems.
For the rest of the paper, unless otherwise stated, we assume that Note that this is without loss of generality as we can always takē More specifically, if we assume that X is a box defined as f L Corollary 3 gives a simplified version of Theorem 2 in the case of monotonicity, which we also utilize to compute convex/concave relaxations for the minimum/maximum of two functions, Sect. 6.

Corollary 3 If in addition to the assumptions of Theorem 2, Assumption (4) holds and 1. F cv is monotonic increasing then
is a convex relaxation of g. 2. F cv is monotonic decreasing then is a convex relaxation of g. 3. F cc is monotonic increasing then is a concave relaxation of g. 4. F cc is monotonic decreasing then is a concave relaxation of g.
The convexity of g cv and concavity of g cc in this case is well known, e.g., [8].

Subgradient propagation
Theorem (2) allows the evaluation of the convex/concave relaxation of an arbitrary composite function at a point z, provided that convex/concave relaxations of the intrinsic functions are available. As demonstrated in Mitsos et al. [30] the calculation of subgradients of the convex/concave relaxations is useful. In this section the results of Mitsos et al. [30] are generalized to multivariate outer-functions and to the entire subdifferential.

Lemma 2 [Adapted from strong duality theorem.] Consider the problem
with F cv convex. Then u = λ cv λ cc is an optimal solution of the dual problem D(χ cv ,χ cc ) given by Proof The proof is based on the proof of the strong duality theorem in [14]. If u = λ cv λ cc ≥ 0 solves the dual then min x∈X F cv (x) + u T −x +χ cv x +χ cc = h(χ cv ,χ cc ).
For the converse assume u = λ cv λ cc ∈ ∂h(χ cv ,χ cc ). Noting that h is non-decreasing, let e i denote the ith unit vector and observe for all i, and thus u ≥ 0 and u is dual feasible. For anyx ∈ X letχ = x −x . We have or rearranging and therefore On the other hand, weak duality yields the opposite inequality and thus equality holds and u is optimal.

Theorem 4
The subdifferential of g cv atẑ is given by Proof The subdifferential of the convex function − f cc i is given by the result follows from Lemmata 1,2.
We note that in some cases, including fractional (Sect. 7) and multilinear (Sect. 5) terms, a tighter convex relaxation F cv of F than the ones available in closed form, can be calculated through a convex optimization problem of the form The convex underestimator of g will be given by g cv (z) = min x∈X,w∈W where the defining problem has Lagrangian (6) We can calculate the subgradient of g cv (z) using Theorem 4 using the Lagrangian multipliers associated with the constraints f cv . This is formalized in Proposition 2.

Proposition 2
If strong duality holds for (5) and x,ŵ, λ cv ,λ cc ,μ is an optimal primal dual pair of (5) then x, λ cv ,λ cc is an optimal primal dual pair for the problem with Lagrangian where the constraints defining the box X are not dualized.
Keeping in mind (12) and (10) to show that x, λ cv ,λ cc is an optimal point with its corresponding Lagrangian multipliers of (7) we only need Assume to the contrary that there exist anx ∈ X with which is a contradiction since (x,ŵ) ∈ arg min x∈X,w∈WL x, w,λ cv ,λ cc ,μ .

McCormick relaxations and the auxiliary variable method
In this section we revisit the relationship between McCormick relaxations and the AVM [41,42]. AVM lies at the heart of the state of the art software BARON [35,36], and handles composite functions implicitly by a substitution of argument functions with auxiliary variables.
While it is well known that both methods provide lower bounding mechanisms for factorable functions, the restatement of McCormick Relaxations in Theorem 1 and the subsequent generalization makes the relationship between the two approaches explicit and the occasional gap in relaxations smaller.
As mentioned in the introduction, an advantage of AVM compared to McCormick's approach is the potentially tighter bounds due to repeated terms. While multivariate McCormick can provide better bounds than univariate McCormick it can still be weaker than AVM due to the same reasons. A case where multivariate McCormick can provide tighter bounds than AVM, is if tighter convex relaxations can be made practically available through optimization problems as is the case for fractional terms discussed in Sect. 7.
McCormick's approach allows for optimization of the bounding problem in the original space. While there is no general rule dictating that a smaller number of variables will lead to superior performance, it has been demonstrated that in a class of problems with few variables and complex expressions, operating in the original space can give a drastic improvement of CPU time [30].
To illustrate the relationship of the two methodologies, consider the functions f 1 : Note that the univariate McCormick theorem cannot handle this directly.
To solve {min z∈Z g(z)} AVM could formulate the problem in two different ways, depending on whether it would recognize the common term f 3 (z).

Formulation 1 Formulation 2 min
. (13) with corresponding convex relaxations (14) Formulation 2 is tighter and will likely give a better bound. It is not hard to see that multivariate McCormick will give the same bound with Formulation 1 by solving the problem Equation (15) can be interpreted as a decomposition method of the first formulation of (14). In general all inner problems will be easy to solve analytically and a numerical algorithm will only be needed to minimize the resulting relaxation with respect to the original variable z.
In the multivariate McCormick's Framework it is possible to introduce just sufficiently mny artificial variables to improve the resulted relaxation and match the AVM. In our example this would yield the problem (16) yielding the same bound with AVM while increasing the optimization space by a single variable.
It is instructive to consider only one level of composition, i.e., a function F( f (z)), to compare the two methodologies: as it turns out, the use of a cutting plane algorithm to minimize McCormick relaxations using the subgradient propagation mechanism of Sect. 3, is strongly related to applying generalized benders decomposition [15], on the lower bounding problem defined by AVM.
To minimize F( f (z)) AVM would formulate the problem min x∈X,z∈Z If we apply (generalized) Benders decomposition on (17) treating z as the "complicating" variables, the master problem is for all z.
The restricted master V r (z) employs a subset of multipliers. Aẑ obtained by solving the restricted master will be suboptimal if The cut obtained is where λ cv ,λ cc ∈ arg max cutting offẑ. The generalized-Benders cut (19) can be further relaxed by linearization around z, yielding, It can be seen that the linearized cut is equivalent with a subgradient inequality for g cv as obtained by Theorem 4. Note, that the generalized Benders' subproblem (18) generating the cut is identical with (the dual of) the problem solved to provide a function evaluation and generate the subgradient. Therefore, in the single level composition, applying generalized Benders decomposition to AVM is equivalent to minimizing g(z) through a first-order algorithm which at iteration k + 1 chooses point z k+1 for evaluation by solving the linear relaxation where g(z i ), s i are the function evaluation and subgradient returned by the oracle at iteration i. This is not a very efficient algorithm to minimize g(z) and here we use it just to illustrate the equivalence. More efficient first-order methods can be found, for example, in [31]. We note, that in the (univariate and multivariate) McCormick Relaxation framework, it is straightforward to apply Theorem 4 recursively to generate subgradients for nested compositions of functions, This is not to say that it would be impossible to construct equivalent nested decomposition schemes for AVM, which has a staircase structure. In the context of stochastic programming, Birge [6] explores nested decomposition schemes for LPs. In the presence of NLPs with a staircase structure, O'Neill [32] proposes a decomposition framework combining primal and dual decomposition ideas, which is however rather involved.
Note also that the advantage of retaining the original variable space is important only in the case of such complex expressions that would need an introduction of a great number of variables to construct the AVM equivalent. Thus, the above observations on the equivalence of the two approaches for a single composition are mainly of theoretical interest.

Product rule
An interesting example of multivariate composition are products of functions. Note that bilinear terms and bilinear products of functions are very important in applications, see for instance the recent articles [19,28]. Let mult(x 1 , x 2 ) = x 1 x 2 . As given in [3,23], the convex/concave envelopes of mult(·, ·) on Theorem 2 directly gives convex/concave relaxations for the product of two functions.
is a convex relaxation of g on Z and g cc (z) = max is a concave relaxation of g on Z .
The convex relaxation for g that McCormick proposed [23] is where where . Proposition 3 shows that the proposed relaxations g cv /g cc are always at least as tight as McCormick's ruleḡ cv /ḡ cc , while Fig. 1 shows that they can be tighter. Proposition 3 g cv (z) ≥ḡ cv (z) for all z ∈ Z and g cc (z) ≤ḡ cc (z) for all z ∈ Z.
Proof Using the well-known fact, e.g., [51], that for any function φ(x, y) defined on X × Y there holds by interchanging the minimization and maximization operators we obtain , , (20) is tighter than McCormick's relaxationḡ cv given by Eq. (22) The proof that g cc (z) ≤ḡ cc (z) for all z ∈ Z is similar and is omitted for brevity.
Note that the first inequality in (24) can be strict only if f L that is, only if Assumption 4 does not hold. Scott and Barton [38] have observed thatḡ cv can be tightened by intersecting with the interval bounds. However, from the definition of g cv we have that g cv is at least as tight and in some cases tighter than the result by Scott and Barton. If f U 1 = f L 1 or f U 2 = f L 2 , at least one of the functions is constant and the computation of the convex and concave envelopes of their product is trivial.
The convex relaxations obtained by Eqs. (20), (21) can be represented in closed form. If f U 1 > f L 1 and f U 2 > f L 2 , g cv (z) can be shown to be given by Similarly, if f U 1 > f L 1 and f U 2 > f L 2 then g cc (z) is given by In addition to bilinear products of functions, often multilinear products of functions are used in applications. The class of functions considered herein can be summarized as G(z) = t∈T c t i∈I t f i (z), where T and I t ⊂ I are index sets and c t are constants. Such functions can be handled by recursive application of McCormick's product rule and these approaches give weaker than possible relaxations, compare for instance [4,5,9,25]. In contrast, Theorem 2 provides the framework to directly handle such terms and provide tighter relaxations. Herein, only the convex relaxations are discussed; the concave relaxations are analogous.
Rikun [34] considers F : R n → R, F(x) = t∈T c t i∈I t x i on a hypercube X = He proves that the convex envelope F cv,env at a point x can be evaluated by the following optimization problem where x k denote the vertices of X . Note that this is a LP, albeit of size m. Note also that explicit representations exist for subclasses of this function such as the explicit facets of trilinear terms by Meyer and Floudas [25]. By Theorem 2 a convex relaxation of G on Z can be constructed as Noting that min y min w h(y, w) = min y,w h(y, w) we obtain which is still an LP of similar size. By Proposition 2, Theorem 4 can still be used for the computation of subgradients if we take into account in the construction of σ cv only the Lagrangian multipliers associated with the constraints f cv

Convex/concave envelopes and relaxations of min/max operators
The operators min and max often arise in engineering optimization formulations. It is wellknown that the minimum of concave functions is a concave function, but the same does not hold in general for convex functions. To the authors' best knowledge relaxations for such functions are not available in literature or in most numerical codes. For instance, the operators min / max are currently not handled by the state-of-the-art general-purpose solvers BARON [36,49] and ANTIGONE [27][28][29], while in MC++ [10] they are handled using the well-known reformulation and applying the univariate McCormick composition theorem to the negative absolute value. However, the constructed relaxations are not as tight as the ones proposed here. Calculating interval enclosures for the function min( f 1 (x), f 2 (x)) given interval enclosures for f 1 and f 2 is straightforward and is also done in MC++ [10].

Proposition 4
Consider Z ∈ R n and f 1 , f 2 : Z → R. Suppose that interval enclosures are given for f 1 and f 2 on Z , i.e., bounds f L Then we have It is noteworthy that these bounds are not exact, as shown in the next example We can utilize Corollary 3 to compute convex/concave relaxations for the minimum/maximum of two functions. The computation of convex/concave relaxations of the minimum/maximum of two functions by Theorem 2 requires the convex/concave envelopes of min(x 1 , x 2 )/ max(x 1 , x 2 ) on an arbitrary rectangle, which is easy to derive to the authors' best knowledge is not explicitly available in the literature. x 2 ). The convex envelope of min(x 1 , x 2 ) on Z is given by min cv : Z → R, min cv (x 1 , x 2 ) = max min cv,1 (x 1 , x 2 ), min cv,2 (x 1 , x 2 ) with and the concave envelope of max(x 1 , x 2 ) on Z is given by max cc : Z → R, The proof is in the Appendix.
A convex relaxation of the maximum of two functions is trivially given by the maximum of the convex relaxations of the two functions and a concave relaxation of the minimum of two functions as the minimum of the concave relaxations of the two functions.

Proposition 5
Consider Z ∈ R n and g 1 , f 2 (z)). Suppose that interval enclosures are given for f 1 and f 2 on Z , i.e., bounds f L and convex and concave relaxations such that

Recall that Proposition 4 gives interval enclosures for f on Z . The following procedure defines a convex relaxation g cv
Furthermore, the following procedure defines a concave relaxation g cc Proof Since min(·, ·) and max(·, ·) are monotonic increasing the result follows by Corollary 3.
Note that there is no guarantee that the proposed relaxation is the envelope even if the estimators of the factors are, as shown in Fig. 2.
Reformulating min(·, ·) and max(·, ·) operators using the absolute value of the difference, results in weak natural interval extensions and also weaker McCormick relaxations as shown in Proposition 6. Figure 2 shows that the inequality in Proposition 6 can be strict.

Proposition 6
Consider Z ∈ R n and f 1 , Suppose that interval enclosures are given for f 1 and f 2 on Z , i.e., bounds f L and convex/concave relaxations such that Proof The proof is given in the Appendix Note that both f 1 and f 2 have range [0, 1] and that both are convex and as such f cv 1 = f 1 and f cv 2 = f 2 are the convex envelopes. We have g 1 (z) = z 2 and this is its convex envelope. It follows from Proposition 5 that g cv 1 : Z → R, g cv 1 (z) = max(0, z 2 + z − 1) is a convex relaxation of g 1 on Z . Note that it does not give the envelope although envelopes are used for the factors. The reformulation via the absolute value furnish a valid but less tight relaxationḡ cv,abs 1 Relaxations of min ( f 1 , . . . , f m ) can be computed either recursively, or by direct application of Theorem 2 if an envelope/relaxation of min (x 1 , . . . , x m ) is available on the appropriate domain. In [0, 1] m , for example, it can be shown that the convex envelope of min (x 1 , . . . , x m ) is max(0, i x i − n + 1). If the relaxation for the multiterm operator is the envelope, direct application of the multivariate composition will result in at least as tight relaxations. If in contrast the relaxations for the multiterm operator are weak, it may be advisable to use the bivariate composition recursively.

Fractional terms
Fractional terms f 1 (z)/ f 2 (z) often arise in engineering optimization formulations. In McCormick relaxation framework, e.g., in MC++ [10] they are handled rigorously using the presentation as f 1 (z) × ( f 2 (z)) −1 , i.e., as a bilinear product with the inverse function embedded. The multivariate composition theorem can handle the fractional terms more naturally and yields at least as tight and often tighter relaxations. For the rest of this section we assume that f L 2 > 0 or f U 2 < 0, so that the division is well defined. Consider the fractional term x 1 x U 2 which we will denote via the division function div(·, ·). Tawarmalani and Sahinidis [47] discuss convex relaxations and the envelope for the positive orthant, i.e., for x L 1 > 0, x L 2 > 0. One relaxation by Zamora and Grossmann [53,54] is given by The function div cv,Z &G is the convex envelope when x L 2 → 0 and x U 2 → ∞. A piecewise linear relaxation of div [33,47] is given by Another method to obtain a valid convex relaxation of div(·, ·) on X 1 × X 2 , assuming that either x U 2 < 0 or x L 2 > 0 is to apply the product rule of McCormick [23] (defined in Eq. (22)) using the representation x 1 × Inv(x 2 ) where Inv(·) = (·) −1 . Let Inv L , Inv U denote the implied bounds, and Inv cv (x 2 ), Inv cc (x 2 ) the convex and concave relaxations of Inv on X 2 . It is easy to verify that the result is div cv,mc (x 1 , which for the positive orthant reduces to div cv,mc, as computed by Quesada and Grossman [33] following the same procedure. It is shown in [33] that div cv,lin is a linearization of div cv,mc,+ at x L 2 , x U 2 and thus div cv,lin (x 1 , The concave envelope for the positive orthant is computed in [46] to be div cc,mc,+ (x 1 , Finally, Tawarmalani and Sahinidis [46,47] prove that the convex envelope at a point can be evaluated by solving an optimization problem div cv,env (x 1 , x 2 ) = min which can be reformulated as a semi-definite program.
In MC++ [10] a relaxation for G(z) = f 1 (z) The multivariate McCormick relaxations (by Theorem 6) are the same when the outer function is relaxed via div cv,Z&G (by Zamora and Grossmann [53]), or via div cv,env (by Tawarmalani and Sahinidis [47]) ; the convex relaxation obtained is the tightest among the relaxations considered. The univariate McCormick relaxation g cv,MC++ (by Theorem 1) is the same as the multivariate using the bilinear relaxation div cv,mc++ and is weaker than the previous ones. The loosest relaxation is obtained when the linear relaxation div cv,lin (by Tawarmalani and Sahinidis [47]) is used for the outer function in the multivariate McCormick relaxations which for a given z can be written a semi-definite program. Similarly to the discussion of multilinear products we can obtain subgradients by using the Lagrange multipliers associated to the constraints f cv . If the envelope is not used, one can easily take the maximum of div cv,Z&K and div cv,mc+ . Tawarmalani and Sahinidis [47] show it can be beneficial compared to any of the two terms.

Concluding remarks
We presented a multivariate generalization of McCormick's composition theorem [23].
McCormick's results for the relaxation of composite functions with univariate outer function are the basis of so-called McCormick-relaxations, which are one of the key ideas in constructing convex relaxations in deterministic global optimization. Our generalization to multivariate outer functions results in tighter relaxations for important classes of functions including binary product of functions, the division of functions and the minimum/maximum of functions. Similarly to McCormick's composition and product theorem, the multivariate composition can be applied recursively and in fact the implementation of our result is very similar to McCormick's relaxations; many of our improvements have been implemented in both MC++ [10] and modMC [13]. In contrast to the univariate McCormick's relaxations, our result also enables the direct relaxation of classes of functions such as multilinear products of functions. This is particularly important since in recent years many relaxations have been proposed for relatively complicated expressions and it has been shown that using this is advantageous compared to recursive application of simple rules. For instance, an important class of functions are the so-called edge-concave functions treated in [26,44,45]; the work presented herein can be used to obtain tight relaxations for functions that are a composition of an edge-concave outer function and an arbitrary inner function; the relaxation can be achieved via a similar reasoning to our theorems for relaxations of bilinear, multilinear and fractional terms. It would be very useful to collect all these rules and implement them in the proposed multivariate McCormick relaxations and then perform a thorough computational comparison of the advances obtained. Moreover, it would be interesting to consider other important functions found in applications, such as | f 1 (z) − f 2 (z)| and ( f 1 (z) − f 2 (z)) 2 which are found for instance in parameter estimation. Also, it would be interesting to consider discontinuous functions as done in [52].
Similarly to univariate McCormick relaxations, our result is also applicable to functions calculated by algorithms [30]. It is well-known that univariate McCormick relaxations are nonsmooth and recently subgradient propagation has been proposed [30]. For the proposed multivariate framework it is also possible to propagate subgradients and in fact, we provide the framework to obtain, at least in principle, the entire subdifferential.
An alternative to McCormick relaxations is the AVM. Our reformulation and generalization of McCormick's composition theorem makes the connection with this method more explicit. In particular, it illustrates that the McCormick relaxation framework can be interpreted as a decomposition method for AVM. It would be of interest to indeed utilize such decomposition methods in the AVM. Moreover, we discussed the tightness of relaxations of the AVM compared to the multivariate McCormick relaxations. In cases that common subexpressions are recognized in the AVM this can result in tighter relaxations than the McCormick relaxations [48]; the same holds for the simple recursive application of the proposed multivariate McCormick relaxations. In some cases, it is possible to introduce just enough auxiliary variables to close this gap, and it would be interesting to explore this opportunity computationally. Moreover, the proposed multivariate relaxations can result in tighter relaxations in specific cases by enabling the use of complicated but tight relaxations of some functions. It would be interesting to computationally compare the two methods.
Also, if S 1 is the simplex defined by the points and since min cv,1 (x 1 , x 2 ) = min (x 1 , x 2 ) for all vertices of S 1 , it follows, see for example [18], that min cv,1 (x 1 , x 2 ) is the convex envelope of min(x 1 , and since min cv,2 (x 1 , x 2 ) = min(x 1 , x 2 ) for all vertices of S 2 , it follows that min cv,2 (x 1 , x 2 ) is the convex envelope of min(x 1 , From Lemma 4, if F cv Z is the convex envelope of min(x 1 , x 2 ) on Z , then we have Thus F cv Z (z) ≤ max min cv,1 (z), min cv,2 (z) for all z ∈ S 1 ∪ S 2 = Z and min cv is the convex envelope of min(x 1 , x 2 ) on Z .
The proof for the concave envelope of max(x 1 , x 2 ) is similar and is omitted.

Proof of Proposition 6
Proof Since the negative absolute value is concave and piecewise affine linear, its envelope is the secant. Thus, application of McCormick's composition Theorem 1 as reformulated in (2) givesḡ cv,abs 1 (z) = min w min cv,abs (z, w) On the other hand, Theorem 2 gives a convex relaxation for g 1 on Z g cv 1 (z) = min x min cv (x) where x L i = f L i , x U i = f U i and min cv is the convex envelope of min(·, ·) on X = x L 1 , x U 1 × x L 2 , x U 2 . We will show that (36) has an optimal value smaller or equal to the optimal value of (37) for an arbitrary but fixed z, and thus min cv,abs (z) ≤ min cv (z). To do so, first we reformulate (36) by introducing two new variables x 1 , x 2 with w = x 1 − x 2 and eliminate w obtainingḡ cv,abs 1 (z) = min x 1 ,x 2 min cv,abs (z, which is equivalent with (36). Next we show that the optimization problem at the right hand of (38) is a relaxation of the optimization problem at the right hand of (37) and thus has a smaller optimal value.
First we will show that any feasible point of (37) is also feasible in (38). Indeed, take any feasible (x 1 , x 2 ). By feasibility we have Similarly by feasibility and thus and (x 1 , x 2 ) is also feasible in (38). It remains to show that the objective function of (38) is an underestimate of (37). Take any (x 1 , x 2 ) which is feasible in (37). By construction of min cv,abs we have min cv,abs (z, x 1 − x 2 ) ≤ 0.5 f cv 1 (z) + f cv 2 (z) − |x 1 − x 2 | .
By feasibility of (x 1 , x 2 ) in (36) we also have and thus combining the last two inequalities we also obtain min cv,abs (z, x 1 − x 2 ) ≤ x 1 + x 2 − |x 1 − x 2 | = min(x 1 , x 2 ) or min cv,abs (z, ·) underestimates min(·, ·) on X . Moreover, min cv,abs (z, ·) is affine linear and thus also convex on X . Since min cv is the convex envelope of min(·, ·) on X we directly obtain min cv,abs (z, w = x 1 − x 2 ) ≤ min cv (x 1 , x 2 ), and the result follows. The result for the concave envelope is analogous and so are the results for the max.
We treat separately the cases f L 1 ≥ 0 versus f L 1 < 0. If f L 1 ≥ 0 since both Inv cv , Inv cc are decreasing monotonically we have ζ 2 (z) = min f L 1 Inv cv ( f cc 2 (z)), f L 1 Inv cc ( f cc 2 (z)) and therefore .
If on the other hand f L 1 < 0 since both −Inv cv , −Inv cc are increasing monotonically we have Inv cc ( f cv 2 (z)).
In this case due to negativity of f L 1 there holds and we obtain again Therefore, we have established that Eq. (41) holds independently of the sign of f L 1 . By a similar reasoning we deduce that ζ 4 (z) = min f U 1 Inv cv ( f cc 2 (z)) f U 1 Inv cc ( f cv 2 (z)) .
If the available relaxations of div(·, ·) on X 1 × X 2 are tighter than div cv,mc and div cc,mc the resulting relaxation has to be even tighter.