Abstract
In this paper we exploit a slight variant of a result previously proved in Locatelli and Schoen (Math Program 144:65–91, 2014) to define a procedure which delivers the convex envelope of some bivariate functions over polytopes. The procedure is based on the solution of a KKT system and simplifies the derivation of the convex envelope with respect to previously proposed techniques. The procedure is applied to derive the convex envelope of the bilinear function xy over any polytope, and the convex envelope of functions \(x^n y^m\) over boxes.
Similar content being viewed by others
References
Al-Khayyal, F.A., Falk, J.E.: Jointly constrained biconvex programming. Math. Oper. Res. 8, 273–286 (1983)
Anstreicher, K.M., Burer, S.: Computable representations for convex hulls of low-dimensional quadratic forms. Math. Program. B 124, 33–43 (2010)
Anstreicher, K.M.: On convex relaxations for quadratically constrained quadratic programming. Math. Program. 136, 233–251 (2012)
Benson, H.P.: On the construction of convex and concave envelope formulas for bilinear and fractional functions on quadrilaterals. Comput. Optim. Appl. 27, 5–22 (2004)
Crama, Y.: Concave extensions for nonlinear 0–1 maximization problems. Math. Program. 61, 53–60 (1993)
Kuno, T.: A branch-and-bound algorithm for maximizing the sum of several linear ratios. J. Global Optim. 22, 155–174 (2002)
Jach, M., Michaels, D., Weismantel, R.: The convex envelope of (\(n\)-1)-convex functions. SIAM J. Optim. 19(3), 1451–1466 (2008)
Khajavirad, A., Sahinidis, N.V.: Convex envelopes of products of convex and component-wise concave functions. J. Global Optim. 51, 391–409 (2012)
Khajavirad, A., Sahinidis, N.V.: Convex envelopes generated from finitely many compact convex sets. Math. Program. 137, 371–408 (2013)
Laraki, R., Lasserre, J.B.: Computing uniform convex approximations for convex envelopes and convex hulls. J. Conv. Anal. 15(3), 635–654 (2008)
Linderoth, J.: A simplicial branch-and-bound algorithm for solving quadratically constrained quadratic programs. Math. Program. 103, 251–282 (2005)
Locatelli, M., Schoen, F.: On convex envelopes for bivariate functions over polytopes. Math. Program. 144, 65–91 (2014)
Locatelli, M.: Polyhedral subdivisions and functional forms for the convex envelopes of bilinear, fractional and other bivariate functions over general polytopes. J. Global Optim. 66, 629–668 (2016)
Locatelli, M.: A technique to derive the analytical form of convex envelopes for some bivariate functions. J. Global Optim. 59, 477–501 (2014)
Locatelli, M.: Convex envelopes of some quadratic functions over the \(n\)-dimensional unit simplex. SIAM J. Optim. 25, 589–621 (2015)
Locatelli, M.: On the computation of convex envelopes for bivariate functions through KKT conditions. Optimization Online. http://www.optimization-online.org/DB_FILE/2016/01/5280.pdf. Accessed 2016
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: Part I—convex underestimating problems. Math. Program. 10, 147–175 (1976)
Meyer, C.A., Floudas, C.A.: Convex envelopes for edge-concave functions. Math. Program. 103, 207–224 (2005)
Mitchell, J.E., Pang, J.-S., Yu, B.: Convex quadratic relaxations of nonconvex quadratically constrained quadratic programs. Optim. Methods Softw. 29(1), 120–136 (2014)
Rikun, A.: A convex envelope formula for multilinear functions. J. Global Optim. 10, 425–437 (1997)
Ryoo, H.S., Sahinidis, N.V.: Analysis of bounds for multilinear functions. J. Global Optim. 19, 403–424 (2001)
Sherali, H.D., Alameddine, A.: An explicit characterization of the convex envelope of a bivariate bilinear function over special polytopes. Ann. Oper. Res. 27, 197–210 (1992)
Sherali, H.D.: Convex envelopes of multilinear functions over a unit hypercube and over special discrete sets. Acta Math. Vietnam. 22, 245–270 (1997)
Tawarmalani, M., Sahinidis, N.V.: Semidefinite relaxations of fractional programs via novel convexification techniques. J. Global Optim. 20, 137–158 (2001)
Tardella, F.: On the existence of polyhedral convex envelopes. In: Floudas, C.A., Pardalos, P.M. (eds.) Frontiers in Global Optimization, pp. 563–574. Kluwer, Dordrecht (2003)
Tardella, F.: Existence and sum decomposition of vertex polyhedral convex envelopes. Optim. Lett. 2, 363–375 (2008)
Tawarmalani, M., Richard, J.-P.P., Xiong, C.: Explicit convex and concave envelopes through polyhedral subdivisions. Math. Program. 138, 531–577 (2013)
Zamora, J.M., Grossmann, I.E.: A Branch and Contract algorithm for problems with concave univariate, bilinear and linear fractional terms. J. Global Optim. 14, 217–249 (1999)
Acknowledgements
The author is grateful to an anonymous reviewer whose suggestions have been quite useful to improve the presentation.
Author information
Authors and Affiliations
Corresponding author
Appendices
Convex envelope and polyhedral subdivision for the bilinear function
We make a separate discussion for each of the three cases discussed in Sect. 4.
1.1 \(J=\{i,j,k\}\) and \(\Omega _i, \Omega _j,\Omega _k\in V(P)\)
Let \(\mathbf{v}^i, \mathbf{v}^j, \mathbf{v}^k\in V(P)\) be the three vertices \(\Omega _i, \Omega _j,\Omega _k\) such that no pair of these vertices lies along a line with positive slope. In this case the computation of the convex envelope is rather simple: the convex envelope is the affine function interpolating xy at the three vertices, i.e.,
where \(\mathbf{p}^T \mathbf{v}^h +p_0= v_{x}^h v_{y}^h\) for each \(h\in J\). Moreover, the set \(\Gamma _J\) is defined in the following observation.
Observation A.1
If \(f(\mathbf{x})=xy\) and \(J=\{i,j,k\}\), then
1.2 \(J=\{i,j\}\) and \(\Omega _i\in V(P)\), \(\Omega _j\in \bar{E}(P)\)
We denote by \(\mathbf{v}=(v_x,v_y)\) the vertex \(\Omega _i\in V(P)\). We will omit in what follows the dependency of \(x_j\) from \(\varvec{\alpha }\). Recalling (22), system (20) is equivalent to
The parametric solution of this system is easy to derive. It follows from (21) that \(a=2 m_j x_j + q_j - m_j b\). Then, the system can be rewritten as follows
The first two equations lead to
so that
Then
The convex envelope is equal to
where \(\varvec{\alpha }_J(\mathbf{x})=(a(\mathbf{x}), b(\mathbf{x}))\), over the set
Note that, according to definition (9) \(\varvec{\alpha }_J(\mathbf{x})\in D_j\) can also be written as \(x_j^1\le x_j \le x_j^2\). In the following observation we derive a simplified definition of the set \(\Gamma _J\). It will turn out that the restrictions \(\eta _r(\varvec{\alpha }_J(\mathbf{x}))\ge \eta _k(\varvec{\alpha }_J(\mathbf{x})),\ \forall r\not \in J,\ k\in J\), can be replaced by simple restrictions on the values of \(x_j\). We assume that \(\Omega _j\in \bar{E}^u(P)\) (the analysis for the case \(\Omega _j\in \bar{E}^{\ell }(P)\) is analogous).
Observation A.2
Let \(\Omega _j\in \bar{E}^u(P)\). Then, the following holds.
-
If \(P\cap \{\varvec{\xi }\ :\ \xi _x>v_x,\xi _y<v_y\}\ne \emptyset \), then \(\Gamma _J=\emptyset \);
-
If \(P\subseteq \{\varvec{\xi }\ :\ \xi _x\le v_x,\xi _y\ge v_y\}\), then
$$\begin{aligned} \Gamma _J=\left\{ \mathbf{x}\in P\ :\ \lambda ^J(\mathbf{x})\in [0,1],\ \ x_j^1\le x_j \le x_j^2\right\} . \end{aligned}$$(30) -
If \(P\cap \{\varvec{\xi }\ :\ \xi _x>v_x,\xi _y\ge v_y\}\ne \emptyset \), then we need to add the restriction
$$\begin{aligned} x_j \le \min \left\{ x_j^2,-\frac{q_j}{m_j+\sqrt{m_j m_k}}\right\} , \end{aligned}$$in the definition (30) of \(\Gamma _J\).
-
If \(P\cap \{\varvec{\xi }\ :\ \xi _x<v_x,\ \xi _y\le v_y\}\ne \emptyset \), then we need to add the restriction
$$\begin{aligned} x_j \ge \max \left\{ x_j^1,-\frac{q_j}{m_j+\sqrt{m_j m_h}}\right\} , \end{aligned}$$in the definition (30) of \(\Gamma _J\).
Proof
We will assume in what follows that \(v_x=v_y=0\). This is without loss of generality since it can always be made true by a translation. Since \(\Omega _j\in \bar{E}^u(P)\) and \((v_x,v_y)=(0,0)\), it holds that \(q_j>0\) and
Moreover, \(x_j\le 0, y_j=m_jx_j+q_j\ge 0\), otherwise the bilinear function is strictly convex along the segment between (0, 0) and \((x_j,y_j)\), which can not hold in view of Corollary 3.2. Thus, we can impose \(-\frac{q_j}{m_j}\le x_j \le 0\). Next, we need to impose that for each \(\varvec{\xi }\in P\),
We remark that we could restrict the attention to \(\varvec{\xi }\in \Omega _k\), for all \(\Omega _k\in G(P)\), but this would not simplify the following analysis. The inequality (31) can be rewritten as follows
If we consider the above inequality as a quadratic inequality with respect to \(x_j\), then its determinant is
This is \(< 0\) for \(\xi _x\xi _y<0\) and \(\varvec{\xi }\not \in \Omega _j\). Thus, if \(\xi _x\xi _y< 0\) and \(\xi _y-m_j \xi _x<0\), i.e., \(\xi _x\xi _y< 0,\ \xi _y< 0,\ \xi _x> 0\), for some \(\varvec{\xi }\in P\), then \(\Gamma _J=\emptyset \). Otherwise, if \(\xi _x\xi _y\le 0\) and \(\xi _y-m_j \xi _x\ge 0\), i.e., \(\xi _x\xi _y\le 0,\ \xi _y\ge 0,\ \xi _x\le 0\) for all \(\varvec{\xi }\in P\), then
Next, let us assume that \(\xi _x>0,\ \xi _y\ge 0\) for some \(\varvec{\xi }\in P\). In this case the origin is the vertex of an edge of P lying along a line \(y=m_k x\) for some \(m_k\ge 0\). Moreover, \(P\subseteq \{\mathbf{x} :\ y\ge m_k x\}\). Thus, since \(b(\mathbf{x})\le 0\), we have
Taking into account the definitions of \(a(\mathbf{x})\) and \(b(\mathbf{x})\) we have that, in view of \(\xi _x>0\), (31) is satisfied if
Since \(\xi _y\ge 0\), \(q_j>0\), \(x_j\le 0\), and \(y_j=m_jx_j+q_j\ge 0\), the above inequality is satisfied if
or, equivalently
which, combined with \(x_j\le x_j^2\) proves the result.
Finally, let us assume that \(\xi _x<0,\ \xi _y\le 0\) for some \(\varvec{\xi }\in P\). In this case the origin is the vertex of an edge of P lying along a line \(y=m_h x\) for some \(m_h\ge 0\). Moreover, \(P\subseteq \{\mathbf{x}\ :\ y\ge m_h x\}\). Thus, since \(b(\mathbf{x})\le 0\), we have
Taking into account the definitions of \(a(\mathbf{x})\) and \(b(\mathbf{x})\) we have that, in view of \(\xi _x<0\), (31) is satisfied if
Since \(\xi _y\le 0\), \(q_j>0\), \(x_j\le 0\), and \(y_j=m_jx_j+q_j\ge 0\), the above inequality is satisfied if
or, equivalently
which, combined with \(x_j\ge x_j^1\) proves the result. \(\square \)
1.3 \(J=\{i,j\}\) and \(\Omega _i,\Omega _j \in \bar{E}(P)\)
We will assume that \(\Omega _j \in \bar{E}^{\ell }(P)\), \(\Omega _i \in \bar{E}^{u}(P)\), so that
System (20) becomes (once again we omit the dependency of \(x_i\) and \(x_j\) from \(\varvec{\alpha }\))
The case \(m_i=m_j\) is a simpler one for which a solution of the system is easily derived. Indeed, in this case it follows from (21) that
The first two equations lead to
so that
The third equation reduces to
while (21) implies
If \(m_i\ne m_j\) the solution of the system is a bit more cumbersome although always based on standard computations. It follows from (21) that
The first two equations lead to
The system can be rewritten as follows
In order to solve it parametrically with respect to \(\mathbf{x}\), it is worthwhile to make the following change of variables
After that, the system is rewritten as follows
It is immediately seen that the possible solutions of the second equation are
or
If (35) holds, then, after a few computations, it can be seen that the solutions of the first equation are
where
Solution \(W_2\) can be discarded. Indeed, in this case both Z and W do not depend from x, y and
This is only possible if \((x_j,y_j)\equiv (x_i,y_i)\) and the point is the intersection of the two lines \(y=m_jx+q_j\) and \(y=m_ix+q_i\), so that we can discard this case. If we consider the solution \(W_1\), we have
Next, let us assume that (36) holds. In this case we have the two solutions
where
Once again, \(Z_2\) can be discarded. If we consider \(Z_1\) we end up with
However, in this case we observe that (33) implies that for each \(\mathbf{x}\in P\setminus [\Omega _i\cup \Omega _j] \, x_i,x_j< x\) if \(m_i>m_j\), or \(x_i,x_j> x\) if \(m_i<m_j\) (note that over \(\Omega _i\) and over \(\Omega _j\) the convex envelope is equal to the restriction of the bilinear function to that edge and we do not need to consider these points). Then, \(\lambda (\mathbf{x})\not \in [0,1]\), so that we can discard this solution. In conclusion, the only acceptable solution of the system is (37). Then, we can derive \(\varvec{\alpha }_J(\mathbf{x})=(a(\mathbf{x}), b(\mathbf{x}))\) from (34). We have
over the set
In fact, the following observation gives a simplified definition of the set \(\Gamma _J\), showing that we can omit \(\eta _r(\varvec{\alpha }_J(\mathbf{x}))\ge \eta _k(\varvec{\alpha }_J(\mathbf{x} )),\ \forall r\not \in J,\ k\in J\).
Observation A.3
We have that
Proof
Let \(P'=chull(\Omega _i\cup \Omega _j)\subseteq P\). It turns out that \(conv_{f,P'}(\mathbf{x})\) is equal to the function g defined in (38) over the set \(\Gamma _J\subseteq P'\) defined in (39). This is an immediate consequence of the fact that in this case \(r\not \in J\) implies that \(\Omega _r\) is a vertex of one of the two edges \(\Omega _i\) and \(\Omega _j\) of \(P'\), and \(\eta _r(\varvec{\alpha }_J(\mathbf{x}))\ge \eta _i(\varvec{\alpha }_J(\mathbf{x}))\) for all \(r\not \in J\) follows from \(\varvec{\alpha }_J(\mathbf{x})\in D_i\cap D_j\). What we will prove now is that g is a convex underestimator of f over the whole polytope P. Then, by definition of convex envelope as the largest convex underestimator of f over P and observing that \(P'\subseteq P\) implies \(conv_{f,P'}(\mathbf{x})\ge conv_{f,P}(\mathbf{x}) \, \forall \mathbf{x}\in P'\), we must have \(conv_{f,P}(\mathbf{x})=conv_{f,P'}(\mathbf{x})=g(\mathbf{x})\) over \(\Gamma _J\).
In what follows we will omit the dependency of \(\varvec{\alpha }=(a,b)\) from J. In order to see that g is a convex underestimator of f over the whole polytope P, we need to check whether
We rewrite this as
or, equivalently
Since \(a(\mathbf{x})+ m_j b(\mathbf{x})=2 m_j x_j + b(\mathbf{x})q_j\), and after adding and subtracting \(m_j x^2\), we end up, after a few elementary computations, with
By definition of \(x_j\) and \(b(\mathbf{x})\) the inequality reduces to
which always holds over P. \(\square \)
1.4 An example
In order to illustrate the results of this section we consider the following example taken from [19]. Let
i.e., P is the polytope with vertices \(\mathbf{v}_1=(0,0)\), \(\mathbf{v}_2=(5,0)\), \(\mathbf{v}_3=(0,1)\), and \(\mathbf{v}_4=(5,6)\). We set \(\Omega _i=\{\mathbf{v}_i\}\), \(i=1,\ldots ,4\). We have that \(\bar{E}^u(P)\) is made up by the edge \(\Omega _5=[\mathbf{v}_3,\mathbf{v}_4]\), while \(\bar{E}^\ell (P)=\emptyset \). The only set J with cardinality three which needs to be considered is \(J=\{1,2,3\}\), from which we have
The only set with cardinality two which needs to be considered is \(J=\{2,5\}\). Following the development of Sect. A.2, we have from (27) that
from (28) that
so that it follows from (29) that
over \(\Gamma _J\). In order to define \(\Gamma _J\), we notice that: (i) \(\lambda (\mathbf{x})\in [0,1] \, \forall \mathbf{x}\in P\); (ii) after translating \(\mathbf{v}_2\) into the origin, we remark that we are in the second subcase of Observation A.2. Thus,
Recalling the definition (40) of \(x_j\), and observing that \(y-x+5\ge 0\) over P, we conclude that
Figure 4 reports polytope P and the corresponding polyhedral subdivision.
Convex envelope and polyhedral subdivision for \(x^n y^m\) over a box
As already mentioned in Sect. 5, we restrict the attention to \(\mathbf{x}\in \mathcal{B}^u\) since the case \(\mathbf{x}\in \mathcal{B}^{\ell }\) is analogous. We remark that, according to Observation 3.1
and, equivalently
Now, let us consider the two triples \(\{1,5,7\}\) and \(\{2,5,7\}\). In both cases \(\eta _5(\varvec{\alpha })=\eta _7(\varvec{\alpha })\), so that
If \(J=\{1,5,7\}\), then
while if \(J=\{2,5,7\}\), then
Due to (41), (42) and (43) can hold at the same time only if \(\eta _1(\varvec{\alpha })=\eta _2(\varvec{\alpha })=\eta _5(\varvec{\alpha })=\eta _7(\varvec{\alpha })\). In all the other cases, only one of the two triples is acceptable. In particular, if
then, we can restrict our attention to \(J=\{1,5,7\}\), while if
then, we can restrict our attention to \(J=\{2,5,7\}\). In fact, the two conditions (44) and (45) allow to restrict the attention to a collection \(\mathcal{J}\) made up by only four sets. For instance, if (44) holds, then \(J=\{2,7\}\) can be removed. Indeed, in such case
while
The left-hand side is an increasing function with respect to a, so that the above equation implies that b is an increasing function with respect to a. Moreover, for \(b=\bar{b}\) the equation reduces to
which, in view of (44) implies \(a>\bar{a}\), but that is not possible. In a similar way it can be seen that if (44) holds, then we can remove the set \(J=\{1,2,7\}\). In conclusion, if (44) holds, then we can restrict our attention to the four sets
while in an analogous way it can be seen that if (45) holds, we can restrict our attention to the four sets
The case
is a rather peculiar one where
while over \(chull\{B,K,L\}\) the convex envelope can be computed by solving the system associated to the set \(J=\{1,2\}\).
In what follows we only discuss the case where (44) holds, since the other cases are analogous. Figure 5 displays the polyhedral subdivision induced by the convex envelope in this case.
1.1 Set \(\{1,5,7\}\)
In this case
The solution of the system (19) is
We should also check whether \(\eta _3(\varvec{\alpha }),\eta _4(\varvec{\alpha }),\eta _8(\varvec{\alpha }) \ge \eta _5(\varvec{\alpha })\). It is enough to observe that \(a_1>0\) and that \(b_1>0\). The latter follows by observing that
In view of the definition (25) of \(\bar{x}_1^1\), this is equivalent to prove that
or, equivalently
which certainly holds. Then, \(a_1\in D_3^+\) and \(b_1\in D_4^+\). Indeed, \(D_3\) and \(D_4\) only contain negative a and b values, respectively. Thus, for these a and b values it holds that \(\eta _8(\varvec{\alpha })>\eta _3(\varvec{\alpha })=\eta _4(\varvec{\alpha })= \eta _5(\varvec{\alpha })=\eta _7(\varvec{\alpha })\).
In conclusion, we have \(\Gamma _J=chull\{A,C,K\}\), and
1.2 Set \(\{1,2,5\}\)
In this case
and the solution of the system (19) is
Then, after defining \(M=(x_1(a_2),u_y)\), we have \(\Gamma _J=chull\{A,M,L\}\), and
Note that also in this case we should check whether \(\eta _3(\varvec{\alpha }),\eta _4(\varvec{\alpha }),\eta _8(\varvec{\alpha }) \ge \eta _5(\varvec{\alpha })\), but the proof is analogous to the one in the previous case.
1.3 Set \(\{1,5\}\)
In this case (20) is
whose solution (parametric with respect to \(\mathbf{x}\)) is
Then, \(\Gamma _J=chull\{A,M,K\}\), and
(we omit to prove that \(\eta _3(\varvec{\alpha }),\eta _4(\varvec{\alpha }),\eta _8(\varvec{\alpha }) \ge \eta _5(\varvec{\alpha })\)).
1.4 Set \(\{1,2\}\)
In this case (20) is
If we denote by \(\lambda _4(\mathbf{x}), a_4(\mathbf{x}), b_4(\mathbf{x})\) the parametric solution of the system, then \(\Gamma _J=chull\{B,M,L\}\), and \(\forall \mathbf{x}\in \Gamma _J\)
(again, we omit to prove that \(\eta _3(\varvec{\alpha }),\eta _4(\varvec{\alpha }),\eta _8(\varvec{\alpha }) \ge \eta _1(\varvec{\alpha })\)). If \(m\ne n\), we are not able to derive a closed form formula for the parametric solution of the system (46). Thus, \(a_4, b_4\) are implicitly defined as solutions of the system (46). If \(m=n\) (the case already discussed in [14]), then it turns out that the last equation in (46) is equivalent to
Thus, the first two equations become
or, equivalently
and, consequently
In view of the definition of \(x_1(a)\) and \(y_2(b)\) we also have
Rights and permissions
About this article
Cite this article
Locatelli, M. Convex envelopes of bivariate functions through the solution of KKT systems. J Glob Optim 72, 277–303 (2018). https://doi.org/10.1007/s10898-018-0626-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-018-0626-1