Abstract
We propose the formulation of convex Generalized Disjunctive Programming (GDP) problems using conic inequalities leading to conic GDP problems. We then show the reformulation of conic GDPs into Mixed-Integer Conic Programming (MICP) problems through both the big-M and hull reformulations. These reformulations have the advantage that they are representable using the same cones as the original conic GDP. In the case of the hull reformulation, they require no approximation of the perspective function. Moreover, the MICP problems derived can be solved by specialized conic solvers and offer a natural extended formulation amenable to both conic and gradient-based solvers. We present the closed form of several convex functions and their respective perspectives in conic sets, allowing users to formulate their conic GDP problems easily. We finally implement a large set of conic GDP examples and solve them via the scalar nonlinear and conic mixed-integer reformulations. These examples include applications from Process Systems Engineering, Machine learning, and randomly generated instances. Our results show that the conic structure can be exploited to solve these challenging MICP problems more efficiently. Our main contribution is providing the reformulations, examples, and computational results that support the claim that taking advantage of conic formulations of convex GDP instead of their nonlinear algebraic descriptions can lead to a more efficient solution to these problems.
Similar content being viewed by others
Data Availability
The data generated and analyzed during the current study is available in the following GitHub repository, https://github.com/bernalde/conic_disjunctive.
Notes
References
Liberti, L.: Undecidability and hardness in mixed-integer nonlinear programming. RAIRO-Oper. Res. 53(1), 81–109 (2019)
Trespalacios, F., Grossmann, I.E.: Review of mixed-integer nonlinear and generalized disjunctive programming methods. Chem. Ing. Tec. 86(7), 991–1012 (2014)
Lee, J., Leyffer, S.: Mixed Integer Nonlinear Programming, vol. 154. Springer Science & Business Media, Berlin (2011)
Liberti, L.: Mathematical Programming (Ecole Polytechnique, Paris, 2017). https://www.lix.polytechnique.fr/~liberti/teaching/dix/inf580-15/mathprog.pdf
Fletcher, R., Leyffer, S.: Solving mixed integer nonlinear programs by outer approximation. Math. Program. 66(1), 327–349 (1994)
Kronqvist, J., Bernal, D.E., Lundell, A., Grossmann, I.E.: A review and comparison of solvers for convex MINLP. Optim. Eng. 20(2), 397–455 (2019)
Dakin, R.J.: A tree-search algorithm for mixed integer programming problems. Comput. J. 8(3), 250–255 (1965)
Geoffrion, A.M.: Generalized Benders decomposition. J. Optim. Theory Appl. 10(4), 237–260 (1972)
Westerlund, T., Skrifvars, H., Harjunkoski, I., Pörn, R.: An extended cutting plane method for a class of non-convex MINLP problems. Comput. Chem. Eng. 22(3), 357–365 (1998)
Duran, M.A., Grossmann, I.E.: An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Math. Program. 36(3), 307–339 (1986)
Ben-Tal, A., Nemirovski, A.: Lectures on modern convex optimization: analysis, algorithms, and engineering applications. SIAM, Philadelphia (2001)
Kılınç-Karzan, F.: On minimal valid inequalities for mixed integer conic programs. Math. Oper. Res. 41(2), 477–510 (2016)
Lubin, M., Zadik, I., Vielma, J.P.: In: International Conference on Integer Programming and Combinatorial Optimization, pp. 392–404. Springer (2017)
Friberg, H.A.: CBLIB 2014: a benchmark library for conic mixed-integer and continuous optimization. Math. Program. Comput. 8(2), 191–214 (2016)
MOSEK ApS MOSEK Modeling Cookbook (2018)
Domahidi, A., Chu, E., Boyd, S.: In 2013 European Control Conference (ECC), pp. 3071–3076. IEEE (2013)
Coey, C., Kapelevich, L., Vielma, J.P.: Solving natural conic formulations with Hypatia. INFORMS J. Comput. 34, 2686–2699 (2022)
Vanderbei, R.J., Yurttan, H.: Using LOQO to solve second-order cone programming problems. Constraints 1, 2 (1998)
Zverovich, V., Fourer, R., Optimization, A.: In: INFORMS Computing Society Conference (2015). http://ampl.com/MEETINGS/TALKS/2015_01_Richmond_2E.2.pdf
Erickson, J., Fourer, R.: Detection and transformation of second-order cone programming problems in a general-purpose algebraic modeling language. Optim. Online (2019)
Grant, M., Boyd, S., Ye, Y.: Global Optimization, pp. 155–210. Springer, Berlin (2006)
Lubin, M., Yamangil, E., Bent, R., Vielma, J.P.: In: Louveaux, Q., Skutella, M. (eds)Integer Programming and Combinatorial Optimization: 18th International Conference, IPCO 2016, pp. 102–113. Springer, Springer International Publishing (2016)
Lubin, M., Yamangil, E., Bent, R., Vielma, J.P.: Polyhedral approximation in mixed-integer convex optimization. Math. Program. 172(1–2), 139–168 (2018)
Coey, C., Lubin, M., Vielma, J.P.: Outer approximation with conic certificates for mixed-integer convex problems. Math. Program. Comput. 12, 249–293 (2020)
Vigerske, S.: Decomposition in multistage stochastic programming and a constraint integer programming approach to mixed-integer nonlinear programming. Ph.D. thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II (2013)
Bestuzheva, K., Gleixner, A., Vigerske, S.: A Computational Study of Perspective Cuts (2021). arXiv:2103.09573
Khajavirad, A., Sahinidis, N.V.: A hybrid LP/NLP paradigm for global optimization relaxations. Math. Program. Comput. 10(3), 383–421 (2018)
Günlük, O., Linderoth, J.: Perspective reformulations of mixed integer nonlinear programs with indicator variables. Math. Program. 124(1), 183–205 (2010)
Waltz, R., Platenga, T.: KNITRO user’s manual. 2010 (2017)
Belotti, P., Berthold, T., Neves, K.: In: 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), pp. 1–5. IEEE (2016)
I. Gurobi Optimization: Gurobi Optimizer Reference Manual (2016). http://www.gurobi.com
IBM Corp., IBM, V20.1: User’s Manual for CPLEX. International Business Machines Corporation (2020). https://www.ibm.com/docs/en/icos/20.1.0?topic=cplex
Conforti, M., Cornuéjols, G., Zambelli, G.: Integer Programming, volume 271 of Graduate Texts in Mathematics (2014)
Balas, E.: Disjunctive Programming. Springer, Berlin (2018)
Çezik, M.T., Iyengar, G.: Cuts for mixed 0–1 conic programming. Math. Program. 104(1), 179–202 (2005)
Belotti, P., Góez, J.C., Pólik, I., Ralphs, T.K., Terlaky, T.: In: Numerical Analysis and Optimization, pp. 1–35. Springer (2015)
Lodi, A., Tanneau, M., Vielma, J.P.: Disjunctive cuts in mixed-integer conic optimization (2019). arXiv:1912.03166
Bonami, P., Lodi, A., Tramontani, A., Wiese, S.: On mathematical programming with indicator constraints. Math. Program. 151(1), 191–223 (2015)
Grossmann, I.E., Lee, S.: Generalized convex disjunctive programming: nonlinear convex hull relaxation. Comput. Optim. Appl. 26(1), 83–100 (2003)
Chen, Q., Johnson, E.S., Bernal, D.E., Valentin, R., Kale, S., Bates, J., Siirola, J.D., Grossmann, I.E.: Pyomo. GDP: an ecosystem for logic based modeling and optimization development. Optim. Eng. 1–36 (2021)
Vielma, J.P.: Small and strong formulations for unions of convex sets from the Cayley embedding. Math. Program. 177(1), 21–53 (2019)
Günlük, O., Linderoth, J.: In: Mixed Integer Nonlinear Programming, pp. 61–89. Springer (2012)
Furman, K.C., Sawaya, N.W., Grossmann, I.E.: A computationally useful algebraic representation of nonlinear disjunctive convex sets using the perspective function. Comput. Optim. Appl. 1–26 (2020)
Hijazi, H., Bonami, P., Cornuéjols, G., Ouorou, A.: Mixed-integer nonlinear programs featuring “on/off’’ constraints. Comput. Optim. Appl. 52(2), 537–558 (2012)
Lee, S., Grossmann, I.E.: New algorithms for nonlinear generalized disjunctive programming. Comput. Chem. Eng. 24(9–10), 2125–2141 (2000)
Stubbs, R.A., Mehrotra, S.: A branch-and-cut method for 0–1 mixed convex programming. Math. Program. 86(3), 515–532 (1999)
Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0–1 mixed integer programs. Math. Program. 106(2), 225–236 (2006)
Aktürk, M.S., Atamtürk, A., Gürel, S.: A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3), 187–191 (2009)
Raman, R., Grossmann, I.E.: Modelling and computational techniques for logic based integer programming. Comput. Chem. Eng. 18(7), 563–578 (1994)
Grossmann, I.E., Ruiz, J.P.: In: Mixed Integer Nonlinear Programming, pp. 93–115. Springer, Berlin (2012)
Sawaya, N.: Reformulations, relaxations and cutting planes for generalized disjunctive programming. Ph.D. Thesis, Carnegie Mellon University (2006)
Sawaya, N., Grossmann, I.: A hierarchy of relaxations for linear generalized disjunctive programming. Eur. J. Oper. Res. 216(1), 70–82 (2012)
Ruiz, J.P., Grossmann, I.E.: A hierarchy of relaxations for nonlinear convex generalized disjunctive programming. Eur. J. Oper. Res. 218(1), 38–47 (2012)
Williams, H.P.: Model Building in Mathematical Programming. Wiley, Hoboken (2013)
Ceria, S., Soares, J.: Convex programming for disjunctive convex optimization. Math. Program. 86(3), 595–614 (1999)
Ruiz, J.P., Grossmann, I.E.: Global optimization of non-convex generalized disjunctive programs: a review on reformulations and relaxation techniques. J. Glob. Optim. 67(1–2), 43–58 (2017)
Trespalacios, F., Grossmann, I.E.: Cutting plane algorithm for convex generalized disjunctive programs. INFORMS J. Comput. 28(2), 209–222 (2016)
Atamtürk, A., Gómez, A.: Strong formulations for quadratic optimization with M-matrices and indicator variables. Math. Program. 170(1), 141–176 (2018)
Tawarmalani, M., Sahinidis, N.V.: A polyhedral branch-and-cut approach to global optimization. Math. Program. 103(2), 225–249 (2005)
Balas, E.: Annals of Discrete Mathematics, vol. 5, pp. 3–51. Elsevier, Amsterdam (1979)
Chandrasekaran, V., Shah, P.: Relative entropy optimization and its applications. Math. Program. 161(1–2), 1–32 (2017)
El Ghaoui, L., Lebret, H.: Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. Appl. 18(4), 1035–1064 (1997)
Bussieck, M.R., Meeraus, A.: Modeling Languages in Mathematical Optimization, pp. 137–157. Springer, Berlin (2004)
Bussieck, M.R., Drud, A.: SBB: A New Solver for Mixed Integer Nonlinear Programming. Talk, OR (2001)
Bussieck, M.R., Drud, A.S., Meeraus, A.: MINLPLib-a collection of test models for mixed-integer nonlinear programming. INFORMS J. Comput. 15(1), 114–119 (2003)
Grossmann, I., Lee, J.: CMU-IBM cyber-infrastructure for MINLP (2021). https://www.minlp.org/index.php
Bonami, P., Biegler, L.T., Conn, A.R., Cornuéjols, G., Grossmann, I.E., Laird, C.D., Lee, J., Lodi, A., Margot, F., Sawaya, N., Wächter, A.: An algorithmic framework for convex mixed integer nonlinear programs. Discret. Optim. 5(2), 186–204 (2008). https://doi.org/10.1016/j.disopt.2006.10.011
Bernal, D.E., Vigerske, S., Trespalacios, F., Grossmann, I.E.: Improving the performance of DICOPT in convex MINLP problems using a feasibility pump. Optim. Methods Softw. 35(1), 171–190 (2020)
Flaherty, P., Wiratchotisatian, P., Lee, J.A., Tang, Z., Trapp, A.C.: MAP Clustering under the gaussian mixture model via mixed integer nonlinear optimization (2019). arXiv:1911.04285
Chen, Y., Gupta, M.R.: EM Demystified: An Expectation-Maximization Tutorial. Technical Report Number UWEETR-2010-0002 (2010)
Li, Y.F., Tsang, I.W., Kwok, J.T., Zhou, Z.H.: Convex and scalable weakly labeled SVMs. J. Mach. Learn. Res. 14(7) (2013)
Papageorgiou, D.J., Trespalacios, F.: Pseudo basic steps: bound improvement guarantees from Lagrangian decomposition in convex disjunctive programming. EURO J. Comput. Optim. 6(1), 55–83 (2018)
Kronqvist, J., Misener, R., Tsay, C.: Between steps: intermediate relaxations between big-M and convex hull formulations (2021). arXiv:2101.12708
Bussieck, M.R., Dirkse, S.P., Vigerske, S.: PAVER 2.0: an open source environment for automated performance analysis of benchmarking data. J. Glob. Optim. 59(2), 259–275 (2014)
Trespalacios, F., Grossmann, I.E.: Improved Big-M reformulation for generalized disjunctive programs. Comput. Chem. Eng. 76, 98–103 (2015)
Jackson, J.R., Grossmann, I.E.: High-level optimization model for the retrofit planning of process networks. Ind. Eng. Chem. Res. 41(16), 3762–3770 (2002)
De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.: The mahalanobis distance. Chemom. Intell. Lab. Syst. 50(1), 1–18 (2000)
Vecchietti, A., Grossmann, I.E.: LOGMIP: a disjunctive 0–1 non-linear optimizer for process system models. Comput. Chem. Eng. 23(4–5), 555–565 (1999)
Mahajan, A., Munson, T.: Exploiting second-order cone structure for global optimization. Argonne Nat. Lab., Lemont, IL, USA, Tech. Rep. ANL/MCS-P1801-1010 (2010)
Legat, B., Dowson, O., Dias Garcia, J., Lubin, M.: MathOptInterface: a data structure for mathematical optimization problems. INFORMS J. Comput. 34(2), 672–689 (2021). https://doi.org/10.1287/ijoc.2021.1067
Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95(1), 3–51 (2003)
Chares, R.: Cones and interior-point algorithms for structured convex optimization involving powers andexponentials. Ph.D. Thesis, UCL-Université Catholique de Louvain, Louvain-la-Neuve, Belgium (2009)
Coey, C., Kapelevich, L., Vielma, J.P.: Conic optimization with spectral functions on Euclidean Jordan algebras. Math. Oper. Res. (2022)
Benson, H.Y., Vanderbei, R.J.: Solving problems with semidefinite and related constraints using interior-point methods for nonlinear programming. Math. Program. 95(2), 279–302 (2003)
Hiriart-Urruty, J.B., Lemaréchal, C.: Fundamentals of Convex Analysis. Springer Science & Business Media, Berlin (2004)
Parikh, N., Boyd, S., et al.: Proximal algorithms. Found. Trends® Optim. 1(3), 127–239 (2014)
Balas, E.: Disjunctive programming and a hierarchy of relaxations for discrete optimization problems. SIAM J. Algebraic Discrete Methods 6(3), 466–486 (1985)
Balas, E.: Disjunctive programming: properties of the convex hull of feasible points. Discret. Appl. Math. 89(1–3), 3–44 (1998)
Hijazi, H., Bonami, P., Ouorou, A.: An outer-inner approximation for separable mixed-integer nonlinear programs. INFORMS J. Comput. 26(1), 31–44 (2014)
Kronqvist, J., Lundell, A., Westerlund, T.: Reformulations for utilizing separability when solving convex MINLP problems. J. Glob. Optim. 71(3), 571–592 (2018)
Jeroslow, R.G.: Representability in mixed integer programming, I: characterization results. Discret. Appl. Math. 17(3), 223–243 (1987)
Acknowledgements
The authors gratefully acknowledge financial support from the Center of Advanced Process Decision-making and from the US Department of Energy, Office of Fossil Energy’s Crosscutting Research, Simulation Based Engineering Program through the Institute for the Design of Advanced Energy Systems (IDAES).
Funding
This work was supported by the Center of Advanced Process Decision-making and from the US Department of Energy, Office of Fossil Energy’s Crosscutting Research, Simulation Based Engineering Program through the Institute for the Design of Advanced Energy Systems (IDAES).
Author information
Authors and Affiliations
Contributions
Both authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by David E. Bernal Neira. The first draft of the manuscript was written by David E. Bernal Neira as part of his Ph.D. Thesis and both authors commented on previous versions of the manuscript. Both authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Background
In this manuscript, we use a similar notation to the one used by Ben-Tal and Nemirowski [11] and Alizadeh and Goldfarb [81]. We use lowercase boldface letters, e.g., \(\textbf{x,c}\), to denote column vector, and uppercase boldface letters, e.g., \({\textbf{A}}, {\textbf{X}}\), to denote matrices. Sets are denoted with uppercase calligraphic letters, e.g., \({\mathcal {S}}, {\mathcal {K}}\). Subscripted vectors denote \(\mathbf {x_i}\) denote the ith block of \({\textbf{x}}\). The jth component of the vectors \({\textbf{x}}\) and \(\mathbf {x_i}\) are indicated as \(x_j\) and \(x_{ij}\). The set \(\{1,\ldots ,J\}\) is represented by the symbol \(\llbracket J \rrbracket \). Moreover, the subscript \(\llbracket J \rrbracket \) of a vector \({\textbf{x}}\) is used to define the set \({\textbf{x}}_{\llbracket J \rrbracket }:= \{{\textbf{x}}_1,\ldots ,{\textbf{x}}_J\}\). We use \({\textbf{0}}\) and \({\textbf{1}}\) for the all zeros and all ones vector, respectively, and 0 and I for the zero and identity matrices, respectively. The vector \(e_j\) will be the vector with a single 1 in position j, and its remaining elements are 0. The dimensions of the matrices and vectors will be clear from the context. We use \({\mathbb {R}}^k\) to denote the set of real numbers of dimension k, and for set \({\mathcal {S}} \subseteq {\mathbb {R}}^k\), we use \(\text {cl}({\mathcal {S}})\) and \(\text {conv}({\mathcal {S}})\) to denote the closure and convex hull of \({\mathcal {S}}\), respectively.
For concatenated vectors, we use the notation that “,” is row concatenation of vectors and matrices, and “;” is column concatenation. For vectors, \(\textbf{x,y}\) and \({\textbf{z}}\), the following are equivalent.
The projection of a set \({\mathcal {S}} \subseteq {\mathbb {R}}^k\) onto the vector \({\textbf{x}} \in X \subseteq {\mathbb {R}}^n\), with \(n \le k\) is denoted as \(\text {proj}_{{\textbf{x}}}({\mathcal {S}}):= \{ {\textbf{x}} \in X: \exists {\textbf{y}}: ({\textbf{x}};{\textbf{y}}) \in {\mathcal {S}} \}\).
If \({\mathcal {A}} \subseteq {\mathbb {R}}^k\) and \({\mathcal {B}} \subseteq {\mathbb {R}}^l\) we denote their Cartesian product as \({\mathcal {A}} \times {\mathcal {B}}:= \{({\textbf{x}};{\textbf{y}}): {\textbf{x}} \in {\mathcal {A}}, {\textbf{y}} \in {\mathcal {B}} \}\).
For \({\mathcal {A}}_1, {\mathcal {A}}_2 \subseteq {\mathbb {R}}^k\) we define the Minkowski sum of the two sets as \({\mathcal {A}}_1 + {\mathcal {A}}_2 = \{ {\textbf{u}} + {\textbf{v}}: {\textbf{u}} \in {\mathcal {A}}_1, {\textbf{v}} \in {\mathcal {A}}_2 \}\).
1.1 Cones
For a thorough discussion about convex optimization and conic programming, we refer the reader to [11]. The following definitions are required for the remainder of the manuscript.
The set \({\mathcal {K}} \subseteq {\mathbb {R}}^k\) is a cone if \(\forall ({\textbf{z}},\lambda ) \in {\mathcal {K}} \times {\mathbb {R}}_+, \lambda {\textbf{z}} \in {\mathcal {K}}\). The dual cone of \({\mathcal {K}} \subseteq {\mathbb {R}}^k\) is
and it is self-dual if \({\mathcal {K}} = {\mathcal {K}}^*\). The cone is pointed if \({\mathcal {K}} \cap (-{\mathcal {K}}) = \{ {\textbf{0}} \}\). A cone is proper if it is closed, convex, pointed, and with a non-empty interior. If \({\mathcal {K}}\) is proper, then its dual \({\mathcal {K}}^*\) is proper too. \({\mathcal {K}}\) induces a partial order on \({\mathbb {R}}^k\):
which allows us to define a conic inequality as
where \({\textbf{A}} \in {\mathbb {R}}^{m \times k}\), \({\textbf{b}} \in {\mathbb {R}}^m\), and \({\mathcal {K}}\) a cone.
When using a cone that represents the cartesian product of others, i.e., \({\mathcal {K}} = {\mathcal {K}}_{n_1} \times \cdots \times {\mathcal {K}}_{n_r}\) with each cone \({\mathcal {K}}_{n_i} \subseteq {\mathbb {R}}^{n_i}\), its corresponding vectors and matrices are partitioned conformally, i.e.,
Furthermore, if each cone \({\mathcal {K}}_{n_i} \subseteq {\mathbb {R}}^{n_i}\) is proper, then \({\mathcal {K}}\) is proper too.
A Conic Programming (CP) problem is then defined as:
Examples of proper cones are:
-
The nonnegative orthant
$$\begin{aligned} {\mathbb {R}}_+^k = \left\{ {\textbf{z}} \in {\mathbb {R}}^k: {\textbf{z}} \ge {\textbf{0}} \right\} . \end{aligned}$$(A6) -
The positive semi-definite cone
$$\begin{aligned} {\mathbb {S}}_+^k = \left\{ Z \in {\mathbb {R}}^{k \times k}: Z = Z^T, \lambda _{min}(Z) \ge 0 \right\} , \end{aligned}$$(A7)where \(\lambda _{min}(Z)\) denotes the smallest eigenvalue of Z.
-
The second-order cone, Euclidean norm cone, or Lorentz cone
$$\begin{aligned} {\mathcal {Q}}^k = \left\{ {\textbf{z}} \in {\mathbb {R}}^{k}: z_1 \ge \sqrt{\sum _{i=2}^k z_i^2} \right\} . \end{aligned}$$(A8) -
The exponential cone [82]
$$\begin{aligned} \begin{aligned} {\mathcal {K}}_{exp}&= \text {cl} \left\{ (z_1,z_2,z_3) \in {\mathbb {R}}^3: z_1 \ge z_2 e^{z_3 / z_2}, z_1 \ge 0, z_2> 0 \right\} \\&= \left\{ (z_1,z_2,z_3) \in {\mathbb {R}}^3: z_1 \ge z_2 e^{z_3 / z_2}, z_1 \ge 0, z_2 > 0 \right\} \\&\quad \bigcup {\mathbb {R}}_+ \times \{ {\textbf{0}} \} \times (-{\mathbb {R}}_+) \\&= \left\{ (z_1,z_2,z_3) \in {\mathbb {R}}^3: z_1 \ge z_2 e^{z_3 / z_2}, z_2 \ge 0 \right\} . \end{aligned} \end{aligned}$$(A9)
Of these cones, the only one not being self-dual or symmetric is the exponential cone.
Other cones that are useful in practice are
-
The rotated second-order cone or Euclidean norm-squared cone
$$\begin{aligned} {\mathcal {Q}}_r^k = \left\{ {\textbf{z}} \in {\mathbb {R}}^{k}: 2z_1z_2 \ge \sqrt{\sum _{i=3}^k z_i^2}, z_1, z_2 \ge 0 \right\} , \end{aligned}$$(A10)This cone can be written as a rotation of the second-order cone, i.e., \({\textbf{z}} \in {\mathcal {Q}}^k \iff R_k{\textbf{z}} \in {\mathcal {Q}}_r^k\) with \(R_k:= \begin{bmatrix} \sqrt{2}/2 &{} \sqrt{2}/2 &{} 0 \\ \sqrt{2}/2 &{} \sqrt{2}/2 &{} 0 \\ 0 &{} 0 &{} I_{k-2} \end{bmatrix}\) [15].
-
The power cone, with \(l < k, \sum _{i \in \llbracket l \rrbracket } \alpha _i = 1, \alpha _{i \in \llbracket l \rrbracket } > 0\),
$$\begin{aligned} {\mathcal {P}}_k^{\alpha _1, \ldots , \alpha _l} = \left\{ {\textbf{z}} \in {\mathbb {R}}^k: \prod _{i = 1}^l z_i^{\alpha _i} \ge \sqrt{\sum _{i = l+1}^k z_i^2}, \quad z_i \ge 0 \quad i \in \llbracket l \rrbracket \right\} . \end{aligned}$$(A11)This cone can be decomposed using a second-order cone and \(l-1\) three-dimensional power cones
$$\begin{aligned} {\mathcal {P}}_3^{\alpha } = \left\{ (z_1,z_2,z_3) \in {\mathbb {R}}^3: z_1^\alpha z_2^{1-\alpha } \ge |z_3 |, \quad z_1, z_2 \ge 0 \right\} , \end{aligned}$$(A12)through \(l - 1\) additional variables \((u,v_{1},\ldots ,v_{l-2})\),
$$\begin{aligned} {\textbf{z}} \in {\mathcal {P}}_k^{\alpha _1, \ldots , \alpha _l} \iff {\left\{ \begin{array}{ll} (u,z_{l+1},\ldots ,z_{k}) \in {\mathcal {Q}}^{k-l+1},\\ (z_1,v_1,u) \in {\mathcal {P}}_3^{\alpha _1}, \\ (z_i,v_i,v_{i-1}) \in {\mathcal {P}}_3^{{\bar{\alpha }}_i}, \quad i=2,\ldots ,l-1, \\ (z_{l-1},z_l,v_{l-2}) \in {\mathcal {P}}_3^{{\bar{\alpha }}_{l-1}}, \end{array}\right. } \end{aligned}$$(A13)where \({\bar{\alpha }}_i=\alpha _i/(\alpha _i+\cdots +\alpha _l)\) for \(i=2,\ldots ,l-1\) [15]. \({\mathcal {P}}_3^{\alpha }\) can be represented using linear and exponential cone constraints, i.e., \(\lim _{\alpha \rightarrow 0} (z_1,z_2,z_2+\alpha z_3) \in {\mathcal {P}}_3^{\alpha } = (z_1,z_2,z_3) \in {\mathcal {K}}_{exp}\)
Most, if not all, applications-related convex optimization problems can be represented by conic extended formulations using these standard cones [15], i.e., in problem CP, the cone \({\mathcal {K}}\) is a product \({\mathcal {K}}_1 \times \cdots \times {\mathcal {K}}_r\), where each \({\mathcal {K}}_i\) is one of the recognized cones mentioned above. Equivalent conic formulations for more exotic convex sets using unique cones can be formulated with potential advantages for improved solution performance [17, 83].
As mentioned in the introduction, an alternative to a convex optimization problem’s algebraic description as in problem MINLP is the following Mixed-Integer Conic Programming (MICP) problem:
where \({\mathcal {K}}\) is a closed convex cone.
Without loss of generality, integer variables need not be restricted to cones, given that corresponding continuous variables can be introduced via equality constraints. Notice that for an arbitrary convex function \(f: {\mathbb {R}}^k \rightarrow {\mathbb {R}} \cup \{ \infty \}\), one can define a closed convex cone using its recession,
where the function \(\tilde{f}({\textbf{z}},\lambda )\) is the perspective function of function \(f({\textbf{z}})\), and whose algebraic representation is a central piece of this work. Closed convex cones can also be defined as the recession of convex sets. On the other hand, a conic constraint can be equivalent to a convex inequality,
Although certain cones can be non-smooth, e.g., SOC cones, these can be reformulated using appropriately chosen smooth convex functions \({\textbf{g}}({\textbf{x}})\) [35, 84].
We can therefore reformulate problem MINLP in the following parsimonious manner [22]:
where copies of the original variables \({\textbf{x}}\) and \({\textbf{y}}\) are introduced for the objective function and each constraint, \(\mathbf {x_f},\mathbf {y_f},\mathbf {x_j},\mathbf {y_j}, j \in \llbracket J \rrbracket \), such that each belongs to the recession cone of each constraint defined as in (A14). Each conic set requires the introduction of an epigraph variable t and a recession variable \(\lambda \). The epigraph variable from the objective function, \(t_f\), is used in the new objective, and the ones corresponding to the constraints are set as nonnegative slack variables \(s_j\). The recession variables \(\lambda \) in (A14) are fixed to one in all cases.
Notice that problem MINLP-Cone is in MICP form with \({\mathcal {K}} = {\mathbb {R}}_+^{n_x + J} \times {\mathcal {K}}_f \times {\mathcal {K}}_{g_1} \times \cdots \times {\mathcal {K}}_{g_J}\). As mentioned above, the case when \({\mathcal {K}} = {\mathcal {K}}_1 \times \cdots \times {\mathcal {K}}_r\) where each \({\mathcal {K}}_i\) is a recognized cone is more useful from practical purposes. Lubin et al. [22] showed that all the convex MINLP instances at the benchmark library MINLPLib [65] could be represented with nonnegative, second-order, and exponential cones.
1.2 Perspective function
For a convex function \(h({\textbf{x}}): {\mathbb {R}}^n \rightarrow {\mathbb {R}} \cup \{ \infty \}\) its perspective function \(\tilde{h}({\textbf{x}},\lambda ): {\mathbb {R}}^{n+1} \rightarrow {\mathbb {R}} \cup \{ \infty \}\) is defined as
The perspective of a convex function is convex but not closed. Hence, consider the closure of the perspective function \((\text {cl}~\tilde{h})({\textbf{x}},\lambda )\) defined as
where \(h'_\infty ({\textbf{x}})\) is the recession function of function \(h({\textbf{x}})\) [85, Section B Proposition 2.2.2], and which in general does not have a closed-form.
The closure of the perspective function of a convex function is relevant for convex MINLP on two ends. On the one hand, it appears when describing the closure of the convex hull of disjunctive sets. On the other hand, as seen above, it can be used to define closed convex cones \({\mathcal {K}}\) that determine the feasible region of conic programs. Relying on the amenable properties of convex cones, conic programs can be addressed with specialized algorithms, allowing for more efficient solution methods.
The closure of the perspective function presents a challenge when implementing it for nonlinear optimization models, given that it is not defined at \(\lambda =0\). As seen below, modeling this function becomes necessary when writing the convex hull of the union of convex sets. Several authors have addressed this difficulty in the literature through \(\varepsilon \)-approximations. The first proposal was made by Lee and Grossmann [45], where
This approximation is exact when \(\varepsilon \rightarrow 0\). However, it requires values for \(\varepsilon \), which are small enough to become numerically challenging when implemented in a solution algorithm.
Furman et al. [43] propose another approximation for the perspective function such that
which is exact for values of \(\lambda = 0\) and \(\lambda = 1\), is convex for \(h({\textbf{x}})\) convex, and is exact when \(\varepsilon \rightarrow 0\) as long as \(h({\textbf{0}})\) is defined. Using this approximation in the set describing the system of equations of the closed convex hull of a disjunctive set also has properties beneficial for mathematical programming.
This approximation is used in software implementations when reformulating a disjunctive set using its hull relaxation [40, 78]. Notice that even with its desirable properties, the approximation introduces some error for values \(\varepsilon > 0\); hence, it is desirable to circumvent its usage. As shown in [42] and the Sect. 4, using a conic constraint to model the perspective function allows for a more efficient solution of convex MINLP problems.
1.3 Disjunctive programming
Optimization over disjunctive sets is denoted as Disjunctive Programming [34, 60]. The system of inequalities gives a disjunctive set joined by logical operators of conjunction (\(\wedge \), “and”) and disjunction (\(\vee \), “or”). These sets are non-convex and usually represent the union of convex sets. The main reference on Disjunctive Programming is the book by Balas [34].
Consider the following disjunctive set
where \(|I |\) is finite. Each set defined as \({\mathcal {C}}_i:= \{ {\textbf{x}} \in {\mathbb {R}}^n \mid {\textbf{h}}_i({\textbf{x}}) \le {\textbf{0}} \}\) is a convex, bounded, and nonempty set defined by a vector-valued function \({\textbf{h}}_i: {\mathbb {R}}^n \rightarrow \left( {\mathbb {R}} \cup \{ \infty \} \right) ^{J_i}\). Notice that is it sufficient for \({\mathcal {C}}_i\) to be convex that each component of \({\textbf{h}}_i\), \(h_{i \llbracket J_i \rrbracket }\), is a proper closed convex function, although it is not a necessary condition. A proper closed convex function is one whose epigraph is a nonempty closed convex set [86].
Ceria and Soares [55] characterize the closure of the convex hull of \({\mathcal {C}}\), \(\text {cl conv}({\mathcal {C}})\), with the following result.
Theorem 2
[55] Let \({\mathcal {C}}_i = \{ {\textbf{x}} \in {\mathbb {R}}^n \mid {\textbf{h}}_i({\textbf{x}}) \le {\textbf{0}} \} \ne \emptyset \), assume that each component of \({\textbf{h}}_i\), \(h_{i\llbracket J_i \rrbracket }\), is a proper closed convex function, and let
Then \(\text {cl conv}(\bigcup _{i \in I} {\mathcal {C}}_i) = \text {proj}_{{\textbf{x}}}({\mathcal {H}})\).
Proof
See [55, Theorem 1] and [38, Theorem 1]. \(\square \)
Theorem 2 provides a description of \(\text {cl conv}({\mathcal {C}})\) in a higher dimensional space, an extended formulation. This Theorem generalizes the result by [34, 60, 87, 88] where all the convex sets \({\mathcal {C}}_i\) are polyhedral. Even though the extended formulations induce growth in the size of the optimization problem, some of them have shown to be amenable for MINLP solution algorithms [22, 59, 89, 90].
A similar formulation was derived by Stubbs and Mehrotra [46] in the context of a Branch-and-cut method for Mixed-binary convex programs. These authors notice that the extended formulation might not be computationally practical; hence, they derive linear inequalities or cuts from this formulation to be later integrated into the solution procedure. Similar ideas have been explored in the literature [47]. In particular cases, the dimension of the extended formulation can be reduced to the original size of the problem, e.g., when there are only two terms in the disjunction, i.e., \(|I |=2\), and one of the convex sets \({\mathcal {C}}_i\) is a point [42]. A description in the original space of variables has also been given for the case when one set \({\mathcal {C}}_1\) is a box and the constraints defining the other \({\mathcal {C}}_2\) is determined by the same bounds as the box and nonlinear constraints being isotone [44]. This has been extended even further by Bonami et al. [38] with complementary disjunctions. In other words, the activation of one disjunction implies that the other one is deactivated, in the case that the functions that define each set \({\textbf{h}}_{\{1,2\}}\) are isotone and share the same indices on which they are non-decreasing. The last two cases present the formulation in the original space of variables by paying a prize of exponentially many constraints required to represent \(\text {cl conv}({\mathcal {C}})\).
In the case that \({\mathcal {C}}_i\) is compact, its recession cone is the origin, i.e., \({\mathcal {C}}_{i\infty }= \{ {\textbf{x}} \in {\mathbb {R}}^n \mid {\textbf{h}}'_{i\infty }({\textbf{x}}) \le {\textbf{0}} \} = \{ {\textbf{0}} \}\) [85, Section A, Proposition 2.2.3]. This fact, together with (A17) and Theorem 2, forces that for a compact \({\mathcal {C}}_i\), a value of \(\lambda _i=0\) implies \({\textbf{v}}_i=0\). This fact has been used to propose mixed-integer programming formulations for expressing the disjunctive choice between convex sets by setting the interpolation variables to be binary \(\lambda _i \in \{0,1\}, i \in I\) [45, 91], i.e.,
An interesting observation is that using the approximation of the closure of the perspective function from Furman et al. [43], for any value of \(\varepsilon \in (0,1)\), \(\text {proj}_{\textbf{x}}({\mathcal {H}}_{\{0,1\}})={\mathcal {C}}\) when \({\textbf{h}}_i({\textbf{0}})\) is defined \(\forall i \in I\) and
see [43, Proposition 1].
The condition on (A23) is required to ensure that if \(\lambda _i=1\), then \({\textbf{v}}_{i'}=0, \forall i' \in I {\setminus } \{ i \}\). This condition is not valid in general for a disjunctive set \({\mathcal {C}}\), but it is sufficient to have a bounded range on \({\textbf{x}} \in {\mathcal {C}}_i, i \in I\). Moreover, when these conditions are satisfied, \({\mathcal {C}} \subseteq \text {proj}_{\textbf{x}}({\mathcal {H}})\) using the approximation in (A19) for \(\varepsilon \in (0,1)\), with \(\text {cl conv} ({\mathcal {C}}) = \text {proj}_{\textbf{x}}({\mathcal {H}})\) in the limit when \(\varepsilon \rightarrow 0\) [43, Proposition 3].
The problem formulation HR is derived by replacing each disjunction with set \({\mathcal {H}}_{\{0,1\}}\) (A22). Notice that to guarantee the validity of the formulation, the condition on (A23) is enforced implicitly by having the bounds over \({\textbf{x}}\) included in each disjunct, leading to constraint \({\textbf{x}}^{l}y_{ik} \le {\textbf{v}}_{ik} \le {\textbf{x}}^{u}y_{ik}\).
Appendix B: Detailed computational results
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bernal Neira, D.E., Grossmann, I.E. Convex mixed-integer nonlinear programs derived from generalized disjunctive programming using cones. Comput Optim Appl 88, 251–312 (2024). https://doi.org/10.1007/s10589-024-00557-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-024-00557-9