A review and comparison of solvers for convex MINLP

In this paper, we present a review of deterministic software for solving convex MINLP problems as well as a comprehensive comparison of a large selection of commonly available solvers. As a test set, we have used all MINLP instances classified as convex in the problem library MINLPLib, resulting in a test set of 335 convex MINLP instances. A summary of the most common methods for solving convex MINLP problems is given to better highlight the differences between the solvers. To show how the solvers perform on problems with different properties, we have divided the test set into subsets based on the continuous relaxation gap, the degree of nonlinearity, and the relative number of discrete variables. The results also provide guidelines on how well suited a specific solver or method is for particular types of MINLP problems.


Introduction
Mixed-integer nonlinear programming (MINLP) combines the modeling capabilities of mixed-integer linear programming (MILP) and nonlinear programming (NLP) into a versatile modeling framework. By using integer variables, it is possible to incorporate discrete decisions, e.g., to choose between some specific options, into the optimization model. Furthermore, by using both linear and nonlinear functions it is possible to accurately model a variety of different phenomena, such as chemical reactions, separations, and material flow through a production facility. The versatile modeling capabilities of MINLP means there are a wide variety of real-world optimization problems that can be modeled as MINLP problems, e.g., block layout design problems (Castillo et al. 2005), cancer treatment planning (Cao and Lim 2011), design of water distribution networks (Bragalli et al. 2012), portfolio optimization (Bonami and Lejeune 2009), nuclear reactor core fuel reloading (Quist et al. 1999), process synthesis (Grossmann et al. 1999), pooling problems in the petrochemical industry (Misener and Floudas 2009), and production planning (Sahinidis and Grossmann 1991). More of MINLP applications are described by, e.g., Floudas (1995), Biegler and Grossmann (2004), Boukouvala et al. (2016) and Trespalacios and Grossmann (2014). For a recent review on MINLP methods see D' Ambrosio and Lodi (2013) and Bonami et al. (2012).
MINLP is often considered as a "difficult" class of optimization problems. However, there has been significant progress in the field during the last twenty years and there are several good solvers for MINLP problems available today (Bussieck and Vigerske 2010). Here we will focus on convex MINLP, which is a specific subclass with some desirable properties, e.g., it is possible to decompose a convex MINLP problem into a finite sequence of tractable subproblems. An MINLP problem is often considered as convex when its continuous relaxation yields a convex NLP problem. In recent years there has been significant progress within the field of MILP and NLP (Achterberg and Wunderling 2013;Bazaraa et al. 2013) which is also reflected onto the field of MINLP since decomposition techniques for MINLP problems often rely on solving these types of subproblems. It is also possible to solve certain classes of nonconvex MINLP problems, such as problems with signomial or general twice-differentiable constraints, by reformulating them into convex MINLP problems (Pörn et al. 1999;Lundell et al. 2009;Nowak et al. 2018), further motivating the study of efficient methods for convex MINLP.
The intention of this paper is to give an overview of commonly available deterministic solvers for convex MINLP problems and to present a thorough numerical comparison of the most common solvers. Most optimization solvers are connected to one or more of the well-established modeling environments for MINLP optimization, such as, AIMMS (Bisschop 2006), AMPL (Fourer et al. 1993), and GAMS (Brook et al. 1988). In recent years, there has also been a growing interest in optimization modeling in Python and Julia (Bezanson et al. 2012); JuMP is a modeling environment for optimization embedded in Julia (Dunning et al. 2017), and Pyomo is a similar environment in Python (Hart et al. 2012). Several MINLP solvers also offer interfaces to MATLAB, and through OPTI Toolbox it is also possible to access several solvers in MATLAB (Currie et al. 2012).
The solvers considered in the numerical comparison are AlphaECP, Antigone, AOA, BARON, BONMIN, Couenne, DICOPT, Juniper, KNITRO, LINDO, Minotaur, Muriqui, Pavito, SBB, SCIP, and SHOT. These were chosen based on criteria like availability, active development, and support for a file format available in MIN-LPLib (MINLPLib 2018). Some of these are global solvers and therefore not limited to convex problems. However, most of the global solvers have convexity identification techniques or manual strategy settings that can be set by the user to allow them to more efficiently deal with convex problems. The convex solvers can also often be used as heuristic methods without guarantee for finding the optimal solution for nonconvex MINLP problems.
In Sect. 2, the convex MINLP problem is defined and a general overview of the most common algorithms for such problems are given in Sect. 3. Most solvers in the comparison utilize one or more of these solution methods, as described in Sect. 4, which contains a summary of the solvers considered. Section 5 describes the benchmark in detail, and the numerical results are, finally, presented in Sect. 6.

Convex MINLP problem formulation
A convex MINLP problem can, without loss of generality, be written as where the sets N, L and Y are given by and L ∩ Y is assumed to be a compact set. The upper and lower bounds on the integer variables y i are denoted as y i and y i . To ensures convergence of methods such as outer approximation, it is assumed that all the integer variables are bounded either by the variables bounds or by the linear constraints, since unbounded variables can, e.g., cause some of the subproblems to be unbounded. Most solvers do not require the variables to be bounded, however, to avoid such issues some solvers automatically assigns large bounds to unbounded variables. Generally, problem (P-MINLP) is considered as convex if all the nonlinear functions g j are convex in the variables and the relaxed integer variables . There has recently been an interest in nonsmooth convex MINLP, and some solution techniques have been presented see e.g., Eronen et al. (2017) and Eronen et al. (2014). However, most of the commonly available solvers only have guaranteed convergence for smooth problems and therefore we limit this study to problems where the nonlinear functions g j are continuously differentiable.

Methods
This section describes the most commonly used algorithms for convex MINLP. The methods described are branch and bound, extended cutting plane, extended supporting hyperplane, outer approximation, generalized Benders decomposition, and LP/ NLP-based branch and bound. This summary is not intended to give an in-depth analysis of the algorithms, but to better exemplify the differences between the solvers. For a more detailed discussion about the algorithms see, e.g., D'Ambrosio and Lodi (2013), Belotti et al. (2013), Grossmann (2002), and Floudas (1995).

Branch and bound
Branch and bound (BB) was first presented as a technique for solving MILP problems by Land and Doig (1960). A few years later it was noted by Dakin (1965) that MINLP problems can be solved with a similar branch and bound approach, although the paper focused on linear problems. Solving convex MINLP problems with a BB approach was also studied by Gupta and Ravindran (1985).
In the basic form, BB solves the MINLP problem by relaxing the integer restrictions of the original problem and solving continuous (convex) NLP relaxations. Solving a continuous relaxation of problem (P-MINLP) results in a solution ( k , k ) , which provides a valid lower bound. If all components of k take on integer values, then it is also an optimal solution to the MINLP problem. Otherwise, the continuous relaxation is divided (branched) into two new NLP subproblems by adding the constraints y i ≤ ⌊y k i ⌋ and y i ≥ ⌈y k i ⌉ to the relaxed problem. The branching variable y i is a variable that takes on a fractional value and usually chosen based on some criteria, e.g., the variable furthest away from an integer value. A new lower bound can be obtained by solving the new subproblems (child nodes), and in case one of the subproblems returns an integer solution it also provides a valid upper bound. The search procedure is often represented by a tree, where the nodes are connected to their parent node and represent the subproblems. If one of the nodes does not provide an integer solution, then it is branched into two new nodes creating two new subproblems. In case one of the nodes obtains an optimum worse than the upper bound or in case the subproblem is infeasible, then the node can be pruned since an optimal solution cannot exist in that part of the search space. This approach of solving convex NLP problems in each node is often referred to as NLP-based branch and bound (NLP-BB).
Obtaining a tight continuous relaxation is of great importance within BB to avoid large search trees. Stubbs and Mehrotra (1999) presented a branch and cut method for convex MINLP problems that uses cutting planes to strengthen the continuous relaxation. Several techniques have been proposed for obtaining cuts to strengthen the continuous relaxation for MINLP problems, e.g., lift-and-project cuts (Kılınç et al. 2017;Zhu and Kuno 2006;Balas et al. 1993), Gomory cuts (Çezik and Iyengar 2005;Gomory et al. 1958), and perspective cuts (Frangioni and Gentile 2006).
Compared to BB techniques for MILP problems, NLP-BB involves computationally more demanding subproblems; it is often not unusual to explore more than 100,000 nodes for a modest-sized problem! Techniques to efficiently integrate the NLP solver and not solving all subproblems to optimality have also been proposed by Borchers and Mitchell (1994) and Leyffer (2001). Another BB approach is to solve LP relaxations in the nodes and construct a polyhedral approximation of the nonlinear constraints. A polyhedral branch and cut technique, solving LP relaxations in the nodes, was presented by Tawarmalani and Sahinidis (2005).
Many important details on BB such as branching strategies have been left out for the sake of brevity. For more details on BB see, e.g., Bonami et al. (2011) andFloudas (2000).

3
A review and comparison of solvers for convex MINLP

Extended cutting plane
The extended cutting plane (ECP) algorithm was first presented by Westerlund and Petterson (1995), and can be seen as an extension of Kelley's cutting plane method for convex NLP problems presented by Kelley (1960). In its original form the ECP method is intended for convex MINLP problems, and by some modifications, given the name generalized alpha ECP (GAECP), it can be applied to pseudoconvex problems as shown by Westerlund and Pörn (2002).
The ECP algorithm uses linearization of the nonlinear constraints to construct an iteratively improving polyhedral outer approximation of the set N. The trial solutions are obtained by solving the following MILP subproblems, where the set N k is given by Here A i is an index set containing the indices of either the most violated or all violated constraints in iteration i. Set N k is, thus, a polyhedral approximation of set N, constructed by first-order Taylor series expansions of the nonlinear constraints generated at the trial solutions ( k , k ) . The linearizations defining N k is usually referred to as cutting planes since they cut off parts of the search space that cannot contain the optimal solution. Due to convexity, N ⊆N k and therefore, the solution of problem (MILP-k) provides a valid lower bound of problem (P-MINLP).
In the first iteration, the set N 0 can simply be defined as ℝ n+m . New trial solutions are then obtained by solving subproblem (MILP-k), and the procedure is repeated until a trial solution satisfies all the constraints within a given tolerance. Once a trial solution satisfies all nonlinear constraints it is also the optimal solution, since the solution was obtained by minimizing the objective within a set containing the entire feasible region. For more details on the ECP algorithm see, e.g., Westerlund and Petterson (1995) or Westerlund and Pörn (2002).

Extended supporting hyperplane
The extended supporting hyperplane (ESH) algorithm was presented by Kronqvist et al. (2016) as an algorithm for solving convex MINLP problems. The ESH algorithm uses the same technique as the ECP algorithm for obtaining trial solutions, but uses a different technique for generating the polyhedral outer approximation N k . It has been observed that the cutting planes used to construct the polyhedral outer approximation in the ECP algorithm are, in general, not as tight as possible, see Kronqvist et al. (2016). By using a one dimensional root search, the ESH algorithm is able to obtain supporting hyperplanes to the set N at each iteration, and use these to construct a polyhedral outer approximation N k .

3
First, a strict interior point int , int is obtained by solving the following convex NLP problem The interior point should preferably be as deep as possible within the interior of N, which is here approximated by minimizing the l ∞ -norm of the nonlinear constraints.
Similar to the ECP algorithm, the trial solutions k MILP , k MILP are obtained by solving problem (MILP-k). These solutions provide a valid lower bound on the optimal solution of problem (P-MINLP). However, they will not be directly used to construct the set N k as in the ECP method. To construct the polyhedral outer approximation, we define a new function F as the point-wise maximum of the nonlinear constraints, according to A new sequence of points k , k is now defined as where the interpolation parameters k are chosen such that F( k , k ) = 0 . The interpolation parameters k can be obtained by a simple one-dimensional root search. The points k , k are now located on the boundary of the feasible region, and linearizing the active nonlinear constraints at this point results in supporting hyperplanes to the set N. The set N k is, thus, constructed according to Eq. (2) using the points k , k .
The ESH algorithm also uses a preprocessing step to obtain supporting hyperplanes of the set N by solving linear programming (LP) relaxations. The procedure of solving MILP subproblems and generating supporting hyperplanes is repeated until a trial solution satisfies all nonlinear constraints. The tighter polyhedral outer approximation usually gives the ESH algorithm an advantage over the ECP algorithm. It has been shown in Eronen et al. (2017), that the ESH algorithm can also be successfully applied to nonsmooth MINLP problems with pseudoconvex constraint functions.

Outer approximation
The outer approximation (OA) method was first presented by Duran and Grossmann (1986), with additional properties for convex MINLP problems described in Fletcher and Leyffer (1994). Some modifications of the OA method have been presented to handle nonconvex problems more efficiently, see, e.g., Kocis and Grossmann (1988) and Viswanathan and Grossmann (1990). For more details on the basic convex approach discussed in this paper see, e.g., Grossmann (2002).
OA is a decomposition technique, which obtains the optimal solution of the original problem by solving a sequence of MILP and NLP subproblems. Similar to both ECP and ESH, OA also constructs an iteratively improving polyhedral outer approximation N k of the nonlinear feasible region defined by the set N. However, OA only uses the polyhedral approximation for choosing the integer combination k , while the corresponding continuous variables k are chosen by solving a convex NLP subproblem.
In each iteration, the polyhedral outer approximation is used to construct problem (MILP-k), referred to as the MILP master problem. A new integer combination k is then obtained by solving problem (MILP-k). Once the integer combination k is obtained, the following NLP subproblem is formed If problem (NLP-fixed) is feasible, a valid upper bound can be obtained from the solution ( k , k ) , and the solution is used to improve the polyhedral approximation according to Eq. (2). The polyhedral outer approximation is updated by either linearizing all constraints or only the active constraints.
Problem (NLP-fixed) may also be infeasible in some iteration. If k is an infeasible integer combination, the corresponding continuous variables can be obtained by solving the following convex subproblem which minimizes the constraint violation with respect to the l ∞ -norm. The solution to problem (NLP-feasibility) does not provide a lower bound. However, using the infeasible solution ( k , k ) to update the polyhedral outer approximation according to Eq. (2), ensures that the infeasible integer combination k cannot be obtained again by the MILP master problem, cf. Fletcher and Leyffer (1994).
The OA algorithm is usually initiated by solving a continuous relaxation of the MINLP problem, giving an initial lower bound and a solution that can be used to construct the polyhedral approximation N 0 (Viswanathan and Grossmann 1990). It is also possible to use integer cuts to exclude specific integer combinations, as suggested by Duran and Grossmann (1986). Solving the MILP master problems (MILPk) provides a lower bound on the optimum, and the procedure is repeated until the upper and lower bound is within a given tolerance.
In general, OA results in tighter polyhedral outer approximations than the ECP algorithm, and may, therefore, require fewer iterations. For a feasible integer combination, OA will in general result in a tighter polyhedral outer approximation than ESH, but for an infeasible integer combination, ESH can give a tighter approximation. OA may thus require fewer iterations than both ESH and ECP to solve certain problems. However, since each iteration is somewhat more computationally demanding, the methods are difficult to compare directly. (NLP-fixed) (NLP-feasibility) g j ( , ) ≤ r ∀j = 1, 2, … , l, 1 3

Generalized Benders decomposition
Generalized Benders decomposition (GBD) was first presented by Geoffrion (1972) and is a generalization of Benders decomposition, a partitioning procedure for solving MILP problems (Benders 1962). As noted by Quesada and Grossmann (1992), GBD is closely related to OA and the main difference is the derivation of the master problem.
In GBD, the master problem is projected onto the space defined by the integer variables and the master problem is, thus, only expressed in the integer variables. Here we will not present the full derivation of GBD, but use the same approach as Grossmann (2002) to derive the master problem. For more details on GBD see, e.g., Grossmann and Kravanja (1997) or Floudas (1995). Given an integer combination k , the corresponding continuous variables can be obtained by solving either one of the problems (NLP-fixed) or (NLP-feasibility). If problem (NLP-fixed) is feasible, it provides a valid upper bound, as well as values for the continuous variables k and the optimal Lagrangean multipliers k and k . A valid cut is then given by where ∇ denotes the gradient with respect to the integer variables. Here, is a new auxiliary variable used for describing the objective function of the MILP subproblems. Note that the left-hand side of Eq. (6) is a first order Taylor series expansion of the Lagrangean function of problem (NLP-fixed) at the point ( k , k , k , k ) with respect to the and variables, and that the gradient with respect to the variables will be zero. The cut in Eq. (6) can be shown to be a surrogate constraint of the linearization in Eq. (2) in which the continuous variables are projected out, cf. Quesada and Grossmann (1992) or Grossmann (2002).
If problem (NLP-fixed) is infeasible with the integer combination k , problem (NLPfeasibility) is solved to obtain the continuous variables k as well as the optimal multipliers k and k . A valid cut in the integer space is then given by, For more details on the cuts see, e.g., Quesada and Grossmann (1992). The master problem for obtaining new integer combinations, is then given by, where K f contains the indices of all iterations where problem (NLP-fixed) is feasible and the index set K contains all iterations. Solving the master problem provides a lower bound on the optimal solution and gives a new integer combination k+1 . The procedure is repeated until the upper and lower bounds are within the desired tolerance. Since the cuts obtained by equations (6) and (7) can be viewed as surrogate cuts of the linear constraints included in the OA master problem, GBD generates weaker cuts than OA at each iteration and usually requires more iterations to solve a given problem. However, the master problems in GBD may be easier to solve since they contain fewer variables compared to OA and only one cut is added in each iteration.
A compromise between OA and GBD has been proposed by Quesada and Grossmann (1992), where the continuous variables are classified into linear or nonlinear based on how they are involved in the original MINLP problem. By projecting out the nonlinear continuous variables, one can derive a Lagrangean cut in a similary way as with GBD while still retaining the linear constraints involving continuous variables in the master problem. The given method has been coined as partial surrogate cuts (PSC), and as proved in Quesada and Grossmann (1992), it results in a tighter linear relaxation compared to GBD while still only adding one cut per iteration.

LP/NLP-based branch and bound
When solving a convex MINLP problem with either ECP, ESH, GBD or OA, most of the total solution time is usually spent on solving the MILP subproblems. The MILP problems are also quite similar in consecutive iterations since they only differ by a few linear constraints. To avoid constructing many similar MILP branch and bound trees, Quesada and Grossmann (1992) presented a method which integrates OA within BB, called LP/NLP-based branch and bound (LP/NLP-BB). The main idea is to only construct one branch and bound tree, where the MILP master problem is dynamically updated.
An initial polyhedral outer approximation is constructed by solving a continuous relaxation and linearizing the constraints at the relaxed solution, as in OA. The polyhedral outer approximation is used to construct the first MILP master problem and the branch and bound procedure, where an LP relaxation is solved in each node, is initiated. Once an integer solution is obtained at a given node, the integer combination is used as in OA. If the NLP problem (NLP-fixed) with the given integer combination is feasible, it provides an upper bound and new linearizations are generated. If it is infeasible, new linearizations can be obtained by solving the feasibility problem (NLP-feasibility). The new linearizations are then added to all open nodes, and the LP relaxation is resolved for the node which returned the integer combination. The branch and bound procedure continues normally by solving LP relaxations, which now give a more accurate approximation of the nonlinear constraints. Here, the search must continue down each node until either the LP relaxation returns an integer solution that satisfies all nonlinear constraints, the LP relaxation obtains an objective value worse than the upper bound, or until the LP relaxation becomes infeasible. As in normal BB, a lower bound is provided by the lowest optimal solution of the LP relaxations in all open nodes, and the search continues until the upper and lower bounds are within a given tolerance. The LP/NLP-BB procedure, thus, only generates a single branch and bound tree, and is sometimes referred to as a single-tree OA.
Numerical results have shown that LP/NLP-BB technique can result in significantly fewer nodes than the total number of nodes explored in the multiple MILP master problems in OA (Duran and Grossmann 1986;Leyffer 1993). Implementations of the LP/NLP-BB algorithm have shown promising results, cf. Abhishek et al. (2010), Bonami et al. (2008) or Mahajan et al. (2017).

Solver enhancement techniques
Most solvers are not based on a single algorithm but combines several techniques to improve its performance. In this section, we briefly describe some preprocessing techniques and primal heuristics that can be integrated with all the previously mentioned methods to improve the practical performance. Other important techniques to improve the performance include various cutting planes as well as different branching rules. For more details on cutting planes for convex MINLP see, e.g., Belotti et al. (2013), Bonami (2011) and (Kılınç et al. 2017). Overviews of branching rules are given by Linderoth and Savelsbergh (1999) and Achterberg et al. (2005).

Preprocessing
Preprocessing includes various techniques for modifying the problem into a form more favorable for the actual solver. The preprocessing procedures can result in tighter relaxations or reduce the problem size. Belotti et al. (2013) classified MINLP presolving techniques into two major categories: housekeeping and reformulations. Housekeeping includes techniques such as bound tightening and removal of redundant constraints, while reformulations can include techniques such as improvement of coefficients in the constraints and disaggregation of constraints.
There are two main approaches for tightening the variable bounds, feasibilitybased bound tightening (Shectman and Sahinidis 1998;Belotti et al. 2010), and optimization-based bound tightening (Liberti and Maculan 2006). Feasibility-based bound tightening analyzes the constraints sequentially to improve the variable bounds, whereas optimization-based bound tightening solves a sequence of relaxed problems where each individual variable is maximized and minimized to obtain optimal bounds. By reformulating the original problem, it is in some cases possible to obtain significantly tighter relaxations. Within MILP it is well known that different problem formulations can result in a tighter or weaker continuous relaxations; the uncapacitated facility location problem is a good example of when disaggregation of some constraints leads to a tighter continuous relaxation (Wolsey 1998). Similar techniques can also be used to obtain tighter relaxations for MINLP problems. Some types of nonlinear constraints can also be disaggregated to obtain a lifted reformulation of the problem, where the nonlinear constraint is split into several constraints by the introduction of new variables. Such lifted reformulations were proposed by Tawarmalani and Sahinidis (2005), where it was shown that a lifted reformulation results in tighter polyhedral outer approximations. In a recent paper by Kronqvist et al. (2018b), it was shown that the performance of several MINLP solvers, based on ECP, ESH, and OA, could be drastically improved by utilizing a reformulation technique based on lifting. Lifted reformulations of MINLP problems have also been studied by Hijazi et al. (2013), and Lubin et al. (2016). Some further reformulation techniques for MINLP problems are also presented in Liberti (2009).

Primal heuristics
Primal heuristics is a common term for algorithms and techniques intended to obtain good feasible solutions with relatively little computational effort compared to solving the original problem. The use of primal heuristics began in the field of MILP, and for instance Fischetti and Lodi (2011) claimed that primal heuristics were one of the most important improvements in MILP solvers within the last decade. In recent years, there has also been an interest in primal heuristics for MINLP problems and several algorithms have been proposed for this task. Such algorithms are, e.g., undercover (Berthold and Gleixner 2014), feasibility pumps ), rounding heuristics (Berthold 2014b), and the center-cut algorithm (Kronqvist et al. 2018a). Another technique for obtaining feasible solutions in solvers based on ECP, ESH or OA, is to check the alternative solutions in the solution pool provided by the MILP solver (Kronqvist et al. 2016). A detailed summary of several primal heuristics for MINLP problems is given by Berthold (2014a) andD'Ambrosio et al. (2012).
Finding a good feasible solution to an MINLP problem can improve the performance of MINLP solvers, as shown by the numerical results in Berthold (2014a) and Bernal et al. (2017). Having a good feasible solution can, e.g., reduce the size of the search tree in BB-based solvers and provide a tight upper bound. Obtaining a tight upper bound is especially important in solvers based on the ECP or ESH algorithm, because neither of the algorithms will in their basic form obtain a feasible solution before the very last iteration.

Solvers
This section is intended as an introduction to commonly available MINLP solvers, and to describe their main properties. Most of the solvers are not based on a single "pure" algorithm but they combine several techniques and ideas to improve their performance. On top of this, MINLP solver technology has evolved from the more mature NLP and MILP fields, and most MINLP solvers rely heavily on such subsolvers. Among the MILP solvers, the most recognized commercial solvers are CPLEX (IBM ILOG CPLEX Optimization Studio 2017), Gurobi (Gurobi 2018), and XPRESS (FICO 2017). The solvers GLPK (Makhorin 2008) and Cbc (Forrest 2005), the latter is a part of the COIN-OR initiative (Lougee-Heimer 2003), and are among the most recognized open-source solvers for MILP. All of these solvers implement an arsenal of methods within a branch and cut framework. In the NLP case, solvers like CONOPT (Drud 1994), Knitro (Byrd et al. 2006), Mosek (Andersen and Andersen 2000), and SNOPT (Gill et al. 2005) are well-known commercial options, and IPOPT (Wächter and Biegler 2006) is a well-known opensource solver (also part of the COIN-OR initiative). There exists more variability in the algorithms behind NLP solvers, e.g., CONOPT implements a Generalized Reduced Gradient (GRG) method, while IPOPT, Knitro, and Mosek use an interior-point method, and SNOPT uses a sequential quadratic programming (SQP) approach; see Biegler (2010) for a review in NLP.
Besides convexity, some of the solvers mentioned here also require an algebraic formulation of the problem. By analyzing the problem structure and applying different reformulations for instance it is, e.g., possible to obtain tighter relaxations. Furthermore, some of the NLP solvers also require the nonlinear functions to be twice continuously differentiable to guarantee convergence, which in turn imposes additional restrictions on some of the MINLP solvers.
In this section, we only mention the main features of the solvers, and for more details see the references given in the solver sections. A summary of solvers and software for MINLP problems was previously also given by Bussieck and Vigerske (2010). The solvers are implemented in a variety of programming languages, either available as standalone executables or libraries accessible from algebraic modeling software like GAMS, AMPL, and AIMMS. Other solvers have been implemented directly in the same programming languages as their modeling systems, e.g., MAT-LAB, Python-Pyomo, Julia-JuMP. The solvers used in the numerical comparison are listed in alphabetical order below.

AlphaECP
License type: Commercial Interfaces: GAMS, NEOS URL: www.gams.com/lates t/docs/S_ALPHA ECP.html AlphaECP (Alpha Extended Cutting Plane) is a solver based on the ECP algorithm developed by T. Westerlund's research group at Åbo Akademi University and implemented in GAMS by T. Lastusilta. By using the GAECP algorithm (Westerlund and Pörn 2002) the solver also has guaranteed convergence for pseudoconvex MINLP problems. AlphaECP mainly solves a sequence of MILP subproblems to generate a polyhedral outer approximation through cutting planes, but to speed-up convergence, it occasionally solves NLP subproblems with fixed integer variables as well.
To improve the capabilities of handling nonconvex problems, the algorithm also employs some heuristic techniques described in Lastusilta (2011). An important feature of AlphaECP is the technique to initially only solve MILP problems to feasibility Westerlund and Pörn (2002). This often results in a significant reduction in total solution time since fewer MILP subproblems are solved to optimality. AlphaECP can use all the NLP and MILP subsolvers available in GAMS.

ANTIGONE
License type: Commercial Interfaces: GAMS, NEOS URL: www.gams.com/lates t/docs/S_ANTIG ONE.html ANTIGONE (Algorithms for coNTinuous / Integer Global Optimization of Nonlinear Equations) is a global optimization solver developed by R. Misener and C.
A. Floudas at Princeton University. As a global solver, ANTIGONE is not limited to only convex problems but is also able to solve a variety of nonconvex problems. It uses reformulations and decomposes nonlinear functions into constant, linear, quadratic, signomial, linear fractional, exponential, and other general nonconvex terms. Convex relaxations are then generated for the decomposed nonconvex terms and the relaxations are solved in a branch and cut framework Floudas 2014, 2013). ANTIGONE uses the local solvers CONOPT or SNOPT for finding feasible solutions and CPLEX for lower bounding MILP relaxations. The solver also uses both feasibility-and optimality-based bound tightening to reduce the search space and obtain tighter relaxations.

AOA
License type: Commercial (source code available) Interfaces: AIMMS URL: www.aimms .com/engli sh/devel opers /resou rces/solve rs/aoa AOA (AIMMS Outer Approximation) is a module implemented in the AIMMS language (Hunting 2011). As the name suggests, the solver is based on OA and implements both normal OA and the LP/NLP-BB methods. The latter is the recommended one for convex problems and generates linearizations as lazy constraints utilizing MILP solver callbacks. To improve the performance and its capabilities for solving nonconvex problems, AOA may use nonlinear preprocessing and a multi-start technique. The source code of AOA is included in AIMMS, so the user can fully customize the algorithm (Roelofs and Bisschop 2018 (Ryoo and Sahinidis 1996;Tawarmalani and Sahinidis 2005). The solver uses a polyhedral branch and bound technique, and thus, solves LP relaxations in the BB nodes. However, BARON also uses MILP relaxations as described by Zhou et al. (2018) and Kılınç and Sahinidis (2018) and nonlinear relaxations (Khajavirad and Sahinidis 2018). Nonconvex problems are handled by generating convex underestimators and concave overestimators in combination with a spatial branch and bound technique. The solver utilizes automatic reformulations and convexity identification to decompose nonconvex functions into simpler functions with known convex or concave relaxations. The reformulations can also result in tighter lifted polyhedral outer approximations as shown by Tawarmalani and Sahinidis (2005). BARON also uses advanced bound tightening and range reduction techniques to reduce the search space and uses local search techniques as primal heuristics. BARON includes IPOPT, FilterSD or FilterSQP for solving NLP subproblems, and it can also utilize any available NLP solver in GAMS. Cbc, CPLEX or Xpress can be used as LP and MILP subsolvers.  (Bonami et al. 2008). The solver implements several algorithms, and the user is able to choose between NLP-BB, LP/NLP-BB, OA, feasibility pump, OA-based branch and cut, and a hybrid approach. Some computational results, as well as detailed descriptions of the main algorithms, are given by Bonami et al. (2008) and Bonami and Lee (2007). As subsolvers, BONMIN uses IPOPT for NLP and Cbc or CPLEX for MILP problems.  (Viswanathan and Grossmann 1990). In the equality relaxation, the nonlinear equality constraints are relaxed as inequalities using the signs of the corresponding Lagrangean multipliers, given by one of the NLP subproblems. The augmented penalty method relaxes the linearizations with slack variables which are penalized in the objective of the MILP master problem of OA. Both methods are intended as heuristics for nonconvex MINLP problems, although if the equality constraints relax as convex inequalities the methods become rigorous. A feasibility pump algorithm is implemented as a primal heuristic to improve the solver's performance (Bernal et al. 2017). DICOPT can use any available MILP and NLP subsolvers available in GAMS.

Juniper
License type: Open-source (MIT) Interfaces: JuMP URL: www.githu b.com/lanl-ansi/junip er.jl Juniper is an open-source MINLP solver implemented in Julia. It is developed by O. Kröger, C. Coffrin, H. Hijazi, and H. Nagarajan at Los Alamos National Laboratory. The solver implements an NLP-BB method with branching heuristics and primal heuristics, such as the feasibility pump (Kröger et al. 2018). The solver also uses parallelization capabilities available in Julia to solve multiple NLP subproblems in parallel. For nonconvex problems, it acts as a heuristic. It can use any NLP solver available in JuMP for solving subproblems, and it can optionally use a MIP solver for its feasibility pump.

Knitro
License type: Commercial Interfaces: AIMMS, AMPL, C++, C#, Fortran, Java, JuMP, GAMS, NEOS, Pyomo, Python, YALMIP URL: www.artel ys.com/knitr o Knitro is a commercial optimization software currently developed by Artelys (Byrd et al. 2006). Knitro includes several algorithms for dealing with continuous problems, such as interior-point and active-set algorithms. The solver uses BB techniques for problems with discrete variables, and the solver has three methods for dealing with MINLP problems. The first method uses an NLP-BB algorithm, and the second method is based on the LP/NLP-BB algorithm. The third method, based on a sequential quadratic programming approach, is mainly intended for problems with expensive function evaluations and can handle MINLP problems where the discrete variables are not relaxable, e.g., functions given by black-box simulators (Artelys 2018). By using default settings the solver will automatically choose which method to use.  Melo et al. (2018b). The solver has several algorithms implemented, e.g., ECP, ESH, OA, LP/NLP-BB, and NLP-BB, as well as some heuristics approaches (Melo et al. 2018a). The solver provides a platform for using the most common algorithms for convex MINLP with several customizable parameters for the end user. For solving the resulting MILP and NLP subproblems, Muriqui can use CPLEX, Gurobi, Xpress (FICO 2017), Mosek, Glpk (Makhorin 2008), IPOPT and Knitro. Muriqui also utilizes callback functionality and so-called lazy constraints in CPLEX and Gurobi to perform the single-tree search in LP/NLP-BB and in the hybrid algorithm.

Pavito
License type: Open-source (MPL 2.0) Interfaces: JuMP URL: www.githu b.com/julia opt/pavit o.jl Pavito is an open-source solver for convex MINLP implemented in Julia by C. Coey, M. Lubin and J. P. Vielma. Its functionality was previously part of the Pajarito solver for conic MINLP, but the NLP functionality was recently moved into the Pavito solver (Coey et al. 2018). Contrary to the other solvers presented in this manuscript, Pajarito uses a conic problem formulation and is based on a conic outer approximation algorithm (Lubin et al. 2016). Conic MINLP formulations can be built using the Disciplined Convex Programming (DCP) modeling paradigm, which may require a reformulation of the test instances. Because of this, Pajarito is left out of the numerical comparison and only Pavito is included. Pavito is based on an OA algorithm and has the functionality to perform a single-tree search similar to LP/NLP-BB by utilizing callbacks to the MILP subsolver. Pavito can use any MILP and NLP subsolver available in JuMP for solving subproblems.

SBB
License type: Commercial Interfaces: GAMS, NEOS URL: www.gams.com/lates t/docs/S_SBB.html SBB (Simple Branch and Bound) is a solver in GAMS based on NLP-BB, developed by ARKI Consulting and Development A/S. SBB is based on a BB method that solves nonlinear relaxations in the form of NLP problems at each node. For nonconvex MINLP problems it works as a heuristic approach without convergence guarantees. For improved robustness the solver has functionality for dealing with NLP solver failures, by changing either subsolver or subsolver parameters. The solver also uses primal heuristics through the GAMS Branch-Cut-and-Heuristic facility (GAMS 2018). SBB can use any of the available NLP solvers in GAMS for solving the relaxed subproblems. However, it works best with solvers that take advantage of a near-optimal starting point such as CONOPT, Minos and SNOPT.

SCIP
License type: Free for academic use (ZIB academic license); commercial Interfaces: Standalone; AMPL, C, GAMS, JuMP, MATLAB, NEOS, Java, Pyomo, Python URL: scip.zib.de SCIP (Solving Constraint Integer Programs) was originally developed by T. Achterberg at the Zuse Institute Berlin in cooperation with TU Darmstadt, RWTH Aachen, and University of Erlangen-Nürnberg, as a general framework based on branching for constraint integer and mixed-integer programming using branch-cut-and-price, cf. Achterberg (2009). The solver is intended to be modular and it utilizes plugins to make it easy to modify . SCIP was extended by Vigerske and Gleixner (2018) to solve convex and nonconvex MINLP problem by utilizing polyhedral outer approximations and a spatial branch and bound technique. The solver uses LP relaxations and cutting planes to provide strong dual bounds, while using Constraint Programming to handle arbitrary (non-linear) constraints and propagation to tighten domains of variables. A variety of primal heuristics and bound tightening techniques are also utilized in the solver. SCIP includes SoPlex for solving the LP subproblems, but can also utilize, CLP, CPLEX, Gurobi, Mosek or XPress if available. Furthermore, the solver uses IPOPT for solving NLP subproblems in the nonlinear strategy. The supporting hyperplanes are then dynamically added by utilizing callbacks and lazy constraints, enabling the MILP solver to continue without rebuilding the branch and bound tree. If Cbc is used as MILP subsolver, a multi-tree strategy is used. The tight integration with the MILP solvers enables SHOT to fully benefit from their cut generating procedures, advanced node selection, and branching techniques. SHOT also includes the functionality to solve MIQP subproblems with CPLEX and Gurobi. The NLP problems are solved with either IPOPT or any of the applicable solvers in GAMS. A version of SHOT with reduced functionality, e.g., only utilizing the multi-tree approach, is also available for Wolfram Mathematica ).

Other MINLP solvers
Besides the solvers mentioned above, there are a few others solvers capable of solving convex MINLP problems that the authors are aware of. It should be noted that the solvers left out of the numerical comparison are not necessarily inferior compared to the other solvers. These solvers have been left out of the comparison due to one of the following reasons: not publicly available, not maintained within the last years, or not able to read the problem formats available in MINLPlib. bnb is a MATLAB implementation of NLP-BB by K. Kuipers at the University of Groningen. The solver uses the fmincon routine in MATLAB's Optimization Toolbox for solving the integer relaxed subproblems. The MATLAB code for the solver can be downloaded from www.mathw orks.com/matla bcent ral/filee xchan ge/95-bnb.
FICO Xpress-SLP is a solver currently developed by FICO (FICO 2017) and is available as both standalone binaries and a FICO Xpress-MOSEL module. The solver has an interface to Python and can be used in several other programming environments through the BCL Builder Component Library (FICO 2017). Xpress-SLP is a local solver designed for large scale nonconvex problems, and global optimality is only guaranteed for convex problems. For general MINLP problems, the solver uses a mixed integer successive linear programming (MISLP) approach. With the MISLP approach, it is, e.g., possible to use an NLP-BB technique where the NLP subproblem at each node is solved using a successive linear programming (SLP) technique. For certain types of problems, such as convex MIQP, MIQCQCP and MISOCP, the solver detects convexity and automatically reverts to FICO Xpress's purpose written solvers. The solver also includes some heuristic approaches for quickly obtaining solutions. More information about FICO and its solvers can be found at www.fico.com/en/produ cts/ fico-xpres s-optim izati on.
FilMINT is an MINLP solver developed by K. Abhishek, S. Leyffer and J. Linderoth based on the NLP/LP-BB algorithm (Abhishek et al. 2010). The solver is built on top of the MILP solver MINTO (Nemhauser et al. 1994) and uses filterSQP (Fletcher and Leyffer 1998) for solving NLP relaxations. By utilizing functionality in MINTO, FilMINT is able to combine the NLP/LP-BB algorithm with features frequently used by MILP solvers, such as cut generation procedures, primal heuristics, and enhanced branching and node selection rules. There is an AMPL interface available for FilMINT and the solver can also be used through the NEOS server. For more details, we refer to Abhishek et al. (2010).
fminconset is an implementation of NLP-BB in MATLAB by I. Solberg. The NLP subproblems are solved with MATLAB's fmincon routine in the Optimization Toolbox. The solver is available to download from www.mathw orks.com/matla bcent ral/filee xchan ge/96-fminc onset .
GAECP (Generalized Alpha Extended Cutting Plane) is a solver based on the GAECP algorithm (Westerlund and Pörn 2002) developed by T. Westerlund. The solver also uses supporting hyperplanes as in the ESH algorithm and is able to guarantee convergence for MINLP problems with nonsmooth pseudoconvex functions. The solver is described in detail in Westerlund (2018).
MILANO (Mixed-Integer Linear and Nonlinear Optimizer) is a MATLABbased MINLP solver developed by H. Y. Benson at Drexel University. There are two versions of the solver available; one uses an NLP-BB technique and the other is based on OA. The NLP-BB technique version uses an interior point NLP solver with warm-starting capabilities described in Benson (2011). The solver can be downloaded from www.pages .drexe l.edu/~hvb22 /milan o.

MindtPy (Mixed-Integer Nonlinear Decomposition Toolbox in Pyomo)
is an open-source software framework implemented in Python for Pyomo by the research group of I. Grossmann at Carnegie Mellon University. This toolbox implements the ECP, GBD, and OA algorithms, together with primal heuristics. It relies on Pyomo to handle the resulting MILP and NLP subproblems, allowing any of the solvers compatible with Pyomo to be used with MindtPy ). The toolbox is available in the following repository www.githu b.com/berna lde/pyomo /tree/mindt py.
MINLP_BB was developed by S. Leyffer and R. Fletcher as a general solver for MINLP problems (Leyffer 1999). The solver is based on NLP-BB and uses filterSQP for solving the continuous relaxations. There are interfaces to AMPL and Fortran. Furthermore, the solver can also be used in MATLAB through the TOMLAB optimization environment (Holmström 1999). More information about the solver is available at wiki.mcs.anl.gov/leyff er.
MINOPT (Mixed Integer Nonlinearl Optimizer) was developed by C.A. Schweiger and C.A. Floudas as a modelling language for a wide range of optimization problems (Schweiger and Floudas 1998). For MINLP problems MINOPT offered several algorithms, such as variants of OA and GBD.
MISQP (Mixed-Integer Sequential Quadratic Programming) is a solver based on a modified sequential quadratic programming (SQP) algorithm for MINLP problems presented by Exler and Schittkowski (2007). The solver is developed by K. Schittkowski's research group and the University of Bayreuth. MISQP is intended for problems where function evaluations may be expensive, e.g., where some function values are obtained by running a simulation. Unlike some of the other solvers, MISQP does not need to evaluate functions at fractional values for integer variables which can be an important property, e.g., for simulation-based optimization tasks. There is an interface in MATLAB through TOMLAB as well as a standalone Fortran interface. A more detailed description of the solver is available at tomwi ki.com/MISQP .  Nowak et al. (2002)). For more details on nonconvex MINLP see, e.g., Tawarmalani and Sahinidis (2002), Liberti and Maculan (2006) and Floudas (2000).

Benchmark details
The objective of the forthcoming two sections is to compare some of the convex MINLP solvers mentioned in the previous section by applying them on a comprehensive set of test problems. There are some benchmarks available in literature, e.g., Kronqvist et al. (2016), Bonami et al. (2012), Lastusilta (2011) and Abhishek et al. (2010). However, these are limited to only a few of the solvers considered here or used a smaller set of test problems. The goal here is to give a comprehensive up-to-date comparison of both open-source and commercial solvers available in different environments. The main interest has been to study how the solvers perform on a desktop computer, and all the benchmarks were performed on a Linux-based PC with an Intel Xeon 3.6 GHz processor with four physical cores (able to process eight threads at once) and 32 GB memory.
We have allowed the solvers to use a maximum of eight threads to replicate a real-world situation where one tries to solve the problems with all the available resources. However, it is worth mentioning that the solvers and subsolvers utilize parallelism to different extents.
In the comparison, we have included three versions of BONMIN: BONMIN-OA based on OA, BONMIN-BB based on NLP-BB, and BONMIN-HYB, which is a variant of the LP/NLP-BB algorithm. We have also included two versions of Minotaur: Minotaur-QG based on the LP/NLP-BB algorithm, and Minotaur-BB based on NLP-BB. Two version of Knitro where considered: Knitro-QG based on LP/NLP-BB and Knitro-BB based on NLP-BB. These different versions of the same solver were included since they represent different approaches for solving the MINLP problems and their performance vary significantly. The solvers in the comparison are implemented in and used from different environments (GAMS, AIMMS, and Julia/JuMP), and the subsolvers available may vary. Where possible, we have tried to use CONOPT and IPOPT as the NLP solver, and CPLEX as the (MI)LP solver. The linear solver used in IPOPT was MA27 (HSL 2018) which is the default if available. For all applicable solvers, the latest version of CPLEX (12.8) was used; the exception is Minotaur which we only got working with version 12.6.3. Couenne warns against using another LP solver than CLP, which we respected since it actually improved performance significantly. For Minotaur, filterSQP was used as NLP subsolver as it is recommended over IPOPT, and overall performed better. A full list of solvers and subsolvers used are given in Table 1.
The termination criteria used with the solvers is the relative objective gap between the upper and lower objective bounds, where we used a tolerance of 0.1% with all the solvers. To make sure that the solvers did not terminate prematurely due to other built-in termination criteria and to avoid clear solver failures, some specific solver options were given; these are listed in Appendix B. Except for these, default settings were used for all solvers. Furthermore, a wall clock time limit of 900 s was also used with all solvers. Even with the 15-min time limit per problem, the total running time for the experiments was more than two weeks.

Problem sets
The problems considered here are from the problem library MINLPLib (MINLPLib 2018), which as of July 2018 consists of 1534 instances. These instances originate from many different sources and applications as indicated in the library. Out of the instances, we have chosen all problems that satisfy the following criteria: classified as convex, containing at least one discrete variable and some nonlinearity (either in the objective function or in the constraints). We also excluded the instance meanvarxsc utilizing semicontinuous variables not supported by all of the solvers. The smallinvSNPr* instances were also excluded, since they recently lost their convex classification in MINLPLib due to rounding errors in the problem format that actually made them slightly nonconvex. In total, there were 335 instances that satisfied the given criteria, and these constitute our master benchmark test set. Some statistics of the problems are available in Table 2 The problems selected represent a variety of different types of optimization problems with different properties such as number of variables, number of discrete variables, and number of nonlinear terms. Some of the problems also represent different formulations of the same problems, e.g., both big-M and convex hull formulation of disjunctions. It is therefore of interest to compare the solvers not only on the entire test set but also on smaller subsets with specific properties. We have partitioned the test set into groups, representing both integer and nonlinear properties, to compare both the solvers and algorithms for the different types of problems. The following criteria were used to partition the test problems into subsets:

Continuous relaxation gap
By solving a continuous relaxation of the MINLP problem and comparing the optimal objective value of the relaxed problem with the actual optimal objective value, we are able to determine the continuous relaxation gap. To avoid differences due to scaling, we use a relative value calculated as (8) Relative continuous relaxation gap = |z * −z| max {|z * |, 0.001} × 100%, where z * denotes the optimal objective value and z denotes to optimum of the continuous relaxation. The continuous relaxation gap varies significantly for the test problems: some instances have a gap larger than 1000% and for some instances it is smaller than 1%. Based on the gap, given by Eq. (8), we have divided the test problems into two subsets: problems with a large gap ( ≥ 50% ) and problems with a small gap ( < 50% ). According to this classification, there are 151 problems with a large gap (average gap 188%) and 184 with a small gap (average gap 7.2%).

Nonlinearity
Some of the test problems are almost linear with only a few nonlinear terms, whereas some test problems are nonlinear in each variable. The test problems are here classified based on the following nonlinearity measure where n nonlin is the number of variables involved in a nonlinear term and n tot is the total number of variables. The test problems are divided into the following two categories: problems with high degree of nonlinearity ( ≥ 50% ), and problems with low degree of nonlinearity ( < 50% ). The set with high degree of nonlinearity contains 103 problems with an average nonlinearity measure of 89%, while the set with low degree of nonlinearity contains 232 problems with an average nonlinearity measure of 14%.

Discrete density
The number of discrete variables also varies significantly in the test problems. Some problems contain only a few discrete variables, while others contain only discrete variables. To avoid a division based mainly on the problem size, we have chosen to divide the problems based on the following measure (9) Degree of nonlinearity = n nonlin n tot × 100%, Here n int and n bin are the number of integer and binary variables, and n tot is the total number of variables. Again the test problems are divided into two subsets: problems with a high discrete density ( ≥ 50% ) and problems with a low discrete density ( < 50% ). The first category contains 120 problems with an average discrete density of 81%, and the second category contains 215 problems with an average density of 27%.
A list of the problems in each category is given in "Appendix A", which also shows the continuous relaxation gap, degree of nonlinearity, and discrete density for each test problem. Scatter plots are also presented in "Appendix A" that show there is little to no correlation between the problem categories.

Reporting
All the results were analyzed using PAVER (Bussieck et al. 2014), which is a tool for comparing the performance of optimization solvers and analyzing the quality of the obtained solutions. The reports generated by PAVER, as well as all the results obtained by the individual solvers are available at andre aslun dell.githu b.io/ minlp bench marks . The parameters used for generating the reports are also available within the reports.
A comment must be made regarding the choice of the parameter gaptol in PAVER, which was set to the value 1.002 × 10 −3 instead of the value used as termination criteria ( 1 × 10 −3 = 0.1% ). The small perturbation is needed due to differences in how the relative gap is calculated by the solvers. Some of the solvers calculate the relative gap by dividing the gap by the lower bound, whereas others divide by the smallest absolute value of either the upper or lower bound. For example, BARON and ANTIGONE would, without the small perturbation, seem to terminate prematurely on a large number of instances and these would all be marked as failed by PAVER.
PAVER also calculates so-called virtual best and virtual worst solvers. The virtual best solver is the best (in our graphs the fastest) successful solver selected for each individual problem instance, and the virtual worst is then the slowest for each instance. These virtual solvers provide a good comparison for how good or bad an individual solver is compared to all the solvers.
Since MINLPLib also provides a list of known optimal objective values, as well as upper and lower objective bounds, PAVER is able to compare the obtained solutions by the known bounds in MINLPLib. PAVER is, thus, also able to calculate the so-called primal gap, i.e., the difference between the obtained solution and the best-known integer solution, which can be used to analyze the quality of the obtained solutions. For example, there are cases where the solver returns the optimal solution, but it has not been able to verify optimality within the time limit. PAVER also uses known objective bounds available in MINLPLib to check whether the solvers obtained correct solutions and bounds for the test problems.

Results
The results are presented using solution profiles showing the number of individual problems that a solver is able to solve as a function of time. Note that the profiles do not represent the cumulative solution time, but shows how many individual problems the solvers can solve within a specific time. We have not used performance profiles where the time is normalized with respect to best solver (Dolan and Moré 2002) since these are not necessarily good for comparing several solvers as noted by Gould and Scott (2016). In all solution profiles in this section, we have chosen to divide the solvers into two categories to make the solution profiles more easily readable. The solvers are divided into MILP decomposition-based solvers and BB-based solvers. The division is not completely straightforward since some of the solvers could fit into both categories. However, the division is only intended to make it easier to read the results. The solvers classified as MILP decomposition-based solvers are AlphaECP, AOA, BONMIN-OA, DICOPT, Knitro-QG, Minotaur-QG, Muriqui, Pavito, and SHOT. Solvers classified as BB-based solvers are ANTIGONE, BARON, BONMIN-BB, BONMIN-HYB, Couenne, Juniper, Knitro-BB, LINDO, Minotaur-BB, SBB, and SCIP. The time scales are also divided into two parts to better highlight differences between the solvers, and it is linear in the first 10 s, and logarithmic between 10 and 900 s. In each plot, the solvers in the nonactive group are indicated with thin gray lines, while the others are as shown in the respective legends. The same line style is used for a specific solver in all figures. If there are several different strategies used with the same solvers, different line types (solid, dashed, dotted) are used while the color remain the same. In the right margin of each profile, the solvers are ranked according to the number of solved problems (as indicated within parenthesis). The virtual best and virtual worst solvers are shown in the figures as the top and bottom thick gray lines, and the region between them is shaded. Figures 1 and 2 show the solution profiles when applying the solvers on the complete set of test problems. As mentioned, the solution profiles indicate the number of problems that have been solved by the individual solvers as a function of time. A problem is defined as solved in this set of experiments if the relative objective gap, as calculated by PAVER, is ≤ 0.1002% (see note above). To better examplify the differences between the solvers, the same data is used for generating Table 3 which shows "snapshots" of the solution profile for different levels of the number of solved problems.
Out of the BB-based solvers, BARON is able to solve the most instances (306) within the time limit followed by SCIP (295) and Minotaur-BB (258). SHOT is able to solve the most problems out of the MILP decomposition-based solvers (312), closely followed by AOA (310) and Muriqui (301). SHOT and AOA are overall the fastest solvers for the test set. The virtual best solver is able to solve 326 of the problems while the virtual worst only manages to solve 39. The virtual best and worst solvers indicate a huge spread in the solvers' performance for different problems and highlight the importance of choosing a solver well suited for the problem type. The virtual best solver also shows that a large portion of the test problems are manageable for at least one of the solvers, while judging by the virtual worst solver, it is clear the test set is challenging. Figure 3 presents statistics regarding the termination of the solvers, e.g., how many errors and timeouts occurred. These values are as reported by the solver, but 1 3 A review and comparison of solvers for convex MINLP  Table 3  The table shows  also include solver crashes where no solution was returned. PAVER also verifies if the solver runs were completed successfully, e.g., by comparing the objective values returned to known values or bounds; if there is a discrepancy, these instances are given the status failed. Statistics on instances marked as failed are shown in Fig. 4. Figure 5 shows the number of problems where the solver was able to obtain a solution within 0.1% and 1% of the best-known solution, but not necessarily able to verify optimality. The figure shows that none of the solvers was able to obtain a solution within 1% of the best-known solution for all of the problems, given the 900 s time limit. For example, BARON was able to obtain a solution within 1% of the optimum for 317 problems, and SHOT obtained such a solution for 320 problems.
The number of instances solved to a relative objective gap, i.e., difference between the upper and lower objective bound, of 0.1%, 1% and 10%, per solver is shown in Fig. 6. By comparing Figs. 5 and 6, it can be observed that some of the solvers are able to obtain a solution within 0.1% of the optimum to significantly more problems than they are able to verify as optimal. For example, AlphaECP and ANTIGONE seems to be struggling with obtaining a tight lower bound for some of the problems, since they are able to obtain solutions within 0.1% for 300 (AlphaECP) and 275 (ANTIGONE) problems, while only verifying optimality for 276 and 236 instances respectively. Since ANTIGONE is a global solver without a user-selected convex strategy, it might fail to recognize some of the problems as convex, and therefore, generate weaker relaxations. This would explain the difference between the number of optimal solutions found and number of solutions verified as optimal.
Since it may be difficult to draw more detailed conclusions from the results in Figs. 1 and 2, the next sections consider subsets of test problems with specific properties. A summary of the results for the different subsets is given in Sect. 6.4. Fig. 3 The solution status returned from the solvers

Impact of the continuous relaxation gap
In this section, we consider problems with a large continuous relaxation gap and problems with a small continuous relaxation gap. Figure 7 shows the solution profiles of the solvers for the problems with a large gap, and Fig. 8 shows the solution Fig. 4 The number of solutions per solver flagged as failed by PAVER. Most often, the cause is that the returned solution is not within the bounds provided in MINLPLib   Fig. 5 The number of instances in the benchmark where the solvers found a a solution within 0.1%, 1% and 10% of the best known objective value profiles for those with a small gap. By comparing the figures, there is a clear difference for the solvers based on a BB approach, as these clearly perform better on the problems with a small gap compared to those with a large continuous relaxation gap. For example, BONMIN-BB is one of the most efficient solvers for the problems with a small gap, both in terms of speed and number of solved problems, while it is clearly out-performed by several solvers for the problems with a large gap.
The NLP-BB-based solvers, BONMIN-BB, Juniper, Knitro-BB, Minotaur-BB, and SBB solve significantly fewer problems with a large gap than the solvers based on either an ECP, ESH or OA (AlphaECP, BONMIN-OA, DICOPT, and SHOT). For problems with a large continuous relaxation gap, the BB trees may become larger and the more expensive subproblems in each node may make the NLP-BBbased solvers suffer performance-wise. Using a polyhedral approximation within a BB framework, BARON and SCIP are not as strongly affected by the relaxation gap; this could partially be due to having simpler subproblems at each node.
Overall, the MILP decomposition-based solvers seem to be less affected by the continuous relaxation gap then the BB-based ones. Several of the MILP decomposition-based solvers, such as AOA and SHOT, are closely integrated with the MILP subsolver (CPLEX in this case), and rely on it for handling the integer requirements. This close integration enables the usage of several advanced features from the more mature MILP solvers, while NLP-BB-based solvers often need to manage branching, handling the BB tree, cut generation, and other techniques on their own. One can expect this advantage to be more important for problems that are challenging due to the integer requirements, which is often a trait of problems with large continuous relaxation gaps.
As an example of the impact on the performance of a MILP decomposition-based solver that handle the integer requirements itself, consider Minotaur-QG (as it only uses CPLEX as a LP subsolver): The performance difference between Minotaur-QG and AOA are significant, even though they utilize the same basic algorithm.

Impact of nonlinearity
In this section, problem types with a high and low degree of nonlinearity are compared, and the results are shown in Figs. 9 and 10. Several of the solvers use linearizations to approximate the nonlinear functions in some steps of the solution procedure, whereas solvers using an NLP-BB approach directly treats the nonlinearity in each node of the BB tree. As expected, most of the solvers utilizing linearizations perform significantly better on the problems with a low degree of nonlinearity, since BONMIN-OA, DICOPT, and SCIP are among the most efficient solvers in terms of both speed and number of problems solved. However, for problems with a high degree of nonlinearity they are outperformed by the NLP-BB-based solvers BON-MIN-BB, Knitro-BB, Minotaur-BB, and SBB.
SHOT and the LP/NLP-BB-based solvers AOA, Minotaur-QG and Muriqui have quite similar behavior for both types of problems and perform well in both categories. These solvers rely on linearizations of the nonlinear constraints and one would, thus, expect them to be negatively affected by the degree of nonlinearity. However they all use a quite similar single-tree approach where NLP subproblems are solved in some of the nodes, which may help them to cope with problems with a high degree of nonlinearity. The LP/NLP-BB-based solver Knitro-QG also performs quite well for problems with a high degree of nonlinearity.
Most affected by the degree of nonlinearity seem to be the NLP-BB-based solvers. For problems with a high degree of nonlinearity they performed overall well, with several of the NLP-BB-based solvers being among the most efficient ones. Thus, in this case there seems to be a clear advantage of directly treating the nonlinearities. For the problems with a low degree of nonlinearity, however, the NLP-BBbased solvers did not perform as well in comparison with the other solvers.

Impact of discrete density
Finally, we compare how the solvers are affected by the relative number of discrete variables, i.e., integer and binary variables. Figures 11 and 12 show how the solvers perform for problems with high and low discrete density.
The MILP decomposition-based solvers perform similarly for both types of problems, and no obvious conclusions can be drawn from the results. However, again there is a clear difference for the NLP-BB-based solvers, and surprisingly many of these solvers performed better than the other ones on the high discrete density set of problems. One could expect a correlation between the discrete density and continuous relaxation gap, and that a high discrete density would result in a high continuous relaxation gap. However, as shown in Fig. 13 in "Appendix A", there is basically no correlation between the two for this set of test problems. Thus, by analyzing the results there is no clear reason for why the NLP-BB-based solvers perform better for the problems with a high discrete density, but one should keep in mind that the test set is somewhat limited.
1 3 Fig. 9 The solution profiles for problem instances with a high level of nonlinear variables as indicated in "Appendix A"

3
A review and comparison of solvers for convex MINLP Fig. 10 The solution profiles for problem instances with a low level of nonlinear variables as indicated in "Appendix A" 1 3 Fig. 11 The solution profiles for problem instances with a high level of discrete variables as indicated in "Appendix A"

3
A review and comparison of solvers for convex MINLP Fig. 12 The solution profiles for problem instances with a low level of discrete variables as indicated in "Appendix A" Both BARON and SHOT perform well on both sets of test problems, while they perform somewhat better on problems with a low discrete density. The OA approach seems to be well suited for the problems with a low discrete density, where DICOPT is one of the most efficient solvers and BONMIN-OA also manages to solve a large portion of the problems. AOA and Muriqui, both integrating the LP/NLP-BB method through callbacks and lazy constraints with the MILP solver, also perfoms well on both categories.

Summary of the results
How the solvers are affected by the continuous relaxation gap, degree of nonlinearity and discrete density is summarized in Table 4. The table shows the number of problems solved within each category as well as an indicator of how the solvers' performances were affected by the specific properties. The performance indicator tries to show how the performance of a solver is affected by the problem properties with respect to the other solvers. If a solver clearly performed better in a category, with respect to speed and number of solved problems, it is indicated by '+', and similarly '−' indicates that solver performed worse for that category of problems. If the performance is similar within both categories it is indicated by ' ∼ '. Each performance indicator has been chosen by comparing how a specific solver performed in both the high and low category with respect to the other solvers. Note that a '-' sign does not necessarily indicate that the solver performed poorly, it simply states that the solver did not perform as well as in the other category. For example, for the problems with a low degree of nonlinearity DICOPT is overall one of the fastest solvers. For the problems with a high degree of nonlinearity DICOPT also performs well, but not as well as in the other category, and this is indicated by a '+' and '−' sign in Table 4. That there are no significant changes for the performance of AOA with respect to both categories is indicated by ' ∼ ' sign.
These indicators were obtained by carefully analyzing the performance profiles, and are not intended as a grade of the solver but to show how it is affected by different problem properties. The results presented in Table 4 indicates that BB-based solvers seem to be more affected by the problem properties considered here compared to the MILP decomposition-based solvers. One possible explanation for the differences between the two types of solvers is that several of the MILP decomposition-based solvers rely heavily on the MILP subsolver, and benefits from several advanced features from the MILP solver.
Comparing the global solvers (ANTIGONE, BARON, Couenne, LINDO, and SCIP) with the convex solvers is not completely fair since the global solvers are able to solve a wider class of problems. In the numerical comparison, one should keep in mind that these global solvers are intended for a different (more general) type of problems. Some of these solvers do not have a convex option, and thus, they have access to less information about the problem and might treat it as nonconvex. For example, the performance difference of ANTIGONE and Couenne compared to BARON and SCIP, may be explained with the solvers treating some of the convex functions as nonconvex, and therefore, generate unnecessarily weak relaxations. BARON seems to be very efficient at identifying convex problems since it is able to deal with the problems in the benchmark set in such an efficient manner. Even if it is a global solver, capable of handling a variety of nonconvex problems, it is also one of the most efficient solvers for convex problems.
Furthermore, one should not draw any conclusions on how the solvers perform on nonconvex problems based solely on the results presented in this paper. For example, the convex strategies in AOA and SHOT, the two most efficient solvers for this set of problems, do not necessarily work well for nonconvex problems, and in fact there is another strategy in AOA intended as an heuristic for nonconvex problems. Some of the nonglobal solvers may actually work quite well as heuristics for nonconvex problems, of course without any guarantee of finding the global or even a feasible solution.

Conclusions
The comparisons presented in this paper are mainly intended to help the readers make informed decisions about which tools to use when dealing with different types of convex MINLP problems. In the previous sections, we have shown how 16 different Table 4 The table shows how the solvers are affected by the problem properties described in Sect. 5.1 If a specific solver performs better for one of the categories it is indicated by a '+' sign, and a '−' sign indicates that the solver performs worse on that specific category. If the solver performs similarly on both categories it is indicated by ' ∼ '. Furthermore, the number in each row shows the total number of problems that the solver was able to solve within a relative objective gap of 0.1% within 900 s solvers performed on a test set containing 335 MINLP instances. By comparing the solvers on MINLP instances with different properties we noticed significant differences in the solvers' performance. For example, the solvers based on NLP-BB were strongly affected by both the continuous relaxation gap and the degree of nonlinearity. Several of the solvers are based on the same main algorithms, although they differ significantly in terms of speed and number of problems solved. The differences are mainly due to different degrees of preprocessing, primal heuristics, cut generation procedures, and different strategies used by the solvers. The performance differences highlight the importance of such techniques for an efficient solver implementation. For the test set considered here, SHOT and AOA were the overall fastest solvers. Both of the solvers are based on a single-tree approach closely integrated with the MILP solver by utilizing callbacks to add the linearizations as lazy constraints. The results show the benefits of such a solution technique and support the strong belief in the single-tree approach by Abhishek et al. (2010) and Belotti et al. (2013). The close integration with the MILP solver allows AOA and SHOT to benefit from different techniques integrated within the MILP solver, such as branching heuristics, cut generation procedures, and bound tightening.
Overall, several of the solvers performed well on the test set and were able to solve a large portion of the problems. The most instances any solver could solve within the time limit was 312 instances, and by combining all the solvers we where able solve 326 of the 335 MINLP problems to a 0.1% guaranteed optimality gap. However, it should be noted that many of the test instances are quite small and simple compared to industry-relevant problems. Still today, real-world problems must often be simplified and reduced in size to obtain tractable formulations, in the process limiting the practical benefits of MINLP. Thus, in order to fully benefit from convex MINLP as a tool for design and decision-making, both further algorithmic research and solver software development are required. We also hope that this paper encourages MINLP users to submit their problems to the instances libraries, e.g., MINLPLib and www. minlp .org, to benefit both MINLP solver developers and end users. 3 Fig. 13 As can be seen from these scatter plots, there is little to no correlation between the three categories integer relaxation gap, nonlinearity and discrete density. Note that some outliers are missing

Appendix B
The options provided to the solvers (and subsolvers) in the benchmark are listed below. All other settings are the default values as provided by the individual solvers. Note that we have not tried to fine-tune any of the solvers; however, if there is a convex strategy, or recommended convex parameters we have used those. We have also modified limits and other parameters when it is apparent that implementation issues are the cause of, e.g., premature termination of a solver on some problem instances or numerical instability. For example, without adding the CONOPT option specified below, DICOPT and SBB fails to solve all the smallinv-instances in MINLPLib. Therefore, we believe that it is motivated to use these, since this problem occurs in the subsolver.