A disjunctive cut strengthening technique for convex MINLP

Generating polyhedral outer approximations and solving mixed-integer linear relaxations remains one of the main approaches for solving convex mixed-integer nonlinear programming (MINLP) problems. There are several algorithms based on this concept, and the efficiency is greatly affected by the tightness of the outer approximation. In this paper, we present a new framework for strengthening cutting planes of nonlinear convex constraints, to obtain tighter outer approximations. The strengthened cuts can give a tighter continuous relaxation and an overall tighter representation of the nonlinear constraints. The cuts are strengthened by analyzing disjunctive structures in the MINLP problem, and we present two types of strengthened cuts. The first type of cut is obtained by reducing the right-hand side value of the original cut, such that it forms the tightest generally valid inequality for a chosen disjunction. The second type of cut effectively uses individual right-hand side values for each term of the disjunction. We prove that both types of cuts are valid and that the second type of cut can dominate both the first type and the original cut. We use the cut strengthening in conjunction with the extended supporting hyperplane algorithm, and numerical results show that the strengthening can significantly reduce both the number of iterations and the time needed to solve convex MINLP problems.

Convex MINLP represents a highly successful subclass of optimization problems, e.g. algorithm developers often develop convex approximations of nonconvex engineering relationships (Geiler et al. 2015) or decompose their optimization problems into a series of convex MINLP problems (Lundell and Westerlund 2018;Nowak et al. 2018). A wide range of efficient solver software is developed specifically for convex MINLP (Grossmann et al. 2002;Bonami et al. 2008;Lastusilta 2011;Bernal et al. 2020;Lundell et al. 2020;Mahajan et al. 2017;Kröger et al. 2018;Melo et al. 2020). The success of convex MINLP derives from the seminal work of Duran and Grossmann (1986b) in developing the outer approximation (OA) algorithm. The work by Duran and Grossmann (1986b) became pivotal in solving convex MINLP problems because of the algorithm's strong convergence properties for a wide range of problem classes (Quesada and Grossmann 1992;Fletcher and Leyffer 1994) and its speed in solving practical problems (Bonami et al. 2008). In a recent benchmark by Kronqvist et al. (2019) it was shown that several of the most efficient convex MINLP solvers are based on the OA algorithm.
The concept of using an outer approximation of the nonlinear constraints for MINLP problems, developed by (Duran and Grossmann 1986b;Geoffrion 1972), forms the core of several other convex MINLP algorithms, e.g., extended cutting plane (ECP) (Westerlund and Petterson 1995;Westerlund and Pörn 2002), feasibility pump (Bonami and Gonçalves 2012), extended supporting hyperplane (ESH) , and the center-cut algorithm (Kronqvist et al. 2018a). Further developments of the OA algorithm, incorporating quadratic approximations and regularization, has been presented by Su et al. (2018) and Kronqvist et al. (2018b). These algorithms could commonly be referred to as outer approximation type algorithms, although this classification is seldom used.
This paper focuses on deriving strong cutting planes for convex MINLP problems, resulting in tight outer approximations, by exploiting disjunctive structures in the problem. We use cuts obtained by the ESH algorithm as a basis, and we develop a framework for strengthening the cuts by considering the integer restrictions. The cut strengthening technique is not unique to the ESH algorithm and could also be used with an OA, ECP or generalized Benders decomposition (Geoffrion 1972) framework. The main motivation behind using the ESH algorithm is that the algorithm tends to generate a single strong cut per iteration. The ESH cuts are actually as tight as possible with regards to the nonlinear constraints ), but they do not in general form supporting hyperplanes to the convex hull of all integer feasible solutions. Here we develop a framework for strengthening the ESH cuts, which results in two new types of cuts that are always as tight or tighter than the ESH cut. The new cuts can give both a tighter representation of the nonlinear constraints as well as a tighter continuous relaxation. By obtaining a tighter outer approximation of the nonlinear constraints, we can reduce both the number of iterations and the time needed to solve problems.
Cutting planes that strengthen the continuous relaxation are nowadays an essential part of an efficient mixed-integer linear programming (MILP) solver (Achterberg and Wunderling 2013; Linderoth and Lodi 2011), and there is an active interest in developing similar cuts for convex MINLP. Disjunctive cutting planes for convex MINLP originate from the fundamental contributions of Ceria and Soares (1999) and Stubbs and Mehrotra (1999), and further developments are presented in (Trespalacios and Grossmann 2016). Lift-and-project cuts were first introduced in MILP by Balas et al. (1993), and this technique has later been adopted within convex MINLP. By linearizing the constraints, a polyhedral outer approximation can be used to derive lift-and-project cuts through a cut generating LP (Zhu and Kuno 2006;Bonami 2011;Kılınç et al. 2017;Serra 2020). An alternative approach is presented by Lodi et al. (2019), where they obtain cuts directly by solving cut generating conic programs. Other types of cuts used within MINLP includes different types of mixed-integer rounding cuts (Gomory 1960;Atamtürk and Narayanan 2010), reformulation linearization technique (RLT) based cuts (Sherali and Adams 2013;Misener et al. 2015), and split cuts (Modaresi et al. 2015).
The cut strengthening techniques presented here can be viewed as an alternative approach to the previously mentioned lift-and-project and disjunctive cuts. However, our cut strengthening procedure is more focused on obtaining a tight MILP relaxation, than getting the best improvement for the continuous relaxation. The cuts are generated by selecting a disjunction of the MINLP problem and strengthening an ESH cut over the convex hull of the selected disjunction. Trespalacios and Grossmann (2016) use a somewhat similar idea, where they derive a supporting hyperplane for a nonlinear disjunction by solving a separation problem. Instead of solving a separation problem, we strengthen the ESH cut by deriving the smallest possible right-hand side values to the ESH cut that are still valid for each term of the disjunction. This enables us to effectively use individual right-hand side values for each term of the disjunction, making the cut tight for each disjunct. A similar approach is used by Trespalacios and Grossmann (2015) to construct tighter big-M reformulations of generalized disjunctive programs. We determine right-hand side values of the cuts by solving independent convex NLP problems in the original variable space and do not rely on the convex hull formulation of the disjunctions. By doing so, numerical difficulties associated with the perspective function are avoided and instead of solving a larger (lifted) problem, we solve several smaller independent (parallelizable) problems. This approach also enables us to identify some infeasible integer assignments and to handle numerical tolerances in a straightforward fashion. To the authors' best knowledge, this is a novel cut strengthening technique for convex MINLP.
The paper is organized as follows. Section 2 gives a short description of the ESH algorithm, along with the necessary assumptions on the MINLP problems. Section 3 presents the theory and techniques used for the cut strengthening, and a cut strengthening algorithm is presented in Sect. 4. Section 5 presents an algorithm for solving convex MINLP problems that combines the ESH algorithm with the cut strengthening techniques. Finally, some numerical results are presented in Sect. 6.

Background
First, we define the class of problems considered within the paper and state the assumptions needed to guarantee convergence of the ESH algorithm. The disjunctive structure that the cut strengthening technique builds upon is also presented in this section. The second part of this section briefly describes the ESH algorithm, which is later used to generate cuts and forms the basis of the convex MINLP algorithm in Sect. 5.

Problem statement
The most commonly used, and most practical, definition of a convex MINLP problem, is that all of the nonlinear constraints and objective are given by convex functions (Gupta and Ravindran 1985;Quesada and Grossmann 1992;Westerlund and Petterson 1995;. Throughout the paper, we use this definition of convexity. Without loss of generality, we only consider convex MINLP problems with the following structure where g j ∶ ℝ n → ℝ are convex continuously differentiable functions. Here, I ℤ is a set containing the indices of all the integer variables. To clarify the notation, x i referrers to the i-th element of the variable vector . The feasible set defined by the nonlinear constraints will be referred to as the nonlinear feasible set, and it is given by

3
A disjunctive cut strengthening technique for convex MINLP To simplify the notation, we will also introduce a set L defined by the linear constraints and a set Y given by the variable domains To ensure convergence of the ESH algorithm, we need to make the following assumptions of problem (MINLP).

Assumption 1
The linear constraints form a compact set.
For the cut strengthening procedure, we make the following assumption on the problem structure.

Assumption 3
The MINLP problem contains at least one exclusive selection constraint of binary variables, i.e., ∃ I D ⊂ I ℤ ∶ x i ∈ {0, 1} ∀i ∈ I D , and either one of the constraints appears in the problem.
For the sake of simplicity and clarity, we will throughout the paper only focus on the exclusive selection constraint (2). The second type of exclusive selection constraint (3), can trivially be converted into the first type by introducing a slack binary variable and can be handled by the same approach.
The exclusive selection constraints arise, for example, from the representation of disjunctive constraints through the so-called big-M or convex hull formulation (Balas 1979;Raman and Grossmann 1994;Trespalacios and Grossmann 2014). Note that we do not restrict all of the integer variables to be binary variables, nor do we assume the problems to have disjunctive constraints of a specific type. The cut strengthening simply requires that the problem contains at least one exclusive selection constraint, which is used for strengthening the cut. However, the cut strengthening is most powerful in case the problem contains the big-M constraints, resulting in a weak continuous relaxation. Therefore, we focus on problems containing big-M constraints.
For the cut strengthening to be computationally efficient, the number of elements in I D should be less than the elements in I ℤ . Throughout the paper, we also assume that the main challenges in solving problem (MINLP) arise from the integer restrictions.

3
Consequently, we assume that a continuous relaxation of the problem is significantly easier to solve than the MILP relaxations used by OA, ECP, and ESH. This is often the case for convex MINLP problems which is, for example, shown by the numerical result in Muts et al. (2020) and Su et al. (2015).

The extended supporting hyperplane algorithm
The ESH algorithm was presented by Kronqvist et al. (2016) as a method for solving convex MINLP problems, and it builds upon ideas presented by Veinott Jr (1967). It was proven by Eronen et al. (2017) that the ESH algorithm is directly applicable to nonsmooth MINLP problems with constraints given by pseudoconvex functions. Properties of the ESH algorithm have also been further analyzed by Serrano et al. (2019). The ESH algorithm constructs a tight polyhedral outer approximation of the nonlinear feasible set N, by generating supporting hyperplanes to the set. The polyhedral outer approximation at iteration k is given by where ̄ i are points on the boundary of N and A i contains the indices of all constrains active at ̄ i . From convexity it directly follows that N ⊆N k , and N k is commonly referred to as an outer approximation of N.
A new trial solution k+1 is obtained by solving the following MILP relaxation A lower bound on the optimal objective value of problem (MINLP) is given by , where k+1 is an optimal solution to the MILP relaxation. The trial solutions obtained by solving problem (MILP-r) will all be outside of the nonlinear feasible set N, before the very last iteration. Therefore, linearizing the nonlinear constraints at the trial solutions k would, in general, not form supporting hyperplanes to N and would result in weaker cuts. To obtain supporting hyperplanes, ESH performs an approximative projection of the trial solution k onto N ∩ L . A point in the interior of N ∩ L is needed for the projection, and such a point is obtained by solving the convex continuous problem For the approximative projection of k , we define the one-dimensional function for ∈ [0, 1] . Using a simple root-search algorithm we can obtain a k such that F k = 0 . The approximative projection of k onto N ∩ L is then given by Now, if the active constraints are linearized at ̄ k we obtain the following cuts which forms supporting hyperplanes to N ∩ L . The supporting hyperplanes are then added to the current polyhedral outer approximation to form N k+1 , which ensures that ̄ k ∉N k+1 .
The ESH algorithm repeats the procedure of solving (MILP-r) and improving the outer approximation by generating supporting hyperplanes. To improve the computational performance, the algorithm starts by further relaxing (MILP-r) and solving LP relaxations to quickly generate an outer approximation. For more details and computational enhancements on the ESH algorithm see Lundell et al. (2018).
The cuts generated by the ESH algorithm are as tight as possible with regards to N ∩ L . However, there is no guarantee that the algorithm generates supporting hyperplanes to the convex hull of N ∩ L ∩ Y . Therefore, it can be possible to further strengthen the cuts by considering the integrality restrictions. To illustrate the possible strengthening of the cuts, consider the following example The example contains the disjunctive constraint that the (x 1 , x 2 )-variables must be within one of three circles, which is represented by the big-M formulation. The value 29.944 is, in this case, the tightest common value for the big-M coefficients. A stronger problem formulation could simply be obtained by using individual M values for each constraint, which can easily be determined as described in the Appendix. We only use the weaker formulation in order to better highlight differences between the cuts. Figure 1 shows the feasible set of problem (EX1) along with the continuously relaxed feasible set projected down onto the (x 1 , x 2 )-space.
In the first iteration, the ESH algorithm will generate the following cut which forms a supporting hyperplane to N ∩ L but not a supporting hyperplane to convex hull of N ∩ L ∩ Y . From Fig. 1, it is clear that the cut given by Eq. (8) is not as tight as possible when considering the integer properties. In the next section, we present a technique to further tighten the cut by utilizing the disjunctive structures of the MINLP problem.

Cut strengthening
From the example in the previous section, it can be observed that the ESH cut can be tightened by simply reducing the right-hand side and still remain valid for the integer feasible set, i.e., N ∩ L ∩ Y . To reduce the right-hand side, we will consider an exclusive selection constraint, see assumption 3, and determine the smallest righthand side values for each selection. This enables us to strengthen the cut by reducing the right-hand side alone or to further strengthen the cut by assigning individual right-hand side values for each assignment of the exclusive selection constraint.
First, we select an index set I D I D ⊂ I ℤ that contains the indices of all the binary variables included in an exclusive selection constraint of the MINLP problem. By using the ESH algorithm we obtain the cut which forms a tight valid inequality for N ∩ L . To tighten cut (9), consider the following disjunctive programming (DP) problem This DP problem can be solved as a convex NLP through the convex hull formulation (Ceria and Soares 1999;Stubbs and Mehrotra 1999;Lee and Grossmann 2000). Formulating problem (10) as a convex NLP through a convex hull formulation can cause numerical difficulties, such as division by zero and non-smoothness (Sawaya and Grossmann 2007), and the problem will contain |I D | copies of the variables. Instead of solving (10) as a single large problem we solve it as smaller individual Fig. 1 The dark circles show the feasible set of problem (EX1) projected onto the (x 1 , x 2 )-space. The light gray area in the left figure shows the feasible set of the continuous relaxation. The right figure also shows the projection of the outer approximation obtained by the first iteration of the ESH algorithm. Note that, a supporting hyperplane to N ∩ L does not necessarily form a supporting hyperplane in a projected space, as shown in the figure convex problems, by considering the following alternative formulation of problem (10) By solving each inner problem of (11) separately we can determine z * as the largest b i . This approach requires |I D | independent convex NLP problems to be solved, but computationally it can be more efficient than solving a single problem with |I D | copies of the variables. Using z * as the new right-hand side value of cut (9), we form the tightened cut Proposition 1 The cut given by Eq. (12) forms a valid inequality for N ∩ L ∩ Y , and is at least as tight as the cut given by Eq. (9).
Proof From optimality of problem (10) it directly follows that cut (12) forms a supporting hyperplane to the feasible set of problem (10), which contains N ∩ L ∩ Y . Since the feasible set of problem (10) is contained within N ∩ L , it follows that z * ≤ . ◻ Solving (10) as smaller individual convex problems also enables us to further tighten the cut. To further strengthen the cut, we considering each term of the disjunction in problem (10) and form a convex NLP problem for each i ∈ I D Note that each problem (NLP-i) is a subproblem of problem (11). To simplify the derivation and analysis, we first assume that all i ∈ I D result in a feasible problem (NLP-i). Solving problem (NLP-i) for each i ∈ I D gives the values b i that can be used as individual right-hand side values for each integer assignment of the exclusive selection constraint (2). A new strengthened cut is then given by and the properties of the new cut are presented in the following two theorems.

Theorem 1 The cut given by Eq. (13) forms a valid inequality for
The theorem is easily proven by contradiction. First, assume ∃̄ ∈ N ∩ L ∩ Y ∶ Due to the exclusive selection constraint, one and only one of the binary variables x i∈I D can be nonzero. Let j be the index of the nonzero binary variable, and the strict inequality (14) can now be written as By assumption, ̄ must satisfy all constraints of problem (NLP-i). This implies that b j cannot be an optimal solution to problem (NLP-i), and this leads to a contradiction. ◻ Before analyzing the tightness of the cuts, we first describe our definition of a tighter cut. Here we consider cut (13) to be tighter than cut (12) in the sense that any satisfying Eq. (13) will satisfy Eq. (12), but not vice versa. In integer programming, this tightness relation is commonly referred to as cut (13) strictly dominating cut (12), e.g., see Balas and Margot (2013).

Theorem 2 The cut given by Eq. (13) is always as tight or tighter than the cut given Eq. (12).
Proof Since z * is chosen as the maximum of ⊤ over all integer assignments of the exclusive selection constraint intersected with N ∩ L , it follows that z * = max i∈I D {b i } . Therefore, each b i can be split into two parts b i = z * − i , where each i ≥ 0 . The cut given by Eq. (13) can now be written as proving that the cut is always as tight as cut (12). Furthermore, if a single i > 0 , then the cut given by (13) will strictly dominate cut (12). ◻ Earlier we assumed that all i ∈ I D result in a feasible problem (NLP-i), which is not a necessary assumption for the cut strengthening. Finding such infeasible integer assignments enables us to remove the corresponding binary variable, as further described in the following proposition.

Proposition 2 If i ∈ I D result in an infeasible problem (NLP-i), then the binary variable x i can be eliminated by permanently fixing the variable to zero.
Proof In problem (NLP-i) all variables, except those included in the exclusive selection constraint, are relaxed to continuous variables and they are only restricted by the original constraints. Variable x i is fixed to one, which automatically fixes the other variables in the exclusive selection constraint to zero. Therefore, the only case where problem (NLP-i) can be infeasible is when x i = 1 is an infeasible partial integer assignment to the MINLP problem. ◻ To illustrate the difference between the two cuts, we again consider problem (EX1). By applying the cut strengthening technique to the cut given by the ESH algorithm, we can generate the following two cuts The outer approximations obtained given by the two different cuts are shown in Fig. 2. The figure shows a clear advantage of the second cut, resulting in a significantly tighter linear relaxation of the MINLP problem. However, comparing Figs. 1 and 2 show that both cuts are significantly stronger than the standard ESH cut.
In an outer approximation type algorithm, it is not only important to obtain a tight continuous relaxation, but also to obtain a tight MILP relaxation, i.e., a tight linear relaxation for given integer assignments. The two are obviously related, but it is possible to have a tight MILP relaxation with a weak continuous relaxation. To further illustrate the differences between the two types of cuts, we analyze how the feasible region of the cuts to problem (EX1) varies with the feasible integer assignments. Figure 3 shows the feasible region of the cut given by Eq. (17) for each feasible integer assignment. The figure shows that cut (17) is tight for one of the feasible integer assignments, but not as tight as possible for the other two. Figure 4 shows the cut given by Eq. (18) forms a supporting hyperplane to the feasible set of each term of the disjunction in problem (EX1), i.e., for each feasible integer assignment the cut is as tight as possible. The example highlights the fact that the individually tightened cuts, i.e., cuts formed by Eq. (13), can give (17) 5.920x 1 + 4.536x 2 + 29.944x 3 ≤ 52.029, (18) 5.920x 1 + 4.536x 2 ≤ (52.029 − 29.944)x 3 + 41.192x 4 + 35.451x 5 . Fig. 2 The figures show the true feasible set of problem (EX1) and the continuously relaxed feasible set projected onto the (x 1 , x 2 )-space. The left figure shows the outer approximation given by cut (17) and the right figure shows the outer approximation given by cut (18) both significantly tighter continuous and MILP relaxations than the cut given by Eq. (12) and the original ESH cut.
In this section, we have presented a framework for strengthening cuts obtained by the ESH algorithm. However, the same approach can also be used to strengthen cuts obtained by a similar algorithm, such as ECP, OA or generalized Benders decomposition. The next section will focus more on the computational aspects, and how to practically utilize the cut strengthening framework within a solver.

A cut strengthening algorithm
This section focuses on the computational aspects and how to utilize the cut strengthening techniques from the previous section in an algorithm. We present a simple strategy for selecting one out of multiple exclusive selection constraints, and   (18) for each feasible integer assignment in the (x 1 , x 2 ) -space describe some computational enhancements along with a discussion on how to deal with tolerances.
The cut strengthening techniques in the previous section utilizes the exclusive selection constraint (2) to tighten cuts of the type given by Eq. (9). However, MINLP problems can contain multiple exclusive selection constraints, e.g., originating from multiple disjunctive constraints. Given a cut, there is a choice of which exclusive selection constraint and the corresponding variables to choose for the tightening procedure. Ideally one wants to choose the exclusive selection constraint with the binary variables x i for i ∈ I D such that the coefficients b i obtained by solving (NLP-i) are as small as possible. However, such an optimal choice cannot trivially be determined, and instead, we will make the choice based on the variable connections.
Suppose that we have obtained cut (9), which is given by linearizing the nonlinear constraint g j ( ) ≤ 0 . To compare the different exclusive selection constraints, and their corresponding variables x i for i ∈ I D , we check the connections of the variables x i for i ∈ I D to the constraint g j ( ) ≤ 0 . Here we consider two types of connections, direct connections and step-one connections. Variable x i is directly connected to g j ( ) ≤ 0 , if the variable is included in the constraint. In a step-one connection, the variable x i is included in another constraint (linear or nonlinear) that has at least one variable in common with g j ( ) ≤ 0 . The number of direct connections in an exclusive selection constraint is given by number of variables in I D that are directly connected to the nonlinear constraint g j ( ) ≤ 0 , and similarly for the step-one connections. Here, we use the following heuristic rule for selecting an exclusive selection constraint. A feasible solution to the MINLP problem ̂ can also be utilized within the cut strengthening procedure. This is done by simply including the objective reduction constraint as a constraint in problem (NLP-i). Including the objective reduction constraint can further reduce the coefficients b i , resulting in a stronger cuts. Furthermore, including the objective reduction constraint can enforce infeasibility on some partial integer assignments, and cause assignments in problem (NLP-i) to be infeasible. As mentioned earlier, the only way problem (NLP-i) can be infeasible is if the partial assignment, i.e., x i = 1 i ∈ I D , x j = 0 ∀j ∈ I D ⧵ i , is infeasible for the MINLP problem. Finding such infeasibilities is desirable since it allows us to eliminate a variable from the MINLP problem by fixing it to zero.
Including the previously tightened cuts into problem (NLP-i) can also improve performance by tightening the continuous relaxation. Obtaining a tighter continuous relaxation in problem (NLP-i) can further strengthen the cut and infer infeasibilities. In the numerical results presented in Sect. 6, it was noticed that including the tightened cuts and an objective reduction constraint can greatly help in identifying infeasible or non-optimal partial integer assignments. The ability to identify and eliminate these from the search space can result in fewer iterations but can also reduce the complexity of the MILP relaxations, used by algorithms such as ESH, ECP, and OA.
The cut strengthening techniques are summarized as pseudo-code in Algorithm 1. In the algorithm, the two different cuts from the previous section are considered as different strategies. The cut given by Eq. (13) is referred to as a Multi Tightening (MT) strategy, since it effectively uses multiple values for the right-hand side. Similarly, the cut given by Eq. (12) is referred to as a Single Tightening (ST) strategy.

Computational comments
When solving an optimization problem to generate a cut, it is important to take the solver tolerances into consideration. The tolerances are especially important when dealing with nonlinear problems, where it is rare that a solver returns an exact optimal solution. In the cut strengthening procedure, presented in the previous section, the solver tolerance will only affect the coefficients b i . If we can ensure that the solution of problem (NLP-i) is within an -tolerance from the true optimal objective value, then the suboptimality can easily be handled by relaxing the cut, i.e., adding to the right-hand side.
As a comparison, some other techniques to obtain strong cuts for convex MINLP problems use the minimum distance (separation) problem to generate cuts (Stubbs and Mehrotra 1999; Trespalacios and Grossmann 2016). In these approaches, the minimizer of an NLP subproblem forms the coefficients of both the left-and right-hand side of the cut. For these cuts, it is important to obtain a high optimality accuracy in the variable space, since it affects both the angle and level of the cut. Issues with numerical tolerances can be reduced or effectively eliminated, e.g., by post-processing the cut and optimizing over each term in the disjunction to determine a valid right-hand side, but this comes at a significant computational expense. However, since both the coefficients on the left-and right-hand side are optimized, this approach is not limited to a specific cut but can basically generate any supporting hyperplane to the convex hull of the disjunction. Generating cuts by solving the separation problem can, therefore, result in stronger cuts than the cut strengthening procedure which is limited by the structure of the original cut.
In the cut strengthening procedure, we optimize over each term of a disjunction in problem (10) separately. This allows us to obtain stronger cuts and identify infeasible partial integer assignments, as described in Sect. 3. In an efficient implementation, the individual problems given by (NLP-i) can be solved in parallel since they are completely independent. This approach also has computational advantages, since the convex hull formulation and the perspective function, in particular, comes with numerical challenges. There are formulations to avoid division by zero (Sawaya 2006) and for some types of problems, the convex hull is second-order cone representable, which can be handled more efficiently (Ben-Tal and Nemirovski 2001). However, if some of the partial integer assignments are infeasible it can cause difficulties for solvers since the convex hull of problems (NLP-i) will then have an empty interior even though it is feasible. Such issues can be eliminated by analyzing each term of the disjunction in a pre-processing and eliminating infeasible terms, but this also comes at a significant computational expense.
As previously mentioned, our cut strengthening approach is limited to a specific cut and, therefore, it may result in a weaker cut compared to generating the cut from solving a separation problem. The main advantage of our cut strengthening approach is that the cut is obtained by solving several smaller independent convex problems, compared to solving the larger separation problem. Therefore, the trade-off of our cut strengthening approach is a reduced computational complexity at the expense of a possibly weaker cut.

Computational setup
To compare the cuts and to show the advantage of the cut strengthening, we have included a numerical study where we compare the ESH algorithm with and without the cut strengthening technique. These are preliminary results and are mainly intended as a proof of concept. To focus on the effects of cut strengthening, we apply them to a basic implementation of the ESH algorithm. As shown by Lundell et al. (2016Lundell et al. ( , 2020 several other techniques can be combined with the algorithm to improve the computational performance, such as early MILP termination and multiple cut generation strategies. Before presenting the results, we will give a more detailed description of the computational setup.

A Convex MINLP algorithm
To solve the MINLP problems we will use the ESH algorithm, which was briefly presented in Sect. 2. In each iteration, we use the cut strengthening algorithm from Sect. 4 to strengthen the cut generated by the ESH algorithm. It is known that the basic ESH algorithm tends to only generate a single cut per iteration Lundell et al. 2017). However, in some iterations the root-search can result in a point where multiple constraints are active, resulting in multiple cuts.
Here, we will only strengthen one cut per iteration. If we obtain multiple cuts in an iteration, then we randomly pick one of them for the strengthening procedure. We do not use the LP-preprocessing from , which simplifies the algorithm and allows us to better focus on the effect of the cut strengthening.
Besides the basic ESH algorithm, we only include two simple primal heuristics that have proven to be effective within this framework Lundell et al. 2018). Without any primal heuristics the ESH algorithm will generally not obtain feasible solutions during the solution procedure, making it difficult to terminate based on the optimality gap. Therefore, the primal heuristics is an important enhancement to the ESH algorithm and practically needed within a solver. From the numerical tests, we also noticed that feasible solutions improve the cuts and help to identify non-optimal partial integer assignments. The primal heuristics we use here are checking the alternative solution in the MILP solver's solution pool, and fixing the integer assignments in the MINLP and solving the resulting convex NLP problem in every fourth iteration. The primal heuristics are summarized as a pseudocode in Algorithm 2.
Algorithm 2 Primal heuristics 1: procedure PrimalHeuristic(c, N, L, Y , SolutionPool, k,x, x k ) 2: Fixed integer assignment 5:x ← arg min x∈Ŷ c x 6: end if 7: end for 8: if k mod 4 = 0 then Every fourth main iteration 9: if MINLP is feasible for the integer assignment then 11:x ← arg min x∈Ŷ c x 12: if c x < c x then 13:  As a termination criterion we use the relative optimality gap defined as where ub and lb are upper and lower bounds on the optimal objective value of the MINLP problem. Here, ub is given by the best found feasible solution and lb is given by ⊤ k , where k is given by problem (MILP-r). We consider the MINLP problem as solved when the relative gap is reduced to 10 −3 , thus proving that the best found solution is within 0.1% of the global optimum. The method used for solving the MINLP problems is summarized as a pseudo-code in Algorithm 3.

Implementation and hardware
For the numerical comparison, we use a simple implementation of the ESH algorithm utilizing IPOPT 3.12.9 (Wächter and Biegler 2006) and Gurobi 8.1 (Gurobi 2019) as subsolvers for NLP and MILP subproblems. For reading and parsing the MINLP problems, we use the open-source MATLAB toolbox OPTI Toolbox (Currie and Wilson 2012). In the current implementation, we are not able to run the cut strengthening NLP subproblems in parallel, which could significantly speedup the cut strengthening. However, the computational results in the following section still clearly show an advantage of the cut strengthening, both in terms of total computational time and in number of iterations.
The numerical comparisons are performed on a basic desktop computer with an Intel i7-7700k processor, 16 GB RAM, and Windows 10. For the subsolvers, we use default settings except for allowing Gurobi to run on 8 threads. By running Gurobi on multiple threads, the MILP subproblems are solved faster and this is simply done (20) gap = ub − lb |ub| + 10 −10 , by changing the Threads-parameter. The root-search, in the approximative projection, is done to a tolerance of 10 −16 in the -variable.

Numerical results
To test the efficiency of the cut strengthening, we apply the simple implementation of the ESH algorithm, described in Algorithm 3, to a set of test problems. The ESH algorithm forms the baseline for the numerical comparison, and we compare how the cut strengthening techniques affect the number of iterations and solution times. As already mentioned, the results are mainly intended as a proof of concept to show the impact of the cut strengthening. By using techniques such as early stopping (Lundell et al. 2018) and running the cut tightening NLP problems in parallel it would be possible to significantly reduce the solution times.
For the numerical test we have chosen convex MINLP instances from MINL-PLib (MINLPLib 2020) containing at least one exclusive selection constraint. The cut strengthening is mainly intended to strengthen the linear approximation of the nonlinear constraints, and it is expected to be most beneficial for problems containing the big-M formulation of disjunctions containing nonlinear constraints. The disjunctions are identified through the exclusive selection constraints, and the cuts are strengthened through a tighter representation of the disjunctions. If nonlinear disjunctions are represented by the convex hull formulation, our approach will not necessarily be able to tighten the relaxation. For example, if a nonlinear disjunction is represented by the convex hull, then the ESH algorithm can generate supporting hyperplanes to the convex hull of the disjunction and the ST-strategy will not be able to change such cuts. The MT-strategy could still give a tighter approximation for integer feasible solutions, as it may cut off parts of the convex hull for some integer values as shown in Fig. 4. Since the cut strengthening is an expensive operation, it is better suited for problems with the big-M formulation as the impact will be more significant and the subproblems are smaller. Therefore, we focus on problems where disjunctions, with either linear or nonlinear constraints, are represented by the big-M formulation. The problems we consider from MINLPLib are different versions of the problems clay, flay, slay, sssd, and tls. These problems represent optimization tasks such as trimloss problems (Harjunkoski et al. 1998), optimal placement tasks (Sawaya 2006) and service systems design (Elhedhli 2006). We also consider a problem called stockcycle (Silver and Moon 1999), which is known to be difficult to solve without any reformulations (Kronqvist et al. 2018c). Furthermore, we also consider a class of test problems called p_ball, that are described in the Appendix. The p_ball instances contain several relatively large nonlinear disjunctions, and are designed to be challenging due to both the nonlinearity and the combinatorial aspects. We use a 2 h time limit for all the problems except for stockcycle, where we use a time limit of 96 h.
The results are presented in Table 1, showing both the number of iterations and the time needed to solve each problem. The table shows that both cut strengthening techniques can significantly reduce the number of iterations needed to solve the problems. However, the table shows a clear advantage of the multi tightening (MT) strategy. This result aligns well with the theory since cut (13) used in the MT strategy can dominate the cuts used by the single tightening (ST) strategy. On average, the ST strategy reduces the number of iterations by a factor of 1.5, and the MT strategy gives a further reduction with a factor of 2.9 on average. Both cut strengthening strategies give a significant reduction in solution times, but the MT strategy has a clear advantage and is faster by a factor of 2 compared to the ST strategy. The performance in terms of speed for the strategies is illustrated in Fig. 5, which shows the performance profiles of the different strategies. From the figure, it can be observed that MT strategy gives a great advantage for the more challenging problems.
The strategies include a simple implementation of the ESH algorithm and the ESH algorithm combined with two new cut strengthening techniques. The sign ">" indicates that the time limit was exceeded before the search could be terminated. The fewest iterations and the fastest times are in bold for each problem. All of the instances in this table use a big-M formulation  The test set has 43 instances (clay*, flay*, p_ball*, slay*, sssd*, stockcycle*). An instance is considered solved when it reaches a relative gap of less than 0.1% As shown in Table 1, the cut strengthening is especially powerful for the clay and p_ball problems. These problems contain nonlinear disjunctions that are represented by the big-M formulations, giving weak continuous relaxations that can be efficiently strengthened by the cut strengthening technique. For the p_ball problems, the MT cut strengthening reduces the number of iterations by a factor of 11.7 on average. Without the cut strengthening the larger p_ball problems are practically intractable with the ESH algorithm, and the optimality gap remained large after 2 h.
As previously mentioned, the cut tightening comes at a computational cost of solving convex NLP subproblems. For example, problem p_ball_40b_5p_4d contains nonlinear disjunctions of size 40, which results in 40 subproblems for the cut tightening per iteration. Solving these subproblems accumulates to about 35% of the total solution time. However, this is well compensated for by the great reduction in the number of iterations.
It is worth mentioning that the cut strengthening techniques do not necessarily result in computationally more demanding iterations. For example, the average iteration time for slay10m is 10.6 seconds with the ESH strategy and less than 2 seconds with both the ST and MT strategies. There are two reasons behind the significantly faster iterations. First, the strengthened cuts can result in a tighter continuous relaxation, making the MILP relaxations easier to solve. But more importantly, the cut strengthening procedure can sometimes identify infeasible or non-optimal integer assignments during the solution procedure, see Sect. 3 for details. For slay10m, the cut strengthening is able to fix 53 of the binary variables to zero. Similarly, the cut strengthening eliminates 299 of the 432 binary variables in stockcycle. By further studying these problems, we found that the binary variables fixed by the cut tightening cannot trivially be removed, e.g., by performing LP-based bounds tightening. Some of the integer assignments immediately resulted in problem (NLP-i) to be infeasible in the cut tightening, and some became infeasible due to bounds on the objective and accumulation of strengthened cuts. The ability to identify the infeasible or non-optimal integer assignments can greatly reduce the complexity of the MINLP problems and comes as a desirable side effect of the cut tightening.
MINLPLib also contains a large number of problems called syn and rsyn (Türkay and Grossmann 1996;Sawaya 2006). These problems do have a disjunctive structure, although mainly involving linear constraints. These problems are all easy to solve, and on average they require less than 10 iterations and 3 seconds with the ESH algorithm. There are in total 24 rsyn and 24 syn problems with the big-M formulation. For these problems, the cut strengthening did not provide any significant advantages. The average times and number of iterations for these two problem types are shown in Table 2.
For the rsyn and syn instances the cut tightening procedure has little effect on the cuts, and does not result in fewer iterations. In these problems the nonlinear constraints only contain three variables, and there is only a single nonlinear variable in each constraint. It is possible that these constraints are tight to begin with, which would explain why the strengthening does not have much effect for these specific problems.
The cut strengthening seems to be most efficient for problems that contain disjunctions with nonlinear constraints, e.g., clay and p_ball problems. Some aspects of why the multi tightening strategy works particularly well for these problems are described in the next section. For problems with nonlinear disjunctions, the choice of which exclusive selection constraint to perform the strengthening on is also straight forward since binary variables will be present in the nonlinear constraints. The cut strengthening also performed well on the problems slay, sssd, and stockcyckle, where there are only disjunctions with linear constraints.

Comparing strong problem formulations and cut strengthening
These results show that the strengthening procedure can give a great advantage for problems where disjunctions are represented by big-M constraints. To further analyze the cut strengthening procedure, we compare the cut strengthening procedure with applying the basic ESH algorithm on the same MINLP instances in a convex hull form, where all or some disjunctions are represented by the convex hull formulation. For this test, we use all problems from the previous section that are available in both a big-M and convex hull form. The results are presented in Table 3. Here we only use the multi strengthening technique, since it results in stronger cuts than the single tightening at the same computational cost.
For nonlinear disjunctions represented by a convex hull formulation, the ESH algorithm can generate supporting hyperplanes to the convex hull of the disjunction. Therefore, applying the ESH algorithm to MINLP instances where nonlinear disjunctions are represented by the convex hull can result in significantly tighter cuts compared to cuts obtained from big-M constraints. This can be seen from the results in Table 3, which shows that the ESH algorithm requires fewer iterations for most of the problems in the convex hull form. The ESH algorithm is still faster on some of the problems in big-M form, which is most likely due to the smaller subproblems.
It is important to notice that the cut strengthening procedure will not necessarily result in similar cuts as applying the ESH algorithm to the convex hull formulation of the problem. This is well illustrated by problem (EX1), where the single tightening strategy does not result in a supporting hyperplane to the convex hull of the disjunction as illustrated in Fig. 2. The multi tightening strategy forms a supporting hyperplane to the convex hull of the disjunction, but it still behaves differently compared to a cut obtained by applying the ESH algorithm to the convex hull form of the problem. Figure 4 shows that the multi tightened cut not only forms a supporting hyperplane to the convex hull of the disjunction, but for each feasible integer assignment it also forms a supporting hyperplane to the corresponding term of the  disjunction. For problem (EX1), a single multi tightened cut effectively acts as a supporting hyperplane to three different nonlinear constraints for each feasible integer assignments. The multi tightened cuts behave similarly for the p_ball instances, where each disjunction corresponds to assigning a point to one of the balls. For a feasible integer assignment, a cut obtained by multi tightening will then act as a supporting hyperplane to each ball for one of the points. For example, for the problem p_ball_40b_5p_3d a multi tightened cut effectively behaves as a tight cut for 40 different nonlinear constraints. This behaviour can make the multi tightened cuts especially powerful for problems with nonlinear disjunctions, which is also shown by the results in Table 3.
Only the p_ball and clay instances contain nonlinear disjunctions and for most of these problems the multi tightening strategy significantly reduces both solution times and number of iterations. For problems with only linear disjunctions, the multi tightening strategy does not necessarily give the same advantage. However, the multi tightening strategy also performed well on the test problems with only linear disjunctions. On average the multi tightening strategy reduces the number of iterations by a factor of 7.2 compared to the ESH algorithm with the big-M formulation and by a factor of 1.5 compared to ESH algorithm with the convex hull formulation of the problems. In terms of total solution time, the multi tightening strategy reduces the total solution time by more than a factor of 3 on average compared to the other two approaches.

Conclusions
In this paper, we have presented a new framework for strengthening cuts to obtain tighter outer approximations for convex MINLP. The cut strengthening is based on analyzing disjunctive structures in the MINLP problem, and either strengthen the cut for the entire disjunction or separately for each term of the disjunction. We have proven that the strengthening results in valid cuts that can dominate the original cut. The numerical results show that the strengthening can greatly reduce the number of iterations and time needed to solve convex MINLP problems. We have focused on strengthening cuts derived from the ESH algorithm, but the same techniques can just as well be used to strengthen cuts obtained by OA, ECP or generalized Benders decomposition.

3
A disjunctive cut strengthening technique for convex MINLP directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.

Appendix: New nonlinear disjunctive test problems
To further test the cut strengthening for problems with different sized disjunctions containing nonlinearities of varying difficulty, we have generated 12 new test problems. The underlying optimization task is simple, select n-points in m-dimensional balls, such that the 1 -distance between all points is minimized. Only one point can be assigned to each ball, and in total there are l balls with radius one. The problem has a clear disjunctive programming structure, where the disjunctions arise from the assignment of each point to one of the balls. In total we get n disjunctions of size l, i.e., one disjunction per point and one disjunctive term per ball. Even if this optimization task can be represented as a binary quadratic problem, it is a challenging problem for OA-type algorithms. Without any reformulations, OA-type algorithms will require a large number of iterations due to the difficulties of accurately approximating an n-dimensional ball with hyperplanes (Hijazi et al. 2013). Higherdimensional balls render the outer approximation task more difficult, and the number of nonlinear constraints is given by the number of balls times the number of points. There is a clear combinatorial structure to the problem, and the complexity increases with the number of points and balls as the number of possible discrete configurations drastically increases. The seemingly simple optimization problem is, thus, challenging both due to the combinatorial nature and the nonlinearity.
Before presenting the MINLP formulation, we briefly describe the notation and some details of the problem formulation. Here, i ∈ ℝ m denotes the center of ball i and c i 1 refers to the first coordinate of the center. Similarly, i ∈ ℝ m refers to point i and p i 1 is the first coordinate of the point. To simplify the notation, we introduce the sets D = {1, 2, … , m} , P = {1, 2, … , n} , P i = {i + 1, i + 2, … , n} , and B = {1, 2, … , l} . The 1 -distance can be represented by linear constraints by introducing auxiliary variables i,j ∈ ℝ m . As before, d i,j k refers to the k-th component of the vector i,j . To act as the absolute value, we use the following constraints and the distance between the points i and j is now given by ∑ m k=1 d i,j k . For the test problems, we randomly chose each coordinate of the balls' centers between 0 and 10, which also limits each variable d i,j k to the interval [0, 10]. We use a binary variable b i,r for selecting if point i is assigned to ball r. There are several identical solutions to these optimization problems due to symmetries, e.g., you can switch places of the first and second point to obtain another equally good solution. To eliminate some of the symmetrical solutions we include an ordering of the points along the first coordinate. The ordering is enforced by including the constraints that the first point must be closer to the origin along the first coordinate than the second point, and similarly for the following points. Using the big-M formulation the problem can be written as where M r are sufficiently large constants. For these problems the smallest valid M r is simply given by Here, M r are based on the largest squared Euclidean distance between the center of a ball and the point furthest away in any other ball. This gives the smallest valid M constants, resulting in a tight Big-M formulation. It could be possible to obtain a stronger formulation, e.g., by eliminating furhter symmetries or by the techniques presented by Trespalacios and Grossmann (2015). However, the goal here is not to derive an optimal problem formulation, but simply to generate a few test problems of different size and difficulty. We have generated 12 random test instances, where the centers of the unit balls are chosen randomly. The test problems are of different size, and the main attributes are summarized in Table (4). The problems range in size from 50-210 binary variables and 30-85 continuous variables.
It is also possible to represent the assignment of points to circles with the convex hull formulation to obtain a tighter continuous relaxation. For each point i ∈ P , we need to make l copies of the variables i , and we get the new variables r ∈ ℝ m . It would be possible to represent the convex hull by second-order cones, but to fit within the framework of this paper we use the formulation presented by Sawaya and Grossmann (2007). The problem can then be written as ∀ i ∈ P, ∀r ∈ B, i,j ∈ [0, 10] m ∀i ∈ P, ∀j ∈ P i , i ∈ [0, 10] m ∀i ∈ P, ∀ i ∈ P, ∀r ∈ B, i,j ∈ [0, 10] m ∀i ∈ P, ∀j ∈ P i , i ∈ [0, 10] m ∀i ∈ P, r ∈ [0, 10] m ∀i ∈ P, ∀r ∈ B. Numerical results are presented in Sect. 6, which show that the test problems are challenging for the ESH algorithm with both problem formulations. Finally, all the test problems can be downloaded from https ://githu b.com/jkron qvi/point s_in_circl es.