Towards the Solution of Mixed-Integer Nonlinear Optimization Problems using Simultaneous Convexiﬁcation

Solving mixed-integer nonlinear optimization problems (MINLPs) to global optimality is extremely challenging. An important step for enabling their solution consists in the design of convex relaxations of the feasible set. Known solution approaches based on spatial branch-and-bound become more eﬀective the tighter the used relaxations are. Relaxations are commonly established by convex underestimators, where each constraint function is considered separately. Instead, a considerably tighter relaxation can be found via so-called simultaneous convexiﬁcation, where convex underestimators are derived for more than one constraint at a time. In this work, we present a global solution approach for solving mixed-integer nonlinear problems that uses simultaneous convexiﬁcation. We introduce a separation method for the convex hull of constrained sets. It relies on determining the convex envelope of linear combinations of the constraints and on solving a nonsmooth convex problem. In particular, we apply the method to quadratic absolute value functions and derive their convex envelopes. The practicality of the proposed solution approach is demonstrated on several test instances from gas network optimization, where the method outperforms standard approaches that use separate convex relaxations.


Introduction
In this work, we develop a global solution approach for solving mixed-integer nonlinear Problems (MINLPs).Such optimization problems belong to the most challenging optimization tasks known today, due to the fact that they combine integral decision variables as well as nonlinear and nonconvex constraint functions.In this work, we focus on handling the latter property.MINLPs abound in many real-world applications such as energy and distribution networks, for example gas networks (e.g.Martin et al. [2006]), and chemical process design (e.g.Grossmann et al. [1999]).See [Belotti et al., 2013] for a detailed overview.
The most commonly used solution strategy for general MINLPs consists in a Branch and Bound algorithm (see, e.g., the textbook Locatelli and Schoen [2013]).It is implemented and enhanced in several state-of-the-art software packages like Antigone (Misener and Floudas [2014]), Baron (Tawarmalani and Sahinidis [2005]) or SCIP (Gleixner et al. [2017]).We refer to Bussieck and Vigerske [2014] for an extensive survey on MINLP solvers.
The main idea of the Branch and Bound algorithm is to generate a tree structure of subproblems that arise from a subdivision of the feasible set.For each subproblem, a relaxation of the feasible set is generated by dropping the integrality assumptions and by using convex underestimators of the nonconvex constraint functions.These relaxed subproblems are convex and yield lower bounds for the original subproblem.The quality of these bounds depends on the tightness of the underestimators.Hence, a broad field of research is devoted to finding the tightest possible convex underestimators (the so-called convex envelopes) for different types of relevant functions.A description of the convex envelope was obtained explicitly for several specific classes of functions, e.g. for multilinear functions (Rikun [1997], for fractional terms (Tawarmalani and Sahinidis [2001]) or for odd monomials (Liberti and Pantelides [2003]).Further results handle functions with specific curvature properties such as edge-convexity/concavity and indefiniteness (Meyer and Floudas [2005]; Jach et al. [2008]; Khajavirad and Sahinidis [2013]; Locatelli and Schoen [2014]).See also [Boukouvala et al., 2016] for a list of publications on this subject.
Most of these results have in common that they analyze the convex envelope of a single real-valued function.However, the convex hull of a feasible set defined by multiple constraint functions is not completely described by the convex envelope of every single constraint.As a consequence, the standard relaxation of the feasible set can be significantly tightened by considering the interaction between multiple constraint functions.This interaction was already studied in [Tawarmalani, 2010].In the following, we use the authors' nomination and refer to the convex hull of a set given by multiple constraints as simultaneous convexification.The Reformulation-Linearization Technique (RLT, Sherali and Alameddine [1992]) can be interpreted as an early result in this context.In this method, suitable functions are multiplied in order to derive additional constraints on the relaxation.The RLT constraints are further integrated in [Anstreicher and Burer, 2010] in order to derive the simultaneous convexification of a set of bivariate quadratic functions.In [Ballerstein, 2013] a more general characterization is given.The convex hull of a feasible set, defined by a vector valued function, can be described by the convex envelopes of all linear combinations of its components.
In this work, we make use of that result to derive a refinement of the standard relaxation of relatively general MINLPs.We present a global optimization method that includes the refinement into an algorithmic framework by using a cutting plane approach.See [Kelley, 1960] for an early publications on this strategy.The solution of a relaxed problem is separated by a linear inequality in order to iteratively converge to the optimum of the original problem.In our case, we separate from the convex hull of the feasible set by solving a convex optimization problem.The problem is not continuously differentiable and relies on an algorithmically utilizable representation of the convex envelope of linear combinations of the constraint functions.
The solution of general MINLPs is often beyond what is currently possible in practice due to their computational difficulty.Therefore, the abstract framework presented here is rolled out for a general class of bivariate quadratic absolute value functions.We first derive the simultaneous convexifications and explain the details of the solution algorithm.We then show experimental results for an application in the operation of gas networks.The results indicate that the developed separation strategy is able to derive significantly tighter bounds than the standard relaxation.
The remainder of this paper is structured as follows.In Section 2 we briefly discuss the considered problem class and solution strategy.We motivate how standard relaxations of the feasible set are obtained and why there is room for improvements.In Section 3 we recap the main result on simultaneous convexification and use it to derive a separation problem and cutting planes for the convex envelope of the feasible set based on the solution of this problem.We further present some definitions and basic results concerning the convex envelope, generating sets and minimizing simplices.Section 4 introduces quadratic absolute value functions in the way they are used in gas network optimization.For these specific functions, we derive the convex envelopes of their linear combinations in order to apply our optimization method.The practical impact of our work is exemplarily evaluated in Section 5.
The results presented in this paper are also given in the dissertation by Mertens [2019].Preliminary considerations and computations are already introduced in the dissertation by Merkert [2017].

Solving Mixed-Integer Nonlinear Optimization Problems
We assume mixed-integer nonlinear optimization problems (MINLPs) given in the form min c (x, z) with cost vector c ∈ R n+m , set D ⊆ R n and a continuous function g : D → R m .In general, the constraint function g is nonconvex, the feasible set X is nonconvex and Problem (OP) has multiple locally optimal solutions.Potential integrality constraints on certain entries of x are implied by D.
As this work aims on tackling the problems arising from nonlinearities in MINLPs, we omit the integrality restrictions in the following and assume that D is compact and convex.This assumption is usually satisfied by considering the continuous relaxation of a given mixed-integer problem.However, even without the integrality constraints, not all MINLPs can be formulated as Problem (OP).Only certain types of dependencies are allowed and bounds on z are only given implicitly by x.This is a relevant restriction for general MINLPs.Nevertheless, the proposed structure is well suited for demonstrating the effect of simultaneous convexification.Furthermore, at least a substructure of the form X is given in almost any MINLP.The developed strategies may therefore still be applied on more general feasible sets.
A common strategy for solving Problem (OP) is the so-called Spatial Branch and Bound Method.Therein, a convex superset X ⊇ X of the feasible set and the corresponding relaxed problem min c (x, z) (RP) are considered.Problem (RP) is convex and yields a lower bound for Problem (OP).It may therefore be used as a tool to evaluate the quality of solutions and, by integrating this approach into a branch and bound framework, to achieve convergence to global optimality.For a detailed introduction on global mixed-integer nonlinear optimization, we refer to [Locatelli and Schoen, 2013] and [Belotti et al., 2013].
We briefly discuss different approaches of constructing a relaxed feasible set X. Obviously, the choice of X is crucial for the quality of the resulting lower bound given by Problem (RP).Problem (RP) gives the best lower bound of Problem (OP), if X is chosen as small as possible.The smallest possible convex superset of X is conv(X).In fact, the objective value of both problems is equal in this case because of their identical linear objective function.
As conv(X) is hard to determine in general, it is common to make use of convex underestimators.
A standard approach is now given by X = X 0 with a convex underestimator g lo and a concave overestimator g up of g.Tighter estimators in general lead to a tighter relaxed feasible set X 0 .The tightest possible estimators are the convex/concave envelopes, so we define Note that conv(X) X * X 0 holds in general.Hence, the resulting lower bound obtained by the respective Problem (RP) with X = X 0 , or even X = X * , is sub-optimal.
This observation motivates the remainder of this work.In the next section, we introduce a strategy that generates additional constraints in order to tighten the relaxed feasible set X, if it is not equal to conv(X).Afterwards, we exemplarily evaluate this strategy for a special type of constraint functions.

A Separation Method Using Simultaneous Convexification
As motivated above, the convex envelopes of the single constraint functions are not sufficient to describe the convex hull of the feasible set of Problem (OP).Instead, a result in Ballerstein [2013] gives an exact characterization.We aim to integrate this result into an algorithmic framework by a separation strategy.
For a given point x ∈ X, we either confirm that x ∈ conv(X) holds, or compute a linear inequality that is valid for conv(X) but not for x.This allows us to design a cutting plane method ( [Kelley, 1960]).First, the relaxed Problem (RP) is solved with an arbitrary X ⊇ conv(X).Second, a linear inequality, that separates the optimal solution from conv(X), is added to the description of X.This procedure can be iterated in order to reduce the size of the relaxed feasible set in every step.It can also be integrated into a Branch and Bound framework as an additional step.This combination is also called Branch and Cut.
In the first part of this section, we present the result in Ballerstein [2013] that characterizes the convex hull of the feasible set as the intersection of inifinitely many sets.We further design an optimization problem that identifies one of the infinitely many sets that is suitable for separating a given point.Afterwards, we translate the solution of this problem into a linear inequality.The problem relies on the availability of the convex envelope of linear combinations of the constraint functions.We therefore discuss some well known results on the structure of the convex envelope in the last part of this section.

Separation Problem
We aim to derive a separation problem for the convex hull of the feasible set of Problem (OP).We denote it by A result in Ballerstein [2013] states that the convex hull of a vector of continuous functions on a compact, convex domain can be described using the convex envelopes of all possible linear combinations of the components.
Proposition 1 (Ballerstein [2013]) Given a compact, convex domain D ⊆ R n and a continuous function g : with In the following, we use this representation to derive a convex optimization problem that provides suitable α for separating points from the convex hull of the feasible set of Problem (OP).To be more precise, we derive an algorithmic framework for the following separation task.

Separation Task 1
Input: A set D ⊆ R n compact and convex, a continuous function g : D → R m and a point (x, z) ∈ D × R m .Task: Decide whether (x, z) ∈ Y , and, if not, return a vector α ∈ R m with Let a point (x, z) ∈ D × R m be given.According to Proposition 1, we have Observe that due to scaling, it suffices to consider only linear multipliers α ∈ R m from the unit ball The inequality holds, as vex Thus, the objective function h of Problem (SP) is convex on any open superset of B m and therefore also continuous on B m .
Note that, in the case of (x, z) ∈ Y , it is not necessary to solve Problem (SP) to optimality in order to fulfill Separation Task 1.In fact, it suffices to find a point α ∈ B m with objective value h(α) < 0 to derive (x, z) / ∈ Y .From a practical point of view, it is important to mention that an efficient solvability of Problem (SP) heavily relies on the availability of an algorithmically utilizable representation of the convex envelope of α g for every α ∈ B m .Moreover, note that the function vex D [α g](x) is in general not continuously differentiable in α.

Deriving Linear Inequalities
We assume that a solution to Problem (SP) can be computed.In order to algorithmically utilize this solution in a cutting plane approach, we construct a linear inequality that separates the point (x, z) from Y based on this solution.We establish the following notation for our analysis.
Definition 2 Let Z ⊆ R n be a closed convex set.Furthermore, for β ∈ R n and β 0 ∈ R, we consider the linear inequality β z ≤ β 0 and the corresponding hyperplane We call H(β, β 0 ) a cutting plane of Z for z, if β z ≤ β 0 is valid for Z but not valid for {z}.
The respective separation task using cutting planes is defined as Separation Task 2 Our analysis and algorithmic results are summarized in the following proposition.
Proposition 3 Let D ⊆ R n be compact and convex and g : D → R m continuous.
. Then, a valid linear inequality for Y is given by ∈ Y and let α be an optimal solution to Problem (SP).
Proof 1. First, we define the half-space given by (3) intersected with box D as For any point (x, z) ∈ M g (α) we have and that the linear inequality As H(β, β 0 ) is a supporting hyperplane at x, vex D [α g](x) , we have We follow the proof of 1. and derive n i=1 Combining these equations, we have This leads to (x, z) / ∈ N g,x (α , β, β 0 ) and to our statement.
A direct consequence of Proposition 3 is that we can find an exact representation of Y based on supporting hyperplanes of the epigraph of the convex envelopes for different α.For this, let (β(x, α), β 0 (x, α)) ∈ R (n+1)+1 with β n+1 (x, α) < 0 define an arbitrary supporting hyperplane Concluding these results, we are able to design cutting planes for the set Y by solving a convex nonlinear optimization problem.We further need to effectively construct vex D [α g](x) as well as a supporting hyperplane at a given point x for arbitrary α ∈ B m .The next subsection deals with the latter two parts.

Structure of the Convex Envelope
We present some well known results concerning the structure of the convex envelope of a given function (e.g., see [Locatelli and Schoen, 2013]).They are used in Section 4 in order to exemplarily apply the separation strategy developed above.
The value of the convex envelope of g : R n → R at a certain point x ∈ D ⊆ R n can be determined by the following nonconvex optimization problem.
with a suitably chosen G ⊆ D. Note that k is bounded by n + 1, which is a consequence from Caratheodory's theorem.
The most general choice for G in Problem (EP) is G = D. We will, however, discuss some other valid choices that result in an equivalent problem in the following.For instance, the set G can be chosen as a real subset of D by using the concept of generating sets.Thereby, we can often strengthen the problem formulation significantly.
Definition 3 (Tawarmalani and Sahinidis [2002]) For a continuous function g : D → R on a compact convex domain D ⊆ R n , we denote the generating set of g on D by By definition of the generating set, we equivalently formulate Problem (EP) by setting G = G[g, D].Next, we provide a necessary condition that allows us to exclude feasible points from the generating set.
Definition 4 Let a continuous function g : R n → R be given.The set of concave directions of g at x ∈ R n is given by δ In the case of g being twice continuously differentiable at x, the set of concave directions at x ∈ R n is given by δ , where H g (x) denotes the Hessian Matrix of g at x.
A necessary condition for an interior point x ∈ int(D) being an element of the generating set is that the function g must be strictly locally convex at x.This leads to the following observation (see [Tawarmalani and Sahinidis, 2002, Cor. 5]).
Throughout our analysis, it is often useful to not set In certain cases, this will especially allow us to find an optimal solution {λ ; x 1 , . . ., x n+1 } of Problem (EP) with a reduced number of nonzero components of the vector λ ∈ R n+1 .To indicate nonzero components of vector λ , we define I(λ Definition 5 For D ⊆ R n , g : D → R, x ∈ D and some G ⊆ D, let {λ ; x 1 , . . ., x n+1 } denote an optimal solution to Problem (EP) such that the cardinality |I(λ )| is minimal.Then, we call a minimizing simplex for x w.r.t.g and G.If |I(λ )| ≤ 2, then we also use the term minimizing segment for S g,G (x).
In order to determine the convex envelope of a specific function, it is advantageous to know the dimension of the minimizing simplices beforehand.If there exists, for some G ⊆ D and for every x ∈ D, a minimizing segment w.r.t.g and G, then we say that the convex envelope of g (on D) consists of minimizing segments w.r.t.G.Note also that in this case the convex envelope of g also consists of minimizing segments with respect to every superset Ḡ ⊇ G.
Furthermore, we can exclude points and pairs of points from being part of minimizing simplices by again using the concept of concave directions.
Observation 2 Let D ⊆ R n be convex and g : D → R be continuous.Let conv({x i | i ∈ {1, . . ., m}}) be a minimizing simplex for a given x w.r.t.g and D.
1.If x i ∈ int(D) holds for some i ∈ {1, . . ., m}, then δ[g, x i ] = ∅ 2. For every pair x i , x j , i = j, i, j ∈ {1, . . ., m} there exists x ∈ conv {x i , x j } with 4 Convex Envelope of Quadratic Absolute Value Functions The proposed separation strategy relies on the availability of an algorithmic utilizable representation of the convex envelope of linear combinations of the constraint functions.As deriving the convex envelope of arbitrary functions is beyond the capability of the current state of research, it is common to only consider a specific function class.In the following, we exemplarily restrict ourselves to bivariate quadratic absolute value functions.We derive the convex envelope for these kind of functions, which allows us to apply the separation strategy on the corresponding constraint sets.As quadratic absolute value functions are used in the challenging field of gas network operation, we are also able to evaluate the impact of our strategy on a real world application.
Fig. 1 A single junction in a gas network.
This is done in Section 5, where we derive stronger convex relaxations for some small examples of gas network optimization problems.
In this section, we briefly describe the gas network setting and the resulting constraint structure.We present some analytic tools and concepts that help deriving the convex envelope in general, and apply them to the given function class.The concrete representation of the envelopes is mostly moved to the appendix.

Constraint Structure in Gas Networks
A gas network in its simplest form consists of a system of connected pipes.In our setting, we neglect all other components like compressor stations or control valves.We consider the stationary case without dependency on time.Gas flows through the pipes based on the pressure differences at the respective end points.Mathematically, the network is modeled as a graph with arcs representing pipes and nodes representing coupling points.We focus on analyzing a single junction in a gas network consisting of four nodes and three arcs.Every node has a corresponding squared pressure variable π i (i = 1, 2, 3, 4) and every arc has a corresponding flow variable q j (j = 1, 2, 3).See Figure 4.1 for a visualization.The relevant constraints connecting these values are given by Note that the direction of the flow is given by the sign of each flow variable.The respective flow is directed as shown in Figure 4.1 for a positive sign and the other way around for a negative sign.We reformulate (4) into As motivated in Section 3, we derive the convex envelope of ᾱ ḡ for arbitrary ᾱ ∈ B 3 .Note that ḡ is separable and that parameter c is simply a scaling factor.Further, we identify x 1 = q 1 , x 2 = q 2 and reduce our analysis on the convex envelope of g α := α g with and further reduce the complexity by fixing the direction of flow in the considered junction.
When all flow directions are fixed, we can assume that the variables x 1 and x 2 are nonnegative.This implies that the underlying functions reduce to quadratic ones and it is In this case, a complete description of Y is given in [Anstreicher and Burer, 2010].
The first case not covered by the literature is therefore defined by two fixed flow directions and one variable flow direction.Without loss of generality, the only variable with unfixed sign is x 1 .We assume that x 2 ≥ 0 and x 1 + x 2 ≥ 0 holds, i.e. the single terms reduce to 2 .Thus, we are interested in determining the convex envelope of The Hessian matrix of g α depends on the sign of x 1 and is given by Observe that due to scaling we can assume that α 1 ∈ {−1, 0, 1}.In case of α 1 = 0, g α reduces to a quadratic function again.For the remainder of the section we restrict ourselves to α 1 = 1, as the case α 1 = −1 is similar and can be obtained by symmetric considerations.
There are nine remaining cases that need to be discussed and that depend on the specific values of the parameters α 2 and α 3 .They define the curvature properties of g α .In order to distinguish the different cases, we use the following definition.
Definition 6 Let g : R n → R be continuous and D ⊆ R n convex.
Using this notation, the nine different cases can be distinguished as listed in Table 1.The first column denotes the (sub)cases and the second one lists the conditions on α for the respective case.Columns three to five give the curvature of g α with respect to both components and in general.Concave-convex means that g α is direction-wise concave for x < 0 and direction-wise convex for x ≥ 0 and indefinite-convex means that g α is indefinite for x < 0 and convex for x ≥ 0. Concave/indefinite indicates that g α is either concave or indefinite.We do not elaborate the analysis for all of these cases.Instead, we discuss interesting results and properties exemplarily on the two most complicated (sub)cases.The remaining ones are either trivial, are obtained by results in the literature, or can be derived using similar arguments as presented.For the sake of completeness, they can still be found in the appendix.Subsection 4.2 considers Case (2.b.iii) and uses the concept of (n−1)-convex functions (see Jach et al. [2008]) in order to show that the convex envelope of g α consists of minimizing segments.In Subsection 4.3, we introduce the direction-wise convex envelope and reduce Case (3) to Case (2).

Reduction on Minimizing Segments
We consider Case (2.b.iii).The function g α is direction-wise convex w.r.t.component 1 and 2. Furthermore, it is convex for x 1 ≥ 0 and indefinite for x 1 < 0. We show that the convex envelope consists of minimizing segments and derive them for any given point x ∈ D.
For this, we make use of the concept of (n − 1)-convex functions as introduced in [Jach et al., 2008].
Definition 7 Let g : R n → R be a twice differentiable function.g is said to be (strictly) (n − 1)-convex if the function g| xi=xi : R n−1 → R is (strictly) convex for each fixed value xi ∈ R and for all i = 1, . . ., n.
For indefinite functions with this property, the authors give a statement on the structure of the concave directions.
Lemma 1 [Jach et al., 2008, Lemma 3.2] Let g : D = [l, u] ⊆ R n → R be a twice differentiable function, and let the collection {O 1 , . . ., O 2 n } be the system of open orthants of the space R n .Then, the function g is (n − 1)-convex and indefinite if and only if δ[g, x] is nonempty for each x ∈ D and there exists an index i ∈ {1, . . ., This statement can be extended to the function g α for Case (2.b.iii).g α can be divided into an indefinite (n − 1)-convex function for negative x 1 and a convex function for positive x 1 .The property stated in Lemma 1 therefore also holds for g α as shown in the following Corollary.
Proof We divide function g α into two parts and formulate it as The concave directions at a point x with x 1 = 0 need to be discussed separately.
We consider the direction d = (d 1 , d 2 ) and distinguish the two cases d 1 = 0 and d 1 = 0.For d 1 = 0, the function is convex in λ as g α is direction-wise convex w.r.t.component 2. Therefore, any d with d 1 = 0 is not a concave direction.For d 1 = 0, the function h g,x,d (λ) with λ ∈ [−ε, ε] always has a domain that includes points whose first component attains values strictly greater zero for every ε > 0. As g α is convex for x 1 > 0, d is again not a concave direction.Summarizing these results, we obtain for every x ∈ D. As g − α is an (n − 1)-convex and indefinite function, we can apply Lemma 1 to conclude that there exists an index i ∈ {1, 2, 3, 4}, such that for all x ∈ D the set of concave direction of This structure of the concave directions can be used to show the existence of minimizing segments for every point x ∈ D w.r.t. a set G (see Definition 5).As the existence of minimizing segments depends on the choice of G, we first define See Figure 2(a) for a graphic representation of the subsets G i .Using Observation 1, it is easy to see that G[g α , D] ⊆ G holds.The existence of minimizing segments is then given by the following Lemma.A similar case and basic ideas of the proof are already provided in [Jach et al., 2008, Theorem 3.1].
Corollary 2 Let α be as given in Case (2.b.iii), D = [l, u] with l 2 ≥ 0 and G as defined above.Then the convex envelope of g α on D consists of minimizing segments w.r.t.G.
Proof Note the following preliminary considerations.Let The proof of Corollary 1 already indicates that g − α is indefinite.The set of concave directions of g α at x ∈ E is therefore nonempty.
For the proof, we show that the convex envelope of g on D consists of minimizing segments w.r.t.D. Using Observation 2, it is easy to see that there are no extreme points of any minimizing segment inside E. We conclude that the convex envelope of g on D also consists of minimizing segments w.r.t.G.
Assume that there exists a point x ∈ D with a minimizing simplex S gα,D (x) consisting of at least three different extreme points, i.e., x 1 , x 2 , x 3 ∈ extr S gα,D (x) with According to Observation 2, we have x 1 , x 2 , x 3 / ∈ E, as δ[g α , x] = ∅ holds for all points x ∈ E. We conclude that x 1 , x 2 , x 3 ∈ G and we only need to distinguish two cases: 1. Two of the points x 1 , x 2 , x 3 are elements of the same subset G i for some i ∈ {1, . . ., 4}.Function g α is convex on G i for every i ∈ {1, . . ., 4}.This leads to a contradiction based on Observation 2. 2. No two points are elements of the same subset G i for all i ∈ {1, . . ., 4}.
For all possible combinations, one of the three vectors (x 1 − x 2 ), (x 2 − x 3 ) and (x 3 − x 1 ) is not element of the same pair of open orthants O i ∪ (−O i ) for all i = 1, . . ., 4. According to Corollary 1, g α is convex on at least one of the three sets conv {x 1 , x 2 } , conv {x 2 , x 3 } or conv {x 3 , x 1 } .This contradicts Observation 2 again.
Hence, the convex envelope of g on D consists of minimizing segments w.r.t.D. We conclude our statement by using Observation 2 as described above.Next, we construct a minimizing segment for any given point x ∈ D w.r.t.g α and G.For this, we first analyze the structure of concave directions in order to apply Observation 2.
Either way, there exists a vector for all x ∈ D. This property of the structure of concave directions will be used together with Observation 2 in the following analysis.By Corollary 2, there exists a minimizing segment S gα,G (x) for any given point x ∈ D. We denote the two extreme points of S gα,G (x) by p 1 and p 2 , i.e., By definition of G, we have p 1 ∈ G i and p 2 ∈ G j for some i, j ∈ {1, . . ., 4}.Next, we classify possible minimizing segments for all combinations of i and j (exploiting symmetry).For this, consider the following "easy" cases first.i = j: As g α is convex on G k for k = 1, . . ., 4, we have p 1 = p 2 = x in this case.(i, j) = (1, 3): Using (7) and Observation 2, we derive that there are no minimizing segments with p 1 ∈ G 1 and p 2 ∈ G 3 except for p 1 = p 2 = x = (l 1 , u 2 ).(i, j) = (2, 4): Using (7) and Observation 2 again, this leads to p 1 = p 2 = x = (0, l 2 ).(i, j) = (2, 3): See Case (2.a) in Appendix.(i, j) = (1, 2): See Case (2.b.i) in Appendix.
The first interesting combination is (i, j) = (1, 4).For every given point x ∈ D and every extreme point p 1 := (l 1 , r) ∈ G 1 of a possible minimizing segment of x, we consider the ray L starting at p 1 into the direction of x as We determine the convex envelope of g α restricted to L, and thereby detect the second extreme point p 2 := (s, t) ∈ G 4 (See Figure 2(b)).The point p 2 is given as the point with a directional derivative coinciding with the gradient of the line connecting p 1 , g α (p 1 ) and p 2 , g α (p 2 ) , i.e., Note that s ≥ 0 holds because of p 2 ∈ G 4 .We further introduce a new variable µ and set Variable µ can be interpreted as the distance between p 1 and p 2 relatively to the distance between p 1 and x.We combine the equations above and derive Each of the variables r, s, t and µ now depend on r.We insert this information into the problem used to derive the value of the convex envelope at a given point x (see (EP)).This results in the one-dimensional optimization problem This problem has to be solved in order to determine the actual minimizing segment and the value of the convex envelope at x.Using basic transformation, we first reformulate the objective function into with a constant c not depending on r, that can be omitted for sake of optimization.We aim to apply the first order optimality condition and consider the derivative of h(r) (with µ also depending on r), which reads as In order to determine the optimal solution, the roots of (1 − µ) do not have to be considered as µ = 1 would result in p 2 = x.The remaining root of the derivative is given by Hence, the optimal value of ( 8) is attained at r 1 or at the boundary of the interval [l 2 , u 2 ].The minimum of (8) can not be attained at r = l 2 , because this would result in a minimizing segment not including a concave direction (see (7) and Observation 2).The two remaining possible optimal solutions are therefore r 1 = x2 + α2 α2+α3 (x 1 −l 1 ) and r 2 = u 2 , and their respective minimizing segments.
For (i, j) = (3, 4), the resulting possible minimizing segment can be derived in a simultaneous way.Combining these results, we derive possible minimizing segments for several combinations of i, j ∈ {1, . . ., 4}.As all combinations are considered, the actual minimizing segment for x has to be one of them.In order to determine it, we compute the values of the convex combination induced by all different possible segments and take the lowest one.
Figure 3 exemplarily shows the resulting structure of the minimizing segments for different x ∈ D. We use green lines for segments with (i, j) = (1, 2), magenta and red lines for segments with (i, j) = (1, 4) and blue lines for segments with (i, j) = (3, 4).Yellow lines show intermediate segments with one extreme point at (l 1 , u 2 ).Black dots indicate minimizing segments of dimension zero inside set G 4 .This means that the function g α coincides with its convex envelope.

The Direction-Wise Convex Envelope
We consider Case (3) and all its subcases.In order to handle these cases, we introduce the concept of direction-wise convex envelopes and show how it can be used to reduce Case (3) on results from Case (2).
It is α 2 ∈ (−1, 1), so that, w.r.t.component 1, g α is direction-wise concave for x ≤ 0 and direction-wise convex on x ≥ 0. For a function g that is not direction-wise convex w.r.t. to a certain coordinate i ∈ {1, . . ., n} on the whole set D, we can design a function with this property by computing the convex envelope of g restricted to a line segment defined by fixing the value of x j for all j = i, i ∈ {1, . . ., n}.

Definition 8
The direction-wise convex envelope of g on D w.r.t.component i is defined as In certain cases, this transformation preserves the direction-wise curvature with respect to other coordinates.To be more precise, we require that the entries of coordinate i of the generating set of vex D(x,i) [g](x) is the same for all x ∈ D. We denote the respective entries by X i in the following result.Note that the entries of the remaining coordinates are already given by x.
Lemma 2 Let g : R n → R continuous and direction-wise (strictly) convex/concave w.r.t.component k ∈ {1, . . .n}.Let D ⊆ R n be a compact convex set and let i ∈ {1, . . ., n}, i = k be given.If there is a set then it is γ D,i [g] direction-wise (strictly) convex/concave w.r.t.component k.
Proof We discuss the proof only for the statement on direction-wise convexity.
The results for direction-wise concavity and strictness can be derived analogously.
Corollary 3 Let D ⊆ R n and g : D → R. Let γ D,i [g](x) be the direction-wise convex envelope of g on D w.r.t.some component i ∈ {1, . . ., n}.Then it is In order to make use of this result, we first derive the direction-wise convex envelope of g α w.r.t.component 1.It is x) restricted to x 1 < s is twice differentiable, so we can derive the Hessian Matrix as direction-wise convex w.r.t.component 1 and indefinite for x 1 < s.Note that, for all subcases, the direction-wise convexity/concavity w.r.t.component 2 is also preserved as statet in Lemma 2. This holds, as the value of s in the definition of γ D,1 [g α ] is independent in x 2 .
By applying these results, the convex envelope of γ D,1 [g α ] for all subcases of Case ( 3 Using the convex envelope of γ D,1 [g α ], the convex envelope of g α can also be easily derived by translating the minimizing segments of γ D,x [g α ] into minimizing simplices of g α .We refer to the appendix for more detailed information.

Computational Results
In this section, we evaluate the proposed separation strategy from Section 3 exemplarily on the feasible set arising from two test networks.We aim to show that the resulting cutting planes are well suited to tighten the convex relaxation of the feasible set provided by state-of-the-art software packages.Additionally, we show that the computation of cutting planes is not very time consuming in comparison to their benefits.We present the test setting in Section 5.1, the strategy of our implementation in Section 5.2, and discuss the computational results in Section 5.3.

Test Setting
We consider two test networks.The first one is artificially designed and denoted by "Net1" (see Figure 4).It has 7 nodes, 4 of which are terminal nodes that may have nonzero demand.Furthermore, it has 9 arcs and 3 interior junctions.The topology of the second one is taken from a gas network library (GasLib-11, Schmidt et al. [2017], see Figure 5) and denoted by "Net2".It has 11 nodes, 6 terminals, 11 arcs and 5 interior junctions.
For both networks we consider three different settings each, given by different bounds (box constraints) on the involved variables, including the demands at the terminal nodes.In order to work with reasonably tight variable bounds, all bounds have been tightened by applying the preprocessing routines implemented in the Lamatto++ software framework for gas network optimization, described in [Geißler, 2011, Chapter 7].
In order to evaluate the relaxed feasible set in multiple "directions", we further consider ten different objective functions respectively, which are formally all to be minimized in our computations.These objectives are given as linear combinations of the squared pressure and flow variables in the network.They are mostly inspired by applications and include finding minimum or maximum values for the demand of a specific terminal node (objectives 1, 2, 5 and 6), the (squared) pressure at a specific node (objectives 3 and 4), the squared pressure difference between two specific nodes (objective 10) and sparse linear combinations of flow variables (objectives 7, 8, and 9).
There are 30 different combinations of bound setting and objective function for both networks.We call each combination a scenario and denote it by the identifier i-j, where i specifies the bound setting and j the objective number.
q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 q 9 Fig. 4 Visualization of Net1 The formulation of every interior junction in the considered networks is adapted according to Section 4.1, and has therefore the desired structure of Problem (OP).The bounds on the variables have been chosen in such a way that several flow directions are fixed automatically and at most one flow direc- q 1 q 2 q 3 q 4 q 5 q 6 q 7 q 8 q 9 q 10 q 11 Fig. 5 Visualization of Net2 tion at every junction remains unfixed.This allows us to compute the convex envelope of all linear combinations of the constraint functions for every junction Section 4).
As a result, we are able to perform Separation Task 2 on the feasible set of every single junction.Note that not the whole network has the desired structure of Problem (OP), as additional constraints are needed to describe the coupling between the junctions.Therefore we are not able to separate from the convex hull of the feasible set of the whole network, but only from the convex hull of subsets.The design of the implementation is described in the next section.

Implementation
The strategy of our computations is the following: We design and solve a linear relaxation of the network.For each interior junction in the network, we perform Separation Task 2 by solving Separation Problem (SP).This way, we either confirm the solution of the relaxation or derive a cutting plane that is added to the description of the relaxation.We iterate this procedure until no further cutting planes are found, or until a maximum number of 100 iterations is reached.
The Separation Problem (SP) is implemented as a simple subgradient method.We use an arbitrary value α from the unit sphere S 2 ⊆ R 3 as a starting point and compute value and subgradient according to Section 4. We make use of a diminishing step size and a stopping criterion based on iteration count and improvement of the objective function.We also apply several standard methods to avoid numerical issues, such as excluding cutting planes that are approximate conic combinations of other cutting planes, or safe rounding of coefficients.Note that this part of the implementation is not optimized in terms of computational efficiency, as the focus lies on the quality of the resulting cutting planes.
If we only consider the progress of the objective value in the iterative linear relaxation outlined above, we simply confirm that the cutting planes hold additional information compared to the linear relaxation.However, our aim is to show that the cutting planes also tighten the "standard" relaxation provided by a state-of-the-art solver for MINLPs and are beneficial for its overall solution process.We chose BARON 18.5.8(Tawarmalani and Sahinidis [2005]) for this comparison.BARON does not allow the user to interfere with the solution process or to integrate custom optimization techniques.Therefore, we add the precomputed cutting planes to the model description and let BARON solve the problem with and without these additional constraints.We deactivate presolving routines and primal heuristics, and directly provide an optimal solution to the solver.This way, we are able to analyze the influence of our separation strategy on the quality of the convex relaxation and the resulting lower bounds.
We further deactivate the bound tightening strategies provided by BARON as they are also not available for the iterative linear relaxations used to derive the cutting planes.Note that this means slowing down the solver substantially, to the extend that very easy instances, solved in less than a second by BARON with standard parameter settings, now become challenging.Indeed, this is the case for most scenarios considered here.
However, without this modification the cutting planes would be applied on a different relaxed feasible set than the one they were constructed for.
Except for the points above, we choose the default options for BARON.All computations are carried out on a 2.6 GHz Intel Xeon E5-2670 Processor with a limit of 32 GB memory space for each run.

Results
We first discuss the results of the iterative linear relaxation for both networks.Note that the iterative linear relaxation is in our setting only used to derive the cutting planes.We are not interested in comparing the quality of the linear relaxation with BARON, but in analyzing the influence of the generated cuts on the quality of the lower bounds obtained by BARON as a stand-alone solver.Therefore, we omit any further information on the solution process of the linear relaxation.
In a second step, we present the results obtained by BARON (see Tables 2  and 3).In column 2 and 3, we display the optimal value of the respective scenario and the lower bound at the root node obtained by BARON alone.Their difference is denoted as the gap.For all further settings, we display the respective lower bounds in terms of the percentage of this gap that was closed by the solver.Column 4 gives the lower bound at the root node obtained by BARON with the use of our cutting planes (w/ cuts).Column 5 and 6 show the lower bound after 10 minutes into the solving process.See column 5 for the results of BARON alone and column 6 for BARON with the additional use of our cutting planes.Note that several of our 30 scenarios are already solved in the root node.The solution process of these scenarios does not offer any information in terms of improvement of lower bounds.They are therefore excluded from the presentation.
We discuss the artificially designed network Net1 first.13 of our 30 scenarios are already solved in the root node and are not considered further.For the remaining scenarios our recursive linear relaxation generates between 14 and 57 cutting planes, with an average of 32 cutting planes per scenario.This also implies that the iteration limit of 100 was never hit for these instances.The computational effort needed for the construction of the cuts is negligible: For every scenario, the computation of all cuts together is done in less than a quarter of a second.The results obtained by BARON are given in Table 2. Our generated cuts clearly improve the quality of the lower bounds at the root node for almost all instances (see column 4).This improvement has a large range between 0 % and 84 %, and a mean value of approximately 35 %.After 10 minutes into the solution process, the lower bounds obtained with the cuts are still significantly better than the ones obtained without them (see columns 5 and 6).The average values are 18 % and 70 % respectively, which is a difference of 52 percentage points.Note that percentages in columns 4 to 6 are rounded down to the first decimal, so 100 % gap closed for scenario 1-5 means that it is solved to optimality thanks to the cutting planes.Scenario 3-1 is solved to optimality with as well as without cuts, while the former version is significantly more efficient (5.87s vs. 106.14suntil successful termination).We also tested giving more time than 10 minutes to the solver.However, only extremely slow progress can be observed afterwards and, consequently, the key figures after 1h are very similar to the ones given in columns 5 and 6 of Table 2.In particular, the positive effect of our cutting planes on the lower bounds obtained does not seem to diminish for larger time limits.
Next, we discuss the library network Net2.11 scenarios are already solved in the root node.The number of generated cuts ranges from 13 to 52 with an average of 27 per scenario.Furthermore, the construction of cuts is again performed in less than a quarter of a second.The results obtained by BARON are given in Table 3.Our observations for Net2 are similar to the ones for Net1.The improvement of the lower bounds at the root node has a large range and a mean value of 54 % (see column 4).After 10 minutes into the solution process, the average amount of gap closed is 17 % without the cuts and 69 % with cuts (see columns 5 and 6).For three scenarios (3-2, 3-7 and 3-10) over 99 % gap has been closed within 10 minutes thanks to the cutting planes.
Again, after more than 10 minutes very little solver progress can be observed and our observations stay valid also when considering the results after one hour of computation time.Is is worth noting, though, that the use of our cutting planes led to scenario 3-7 being solved to optimality within 1h.
We conclude that our separation method for this special application can be performed in a fraction of a second.This result is expected, as the value and a subgradient of our Separation Problem (SP) can be derived "mostly" analytically (see Section 4).Additionally, the designed cuts are well suited to improve the convex relaxation of the considered MINLP and the resulting lower bound.Furthermore, the amount of gap closed after 10 minutes is higher with the usage of the cuts for every single scenario.This indicates that the growth of the model formulation caused by the additional constraints is not significant compared to the provided benefits of the cuts.We assume that the separation strategy is even more efficient if it is integrated into a MINLP solver.In such a setting, cutting planes can not only be designed adaptively to cut off solutions of intermediate node relaxations, but can also profit from tightened variable bounds becoming available during the solution process, and be strengthened accordingly.

Conclusion
In this paper, we proposed an algorithmic framework for tightening convex relaxations of MINLPs as they are routinely constructed by MINLP solvers.The method is based on separating valid inequalities for the simultaneous convex hull of multiple constraint functions, which generally is much tighter than standard relaxations that are only based on convexifying individual constraint functions separately.A key ingredient for this strategy is having a suitable representation of the (lower-dimensional) convex hull of all linear combinations of the considered constraint functions.Under this assumption, we show that one can solve Separation Tasks 1 and 2 efficiently and thus determine a supporting hyperplane of the simultaneous convex hull cutting off any given point outside.
As an example application, we considered a constraint structure from gas network optimization, which can be modeled using quadratic absolute value functions.For this special case, we were able to determine the convex hull of arbitrary linear combinations of constraint functions belonging to a certain local substructure and successfully test the proposed approach.Indeed, the computational results suggest that already a small number of cutting planes derived from a standard relaxation can significantly improve the performance of a state-of-the-art MINLP solver.The positive effect on the dual bound when using our cutting planes also persists when looking deeper into the solution process.Furthermore, it clearly outweighs the time needed for the separation.In our tests, the separation routine neither had access to possible variable bound improvements after the root relaxation, nor was it given the opportunity to cut off optimal solutions from node relaxations after branching.We therefore expect that it would be even more effective when incorporated directly into a branch-and-branch solver.
It would be interesting to see the presented approach being applied to other function classes an/or application areas in the future.However, computing convex envelopes of arbitrary linear combinations of constraint functions is very difficult and cannot be expected to be feasible in each case.Therefore also techniques producing tight approximations that are consistent in the sense that (SP) remains a convex problem would be of great interest from a practical point of view.In this case, g α is direction-wise concave w.r.t.components 1 and 2. As our domain D is a box, g α (x) is also called edge-concave.Functions with this property and the respective convex envelopes are for example studied in [Meyer and Floudas, 2005].
In order to decide which of the two possible triangulations determines the convex envelope, we need to compare the respective values at the center point 1 2 (l 1 +u 1 , l 2 +u 2 ) of D (e.g., see Meyer and Floudas [2005]).The corresponding possible minimizing segments for the center point are given by conv {(l 1 , l 2 ), (u 1 , u 2 )} and conv {(l 1 , u 2 ), (u 1 , l 2 )} .It turns out that holds for x 2 ≥ 0 and that, hence, triangulation T 1 determines the convex envelope.The resulting minimizing simplices are given by conv(G 1 ) (green region in Figure 6), conv(G 2 ) (blue region in Figure 6) and conv {(l 1 , l 2 ), (u 1 , u 2 )} (red line in Figure 6).

Case (1.b)
In this case, g α is direction-wise concave w.r.t.component 1 and direction-wise convex w.r.t.component 2. The generating set of g α is given as As g α (x) is convex on {l 1 } × [l 2 , u 2 ] and {u 1 } × [l 2 , u 2 ] respectively, no minimizing simplex contains more than one point in each of both subsets.Hence, the convex envelope w.r.t.G[g α , D] consists of minimizing segments of the form conv (l 1 , y 1 ), (u 1 , y 2 ) , with y 1 , y 2 ∈ [l 2 , u 2 ].Functions of this type are already studied in the literature (e.g., see Tawarmalani and Sahinidis [2001]; Jach et al. [2008]).For a given point x, the specific values of y 1 and y 2 are given as the unique minimizer of an univariate optimization problem.They need to satisfy ∂g α ∂x 2 (l 1 , y 1 ) = ∂g α ∂x 2 (u 1 , y 2 ) (11) or either y 1 or y 2 need to lie at the boundary of the interval [l 2 , u 2 ].
Thus, a minimizing segment is either parallel to the vector v := 1, − α2 α2+α3 (red lines in Figure 7), or is determined by y 1 = l 2 (blue lines in Figure 7) or by y 2 = u 2 (green lines in Figure 7).

Case (2.a)
In this case, g α is direction-wise convex w.r.t.component 1 and direction-wise concave w.r.t.component 2. The generating set of g α is given as In order to compute the minimizing segments, we use the same arguments as in Case (1.b) with inverted roles of the two coordinates (e.g., see Jach et al. [2008]).
As the second derivative of g α differs among the two half-spaces x 1 ≤ 0 and x 1 ≥ 0, we additionally distinguish three possibilities defined by the position of the minimizing segments with respect to these half-spaces.Minimizing segments containing only points with negative values of x 1 are parallel to the vector v 1 := α2 α2−α1 , −1 (yellow lines in Figure 8), or defined by the extreme point (l 1 , u 2 ) (green lines in Figure 8).Segments containing only points with a positive value of x 1 are parallel to the vector v 2 := α2 α2+α1 , −1 (red lines in Figure 8) or defined by the extreme point (u 1 , l 2 ) (blue lines in Figure 8).Minimizing segments containing points from both half spaces are not parallel to each other.For one extreme points (y 1 , u 2 ), the second one is given by y 1 + α2(u2−l2)−2α1y1 α1+α2 , l 2 (magenta lines in Figure 8).(2.b.i) In this case, g α is indefinite and direction-wise convex w.r.t.components 1 and 2. We derive a similar result as in Section 4.2 for Case (2.b.iii).
Corollary 4 (Jach et al. [2008]) Let α be as defined in Case ( 2   In this case, γ D,1 [g α ] is direction-wise convex w.r.t.component 1 and directionwise concave w.r.t.component 2. We apply a similar approach as in Case (2.a).We again derive the structure of minimizing segments by analyzing the derivatives of γ D,1 [g α ].
As a second step, Figure 11 displays the structure of the minimizing segments w.r.t.vex D [γ D,i [g α ]].Green, yellow and blue lines indicate minimizing segments that are determined by the extreme points (l 1 , u 2 ), (q, l 2 ) and (u 1 , l 2 ) respectively.The red lines indicate minimizing segments parallel to the vector v.
It is v := Note that the analysis and visualization only holds for α 2 ≥ 0. However, the case of α 2 < 0 is mostly symmetric and can be handled analogously.The function γ D,1 [g α ] is direction-wise convex w.r.t.components 1 and 2. Furthermore, it is indefinite for x < s as motivated in Section 4.3.We use the same strategy and the same subdivision of G as in Case (2.b.iii).
As a second step, Figure 13 displays the structure of the minimizing segments w.r.t.vex D [γ D,i [g α ]](x).Magenta and red lines show segments connecting G 1 and G 4 and yellow lines show segments defined by the extreme point (l 1 , u 2 ).Black dots indicate that the function g α coincides with its convex envelope.Note that the analysis and visualization only holds for α 2 ≥ 0. However, the case of α 2 < 0 is mostly symmetric and can be handled analogously.

( a )
Visualization of the subdivision of G. (b) Visualization of L, p 1 and p 2 .

Fig. 3
Fig. 3 Case Structure of the minimizing segments.
.b.i) and let G = bd(D) be the boundary of D. Then we have G[g α , D] ⊆ G and the convex envelope of g α on D consists of minimizing segments w.r.t.G. Proof Analogous to Corollary 2.
The feasible set B m is compact and convex, and α z is linear in α.Moreover, observe that vex D [α g](x) is concave in α.In fact, for arbitrary α, β ∈ R m and λ ∈ [0, 1] we obtain

Table 1
Conditions and properties for all nine (sub)cases.

Table 2
Improvement of the lower bound for Net1, comparing BARON alone and BARON with the use of the cutting planes

Table 3
Improvement of the lower bound for Net2, comparing BARON alone and BARON with the use of the cutting planes