On decomposition and multiobjective-based column and disjunctive cut generation for MINLP

Most industrial optimization problems are sparse and can be formulated as block-separable mixed-integer nonlinear programming (MINLP) problems, defined by linking low-dimensional sub-problems by (linear) coupling constraints. This paper investigates the potential of using decomposition and a novel multiobjective-based column and cut generation approach for solving nonconvex block-separable MINLPs, based on the so-called resource-constrained reformulation. Based on this approach, two decomposition-based inner- and outer-refinement algorithms are presented and preliminary numerical results with nonconvex MINLP instances are reported.


Introduction
Most real-world Mixed Integer Nonlinear Programming (MINLP) models are sparse, e.g. instances of the MINLPLib (Vigerske 2018). These models can be reformulated as block-separable problems, defined by low-dimensional sub-problems, which are coupled by a moderate number of linear global constraints. In this paper 1 3 we investigate the potential of solving block-separable MINLPs using a decomposition multi-tree approach.
Our focus is on Decomposition-based Successive Approximation, i.e. an approximation is improved by solving low-dimensional sub-problems. Decomposition is a very general approach that can be applied to convex optimization, as well as nonconvex optimization and discrete optimization. These methods are based on dividing a model into smaller sub-problems defined by local constraints, which can be solved in parallel. Then the results are used for updating a global master problem defined by global constraints, which is typically an LP, a MIP or a QP problem. The subproblems and the global master problem are alternatingly solved until some termination criterion is met. In the following, an overview on (deterministic) decomposition methods is given.

Lagrangian decomposition
Many MINLP decomposition methods are based on solving a Lagrangian relaxation. It has been proven that Lagrangian relaxation regarding the global constraints is equivalent to a convex relaxation regarding the local constraints of the original problem when we are dealing with a linear objective function (Nowak 2005). Lagrangian Decomposition (LD) (Feltenmark and Kiwiel 2000;Geoffrion 1974;Lemaréchal and Renaud 2001) solves the dual problem by exactly solving subproblems. Column Generation (CG) methods also solve the dual (Lübbecke and Desrosiers 2005). However, in contrast to LD, CG does not require solving subproblems exactly. It is sufficient to compute feasible points of sub-problems with negative reduced cost. Note that it is possible to solve the LP master problem inexactly, e.g. using subgradient or bundle methods. Other decomposition algorithms for solving a Lagrangian relaxation are cutting plane methods or Frank-Wolfe decomposition, see Nowak (2005) for an overview.

Alternating direction methods
The Alternating Direction Method (ADM) computes local solutions of a MINLP by alternately solving a QP master problem and MINLP sub-problems. Originally, ADMs were developed for finite element problems (Gabay and Mercier 1976) and 1 3 Decomposition in MINLP are based on Uzawas algorithm (Uzawa 1958). A review of ADMs including a convergence proof for convex problems with separable objective function is given in Boyd et al. (2011). An ADM for solving MIPs is presented in Geissler et al. (2014) and for MINLPs in Nowak (2015).

Multi-tree decomposition methods
This type of decomposition strategy uses a MIP master problem. It is called multitree, because an individual branch-and-bound tree is built for each MIP instance.
Using one global master problem which is updated during the solution process, i.e. new constraints are added during the solution process in order to improve the master problem, is called single-tree strategy. Both approaches solve easier sub-problems, in order to update the global master problem. More discussion on single-tree and multi-tree approaches can be found in Lundell et al. (2018). Rapid Branching is a CG-based multi-tree successive fixing heuristic, described in Borndörfer et al. (2013) and Nowak (2014). For MINLP problems with a small duality gap, like many transport planning problems, this method can be used to compute near globally optimal solutions of problems with millions of variables. The MOP-CG approach of Bodur et al. (2016) is a multi-tree decomposition algorithm for solving loosely coupled IPs by alternately solving CG lower and upper bounding master problems and Multi-Objective Programming (MOP) sub-problems. It is based on a resource-constrained reformulation of the MINLP. We investigate the potential of this approach combining them with Decomposition-based Inner-and Outer-Refinement (DIOR), see Nowak et al. (2018).

Investigating potential of decomposition multi-tree approaches
To investigate the potential of decomposition in contrast to using one single search tree in MINLP, we develop Decomposition-based Inner-and Outer-Refinement (DIOR) algorithms. We also investigate the idea of dimension reduction following the concept of a resource constraint program as introduced by Bodur et al. (2016).
To do so, we follow the following steps: 1. Reformulate the MINLP as a resource-constrained program (RCP) in a reduced space 2. Compute an LP approximation of the RCP regarding nondominated columns using a two-phase approach: (a) Subgradient method, (b) Column generation. 3. Compute a MIP approximation of the RCP by adding disjunctive cuts, using a multi-objective-based line-search Section 2 first focuses on the problem definition and the resource-constraint approach. Section 3 describes a column generation algorithm for computing an initial inner and outer LP approximation. Section 4 presents a DIOR algorithm for computing an outer MIP approximation. The convergence of this algorithm is shown in Sect. 4.6. A faster DIOR algorithm for computing an inner MIP approximation is presented in Sect. 5. The numerical evaluation in Sect. 6 shows the potential of the new decomposition-based approximation approach. We conclude in Sect. 7 outlining steps for future investigation.

Problem formulation and reformulations
We consider block-separable (or quasi-separable) MINLP problems of the form with |M 1 | + |M 2 | = m , and The vector of variables x ∈ ℝ n is partitioned into |K| disjoint blocks such that n = ∑ k∈K n k , where n k = n k1 + n k2 is the dimension of block k and x k ∈ ℝ n k denotes the variables of block k. The vectors x, x ∈ ℝ n denote lower and upper bounds on the variables. The linear constraints defining feasible set P are called global. The constraints defining feasible set X k are called local. Set X k is defined by linear and nonlinear local constraint functions, g kj ∶ ℝ n k → ℝ , which are assumed to be bounded and continuously differentiable within the set [x k , x k ] . Linear global constraints P are defined by a i ∈ ℝ n , b i ∈ ℝ, j ∈ [m] . The linear objective function is defined by c T x ∶= ∑ k∈K c T k x k , c k ∈ ℝ n k . The blocks k ∈ K can be computed based on connected components of the socalled 'sparsity graph' (Nowak 2005). The nodes and arcs of this graph correspond to variables and nonzero entries of the Hessian of constraint functions of the original MINLP, respectively. Interestingly, this procedure yields small blocks for most instances of the MINLPLib (Vigerske 2018). Note that it is possible to reformulate a MINLP with a given arbitrary maximum block-size n k by adding new variables and copy-constraints (Nowak 2005;Tawarmalani and Sahinidis 2005;Vigerske 2012).

Resource-constrained reformulation
If the MINLP (1) has a huge number of variables, it can be difficult to solve it in reasonable time. In particular, if the MINLP is defined by discretization of some infinitely dimensional variables, like in stochastic programming or in differential equations, and in addition the sub-problems are reformulated in a lifted space using auxiliary variables, the number of variables can be significantly higher than the number of global constraints connecting sub-problems. For such problems, a resource-constrained view can be promising. In this approach, the original problem (1) in n-dimensional space is viewed from the m global constraints. Define the matrix A k by and consider the transformed feasible set: The variables w k ∶= A k x k are called resources. They describe how the objective value and the constraint values a T i x are distributed among the blocks. Note that for sparse MINLPs the number of non-zero resources, for which A ki ≠ 0 , can be much smaller than m, see Sect. 2.3. Let denote the global constraints in the resource space. We then define the resourceconstrained formulation of (1) as Proposition 1 Problem (1) and (7) are equivalent to the following two-level program: where w * k0 is the optimal value of the sub-problem RCP k given by Proof Problem (1) can be formulated as This shows that (1) and (7) are equivalent. For a given solution (w * , x * ) of (10), it follows that x * fulfills (9). Hence, (8) is equivalent to (10). ◻

Multi-objective reformulation
The multi-objective (MO) view on (7) changes the focus from the complete image set W k to the relevant set of Pareto optimal points. A similar reformulation is presented in Bodur et al. (2016). Consider the following MO sub-problem of block k, called MOP k , where we aim to minimize |M 1 | + 1 resources simultaneously The nondominated (Pareto-optimal or efficient) front of MO sub-problem (11) is defined to be the set of vectors, w k = A k x k with x k ∈ X k , with the property that there does not exist any other feasible solution, v k = A k y k with y k ∈ X k , which dominates w k , i.e., for which v ki ≤ w ki for all i ∈ M 1 ∪ {0} , and v ki < w ki for at least one index i ∈ M 1 ∪ {0} . An element of the nondominated front is known as a nondominated point (NDP). In other words, a NDP is a feasible objective vector for which none of its components can be improved without making at least one of its other components worse. A feasible solution x k ∈ X k is efficient (or Pareto optimal) if its image w k = A k x k is a NDP, i.e. it is nondominated. Define the set of NDPs of (11) by and correspondingly, set

Proposition 2
The solution of problem (7) is attained at w * ∈ W * .
Proof Assume there exist parts ŵ * k ∉ W * of a solution w * , i.e the parts are dominated. This means ∃ŵ k ∈ W * k which dominates w * k , i.e. ŵ ki ≤ w * ki for i ∈ {0} ∪ M 1 . Consider ŵ the corresponding solution where in w * the parts w * k are replaced by ŵ k . Then ŵ is feasible for RCP given for i ∈ M 1 , and its objective function value is at least as good as that of w * as which means that the optimum is attained at a NDP point ŵ ∈ W * . ◻

Reducing the dimension of the resources
In many practical problems, some of the blocks may not appear in a global constraint.
To make use of this characteristic in terms of dimension reduction, we should consider the exact properties. Consider the index set of relevant resources Then RCP (7) is equivalent to where the projection operator k ∶ ℝ m+1 → ℝ |M 1k |+|M 2k | is defined by Similarly, following Proposition 2, (12) is equivalent to Formulations (14) and (16) are of interest, because the number ∑ k∈K �M 1k � + �M 2k � of relevant resources is usually significantly smaller than the number n of variables in the original problem. This is the case for sparse optimization models, for which the model components are coupled by a moderate number of global constraints.

Supported nondominated points
A common method to compute a NDP is the weighted sum method (Miettinen 1998), which solves an optimization problem with a single objective obtained as a positive (convex) combination of the objective functions of the multiobjective problem: For a positive weight vector d ∈ ℝ |M 1k | + , any optimal solution of (17) is an efficient solution of (11), i.e., its image is nondominated. Such a solution and its image are called a supported efficient solution and a supported NDP, respectively. Thus, an efficient solution x k is supported if there exists a positive vector d for which x k is an optimal solution of (17), otherwise x k is called unsupported.

Column generation
In this section we describe a column generation algorithm for computing initial inner and outer LP approximations. We use the notation

Inner approximation
Column generation (CG) is a decomposition method for solving the following convex relaxation of (1) Note that the convex relaxation (18) of problem (1) is equivalent to the Lagrangian relaxation of problem (1) regarding the global constraints, see Lemma 3.7 of Nowak (2005) for the proof.
The quality of convex relaxation (18) of MINLP (1) depends strongly on the duality gap, defined by Given a finite set of feasible points we have that is an inner LP approximation of (18). The resource constrained formulation of (19) is given by where R k ∶= {A k y ∶ y ∈ S k } is called a set of columns r kj ∈ ℝ m+1 . An LP formulation of the inner approximation (20), called LP-IA, is given by where |R k | denotes the standard simplex CG generates columns r kj ∈ ℝ m+1 and adds them to the column set R k , k ∈ K, by alternately computing a dual solution of (21), solving Lagrangian sub-problems and adding y k ( ) to R k for k ∈ K . In general, sub-problem (23) is a nonconvex MINLP problem since it is defined over nonlinear feasible set X k . However, it is smaller than the original problem (1) and its size depends on the block-size n k , k ∈ K.
The computation of convex relaxation (18) is based on solving large easy LP master problems (19) and small difficult MINLP sub-problems (23). In order to be efficient, a fast sub-solver for Lagrangian sub-problem (23) is necessary. Examples of fast sub-solvers are (truncated) branch-and-cut, local search, dynamic programming-based constrained shortest path (Engineer et al. 2008) or MIP-based OA Nagarajan et al. 2019).

Initializing LP-IA
Algorithm 1 computes initial columns R k , k ∈ K, by performing a subgradient method for maximizing the dual function of problem (1) regarding the global constraints: We compute the step length p by comparing the values of the function L( ) defined in (24) at different iterations p of Algorithm 1, similarly as in Shor (1985): The method solvesubproblem(d) solves MINLP sub-problem (23) for a given search direction d ∈ ℝ m+1 and computes reduced cost k of the new point y k . The columns w k = A k y k are added to the column set R k , k ∈ K . The reduced cost k is computed by taking the difference between the cost of new column w k regarding direction d and minimum cost of existing columns R k regarding direction d, i.e.
After computing minimizers y k of (26) for d = (1, T ) , one can easily compute the value of dual function (24). Note that if y k is a global minimizer and d k is a nonnegative search direction, then w k = A k y k is a supported NDP.
return R

A column generation algorithm
Algorithm 2 describes a Column Generation algorithm for computing LP-IA (21).
In the beginning of the algorithm, the feasible set of LP-IA (21) may be empty. Therefore, the following master problem with slacks s ∈ ℝ m + is solved by where penalty > 0 is sufficiently large. If the slack variables are nonzero, i.e. s ≠ 0 , in order to eliminate nonzero slack variables, the method slackdirections computes a new search direction d ∈ ℝ m for the subproblems (23) in the following way with e i ∈ ℝ m the vector with a one at coordinate i and zeros elsewhere.
for k ∈ K do 10:

A DIOR algorithm for computing an outer MIP approximation
In this section, we present a DIOR algorithm for computing an exact outer MIP approximation of the resource-constrained problem (7). It consists of an LP phase (using an LP master problem) and a MIP phase (using a MIP master problem). The LP phase generates supported NDPs and is intended to speed up the convergence of the algorithm, like in . The MIP phase generates also non-supported NDPs. For the sake of simplicity, we consider only global inequality constraints, i.e.

Outer LP approximation
An outer LP approximation of (7), called LP-OA, is given by with where y k ( ) is the solution of the Lagrange sub-problem (23) regarding a dual point and M is a set of dual solution points, computed by Algorithm 2. Note that P k consists of a set of supporting hyperplanes of set W k . In other words, set P k is defined by valid linear constraints, since they are constructed using optimal solution point y k ( ) of the Lagrange sub-problem (23) regarding dual direction ∈ M . Therefore,

Outer MIP approximation
We construct an outer approximation of W k defined by polyhedral subdivision elements D ku , called cells where U k is the index set of cell subdivision elements. We define a cell D ku , u ∈ U k , k ∈ K in the following way where J ku denotes an index set of constraints defining the u-th cell. A nonconvex outer MIP approximation problem of (7) is given by where binary variables select an index u ∈ U k , such that w k ∈ D ku . Note that (34) does not consider the integer constraints of the original problem (1), since it is defined in transformed feasible set as in (5). Problem (34) contains only binary constraints which indicate whether cell D ku , u ∈ U k is active or inactive. A MIP formulation of (34) is given by where d kj and kj , j ∈ J ku , describe the polyhedral set corresponding to cell D ku described by (33) and selected when t u = 1 . The "big" M > 0 should be sufficiently large. A restricted LP master problem regarding the selected cells D ku k , k ∈ K, is defined by

Disjunctive cuts
A p-disjunctive cut is defined by removing an open polyhedron C k from a cell D ku defined by p linear inequalities, i.e.
We generate a cone cut with respect to vertex v ∈ ℝ m+1 , which we call an |M 1k |-disjunctive cut, by removing the cone of dominated area In the outer approximation, we use this concept to remove a cone of dominated area given an NDP v. This means that MIP approximation (35) is refined and cuts off parts of MIP solution ŵ by removing cones C k (v k ) that contain ŵ k ∈ C k (v k ) for those subproblems k ∈ K , where ŵ k ∉ W k . In order to remove an cone C k (v k ) , cell D ku is divided into |M 1 | new cells

Pareto line search
We describe a line search procedure for constructing a disjunctive cut for cutting off a solution (u,ŵ) of MIP-OA (35) by removing cone C k (v k ) , defined in (37). Consider the ideal point We compute a disjunctive cut by removing the largest cone C k (v k ) , such that

DIOR using Pareto line search
Algorithm 4 describes an inner-outer refinement algorithm for solving (1) by iteratively adding disjunctive cone cuts based on Pareto line search. It maintains a set of sample points R k and cells D k on local level an a set M of dual vectors on global level. In Sect. 6, we will illustrate the algorithm numerically and also provide a sketch of the idea of the cone points, outer solutions and outer approximation in Fig. 2. It uses the methods:

Decomposition in MINLP
• initoa(R, M) for initializing D by D k1 = P k , k ∈ K, where P k is defined by the local constraints of the OA-LP (29). • solveouterlp(D) for computing (ŵ, ) of the OA-LP (29). • idealpoint for computing an ideal point w k defined in (39). • paretolinesearch(w k ,ŵ k , ) for computing a possibly nonsupported NDP v k by solving (41). • conesubdiv(u k , v k , D k ) for removing cone C k (v k ) from set D k by dividing D ku k into new overlapping cells 38. Algorithm 4 DIOR for computing an outer MIP approximation 1: function dior1 2: (ẑ, u, D, R) ← initDior,ŵ ← w(ẑ) # LP-IA refine 3: for k ∈ K do 4: w k ←idealPoint, add cuts w k ≥ w k to D k 5: repeat # MIP-OA refine 6: for k ∈ K do 7: ( (u,ŵ) ← solveOuterMip(D) # MIP-OA solution 10: until stopping criterion 11: return k∈K w k0 The algorithm starts by generating columns and a first outer approximation via Algorithm 3. A reduced version of Algorithm 4 is illustrated with instances having one global constraint in .

Proof of convergence
In this section, we prove that Algorithm 4 computes an -global optimum of problem (1) in finitely many iterations. Note that we assume (28), i.e. m = |M 1 | . Let Denote by ŵ p , p , v p the solution of MIP-OA (34), the dual solution of restricted LP-OA (36), and the solution of the Pareto line-search sub-problem (41) in iteration p, respectively. Furthermore, denote by D p k ⊃ W k , k ∈ K, the outer approximation, which is refined by Algorithm 4 in iteration p by adding cone cuts. In particular, we have This process creates a sequence of enclosure sets with the following property In fact, it is sufficient to have Ŵ p enclose W * as Property (2)

MIP-OA (34) is equivalent to
Proof This can be proved exactly as in Proposition 2. ◻ For the sequel of the proof, we introduce the extended resource set as a complement of the dominated area The extended Pareto frontier is defined as Notice that W * k not only includes the Pareto front W * k , but also covers the gaps in the Pareto front, which are also sketched in Fig. 1.

Lemma 2 Problem is equivalent to
Proof This can be proven as in Proposition 2. Assume that ŵ * k ∉ W * for some k ∈ K of a solution w * . This means ∃ŵ k ∈ W * k which dominates w * k , i.e. ŵ ki ≤ w * ki for i ∈ {0} ∪ M 1 . Consider ŵ the corresponding solution where in w * the parts w * k are replaced by ŵ k . As in the proof of Proposition 2 it follows that the optimum is attained at a NDP point ŵ ∈ W * ⊆ W * . ◻ Considering W k is relevant when we have a look at the line search (41). Now focusing on this step, notice that if takes a value of 1, the outer approximation subsolution ŵ k is feasible and in that respect optimal for this part of the master problem. So, if for all sub-problems we have a value of = 1 we are done and the algorithm converged.

Lemma 3 If after p < ∞ iterations of Algorithm 4,
p k = 1 for all k ∈ K , then ŵ p is an optimal solution of problem (7).
Proof Since ŵ p is an optimal solution of MIP-OA master problem (34) for all k ∈ K . From Lemma 2 it follows that ŵ p minimizes the objective function within H ∩ W . Since ŵ p ∈ H , it follows that it is also an optimal solution of (7). ◻ Since subsequence {ŵ p j } ∞ j=1 is convergent, there exists a limit it has a limit lim j→∞ŵ p j = w * . In Lemmas 6 and 7, we show that w * is in the extended Pareto frontier W * and therefore an optimal solution of (7), where W *

Proof
Since the algorithm has not terminated, for all p = 1, 2, … there exists a k ∈ K such that p k > 1 . Therefore, all the points in the sequence {ŵ p } ∞ p=1 will be distinct as shown in Lemma 4. Since {ŵ p } ∞ p=1 contains an infinite number of different points, and all are in a compact set, since MIP-OA is bounded, according to the Bolzano-Weierstrass Theorem, the sequence contains a convergent subsequence. ◻

Lemma 6
The limit w * k for any convergent subsequence {ŵ p j } ∞ j=1 generated in Algorithm 4 belongs to W * k .

Proof
removed. This is also the case for iterate p j , such that for future iterates ∃i,ŵ

Lemma 7
The limit point of a convergent subsequence is a global minimum point of (7).
Proof Because each set Ŵ p is an outer approximation of the feasible set W, f (ŵ p j ) gives a lower bound on the optimal value of the objective function. Due to property (43), sequence {f (ŵ p j )} ∞ j=1 is nondecreasing and since the objective function is continuous, we get lim j→∞ f (ŵ p j ) = f (w * ) . According to Lemma 6, limit point w * k is within the set W * . From Lemma 2 follows that w * minimizes the objective function within H ∩ W . Because w * ∈ H , it is also an optimal solution of (7). ◻ Since Lemmas 6 and 7 apply to all convergent subsequences generated by solving the MIP-OA master problems (34), any limit point of such sequence will be a global optimum. We summarize the convergence results in the following theorem.
Theorem 1 Algorithm 4 either finds a global optimum of (7) in a finite number of iterations or generates a sequence {ŵ p j } ∞ j=1 converging to a global optimum.
Proof Suppose the algorithm stops in a finite number of iterations. Then the last solution of the MIP-OA master problem (34) satisfies all constraints and according to Lemma 3 it is a global optimum of (7). In case the algorithm does not stop in a

3
Decomposition in MINLP finite number of iterations, it generates a sequence converging to a global optimum of (7) according to Lemmas 5 and 7. ◻

A DIOR algorithm for computing an inner MIP approximation
Motivated by the Rapid Branching approach (Borndörfer et al. 2013), we present in this section a heuristic DIOR algorithm for computing an inner MIP approximation of the resource-constrained problem (7). The goal of the algorithm is to compute a good primal solution point using the resources obtained by the inner MIP approximation. The approach is based on iteratively cutting off low-dimensional faces of conv (R ∩ H) containing the solution ŵ of a MIP master problem. Note that the objective value of an inner MIP approximation does not provide a valid lower bound for problem (1).

Inner MIP approximation
Consider a partition of the relevant part of resource space [w, w] using polyhedral partition elements D ku , called cells, i.e.
Implicitly, this partition also partitions the set of columns R k into subsets R ku ∶= R k ∩ D ku . An inner MIP approximation with slacks, called MIP-IA, is A MIP formulation of (46) is given by where [R ku ] ⊂ [R k ] denotes the indices of the columns R ku . Note that replacing conv (R ku ) in (46) by conv (W k ∩ D ku k ) defines a lower bounding program of the MINLP (1). By performing column generation regarding cells D ku , the optimum value of (46) is converging to the optimum value of this lower bounding program. (47) is refined by adding an inner disjunctive cut, defined by subdividing a cell D ku k into sub-cells D kv , v ∈ V k (u k ) such that and replacing conv (R ku k ) by ⋃ v∈V k (u k ) conv (R ku k ∩ D kv ). In order to increase the optimum value of (46), it is necessary to cut off w k (ẑ k ) for some k ∈ K , where ẑ is the solution of MIP-IA (47). This is equivalent to Denote by R k ⊆ R k a set of supporting columns with w k (ẑ k ) ∈ int ( conv (R k )) . We define the sub-cell D kv such that it eliminates one supporting column from the set R k , i.e. R k ⊄ D kv , ∀v ∈ V k (u k ) . For that we set the point ŵ k (ẑ) to be a vertex of D kv , ∀v ∈ V k (u k ) . Since w k (ẑ k ) ∈ int ( conv (R k )) and w k (ẑ k ) ∈ vert(D kv ), ∀v ∈ V k (u k ) , it cannot be expressed as a convex combination of points in R k ∩ D kv , i.e. (48) holds.

Decomposition in MINLP
Algorithm 5 Project-and-branch method for inner disjunctive cut generation and local optimization 1: function innerDisjunctCut(u,ẑ, D, R) 2: for k ∈ K \K do # 1. find block for subdiv.
ifǨ = ∅ then 9: repeat 10: for v ∈ V k do # 3. CG for sub-paths 20: if |K| < |K| then 28:ẑ ← solveRestrictIA(u, D, R) 29: L ← L ∪ {(ẑ, u,K)} # add new sub-path 30: until L = ∅ or stopping criterion 31: return (V, D, R) Algorithm 5 describes a project-and-branch procedure for refining MIP-IA by adding disjunctive cuts. It iteratively subdivides a cell D ku k into sub-cells D kv , v ∈ V k (u k ) . Denote the set of subdivided blocks by K ⊂ K . For each sub-path of sub-cells D ku k , k ∈K , a lower bound (ẑ) ∶= ∑ k∈Kŵk0 (ẑ k ) is computed, where ẑ is the solution of the restricted LP-IA (54) regarding sub-cells D ku k . In order to prevent, that a cell D ku k is subdivided several times, the index set V k = V k (u k ) is stored, and D ku k is only subdivided if V k = � . The path data (ẑ, u,K) is stored in a list L. In each iteration, the following steps are performed: 1. In the first step, a block k ∈ K⧵K is determined, which will be later subdivided.
Since the relative distance of w k (ẑ k ) to a column r kj ∈R k is related to 1 −ẑ kj , a block is selected, for which | max j∈[R k ]ẑ j − 0.5| is small, where R k ⊆ R k is a set of supporting columns (with positive ẑ kj -value).
• The set R k is computed by getsupportcolumns(u k ,ẑ k , D k , R k ) by first computing the p largest positive values ẑ k1 ≥ẑ k2 ≥ ⋯ ≥ẑ kp and setting 1 3 Then redundant columns r kj ∈R k are removed, which can be represented as a convex combination of other columns of R k , such that conv (R k ) = conv (R k ⧵{r kj }) and |R k | ≤ |M 1k | + |M 2k |.
• In order to check if w k (ẑ k ) is infeasible, the following resource constrained projection sub-problem with slacks, similar as RCP k (9), is solved using solveresprojectsubproblem(u k ,ẑ k ): If s k ≠ 0 , then w k (ẑ k ) ∉ W k . The point ỹ k ∈ X k is a partial solution estimate, which is used in Algorithm 6 for computing a solution candidate. 2. In the second step, D ku k is subdivided into the sub-cells for u j ∈ V k (u k ) . Sub-cells D kv are defined by |R k | − 1 cut directions kji ∈ ℝ |R k | separating columns r kj ∈R k , and fulfilling (48). They are computed using the methods: • innersubdivcuts(ẑ k ,w k ,R k ) for computing cut directions kji of sub-cells D kv j defined in (51), by solving for all j ∈ [R k ], i ∈ [R k ]⧵{j} the following system of equations If T kji (r kj − w k (ẑ k )) ≥ 0 , kji is multiplied by −1 . Despite the removal of redundant columns in method getsupportcolumns, there is no guarantee that system (52) is solvable. In order to be able always to compute a cut kji , we constructed it by computing the basis of the null space of (52).
3. In the third step, lower bounds (ẑ) of sub-paths D ku k , k ∈K , are computed by performing a limited CG using • getsearchdirection(w k , r kj v ) for setting the search direction

Decomposition in MINLP
• solvelagsubproblem(d k , D ku k ) for solving • solverestrictia(u, D, R) for computing a primal and dual solution of

DIOR using an inner MIP approximation
Algorithm 6 describes an inner MIP refinement algorithm for computing a solution candidate of (1) by iteratively subdividing the feasible set and generating new columns using innerdisjunctcut(u,ẑ, D, R ) for computing (V, D, R). If V = � , the algorithm stops, since no cells were subdivided and no columns were generated. The algorithm uses the methods: • solveinnermip(R, D) for solving (47). • localsolve(y) for computing a solution candidate x * of (1) by performing a local search starting from the point y.

Numerical results
Algorithm 4 and 6 were implemented with Pyomo (Hart et al. 2017), an algebraic modelling language in Python, as part of the parallel MINLP-solver Decogo (Nowak et al. 2018). The implementation of Decogo is not finished, in particular parallel solving of sub-problems has not been implemented yet. The solver utilizes SCIP 5.0.1 (Gleixner et al. 2017) for solving MINLP sub-problems, GUROBI 9.0.3 for for solving MIP/LP master-problems and IPOPT 3.12.13 (Wächter and Lorenz 2006) for performing a local search starting from the projected solution of the last MIP master problem. Note that it is possible to use other suitable solvers which can interface with Pyomo. All computational experiments were performed using a computer with Intel Core i7-7820HQ 2.9 GHz CPU and 16 GB RAM.
Since most MINLP models are not given in a block-separable form, block structure identification of the original problem and its automatic reformulation into a block-separable form have been implemented. The block structure identification is based on the idea of connected components of a Hessian adjacency graph. Consider a MINLP problem defined by n variables and by |M| functions h m , m ∈ M . Consider a Hessian adjacency graph G = (V, E) defined by the following vertex and edge sets In order to subdivide the set of variables into |K| blocks, we compute the connected We obtain the list of variables

Experiment with DIOR1
In this section, we illustrate the results of Algorithm 4 with Example 1 outlined in Sect. 2.4. The optimal value of the problem is − 8.5 with the optimal resources (−4, 3.5) in space W 1 and (−4.5, 6.5) in space W 2 . Note that we solve Pareto line search sub-problem (41) to global optimality, in order to guarantee that the cone point is an NDP. As a stopping criterion for the algorithm, we use a tolerance on improvement of MIP-OA objective value and set it to MIP = 10 −5 . For the line search, the algorithm computes the ideal point w 1 = (−8, 0) and w 2 = (−6.2, 5) . Algorithm 3 computes an initial OA point ŵ 1 = (−3.6, 3) and ŵ 2 = (−5, 7) . Figure 2 shows that the optimal value of the MIP-OA (34) converges to the global optimum of (1) in 20 iterations after 12.5 s. It is interesting to notice that in space W 2 , for almost all OA solution points, the corresponding cone point v 2 is identical, i.e. the OA solution ŵ 2 belongs to the feasible set.
For the instances with more than one constraint, the convergence is much slower. In each iteration, the algorithm generates m + 1 disjunctive cuts for the selected cell. These cuts are weak, since the algorithm first selects cells which don't improve the OA objective value. The cells, which improve the OA objective value, are selected only when other resources cannot be improved anymore. Also, after several iterations, the MIP-OA master problem becomes more difficult to solve due to the huge amount of generated disjunctive cone cuts.

Experiments with DIOR2
In this section, we present the results for CG Algorithm 2 and DIOR2 Algorithm 6. For testing purpose, we selected several instances from MINLPLib (Vigerske 2018).  More detailed statistics on the selected instances is given in Table 1. Focusing on the setting for the Column Generation (Algorithm 2), notice that often smaller MINLP sub-problems can still be difficult to solve. Therefore, we set termination criteria for earlier stopping of SCIP for solving MINLP sub-problems, e.g. the maximum number of processed nodes after the last improvement of the primal bound or the relative gap tolerance. Those parameters are set to 500 and 0.01, respectively. Furthermore, Algorithm 2 solves the problem over its convex relaxation. In order to check convergence, we repeat one iteration of Algorithm 2 without early termination of sub-problem solving, i.e. we don't apply any stopping criteria for the sub-solver. The disadvantage of this strategy is that Column Generation needs more iterations to converge. Therefore, it solves more MINLP and LP sub-problems. Table 1 shows the results of Column Generation for the selected test set. The characteristics of each instance are reported, i.e. problem size n, number of blocks |K| and number of global constraints m. The performance measures are given by the number of solved LP master problems N LP , the number of solved MINLP sub-problems N sub and number of generated columns |R|. Note that N LP also denotes the number of iterations of Algorithm 2. We also compute the relative duality gap * − LP , where LP denotes the objective of the LP master problem (20) and * denotes the best known objective function value. Table 1 illustrates that the number of generated columns |R| in Algorithm 2 is smaller than the number of solved MINLP sub-problems N sub . This indicates that the MINLP sub-problem may generate the same column several times. For some instances, it is necessary to solve relatively more LP master problems, but these subproblems are relatively easy, as can be observed in Table 3.
For the experiments with DIOR2, the maximum number of supporting columns in (49) is set to p = 3 , and the maximum number of MIP iterations (number of times MIP-IA master problem (47) is solved) to 20, i.e. N MIP ≤ 20 . Some MINLP sub-problems in steps 3-10 of Algorithm 6 are solved to optimality. For example, sub-problem (50), in order to check whether the resources of MIP-IA are feasible. Other MINLP sub-problems are not solved to optimality, i.e. they were solved as in Algorithm 2.
In Table 2 about DIOR2, N sub and |R| include the number of solved MINLP subproblems and number of generated columns from Algorithm 2, respectively. The indicators are the number of MIP iterations N MIP (number of times MIP-IA (47) is solved), optimal value of MIP-IA (47) and objective value at the primal solution point computed by the local solver. Table 2 illustrates that the new decomposition-based successive approximation approach is able to solve nonconvex MINLP models to global optimality. Notice that like for CG, the number of solved MINLP sub-problems N sub is higher than the number of generated columns |R|. For Example 1, DIOR2 reduces the number of iterations from 20 to 2 compared to DIOR1. Table 3 compares Algorithm 6 to branch-and-bound-based solver SCIP 5.0.1 (Gleixner et al. 2017) in terms of computing time. Notice that computing time not only depends on performance, but also on the handiness of the programmer and the used platform; python is not directly fast. All settings of SCIP were set to default. For each instance, we compare total solution time T of DIOR2 with time spent by SCIP T SCIP . Note that T also includes time spent for decomposition T dec . T LP , T MIP and T MINLP denote the time spent on solving LP master problems, MIP master problems and MINLP sub-problems, respectively. Table 3 shows that SCIP requires less total time than Algorithm 6. However, Algorithm 6 spends most of the running time T on solving MINLP sub-problems. This shows some potential for solving these sub-problems in parallel. T MIP depends on N MIP , i.e. more iterations requires more time spent on solving MIP master problems. The MIP master problem becomes more difficult to solve when a lot of cuts are generated. Interesting enough, T LP is relatively small and it does not depend on the number of solved LP master problems. Note that for Example 1, DIOR2 improves the solution compared to DIOR1 from 12.5 to 1.7 s.

Conclusions
MINLP is a strong paradigm for modelling practical optimization problems. We introduced multi-tree decomposition-based methods for solving MINLP models (1), which avoid building one big search tree to solve the problem. Instead, it solves smaller MINLP sub-problems, of which the solutions are combined in a master LP and MIP problem up to convergence. Moreover, we investigate the potential of the so-called resource constrained formulation of the MINLP problem.
Both algorithms DIOR1 and DIOR2 use low-dimensional MINLP-sub-problem solutions for refining resource-constrained LP and MIP master problems. DIOR1 is slow, but proven to converge. We have shown that DIOR2 is a faster heuristic procedure. On the other hand, an established well implemented branch-and-bound algorithm is faster than DIOR2. In the future, we will work on improving the DIOR algorithm, e.g. by implementing parallel solving of sub-problems and reducing the number sub-problems to be solved.
An advantage of multi-tree decomposition algorithms is the possibility to modify the optimization model during the solution process. An example for this is an airline transportation network which is extended during the solution process by adding new transport options, like train or bus connections. Another example is a response surface or an artificial neural network of a black-box function of a sub-problem, which is iteratively improved regarding new sample/training points. The generated cuts and points can also be used for performing a warm-start if the model has been changed slightly, e.g. in a dynamic optimization model with a moving horizon.
Similarly, like in CG, it is possible to start a multi-tree algorithm with a restricted or simplified search space, which is improved iteratively. Examples for simplifying the search space are: reducing scenario trees, restricting transport networks, or discretizing differential equations.
Such decomposition-based successive approximation methods may be used to solve large-scale optimization problems, like complex multidisciplinary models. Moreover, the presented multi-tree algorithms can be extended to multi-objective optimization by modifying the master problems according to multiple objectives.