1 Introduction

A major problem arising in process synthesis is the simultaneous selection of process equipment and their optimization regarding design and operation. Superstructure optimization represents a suitable but computationally demanding approach to solve such problems. Typical formulations include discrete variables. However, in some cases the formulations are purely continuous (e.g., blending problem for fuels (Singh et al. 2000), process water networks (Ahmetović and Grossmann 2011)). Superstructures can be represented in several ways resulting in different problem formulations for which different solution algorithms exist (Grossmann 2002; Mencarelli et al. 2020). A common and intuitive modeling approach is their formulation as generalized disjunctive programming (GDP) problems consisting of algebraic constraints, disjunctions, Boolean variables, and logic propositions (Raman and Grossmann 1994). These GDP problems can be solved directly with dedicated solution algorithms (e.g., logic-based Outer Approximation (Lee and Grossmann 2000) or GDP branch and bound (Türkay and Grossmann 1996b)) that aim to exploit the disjunctions effectively (e.g., by directly branching on them or reducing the problem size by considering only the active disjuncts). Alternatively, they can be reformulated to mixed-integer nonlinear programming (MINLP) problems (Grossmann and Trespalacios 2013), in which integer variables replace Boolean variables and algebraic equations model the logic propositions.

Conventional reformulation approaches are the Big-M (Nemhauser and Wolsey 1988) and Convex Hull (Lee and Grossmann 2000) method. The reformulation of GDP to MINLP problems enables the use of powerful commercial MINLP solvers and thus the possibility to solve GDP problems containing nonconvex functions (Lee and Grossmann 2003; Ruiz and Grossmann 2010). In contrast, existing solvers dedicated to GDP problems are restricted to GDP problems containing only convex functions. It is also possible to avoid introducing Boolean or binary variables for modeling the disjunctions by using complementarity constraints resulting in mathematical programs with equilibrium constraints (MPEC) (Luo et al. 1996). This results in a special type of nonlinear programming (NLP) problem with only a few optimization variables. The NLP problem can be either solved directly or again reformulated (e.g., by using the Plus Function (Chen and Mangasarian 1996)), which often involves smoothing techniques. In summary, a number of modeling approaches and solution algorithms has been developed over the past decades. Their comparison has mainly been limited to the conventional approaches and problems with only linear or convex constraints representing the process models. For problems containing nonconvex functions, which require global optimization techniques, a comparative assessment of all approaches mentioned above is missing so far.

For global optimization, several techniques for improving computational tractability have been proposed. As superstructure optimization problems often have many variables and constraints, factorable reduced-space (RS) formulations can be beneficial (for an overview of these formulations, see Bongartz (2020), Chapter 3). In such RS formulations, equality constraints are used to eliminate optimization variables from the problem, while ensuring that the optimization problem remains factorable, i.e., all functions can still be expressed as compositions of simple so-called intrinsic functions (i.e., functions for which convex and concave relaxations are known). Compared to the more conventional (equation-oriented) full-space (FS) formulations of the problem, the RS formulations have fewer optimization variables and constraints. In the branch-and-bound (BaB) algorithms that are used for global optimization of nonconvex problems, RS formulations reduce the dimensionality of the space that needs to be partitioned via branching, and they reduce the size of the subproblems for computing lower and upper bounds on the optimal objective value. Depending on the method used for constructing relaxations, they may however result in weaker relaxations (see Bongartz (2020), p. 64). Factorable RS formulations have been used successfully for flowsheet optimization problems (Byrne and Bogle 2000; Bongartz and Mitsos 2017, 2019), parameter estimation problems involving ordinary differential equations (Mitsos et al. 2009), and problems with machine learning models embedded (Schweidtmann and Mitsos 2018; Schweidtmann et al. 2021). Its application to superstructure optimization problems has not been investigated yet. Even more variables and constraints can be eliminated from the problem by dropping the requirement that the functions remain factorable, and instead allowing the use of implicit functions defined by the constraints. Barton and co-workers have presented methods for handling such implicit functions in BaB algorithms (Scott et al. 2011; Stuber et al. 2014; Wechsung et al. 2015). However, these are beyond the scope of this work as corresponding implementations are not readily available yet.

In the present work, we compare different problem formulations for superstructure optimization problems comprising nonconvex functions. Within this comparison, we investigate whether the optimization can benefit from a RS formulation.

In Sect. 2, we introduce a simple and a more complex illustrative example problem. For the simple example problem, the conventional and unconventional problem formulations are developed in Sect. 3. The comparison of all problem formulations for the two example problems are presented in Sect. 4. In Sect. 5, the key results are confirmed by an optimization problem with a different structure (i.e., piecewise-defined cost function instead of unit selection), which can be modeled using the same reformulation approaches. Sect. 6 concludes our findings.

2 Problem definition

To introduce the different problem formulations in Sect. 3, we use a simple illustrative example problem (Sect. 2.1). This example problem and a more complex one with multiple disjunctions (Sect. 2.2) are then used to analyze each problem formulation in greater detail in Sect. 4.

2.1 Simple illustrative example problem

In this simple example problem, we consider only two mutually exclusive options (either Unit P or Unit S) to produce a desired amount of a product, \({\dot{n}}_\text {out}=1\) (considered as a parameter). This illustrative example problem is motivated by our work on power-to-X (Burre et al. 2020, 2021; Roh et al. 2020), in which decisions need to be taken on how to provide raw materials (e.g., hydrogen and carbon dioxide) and process them toward the final product (e.g., electricity-based fuels).

Fig. 1
figure 1

Schematic of the simple example problem with the choice of producing a desired amount of a product, \({\dot{n}}_\text {out}=1\), either via Unit P (\(Y_\text {P}=\text {True}\), \(Y_\text {S}=\text {False}\)) or Unit S (\(Y_\text {P}=\text {False}\), \(Y_\text {S}=\text {True}\))

Each option comes with operating and investment costs, both of which are dependent on molar flow rates and unit-specific cost parameters (Table 1). Operating costs \(C_{\text {op}}\) are quadratically dependent on the molar flow rate passing through the chosen unit (\(C_\text {op}^j=({\dot{n}}_\text {in}^j)^2e_j\)). Investment costs \(C_\text {inv}\) are dependent on the maximal rating given as the global feed flow rate into the superstructure, \({\dot{n}}_\text {in}\), and a constant cost parameter (\(C_\text {inv}^j=C_j + {\dot{n}}_\text {in}^{0.6}\)). The dependency on \({\dot{n}}_\text {in}\) (in addition to \({\dot{n}}_\text {in}^j\) for operating costs) puts more emphasis on the nonconvexity regarding multiple variables in typical process synthesis problems. This concave term can also be moved out of the disjunction directly into the objective function. Our simple numerical experimentation showed that this does not have any effect on the optimization. Also the consideration of a constant conversion parameter for each process unit does not have an influence on the overall results. To keep the illustrative example problem simple, we do not consider such a parameter.

Table 1 Cost parameters for the simple (GDP1) and the multiple-disjunction (GDP2) example problem. \(C_j\) are fixed investment costs and \(e_j\) are specific operating costs

The objective is to find the process with the lowest total cost C by solving the following optimization problem formulated as a GDP problem:

$$\begin{aligned} \text {min } \qquad&C = C_\text {op} + C_\text {inv}\\ \text {s.t.}\,\qquad&{\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}\\&{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {S} \\&\begin{bmatrix} Y_\text {P} \\ {\dot{n}}_\text {in}^\text {S}={\dot{n}}_\text {out}^\text {S}=0\\ {\dot{n}}_\text {out}^\text {P}={\dot{n}}_\text {in}^\text {P}\\ C_\text {op}=({\dot{n}}_\text {in}^\text {P})^2e_\text {P} \\ C_\text {inv}=C_\text {P} + {\dot{n}}_\text {in}^{0.6} \end{bmatrix} \vee \begin{bmatrix} Y_\text {S} \\ {\dot{n}}_\text {in}^\text {P}={\dot{n}}_\text {out}^\text {P}=0\\ {\dot{n}}_\text {out}^\text {S}={\dot{n}}_\text {in}^\text {S}\\ C_\text {op}=({\dot{n}}_\text {in}^\text {S})^2e_\text {S}\\ C_\text {inv}=C_\text {S} + {\dot{n}}_\text {in}^{0.6}\\ \end{bmatrix} \\&Y_\text {P} {\underline{\vee }} Y_\text {S}\\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \ge 0 \\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \in {\mathbb {R}} \\&Y_\text {P}, Y_\text {S} \in \{ \text {True}, \text {False} \} \end{aligned}$$
(GDP1)

There are alternative ways for modeling the system, e.g., by eliminating the global (i.e., independent from disjunctions) equality constraints and modifying the disjunctive constraints correspondingly. As Problem (GDP1) is the most direct representation of the superstructure shown in Fig. 1, the use of these global constraints is considered throughout this work. Additionally, we explicitly restrict all illustrative example problems to the exclusive choice between units (denoted by the logic exclusive “or"-operator \({\underline{\vee }}\)) to keep it as simple as possible.

2.2 Multiple-disjunction example problem

As the flowsheet structure of Problem (GDP1) is minimalist, we analyze a more complex example problem with two disjunctions taken from (Grossmann and Trespalacios 2013), where the outlet of Unit S needs to be further processed by either Unit F1 or F2 (Fig.2). To make the choice for one of these units economically viable and the case study interesting, we introduce a cost factor of 0.1 for Unit F1 and F2.

Fig. 2
figure 2

Schematic of the flowsheet structure with two disjunctions and the choice of producing a desired amount of a product, \({\dot{n}}_\text {out}=1\), either via Unit P (\(Y_\text {P}=1\), \(Y_\text {S}=0\)) or Unit S (\(Y_\text {P}=0\), \(Y_\text {S}=1\)). If Unit S is chosen, the product needs to be further processed by either Unit F1 (\(Y_\text {F1}=1\), \(Y_\text {F2}=0\)) or F2 (\(Y_\text {F1}=0\), \(Y_\text {F2}=1\))

The more complex flowsheet structure results in Problem (P2) (ESI Sect. 3), which is further reformulated to Problem (GDP2) to remove the nested structure within the disjunction and derive corresponding problem formulations as described in Sect. 3. These are summarized in ESI Sect. 3.

$$\begin{aligned} {\text {min } \qquad }&{C = C_\text {op} + C_\text {inv}} \\ {\text {s.t.}\,\qquad }&{ {\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}}\\&{{\dot{n}}_\text {out}^\text {S}={\dot{n}}_\text {in}^\text {F1} + {\dot{n}}_\text {in}^\text {F2}}\\&{{\dot{n}}_\text {out}^\text {F}={\dot{n}}_\text {out}^\text {F1} + {\dot{n}}_\text {out}^\text {F2}} \\&{{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {F}} \\&\begin{bmatrix} {Y_\text {P} }\\ {{\dot{n}}_\text {in}^\text {S}={\dot{n}}_\text {out}^\text {S}=0}\\ {{\dot{n}}_\text {out}^\text {P}={\dot{n}}_\text {in}^\text {P}}\\ {C_\text {op}=({\dot{n}}_\text {in}^\text {P})^2e_\text {P} }\\ {C_\text {inv}=C_\text {P} + {\dot{n}}_\text {in}^{0.6}} \end{bmatrix} {\vee } \begin{bmatrix} {Y_\text {S}} \\ {{\dot{n}}_\text {in}^\text {P}={\dot{n}}_\text {out}^\text {P}=0}\\ {{\dot{n}}_\text {out}^\text {S}={\dot{n}}_\text {in}^\text {S}}\\ {C_\text {op}=({\dot{n}}_\text {in}^\text {S})^2e_\text {S} + C_\text {op}^\text {F}}\\ {C_\text {inv}=C_\text {S} + {\dot{n}}_\text {in}^{0.6} + C_\text {inv}^\text {F}}\\ \end{bmatrix} \\&\begin{bmatrix} {Y_\text {F1}} \\ {{\dot{n}}_\text {in}^\text {F2}={\dot{n}}_\text {out}^\text {F2}=0}\\ {{\dot{n}}_\text {out}^\text {F1}={\dot{n}}_\text {in}^\text {F1}}\\ {C_\text {op}^\text {F} = ({\dot{n}}_\text {in}^\text {F1})^2e_\text {F1}}\\ {C_\text {inv}^\text {F} = C_\text {F1} + 0.1{\dot{n}}_\text {in}^{0.6}} \end{bmatrix} {\vee } \begin{bmatrix} {Y_\text {F2}} \\ {{\dot{n}}_\text {in}^\text {F1}={\dot{n}}_\text {out}^\text {F1}=0}\\ {{\dot{n}}_\text {out}^\text {F2}={\dot{n}}_\text {in}^\text {F2}}\\ {C_\text {op}^\text {F} = ({\dot{n}}_\text {in}^\text {F2})^2e_\text {F2}}\\ {C_\text {inv}^\text {F} = C_\text {F2} + 0.1{\dot{n}}_\text {in}^{0.6}} \end{bmatrix}\\& {\vee } \begin{bmatrix} {Y_\text {notF}} \\ {{\dot{n}}_\text {in}^\text {F1}={\dot{n}}_\text {in}^\text {F2}={\dot{n}}_\text {out}^\text {F1}={\dot{n}}_\text {out}^\text {F2}=0}\\ {C_\text {op}^\text {F} = 0}\\ {C_\text {inv}^\text {F} = 0} \end{bmatrix}\\&{Y_\text {P} {\underline{\vee }} Y_\text {S}}\\&{Y_\text {F1} {\underline{\vee }} Y_\text {F2} {\underline{\vee }} Y_\text {notF}}\\&{Y_\text {S} \Leftrightarrow Y_\text {F1} {\underline{\vee }} Y_\text {F2}}\\&{C_\text {op}, C_\text {inv}, C_\text {op}^\text {F}, C_\text {inv}^\text {F}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {in}^\text {F1}, {\dot{n}}_\text {in}^\text {F2}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S}, {\dot{n}}_\text {out}^\text {F1}, {\dot{n}}_\text {out}^\text {F2}, {\dot{n}}_\text {out}^\text {F} \ge 0} \\&{C_\text {op}, C_\text {inv}, C_\text {op}^\text {F}, C_\text {inv}^\text {F}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {in}^\text {F1}, {\dot{n}}_\text {in}^\text {F2}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S}, {\dot{n}}_\text {out}^\text {F1}, {\dot{n}}_\text {out}^\text {F2}, {\dot{n}}_\text {out}^\text {F} \in {\mathbb {R}}} \\&{Y_\text {P}, Y_\text {S}, Y_\text {F1}, Y_\text {F2}, Y_\text {notF} \in \{\text {True, False} \}} \end{aligned}$$
(GDP2)

3 Problem formulations

In this section, the most common problem formulations for superstructure optimization and less common ones are presented and applied to the simple illustrative example problem (GDP1). One of the most conventional methods to reformulate GDP into MINLP problems are the Big-M (Sect. 3.1, (Nemhauser and Wolsey 1988)) and the Convex Hull method (Sect. 3.2, (Lee and Grossmann 2000)). Big-M and Convex Hull transform Boolean variables \(Y_j\) into binary variables \(y_j\) to express the relationship between and within disjunctions. In contrast, a nonsmooth reformulation approach can be used to prevent the use of any additional variables completely by introducing complementarity constraints. The resulting MPECs can be either solved directly (Sect. 3.3, (Baumrucker et al. 2008)) or again reformulated using the Plus Function (Sect. 3.4, (Chen and Mangasarian 1996)). Another alternative problem formulation, hereafter called Direct MINLP, utilizes binary variables directly within the equality constraints of the process model to capture the existence or nonexistence of a unit or process stream within a disjunction (Sect. 3.5). As we also consider RS formulations for all reformulation approaches, Sect. 3.6 illustrates such a RS formulation exemplarily for the Direct MINLP reformulation approach. In addition to the aforementioned reformulation approaches, superstructures representing problems with piecewise-defined functions can also be handled by using step functions (cf. Sect. 5, (Wechsung and Barton 2013)).

3.1 Big-M

The Big-M method introduces the parameter M (a big number potentially individual for each constraint of disjunct j), which is used to make the constraints of unchosen choices (\(y_j=0\)) redundant. Several methods exist in literature that determine the optimal value for M and thus tighten relaxations (e.g., (Trespalacios and Grossmann 2015)). To maintain a manageable complexity, we do not consider such improved Big-M methods. Instead, we choose M to be equal to the global upper bound of the variable that is bounded by the corresponding constraint. Besides binary variables for each disjunct (the transformed Boolean variables), no additional variables need to be introduced. However, a poor choice for M can result in weak relaxations. Applied to Problem (GDP1), the Big-M method results in Problem (BM1):

$$\begin{aligned} \text {min } \qquad&C = C_\text {op} + C_\text {inv} \\ \text {s.t.}\,\qquad&{\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}\\&{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {S}\\&0 - M(1 - y_\text {P}) \le {\dot{n}}_k^\text {S} \le 0 + M(1 - y_\text {P}) \qquad \qquad \qquad \quad \; k \in \{\text {in}, \text {out}\}\\&0 - M(1 - y_\text {S}) \le {\dot{n}}_k^\text {P} \le 0 + M(1 - y_\text {S}) \qquad \qquad \qquad \quad \; k \in \{\text {in}, \text {out}\}\\&{\dot{n}}_\text {in}^j - M(1 - y_j) \le {\dot{n}}_\text {out}^j \le {\dot{n}}_\text {in}^j + M(1 - y_j) \qquad \qquad \qquad \, j \in \{\text {P}, \text {S}\}\\&({\dot{n}}_\text {in}^j)^2e_j - M(1 - y_j) \le C_\text {op} \le ({\dot{n}}_\text {in}^j)^2e_j + M(1 - y_j)\qquad \; \; \; j \in \{\text {P}, \text {S}\}\\&C_j + {\dot{n}}_\text {in}^{0.6} - M(1 - y_j) \le C_\text {inv} \le C_j + {\dot{n}}_\text {in}^{0.6} + M(1 - y_j)\quad \; \; \, j \in \{\text {P}, \text {S}\}\\&{y_\text {P} + y_\text {S} = 1}\\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \ge 0\\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \in {\mathbb {R}}\\&y_\text {P}, y_\text {S} \in \{0, 1\} \end{aligned}$$
(BM1)

3.2 Convex Hull

For the Convex Hull method, a new (disaggregated) variable, \(\nu _{j}\), needs to be introduced for each variable that is affected by a disjunction and for each choice within the respective disjunction. The bounds on these disaggregated variables can be either chosen to be the global variable bounds or tightened based on the disjunct. In our illustrative example problems, we use global variable bounds (i.e., \(0 \le {\dot{n}} \le 1\), \(0 \le C \le 20\)). Each constraint \(r_{j}(\nu _{j})\le 0\) of disjunct j is expressed by the closure of the perspective function (Ceria and Soares 1999):

$$\begin{aligned} y_{j}r_{j}\left( \frac{\nu _{j}}{y_{j}}\right) \le 0 \end{aligned}$$
(1)

To avoid singularities for \(y_{j}=0\), Lee and Grossmann (2000) proposed a modification of the original perspective function, which was again modified by Sawaya (2006) to improve numerical performance and accuracy:

$$\begin{aligned} ((1 - \epsilon )y_{j} + \epsilon )r_{j}\left( \frac{\nu _{j}}{(1 - \epsilon )y_{j} + \epsilon }\right) \le 0 \end{aligned}$$
(2)

For the illustrative example problems, parameter \(\epsilon\) only needs to be nonzero as the disaggregated variables of unchosen units (\(\nu _{j}=0\)) make Constraint (2) being fulfilled independently from the value of \(\epsilon\). In general, \(\epsilon\) needs to be chosen sufficiently small to maintain a high accuracy. As Constraints (2) represents the commonly used type of perspective function for global optimization, we use it for all Convex Hull formulations within this study where necessary. Other modifications exist and are summarized in Furman et al. (2020). In this study, they are not considered.

The introduction of disaggregated variables increases problem size but generally yields tighter relaxations compared to the Big-M formulation, even if the most suitable Big-M parameter is utilized (Grossmann and Lee 2003). The conversion of Problem (GDP1) with the Convex Hull method results in Problem (CH1):

$$\begin{aligned} \text {min } \qquad&C = C_\text {op} + C_\text {inv} \\ \text {s.t.}\,\qquad&{\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}\\&{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {S}\\&{\dot{n}}_\text {in}^j = {\dot{n}}_\text {in,P}^{j} + {\dot{n}}_\text {in,S}^{j} \qquad \qquad \quad \quad \; \; \: j \in \{\text {P}, \text {S}\}\\&{\dot{n}}_\text {out}^j = {\dot{n}}_\text {out,P}^{j} + {\dot{n}}_\text {out,S}^{j} \qquad \qquad \quad \; \; \; j \in \{\text {P}, \text {S}\}\\&C_\text {op} = C_\text {op}^\text {P} + C_\text {op}^\text {S}\\&C_\text {inv} = C_\text {inv}^\text {P} + C_\text {inv}^\text {S}\\&{\dot{n}}_{k,\text {P}}^\text {S}, {\dot{n}}_{k,\text {S}}^\text {P} \le 0 \qquad \qquad \qquad \quad \; \; \; \; \, k \in \{\text {in}, \text {out}\}\\&C_\text {op}^j\ge {\left( \frac{{\dot{n}}_{\text {in,}j}^j}{(1 - \epsilon )y_j + \epsilon }\right) ^2e_j((1 - \epsilon )y_j + \epsilon )} \qquad \qquad \; \; \, j \in \{\text {P}, \text {S}\} \\&C_\text {inv}^j\ge (C_j + {\dot{n}}_\text {in}^{0.6})y_j \qquad \qquad \qquad j \in \{\text {P}, \text {S}\} \\&{y_\text {P} + y_\text {S} = 1} \\&0 \le {\dot{n}}_{k,\text {P}}^\text {S} \le 1y_\text {P} \qquad \qquad \qquad \; \; \; \; \; \, \, k \in \{\text {in}, \text {out}\}\\&0 \le {\dot{n}}_{k,\text {S}}^\text {P} \le 1y_\text {S} \qquad \qquad \qquad \; \; \; \; \; \; k \in \{\text {in}, \text {out}\}\\&0 \le {\dot{n}}_{k,j}^j \le 1y_j \qquad \qquad \qquad \quad \; \; \; \, k \in \{\text {in}, \text {out}\} \qquad j \in \{\text {P}, \text {S}\}\\&0 \le C_{m}^j \le 20y_j\qquad \qquad \qquad \quad \; m \in \{\text {op}, \text {inv}\} \quad \; \; \, \, j \in \{\text {P}, \text {S}\}\\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \ge 0\\&C_\text {op}, C_\text {inv}, C_\text {op}^\text {P}, C_\text {op}^\text {S}, C_\text {inv}^\text {P}, C_\text {inv}^\text {S} \in {\mathbb {R}}\\&{\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {in,P}^\text {P}, {\dot{n}}_\text {in,P}^\text {S}, {\dot{n}}_\text {in,S}^\text {P}, {\dot{n}}_\text {in,S}^\text {S} \in {\mathbb {R}}\\&{\dot{n}}_\text {out,P}^\text {P}, {\dot{n}}_\text {out,P}^\text {S}, {\dot{n}}_\text {out,S}^\text {P}, {\dot{n}}_\text {out,S}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \in {\mathbb {R}}\\&y_\text {P}, y_\text {S} \in \{0, 1\} \end{aligned}$$
(CH1)

3.3 MPEC

In order to prevent the use of discrete variables and thus prevent solving a MINLP problem, superstructure optimization problems can be formulated as MPECs. This type of problem formulation introduces complementarity constraints to model the discrete choices in a process superstructure.

The MPEC problem formulation can be derived from Problem (GDP1) by considering all global constraints (to represent the part of the flowsheet that is not affected by disjunctions) and a subset of disjunctive constraints (to represent the part of the flowsheet that is affected by disjunctions). The subset of disjunctive constraints is chosen in such a way that (a) only those model equations of a disjunction are considered that correspond to the part of the flowsheet where \(Y_j=\text {True}\) (\({\dot{n}}_\text {out}^\text {P}={\dot{n}}_\text {in}^\text {P}\), \(C_\text {op}^\text {P}=({\dot{n}}_\text {in}^\text {P})^2e_\text {P}\), and \(C_\text {inv}^\text {P}=C_\text {P} + {\dot{n}}_\text {in}^{0.6}\) for the choice \(Y_\text {P}=\text {True}\); \({\dot{n}}_\text {out}^\text {S}={\dot{n}}_\text {in}^\text {S}\), \(C_\text {op}^\text {S}=({\dot{n}}_\text {in}^\text {S})^2e_\text {S}\), and \(C_\text {inv}^\text {S}=C_\text {S} + {\dot{n}}_\text {in}^{0.6}\) for the choice \(Y_\text {S}=\text {True}\)) and (b) zero values can be realized if the choice j within the disjunction is not active. However, this is not given for all types of disjunctive problems: In the simple illustrative example problem (GDP1), \(C_\text {inv}^j\) contains constant investment costs \(C_j\) for unit j and therefore need to be modified by a step function that is dependent on the material stream \({\dot{n}}_\text {in}^j\) passing through unit j. Such a step function introduces nonconvexities additionally to those that are inherently part of the modeled system into the problem. If tailored relaxations are implemented for such a step function (Wechsung and Barton 2013), the optimization may however not be necessarily affected negatively. For all problem formulations in this work, the tanh-function is used as a smoothed step function resulting in an error below the feasibility tolerance. By adding up the cost terms \(C_\text {op}^j\) and \(C_\text {inv}^j\) for each unit j, operating costs \(C_\text {op}\) and investment costs \(C_\text {inv}\) are then retrieved, respectively. The relationship between disjunctions and choices within each disjunction is represented by complementarity constraints instead of binary variables. For Problem (MPEC1), the molar flows passing each unit are multiplied by each other and set to zero. Therefore, either \({\dot{n}}_\text {in}^\text {P}\) or \({\dot{n}}_\text {in}^\text {S}\) need to become zero if the other one is nonzero. Due to the absence of any discrete variables, global NLP solvers can be used to find the global optimum. This can however be challenging. Although a considerably smaller problem size can be achieved in comparison to the other approaches, the introduced complementarity constraints can result in the problem violating constraint qualifications and hence cause problems for the local solvers used for upper bounding. Often, a regularization parameter (a small number \(\mu\)) need to be added to make the constraint qualifications hold again. The solution of the original problem is then obtained by sequentially reducing \(\mu\) to zero (Scholtes 2001). Such a regularization is however not required for the problems analyzed in this work, as the global solvers used herein do in practice not depend heavily on the performance of the NLP solver for upper bounding, for which the constraint qualifications need to hold. To improve performance, BARON detects complementary constraints automatically and treats them accordingly. In MAiNGO, there is no special algorithm to detect and treat complementarity constraints implemented yet.

The MPEC formulation of Problem (GDP1) using complementarity constraints results in Problem (MPEC1), in which parameter P (a big number) is used to approximate the step function more accurately.

$$\begin{aligned} \text {min } \qquad&C = C_\text {op} + C_\text {inv}\\ \text {s.t.}\,\qquad&{\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}\\&{\dot{n}}_\text {out}^j={\dot{n}}_\text {in}^j \qquad \qquad \qquad \; \; j \in \{\text {P}, \text {S}\}\\&{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {S}\\&C_\text {op} = \sum _{j \in J}({\dot{n}}_\text {in}^j)^2e_j \\&C_\text {inv}=\sum _{j \in J}\left[ \tanh (P{\dot{n}}_\text {in}^j)(C_j + {\dot{n}}_\text {in}^{0.6})\right] \\&0={\dot{n}}_\text {in}^\text {P} {\dot{n}}_\text {in}^\text {S}\\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \ge 0 \\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \in {\mathbb {R}} \end{aligned}$$
(MPEC1)

3.4 Plus Function

An alternative representation of the complementarity constraint (\(0={\dot{n}}_\text {in}^\text {P} {\dot{n}}_\text {in}^\text {S}\)) in Problem (MPEC1) can be achieved by using the Plus Function (Chen and Mangasarian 1996):

$$\begin{aligned} 0={\dot{n}}_\text {in}^\text {P} - \text {max}(0,{\dot{n}}_\text {in}^\text {P} - {\dot{n}}_\text {in}^\text {S}). \end{aligned}$$
(3)

This function is commonly used for modeling the nonsmooth behavior of flash units for vapor-liquid(-liquid) equilibrium calculations with vanishing phases (Gopal and Biegler 1999; Sahlodin et al. 2016), for which it has shown a promising computational performance. We refer to the resulting problem formulation as Problem (PLUS1) (ESI Section 2).

3.5 Direct MINLP

A problem formulation for superstructure optimization that is only barely used is the direct multiplication of binary variables with the continuous variables being present in the disjunctions. Similarly to the aforementioned problem formulations, it can be derived from GDP problem (GDP1). Boolean variables \(Y_j\) of the GDP problem transform to binary variables \(y_j\) and are (in contrast to the Big-M and Convex Hull method) multiplied with the continuous model variables (\({\dot{n}}_\text {in}^j=y_j{\dot{n}}_\text {in}\) and \(C_\text {inv}=\sum _{j \in J}y_j(C_j + {\dot{n}}_\text {in}^{0.6})\)) to represent the existence or nonexistence of option j within the process flowsheet. Instead of introducing additional inequality constraints modeling the disjunctions, the Direct MINLP formulation incorporates the disjunctions directly into the algebraic equality constraints of the model. Doing this, potential redundant global constraints may need to be disregarded (\({\dot{n}}_\text {in}={\dot{n}}_\text {in}^\text {P} + {\dot{n}}_\text {in}^\text {S}\)) and the variables in the objective function for each choice in the disjunction added up (\(C_\text {op} = \sum _{j \in J}({\dot{n}}_\text {in}^j)^2e_j\) and \(C_\text {inv}=\sum _{j \in J}y_j(C_j + {\dot{n}}_\text {in}^{0.6})\)). The model formulation results in Problem (MINLP1):

$$\begin{aligned} \text {min } \qquad&C = C_\text {op} + C_\text {inv}\\ \text {s.t.}\,\qquad&C_\text {op} = \sum _{j \in J}({\dot{n}}_\text {in}^j)^2e_j\\&C_\text {inv}=\sum _{j \in J}y_j(C_j + {\dot{n}}_\text {in}^{0.6}) \\&{\dot{n}}_\text {in}^j=y_j{\dot{n}}_\text {in} \qquad \qquad \qquad \; \; \, j \in \{\text {P}, \text {S}\} \\&{\dot{n}}_\text {out}^j={\dot{n}}_\text {in}^j \qquad \qquad \qquad \quad j \in \{\text {P}, \text {S}\}\\&{\dot{n}}_\text {out}={\dot{n}}_\text {out}^\text {P} + {\dot{n}}_\text {out}^\text {S} \\&y_\text {P} + y_\text {S} = 1 \\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \ge 0 \\&C_\text {op}, C_\text {inv}, {\dot{n}}_\text {in}, {\dot{n}}_\text {in}^\text {P}, {\dot{n}}_\text {in}^\text {S}, {\dot{n}}_\text {out}^\text {P}, {\dot{n}}_\text {out}^\text {S} \in {\mathbb {R}} \\&y_j \in \{0, 1\} \qquad \qquad \qquad \; \; \, j \in \{\text {P}, \text {S}\} \end{aligned}$$
(MINLP1)

In contrast to the conventional problem formulations, the Direct MINLP formulation introduces nonlinearities by the multiplication of the binary variables \(y_j\) with expressions of the continuous variable \({\dot{n}}_\text {in}\) as part of the algebraic equality constraints in the model. If the remaining model is linear or convex, the reformulation using the Big-M (cf. Sect. 3.1) or Convex Hull (cf. Sect. 3.2) method results in a mixed-integer linear programming (MILP) or convex MINLP problem, respectively, which can be efficiently solved with general-purpose solvers. Thus, for MILP and convex MINLP problems, the conventional reformulation approaches are typically superior to alternative approaches. If the remaining model is nonconvex anyway, e.g., in more detailed process engineering, where nonconvexities are usually inherently part of the (mechanistic) process model, it is not clear whether it is beneficial to reformulate it with the conventional approaches, as it does not result in a MILP or convex MINLP problem (Huster et al. 2020). Keeping bilinear terms according to the Direct MINLP formulation can result in smaller subproblems, which can be directly given to MINLP solvers.

From a modeling perspective, we may also view Problem (MINLP1) instead of Problem (GDP1) as a starting formulation, which may again be reformulated using the Big-M or Convex Hull approach. This results in slightly larger problems compared to Problem (BM1) and (CH1) and was therefore found to be less computationally efficient than the direct reformulation of Problem (GDP1).

3.6 Reduced-space formulation

The problem formulations stated in the preceding sections are given in their conventional FS formulation. As we also consider RS formulations (i.e., eliminating optimization variables and constraints), we exemplarily introduce the RS formulation of Problem (MINLP1) in such a way that variables y that depend on other model variables x are written as factorable functions \({\tilde{y}}(x)\). By doing so, the objective function becomes an explicit function of the degrees of freedom x only. This can be done for all problem formulations. The RS formulation for Problem (MINLP1) is

$$\begin{aligned} \min _{y_\text {P},{\dot{n}}_\text {in}} \qquad&{\tilde{C}}(y_\text {P},{\dot{n}}_\text {in})\\ \text {s.t.}\,\qquad&{\dot{n}}_\text {out} - \dot{{\tilde{n}}}_\text {out}^\text {P}(y_\text {P},{\dot{n}}_\text {in}) - \dot{{\tilde{n}}}_\text {out}^\text {S}(y_\text {P},{\dot{n}}_\text {in}) = 0,\\&{\dot{n}}_\text {in} \ge 0 \\&{\dot{n}}_\text {in} \in {\mathbb {R}} \\&y_\text {P} \in \{0, 1\}, \end{aligned}$$
(MINLP1 RS)

which has only two variables (\(y_\text {P},{\dot{n}}_\text {in}\)) and contains the following functions:

$$\begin{aligned}&{\tilde{C}}(y_\text {P},{\dot{n}}_\text {in}):={\tilde{C}}_\text {op}(y_\text {P},{\dot{n}}_\text {in}) + {\tilde{C}}_\text {inv}(y_\text {P},{\dot{n}}_\text {in})\\&{\tilde{C}}_\text {op}(y_\text {P},{\dot{n}}_\text {in}):=(\dot{{\tilde{n}}}_\text {in}^\text {P}(y_\text {P},{\dot{n}}_\text {in}))^2\cdot \text {e}_\text {P} + (\dot{{\tilde{n}}}_\text {in}^\text {S}(y_\text {P},{\dot{n}}_\text {in}))^2\cdot \text {e}_\text {S}\\&{\tilde{C}}_\text {inv}(y_\text {P},{\dot{n}}_\text {in}):=y_\text {P}\cdot (C_\text {P}+{\dot{n}}_\text {in}^{0.6}) + {\tilde{y}}_\text {S}(y_\text {P})\cdot (C_\text {S}+{\dot{n}}_\text {in}^{0.6})\\&\dot{{\tilde{n}}}_\text {in}^\text {P}(y_\text {P},{\dot{n}}_\text {in}):=y_\text {P} \cdot {\dot{n}}_\text {in} \\&\dot{{\tilde{n}}}_\text {in}^\text {S}(y_\text {P},{\dot{n}}_\text {in}) := {\tilde{y}}_\text {S}(y_\text {P}) \cdot {\dot{n}}_\text {in} \\&{\tilde{y}}_\text {S}(y_\text {P}) := 1 - y_\text {P} \\&\dot{{\tilde{n}}}_\text {out}^\text {P}(y_\text {P},{\dot{n}}_\text {in}) := \dot{{\tilde{n}}}_\text {in}^\text {P}(y_\text {P},{\dot{n}}_\text {in})\\&\dot{{\tilde{n}}}_\text {out}^\text {S}(y_\text {P},{\dot{n}}_\text {in}) := \dot{{\tilde{n}}}_\text {in}^\text {S}(y_\text {P},{\dot{n}}_\text {in}). \end{aligned}$$

Such a RS formulation with intermediate variables computed as a function of other variables can easily be implemented in procedural modeling environments, e.g., via the C++ or Python APIs of MAiNGO, which then uses the MC++ library (Chachuat et al. 2015) to obtain relaxations of these functions. The elimination of variables in such procedural modeling environments corresponds to the construction of a sequence of mathematical operations, for which a relaxation and their subgradients are constructed. This procedure can be considered as propagating the relaxations through the algorithm (Mitsos et al. 2009).

4 Results for the illustrative example problems

As we focus on superstructure optimization problems containing nonconvex functions, we do not consider dedicated solvers for GDP problems but rather reformulate the optimization problem either into a MINLP or a (nonsmooth) NLP problem (cf. Sect. 3). The resulting nonconvex optimization problems generally exhibit multiple suboptimal minima, such that a global solver needs to be used. We use our open-source deterministic global solver MAiNGO v0.5.0.2 (Bongartz et al. 2018), which employs a standard BaB algorithm with several bound tightening techniques and the multivariate McCormick method (McCormick 1976; Tsoukalas and Mitsos 2014) implemented in MC++ (Chachuat et al. 2015) to obtain relaxations. For a comparison with state-of-the-art deterministic global solvers, we perform each optimization also with the commercial solver BARON v19.3.24 (Kılınç and Sahinidis 2017) in the modeling system GAMS 27.0.0 using the automated generation of GAMS files by MAiNGO. In contrast to MAiNGO, BARON uses the auxiliary variable method (AVM) (Smith and Pantelides 1997; Tawarmalani and Sahinidis 2002), which replaces each nonlinear term by an auxiliary variable (AV) and a constraint. For such constraints, known relaxations are constructed. This method is also used by other state-of-the-art deterministic global solvers such as ANTIGONE (Misener and Floudas 2014) and SCIP (Achterberg 2009), the analysis of which is however beyond the scope of this work. All calculations are conducted on an Intel® Core™i3-6100 CPU with 3.7 GHz running Windows 10 and no other applications. For both global solvers, default settings are selected. The optimality and feasibility tolerance is \(10^{-3}\) and \(10^{-6}\), respectively. Each optimization has been performed 100 times to reduce the variations in solution time for small problems caused by system background processes. For all problems, the arithmetic mean solution time is reported in this study.

4.1 Simple illustrative example problem

For the simple illustrative example problem (GDP1), the problem size and numerical results of each problem formulation are summarized in Table 2. FS formulations treat all model variables as optimization variables, whereas RS formulations use model equality constraints to eliminate as many optimization variables as possible. One of the remaining equality constraints in the RS formulations makes sure that the sum of streams \({\dot{n}}_\text {out}^\text {P}\) and \({\dot{n}}_\text {out}^\text {S}\) (i.e., the exiting stream \({\dot{n}}_\text {out}\)) equals 1. The second remaining equality constraint in Problem (MPEC1) and (PLUS1) is the complementarity constraint modeling the logic proposition of Problem (GDP1). As the complementarity equality constraint does not eliminate a degree of freedom, the problem still has one degree of freedom despite having the same number of variables and equality constraints. The remaining inequality constraints in the RS formulations for Problem (BM1) and (CH1) result from the remaining disjunctive optimization variables. In the FS formulation, the Convex Hull formulation leads to much larger problems than its alternatives, whereas in the RS formulation, the corresponding problem is the same size as that resulting from the Big-M formulation. The unconventional reformulation approaches however still lead to smaller problem sizes, which gets more pronounced for more complex problems (cf. Sect. 4.2 and 5 ).

In overall, the RS formulations in MAiNGO yield significantly lower overall solution times than corresponding FS formulations and are the lowest for the unconventional reformulation approaches. The higher number of BaB nodes and the lower lower bound (LB) in the root node for Problem (BM1) compared to those of Problem (MPEC1), (PLUS1), and (MINLP1) in the FS formulation indicate weaker relaxations despite the additional nonconvexities introduced by the unconventional approaches. For Problem (CH1), the optimization in FS suffers from the high number of optimization variables (21, cf. Table 2). The RS formulation accelerates the optimization by about one third of the solution time in the FS formulation (0.20 s vs. 0.28 s) given its smaller problem size. Compared to the alternative problem formulations, the optimization still takes longer.

The solution time for Problem (MPEC1) in FS is similar to that of Problem (MINLP1) in FS. In RS, solution time reductions are again possible despite weaker relaxations. The same findings also apply to Problem (MINLP1). The Plus Function formulation of the complementarity constraints in Problem (PLUS1) seems to reduce computational effort even further.

Table 2 Problem size and numerical results for the problem formulations of Problem (GDP1) presented in Sect. 3. The optimization has been executed 100 times, of which the arithmetic mean value is shown. The optimal value is 11
Fig. 3
figure 3

Solution time for Problem (GDP1) using the problem formulations presented in Sect. 3. As BARON can not handle max-functions, there are no results for the Plus Function formulation. The error bars represent the standard deviation from the arithmetic mean value of the solution time from 100 optimization runs using MAiNGO

Compared to the commercial global solver BARON, optimization with MAiNGO for the unconventional problem formulations (MPEC1) and (MINLP1) is comparably fast for the FS formulations and even faster for the RS formulations, whereas BARON can handle the conventional problem formulations (BM1) and (CH1) better (especially for the FS formulations). This tendency was expected, as MAiNGO explicitly exploits the benefits of a smaller problem size in the lower bounding problem by using McCormick relaxations (Tsoukalas and Mitsos 2014) opposed to the AVM used in BARON.

If linear instead of the nonlinear cost functions are considered, the Big-M and Convex Hull method results in MILP problems (cf. Problem (BM1lin) and (CHlin), respectively, ESI Sect. 1), which can be solved using CPLEX for both the FS and RS formulation. In contrast, the alternative reformulation approaches (yielding Problem (MPEC1lin), (PLUS1lin), and (MINLP1lin), ESI Section 1) still result in MINLP problems with nonconvex functions, which need to be solved using a global solver. However, the overall results for these three problems considering linear cost functions (cf. ESI Table S1 and ESI Fig. S1) do not differ from those with nonconvex cost functions. This indicates that the nonconvexity of model equations seem to have only minor influence on the optimization of this simple illustrative example problem.

4.2 Multiple-disjunction example problem

Table 3 summarizes the problem size and numerical results of each problem formulation for Problem (GDP2) (each problem formulation is given in ESI Section 3). The newly introduced inequality constraint in the RS formulation of Problem (MINLP2) makes sure that the binary variable \(y_\text {F2}\) can not become negative. Beyond the equality constraint for the leaving stream \({\dot{n}}_\text {out}\), Problem (BM2) and (CH2) require an additional equality constraint for the RS formulation to ensure that exactly one choice is being made for the second disjunction (choice for either F1, F2, or none). Problem (MINLP2) does not need a binary for this third choice (neither F1 nor F2) as the algebraic equations for representing the disjunctions differ from those that result directly from reformulating Problem (GDP2) using the Big-M and Convex Hull approach (cf. ESI Problem (BM2) and (CH2)). As a result, no additional equality constraint is required. Similarly to the simple illustrative example problem (cf. Sect. 4.1), the RS formulations of Problem (MPEC2), (PLUS2), and (MINLP2) lead to the smallest problems.

As for the simple illustrative example problem (cf. Sect. 4.1), the RS formulations in MAiNGO always yield lower overall solution times than corresponding FS formulations. However, the differences between the conventional and unconventional reformulation approaches are for the RS formulations not as pronounced as in the previous example problem. In contrast, they are still well-marked for the FS formulations. In the FS formulation, Problem (MPEC2) and (PLUS2) perform similarly due to their same size and similar relaxation tightness (cf. Table 3). Despite Problem (MINLP2) having four additional binary variables, its solution time is similar to that of the aforementioned problem formulations. Problem (BM2) and (MINLP2) benefit from a significantly smaller problem size compared to Problem (CH2), such that their time consumed per BaB node is also significantly smaller. The complementarity-constrainted problems (MPEC2) and (PLUS2) do not exploit their small problem size so effectively. The prominent advantage of the Convex Hull method—achieving tight relaxations—does not seem to be significant here: The relaxations of the complementarity-constrained problem formulations (MPEC2) and (PLUS2) in FS seem to be even tighter (higher LB in root node) and the optimization requires fewer BaB nodes than Problem (CH2) (cf. Table 3). For Problem (BM2), MAiNGO seems to have considerable problems in the FS formulation. The solution time exceeds 30 minutes, which is most likely caused by the simplistic handling of integer variables in MAiNGO. There are no sophisticated heuristics for generating integer-feasible points implemented yet, which can result in poor performance. In RS formulations, this becomes less likely because of the much lower number of possible branches. However, relaxations generally tend to become weaker in the RS formulation, which is confirmed by the lower LB in the root node for RS compared to the FS formulations. The effect of the reduction in problem size on overall solution time is however bigger than that of the relaxation tightness, which is impressively demonstrated for Problem (CH2). All in all, the much simpler and potentially more intuitive problem formulations (MPEC2), (PLUS2), and (MINLP2) seem to benefit from their comparably small problem sizes both in the FS and RS formulation while maintaining comparatively tight relaxations. They always yield solution times as low as or lower than that of the more complex formulations (BM2) and (CH2).

The comparison of the results obtained by MAiNGO with those obtained by BARON confirms the findings from the simple illustrative example problem (cf. Sect. 4.1), yet to a smaller extent: BARON performs better than MAiNGO for FS formulations with only few nonconvex terms (Problem (BM2) and (CH2)), whereas it performs slightly worse for problems in the RS formulation with a higher number of nonconvex terms (Problem (MPEC2) and (MINLP2)).

Table 3 Problem size and numerical results for the problem formulations of Problem (GDP2) presented in Sect. 3. The optimization has been executed 100 times, of which the arithmetic mean value is shown The optimal value is 11.7
Fig. 4
figure 4

Solution time for Problem (GDP2) using the problem formulations presented in Sect. 3. As BARON can not handle max-functions, there are no results for the Plus Function formulation. The error bars represent the standard deviation from the arithmetic mean value of the solution time from 100 optimization runs using MAiNGO

4.3 Selective branching for FS formulations

As an alternative to reducing the problem size on the modeling level (cf. Sect. 1), the problem size can also be reduced on the algorithm level in the means of selective branching (Epperly and Pistikopoulos 1997; Dür and Horst 1997; Ben-Tal et al. 1994). The crucial difference between operating in a reduced space on the modeling level (RS formulation in MAiNGO) and on the algorithm level (selective branching) is that selective branching only reduces the dimensionality of the space that needs to be partitioned via branching, whereas the RS formulation in MAiNGO additionally reduces the dimension of subproblems to be solved in both the lower and upper bounding problem. In selective branching, the same (significant) reductions in the number of BaB nodes required to solve the optimization problem and the overall solution time have been reported (Epperly and Pistikopoulos 1997). This does however apply only to problems with a specific structure and if constraint propagation is used to remain a high convergence order (Kannan and Barton 2017).

Irrespective of the approach for reformulating the GDP, the RS formulation in MAiNGO always results in lower solution times than the corresponding FS formulation for both the simple and more complex flowsheet structure (Fig. 3 and 4 , respectively). In contrast, the LB in the root node is always lower and the number of BaB nodes required for global optimality is always larger for the RS formulation than for the FS formulation or they are equal, which indicates weaker relaxations for the RS formulations caused by the propagation of relaxations through the algorithm.

For comparison and to isolate the effect of a smaller problem size from that of the branching behavior, we perform selective branching on the RS optimization variables in the FS formulations. This selective branching should result in similar BaB trees between the FS and RS formulations in terms of selection of branching variables and points, while it still exploits the potentially tighter relaxations of the FS formulations. This investigation is performed for the more complex flowsheet structure (Problem (GDP2)) as the differences in solution times for the FS and RS formulations are most pronounced for this problem and their branching variables differ from each other.

Except for Problem (BM2), which seems to have an anomaly in the FS formulation, the results in Table 4 and Fig. 5 show that the solution time remains about the same if selective branching is applied to the FS formulations. This shows that the reduction in solution time is exclusively due to the smaller size of the subproblems. For Problem (MINLP2), the selective branching results in a lower number of BaB nodes compared to its FS formulation without selective branching. This has however only a negligible effect on the overall solution time compared to the solution times for RS formulations as the time consumed per BaB node for the RS formulation is considerably smaller. Selective branching in BARON via the specification of branching priorities has also only negligible influence on the solution time and the number of BaB nodes.

Table 4 Problem size and numerical results for the FS formulation of Problem (GDP2) if selective branching is applied. The optimization has been executed 100 times, of which the arithmetic mean value is shown. The optimal value is 11.7. For the FS formulation of Problem (BM2), the maximum solution time of 86,400 s has been reached
Fig. 5
figure 5

Solution time for the FS formulation of Problem (GDP2) if selective branching is applied compared to both the FS formulation without selective branching and the RS formulation. As BARON can not handle max-functions, there are no results for the Plus Function formulation. The error bars represent the standard deviation from the arithmetic mean value of the solution time from 100 optimization runs using MAiNGO

5 Application to problems with piecewise-defined functions

A commonly used case study for testing problem formulations with piecewise-defined functions is the solution of a heat exchanger network design problem, which can be formulated as a superstructure optimization problem. This is a special case of superstructure optimization problems as the alternative units can be described as one single unit with a piecewise-defined function, in this case, of the heat exchanger area \(A_i\). Thus, the disjuncts do not represent different units but rather different regions within the piecewise-defined cost function of each unit (heat exchanger). The case study is taken from Türkay and Grossmann (1996a) and is depicted in Fig. 6 with parameters given in Table 5. The corresponding formulation of Problem (HEXGDP) can be found in ESI Sect. 4.1.

Fig. 6
figure 6

Heat exchanger network based on the work of Türkay and Grossmann (1996a) with the choice between three different sizes for each heat exchanger 1–3

Table 5 Parameters for the heat exchanger network design problem taken from Türkay and Grossmann (1996a)

For the RS formulations, model equations had to be rearranged (e.g., for the elimination of \(T_1\), by a partial fraction decomposition) in order to tighten relaxations and prevent singularities. Since in this example the alternative units can also be described as a single unit containing a piecewise-defined (potentially discontinuous) function, we introduce a new problem formulation (Problem (HEXStepF), ESI Sect. 4), hereafter called Step Formulation. This formulation involves step functions, for which McCormick relaxations can be constructed (Wechsung and Barton 2013). The piecewise-defined function

$$\begin{aligned} \phi (x)={\left\{ \begin{array}{ll}\phi _1(x) \quad \text { if } x \le x_1 \\ \phi _2(x) \quad \text { otherwise} \end{array}\right. } \end{aligned}$$
(4)

is reformulated as follows:

$$\begin{aligned} \phi (x)=\pi (x-{x}_1) \phi _2(x) + [1 - \pi (x - {x}_1)] \phi _1(x), \end{aligned}$$
(5)

where

$$\begin{aligned} \pi (x)={\left\{ \begin{array}{ll}0 \quad \text { if } x \le 0 \\ 1 \quad \text { otherwise.} \end{array}\right. } \end{aligned}$$
(6)

The problem size and numerical results for each problem formulation (see ESI Sect.  4 for each formulation) are summarized in Table 6. For both global solvers, default settings are used.

The key findings from the previous case studies are confirmed by this discontinuous optimization problem. The RS formulations always result in shorter overall optimization times mainly resulting from the reduced problem size. The negative effect of the elimination of optimization variables on the tightness of the root node relaxation is particularly pronounced for this case study. However, for Problem (HEXBM), (HEXCH), and (HEXMINLP), this has only a minor negative influence on the number of BaB nodes required. For the other problems, the number of BaB nodes does even decrease much likely due to a more directed branching. Despite the lowest LB in the root node of Problem (HEXMINLP), it requires the fewest BaB nodes and thus outperforms the Big-M and Convex Hull formulation for the RS formulation in this case study.

Reformulation approaches avoiding the introduction of binary variables and thus leading to NLP problems (Problem (HEXMPEC), (HEXPLUSF, and (HEXStepF)) do not benefit in the FS formulation from their smaller problem size. Their subproblems are solved quickly, but a high number of BaB nodes is required despite yielding the tightest relaxations in the root node among all formulations. Moving to RS formulations reduces solution time considerably as significantly fewer BaB nodes are required. Since relaxations can never be tighter in RS than corresponding FS formulations, the reduced number of BaB nodes can mainly be attributed to a more directed branching. The combination of a more directed branching with a small problem size resulting from RS formulations for the reformulation approaches introducing additional nonconvexities yield the lowest computational effort. Also the overall trend, BARON performing better than MAiNGO for the conventional reformulations approaches and MAiNGO generally performing better than BARON for the unconventional approaches in the RS formulations (except for the MINLP formulation), is confirmed by this heat exchanger network design problem.

Table 6 Problem size and numerical results for Problem (HEXGDP) with problems formulations presented in Sect. 3 and the Step Formulation. The optimization has been executed 100 times, of which the arithmetic mean value is shown. The optimal value is 114,385. No results are given for the RS (AVM/McCormick hybrid) Step Formulation, as it does not contain repeated nonlinear terms that could be replaced by AVs
Fig. 7
figure 7

Solution time for Problem (HEXGDP) using the problem formulations presented in Sect. 3 and the Step Formulation. As BARON can not handle max-functions, there are no results for the Plus Function formulation. The solution times of the optimization with BARON and the MPEC and Step Formulation exceed the maximum solution time of 86400 s. No results are given for the RS (AVM/McCormick hybrid) Step Formulation, as it does not contain repeated nonlinear terms that could be replaced by AVs. The error bars represent the standard deviation from the arithmetic mean value of the solution time from 100 optimization runs using MAiNGO

We have demonstrated that RS formulations have the big advantage of a smaller problem size and a more directed branching, while they suffer from weaker relaxations than FS formulations. To combine the advantage of a small problem size from the RS formulations with the relaxation tightness of the FS formulations, we can selectively consider AVs for repeated nonlinear terms, thus achieving a hybrid (Najman et al. 2021) between McCormick relaxations and the AVM. Such a hybrid can be easily considered by MAiNGO using its automated identification and replacement of repeated nonlinear terms with AVs. In general, it is much more likely for RS than FS formulations to contain these repeated nonlinear terms (see Bongartz (2020), p.64ff.). In fact, for the heat exchanger network design problem, only the RS formulations contain repeated nonlinear terms. As illustrated in Table 6 and Fig. 7, the selected addition of AVs is beneficial especially for the complementarity-constrained problems (HEXMPEC) and (HEXPLUSF) despite an unchanged LB in the root node. However, the number of BaB nodes reduces considerably while the CPU time per BaB node remains about the same. In contrast, for the other problem formulations (except Problem (HEXBM)) the subproblems seem to become more difficult to solve, such that the benefit of the tighter relaxations resulting from adding selected AVs is almost outweighed. Problem (HEXStepF) does not exhibit repeated nonlinear terms in both the FS and RS formulations. The analysis shows: The consideration of selected AVs within the RS formulations has the potential to improve computational effort even further.

6 Conclusion

Superstructure optimization problems for process synthesis often contain nonconvex functions resulting in nonconvex MINLP problems, for which global solvers need to be used. Although the same problem formulations as those developed for superstructure optimization problems with only convex functions can be used, it remains unclear whether these are always the most computationally efficient ones for problems with nonconvex functions. In particular, for problems with only convex functions, preference is usually given to formulations such as Big-M or Convex Hull that avoid introducing nonconvexities, thus resulting in a convex MINLP or even MILP. We conjectured that for problems that contain nonconvex functions anyway, alternative formulations that do introduce additional nonconvexities but may result in smaller problems or allow tighter relaxations could be promising.

Our analysis shows that for problems containing nonconvex functions anyway, these additional nonconvexities do not necessarily have a considerable negative influence on the optimization. The resulting relaxations often seem to remain comparably tight despite these additional nonconvexities and the corresponding problem formulations contain fewer variables than the conventional formulations. Accordingly, the alternative formulations result in similar or even lower computational time than the conventional ones for most considered examples. As an additional approach to reduce problem size, we exploited reduced-space (RS) formulations, where we eliminate as many optimization variables as possible using the model equations. Despite weaker relaxations, the smaller problem size of RS formulations is beneficial for all problem formulations for the considered case studies.

The comparison of the results obtained with our open-source solver MAiNGO with those obtained with the state-of-the-art commercial solver BARON shows that neither the auxiliary variable method (AVM) employed in BARON, nor the propagation of McCormick relaxations employed in MAiNGO suffers significantly from the additional nonconvex terms resulting from the unconventional problem reformulation approaches. Yet, the reduction of the problem size in the RS formulations can be exploited more effectively by MAiNGO, resulting in the lowest overall solution times observed for the illustrative example problems. For the conventional reformulation approaches, BARON generally outperforms MAiNGO.

We have extended the comparison of model formulations by also considering a problem with piecewise-defined functions instead of unit selections, which can be formulated as a superstructure optimization problem. For this problem, the RS formulation for all reformulation approaches was again computationally more efficient than the FS formulation. A formulation that directly treated the discontinuous function with the help of step functions was particularly advantageous.

In summary, the unconventional reformulation approaches, which introduce nonconvex terms, are promising for nonconvex superstructure optimization problems, especially when combined with RS formulations. Bearing in mind that modeling using these approaches is rather intuitive and the resulting models simple, they represent an interesting alternative to established approaches such as the Big-M and Convex Hull method. To further validate this claim, a benchmark library with more complex superstructure optimization problems is required for such comparative analyses in future studies. In this work, only a few simple flowsheet optimization problems have been considered, as each problem formulation needed to be implemented manually for all reformulation approaches and for both the FS and RS formulation individually. To overcome the considerable manual effort and benefit from a flexible application of different reformulation approaches given specific problem characteristics, the automated generation of problem formulations (in the vein of, e.g., the open-source Python package for component-oriented modeling and optimization for nonlinear design and operation of integrated energy systems COMANDO (Langiu et al. 2021), the Pyomo.GDP package (Chen et al. 2018) that allows automated reformulation and solution of GDPs, or the Pyosyn framework (Chen et al. 2021) for superstructure modeling) based on a GDP problem would be beneficial.