Extensions to Generalized Disjunctive Programming: Hierarchical Structures and First-order Logic

Optimization problems with discrete-continuous decisions are traditionally modeled in algebraic form via (non)linear mixed-integer programming. A more systematic approach to modeling such systems is to use Generalized Disjunctive Programming (GDP), which extends the Disjunctive Programming paradigm proposed by Egon Balas to allow modeling systems from a logic-based level of abstraction that captures the fundamental rules governing such systems via algebraic constraints and logic. Although GDP provides a more general way of modeling systems, it warrants further generalization to encompass systems presenting a hierarchical structure. This work extends the GDP literature to address three major alternatives for modeling and solving systems with nested (hierarchical) disjunctions: explicit nested disjunctions, equivalent single-level disjunctions, and flattening via basic steps. We also provide theoretical proofs on the relaxation tightness of such alternatives, showing that explicitly modeling nested disjunctions is superior to the traditional approach discussed in literature for dealing with nested disjunctions.


Introduction
Discrete-continuous optimization is one of the main modeling approaches to address design, planning, and scheduling problems in Process Systems Engineering (PSE) (Grossmann, 2012).Raman and Grossmann (1994) present a powerful modeling paradigm that extends the work by Balas (1985) on disjunctive programming.This new paradigm, called Generalized Disjunctive Programming (GDP), has been further developed by others in the PSE community over the years to account for additional features, such as nonlinearities and nonconvexities in the problems encountered (Grossmann & Trespalacios, 2013).GDP relies on the intersection of disjunctions of convex sets to model the feasible space.Boolean variables are used as indicator variables for each convex set, meaning that if True, the constraints in the disjunct are enforced.Logic constraints are also included to describe the relationships between the Boolean indicator variables via propositional logic and constraint programming logic.
GDP is a powerful modeling abstraction for optimization problems for two main reasons.Firstly, modeling systems from the basis of their underlying logical relationships speeds up the development of optimization models by making them easier to interpret, and reducing the likelihood of modeling errors due to logical fallacies.Secondly, GDP makes available a broad array of solution methods, ranging from mixed-integer reformulations to logic-based search methods (Chen et al., 2022).
The present work extends the GDP theory to allow modeling hierarchical systems, which are commonly encountered in PSE, and more particularly in Enterprise-Wide Optimization (EWO) (Grossmann, 2012;van den Heever & Grossmann, 1999), and flowsheet superstructure optimization (Türkay & Grossmann, 1996a).Hierarchical systems involve multiple level of decision making, which can be concisely modelled via nested disjunctions.However, traditional GDP does not consider such formulations.Existing GDP literature suggests reformulating nested disjunctions into equivalent single-level disjunctions (Vecchietti & Grossmann, 2000).Such an approach requires introducing additional Boolean variables and logical propositions.An alternate approach is used in the work by van den Heever and Grossmann (1999), in which a direct or inside-out reformulation to MI(N)LP is performed.We formalize these two approaches and provide theoretical proofs on the tightness of their continuous relaxations.We also present a third approach that produces a single-level GDP by applying basic steps to obtain the disjunctive normal form (DNF) of the nested disjunction.The model tightness and computational performance of the different approaches are compared.A series of examples are used to show the modeling and computational advantages obtained by explicitly modeling nested disjunctions.
The paper is organized as follows, Section 2 provides a background on the GDP modeling paradigm.Section 3 extends this formulation to account for hierarchical systems, and discusses the alternatives for modeling such systems.The equivalent mixed-integer programming reformulations for these alternatives are presented, along with two theorems on the tightness of the resulting models.Section 4 provides several numerical use cases for hierarchical GDPs.Section 5 presents concluding remarks.

Background: Generalized Disjunctive Programming (GDP)
The classical GDP formulation is given below (GDP), where  is the set of continuous variables (bounded between   and   ), () is the objective function, () ≤ 0 is the set of global constraints,   () ≤ 0 is the set of constraints applied when the indicator Boolean   is True for disjunct  in disjunction .(), (), and   () are assumed to be continuous and differentiable over .Ω() defines the set of logic constraints, which are described via propositional logic on a subset of Boolean variables.These constraints describe the relations between the Boolean variables via clauses that contain with one or more of the following logic operators: AND (∧), OR (∨), implication (⇒), equivalence (⇔), and negation (¬).The set of logic constraints may also include cardinality clauses of the form choose exactly (or at least or at most)  Boolean variables from a subset of Booleans to be True (Yan & Hooker, 1999).We leverage predicate logic to extend the notation used by Yan and Hooker for cardinality clauses by defining the following predicates: Ξ(,   ∀ ∈ ) enforces exactly  of the Boolean variables   are True, Λ(,   ∀ ∈ ) enforces at least  of the variables are True, and Γ(,   ∀ ∈ ) enforces at most  are True.GDP models typically include a cardinality clause to enforce that exactly 1 disjunct is chosen in each disjunction, i.e., Ξ(1,   ∀ ∈   ) ∀ ∈ .The GDP literature often uses the exclusive OR operator, ∨, to define this constraint.However, such an operator is only correct for proper disjunctions (those with non-overlapping disjuncts).Thus, to avoid any ambiguity, we use the predicate logic notation, Ξ(1, ), here instead.To illustrate the elements of a GDP model, consider the model below (GDP-example).The projection of this model on the  1 ,  2 -plane is given in Figure 2.1, where the quadratic objective function is shown in the colored contours, the global constraints are given by the region under the black curves (one linear and the other nonlinear), and the three disjuncts given in the colored rectangles.The feasible space of such a system is given by the disjoint regions in the orange, blue, and green rectangles that satisfy the global constraints.One of the main advantages of modeling discrete-continuous problems using GDP is the collection of methods that are available for optimizing such systems.These include, 1) reformulating to mixed-integer (non)linear models (MI(N)LP) via either Big-M (Trespalacios & Grossmann, 2015) or Hull reformulations (Agarwal, 2015;Bernal & Grossmann, 2021;Furman et al., 2020;Grossmann & Lee, 2003), 2) logic-based decomposition methods such as Logic-based Outer Approximation (LOA) (Türkay & Grossmann, 1996b), 3) disjunctive branch-and-bound (Lee & Grossmann, 2000), 4) basic steps (Ruiz & Grossmann, 2012), and 5) hybrid cutting planes (Sawaya & Grossmann, 2005;Trespalacios & Grossmann, 2016).The reader is referred to the above references for a detailed understanding of each of these solution methods.

Extended formulation for multi-level hierarchies
Decision hierarchies are present in most decision-making applications.These include for instance supply chain and enterprise-wide optimization, where different levels of decision-making exist depending on the time scales considered: planning (months/years), scheduling (hours/days), and control (seconds/minutes).According to Brunaud and Grossmann (2017), integrating different decision levels enables better coordination and communication between functional areas, which increases agility in response to disturbances and makes it possible to attain benefits for the company that are not possible with a siloed approach.Figure 3.1 illustrates the notion of the synergistic benefits that can be obtained by an integrated approach, rather than siloed or aggregated approaches.Accounting for the relationships between different levels of decision-making can aid in finding the true optimum, which differs from that of the aggregated model (i.e., the model obtained by summing the siloed costs).Integrated approaches to hierarchical decision-making systems have been addressed in the literature.Some examples of these integrations are the integration between design and planning (operational and expansion) (van den Heever & Grossmann, 1999), planning and scheduling (Maravelias & Sung, 2009), and scheduling and control (Muñoz et al., 2011;Sokoler et al., 2017).The following subsections formalize how GDP can be used to model hierarchical systems, along with theoretical proofs on the differences between the approaches.

Hierarchical GDP
We propose extending the GDP paradigm to include multi-level decisions by means of nested disjunctions.
Although the notion of nesting disjunctions to represent hierarchical decisions is not new, the limitations in the traditional GDP notation have made it difficult to exploit the benefits of using such structures.One of the first references to nested disjunctions is found in the work by Vecchietti and Grossmann (2000), which describes the transformations required to conform to the current GDP notation.It is interesting to note that several works have relied on the nested GDP representation due to its compact representation.
In one of these (Rodriguez & Vecchietti, 2009), the following statement is made, "Although the expressiveness of the hierarchical decisions by means of nested disjunctions, they cannot be implemented directly.These disjunctions must be transformed into GDP form.For that purpose, the disjunctions…must be rewritten as single disjunctions, and some additional constraints must also be included in the model." Therefore, from a model development point of view, the use of disjunction nesting is shown to add value.However, its implementation has often required breaking the explicit hierarchical structure.An exception is the work by van den Heever and Grossmann (1999), which does not transform the nested GDP into a logically equivalent single-level GDP, but rather suggests performing the Hull reformulation on the inner disjunction and then reformulating the outer disjunction.We now build upon this concept to formally extend the GDP notation for hierarchical systems that generalizes to multi-disjunct disjunctions, rather than the on/off disjunctions used by van den Heever and Grossmann (1999).We also provide both theoretical proofs on the advantages of modeling system hierarchies via nested disjunctions, and highlight the computational performance gains obtained using this explicit notation.
The proposed extension to the classical GDP notation for hierarchical systems is given below for a 2-Level Nested GDP (2L-GDP), where the upper-level decisions, , enforce the constraints () ≤ 0 and the nested decisions, , which have constraints ℎ() ≤ 0.Here the cardinality clause of selecting exactly one disjunct from the upper-level decisions, , is expressed explicitly, along with a new set of cardinality rules that enforce selecting exactly one of the lower-level decisions, , if and only if the upper-level decision has been selected, and selecting no lower-level decisions when the upper-level decision is not selected.This constraint is expressed as the conjunction of two cardinality rules: (1) ,   ∈   1  1 …  ()

Equivalent Single-level GDP
Previous references to GDP with nested disjunctions in literature have proposed transforming the 2L-GDP model into the Equivalent Single-Level GDP (2E-GDP) given below (Grossmann & Trespalacios, 2013;Vecchietti & Grossmann, 2000).Here, the nested disjunction is extracted and a dummy or "slack" disjunct is added to preserve feasibility.Thus, if none of the nested disjuncts is selected, the slack disjunct is selected, which contains the entire feasible set for .The exclusive cardinality rule on the inner Boolean variables, , is also augmented to include the slack Boolean variable,  0 .This slack variable is, however, not included in the linking logic constraint for the upper and lower-level decisions.This ensures that the nested decisions are only selected if their master Boolean is True.This method for transforming a nested disjunction can also be applied to the multi-level system ML-GDP.Although the above formulation, allows modeling hierarchical systems in the standard GDP notation, it has two major drawbacks: 1) the explicit hierarchical structure is lost, and 2) although the Equivalent Single-Level GDP model is logically equivalent to the Nested GDP model, it requires introducing additional disjuncts and Boolean variables.Introducing "slack" disjuncts and "slack" Boolean variables results in models whose continuous relaxations are less tight, as described in the next section.

Tightness of Continuous Relaxations
The following two theorems and their associated proofs establish the advantages of modeling multi-level decisions problems via Nested GDP, rather than the Equivalent Single-Level GDP approach.The advantages are shown by discussing the tightness of the continuous relaxations of both the Hull reformulation (HR) and Big-M reformulation (BM) of these two GDP models.
Theorem 1.Let rML-GDP-HR denote the continuous relaxation of the mixed-integer program (MIP) obtained from a Multi-Level Nested GDP via the Hull reformulation, and let rME-GDP-HR denote the continuous relaxation of the MIP obtained from its respective Equivalent Single-Level GDP representation via the Hull reformulation.The feasible space of the former is contained within the feasible space of the latter, namely, rML-GDP-HR ⊆ rME-GDP-HR.
Proof.Without loss of generality, the above theorem is proved by establishing that the Hull reformulation of the 2-Level Nested GDP model (r2L-GDP-HR) is contained in the Hull reformulation of its Equivalent Single-Level GDP representation (r2E-GDP-HR): The Hull reformulation for 2L-GDP is given below, where the continuous variable  is disaggregated in each disjunct ( is disaggregated into   for each upper-level disjunct, and   is disaggregated into   for each lower-level disjunct) and the Boolean variables are replaced by their corresponding binary variable ( becomes , and  becomes ). and  are matrices of scalars, and  is a vector of scalars.These are used to map the logic constraints into their algebraic counterparts obtained after converting the logic propositions into conjunctive normal form (CNF) and transforming each clause into its equivalent algebraic constraint (Williams, 1985).
The Hull reformulation for 2E-GDP is given below, where  is disaggregated into   for the upper-level disjunctions, and is also disaggregated into   for the lower-level disjuncts, which are extracted when transforming the model into an Equivalent Single-Level GDP.
The difference between 2L-GDP-HR and 2E-GDP-HR is in the highlighted constraints in the variable disaggregation and cardinality rules sections.The proof for the Hull reformulation case is given by applying Fourier-Motzkin elimination (Dantzig, 1972) to eliminate the slack Binary variable ( 0 ) and its corresponding disaggregated variable ( 0 ) from 2E-GDP-HR.We first combine the last two cardinality rules in 2E-GDP-HR to obtain (1).
Equating the two variable aggregation constraints in 2E-GDP-HR and solving for  0 gives (2).
Summing the bounding constraint for   over  ′ ∈   for  ′ ≠ , results in (5).Using the cardinality rule ∑   ∈  = 1, (5) can be written as given in ( 6), which has two parts, (6a) and (6b).Substituting these into (4) proves that ( 4) is a relaxation of the disaggregation constraint in 2L-GDP-HR (∑   ∈  =   , which can be written as It should also be noted that the cardinality rule on the extracted lower-level decisions in Proof.Without loss of generality, the above theorem is proved by establishing that the Big-M reformulation of the 2-Level Nested GDP model (r2L-GDP-BM) is contained in the Big-M reformulation of its Equivalent Single-Level GDP representation (r2E-GDP-BM), when tight M values are used: The Big-M reformulation for the nested GDP model is given in 2L-GDP-BM, where   is the Big-M value for the constraints in the  ℎ disjunct in disjunction ,   ′ is the Big-M value associated with the upperlevel decision on the nested constraints, and   ′ is the Big-M value associated with the lower-level decision on the nested constraints.The Big-M reformulation for the Equivalent Single-Level GDP is given in 2E-GDP-BM, where   is the same as in 2L-GDP-BM, and   is the Big-M value associated with the extracted lower-level decisions.
Finding the tightest Big-M values requires solving multiple optimization problems to maximize the value of each constraint function over the complete model's feasible region, or over the corresponding feasible region of the disjunction (Grossmann & Trespalacios, 2013).For the proof we calculate tight Big-M values using only the global constraints or upper-level constraints in the case of the nested constraints.The following mathematical optimization problems are solved to obtain tight  values: (7) for   , (8a) for   ′ , (8b) for   ′ , and (9) for   .It should be noted that   ′ accounts for the upper-level constraints   () ≤ 0, meaning it is localized to the parent disjunct that it belongs to.  ′ subtracts   ′ from the traditional Big-M value to ensure that when both upper and lower-level decisions are not selected (  = 0 and   = 0), the resulting Big-M value is equivalent to the global Big-M value for that constraint.
The proof lies in establishing that the feasible space of 2L-GDP-BM is contained in 2E-GDP-BM.The difference between these two models is shown in the highlighted constraints above.It was previously shown that the cardinality rule  0 + ∑   ∈  = 1 is redundant (see Theorem 1).Thus, the proof is given by establishing that the right-hand-sides of the highlighted Big-M constraints satisfy (10), meaning that the Big-M constraint from 2L-GDP-BM is contained in the Big-M constraint from 2E-GDP-BM.Substituting ( 9) in (8b), results in (11).Substituting ( 11) in ( 10) and simplifying the resulting expression produces (12).From the cardinality constraint ∑   ∈  =   , it is clear that   ≤   , meaning that the expressions in parenthesis in ( 12) can be dropped without changing the sign on the inequality.Thus,   ′ ≤   , which is true considering that ( 9) is a relaxation of (8a).Therefore, 2L-GDP-BM ⊆ 2E-GDP-BM. QED

Flattening via Basic Steps
The third approach to modeling hierarchical GDP is to flatten the nested disjunctions by applying sufficiently many basic steps within each disjunction until the nested system is transformed into a system with single-level disjunctions.Consider the simple nested disjunction in ( 13).This disjunction constraint can be flattened by applying two basic steps to introduce  1 () ≤ 0 into the nested disjunctions, resulting in ( 14), where  1 =  1 ∧  1 and  2 =  1 ∧  2 . [ For disjunctions with a single nested disjunction, applying a basic step is quite cheap.However, once there is more than one nested disjunction inside a single disjunct, the number of basic steps required to flatten the hierarchical GDP grows exponentially.Consider the disjunction with two nested disjunctions in (15).Flattening the disjunction is a set covering problem and requires eight basic steps (four for each combination of two disjuncts and four more to introduce  1 () ≤ 0 in the resulting disjuncts) to obtain the equivalent disjunction in ( 16). [ [ Generalizing this to the notation of 2L-GDP, a disjunction with  ∈   nested disjunctions, each of which has  ∈   disjuncts, requires the number of basic steps given in ( 17), where the notation (  1 ) is the binomial coefficient (choose 1 from a group with  elements).The coefficient 2 accounts for introducing   () ≤ 0 into each of the resulting disjuncts, and can be replaced by the 1 + |  ()| if   () represents a vector of functions, where |  ()| is the number of functions within   ().A hybrid approach is also possible, where some basic steps are performed and then the resulting nested disjunction is flattened as in the Equivalent Single-Level GDP approach.However, as the number of nested disjunctions increases, this hybrid approach results in many more disjunctions than those given in (17).Although flattening via basic steps may produce models that are tighter than the inside-out reformulation of the Nested GDP, the combinatorial growth of such systems makes this approach prohibitive for multilevel decision systems with multiple disjuncts in each nested disjunction.

Examples
Each of the examples in this section are implemented in the Julia programming language (version 1.8.3) (Bezanson et al., 2017) using various packages within the ecosystem.These include JuMP (version 1.6.0)(Dunning et al., 2017)

Graphical Example of Model Tightness
Consider the Nested GDP constraint system given in ( 18), which can be expressed as the Equivalent Single-Level GDP in ( 19), or the Flattened GDP (via basic steps) in ( 20).Each of these models is reformulated into a MIP using the Big-M reformulation, with both a loose (large) M value and a tight M value, and the Hull reformulation.Their continuous relaxations are then projected onto the  1 ,  2 plane in Figure 4.1.The projections show that flattening via basic steps is advantageous when the Hull reformulation is performed, but not necessarily when the Big-M reformulation is performed with a tight M value.In the latter case, the lack of information regarding the system hierarchy results in a Big-M reformulation that is equivalent to taking the Big-M region between  1 ,  2 , and  2 , which is worse than taking the Big-M region of  1 and  2 and intersecting it with the Big-M region of  1 and  2 .
Explicitly preserving the hierarchical relationship in the Nested GDP representation reduces the feasible region of the continuous relaxation more than when the Equivalent Single-Level GDP representation is used.This is observed in both the Tight-M (Big-M reformulation with a tight M) and Hull reformulation cases.Furthermore, in this example the Tight-M reformulation of the Nested GDP model produces the same relaxation as the Hull reformulation of the Equivalent Single-Level GDP model with only a fraction of the model size (see Table 4.1).It should also be noted that, the convex hull of the system is obtained when either the Hull reformulation is applied to the Nested GDP or when it is applied to the Flattened GDP.As a result, the continuous relaxation of either formulation will yield the optimum. [

2).
Each of these has a maximum installed capacity of 100 kg.Up to one tank for each material in the system can be installed for storage with a maximum installed capacity of 300 kg.There are two candidate chemical processes to perform each material transformation step, giving a total of six processes in the process superstructure.
There are two potential technologies (catalysts) that can be used in each process, each with a unique cost and yield, giving a total of 12 candidate process-catalysts combinations in the system.The plant process and equipment superstructures are given in Figures 4.2.1 and 4.2.2, respectively.The former illustrates the candidate processes in the superstructure in the state-task network representation (Kondili et al., 1993).The latter depicts the equipment options (reactor type and units, and tanks) in the superstructure.The objective of the optimization problem is to maximize system profit over a 30-day schedule by making the following decisions: • Which material storage tanks to install.
• How many shared reactors to install.
• Which processes to install for each material transformation step.o Which technologies (catalysts) to use in each of the selected processes.o Which reactor type to use in each of the selected processes.▪ How many reactors to operate in each time period.▪ How much to produce in each batch of material.• How much material to purchase for A, B, and C in each time period.
The hierarchy of these decisions is indicated by the bullet indentation above.Thus, the technology and reactor type selections are second-level decisions, and the operating schedule and batch sizes are thirdlevel decisions.For simplicity, any changeover or setup times are not considered.

Model:
The model for this system consists of the following linear constraints.Resource balances are enforced around each resource  at timepoint  with the global constraints in ( 21) and ( 22).The level of material at each tank,  , , is updated based on the material flowing in and out of the tank (material balance).The availability of each reactor,  , , is updated based on the reactor usage, Δ ,, .A reactor unit is locked (unavailable) when it begins a processing task  at time .At time  +   , the processing task ends (  is the duration), and the reactor unit is released (becomes available).The values used for the task durations,   , are   = 5 ∀ ∈ {1,4,5,6},  2 = 3, and  3 = 4 (days).For greater detail on resource balances, the reader is referenced to the review paper on the resource-task network by Perez et al. (2022).The decision to install a resource (tank or reactor) is governed by the disjunctions in ( 23) and ( 24), where the decision is to determine how many units  to install.In this example,   = {0,1,2} for each reactor type (at most 2 identical units can be installed for each reactor type ), and   = {0,1} for each tank (at most 1 tank can be installed for each material).The installation cost,   , is calculated as the sum of a fixed charge,   , and a variable cost coefficient,   , times the total resource capacity.If no units are installed ( = 0), the installation cost and resource capacity,   , drop to zero.( 23) and ( 24) also set the initial condition for the resource availability,  ,0 and  ,0 : if installed, tanks are full, and all reactor units are available, respectively.( 23) also tracks the slack on the tank level at the final timepoint ||,  ̂, which is penalized in the objective function to reduce the likelihood of depleting the inventory at the end of the scheduling horizon (see ( 47)).These constraints ensure that the schedule obtained is a feasible schedule for normal operation with monthly cycles.For startup operations the optimal schedule can be obtained by fixing the design decisions and rerunning the model with the initial tank levels set to zero.The cardinality constraint ( 25) ensures that exactly one of the disjuncts is selected.The values for the cost coefficients are given in Table 4.2.1.Since the plant lifetime is greater than the scheduling horizon, resource installation costs coefficients have been scaled to the appropriate order of magnitude.Installation costs for pipelines between tanks and reactors are assumed to be negligible.The multi-level disjunction in ( 26) represents the decision to install process  or not.When installed, the total batch size,  , , is equal to the flow entering the process at time .There are two nested disjunctions if a process is installed.The first of these relates to which reactor type  is assigned to the process,  , .The second one pertains to which technology (catalyst) is used for that particular process,  ̂, .Once a reactor type is assigned, the per unit batch size,  ̂, , is bounded by the installed capacity of each unit,   , and the operating cost,  , , is proportional to the total batch size with a cost coefficient  , (given in Table 4.2.2).The nested technology selection disjunction specifies the amount of material leaving the process when the batch is completed.This is governed by the yield, , which is specific to the technology  (given in Table 4.2

.3).
There is then a third-level set of disjunctions inside the reactor type assignment disjunction, which determines the number of units, , that are used for a batch at time, ,  ,,, .The number of units selected indicates the number of units that are locked at time  and is also used to determine the total batch size from the per unit batch size.Note that for this system, it is assumed that if multiple units are used, their loads are equally distributed.Finally, when a process is not installed (¬  ), all pertinent variables are set to zero, and the reactor capacity is only bounded by the maximum allowed capacity.The cardinality rules in ( 27)-( 29) are the linking constraints between the different levels of this multi-level disjunction.
[ An additional logic proposition must be included to ensure that if a process  is triggered on reactor type  at time  with  units ( ,,, = ), the reactor type  must have been installed with at least  units (∃ ′ ∈   :  ′ ≥ ,  , ′ = ).For example, if  ,,,1 = , then either  ,1 =  or  ,2 =  (one or two units must have been installed when the plant was built).This condition is enforced with (30).
The variable bounds and domains are given in ( 31)-( 33) and ( 35)-( 46).The upper bound resource capacities are,    = 300 ∀ ∈   and    = 100 ∀ ∈   .The initialization constraint in (34) is used to ensure that there is no flow leaving a reactor in the first   periods since it is assumed that all reactors are idle at the beginning of the scheduling horizon.
The objective of this optimization problem is to maximize profit, as given by ( 47), where   is the price of each material (  = − $1  ⁄ ,   = − $7  ⁄ ,   = − $8  ⁄ , and   = $10  ⁄ ).The tank level slacks are penalized with a penalty coefficient equal to the absolute value of the material price.
The resulting model is the linear Nested GDP given in ( 21)-( 47).This hierarchical model is reformulated into a mixed-integer linear program (MILP) using both Big-M (with both loose and tight M values) and Hull reformulations.The hierarchical GDP model is also transformed into its Equivalent Single-Level GDP and reformulated with both Big-M and Hull methods.The flattening via basic steps is not performed because the scheduling decisions are made up of || nested disjunctions in the third level of nesting, which makes the flattening procedure impractical due to the exponential growth in disjunctions.
The optimum solution yields a cumulative profit of $2,085.The process network and equipment network designs are given in The optimal design requires the installation of Processes 1, 3-5; Tanks B and C; and both reactor types, each with two units available.Reactors of type 1 focus almost exclusively on Process 1 with Technology 2, with one batch of Process 3 (Technology 1).Rectors of type 2 are used for Processes 4 and 5, each using Technology 1. Procurement of A occurs every 5 days, with sales of D typically spaced out every 10 days.By the end of the scheduling horizon, both tank levels have been restored to their initial levels (full).The model sizes and computational statistics for each of the reformulated MILP models is given in Table 4.2.4.Each model is solved three times: 1) relaxing the integrality constraints (LP relaxation), 2) solving the MILP with presolve and heuristics disabled in CPLEX to compare the effects of the model formulation size and tightness on the computational performance, and 3) solving the MILP with presolve and heuristics enabled.As can been observed, all formulations, except the Hull reformulation of the Nested GDP, have a poor continuous relaxation with very large relaxation gaps.The Hull reformulated Nested GDP, on the other hand, has a tight relaxation with a 9% relative gap.When presolve and heuristics are disabled in CPLEX, both the Big-M and Tight-M reformulations cannot solve the problem to optimality within the allotted time limit of 3,600 seconds.CPLEX is able to reduce the optimality gap more in the Tight-M models, than in their Big-M counterparts, with a greater gap reduction when the Nested versions are used.
The Tight-M models result in solutions that are approximately 10% lower than the optimum.It is also observed that the Equivalent Single-Level models result in better feasible solutions despite the larger optimality gaps.The poor solution found with the Nested Big-M model is likely due in part to the fact that the nested constraints end up having two very large M values when reformulated, making them less tight than their Single-Level counterparts.Interestingly, when presolve and heuristics are enabled in CPLEX, it is able to solve the models with large M values to optimality, but not those with tighter M values.
The MILP model obtained by applying the Hull reformulation to the Nested GDP model outperforms the other models, finding the optimum in approximately half of the time required relative to its Equivalent Single-Level counterpart, requiring less nodes to be explored and less cuts to be applied.This superior performance is likely due to the reduced model size and the LP relaxation tightness.The Hull reformulated Nested GDP results in a model that has fewer binary variables (25% and 4% less, before and after presolve, respectively), continuous variables (10% and 26% less, before and after presolve, respectively), and constraints (5% and 25% less, before and after presolve, respectively) than its Equivalent Single-Level counterpart.van den Heever and Grossmann (1999), which consists of an integrated superstructure optimization problem with long term operational and expansion planning.The problem has three potential processes (1, 2, and 3), each with its dedicated processing unit, and three materials (A, B, and C) as shown in Figure 4.3.1.Material C is the final product (price: $10,800/ton) and is produced from Material B in Process 1. Material B can be purchased externally (cost: $7,000/ton) or produced from Material A (cost: $1,800/ton) in either Process 2 or Process 3. It is assumed that each process includes any required separation steps, such that the respective exit streams are singlecomponent streams containing the pure product of each process.The objective here is to minimize cost (maximize profit) by making the following decisions: • Which processes should be used.
• Which processes to operate in each period.
• Which processes to undergo a capacity expansion in each period.
• How much new processing capacity to install in each period.The hierarchical GDP model is given as follows.The material balance constraints in the two stream junction points are given in ( 48) and ( 49), where  , is the flow (tons) in stream  in period  (where  is in years).The amount of imported B and exported C are constrained by ( 50) and (51), respectively.
[ Additional logic constraints are given in ( 56)-( 59).The cardinality clause in ( 56) allows installing at most 1 of Process 2 or Process 3.This is equivalent to the proposition ¬ 2 ∨ ¬ 3 used in the original paper, but generalizes for cases in which there are more than two potential processes in parallel.The implication in (57) ensures that Process 1 is installed if either Process 2 or Process 3 are installed.( 58) and ( 59) enforce that process  operate at least once if installed, with at least one expansion event scheduled between the beginning of the planning horizon (period 1) and each period  in which the process is operated, respectively.
Γ(1, { 2 ,  3 }) (56) The variable domains are given in ( 60)-( 66), where    = 5  ∀ ∈ ,  There are some differences between this formulation and the one in the original paper by van den Heever and Grossmann (1999).The original formulation has the process capacity evolution constraint in the disjunct governed by  , .This requires specifying a new constraint,  , =  ,−1 , for the disjunct governed by ¬ , , which would also be required for the disjunct governed by ¬ , .This is avoided by moving the process capacity balance to the upper-level constraints in   .The same is true for the yield constraint, which we move from the  , disjunct to the   disjunct constraints.This requires that we only constrain the flow exiting the process in the secondary level disjunction, rather than both the entrance and exit flows.It is also more intuitive to specify the yield constraints when the processes are selected.Another major difference is that the original model does not use the cardinality constraints in ( 54) and ( 55).Instead, it uses the logic propositions ( 68) and ( 69).These propositions are contained in ( 54) and ( 55), but do not establish a proper hierarchical relationship since there is no link between ¬ , and   , and ¬ , and  , .
, ⇒   ∀ ∈ ,  ∈  (68)  , ⇒  , ∀ ∈ ,  ∈  (69) An important thing to note is that the model in Example 4.3 is an example of a type of hierarchical GDP, that need not be hierarchical at all.This occurs when every disjunction has only two disjuncts, representing an on and an off state, where the off state has all relevant variables set to zero.When this occurs, (52) can actually be split into three sets of disjunctions without adding the "slack" disjunct observed in the Equivalent Single-Level GDP model.These three sets of disjunctions are given in ( 70)-( 72).The cardinality constraints in ( 54)-( 55) can be replaced by ( 68)-( 69).The model composed of ( 48)-( 51), (53), and ( 56)-( 72) is referred to here as the Non-hierarchical formulation.
[ The Nested GDP model is compared against its Equivalent Single-Level formulation, and the Nonhierarchical formulation, by reformulating each of these into mixed-integer nonlinear programs (MINLPs) using the Hull reformulation, and solving them with in three modes: 1) relaxing the integrality constraints (NLP relaxation), 2) solving the MINLP with local search and range reduction disabled in BARON, and 3) solving the MINLP with local search and range reduction enabled.Since the models are nonlinear, the perspective functions were reformulated using the -approximation from Furman et al. (2020), with  = 10 −6 , which is the default reformulation approach in the DisjunctiveProgramming code.The model statistics are given in The optimal expansion profile is given in Figure 4.3.2,where it can be seen that Process 2 is not installed, but Processes 1 and 3 are, where the capacity in Process 1 increases to 1 ton/year by the third year, and Process 3 increases to 1.11 ton/year by the fourth year.The optimal system cost is -$95 thousand, meaning that plant generates profit.

Conclusions
Two main contributions are made in this paper to the generalized disjunctive programming (GDP) modeling framework.The first one is to add cardinality rules to the logic constraints to allow for constraints of the form choose exactly m Boolean variables to be True (or at least m, or at most m).For more than two Boolean variables, modeling these types of constraints via propositional logic (zerothorder logic) is either cumbersome or not possible.Thus, introducing predicate logic (first-order logic) to express this new constraint form in GDP adds more expressiveness to logic-based models.The second contribution is to extend GDP for modeling hierarchical systems via nested disjunctions.Such an approach results in more intuitive models, but had not been formalized in the past, as classical GDP does not consider disjunction nesting.The notation and logic constraints for such structures is provided, along with theoretical proofs to the tightness of such models, versus equivalent single-level GDP models.It is shown that mixed-integer programming reformulations of nested GDP models have tighter relaxations than the reformulations of their single-level counterparts in both the Hull reformulation, as well as the Big-M reformulation when tight M values are used.However, when large M values are used, the reformulated nested models show worse performance due to the presence of multiple large M parameters in the nested constraints.Finding tight M values requires additional work, and can be done by applying interval arithmetic when the models are linear.However, for nonlinear models, a separate optimization model must be solved for each constraint to find the tightest M values.A discussion on using basic steps to flatten nested GDP models is also given, where flattening nested structures via basic steps improves performance in the Hull reformulations when few disjunctions are nested, but quickly becomes intractable as the number of nested disjunctions and disjuncts increases.
Three examples are presented to show the advantages of using nested structures.In the first example, the tightness of the continuous relaxations of nested linear models are compared geometrically with the relaxations of equivalent single-level models.The relaxations of models that preserve nested structures result in smaller feasible regions than their single-level counterparts, generally yielding significant computational savings.Example 4.2, a linear GDP, and Example 4.3, a nonlinear GDP, illustrate the computational advantages of nested GDP models for problems that integrate superstructure design, technology selection, and operations scheduling, and superstructure design, long-term operations planning, and capacity expansion planning, respectively.It is also shown that for systems with bi-disjunct constraints (disjunctions with only two disjunctions), where one disjunct represents an off state with all pertinent variables set to zero (e.g., zero flow), there is no advantage to modeling such systems as hierarchical, even when there may be several levels of decisions.Such systems can be modelled more simply with single-level disjunctions and the necessary linking constraints.
Future work includes investigating how explicit hierarchical structures can be exploited for informed model decomposition methods and branching strategies.Exploring applications of hierarchical GDP to other fields, such as decision trees and stochastic optimization with event constraints, is another potential area for development.
Figure 2.1.Sample GDP graphical representation for GDP-example model.

Figure 4 . 1 .
Figure 4.1.Projections of the continuous relaxations of Example 4.1 for different reformulations (Big-M = Big-M Reformulation, Tight-M = Big-M Reformulation with tight M values, Hull = Hull Reformulation; basic step = Flatten GDP via basic steps, equivalent = Equivalent Single-Level GDP, nested = 2-Level Nested GDP).Projection areas, relative to Big-M are indicated in %.

Figure 4 .
Figure 4.2.3.Optimal process network design (edge thickness is proportional to the maximum flow on that line).

Figure 4 . 2 . 6 .
Figure 4.2.6.Plant operations schedule (text in each bar, i-j, indicates process number i and technology number j for that event).
∈   ,  ∈   .In the GDP literature, this constraint has been traditionally written as   ⇔ ∨ ∈    ∀ ∈ ,  ∈   ,  ∈   .However, such a logic proposition is incomplete because it would allow the following to occur:   =  and   =  for more than 1 index  ∈   (i.e., False ⇔ (True ∨ True) is valid because the exclusive OR makes the right-hand side

Table 4 .
1. Model sizes and projection areas for Example 4.1 Consider the superstructure optimization problem for a plant that is to produce and sell material D (see Figure4.2.1).Material D can be produced from material C (reaction: C → D), which can be purchased from a third party or produced from material B (reaction: B → C), which can in turn be purchased or produced from material A (reaction: A → B).The plant has two types of multipurpose reactors, each with a backup unit, that can be used for the material transformation steps (see Figure4.2.

Table 4 .
2.1.Fixed and variable cost coefficients for the installation cost of each resource.

Table 4
The objective function is to minimize the system cost, as given in (67), where the stream costs,   , are given in Table4.3.1.The model for Example 4.3 is thus given by (48)-(67).

Table 4 .
Table4.3.2,where it is seen that the Nested formulation has a CPU time that is one order of magnitude smaller than that of the Equivalent Single-Level formulation when range reduction and local search are disabled.When range reduction and local search are switched on, the Nested formulation is faster than the Equivalent Single-Level formulation by only a factor of 2.5.The continuous relaxations for the Nested and Non-hierarchical formulations are equal (29% gap) and tighter than that of the Equivalent Single-Level formulation (69% gap).The performance of the Nested formulation is comparable to that of the Non-hierarchical one, with the latter having less continuous variables and constraints, and taking less time to solve when local search and range reduction are enabled.It is clear that for this model structure, a hierarchical formulation is not required.3.2.Model sizes and computational results of the MINLP models resulting from the Hull reformulations of the Equivalent Single-Level, Nested, and Non-hierarchical GDP models.
a No local search; no range reduction