# Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption

- 1.3k Downloads
- 18 Citations

## Abstract

This paper introduces a novel methodology for the global optimization of general constrained grey-box problems. A grey-box problem may contain a combination of black-box constraints and constraints with a known functional form. The novel features of this work include (i) the selection of initial samples through a subset selection optimization problem from a large number of faster low-fidelity model samples (when a low-fidelity model is available), (ii) the exploration of a diverse set of interpolating and non-interpolating functional forms for representing the objective function and each of the constraints, (iii) the global optimization of the parameter estimation of surrogate functions and the global optimization of the constrained grey-box formulation, and (iv) the updating of variable bounds based on a clustering technique. The performance of the algorithm is presented for a set of case studies representing an expensive non-linear algebraic partial differential equation simulation of a pressure swing adsorption system for \(\hbox {CO}_{2}\). We address three significant sources of variability and their effects on the consistency and reliability of the algorithm: (i) the initial sampling variability, (ii) the type of surrogate function, and (iii) global versus local optimization of the surrogate function parameter estimation and overall surrogate constrained grey-box problem. It is shown that globally optimizing the parameters in the parameter estimation model, and globally optimizing the constrained grey-box formulation has a significant impact on the performance. The effect of sampling variability is mitigated by a two-stage sampling approach which exploits information from reduced-order models. Finally, the proposed global optimization approach is compared to existing constrained derivative-free optimization algorithms.

## Keywords

Derivative-free optimization Kriging Quadratic Constrained optimization Sampling reduction Global optimization## 1 Introduction

A promising approach for optimizing grey-box problems is the development of surrogate approximation models for the explicitly unknown equations of the system, which aim to guide the search towards the true optimum of the original model [18, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36]. Surrogate models serve as analytical approximations to the underlying unknown equations, which allow for the use of derivative-based optimization. However, it has been found that the most efficient surrogate functional forms are multimodal non-convex functions; thus global optimization is necessary in order to solve grey-box optimization problems [37, 38]. Existing literature in grey-box modeling and optimization predominantly employs local optimization methods for the optimization of the formed surrogate formulations. Moreover, existing methods typically treat this class of problems as pure “black-boxes”, assuming that no information is available in analytical form, often forcing ignorance of valuable information or limitation of their applicability. Finally, despite the great interest that these methods have attracted in the literature, the majority of the existing methodologies have been developed for unconstrained, box constrained, or known closed-form constrained problems [19, 21]. Handling of grey-box constraints is still an open question, while there is scarcity of global optimization approaches for multidimensional general constrained grey-box problems. In this work, constrained grey-box optimization is treated as a compilation of deterministic global optimization subproblems stemming from sampling and design of experiments, parameter estimation and global optimization of surrogate formulations.

- (a)
How does the variability in the initial sampling set affect the consistency of the grey-box algorithm, and how can this be mitigated?

- (b)
What is the importance of using deterministic global optimization to solve (i) the surrogate function parameter estimation problems and (ii) the overall constrained grey-box problem?

- (c)
What is the effect of the surrogate function selection on the performance of the proposed approach?

## 2 Problem formulation and brief literature review

Grey-box optimization belongs in the category of Derivative-Free Optimization (DFO), which is a broad classification of the methodologies that do not use derivative information of the original model [19]. DFO methods can be divided into subcategories based on the use or not of surrogate models (model-based or direct-search) and on the type of search for samples within the entire input space (global-search) or a local subregion (local-search). The majority of the existing grey-box optimization methods have been developed for box constrained problems limiting their applicability to real-life applications. Extensions to constrained cases have been performed through penalty-type aggregated methods [41, 42, 43, 44], filters [45], complex statistical criteria [45, 46, 47], aggregated constraint satisfaction functions [25], or surrogate models [26, 31, 32, 48]. A comprehensive analysis and comparison of the existing box-constrained DFO methods available in the literature can be found in two recent reviews, Rios and Sahinidis [21] and Kolda et al. [22].

*n*continuous independent inputs

*x*have known finite bounds \(\left[ {x_i^L ,x_i^U } \right] \). The form of the objective function

*f*(

*x*) and the constraints in set \(M,g_m (x)\), are not available explicitly.

Significant algorithmic developments were originally developed based on direct-search derivative-free concepts for box-constrained optimization [19, 49, 50, 51, 52], where the search is driven only based on function evaluations. Local direct-search DFO methods have been extended to constrained using various methods such as filter approaches and penalty or barrier methods [26, 42, 43, 45, 53]. However, these methods suffer from high dependence on the initial point and the entrapment within the closest to the initial point local optimum and generally require a large number of samples which may be prohibitive [19, 50]. Later on, the idea of using fitted functions (surrogate models) based on the input–output data was found to expedite the search towards optimal solutions. Different methodologies have been proposed which use a surrogate model to approximate *f*(*x*) (i.e., kriging, quadratic, linear, or radial-basis functions), as well as the iterative sampling criterion which will lead to optimal solutions using fewer samples within a local-trust region (local model-based methods) [19, 54, 55, 56, 57, 58, 59, 60, 61], or within the entire region bounded by the upper and lower bounds of the input variables (global model-based methods) [23, 24, 30, 33, 34, 35, 62].

In this work, we are interested in global-search methods which consider the entire investigated space and do not depend on a single initial point. In all existing global-search surrogate-based optimization methods asymptotic convergence is based on the theorem of Torn and Zilinskas [63] which states that any algorithm may converge to a global optimum of a continuous function within a compact set if “its sequence of iterates is everywhere dense” within the compact set [27, 28, 30, 60]. The key in these types of algorithms is the derivation of search criteria which will reach good solutions faster, by identifying promising regions while retaining a balance between local and global search in order to avoid getting trapped in suboptimal regions. Even though global-search model-based methods have gained great popularity for optimizing expensive models, it is true that there is no guarantee that the final optimum may even be a stationary point [29, 30]. Extensions of global-search methods have been proposed using probabilistic criteria, extreme barrier methods, aggregated constraint satisfaction surrogates, and treatment of each constraint using individual surrogate functions [3, 31, 32, 46, 47, 48]. However, these approaches require the formulation of optimization problems with multimodal functions which are difficult to solve to global optimality. Extensive sampling methods such as Monte Carlo are proposed to identify the optimal point of the new constrained statistical criteria, which introduce uncertainty into the proposed approaches [46]. The applicability and efficiency of introducing a more complex non-convex search criterion, becomes questionable as the dimensionality and the number of constraints of the problem increases. The performance of all of the above developments has been tested on relatively small test problems, with a few recent exceptions such as the work of Regis et al. [31, 64] who locally optimize radial basis function formulations which represent high dimensional problems. The value of global optimization for black-box optimization was recognized by Jones et al. [30] who proposed a branch-and-bound algorithm to globally optimize the expected improvement function, which was tested on low-dimensional problems.

The work proposed here belongs to the global-search and model-based categories, which are two qualities with many advantages. Firstly, the use of smooth surrogate models allows the interpolation between existing samples, which has been shown to reduce sampling requirements overall. At the same time, development of surrogate approximations enables the use of deterministic global optimization methods to optimize these non-convex surrogate functions. In fact, the use of advances of the deterministic global optimization literature in general constrained grey-box optimization is one of the unexplored and least discussed topics. In the work of Regis et al. it is claimed that asymptotic convergence is assured even if the formed surrogate problem is not globally optimized [32, 60]. This is achieved by imposing rules which ensure that any new sample must have a minimum distance from all of the existing samples. However, it is not discussed whether this has an effect on the speed of convergence. This is a valid concern, since the effort which is made into sophisticated sampling methods and development of complicated functional forms, may be impaired if the globally optimal parameters are not used or if a suboptimal solution of the constrained grey-box approximation model is used as the next promising sample.

Secondly, searching the entire investigated region is important because this reduces the high dependence on a single initial point. However, even if a set of initial samples are collected using space-filling techniques in order to build the initial grey-box models, the effect of the initial sampling set may be significant on the final solution. For this reason, most derivative-free optimization studies perform a number of tests, starting from different initial sampling sets, in order to verify the average and variance of the final solution. In a realistic application which may have significant computational requirements, it would be very important to be able to mitigate the effect of the initial sampling, in order to guarantee consistent and reliable performance. Through this work we aim to investigate whether advances in global optimization can lead to improved algorithms with high consistency and reliability.

## 3 Motivating example

Benchmarking of constrained grey-box methods is extremely difficult due to the diverse nature of the applications which may have different difficulties in terms of computational cost, noise in their output values, dimensionality and the number of constraints and form of the feasible region. In the majority of the cited literature thus far, proposed methods are tested on relatively small test problems, comprised of smoother continuous functions. In a realistic case, where the input–output data depends on a complex simulation, the data may be extremely noisy, while the true underlying function or feasible region can be discontinuous. Another important aspect which can be underestimated when solving benchmark problems is the number of function calls. In fact, in most realistic applications the affordable number of calls to the expensive simulation might direct the importance towards attainment of better solutions with fewer samples, as opposed to asymptotic guarantee of convergence to a global optimum. For this reason, this work has been motivated by a realistic case study, which is a large Non-Linear Algebraic and Partial Differential Equations (NAPDE) system. NAPDE systems have applicability in all fields of engineering for the representation of complex geometries, multiphase flows and reactions.

Specifically, this work has been developed for the optimization of an adsorption-based process for post-combustion \(\hbox {CO}_{2}\) capture from power plant flue gas which is considered to be a binary mixture of \(\hbox {CO}_{2}\) (14 %) and \(\hbox {N}_{2}\) (86 %). The details of the full simulation model are provided in the “Appendix”, while the representation of the model as a constrained grey-box problem is provided in (P2). A cyclic PSA process with four-steps (pressurization, adsorption, blowdown and evacuation) [14] is described by a system of complex NAPDE system of equations. The main equations of the simulated model are given in Table 11 of Appendix 1, while the notation, initial and boundary conditions used are provided in “Appendix 2”.

The NAPDE model allows for the detailed simulation under different operating conditions and designs (inputs) to obtain the process performance (outputs) at cyclic steady state. More specifically, the PSA process has seven significant input variables: column length (*L*), blowdown pressure (\(P_{bd})\), evacuation pressure (\(P_{evac}\)), adsorption time (\(t_{ads}\)), blowdown time (\(t_{bd}\)), evacuation time (\(t_{evac}\)), and compression pressure (\(P_H\)). These input variables affect three outputs of interest which are the total annual cost, the purity and recovery of the outlet \(\hbox {CO}_{2 }\) stream. Both the purity and recovery must be larger than a minimum specified value for the process to be feasible. For a typical \(\hbox {CO}_{2}\) capture process this is set to be 90 %. Therefore, when designing a cost-effective PSA process, the objective is to minimize the total annual cost (usually per ton of \(\hbox {CO}_{2})\), and the design constraints are: purity \(\ge \)0.90, and recovery \(\ge \)0.90. There is an additional operating constraint to ensure that the blowdown pressure is always higher than the evacuation pressure. All of the design and operating constraints of the problem are given in Table 12 of Appendix 1.

*Pu*is the purity (Eq. A10),

*Re*is the recovery (Eq. A9), and

*is vector representing the seven input variables to the PSA process component \(x_{i}\,(i = 1,{\ldots }, 7)\) where \(x_{1}=L\), \(x_{2}=P_{bd}\), \(x_{3}=P_{evac}\), \(x_{4}=t_{ads}\), \(x_{5}=t_{bd}\), \(x_{6}=t_{evac}\), \(x_{7}=P_{H}\). As a result, the large NAPDE system shown in Tables 11-12 is summarized through (P2), with one grey-box objective, two grey-box constraints (\(M=2\)), one known constraint (\(K=1\)), and box constraints for variables \(x_{1}-x_{7}\).*

**x**## 4 General constrained grey-box global optimization algorithm

Firstly, it is important to formulate the problem which is equivalent to formulation P1 (equivalently P2 for the motivating example), by identifying the objective, constraints and the significant input variables which affect all of the above. Subsequently, a set of samples are required which will be used to build the input–output mappings for each of the unknown correlations. The selection of this initial set may be performed based on different methods, such as an optimized Latin Hypercube Design (LHD) (Sampling Strategy 1), or through more sophisticated sorting and selection methods which require the availability of a larger prior database (Sampling Strategies 2 and 3).

Based on the obtained samples for which the full simulation is performed, the method requires the selection of the type of surrogate function which will be used to represent each of the constraints and the objective function in order to solve \(M+1\) parameter estimation problems (i.e., one for the parameter estimation of the surrogate function for the objective function, and M for the parameter estimation of the surrogate functions for the M grey–box constraints). A series of different functional forms can be tested and selected, namely general quadratic, kriging (exponential), radial-basis functions and signomial functions. Often the identification of the optimal parameters for each of the fitted functions by the minimization of the sum of least-square differences between the predictions and the observations constitutes a challenging optimization problem, which we aim to solve to global optimality.

Once all of the above have been completed, the formulated constrained surrogate optimization problem, including the surrogate objective \(\widetilde{f}(x)\), the surrogate constraints, \(\widetilde{g}_m(x)\), and the original constraints, \(\widetilde{g}_k (x)\), and variable bounds must be solved to global optimality using deterministic global optimization methods [65, 66, 67], collecting a set of possible local solutions and the final global solution. The constrained surrogate formulation does not depend on the expensive simulation, thus it can be solved rigorously using deterministic global optimization methods. Since it is expected that the formulated constrained surrogate formulation will have multiple local solutions, a set of diverse local solutions along with the final global optimum solution are selected as promising future samples. In the sequel, the full simulation must be performed at the set of the unique local and global solutions, in order to obtain their true objective and constraint values. The new samples are incorporated within the sampling set and the parameter estimation problem and optimization of the overall constrained grey-box problem are solved iteratively in order to update the model parameters and repeat the aforementioned steps. The procedure is repeated iteratively until certain convergence criteria are met.

During each iteration a clustering procedure assigns each newly obtained sample to a cluster based on its * x*-space location, objective function value and feasibility when compared to existing samples. The points which are collected often form clusters, since they are a diverse set of local solutions collected. The developed clustering technique identifies the total number of clusters, the average objective function value and the average feasibility of each cluster and the

*-space bounds of the samples contained within each cluster. Once the algorithm converges, the clusters which have been formed are analyzed to provide valuable information about regions of the feasible space which contain promising solutions. This information is used to update the bounds of the search space in such a way so that the clusters which contain the best solutions are incorporated. Details about the clustering technique, as well as the analysis and bound updating are described in Sect. 4.4.*

**x**Finally, the algorithm proceeds to a second stage during which the entire procedure is repeated within the space defined by tighter bounds, until the same convergence criteria are met. During this second stage, new samples are collected and new surrogate models are developed for the constraints and the objective function. These models are guaranteed to be more detailed representations of the locally defined search space, while the excluded regions are no longer part of the model.

### 4.1 Sampling methods and sampling reduction

Selecting the samples for which the full simulation is performed in order to build and update the surrogate function parameters is a critical issue. Unbalanced designs may lead to low accuracy of the predictions in unexplored regions, as well as numerical issues during the parameter estimation due to singularity in the correlation matrices. At the same time, it is desired to keep the sampling requirements to a minimum since the majority of applications of interest rely on simulations which have a high computational cost. For the initial sampling designs, Latin Hypercube Designs are typically used [68], since they have good space-filling abilities with fewer number of samples when compared to full-factorial designs. The popular ‘10 times the number of input variables’ rule of thumb has been used as a baseline in most of the methods developed in literature, and it is denoted as Sampling Strategy 1 in this work (SS1). However, there is a significant amount of variability associated with the initial Latin Hypercube Design, observed and reported by other authors [59]. In other words, by using a slightly different set of 10*n* samples the final result may be significantly different. The variability introduced by the initial sampling depends on the problem itself as well as the total number of samples which are collected. If one can afford to collect a large number of samples, then the effect on the final result will be reduced. However computational expense is often a limiting factor to the total number of collected samples. For example, in the presence of multiple constraints which can significantly limit the feasible space, a simple LHD may not even contain any feasible samples causing more uncertainty to the end result. In addition, the complexity and smoothness of the underlying functions can cause more variability towards the overall performance.

In grey-box optimization problems, where an expensive simulation is available but cannot be directly used for optimization, there are often multiple ways to reduce the complexity of the full simulation in order to obtain a faster model, which serves as another type of surrogate to the original [69, 70, 71, 72]. When this information is available, a possibility arises for collecting a large pool of samples which may reveal trends about the final objective and feasible region of the final full simulation. For example, in the motivating example, the discretization of the finite-element method model is coarsened significantly and the number of cycles can be reduced to a small number, resulting to a reduced-order simulation which is solved in 2–3 s as opposed to 15 min. Once the total set of reduced-order samples is collected, the most challenging aspect is the selection of an optimal subset of the large pool of reduced-order samples for which a full simulation is performed, for subsequent use in surrogate function parameter estimation.

Returning to the motivating example, the objective would be to select a a subset \(S\,(S\subset N)\) of samples out of a superset of *N* reduced order samples, where \(card(N)=N_{l\arg e}\), \(card(S)=10n\) and \(N_{l\arg e} \gg S\). The reduced order samples have an approximated value for purity, recovery and cost provided by the short simulation described earlier, which we can afford to simulate in abundance. Sampling Strategy 2 (SS2) refers to naive ranking of the fast-simulation samples based on purity and recovery and selecting the first *S* samples which satisfy the constraints or have the minimum constraint violations. However, this method does not guarantee that the set of *S* samples is balanced within the investigated space, as opposed to being clustered in several regions leading to surrogate functions which may be highly inaccurate in insufficiently sampled regions. In other words, it is desired to select a subset of samples with promising feasibility as well as objective function values, which also have optimal space-filling properties in the *x*-space. This set of criteria may be conflicting towards the selection of the sampling subset.

The above observations led to Sampling Strategy 3 (SS3), which employs Mixed Integer Linear Programming (MILP) to perform the optimal selection. Inspired by current work of Li and Floudas [73] it is realized that sampling selection in grey-box optimization is an equivalent problem to scenario reduction in stochastic programming problems. In other words, the reduced-order samples of the grey-box model are treated as different scenarios of an uncertain problem. The full set of reduced order simulations leads to a probability density function, which we want to accurately represent using a subset of the obtained samples. This subset will constitute the set of points for which the full simulation will be performed.

*x*) distribution domain, while simultaneously it incorporates objectives in the output domain such as expected, worst and best performance. The MILP formulation used for sampling selection is given in Problem 3:

*K*is the number of input variables,

*N*is the set of original scenarios,

*card(S)*is the desired number of scenarios to be selected, \(p_{i }\) is the probability of each scenario, \(y_{i}\) is the binary variable denoting whether a scenario

*i*is removed (\(y_{i}=1\)), \(d_{i}\) is the minimum Kantorovich distance from the selected scenarios to removed scenario

*i*, \(\nu _{i,i^{\prime }} \) are continuous variables denoting whether scenario

*i*is removed and assigned to scenario \(i^{\prime }\), \(f_{i,j} \) correspond to the value of output

*j*for each scenario

*i*and \(w_{j}\) are optional weights assigned to each output depending on its importance towards the final selection. Further details about formulation P3 are provided in [73].

The objective of the MILP model may be the minimization of the *x*-space probabilistic distance, which implies that the second term of the objective in P3 is not present, and thus this will result to the optimal space filling design. However, the objective may also be a weighted sum of the *x*-space distance, the objective function and constraint function values, which will aim to balance between space-filling properties and promising samples. Formulating this selection procedure as an MILP problem has the advantage of the ability to tune the objective based on the final goal in order to choose a different subset of samples. For instance, if feasibility is the most important aspect, a large weight may be placed on the contribution of the predictions of the constraint function values, or else, additional constraints may be added to the MILP problem to make sure only feasible points are selected.

An example of sample selection is provided in Fig. 3 for the PSA case study. In this figure the light green points represent the location of the initial 5000 samples obtained from the fast simulations in both the x and the y space. The 70 selected points are highlighted in red. The selected points clearly have a wide distribution in the *x*-space, but also the majority is feasible in terms of recovery and purity, and tend to have lower objective function values. Moreover, it is observed that the search space is reduced only in one input variable, the evacuation pressure (\(P_{evac}\)). This is a result of the known closed-form constraint of P2 (\(P_{bd} \ge P_{evac}\)) which is incorporated in the MILP formulation through prior selection of a candidate set which satisfies this constraint. Specifically, when known-constraints are only a function of input variables, the superset of initial samples can be filtered prior to performing the subset selection, simply by removing any samples which do not satisfy the known constraints. Subsequently, subset selection is performed on the filtered data set. At this point, it should be made sure that the initial large design has a sufficient number of feasible samples.

### 4.2 Selection of functional form for the surrogate model

The proposed framework for general constrained grey-box models is not restricted to a single type of functional form for estimating the underlying unknown objective function and constraints. Several types of functional expressions have been tested, keeping in mind that the complexity and non-linearity of the form will affect the solution times and solution quality of the preceding global optimization problems. The functional forms which have been tested range from linear, general quadratic, kriging, and signomial functions. Moreover, since each of the constraints and the objective function are approximated by a separate surrogate model, there is no restriction in using only one type of functional form for all of the \(M+1\) models, since a combination of surrogate model types may be used to form the final surrogate optimization problem (P1 and P2). This selection may be identified based on prior knowledge or cross-validation techniques.

A key feature of this work is the introduction of deterministic global optimization for the parameter estimation in each selected surrogate function. The final objective of this optimization model is the identification of the mapping \(y=f(x)\) which connects a measured output *y* to a set of input variables of dimensionality *n*. In certain cases, (i.e., linear, general quadratic) the parameter estimation might have an analytical solution or is a well-posed convex non-linear problem; under the assumption that the samples have good space-filling geometric properties and the training data is adequately scaled. However, in other cases, parameter estimation is a challenging global optimization problem, which has been overlooked in the literature so far. Especially for interpolating techniques such as kriging, where the model predictions are equal to the observed outputs at all the sampled locations, there are usually multiple possible combinations of parameters for which this condition is satisfied, which means that there exist multiple local/global solutions to the parameter estimation problem. In this case, validation techniques are used along with training in order to identify a solution of the problem which best describes the data, while at the same time it does not overfit the data.

#### 4.2.1 Surrogate functions type 1: kriging functions

One of the most commonly used functional forms in surrogate-based optimization is kriging [74, 75] which is an interpolating technique with several advantages, but also challenges. Global optimization of the kriging parameters is a demanding task since kriging is by nature a purely interpolating technique, thus cross-validation procedures are usually employed in order to calculate a prediction error for the model. In addition, the functional form of kriging introduces non-linear and non-convex terms. Moreover, the estimation of the optimal kriging parameters requires the use of matrix algebra for the inversion of the covariance matrix of the observations. Formulating the kriging parameter estimation as a global optimization problem leads to relatively large problems, with a number of variables and equations highly dependent on the number of training observations. However, as the dimensionality of the problem increases, it is advisable to collect a larger set of training samples in order to develop an accurate response surface. As more samples are collected, the size of the problem increases and thus globally optimizing the parameters is a very challenging task. By employing a simultaneous parameter estimation and validation technique, and careful tightening of the model parameter bounds, we are able to address this global optimization problem.

*SCV*, which is the validation data set. This approach is used since the set

*SMB*will be interpolated by the developed kriging function and it is desirable to produce a function which interpolates the best observations. On the other hand, it is not desirable for the training set to be clustered solely around a promising region, since the predictive ability of the model will be very poor in regions outside this space. The remaining validation points of

*SCV*are not purely interpolated, but they are used to calculate the objective function of the parameter estimation model, which is the sum of squared error between the observed \(y_{SCV} \) and the one predicted by the kriging model. Based on ideas described in the previous section, OSCAR [73] is used for the selection of the

*SMB*and

*SCV*sets, using 80 % of the samples as interpolation points.

**X**domain is smaller. This is exactly what the basis function of Eq. (1) aims to capture. The rate at which the change in distance between two points influences the change in the output

*y*in each dimension

*j*is a function of the fitted parameters \(\theta _j \). It has been observed that modeling this correlation between pairs of observations is so powerful, that a new point can be predicted by the following Eq. [76]:

**R**is given in Eq. (1) while

**r**is a vector which contains the correlations of form (1) between the existing samples and the new unknown x. The optimal values of \(\hat{{\mu }}\) and \(\theta _j \) are found from the maximization of the maximum likelihood function of \(y_{SMB} \) subject to the already observed data

**X**, which can be calculated in closed form. The derivation of Eq. (2) is based on the numerical solution of the maximization of the conditional likelihood of \(\hat{{y}}\left( {\mathbf{x}^{(new)}} \right) \) subject to the identified optimal parameters and the already observed data [18]. Based on the kriging properties, a closed form expression of the associated uncertainty of each prediction can also be calculated. However, this is not used in this work.

**N**data points of dimension

*n*is:

**U**is the square symmetric covariance matrix with elements \(u_{i,j} \) defined in P4. Constraints 3–4 of P4 can be omitted since P4 is uniquely defined without them. However, we have found that they are often beneficial for locating feasible solutions. The main limitation of the formulation of the parameter estimation for kriging as a global optimization problem is the inversion of matrix

**U**which introduces a large number of intermediate variables and equality constraints. In addition, the non-linear form of the terms \(u_{i,j} \) increases the non-linear terms of the problem significantly. Finally, for global optimization, it is desirable to provide lower and upper bounds on all of the variables of the problem. The following bounds are provided for the problem:

**U**have a closed form solution which is minimized when \({\hat{\theta }}_n \) is maximum and is maximized when \(\hat{{\theta }}_n \) is minimum. The bounds for

**U**can be adaptively tightened by relaxing the value of parameter \({\hat{\theta }}_n \) obtained from the parameter estimation solved in the prior iteration of the grey-box optimization approach. Providing all of the above bounds has shown to improve the performance of the global optimization of the kriging parameters. However, as the problem size increases the required time for reaching global optimality using the recent global optimization solver ANTIGONE [67] increases significantly. Due to the need to keep a balance between reducing the overall computational cost of the optimization, a time limit is imposed on the solution time for each of the parameter estimation problems. As the iterative grey-box optimization framework proceeds sampling points tend to be clustered in small subregions, and it becomes difficult to locate feasible solutions. This is due to the fact that the value of the off-diagonal elements \(u_{i,j} \) can take values closer to 1, and matrix

**U**becomes near singular. In order to overcome this issue, it is suggested in the kriging literature to add a small positive number (nugget parameter) to the diagonal elements of

**U**. This modification will not affect the solution of the problem, however, it will lead to a kriging model which will not be an exact interpolator of the collected samples. In our work, this is considered as a slack variable, which is a positive small number (\(\le \)0.1) and the sum of these slack variables is incorporated into the objective function.

#### 4.2.2 Surrogate functions type 2: general quadratic functions

#### 4.2.3 Surrogate functions type 3: signomial functions

### 4.3 Global optimization of overall constrained grey-box approximation problem and selection of next sampling design points

During this stage of the proposed framework, the overall constrained optimization problem (P2) is solved using the deterministic global optimization solver ANTIGONE [67]. In this formulation, the decision variables are the original variables of the problem. Any grey-box expression which depends on the expensive simulation has been replaced by its surrogate function and any known constraint is incorporated as is. During each iteration, the accuracy of the identified globally optimal solution depends on the accuracy of surrogate functions. This is perhaps one of the most significant steps of every grey-box algorithm, since satisfactory accuracy of the overall constrained surrogate formulation is desired only in promising regions. During this stage, diversity in choosing the next sample is an attribute which most of the competitive methods in the literature strive to achieve, in order to avoid entrapment to false local optima. In the literature, this is achieved by the formulation of complex multimodal expected improvement functions which aim to balance between the prediction quality and uncertainty in order to identify the next sampling location. Optimization of this criterion is one of the disadvantages of these methods, since even if they are cheap to evaluate, they are highly nonlinear and multimodal. Moreover, these methods have been designed to select one sample at a time, while extensions to methods which can identify multiple samples involve complex forms of multiple integrals or multiobjective optimization [18, 79].

In the proposed framework, multiple local solutions—which represent upper bounds to the overall constrained surrogate problem—and a global solution of the current formulation are collected during each iteration. The lower bound is obtained by solving the problem to global optimality using the global optimization solver ANTIGONE [67]. The upper bounds are obtained by optimizing the same problem using a local solver (e.g., CONOPT, DICOPT) starting from different initial solutions. This concept is based on the idea that the overall constrained surrogate problem may have multiple local solutions which have a high probability to correspond to regions where true optimal points of the original problem may lie. In other words, the heuristic methods or complex multimodal search criteria which are used in other grey-box methods are replaced by the collection of multiple local/global solutions provided by deterministic optimization algorithms. In addition, local optimization of the overall constrained grey-box problem does not significantly increase the computational cost of the overall method. One could solve as many local problems as the number of obtained solutions up to the current iteration, by using each of the samples as the initial guess. However, as the number of samples increases, this would lead to a large number of optimization problems, without adding new information to the solution pool. This fact is easy to realize since as new samples are collected, they tend to form clusters in promising regions. We perform this selection using OSCAR in order to ensure that we optimally select a diverse set of starting points for local optimization. The criterion for this MILP problem is to select 2*n* samples with maximum diversity in the x-space, in order to increase the probability of obtaining a diverse set of new design points. Feasibility and objective value is also incorporated into the formulation, since it is also important to select starting points with promising predictions. We have found this total number (2*n*) to be a good heuristic rule to balance diversity and computational cost for this specific problem. However, the method can be modified to select any number of starting points with a minimum of 1 (the best point in the sampling set will be selected) and a maximum of the entire set of samples.

Finally, once all of the local and global optimization problems are solved, all of the obtained unique solutions are collected and the full-simulation is performed at those design points by fixing the input variable values to match the new samples. This approach allows for a variable number of new points to be used at each iteration, based on the complexity of the formulated surrogate problem. After the new points are collected, the algorithm returns to the parameter estimation problem and this approach continues until convergence. The convergence criterion is related to both the accuracy of the surrogate model approximations which should be high, as well as the improvement of the actual final solution within a predefined number of consecutive optimization iterations (coi). A more detailed schematic description of the algorithmic steps is shown in Fig. 4.

### 4.4 Clustering of obtained samples and updating of x-space bounds

One of the potential disadvantages of surrogate functions is their inability to capture the underlying model with accuracy in the entire domain with a limited number of samples. However, accurate representation of the objective and constraints in the entire space is not necessary, when the end goal is optimization with limited samples. Moreover, it is realized that despite the diversity of obtained local and global solutions during each iteration, samples start to form clusters and the change in the global parameters of the surrogate functions is not significant enough to achieve further improvement in the solution. Thus, it is proposed to repeat the steps of the proposed framework within a subspace of the original x-domain, defined by information obtained by the identified samples.

The challenge presented is the selection of the criteria based on which the bounds of the optimization will be redefined to form subspace(s), within which the optimization procedure will be restricted. These criteria should minimize the probability to discard promising feasible regions. For this purpose, a clustering procedure is incorporated within the proposed framework, which assigns any new obtained samples within clusters during each iteration.

*Clustering procedure*

- 1.
*Optimization iteration 1*(\(opt\_iter = 1\)) Set the number of clusters equal to the number of distinct obtained local and global solutions (\(c=l+1\)). Initialize cluster centers and calculate all possible distances between each \(i-j\) pair of cluster centers \(d_{ij}^c \). Set radius influence of each cluster*i*equal to: \(r_{in}^c =\frac{1}{a_c }(\mathop {\min }\limits _j d_{ij}^c )j=1,...,c\hbox { and }j\ne i\). Parameter \(a_c \) is user defined, a default of 2 is used in the proposed framework. - 2.
*Perform next optimization iteration*(\(opt\_iter=opt\_iter+1\)) Collect new set of \(l_{new} =l+1\) local and global solutions. - 3.
*For*\({i=1\;to\;l}_{ new}\) Calculate the Euclidean distance of sample*s*to each of the existing clusters and find the nearest cluster. If this distance is less than \(r_{in}^c \), then place this sample in existing cluster and update cluster center. If not, then create new cluster: \(c=c+1\). Recalculate all distances and influence radii \(d_{ij}^c \)and \(r_{in}^c \). If all of the new samples belong in existing clusters, the existing set of clusters is updated and no new clusters are created, however, this does not necessarily imply convergence. - 4.
*If convergence is not met, go to Step 2, otherwise continue to step 5.* - 5.
*Cluster fathoming*Remove clusters with low number of samples, clusters with no feasible points or clusters with very high objective function values. During this step, the parameters for acceptance or non-acceptance of a cluster must be defined by the user based on the nature of the problem. If locating feasible solutions is difficult, perhaps clusters with lowest feasibility violation should be accepted. - 6.
*Kept cluster analysis*For any of the remaining clusters, calculate mean (\(\mu _c^{(n)}\)) and standard deviation (\(sd_c^{(n)}\)) of the samples in each of the*n*dimensions. Calculate bounds of each cluster using the following formula: \(\mu _c^{(n)} -\beta _c sd_c^{(n)} \le x_n \le \mu _c^{(n)} +\beta _c sd_c^{(n)} \), where \(\beta _c \) is a parameter which should be selected by the user in terms of the tightness of the bounds which are desired for bound updating. In certain cases, it is observed that one of the input variables has zero variability and in this case it is kept constant, leading to an optimization problem with fewer dimensions.

## 5 Computational studies

The proposed global optimization framework is tested on a set of different instances of the motivating example of the PSA case study. This allows us to form a large number of diverse problems of different complexity by modifying the zeolite which is used for the \(\hbox {CO}_{2}\) capture inside the adsorption column. It has been found that different materials (i.e., zeolites) exhibit different performance in terms of their total cost, and more importantly in terms of their feasibility [47]. For certain materials the feasible space is very limited- compared to the total investigated operating space, which makes the case studies far more challenging. In addition, different materials lead to simulations of different computational complexity ranging from simulations of 3 min to simulations of 15 min for one computation. In this work, we consider eleven zeolites, namely 13X, AHT, MVY, WEI, ABW, ITW, LTF, NAB, OFF, TON and VNI, which are a collection of popular and promising materials for capture as screened by the most recent literature [40, 80]. 13X is the most popular zeolite for \(\hbox {CO}_{2}\) capture [14, 81, 82], while AHT, MVY and WEI are the top zeolites for \(\hbox {CO}_{2}\) capture as screened by the most recent literature [40, 80]. Hasan et al. [40, 80] identified AHT and MVY to be the top two zeolites for \(\hbox {CO}_{2}\) capture based on cost, using a hierarchical and computational framework that effectively combines material selection and process optimization. Their top ten materials included WEI, which was also listed as the material requiring the least parasitic energy for \(\hbox {CO}_{2}\) capture in a separate study by Lin et al. [80].

Out of these zeolites, 13X, AHT, MVY and WEI are selected to perform a series of tests for the efficiency of various steps of the proposed approach and for answering the main questions posed at the beginning of the manuscript. These are: (a) identify the effect of the initial sampling strategy on the final quality and variability of the obtained solutions, (b) reveal the effect of performing global optimization of the parameter estimation of the surrogate functions and the global optimization of the overall constrained surrogate model versus local optimization for different surrogate methods on the quality of the final solution and the overall computational cost, (c) validate the expected importance of clustering and bound updating and finally, (d) test the efficiency of employing a diverse set of surrogate models for the formulation of the surrogate approximation model (P3). All of the runs are performed as single-thread jobs on a linux workstation containing four Intel Core2 2.83 GHz processors.

### 5.1 Effect of initial sampling strategy on the variability and quality of final solution

- (A)
*Strategy 1 (SS1)*70 point Latin Hypercube Design using the function*lhdesign*of MATLAB. This strategy is the most commonly used approach in the literature, assuming that any model is a black-box. - (B)
*Strategy 2 (SS2)*5000 point Latin Hypercube Design using the function*lhdesign*of MATLAB, performance of fast simulation for 5000 points and naive ranking and selection of 70 samples with highest purity and recovery values. - (C)
*Strategy 3 (SS3)*5000 point Latin Hypercube Design using the function*lhdesign*of MATLAB, performance of fast simulations for 5000 points and selection of 70 samples using OSCAR [73], having a mixed objective of the x-space diversity, minimum cost and maximum purity and recovery.

Performance of different sampling strategies

Zeolite | Sampling strategy | Best cost | Average cost | SD | Average CPU time (h) |
---|---|---|---|---|---|

AHT | 1 | 21.21 | 23.50 | 1.9 | 7.28 |

AHT | 2 | 21.01 | 21.90 | 0.66 | 7.40 |

AHT | 3 | 20.91 | 21.02 | 0.56 | 8.34 |

MVY | 1 | 21.59 | 22.96 | 1.54 | 6.91 |

MVY | 2 | 20.75 | 21.38 | 0.37 | 6.92 |

MVY | 3 | 20.72 | 21.21 | 0.36 | 8.30 |

WEI | 1 | 21.94 | 23.87 | 2.13 | 6.77 |

WEI | 2 | 21.57 | 22.74 | 1.48 | 7.87 |

WEI | 3 | 21.39 | 22.57 | 0.68 | 8.97 |

13X | 1 | 30.57\(^\mathrm{a}\) | 66.63\(^\mathrm{a}\) | 59.79\(^\mathrm{a}\) | 34.26\(^\mathrm{a}\) |

13X | 2 | 28.05\(^\mathrm{b}\) | 29.16\(^\mathrm{b}\) | 0.94\(^\mathrm{b}\) | 34.85\(^\mathrm{b}\) |

13X | 3 | 27.16 | 28.83 | 1.46 | 37.77 |

A common observation of all the case studies, is that SS1 demonstrates inferior performance, in terms of both consistency and quality of solution. Especially for the case of 13X, which is the zeolite for which satisfying feasibility is the most challenging out of the remaining cases, this sampling approach performs poorly. In fact, in three out of the ten runs, the method failed to find a feasible solution overall. As a conclusion, the incorporation of prior short-simulation data followed by sampling reduction is very beneficial in reducing the variability of initial sampling and locating feasible solutions. On the other hand, sampling strategies 2 and 3 are more comparable in terms of average performance and standard deviation of the obtained solutions. With an exception of MVY, for all of the remaining case studies, the best solution is identified when using SS3. It is shown that the algorithm is least dependent on the variability of the initial sampling in the case of SS3, demonstrating that the rigorous selection strategy consistently identifies the optimal and balanced subset of points to use as an initial sampling set. Finally, in the most challenging case study of 13X, SS3 is the only one which does not fail to find a solution at any of the performed runs and it is the method which locates the best solution out of all the remaining methods.

From this analysis it can be concluded that SS3 demonstrates a consistently superior performance when satisfying feasibility is demanding, however, its performance is comparable to SS2 in cases where a larger pool of feasible solutions is available. Overall, SS3 is a rigorous method for selecting a subset of samples when certain a priori data is available.

### 5.2 Effect of local optimization versus global optimization

Local versus global optimization for quadratic and kriging models for AHT

Sampling strategy | Model | Optimization | Cost | Samples | CPU (h) |
---|---|---|---|---|---|

SS1 | Quadratic | Local | 24.66 | 470 | 6.93 |

SS1 | Quadratic | Global | 21.21 | 577 | 10.02 |

SS1 | Kriging | Local | 22.45 | 578 | 11.12 |

SS1 | Kriging | Global | 22.06 | 608 | 12.31 |

SS2 | Quadratic | Local | 23.51 | 345 | 4.51 |

SS2 | Quadratic | Global | 21.01 | 523 | 7.59 |

SS2 | Kriging | Local | 21.00 | 638 | 11.00 |

SS2 | Kriging | Global | 20.91 | 636 | 11.54 |

SS3 | Quadratic | Local | 21.37 | 254 | 5.38 |

SS3 | Quadratic | Global | 20.91 | 303 | 8.34 |

SS3 | Kriging | Local | 20.93 | 624 | 11.15 |

SS3 | Kriging | Global | 20.65 | 631 | 12.44 |

Local versus global optimization for quadratic and kriging models for MVY

Zeolite | Model | Optimization | Cost | Samples | CPU (h) |
---|---|---|---|---|---|

SS1 | Quadratic | Local | 22.82 | 316 | 5.78 |

SS1 | Quadratic | Global | 21.59 | 334 | 5.81 |

SS1 | Kriging | Local | 21.15 | 627 | 10.63 |

SS1 | Kriging | Global | 20.65 | 544 | 12.06 |

SS2 | Quadratic | Local | 21.79 | 332 | 5.89 |

SS2 | Quadratic | Global | 20.75 | 333 | 6.19 |

SS2 | Kriging | Local | 21.23 | 635 | 10.11 |

SS2 | Kriging | Global | 20.19 | 669 | 12.74 |

SS3 | Quadratic | Local | 21.88 | 395 | 6.21 |

SS3 | Quadratic | Global | 20.72 | 354 | 8.30 |

SS3 | Kriging | Local | 20.48 | 637 | 12.90 |

SS3 | Kriging | Global | 20.58 | 549 | 9.82 |

Local versus global optimization for quadratic, signomial and kriging models for WEI

Sampling strategy | Model | Optimization | Cost | Samples | CPU (h) |
---|---|---|---|---|---|

SS1 | Quadratic | Local | 23.73 | 428 | 6.51 |

SS1 | Quadratic | Global | 23.11 | 411 | 6.74 |

SS1 | Kriging | Local | 21.38 | 627 | 11.11 |

SS1 | Kriging | Global | 20.99 | 576 | 10.20 |

SS2 | Quadratic | Local | 23.95 | 271 | 4.50 |

SS2 | Quadratic | Global | 23.95 | 271 | 4.55 |

SS2 | Kriging | Local | 20.86 | 614 | 10.83 |

SS2 | Kriging | Global | 20.90 | 610 | 12.13 |

SS3 | Quadratic | Local | 22.30 | 468 | 6.78 |

SS3 | Quadratic | Global | 21.39 | 411 | 8.11 |

SS3 | Kriging | Local | 21.36 | 559 | 8.50 |

SS3 | Kriging | Global | 21.23 | 619 | 11.11 |

Local versus global optimization for quadratic and kriging models for 13X

Zeolite | Surrogate model type | Optimization | Cost | Samples | CPU (h) |
---|---|---|---|---|---|

SS1 | Quadratic | Local | 31.75 | 231 | 3.63 |

SS1 | Quadratic | Global | 31.36 | 231 | 4.043 |

SS1 | Kriging | Local | 29.60 | 567 | 12.21 |

SS1 | Kriging | Global | 28.66 | 567 | 9.34 |

SS2 | Quadratic | Local | 27.77 | 173 | 3.45 |

SS2 | Quadratic | Global | 27.70 | 173 | 3.55 |

SS2 | Kriging | Local | 28.05 | 283 | 5.78 |

SS2 | Kriging | Global | 27.60 | 283 | 5.66 |

SS3 | Quadratic | Local | 27.16 | 314 | 4.92 |

SS3 | Quadratic | Global | 27.16 | 254 | 18.81 |

SS3 | Kriging | Local | 27.36 | 561 | 9.80 |

SS3 | Kriging | Global | 26.85 | 588 | 11.94 |

For these four case studies, signomial functions often failed to find feasible solutions, or led to significantly suboptimal solutions when compared to those obtained when using general quadratic functions and kriging functions. This result points out the importance of selecting an appropriate functional form, and the results for signomial functions are not presented here.

Based on the results of Tables 2, 3, 4 and 5, it becomes evident that there is an effect both in terms of the type of optimization (i.e., local vs. global) and the type of functions used. Specifically, for all of the case studies, when using general quadratic functions the optimal cost is better when using global optimization with a fewer number of required samples for convergence. The only exceptions to this observation are 13X and WEI, where the global quadratic optimization renders the same solution as the local. However, this is achieved with the same or fewer number of function calls. Moreover, there is a clear indication that when starting from exactly the same initial sampling set, using global optimization is meaningful for all cases and surrogate functions. There are only two exceptions, those of MVY and WEI where the global optimization run of kriging does not locate a better solution when compared to the local runs, because it converges at a lower number of samples. The superiority of the global optimization approach is associated with an increase in the total CPU time required to reach the final solution. Based on the results, however, it can be verified that this increase in CPU time is meaningful due to the attainment of better solutions. More interestingly, there are several instances where the global optimization runs render improved solutions, when using the same or less amount of function calls to the expensive simulation. Consequently, this creates a trade-off since the computational cost associated with global optimization may become negligible as the cost of the simulation increases.

Finally, comparing the performance of the proposed framework when using different types of surrogate functions, it is clear that kriging performs better for all of the case studies. On the other hand, kriging does require an increased number of samples and computational time to reach convergence. This fact can be explained due to the nature of kriging, which is purely interpolating and this usually leads to multiple local optima in each iteration. The global optimization using kriging functions is associated with high computational cost, since both the parameter estimation and the optimization of the overall constrained surrogate model (P2) are large non-convex problems with multiple exponential terms.

Comparison of different surrogate functions is useful in order to demonstrate the importance of selecting the best possible function to approximate each of the grey-box functions. Ideally, selecting the appropriate type of surrogate function for each of the unknown correlations individually, coupled with global optimization of the parameters and the surrogate formulation (P2), would be the most powerful approach.

### 5.3 Importance of clustering and bounds refinement

The performance of clustering depends on the location and the quantity of local solutions obtained throughout the iterations. Specifically for this problem 2*n* (14) local problems and one global problem are solved in each iteration, leading to multiple new design points that are sampled in the next iteration. Out of these 15 total optimization problems, we observe that an average of 2–4 new distinct solutions are obtained when using quadratic functions, while 8–11 new solutions are obtained when using kriging functions. The diversity of solutions when using kriging leads to overall increased sampling requirements, but also improved overall performance. Lastly multiplicity of local solutions when using kriging can be explained by the increased complexity of kriging based surrogate models when compared to the smoother quadratic models.

In order to investigate the significance of bound refinement and iterative nature of the proposed framework within a smaller subregion, the best obtained solution at the end of the first stage is compared to the final obtained solution for all the test problems reported in the previous section. A typical performance of the algorithm during the two stages is shown in Fig. 5. This example is for zeolite 13X, using general quadratic functions, sampling strategy 3 and global optimization. In the first stage of the algorithm, no feasible solutions are obtained until iteration 10. However, there is no improvement in the incumbent solution for the following iterations, and thus the first stage of the algorithm converges and the bounds are updated based on the clustering algorithm. Specifically, the bounds of variables \(\hbox {x}_{1}\) and \(\hbox {x}_{3}\) have been significantly reduced, while the bounds for variables \(\hbox {x}_{4}\) and \(\hbox {x}_{5}\) remain unchanged, and the remaining variable bounds are moderately altered. This refinement has a significant effect on the incumbent solution, which is reduced by 19 % when compared to the result obtained during the first stage.

Percentage of improvement between first stage and second stage

Zeolite | Model | Average improvement (%) |
---|---|---|

AHT | Quadratic | 15 |

AHT | Kriging | 12 |

MVY | Quadratic | 25 |

MVY | Kriging | 11 |

WEI | Quadratic | 29 |

WEI | Kriging | 13 |

13X | Quadratic | 50 |

13X | Kriging | 16 |

### 5.4 Comparison of proposed method with publically available solvers

The performance of the developed method is compared with the performance of the NOMAD software, an in-house implementation of the EGO algorithm [30] for constrained problems following an extension proposed by Sasena et al. [47], the ssmGO algorithm [5, 84] and a version of COBYLA [61] implemented in [85]. The implementation of the constrained EGO uses kriging for approximating the cost, purity and recovery of the problem, while new sampling locations are identified by maximizing the Expected Improvement function [30] subject to the kriging-based constraints. The only differentiation to solving formulation P3, is the fact that the minimization of cost as an objective, is replaced by the maximization of the expected improvement function. The only commercial version of constrained EGO can be found in TOMLAB [86], however, for the types of problems with expensive grey-box constraints, the only way for their incorporation is the augmentation of the objective by the summation of the constraint violation as a penalty term, which does not guarantee satisfaction of the grey-box constraints. SsmGO is a MATLAB based scatter search algorithm originally developed for bioprocess optimization. It is a population based metaheuristic approach which follows and circulates amongst a series of steps; diversification generation; improvement; reference set update; subset generation and solution combination. Constraints are handled using a static penalty function. NOMAD is a mesh adaptive direct search algorithm which uses surrogate models to assist the direct search, while COBYLA uses linear approximations of both the objective and constraints in a trust-region framework.

Performance of constrained ssmGO

Zeolite | Best cost | Average cost | SD cost | Average samples | Average CPU (h) |
---|---|---|---|---|---|

ABW | 25.76 | 26.66 | 0.80 | 1490 | 25.94 |

AHT | 20.75 | 21.77 | 0.85 | 1083 | 18.33 |

ITW | 27.17 | 29.22 | 1.70 | 1088 | 18.65 |

LTF\(^\mathrm{a}\) | 30.78 | 35.10 | 4.60 | 4890 | 59.39 |

MVY | 20.75 | 21.55 | 0.74 | 1069 | 19.55 |

NAB | 22.66 | 23.18 | 0.48 | 1066 | 18.85 |

OFF | 26.86 | 30.67 | 3.02 | 1868 | 66.16 |

TON | 25.58 | 27.19 | 1.40 | 1595 | 28.16 |

VNI | 26.68 | 33.76 | 5.83 | 1146 | 40.12 |

WEI | 21.30 | 21.95 | 0.58 | 1075 | 18.48 |

13X\(^\mathrm{b}\) | 27.98 | 29.28 | 1.84 | 4883 | 75.29 |

Performance of NOMAD

Zeolite | Best cost | Average cost | SD cost | Average samples | Average CPU (h) |
---|---|---|---|---|---|

ABW | 29.41 | 30.94 | 2.17 | 700 | 25.79 |

AHT | 20.64 | 22.48 | 2.60 | 700 | 27.82 |

ITW | 24.87 | 26.34 | 1.86 | 700 | 28.59 |

LTF | 30.81 | 34.23 | 3.87 | 700 | 32.81 |

MVY | 20.55 | 21.08 | 0.52 | 700 | 26.68 |

NAB\(^\mathrm{a}\) | 22.25 | 23.07 | 1.16 | 700 | 25.31 |

OFF | 26.44 | 28.17 | 1.50 | 700 | 32.67 |

TON | 27.69 | 28.42 | 0.67 | 700 | 28.36 |

VNI | 26.65 | 30.41 | 4.85 | 700 | 30.49 |

WEI | 20.60 | 21.08 | 0.48 | 700 | 23.21 |

13X\(^\mathrm{b}\) | 26.71 | 29.71 | 3.08 | 700 | 108.61 |

Performance of constrained conEGO

Zeolite | Best cost | Average cost | SD cost | Average samples | Average CPU (h) |
---|---|---|---|---|---|

ABW | 24.05 | 25.1 | 1.05 | 149 | 5.59 |

AHT | 20.98 | 21.84 | 0.9 | 268 | 6.93 |

ITW | 25.01 | 25.55 | 0.48 | 177 | 4.92 |

LTF | 29.3 | 34.61 | 8.61 | 132 | 4.73 |

MVY | 20.67 | 21.71 | 1.29 | 491 | 12.98 |

NAB | 21.97 | 23.07 | 0.99 | 495 | 12.69 |

OFF | 26.56 | 27.89 | 1.44 | 219 | 5.51 |

TON | 25.27 | 26.37 | 1.28 | 208 | 4.46 |

VNI | 26.8 | 27.87 | 0.78 | 222 | 5.09 |

WEI | 21.11 | 21.66 | 0.54 | 421 | 9.33 |

13X\(^\mathrm{a}\) | 26.17 | 28.55 | 2.3 | 172 | 10.07 |

Performance of the proposed framework for constrained global optimization for grey-box models

Zeolite | Best Cost | Average cost | SD cost | Average samples | Average CPU (h) |
---|---|---|---|---|---|

ABW | 23.11 | 24.11 | 0.6 | 513 | 11.42 |

AHT | 20.6 | 20.92 | 0.29 | 533 | 11.63 |

ITW | 24.7 | 25.37 | 0.31 | 506 | 12.00 |

LTF | 28.11 | 28.68 | 0.47 | 547 | 11.09 |

MVY | 20.19 | 20.53 | 0.27 | 534 | 12.55 |

NAB | 21.67 | 22.16 | 0.23 | 527 | 13.03 |

OFF | 26.2 | 26.79 | 0.33 | 374 | 12.39 |

TON | 25.11 | 25.47 | 0.29 | 521 | 13.21 |

VNI | 26.24 | 26.64 | 0.63 | 443 | 11.30 |

WEI | 20.59 | 21.19 | 0.3 | 422 | 15.59 |

13X | 25.96 | 26.82 | 0.64 | 336 | 29.14 |

Finally, the performance of the proposed framework is provided in Table 10. It is shown that overall the algorithm results in improved solutions with increased consistency. The sampling requirements are lower than ssmGO and NOMAD, while the algorithm requires more samples than the conEGO implementation for convergence. The increased CPU time is attributed to the sampling requirements, the global optimization components, and the initial reduced order sampling procedure. However, this cost does lead to improvements in the final solution and consistency in the performance of the method. Specifically for the case of 13X, which is the case study with the highest computational cost and reduced feasible space, the pay-off for employing the proposed methodology becomes evident since the method always manages to find a feasible solution, while the CPU time is less when compared to that of other methods which require more samples. Results for COBYLA are not presented here, since it was found to have significant difficulty locating feasible solutions for all case studies. The average performance and the variance of each method for all zeolites are shown in Fig. 6, where the edges of the box plot represent the 25th and 75th percentiles of the data, the middle red mark corresponds to the median and the whiskers represent the minimum and maximum cost values obtained from each method. From Fig. 6 it becomes evident that the proposed framework for constrained global optimization of grey-box models has an overall consistent and reliable performance.

## 6 Conclusions

The proposed framework has the capability of using different surrogate functions in order to approximate each of the constraints and objective of the original model, such as general quadratic functions, kriging functions and signomial functions. We have shown that the selection of the surrogate function plays an important role towards the performance of the method. Specifically kriging functions lead to consistently improved solutions for the case studies presented in this work. Finally, we have studied the importance of using deterministic global optimization for the parameter estimation of the surrogate functions, as well as the optimization of the overall constrained grey-box model. The results presented in this work illustrate that solutions with a lower objective function value can be obtained when using global optimization as opposed to local optimization. In certain cases, improved solutions are obtained with fewer function calls to the expensive simulation. Finally, we compare the performance of the proposed method with four available algorithms for constrained derivative-free optimization. It is demonstrated that the proposed framework is promising in terms of consistency and value of the optimal solution.

The performance of the proposed methodology has been tested on several instances of the NAPDE system for \(\hbox {CO}_{2}\) capture which have different characteristics and levels of complexity. The consideration of various types of surrogate functions according to optimal fitting criteria, and the ability of the method to handle any number of known and unknown constraints make it applicable to any problem which follows the grey-box formulation described in this work. We expect the method to have a wide range of applicability in many different scientific fields. However, the aspect of sampling reduction which is one of the reasons we can achieve improved performance, can only be applied only if a reduced-order form of the model is available. We plan to further generalize our method and perform thorough testing on a large set of constrained optimization problems with increased dimensionality and increased number of constraints.

## Notes

### Acknowledgments

The authors acknowledge financial support from the National Science Foundation (CBET-0827907, CBET-1263165).

## References

- 1.Audet, C., Bechard, V., Chaouki, J.: Spent potliner treatment process optimization using a MADS algorithm. Optim. Eng.
**9**(2), 143–160 (2008)CrossRefzbMATHGoogle Scholar - 2.Bartholomew-Biggs, M.C., Parkhurst, S.C., Wilson, S.P.: Using DIRECT to solve an aircraft routing problem. Comput. Optim. Appl.
**21**(3), 311–323 (2002)MathSciNetCrossRefzbMATHGoogle Scholar - 3.Boukouvala, F., Ierapetritou, M.G.: Surrogate-based optimization of expensive flowsheet modeling for continuous pharmaceutical manufacturing. J. Pharm. Innov.
**8**(2), 131–145 (2013)CrossRefGoogle Scholar - 4.Caballero, J.A., Grossmann, I.E.: An algorithm for the use of surrogate models in modular flowsheet optimization. AIChE J.
**54**(10), 2633–2650 (2008)CrossRefGoogle Scholar - 5.Egea, J.A., Rodriguez-Fernandez, M., Banga, J.R., Marti, R.: Scatter search for chemical and bio-process optimization. J. Glob. Optim.
**37**(3), 481–503 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 6.Fahmi, I., Cremaschi, S.: Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models. Comput. Chem. Eng.
**46**, 105–123 (2012)CrossRefGoogle Scholar - 7.Fowler, K.R., Reese, J.P., Kees, C.E., Dennis Jr, J.E., Kelley, C.T., Miller, C.T., Audet, C., Booker, A.J., Couture, G., Darwin, R.W., Farthing, M.W., Finkel, D.E., Gablonsky, J.M., Gray, G., Kolda, T.G.: Comparison of derivative-free optimization methods for groundwater supply and hydraulic capture community problems. Adv. Water Resour.
**31**(5), 743–757 (2008)CrossRefGoogle Scholar - 8.Graciano, J.E.A., Roux, G.A.C.L.: Improvements in surrogate models for process synthesis. Application to water network system design. Comput. Chem. Eng.
**59**, 197–210 (2013)CrossRefGoogle Scholar - 9.Hemker, T., Fowler, K., Farthing, M., Stryk, O.: A mixed-integer simulation-based optimization approach with surrogate functions in water resources management. Optim. Eng.
**9**(4), 341–360 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 10.Henao, C.A., Maravelias, C.T.: Surrogate-based superstructure optimization framework. AIChE J.
**57**(5), 1216–1232 (2011)CrossRefGoogle Scholar - 11.Kleijnen, J.P.C., van Beers, W., van Nieuwenhuyse, I.: Constrained optimization in expensive simulation: Novel approach. Eur. J. Oper. Res.
**202**(1), 164–174 (2010)CrossRefzbMATHGoogle Scholar - 12.Wan, X.T., Pekny, J.F., Reklaitis, G.V.: Simulation-based optimization with surrogate models—application to supply chain management. Comput. Chem. Eng.
**29**(6), 1317–1328 (2005)CrossRefGoogle Scholar - 13.Espinet, A., Shoemaker, C., Doughty, C.: Estimation of plume distribution for carbon sequestration using parameter estimation with limited monitoring data. Water Resour. Res.
**49**(7), 4442–4464 (2013)CrossRefGoogle Scholar - 14.Hasan, M.M.F., Baliban, R.C., Elia, J.A., Floudas, C.A.: Modeling, simulation, and pptimization of postcombustion \({\rm CO}_2\) capture for variable feed concentration and flow rate. 2. Pressure swing adsorption and vacuum swing adsorption processes. Ind. Eng. Chem. Res.
**51**(48), 15665–15682 (2013)CrossRefGoogle Scholar - 15.Hasan, M.M.F., Baliban, R.C., Elia, J.A., Floudas, C.A.: Modeling, simulation, and optimization of postcombustion \(\text{ CO }_2\) capture for variable feed concentration and flow rate. 1. Chemical absorption and membrane processes. Ind. Eng. Chem. Res.
**51**(48), 15642–15664 (2013)CrossRefGoogle Scholar - 16.Hasan, M.M.F., Boukouvala, F., First, E.L., Floudas, C.A.: Nationwide, regional, and statewide \(\text{ CO }_2\) capture, utilization, and sequestration supply chain network optimization. Ind. Eng. Chem. Res.
**53**(18), 7489–7506 (2014)CrossRefGoogle Scholar - 17.Li, S., Feng, L., Benner, P., Seidel-Morgenstern, A.: Using surrogate models for efficient optimization of simulated moving bed chromatography. Comput. Chem. Eng.
**67**, 121–132 (2014)CrossRefGoogle Scholar - 18.Forrester, A.I.J., Sóbester, A., Keane, A.J.: Engineering Design Via Surrogate Modelling—A Practical Guide. Wiley, Chichester (2008)CrossRefGoogle Scholar
- 19.Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. MPS-SIAM Series on Optimization, vol. 8. SIAM, Philadelphia (2009)CrossRefzbMATHGoogle Scholar
- 20.Martelli, E., Amaldi, E.: PGS-COM: a hybrid method for constrained non-smooth black-box optimization problems: brief review, novel algorithm and comparative evaluation. Comput. Chem. Eng.
**63**, 108–139 (2014)CrossRefGoogle Scholar - 21.Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms and comparison of software implementations. J. Glob. Optim.
**56**(3), 1247–1293 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - 22.Kolda, T.G., Lewis, R.M., Torczon, V.: Optimization by direct search: new perspectives on some classical and modern methods. SIAM Rev.
**45**(3), 385–482 (2003)MathSciNetCrossRefzbMATHGoogle Scholar - 23.Bjorkman, M., Holmstrom, K.: Global optimization of costly nonconvex functions using radial basis functions. Optim. Eng.
**1**(4), 373–397 (2000)MathSciNetCrossRefzbMATHGoogle Scholar - 24.Booker, A.J., Dennis, J.E., Frank, P.D., Serafini, D.B., Torczon, V., Trosset, M.W.: A rigorous framework for optimization of expensive functions by surrogates. Struct. Multidiscip. Optim.
**17**(1), 1–13 (1999)CrossRefGoogle Scholar - 25.Boukouvala, F., Ierapetritou, M.G.: Derivative-free optimization for expensive constrained problems using a novel expected improvement objective function. AIChE J.
**60**(7), 2462–2474 (2014)CrossRefGoogle Scholar - 26.Conn, A.R., Le Digabel, S.: Use of quadratic models with mesh-adaptive direct search for constrained black box optimization. Optim. Methods Softw.
**28**(1), 139–158 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - 27.Forrester, A.I.J., Keane, A.J.: Recent advances in surrogate-based optimization. Prog. Aerosp. Sci.
**45**(1), 50–79 (2009)CrossRefGoogle Scholar - 28.Jakobsson, S., Patriksson, M., Rudholm, J., Wojciechowski, A.: A method for simulation based optimization using radial basis functions. Optim. Engi.
**11**(4), 501–532 (2010)MathSciNetCrossRefzbMATHGoogle Scholar - 29.Jones, D.R.: A taxonomy of global optimization methods based on response surfaces. J. Global Optim.
**21**(4), 345–383 (2001)MathSciNetCrossRefzbMATHGoogle Scholar - 30.Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim.
**13**(4), 455–492 (1998)MathSciNetCrossRefzbMATHGoogle Scholar - 31.Regis, R.G.: Constrained optimization by radial basis function interpolation for high-dimensional expensive black-box problems with infeasible initial points. Eng. Optim.
**46**(2), 218–243 (2014)MathSciNetCrossRefGoogle Scholar - 32.Regis, R.G., Shoemaker, C.A.: Constrained global optimization of expensive black box functions using radial basis functions. J. Glob. Optim.
**31**(1), 153–171 (2005)MathSciNetCrossRefzbMATHGoogle Scholar - 33.Yao, W., Chen, X.Q., Huang, Y.Y., van Tooren, M.: A surrogate-based optimization method with RBF neural network enhanced by linear interpolation and hybrid infill strategy. Optim. Methods Softw.
**29**(2), 406–429 (2014)MathSciNetCrossRefzbMATHGoogle Scholar - 34.Muller, J., Shoemaker, C.A.: Influence ensemble surrogate models and sampling strategy on the solution quality of algorithms for computationally expensive black-box global optimization methods. J. Glob. Optim.
**60**(2), 123–144 (2014)CrossRefzbMATHGoogle Scholar - 35.Viana, F.A.C., Haftka, R.T., Watson, L.T.: Efficient global optimization algorithm assisted by multiple surrogate techniques. J. Glob. Optim.
**56**(2), 669–689 (2013)CrossRefzbMATHGoogle Scholar - 36.Davis, E., Ierapetritou, M.: A kriging method for the solution of nonlinear programs with black-box functions. AIChE J.
**53**(8), 2001–2012 (2007)CrossRefGoogle Scholar - 37.Floudas, C.A.: Deterministic Global Optimization, vol. 37. Springer, Berlin (1999)Google Scholar
- 38.Floudas, C.A., Gounaris, C.E.: A review of recent advances in global optimization. J. Glob. Optim.
**45**(1), 3–38 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 39.First, E.L., Hasan, M.M.F., Floudas, C.A.: Discovery of novel zeolites for natural gas purification through combined material screening and process optimization. AICHE J.
**60**(5), 1767–1785 (2014)CrossRefGoogle Scholar - 40.Hasan, M.M.F., First, E.L., Floudas, C.A.: Cost-effective \(\text{ CO }_2\) capture based on in silico screening of zeolites and process optimization. Phys. Chem. Chem. Phys.
**15**(40), 17601–17618 (2013)CrossRefGoogle Scholar - 41.Abramson, M.: Pattern Search Algorithms for Mixed Variable General Constrained Optimization Problems. Rice University, Houston (2002)Google Scholar
- 42.Audet, C., Dennis, J.: Mesh adaptive direct search algorithms for constrained optimization. SIAM J. Optim.
**17**(1), 188–217 (2006)MathSciNetCrossRefzbMATHGoogle Scholar - 43.Audet, C., Dennis Jr, J.E.: A progressive barrier for derivative-free nonlinear programming. SIAM J. Optim.
**20**(1), 445–472 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 44.Holmstrom, K., Quttineh, N.-H., Edvall, M.M.: An adaptive radial basis algorithm (ARBF) for expensive black-box mixed-integer constrained global optimization. Optim. Eng.
**9**(4), 311–339 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 45.Audet, C., Dennis Jr, J.E.: A pattern search filter method for nonlinear programming without derivatives. SIAM J. Optim.
**14**(4), 980–1010 (2004)MathSciNetCrossRefzbMATHGoogle Scholar - 46.Parr, J.M., Keane, A.J., Forrester, A.I.J., Holden, C.M.E.: Infill sampling criteria for surrogate-based optimization with constraint handling. Eng. Optim.
**44**(10), 1147–1166 (2012)CrossRefzbMATHGoogle Scholar - 47.Sasena, M.J., Papalambros, P., Goovaerts, P.: Exploration of metamodeling sampling criteria for constrained global optimization. Eng. Optim.
**34**(3), 263–278 (2002)CrossRefGoogle Scholar - 48.Regis, R.G.: Stochastic radial basis function algorithms for large-scale optimization involving expensive black-box objective and constraint functions. Comput. Oper. Res.
**38**(5), 837–853 (2011)MathSciNetCrossRefGoogle Scholar - 49.Abramson, M., Audet, C., Dennis, J., Digabel, S.: OrthoMADS: a deterministic MADS instance with orthogonal directions. SIAM J. Optim.
**20**(2), 948–966 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 50.Audet, C., Bechard, V., Digabel, S.: Nonsmooth optimization through mesh adaptive direct search and variable neighborhood search. J. Glob. Optim.
**41**(2), 299–318 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 51.Kolda, T.G., Lewis, R.M., Torczon, V.: Optimization by direct search: new perspectives on some classical and modern methods. SIAM Rev.
**45**(3), 385–482 (2003)MathSciNetCrossRefzbMATHGoogle Scholar - 52.Wild, S.M., Regis, R.G., Shoemaker, C.A.: Orbit: optimization by radial basis function interpolation in trust-regions. SIAM J. Sci. Comput.
**30**(6), 3197–3219 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 53.Sankaran, S., Audet, C., Marsden, A.L.: A method for stochastic constrained optimization using derivative-free surrogate pattern search and collocation. J. Comput. Phys.
**229**(12), 4664–4682 (2010)CrossRefzbMATHGoogle Scholar - 54.Le Digabel, S.: Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans. Math. Softw. (TOMS)
**37**(4), 1–15 (2011)MathSciNetCrossRefGoogle Scholar - 55.Oeuvray, R.: Trust-region methods based on radial basis functions with application to biomedical imaging. PhD in Mathematics, Ecole Polytechnique Federale de Lausanne (2005)Google Scholar
- 56.Powell, M.J.D.: The BOBYQA Algorithm for Bound Constrained Optimization Without Derivatives. Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge (2009)Google Scholar
- 57.Regis, R.G., Shoemaker, C.A.: Parallel radial basis function methods for the global optimization of expensive functions. Eur. J. Oper. Res.
**182**(2), 514–535 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 58.Regis, R.G., Shoemaker, C.A.: A stochastic radial basis function method for the global optimization of expensive functions. INFORMS J. Comput.
**19**(4), 497–509 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 59.Regis, R.G., Shoemaker, C.A.: Improved strategies for radial basis function methods for global optimization. J. Glob. Optim.
**37**(1), 113–135 (2007)MathSciNetCrossRefzbMATHGoogle Scholar - 60.Regis, R.G., Shoemaker, C.A.: A quasi-multistart framework for global optimization of expensive functions using response surface models. J. Glob. Optim.
**56**(4), 1719–1753 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - 61.Powell, M.J.D.: A direct search optimization method that models the objective and constraint functions by linear interpolation. In: Gomez, S., Hennart, J.-P. (eds.) Advances in Optimization and Numerical Analysis: Mathematics and its Applications, vol. 275. Mathematics and Its Applications, pp. 51–67. Springer, Berlin (1994)CrossRefGoogle Scholar
- 62.Quan, N., Yin, J., Ng, S.H., Lee, L.H.: Simulation optimization via kriging: a sequential search using expected improvement with computing budget constraints. IIE Trans.
**45**(7), 763–780 (2013)CrossRefGoogle Scholar - 63.Torn, A., Zilinskas, A.: Global optimization. In: Lecture Notes in Computer Science, vol. 350. Springer, Berlin (1989)Google Scholar
- 64.Regis, R.G.: Evolutionary programming for high-dimensional constrained expensive black-box optimization using radial basis functions. IEEE Trans. Evol. Comput.
**18**(3), 326–347 (2014). doi: 10.1109/TEVC.2013.2262111 MathSciNetCrossRefGoogle Scholar - 65.Misener, R., Floudas, C.A.: Global optimization of mixed-integer models with quadratic and signomial functions: a review. Appl. Comput. Math.
**11**, 317–336 (2012)MathSciNetzbMATHGoogle Scholar - 66.Misener, R., Floudas, C.A.: GloMIQO: global mixed-integer quadratic optimizer. J. Glob. Optim.
**57**(1), 3–50 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - 67.Misener, R., Floudas, C.: ANTIGONE: algorithms for continuous/integer global optimization of nonlinear equations. J. Glob. Optim.
**59**(2–3), 503–526 (2014)MathSciNetCrossRefzbMATHGoogle Scholar - 68.Mckay, M.D., Beckman, R.J., Conover, W.J.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics
**42**(1), 55–61 (2000)CrossRefzbMATHGoogle Scholar - 69.Kahrs, O., Marquardt, W.: The validity domain of hybrid models and its application in process optimization. Chem. Eng. Process.
**46**(11), 1054–1066 (2007)CrossRefGoogle Scholar - 70.Sobieszczanski-Sobieski, J., Haftka, R.T.: Multidisciplinary aerospace design optimization: survey of recent developments. Struct. Optim.
**14**(1), 1–23 (1997). doi: 10.1007/BF01197554 CrossRefGoogle Scholar - 71.Willcox, K., Peraire, J.: Balanced model reduction via the proper orthogonal decomposition. AIAA J.
**40**(11), 2323–2330 (2002)CrossRefGoogle Scholar - 72.Lucia, D.J., Beran, P.S., Silva, W.A.: Reduced-order modeling: new approaches for computational physics. Prog. Aerosp. Sci.
**40**(1–2), 51–117 (2004)CrossRefGoogle Scholar - 73.Li, Z., Floudas, C.A.: Optimal scenario reduction framework based on distance of uncertainty distribution and output performance: I. Single reduction via mixed integer linear optimization. Comput. Chem. Eng.
**70**, 50–65 (2014)CrossRefGoogle Scholar - 74.Cressie, N.: Statistics for Spatial Data. Wiley Series in Probability and Statistics. Wiley-Interscience, New York (1993)Google Scholar
- 75.Kleijnen, J.P.C.: Kriging metamodeling in simulation: a review. Eur. J. Oper. Res.
**192**(3), 707–716 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 76.Sacks, J., Welch, W.J., Toby, J.M., Wynn, H.P.: Design and analysis of computer experiments. Stat. Sci.
**4**(4), 409–423 (1989)MathSciNetCrossRefzbMATHGoogle Scholar - 77.Myers, R.H., Montgomery, D.C.: Response Surface Methodology: Process and Product in Optimization Using Designed Experiments. Wiley, New York (1995)zbMATHGoogle Scholar
- 78.Bjork, K.-M., Lindberg, P.O., Westerlund, T.: Some convexifications in global optimization of problems containing signomial terms. Comput. Chem. Eng.
**27**(5), 669–679 (2003)CrossRefGoogle Scholar - 79.Gramacy, R.B., Lee, H.K.H.: Optimization Under Unknown Constraints. University of Cambridge, Cambridge (2010)Google Scholar
- 80.Lin, L.-C., Berger, A., Martin, R., Kim, J., Swisher, J., Jariwala, K., Rycroft, C., Bhown, A., Deem, M., Haranczyk, M., Smit, B.: In silico screening of carbon-capture materials. Nat. Mater.
**11**(7), 633–641 (2012)CrossRefGoogle Scholar - 81.Siriwardane, R.V., Shen, M.-S., Fisher, E.P., Poston, J.A.: Adsorption of \(\text{ CO }_2\) on molecular sieves and activated carbon. Energy Fuels
**15**(2), 279–284 (2001)CrossRefGoogle Scholar - 82.Zhang, J., Webley, P.A., Xiao, P.: Effect of process parameters on power requirements of vacuum swing adsorption technology for \(\text{ CO }_2\) capture from flue gas. Energy Convers. Manag.
**49**(2), 346–356 (2008)CrossRefGoogle Scholar - 83.Drud, A.: CONOPT—a large-scale GRG code. ORSA J. Comput.
**6**, 207–216 (1992)CrossRefzbMATHGoogle Scholar - 84.Egea, J.A., Martí, R., Banga, J.R.: An evolutionary method for complex-process optimization. Comput. Oper. Res.
**37**(2), 315–324 (2010)CrossRefzbMATHGoogle Scholar - 85.Johnson, S.G.: The NLopt nonlinear-optimization package. http://ab-initio.mit.edu/nlopt
- 86.Holmstrom, K., Goran, A.O., Edvall, M.M.: Users Guide for TOMLAB CGO. http://tomopt.com/docs/TOMLAB_CGO.pdf (2008)