Abstract
This paper presents an efficient optimization procedure for solving the reliability-based design optimization (RBDO) problem of structures under aleatory uncertainty in material properties and external loads. To reduce the number of structural analysis calls during the optimization process, mixture models of Gaussian processes (MGPs) are constructed for prediction of structural responses. The MGP is used to expand the application of the Gaussian process model (GPM) to large training sets for well covering the input variable space, significantly reducing the training time, and improving the overall accuracy of the regression models. A large training set of the input variables and associated structural responses is first generated and split into independent subsets of similar training samples using the Gaussian mixture model clustering method. The GPM for each subset is then developed to produce a set of independent GPMs that together define the MGP as their weighted average. The weight vector computed for a specified input variable contains the probability that the input variable belongs to the projection of each subset onto the input variable space. To calculate the failure probabilities and their inverse values required during the process of solving the RBDO problem, a novel saddlepoint approximation is proposed based on the first three cumulants of random variables. The original RBDO problem is replaced by a sequential deterministic optimization (SDO) problem in which the MGPs serve as surrogates for the limit-state functions in probabilistic constraints of the RBDO problem. The SDO problem is strategically solved for exploring a promising region that may contain the optimal solution, improving the accuracy of the MGPs in that region, and producing a reliable solution. Two design examples of a truss and a steel frame demonstrate the efficiency of the proposed optimization procedure.
Similar content being viewed by others
References
AISC 360 (2016) Specification for structural steel buildings. ANSI/AISC 360-16, Chicago
Anderson TV, Mattson CA (2012) Propagating skewness and kurtosis through engineering models for low-cost, meaningful, nondeterministic design. J Mech Des 134(10):100911. https://doi.org/10.1115/1.4007389
Aoues Y, Chateauneuf A (2010) Benchmark study of numerical methods for reliability-based design optimization. Struct Multidiscip Optim 41:277–294. https://doi.org/10.1007/s00158-009-0412-2
ASCE (2017) Minimum design loads and associated criteria for buildings and other structures. ASCE 7-16, Reston
Bartlett FM, Dexter RJ, Graeser MD, Jelinek JJ, Schmidt BJ, Galambos TV (2003) Updating standard shape material properties database for design and reliability. Eng J Am Inst Steel Constr 40:2–14
Bourinet J-M, Deheeger F, Lemaire M (2011) Assessing small failure probabilities by combined subset simulation and support vector machines. Struct Saf 33:343–353. https://doi.org/10.1016/j.strusafe.2011.06.001
Butler RW (2007) Saddlepoint approximations with applications. Cambridge University Press, Cambridge
CEN (2002) Eurocode - Basis of structural design. EN 1990, Brussels
Cheng G, Xu L, Jiang L (2006) A sequential approximate programming strategy for reliability-based structural optimization. Comput Struct 84:1353–1367. https://doi.org/10.1016/j.compstruc.2006.03.006
Chojaczyk AA, Teixeira AP, Neves LC, Cardosod JB, Soares CG (2015) Review and application of artificial neural networks models in reliability analysis of steel structures. Struct Saf 52:78–89. https://doi.org/10.1016/j.strusafe.2014.09.002
Deng J (2006) Structural reliability analysis for implicit performance function using radial basis function network. Int J Solids Struct 43:3255–3291. https://doi.org/10.1016/j.ijsolstr.2005.05.055
Deng J, Gu D, Li X, Yue ZQ (2005) Structural reliability analysis for implicit performance functions using artificial neural network. Struct Saf 27:25–48. https://doi.org/10.1016/j.strusafe.2004.03.004
Do B, Ohsaki M (2021) Gaussian mixture model for robust design optimization of planar steel frames. Struct Multidiscip Optim 63:137–160. https://doi.org/10.1007/s00158-020-02676-3
Du X, Chen W (2004) Sequential optimization and reliability assessment method for efficient probabilistic design. J Mech Des 126:225–233. https://doi.org/10.1115/1.1649968
Du X, Sudjianto A (2004) First order saddlepoint approximation for reliability analysis. AIAA J 42:1199–1207. https://doi.org/10.2514/1.3877
Dubourg V, Sudret B, Bourinet J-M (2011) Reliability-based design optimization using Kriging surrogates and subset simulation. Struct Multidiscip Optim 44:673–690. https://doi.org/10.1007/s00158-011-0653-8
Echard B, Gayton N, Lemaire M (2011) AK-MCS: an active learning reliability method combining Kriging and Monte Carlo simulation. Struct Saf 33:145–154. https://doi.org/10.1016/j.strusafe.2011.01.002
Echard B, Gayton N, Lemaire M, Relun N (2013) A combined importance sampling and Kriging reliability method for small failure probabilities with time-demanding numerical models. Reliab Eng Syst Saf 111:232–240. https://doi.org/10.1016/j.ress.2012.10.008
Forrester A, Keane A (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45:50–79. https://doi.org/10.1016/j.paerosci.2008.11.001
Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, Chichester
Foschi RO, Li H, Zhang J (2002) Reliability and performance-based design: a computational approach and applications. Struct Saf 24:205–218. https://doi.org/10.1016/S0167-4730(02)00025-5
Gillespie CS, Renshaw E (2007) An improved saddlepoint approximation. Math Biosci 208:359–374. https://doi.org/10.1016/j.mbs.2006.08.026
Goswami S, Chakraborty S, Chowdhury R, Rabczuk T (2019) Threshold shift method for reliability-based design optimization. Struct Multidiscip Optim 60:2053–2072. https://doi.org/10.1007/s00158-019-02310-x
Goutis C, Casella G (1999) Explaining the Saddlepoint approximation. Am Stat 53:216–224. https://doi.org/10.1080/00031305.1999.10474463
Guo S (2014) An efficient third-moment saddlepoint approximation for probabilistic uncertainty analysis and reliability evaluation of structures. Appl Math Model 38:221–232. https://doi.org/10.1016/j.apm.2013.06.026
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media. https://doi.org/10.1007/978-0-387-84858-7
Hess PE, Bruchman D, Assakkaf IA, Ayyub BM (2002) Uncertainties in material and geometric strength and load variables. Nav Eng J 114:139–166. https://doi.org/10.1111/j.15593584.2002.tb00128.x
Huang B, Du X (2008) Probabilistic uncertainty analysis by mean-value first order Saddlepoint approximation. Reliab Eng Syst Saf 93:325–336. https://doi.org/10.1016/j.ress.2006.10.021
Jiang C, Lu GY, Han X, Liu LX (2012) A new reliability analysis method for uncertain structures with random and interval variables. Int J Mech Mater Des 8:169–182. https://doi.org/10.1007/s10999-012-9184-8
Lehký D, Slowik O, Novák D (2018) Reliability-based design: artificial neural networks and double-loop reliability-based optimization approaches. Adv Eng Softw 117:123–135. https://doi.org/10.1016/j.advengsoft.2017.06.013
Li X, Gong C, Gu L, Jing Z, Fang H, Gao R (2019) A reliability-based optimization method using sequential surrogate model and Monte Carlo simulation. Struct Multidiscip Optim 59:439–460. https://doi.org/10.1007/s00158-018-2075-3
Liu H, Ong Y-S, Shen X, Cai J (2020) When Gaussian process meets big data: a review of scalable GPs. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/tnnls.2019.2957109
Lophaven SN, Nielsen HB, Søndergaard J (2002) DACE-A Matlab Kriging toolbox, version 2.0. Informatics and mathematical modelling. Technical University of Denmark, DTU, Lyngby
Mahadevan S (2000) Probability, reliability, and statistical methods in engineering design. Wiley, New York
Masoudnia S, Ebrahimpour R (2014) Mixture of experts: a literature survey. Artif Intell Rev 42:275–293. https://doi.org/10.1007/s10462-012-9338-y
McLachlan GJ, Rathnayake S (2014) On the number of components in a Gaussian mixture model. WIREs Data Min Knowl Discov 4:341–355. https://doi.org/10.1002/widm.1135
Moustapha M, Sudret B (2019) Surrogate-assisted reliability-based design optimization: a survey and a unified modular framework. Struct Multidiscip Optim 60:2157–2176. https://doi.org/10.1007/s00158-019-02290-y
Moustapha M, Sudret B, Bourinet J-M, Guillaume B (2016) Quantile-based optimization under uncertainties using adaptive Kriging surrogate models. Struct Multidiscip Optim 54:1403–1421. https://doi.org/10.1007/s00158-016-1504-4
Papadimitriou DI, Mourelatos ZP (2018) Reliability-based topology optimization using mean-value second-order saddlepoint approximation. J Mech Des 140. https://doi.org/10.1115/1.4038645
Park C, Haftka RT, Kim NH (2017) Remarks on multi-fidelity surrogates. Struct Multidiscip Optim 55:1029–1050. https://doi.org/10.1007/s00158-016-1550-y
Rasmussen CE (2000) The infinite Gaussian mixture model. Adv Neural Inf Proces Syst 12:554–560
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press, Cambridge
Santner TJ, Williams BJ, Notz W (2018) The design and analysis of computer experiments, 2nd edn. Springer, New York
Soares RC, Mohamed A, Venturini WS, Lemaire M (2002) Reliability analysis of non-linear reinforced concrete frames using the response surface method. Reliab Eng Syst Saf 75:1–16. https://doi.org/10.1016/S0951-8320(01)00043-6
Valdebenito MA, Schuëller GI (2010) A survey on approaches for reliability-based optimization. Struct Multidiscip Optim 42:645–663. https://doi.org/10.1007/s00158-010-0518-6
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16:645–678. https://doi.org/10.1109/TNN.2005.845141
Zhao Y-G, Ono T (1999) A general procedure for first/second-order reliabilitymethod (FORM/SORM). Struct Saf 21:95–112. https://doi.org/10.1016/S0167-4730(99)00008-9
Zhao Y-G, Ono T (2001) Moment methods for structural reliability. Struct Saf 23:47–75. https://doi.org/10.1016/S0167-4730(00)00027-8
Zhao W, Qiu Z (2013) An efficient response surface method and its application to structural reliability and reliability-based optimization. Finite Elem Anal Des 67:34–42. https://doi.org/10.1016/j.finel.2012.12.004
Acknowledgements
Financial support from the Japan International Cooperation Agency (JICA) for the first author and JSPS KAKENHI No. JP19H02286 for the second author is fully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Replication of results
Main source codes used for solving two design examples in Section 5 are available online at https://github.com/BachDo17/mixGP.
Additional information
Responsible Editor: Jianbin Du
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1 Gaussian process model
Based on the training data set \( \mathcal{D}=\left\{\mathbf{X},\mathbf{y}\right\}={\left\{{\mathbf{x}}_i,{y}_i\right\}}_{i=1}^N \), we seek to construct an input-output mapping y = f(x): ℝd → ℝ, where f(x) is an unknown regression function.
A semi-parametric GPM defines f(x) using the following probabilistic regression model (Rasmussen and Williams 2006):
where hT(x) = [h1(x), …, hq(x)]T is a q-dimensional vector of known basis function of x; q is derived from d according to the form of the basis function; β = [β1, …, βq]T is a vector of unknown coefficients; hT(x)β represents the mean function of f(x); and Z(x) is a correlation function modeling the residual.
Since the prediction performance of GPM is mainly evaluated via the covariance function (Rasmussen and Williams 2006), the basis function in the mean function of the GPM is not the main focus of the method. Thus, there may not exist a criterion to choose a basis function. A zero-basis function is often used to avoid expensive computations of the GPM. However, there are several reasons for explicitly using a non-zero mean function such as interpretability of the regression model and convenience of expressing prior information (Rasmussen and Williams 2006). For this, it is convenient to describe the GP mean function using a few fixed basis functions such as linear or quadratic functions. In this study, the basis function is a quadratic function of the input variables x.
The GPM assumes that the marginal likelihood, i.e., p(y| X), is an N-variate Gaussian with a mean vector \( \mathbf{f}={\left\{f\left({\mathbf{x}}_i\right)\right\}}_{i=1}^N \) and covariance matrix. The covariance matrix is determined based on the following two facts. First, its (i, j)th element explains the cosine similarity between two points f(xi) and f(xj), both are unknowns and drawn from p(y| X). Second, we expect that the smoothness of any continuous functions also well operates on the regression function f(x), i.e., a small variation in x leads to a small variation in f(x), or vice versa. Therefore, the cosine similarity between two input vectors xi and xj, which are known in advance, can be used to characterize the cosine similarity between the two points f(xi) and f(xj), which are unknown in advance.
Let the PDF p(Z| X) of the vector of residuals \( \mathbf{Z}={\left\{Z\left({\mathbf{x}}_i\right)\right\}}_{i=1}^N \) be an N-variate Gaussian with a zero mean and a covariance matrix explaining the similarity between any two input vectors as
where K ∈ ℝN × N is the covariance matrix with the element Kij = k(xi, xj ) is a positive definite kernel function to explain the cosine similarity between two input vectors xi and xj. This study uses the squared exponential kernel as
where θ = {θy, θl} are unknown parameters of the kernel function.
Since the covariance matrix of p(y| X) is identical to that of p(Z| X) in (A.2), the marginal likelihood can be represented by
where H(X) = [hT(x1), …, hT(xN)]T, and H(X)β is the mean vector of f.
The coefficient vector β is the least-squares solution as follows:
To determine the kernel parameters θ, the marginal likelihood, or equivalently, its logarithm is maximized with respect to θ as (Rasmussen and Williams 2006)
Substituting β in (A.5) into (A.6), \( \mathcal{L} \) becomes a function of θ and is maximized using an optimization algorithm. In this study, the DACE toolbox (Lophaven et al. 2002) is used to determine θ.
Once θ and β are determined, to predict the responses \( {\mathbf{y}}^{\ast}={\left\{{y}_l^{\ast}\right\}}_{l=1}^M \) for a new test set of a total M input variables \( {\mathbf{X}}^{\ast}={\left\{{\mathbf{x}}_l^{\ast}\right\}}_{l=1}^M \), the joint PDF of y ∣ X and y∗ ∣ X∗ is established as
where K∗ ∈ ℝN × M with \( {K}_{il}^{\ast }=k\left({\mathbf{x}}_i,{\mathbf{x}}_l^{\ast}\ \right) \), and K∗∗ ∈ ℝM × M with \( {K}_{lh}^{\ast \ast }=k\left({\mathbf{x}}_l^{\ast},{\mathbf{x}}_h^{\ast}\ \right) \).
Applying the rule of the posterior conditional to the joint PDF in (A.7), the conditional PDF used to predict y∗ for the test set X∗ is given by (Rasmussen and Williams 2006)
where
Appendix 2. Clustering training set using Gaussian mixture model
Clustering a training set aims at distributing its similar samples to an independent group in which the samples share a general property. Two fundamental steps of clustering a training set include measuring the similarity of the samples and selecting a clustering algorithm. Different similarity measures and clustering algorithms can be found in Xu and Wunsch (2005). Here, we premise that two training samples are similar if they emerge from the same PDF. Therefore, it is convenient to split the joint PDF of the input-output variables p(x, y) into different Gaussian components using the GMM (Hastie et al. 2009), assign component as a subset, and then distribute the training samples into each subset accordingly.
The GMM describes the joint PDF p(x, y) by a convex combination of Gaussians as follows:
where denote the kth (d + 1)-variate Gaussian; K is the number of Gaussians; and \( \boldsymbol{\Theta} ={\left\{{\pi}_k,{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right\}}_{k=1}^K \) are unknown parameters of the GMM with πk, μk, and Σk represent the mixing proportion, mean vector, and covariance matrix of the kth Gaussian, respectively.
Let z = [z1, …, zN]T denote a latent random vector, where zi ∈ {1, …, K}, and zik denote the probability that the sample (xi, yi) belongs to the kth Gaussian (or the kth subset), i.e.,
Since z is unknown in advance, zik cannot be specified exactly. Instead, its expectation, denoted by \( \mathbbm{E}\left[{z}_{ik}\right] \), can be determined by using Bayes’ rule as follows:
where p(zi = k| Θ) = πk is the prior and p(xi, yi| zi = k, Θ) = ϕ(xi, yi| μk, Σk) is the likelihood. Thus, (B.6) can be rewritten as
To determine Θ, the posterior of the following log-likelihood \( {\mathcal{L}}_c \) of the training set is maximized using an iterative expectation-maximization (EM) algorithm (Hastie et al. 2009).
The EM algorithm starts with initial parameters \( {\boldsymbol{\Theta}}^{(0)}={\left\{{\pi}_k^{(0)},{\boldsymbol{\upmu}}_k^{(0)},{\boldsymbol{\Sigma}}_k^{(0)}\right\}}_{k=1}^K \), computes \( \mathbbm{E}\left[{z}_{ik}\right] \) using (B.7), maximizes \( {\mathcal{L}}_c \) in (B.8) with respect to Θ for obtaining a new value of Θ, and moves on to the next iteration with the new Θ. For a given K, the initial mixing proportions \( {\pi}_k^{(0)} \) are uniform, the initial mean vectors \( {\boldsymbol{\upmu}}_k^{(0)} \) are randomly selected from K vectors of the training set, and the initial variance matrices \( {\boldsymbol{\Sigma}}_k^{(0)} \) are diagonal, where the ith element on the diagonal is the variance of the ith input variable. The algorithm is assured to converge at a finite number of iterations since it always reduced \( {\mathcal{L}}_c \) after each iteration. Detailed derivations and convergence properties of the EM algorithm can be found in Hastie et al. (2009). Here, we summarize two main steps of the EM as follows:
-
E step:
-
M step:
where t denote the iteration counter and \( {\mathbf{d}}_i={\left[{\mathbf{x}}_i^T,{y}_i\right]}^T \).
After Θ is obtained, the probability that the training sample (xi, yi) belongs to the kth Gaussian, i.e., \( \mathbbm{E}\left[{z}_{ik}\right] \), is determined using (B.7), thereby producing K values of \( \mathbbm{E}\left[{z}_{ik}\right] \). The maximum value among these K values indicates which Gaussian (or subset) contains the sample (xi, yi).
By projecting all subsets in (d + 1)-dimensional space onto the d-dimensional space of input variables, the probability that an input variable x emerges from the kth projected subset can be determined (without knowing the corresponding output variable y) by
where μX, k and ΣXX, k are derived from (B.3) and (B.4), respectively.
As a model selection task, the number of Gaussians K should be determined before employing the EM algorithm for obtaining Θ. Detailed discussions on criteria used for selecting the best model among available GMMs can be found in McLachlan and Rathnayake (2014). Here, the Bayesian information criterion (BIC) is chosen since its effectiveness has been confirmed by many authors in the statistical learning field (McLachlan and Rathnayake 2014). To produce a set of GMM candidates for the model selection, we simply increase K step by step from 1 to 50. The best GMM among these candidates minimizes BIC as (Hastie et al. 2009)
where \( \mathcal{L}={\sum}_{i=1}^N\log \left[{\sum}_{k=1}^K{\pi}_k\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right)\right] \) is the log-likelihood of the training set and np is the number of free parameters required for a total of K Gaussians.
Rights and permissions
About this article
Cite this article
Do, B., Ohsaki, M. & Yamakawa, M. Sequential mixture of Gaussian processes and saddlepoint approximation for reliability-based design optimization of structures. Struct Multidisc Optim 64, 625–648 (2021). https://doi.org/10.1007/s00158-021-02855-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-021-02855-w