Sequential mixture of Gaussian processes and saddlepoint approximation for reliability-based design optimization of structures

Do, Bach; Ohsaki, Makoto; Yamakawa, Makoto

doi:10.1007/s00158-021-02855-w

Sequential mixture of Gaussian processes and saddlepoint approximation for reliability-based design optimization of structures

Research Paper
Published: 28 April 2021

Volume 64, pages 625–648, (2021)
Cite this article

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

1079 Accesses
7 Citations
Explore all metrics

Abstract

This paper presents an efficient optimization procedure for solving the reliability-based design optimization (RBDO) problem of structures under aleatory uncertainty in material properties and external loads. To reduce the number of structural analysis calls during the optimization process, mixture models of Gaussian processes (MGPs) are constructed for prediction of structural responses. The MGP is used to expand the application of the Gaussian process model (GPM) to large training sets for well covering the input variable space, significantly reducing the training time, and improving the overall accuracy of the regression models. A large training set of the input variables and associated structural responses is first generated and split into independent subsets of similar training samples using the Gaussian mixture model clustering method. The GPM for each subset is then developed to produce a set of independent GPMs that together define the MGP as their weighted average. The weight vector computed for a specified input variable contains the probability that the input variable belongs to the projection of each subset onto the input variable space. To calculate the failure probabilities and their inverse values required during the process of solving the RBDO problem, a novel saddlepoint approximation is proposed based on the first three cumulants of random variables. The original RBDO problem is replaced by a sequential deterministic optimization (SDO) problem in which the MGPs serve as surrogates for the limit-state functions in probabilistic constraints of the RBDO problem. The SDO problem is strategically solved for exploring a promising region that may contain the optimal solution, improving the accuracy of the MGPs in that region, and producing a reliable solution. Two design examples of a truss and a steel frame demonstrate the efficiency of the proposed optimization procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaussian mixture model for robust design optimization of planar steel frames

Article 14 July 2020

Efficient reliability-based optimization of linear dynamic systems with random structural parameters

Article 12 August 2021

Bayesian inference for random field parameters with a goal-oriented quality control of the PGD forward model’s accuracy

Article Open access 18 August 2022

References

AISC 360 (2016) Specification for structural steel buildings. ANSI/AISC 360-16, Chicago
Google Scholar
Anderson TV, Mattson CA (2012) Propagating skewness and kurtosis through engineering models for low-cost, meaningful, nondeterministic design. J Mech Des 134(10):100911. https://doi.org/10.1115/1.4007389
Article Google Scholar
Aoues Y, Chateauneuf A (2010) Benchmark study of numerical methods for reliability-based design optimization. Struct Multidiscip Optim 41:277–294. https://doi.org/10.1007/s00158-009-0412-2
Article MathSciNet MATH Google Scholar
ASCE (2017) Minimum design loads and associated criteria for buildings and other structures. ASCE 7-16, Reston
Google Scholar
Bartlett FM, Dexter RJ, Graeser MD, Jelinek JJ, Schmidt BJ, Galambos TV (2003) Updating standard shape material properties database for design and reliability. Eng J Am Inst Steel Constr 40:2–14
Google Scholar
Bourinet J-M, Deheeger F, Lemaire M (2011) Assessing small failure probabilities by combined subset simulation and support vector machines. Struct Saf 33:343–353. https://doi.org/10.1016/j.strusafe.2011.06.001
Article Google Scholar
Butler RW (2007) Saddlepoint approximations with applications. Cambridge University Press, Cambridge
Book Google Scholar
CEN (2002) Eurocode - Basis of structural design. EN 1990, Brussels
Google Scholar
Cheng G, Xu L, Jiang L (2006) A sequential approximate programming strategy for reliability-based structural optimization. Comput Struct 84:1353–1367. https://doi.org/10.1016/j.compstruc.2006.03.006
Article Google Scholar
Chojaczyk AA, Teixeira AP, Neves LC, Cardosod JB, Soares CG (2015) Review and application of artificial neural networks models in reliability analysis of steel structures. Struct Saf 52:78–89. https://doi.org/10.1016/j.strusafe.2014.09.002
Article Google Scholar
Deng J (2006) Structural reliability analysis for implicit performance function using radial basis function network. Int J Solids Struct 43:3255–3291. https://doi.org/10.1016/j.ijsolstr.2005.05.055
Article MATH Google Scholar
Deng J, Gu D, Li X, Yue ZQ (2005) Structural reliability analysis for implicit performance functions using artificial neural network. Struct Saf 27:25–48. https://doi.org/10.1016/j.strusafe.2004.03.004
Article Google Scholar
Do B, Ohsaki M (2021) Gaussian mixture model for robust design optimization of planar steel frames. Struct Multidiscip Optim 63:137–160. https://doi.org/10.1007/s00158-020-02676-3
Article MathSciNet Google Scholar
Du X, Chen W (2004) Sequential optimization and reliability assessment method for efficient probabilistic design. J Mech Des 126:225–233. https://doi.org/10.1115/1.1649968
Article Google Scholar
Du X, Sudjianto A (2004) First order saddlepoint approximation for reliability analysis. AIAA J 42:1199–1207. https://doi.org/10.2514/1.3877
Article Google Scholar
Dubourg V, Sudret B, Bourinet J-M (2011) Reliability-based design optimization using Kriging surrogates and subset simulation. Struct Multidiscip Optim 44:673–690. https://doi.org/10.1007/s00158-011-0653-8
Article Google Scholar
Echard B, Gayton N, Lemaire M (2011) AK-MCS: an active learning reliability method combining Kriging and Monte Carlo simulation. Struct Saf 33:145–154. https://doi.org/10.1016/j.strusafe.2011.01.002
Article Google Scholar
Echard B, Gayton N, Lemaire M, Relun N (2013) A combined importance sampling and Kriging reliability method for small failure probabilities with time-demanding numerical models. Reliab Eng Syst Saf 111:232–240. https://doi.org/10.1016/j.ress.2012.10.008
Article Google Scholar
Forrester A, Keane A (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45:50–79. https://doi.org/10.1016/j.paerosci.2008.11.001
Article Google Scholar
Forrester A, Sobester A, Keane A (2008) Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, Chichester
Book Google Scholar
Foschi RO, Li H, Zhang J (2002) Reliability and performance-based design: a computational approach and applications. Struct Saf 24:205–218. https://doi.org/10.1016/S0167-4730(02)00025-5
Article Google Scholar
Gillespie CS, Renshaw E (2007) An improved saddlepoint approximation. Math Biosci 208:359–374. https://doi.org/10.1016/j.mbs.2006.08.026
Article MathSciNet MATH Google Scholar
Goswami S, Chakraborty S, Chowdhury R, Rabczuk T (2019) Threshold shift method for reliability-based design optimization. Struct Multidiscip Optim 60:2053–2072. https://doi.org/10.1007/s00158-019-02310-x
Article MathSciNet Google Scholar
Goutis C, Casella G (1999) Explaining the Saddlepoint approximation. Am Stat 53:216–224. https://doi.org/10.1080/00031305.1999.10474463
Article MathSciNet Google Scholar
Guo S (2014) An efficient third-moment saddlepoint approximation for probabilistic uncertainty analysis and reliability evaluation of structures. Appl Math Model 38:221–232. https://doi.org/10.1016/j.apm.2013.06.026
Article MathSciNet MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media. https://doi.org/10.1007/978-0-387-84858-7
Hess PE, Bruchman D, Assakkaf IA, Ayyub BM (2002) Uncertainties in material and geometric strength and load variables. Nav Eng J 114:139–166. https://doi.org/10.1111/j.15593584.2002.tb00128.x
Huang B, Du X (2008) Probabilistic uncertainty analysis by mean-value first order Saddlepoint approximation. Reliab Eng Syst Saf 93:325–336. https://doi.org/10.1016/j.ress.2006.10.021
Article Google Scholar
Jiang C, Lu GY, Han X, Liu LX (2012) A new reliability analysis method for uncertain structures with random and interval variables. Int J Mech Mater Des 8:169–182. https://doi.org/10.1007/s10999-012-9184-8
Article Google Scholar
Lehký D, Slowik O, Novák D (2018) Reliability-based design: artificial neural networks and double-loop reliability-based optimization approaches. Adv Eng Softw 117:123–135. https://doi.org/10.1016/j.advengsoft.2017.06.013
Article Google Scholar
Li X, Gong C, Gu L, Jing Z, Fang H, Gao R (2019) A reliability-based optimization method using sequential surrogate model and Monte Carlo simulation. Struct Multidiscip Optim 59:439–460. https://doi.org/10.1007/s00158-018-2075-3
Article MathSciNet Google Scholar
Liu H, Ong Y-S, Shen X, Cai J (2020) When Gaussian process meets big data: a review of scalable GPs. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/tnnls.2019.2957109
Lophaven SN, Nielsen HB, Søndergaard J (2002) DACE-A Matlab Kriging toolbox, version 2.0. Informatics and mathematical modelling. Technical University of Denmark, DTU, Lyngby
Google Scholar
Mahadevan S (2000) Probability, reliability, and statistical methods in engineering design. Wiley, New York
Google Scholar
Masoudnia S, Ebrahimpour R (2014) Mixture of experts: a literature survey. Artif Intell Rev 42:275–293. https://doi.org/10.1007/s10462-012-9338-y
Article Google Scholar
McLachlan GJ, Rathnayake S (2014) On the number of components in a Gaussian mixture model. WIREs Data Min Knowl Discov 4:341–355. https://doi.org/10.1002/widm.1135
Article Google Scholar
Moustapha M, Sudret B (2019) Surrogate-assisted reliability-based design optimization: a survey and a unified modular framework. Struct Multidiscip Optim 60:2157–2176. https://doi.org/10.1007/s00158-019-02290-y
Article MathSciNet Google Scholar
Moustapha M, Sudret B, Bourinet J-M, Guillaume B (2016) Quantile-based optimization under uncertainties using adaptive Kriging surrogate models. Struct Multidiscip Optim 54:1403–1421. https://doi.org/10.1007/s00158-016-1504-4
Article MathSciNet Google Scholar
Papadimitriou DI, Mourelatos ZP (2018) Reliability-based topology optimization using mean-value second-order saddlepoint approximation. J Mech Des 140. https://doi.org/10.1115/1.4038645
Park C, Haftka RT, Kim NH (2017) Remarks on multi-fidelity surrogates. Struct Multidiscip Optim 55:1029–1050. https://doi.org/10.1007/s00158-016-1550-y
Article MathSciNet Google Scholar
Rasmussen CE (2000) The infinite Gaussian mixture model. Adv Neural Inf Proces Syst 12:554–560
Google Scholar
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press, Cambridge
MATH Google Scholar
Santner TJ, Williams BJ, Notz W (2018) The design and analysis of computer experiments, 2nd edn. Springer, New York
Book Google Scholar
Soares RC, Mohamed A, Venturini WS, Lemaire M (2002) Reliability analysis of non-linear reinforced concrete frames using the response surface method. Reliab Eng Syst Saf 75:1–16. https://doi.org/10.1016/S0951-8320(01)00043-6
Article Google Scholar
Valdebenito MA, Schuëller GI (2010) A survey on approaches for reliability-based optimization. Struct Multidiscip Optim 42:645–663. https://doi.org/10.1007/s00158-010-0518-6
Article MathSciNet MATH Google Scholar
Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16:645–678. https://doi.org/10.1109/TNN.2005.845141
Article Google Scholar
Zhao Y-G, Ono T (1999) A general procedure for first/second-order reliabilitymethod (FORM/SORM). Struct Saf 21:95–112. https://doi.org/10.1016/S0167-4730(99)00008-9
Article Google Scholar
Zhao Y-G, Ono T (2001) Moment methods for structural reliability. Struct Saf 23:47–75. https://doi.org/10.1016/S0167-4730(00)00027-8
Article Google Scholar
Zhao W, Qiu Z (2013) An efficient response surface method and its application to structural reliability and reliability-based optimization. Finite Elem Anal Des 67:34–42. https://doi.org/10.1016/j.finel.2012.12.004
Article Google Scholar

Download references

Acknowledgements

Financial support from the Japan International Cooperation Agency (JICA) for the first author and JSPS KAKENHI No. JP19H02286 for the second author is fully acknowledged.

Author information

Authors and Affiliations

Department of Architecture and Architectural Engineering, Graduate School of Engineering, Kyoto University, Kyoto-Daigaku Katsura, Nishikyo, Kyoto, 615-8540, Japan
Bach Do & Makoto Ohsaki
Department of Architecture, Tokyo University of Science, 6-3-1 Niijuku, Katsushika-ku, Tokyo, 125-8585, Japan
Makoto Yamakawa

Authors

Bach Do
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Ohsaki
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Yamakawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bach Do.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Replication of results

Main source codes used for solving two design examples in Section 5 are available online at https://github.com/BachDo17/mixGP.

Additional information

Responsible Editor: Jianbin Du

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1 Gaussian process model

Based on the training data set $ \mathcal{D}=\left\{\mathbf{X},\mathbf{y}\right\}={\left\{{\mathbf{x}}_i,{y}_i\right\}}_{i=1}^N $, we seek to construct an input-output mapping y = f(x): ℝ^d → ℝ, where f(x) is an unknown regression function.

A semi-parametric GPM defines f(x) using the following probabilistic regression model (Rasmussen and Williams 2006):

$$ f\left(\mathbf{x}\right)=\sum \limits_{i=1}^p{h}_i\left(\mathbf{x}\right){\beta}_i+Z\left(\mathbf{x}\right)={\mathbf{h}}^T\left(\mathbf{x}\right)\boldsymbol{\upbeta} +Z\left(\mathbf{x}\right) $$

(A.1)

where h^T(x) = [h₁(x), …, h_q(x)]^T is a q-dimensional vector of known basis function of x; q is derived from d according to the form of the basis function; β = [β₁, …, β_q]^T is a vector of unknown coefficients; h^T(x)β represents the mean function of f(x); and Z(x) is a correlation function modeling the residual.

Since the prediction performance of GPM is mainly evaluated via the covariance function (Rasmussen and Williams 2006), the basis function in the mean function of the GPM is not the main focus of the method. Thus, there may not exist a criterion to choose a basis function. A zero-basis function is often used to avoid expensive computations of the GPM. However, there are several reasons for explicitly using a non-zero mean function such as interpretability of the regression model and convenience of expressing prior information (Rasmussen and Williams 2006). For this, it is convenient to describe the GP mean function using a few fixed basis functions such as linear or quadratic functions. In this study, the basis function is a quadratic function of the input variables x.

The GPM assumes that the marginal likelihood, i.e., p(y| X), is an N-variate Gaussian with a mean vector $ \mathbf{f}={\left\{f\left({\mathbf{x}}_i\right)\right\}}_{i=1}^N $ and covariance matrix. The covariance matrix is determined based on the following two facts. First, its (i, j)th element explains the cosine similarity between two points f(x_i) and f(x_j), both are unknowns and drawn from p(y| X). Second, we expect that the smoothness of any continuous functions also well operates on the regression function f(x), i.e., a small variation in x leads to a small variation in f(x), or vice versa. Therefore, the cosine similarity between two input vectors x_i and x_j, which are known in advance, can be used to characterize the cosine similarity between the two points f(x_i) and f(x_j), which are unknown in advance.

Let the PDF p(Z| X) of the vector of residuals $ \mathbf{Z}={\left\{Z\left({\mathbf{x}}_i\right)\right\}}_{i=1}^N $ be an N-variate Gaussian with a zero mean and a covariance matrix explaining the similarity between any two input vectors as

(A.2)

where K ∈ ℝ^N × N is the covariance matrix with the element K_ij = k(x_i, x_j ) is a positive definite kernel function to explain the cosine similarity between two input vectors x_i and x_j. This study uses the squared exponential kernel as

$$ k\left({\mathbf{x}}_i,{\mathbf{x}}_j\ \right)={\theta}_{\mathrm{y}}^2\exp \left[-\frac{{\left({\mathbf{x}}_i-{\mathbf{x}}_j\right)}^T\left({\mathbf{x}}_{\mathrm{i}}-{\mathbf{x}}_j\right)}{2{\theta}_{\mathrm{l}}^2}\right] $$

(A.3)

where θ = {θ_y, θ_l} are unknown parameters of the kernel function.

Since the covariance matrix of p(y| X) is identical to that of p(Z| X) in (A.2), the marginal likelihood can be represented by

(A.4)

where H(X) = [h^T(x₁), …, h^T(x_N)]^T, and H(X)β is the mean vector of f.

The coefficient vector β is the least-squares solution as follows:

$$ \boldsymbol{\upbeta} \left(\boldsymbol{\uptheta} \right)={\left[{\mathbf{H}}^T\left(\mathbf{X}\right){\mathbf{K}}^{-1}\left(\boldsymbol{\uptheta} \right)\mathbf{H}\left(\mathbf{X}\right)\right]}^{-1}{\mathbf{H}}^T\left(\mathbf{X}\right){\mathbf{K}}^{-1}\left(\boldsymbol{\uptheta} \right)\mathbf{y} $$

(A.5)

To determine the kernel parameters θ, the marginal likelihood, or equivalently, its logarithm is maximized with respect to θ as (Rasmussen and Williams 2006)

$$ \mathcal{L}\left(\boldsymbol{\upbeta}, \boldsymbol{\uptheta} \right)=\log p\left(\mathbf{y}|\mathbf{X}\right)=-\frac{1}{2}{\left[\mathbf{y}-\mathbf{H}\left(\mathbf{X}\right)\boldsymbol{\upbeta} \right]}^T{\mathbf{K}}^{-1}\left(\boldsymbol{\uptheta} \right)\left[\mathbf{y}-\mathbf{H}\left(\mathbf{X}\right)\boldsymbol{\upbeta} \right]-\frac{1}{2}\log \left|\mathbf{K}\left(\boldsymbol{\uptheta} \right)\right|-\frac{1}{2}N\log 2\uppi $$

(A.6)

Substituting β in (A.5) into (A.6), $ \mathcal{L} $ becomes a function of θ and is maximized using an optimization algorithm. In this study, the DACE toolbox (Lophaven et al. 2002) is used to determine θ.

Once θ and β are determined, to predict the responses $ {\mathbf{y}}^{\ast}={\left\{{y}_l^{\ast}\right\}}_{l=1}^M $ for a new test set of a total M input variables $ {\mathbf{X}}^{\ast}={\left\{{\mathbf{x}}_l^{\ast}\right\}}_{l=1}^M $, the joint PDF of y ∣ X and y^∗ ∣ X^∗ is established as

(A.7)

where K^∗ ∈ ℝ^N × M with $ {K}_{il}^{\ast }=k\left({\mathbf{x}}_i,{\mathbf{x}}_l^{\ast}\ \right) $, and K^∗∗ ∈ ℝ^M × M with $ {K}_{lh}^{\ast \ast }=k\left({\mathbf{x}}_l^{\ast},{\mathbf{x}}_h^{\ast}\ \right) $.

Applying the rule of the posterior conditional to the joint PDF in (A.7), the conditional PDF used to predict y^∗ for the test set X^∗ is given by (Rasmussen and Williams 2006)

(A.8)

where

$$ {\boldsymbol{\upmu}}_{{\mathbf{y}}^{\ast }}=\mathbf{H}\left({\mathbf{X}}^{\ast}\right)\boldsymbol{\upbeta} +{\mathbf{K}}^{\ast T}{\mathbf{K}}^{-1}\left[\mathbf{y}-\mathbf{H}\left(\mathbf{X}\right)\boldsymbol{\upbeta} \right] $$

(A.9)

$$ {\boldsymbol{\Sigma}}_{{\mathbf{y}}^{\ast }}={\mathbf{K}}^{\ast \ast }-{\mathbf{K}}^{\ast T}{\mathbf{K}}^{-1}{\mathbf{K}}^{\ast } $$

(A.10)

Appendix 2. Clustering training set using Gaussian mixture model

Clustering a training set aims at distributing its similar samples to an independent group in which the samples share a general property. Two fundamental steps of clustering a training set include measuring the similarity of the samples and selecting a clustering algorithm. Different similarity measures and clustering algorithms can be found in Xu and Wunsch (2005). Here, we premise that two training samples are similar if they emerge from the same PDF. Therefore, it is convenient to split the joint PDF of the input-output variables p(x, y) into different Gaussian components using the GMM (Hastie et al. 2009), assign component as a subset, and then distribute the training samples into each subset accordingly.

The GMM describes the joint PDF p(x, y) by a convex combination of Gaussians as follows:

$$ p\left(\mathbf{x},y|\boldsymbol{\Theta} \right)=\sum \limits_{k=1}^K{\pi}_k\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right) $$

(B.1)

$$ \sum \limits_{k=1}^K{\pi}_k=1,\kern0.75em 0\le {\pi}_k\le 1 $$

(B.2)

$$ {\boldsymbol{\upmu}}_k={\left[{\boldsymbol{\upmu}}_{\mathbf{X},k}^T,{\mu}_{y,k}\right]}^T $$

(B.3)

$$ {\boldsymbol{\Sigma}}_k=\left[\begin{array}{cc}{\boldsymbol{\Sigma}}_{\mathbf{X}\mathbf{X},k}& {\boldsymbol{\Sigma}}_{\mathbf{X}y,k}\\ {}{\boldsymbol{\Sigma}}_{y\mathbf{X},k}& {\boldsymbol{\Sigma}}_{yy,k}\end{array}\right] $$

(B.4)

where denote the kth (d + 1)-variate Gaussian; K is the number of Gaussians; and $ \boldsymbol{\Theta} ={\left\{{\pi}_k,{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right\}}_{k=1}^K $ are unknown parameters of the GMM with π_k, μ_k, and Σ_k represent the mixing proportion, mean vector, and covariance matrix of the kth Gaussian, respectively.

Let z = [z₁, …, z_N]^T denote a latent random vector, where z_i ∈ {1, …, K}, and z_ik denote the probability that the sample (x_i, y_i) belongs to the kth Gaussian (or the kth subset), i.e.,

$$ {z}_{ik}=\left\{\begin{array}{c}1,\kern0.75em \mathrm{if}\ {z}_i=k\\ {}0,\kern0.75em \mathrm{if}\ {z}_i\ne k\end{array}\right. $$

(B.5)

Since z is unknown in advance, z_ik cannot be specified exactly. Instead, its expectation, denoted by $ \mathbbm{E}\left[{z}_{ik}\right] $, can be determined by using Bayes’ rule as follows:

$$ \mathbbm{E}\left[{z}_{ik}\right]=\mathrm{\mathbb{P}}\left[{z}_i=k|{\mathbf{x}}_i,{y}_i,\boldsymbol{\Theta} \right]=\frac{p\left({z}_i=k|\boldsymbol{\Theta} \right)p\left({\mathbf{x}}_i,{y}_i|{z}_i=k,\boldsymbol{\Theta} \right)}{\sum_{h=1}^Kp\left({z}_i=h|\boldsymbol{\Theta} \right)p\left({\mathbf{x}}_i,{y}_i|{z}_i=h,\boldsymbol{\Theta} \right)} $$

(B.6)

where p(z_i = k| Θ) = π_k is the prior and p(x_i, y_i| z_i = k, Θ) = ϕ(x_i, y_i| μ_k, Σ_k) is the likelihood. Thus, (B.6) can be rewritten as

$$ \mathbbm{E}\left[{z}_{ik}\right]=\mathrm{\mathbb{P}}\left[{z}_i=k|{\mathbf{x}}_i,{y}_i,\boldsymbol{\Theta} \right]=\frac{\pi_k\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right)}{\sum_{h=1}^K{\pi}_h\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_h,{\boldsymbol{\Sigma}}_h\right)} $$

(B.7)

To determine Θ, the posterior of the following log-likelihood $ {\mathcal{L}}_c $ of the training set is maximized using an iterative expectation-maximization (EM) algorithm (Hastie et al. 2009).

$$ {\mathcal{L}}_c=\sum \limits_{i=1}^N\sum \limits_{k=1}^K\mathbbm{E}\left[{z}_{ik}\right]\log \left[{\pi}_k\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right)\right] $$

(B.8)

The EM algorithm starts with initial parameters $ {\boldsymbol{\Theta}}^{(0)}={\left\{{\pi}_k^{(0)},{\boldsymbol{\upmu}}_k^{(0)},{\boldsymbol{\Sigma}}_k^{(0)}\right\}}_{k=1}^K $, computes $ \mathbbm{E}\left[{z}_{ik}\right] $ using (B.7), maximizes $ {\mathcal{L}}_c $ in (B.8) with respect to Θ for obtaining a new value of Θ, and moves on to the next iteration with the new Θ. For a given K, the initial mixing proportions $ {\pi}_k^{(0)} $ are uniform, the initial mean vectors $ {\boldsymbol{\upmu}}_k^{(0)} $ are randomly selected from K vectors of the training set, and the initial variance matrices $ {\boldsymbol{\Sigma}}_k^{(0)} $ are diagonal, where the ith element on the diagonal is the variance of the ith input variable. The algorithm is assured to converge at a finite number of iterations since it always reduced $ {\mathcal{L}}_c $ after each iteration. Detailed derivations and convergence properties of the EM algorithm can be found in Hastie et al. (2009). Here, we summarize two main steps of the EM as follows:

E step:

$$ \mathbbm{E}\left[{z}_{ik}^{(t)}\right]=\frac{\pi_k^{(t)}\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k^{(t)},{\boldsymbol{\Sigma}}_k^{(t)}\right)}{\sum_{h=1}^K{\pi}_h^{(t)}\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_h^{(t)},{\boldsymbol{\Sigma}}_h^{(t)}\right)} $$

(B.9)

M step:

$$ {\pi}_k^{\left(t+1\right)}=\frac{\sum_{i=1}^N\mathbbm{E}\left[{z}_{ik}^{(t)}\right]}{N} $$

(B.10)

$$ {\boldsymbol{\upmu}}_k^{\left(t+1\right)}=\frac{\sum_{i=1}^N\mathbbm{E}\left[{z}_{ik}^{(t)}\right]{\mathbf{d}}_i}{\sum_{i=1}^N\mathbbm{E}\left[{z}_{ik}^{(t)}\right]} $$

(B.11)

$$ {\boldsymbol{\Sigma}}_k^{\left(t+1\right)}=\frac{\sum_{i=1}^N\mathbbm{E}\left[{z}_{ik}^{(t)}\right]\left({\mathbf{d}}_i-{\boldsymbol{\upmu}}_k^{\left(t+1\right)}\right){\left({\mathbf{d}}_i-{\boldsymbol{\upmu}}_k^{\left(t+1\right)}\right)}^T}{\sum_{i=1}^N\mathbbm{E}\left[{z}_{ik}^{(t)}\right]} $$

(B.12)

where t denote the iteration counter and $ {\mathbf{d}}_i={\left[{\mathbf{x}}_i^T,{y}_i\right]}^T $.

After Θ is obtained, the probability that the training sample (x_i, y_i) belongs to the kth Gaussian, i.e., $ \mathbbm{E}\left[{z}_{ik}\right] $, is determined using (B.7), thereby producing K values of $ \mathbbm{E}\left[{z}_{ik}\right] $. The maximum value among these K values indicates which Gaussian (or subset) contains the sample (x_i, y_i).

By projecting all subsets in (d + 1)-dimensional space onto the d-dimensional space of input variables, the probability that an input variable x emerges from the kth projected subset can be determined (without knowing the corresponding output variable y) by

$$ \mathrm{\mathbb{P}}\left[{z}_i=k|\mathbf{x},\boldsymbol{\Theta} \right]=\frac{\pi_k\phi \left(\mathbf{x}|{\boldsymbol{\upmu}}_{\mathbf{X},k},{\boldsymbol{\Sigma}}_{\mathbf{X}\mathbf{X},k}\right)}{\sum_{h=1}^K{\pi}_h\phi \left(\mathbf{x}|{\boldsymbol{\upmu}}_{\mathbf{X},h},{\boldsymbol{\Sigma}}_{\mathbf{X}\mathbf{X},h}\right)} $$

(B.13)

where μ_{X, k} and Σ_{XX, k} are derived from (B.3) and (B.4), respectively.

As a model selection task, the number of Gaussians K should be determined before employing the EM algorithm for obtaining Θ. Detailed discussions on criteria used for selecting the best model among available GMMs can be found in McLachlan and Rathnayake (2014). Here, the Bayesian information criterion (BIC) is chosen since its effectiveness has been confirmed by many authors in the statistical learning field (McLachlan and Rathnayake 2014). To produce a set of GMM candidates for the model selection, we simply increase K step by step from 1 to 50. The best GMM among these candidates minimizes BIC as (Hastie et al. 2009)

$$ \mathrm{BIC}=-\mathcal{L}+\frac{1}{2}{n}_{\mathrm{p}}\log N $$

(B.14)

where $ \mathcal{L}={\sum}_{i=1}^N\log \left[{\sum}_{k=1}^K{\pi}_k\phi \left({\mathbf{x}}_i,{y}_i|{\boldsymbol{\upmu}}_k,{\boldsymbol{\Sigma}}_k\right)\right] $ is the log-likelihood of the training set and n_p is the number of free parameters required for a total of K Gaussians.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Do, B., Ohsaki, M. & Yamakawa, M. Sequential mixture of Gaussian processes and saddlepoint approximation for reliability-based design optimization of structures. Struct Multidisc Optim 64, 625–648 (2021). https://doi.org/10.1007/s00158-021-02855-w

Download citation

Received: 16 July 2020
Revised: 06 January 2021
Accepted: 16 January 2021
Published: 28 April 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s00158-021-02855-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sequential mixture of Gaussian processes and saddlepoint approximation for reliability-based design optimization of structures

Abstract

Access this article

Similar content being viewed by others

Gaussian mixture model for robust design optimization of planar steel frames

Efficient reliability-based optimization of linear dynamic systems with random structural parameters

Bayesian inference for random field parameters with a goal-oriented quality control of the PGD forward model’s accuracy

References

Acknowledgements