Skip to main content
Log in

An adaptive sparse polynomial dimensional decomposition based on Bayesian compressive sensing and cross-entropy

  • Research Paper
  • Published:
Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Abstract

The polynomial dimensional decomposition (PDD) is a powerful tool for uncertainty quantification. When dealing with complex or high-dimensional problems, the computational cost of the PDD is key to its application. This study proposes a novel sparse PDD method integrated with the Bayesian LASSO (least absolute shrinkage and selection operator) method and an adaptive polynomial basis updating method. Firstly, we construct the sparse PDD metamodel using an analytical Bayesian LASSO method, which is developed based on the framework of sparse Bayesian learning. The analytical method employs an iteration formula to calculate the maximum posterior estimation instead of the time-consuming Markov chain Monte Carlo (MCMC) sampling, and therefore can be used for refining the PDD model repeatedly. Secondly, to improve the performance of the analytical Bayesian LASSO, this paper proposes a cross-entropy-based method for updating the sparse PDD model, which allows the sequential augmentation of the polynomial bases and adaptively determines the polynomial degree and the maximum order of the component functions. The cross-entropy-based method can guarantee a small number of polynomial bases when refining the sparse model. Accordingly, the computational accuracy can be improved with the same sample size. We verify the proposed method using three numerical benchmark examples, and apply it to solve one complex practical engineering problem. The results show that the proposed sparse PDD method is a good choice for uncertainty quantification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Replication of results

As comprehensive implementation details are provided, we are confident that the methodology in this paper is reproducible. Therefore, no additional data and code are appended. If one is interested in the methodology and needs more help for the reproduction, please feel free to contact the corresponding author by email.

Abbreviations

ANOVA:

Analysis of variance

c :

The PDD coefficients

c :

The vector of PDD coefficients of the standard response

c 0 :

The vector of PDD coefficients of the original response

CDF:

Cumulative distribution function

D :

Cross-entropy

g :

Function responses

g, g s :

Vectors of original and standard function responses

g t :

The failure threshold

\(g_{\text{s}}^{*}\) :

The prediction of standard function response of a sample

H :

Design matrix

HDMR:

High-dimensional model representation

j u :

Multi-index set

LASSO:

Least absolute shrinkage and selection operator

MAP:

Maximum a posterior estimation

MCMC:

Markov chain Monte Carlo

MCS:

Monte Carlo simulation

N :

The dimensionality of input variables

p, q :

Discrete probability distribution

P :

The degree of a multivariable orthogonal polynomial

PCE:

Polynomial chaos expansion

PDD:

Polynomial dimensional decomposition

PDF:

Probability density function

P f :

Failure probability

P t :

The truncation degree of the PDD

r, r o :

Results of the MCS and other methods

S :

The maximum dimensionality of component functions of HDMR

SPCE:

Sparse polynomial chaos expansion

u, v :

Index vector

U :

The set of the active polynomial bases

UQ:

Uncertainty quantification

w(x):

The joint PDF of the input variables

x = [x 1, x 2, …, x N]:

N-Dimensional input variables

ε, ε :

Prediction errors

θ, θ :

Standard variables corresponding to x

λ :

Hyperparameter of the hierarchical Bayesian model

μ :

Mean value of the prediction distribution

μ 0 :

Mean value of the function responses

ρ :

Hyperparameter of the hierarchical Bayesian model

σ 2 :

Variance of the likelihood function

σ 0 :

Standard deviation of the likelihood function

\(\sigma_{{\text{g}}}^{2}\) :

The global accuracy parameter

Σ :

The covariance matrix

τ :

Hyperparameter of the hierarchical Bayesian model

Φ, ϕ :

The orthogonal polynomial bases

ω :

The vector composed by the Hyperparameters in the analytical Bayesian LASSO

References

  • Bouhlel MA, Bartoli N, Otsmane A, Morlier J (2016) Improving kriging surrogates of high-dimensional design models by Partial Least Squares dimension reduction. Struct Multidisc Optim 53(5):935–952

    MathSciNet  MATH  Google Scholar 

  • Chatterjee T, Chowdhury R (2017) An efficient sparse Bayesian learning framework for stochastic response analysis. Struct Saf 68:1–14

    Google Scholar 

  • Cheng K, Lu Z (2020a) Active learning polynomial chaos expansion for reliability analysis by maximizing expected indicator function prediction error. Int J Numer Methods Eng 121(14):3159–3177

    MathSciNet  Google Scholar 

  • Cheng K, Lu Z (2020b) Structural reliability analysis based on ensemble learning of surrogate models. Struct Saf 83:101905

    Google Scholar 

  • Cheng K, Lu Z, Zhang K (2019) Multivariate output global sensitivity analysis using multi-output support vector regression. Struct Multidisc Optim 59(6):2177–2187

    MathSciNet  Google Scholar 

  • Crestaux T, Le Maıtre O, Martinez JM (2009) Polynomial chaos expansion for sensitivity analysis. Reliab Eng Syst Saf 94(7):1161–1172

    Google Scholar 

  • Diaz P, Doostan A, Hampton J (2018) Sparse polynomial chaos expansions via compressed sensing and D-optimal design. Comput Methods Appl Mech Eng 336:640–666

    MathSciNet  MATH  Google Scholar 

  • Erfani SM, Rajasegarar S, Karunasekera S, Leckie C (2016) High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recogn 58:121–134

    Google Scholar 

  • Hans C (2009) Bayesian lasso regression. Biometrika 96(4):835–845

    MathSciNet  MATH  Google Scholar 

  • He W, Zeng Y, Li G (2019) A novel structural reliability analysis method via improved maximum entropy method based on nonlinear mapping and sparse grid numerical integration. Mech Syst Signal Process 133:106247

    Google Scholar 

  • He W, Zeng Y, Li G (2020) An adaptive polynomial chaos expansion for high-dimensional reliability analysis. Struct Multidisc Optim 62(4):2051–2067

    MathSciNet  Google Scholar 

  • Hoeffding, W. (1992). A class of statistics with asymptotically normal distribution. In: Breakthroughs in statistics. Springer, New York, pp 308–334

  • Huang Y, Beck JL, Li H (2017) Hierarchical sparse Bayesian learning for structural damage detection: theory, computation and application. Struct Saf 64:37–53

    Google Scholar 

  • Huang Y, Shao C, Wu B, Beck JL, Li H (2019) State-of-the-art review on Bayesian inference in structural system identification and damage assessment. Adv Struct Eng 22(6):1329–1351

    Google Scholar 

  • Jahanbin R, Rahman S (2020) Stochastic isogeometric analysis in linear elasticity. Comput Methods Appl Mech Eng 364:112928

    MathSciNet  MATH  Google Scholar 

  • Ji S, Xue Y, Carin L (2008) Bayesian compressive sensing. IEEE Trans Signal Process 56(6):2346–2356

    MathSciNet  MATH  Google Scholar 

  • Jung Y, Cho H, Lee I (2019) MPP-based approximated DRM (ADRM) using simplified bivariate approximation with linear regression. Struct Multidisc Optim 59(5):1761–1773

    MathSciNet  Google Scholar 

  • Karagiannis G, Lin G (2014) Selection of polynomial chaos bases via Bayesian model uncertainty methods with applications to sparse approximation of PDEs with stochastic inputs. J Comput Phys 259:114–134

    MathSciNet  MATH  Google Scholar 

  • Karagiannis G, Konomi BA, Lin G (2015) A Bayesian mixed shrinkage prior procedure for spatial–stochastic basis selection and evaluation of gPC expansions: applications to elliptic SPDEs. J Comput Phys 284:528–546

    MathSciNet  MATH  Google Scholar 

  • Li M, Wang Z (2020) Deep learning for high-dimensional reliability analysis. Mech Syst Signal Process 139:106399

    Google Scholar 

  • Li G, Wang SW, Rabitz H (2002) Practical approaches to construct RS-HDMR component functions. J Phys Chem A 106(37):8721–8733

    Google Scholar 

  • Li G, Nie Z, Zeng Y, Pan J, Guan Z (2020) New simplified dynamic modeling method of bolted flange joints of launch vehicle. J Vib Acoust. https://doi.org/10.1115/1.4045919

    Article  Google Scholar 

  • Lykou A, Ntzoufras I (2013) On Bayesian lasso variable selection and the specification of the shrinkage parameter. Stat Comput 23(3):361–390

    MathSciNet  MATH  Google Scholar 

  • Marelli S, Sudret B (2015) UQLab user manual–Polynomial chaos expansions. Chair of Risk, Safety & Uncertainty Quantification, ETH Zürich, 0.9-104 edition, 97-110

  • Meng Z, Zhang Z, Zhang D, Yang D (2019) An active learning method combining Kriging and accelerated chaotic single loop approach (AK-ACSLA) for reliability-based design optimization. Comput Methods Appl Mech Eng 357:112570

    MathSciNet  MATH  Google Scholar 

  • Morris MD (1991) Factorial sampling plans for preliminary computational experiments. Technometrics 33(2):161–174

    Google Scholar 

  • Nyeo SL, Ansari RR (2011) Sparse Bayesian learning for the Laplace transform inversion in dynamic light scattering. J Comput Appl Math 235(8):2861–2872

    MathSciNet  MATH  Google Scholar 

  • Park T, Casella G (2008) The bayesian lasso. J Am Stat Assoc 103(482):681–686

    MathSciNet  MATH  Google Scholar 

  • Rabitz H, Aliş ÖF (1999) General foundations of high-dimensional model representations. J Math Chem 25(2):197–233

    MathSciNet  MATH  Google Scholar 

  • Rahman S (2008) A polynomial dimensional decomposition for stochastic computing. Int J Numer Meth Eng 76(13):2091–2116

    MathSciNet  MATH  Google Scholar 

  • Rahman S (2014) Approximation errors in truncated dimensional decompositions. Math Comput 83(290):2799–2819

    MathSciNet  MATH  Google Scholar 

  • Rahman S (2018) Mathematical properties of polynomial dimensional decomposition. SIAM/ASA J Uncertain Quantif 6(2):816–844

    MathSciNet  MATH  Google Scholar 

  • Rahman S (2019) Uncertainty quantification under dependent random variables by a generalized polynomial dimensional decomposition. Comput Methods Appl Mech Eng 344:910–937

    MathSciNet  MATH  Google Scholar 

  • Rahman S (2020) A spline chaos expansion. SIAM/ASA J Uncertain Quantif 8(1):27–57

    MathSciNet  MATH  Google Scholar 

  • Rahman S, Xu H (2004) A univariate dimension-reduction method for multi-dimensional integration in stochastic mechanics. Probab Eng Mech 19(4):393–408

    Google Scholar 

  • Rubinstein RY, Kroese DP (2013) The cross-entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation and machine learning. Springer, New York

    MATH  Google Scholar 

  • Shao Q, Younes A, Fahs M, Mara TA (2017) Bayesian sparse polynomial chaos expansion for global sensitivity analysis. Comput Methods Appl Mech Eng 318:474–496

    MathSciNet  MATH  Google Scholar 

  • Shore J, Johnson R (1981) Properties of cross-entropy minimization. IEEE Trans Inf Theory 27(4):472–482

    MathSciNet  MATH  Google Scholar 

  • Sobol’ IM (2003) Theorems and examples on high dimensional model representation. Reliab Eng Syst Saf 79(2):187–193

    Google Scholar 

  • Sofi A, Muscolino G, Giunta F (2020) Propagation of uncertain structural properties described by imprecise Probability Density Functions via response surface method. Probab Eng Mech 60:103020

    Google Scholar 

  • Tang K, Congedo PM, Abgrall R (2016) Adaptive surrogate modeling by ANOVA and sparse polynomial dimensional decomposition for global sensitivity analysis in fluid simulation. J Comput Phys 314:557–589

    MathSciNet  MATH  Google Scholar 

  • Tang K, Wang JM, Freund JB (2019) Adaptive sparse polynomial dimensional decomposition for derivative-based sensitivity. J Comput Phys 391:303–321

    MathSciNet  MATH  Google Scholar 

  • Thapa M, Mulani SB, Walters RW (2020) Adaptive weighted least-squares polynomial chaos expansion with basis adaptivity and sequential adaptive sampling. Comput Methods Appl Mech Eng 360:112759

    MathSciNet  MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  • Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1(Jun):211–244

    MathSciNet  MATH  Google Scholar 

  • Tsilifis P, Huan X, Safta C, Sargsyan K, Lacaze G, Oefelein JC, Najm HN, Ghanem RG (2019) Compressive sensing adaptation for polynomial chaos expansions. J Comput Phys 380:29–47

    MathSciNet  MATH  Google Scholar 

  • Tunga MA, Demiralp M (2005) A factorized high dimensional model representation on the nodes of a finite hyperprismatic regular grid. Appl Math Comput 164(3):865–883

    MathSciNet  MATH  Google Scholar 

  • Wan X, Karniadakis GE (2006) Multi-element generalized polynomial chaos for arbitrary probability measures. SIAM J Sci Comput 28(3):901–928

    MathSciNet  MATH  Google Scholar 

  • Wang Z, Chen W (2017) Confidence-based adaptive extreme response surface for time-variant reliability analysis under random excitation. Struct Saf 64:76–86

    Google Scholar 

  • Wang Z, Song J (2016) Cross-entropy-based adaptive importance sampling using von Mises-Fisher mixture for high dimensional reliability analysis. Struct Saf 59:42–52

    Google Scholar 

  • Wang H, Yan Z, Xu X, He K (2018) Evaluating influence of variable renewable energy generation on islanded microgrid power flow. IEEE Access 6:71339–71349

    Google Scholar 

  • Weinan E, Han J, Jentzen A (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun Math Stat 5(4):349–380

    MathSciNet  MATH  Google Scholar 

  • Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning, vol. 2(3). MIT Press, Cambridge, p 4

  • Xu J, Kong F (2018) A cubature collocation based sparse polynomial chaos expansion for efficient structural reliability analysis. Struct Saf 74:24–31

    Google Scholar 

  • Xu H, Rahman S (2004) A generalized dimension-reduction method for multidimensional integration in stochastic mechanics. Int J Numer Methods Eng 61(12):1992–2019

    MATH  Google Scholar 

  • Yadav V, Rahman S (2014) A hybrid polynomial dimensional decomposition for uncertainty quantification of high-dimensional complex systems. Probab Eng Mech 38:22–34

    Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68(1):49–67

    MathSciNet  MATH  Google Scholar 

  • Zhang D, Han X, Jiang C, Liu J, Li Q (2017a) Time-dependent reliability analysis through response surface method. J Mech Des. https://doi.org/10.1115/1.4035860

    Article  Google Scholar 

  • Zhang K, Zuo W, Gu S, Zhang L (2017b) Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3929–3938

  • Zhang X, Wang L, Sørensen JD (2019) REIF: a novel active-learning function toward adaptive Kriging surrogate models for structural reliability analysis. Reliab Eng Syst Saf 185:440–454

    Google Scholar 

  • Zhang X, Wang L, Sørensen JD (2020) AKOIS: an adaptive Kriging oriented importance sampling method for structural system reliability analysis. Struct Saf 82:101876

    Google Scholar 

  • Zhou Y, Lu Z, Cheng K (2019a) A new surrogate modeling method combining polynomial chaos expansion and Gaussian kernel in a sparse Bayesian learning framework. Int J Numer Methods Eng 120(4):498–516

    MathSciNet  Google Scholar 

  • Zhou Y, Lu Z, Cheng K, Ling C (2019b) An efficient and robust adaptive sampling method for polynomial chaos expansion in sparse Bayesian learning framework. Comput Methods Appl Mech Eng 352:654–674

    MathSciNet  MATH  Google Scholar 

  • Zhou Y, Lu Z, Hu J, Hu Y (2020) Surrogate modeling of high-dimensional problems via data-driven polynomial chaos expansions and sparse partial least square. Comput Methods Appl Mech Eng 364:112906

    MathSciNet  MATH  Google Scholar 

  • Zhou H, Ibrahim C, Zheng WX, Pan W (2021) Sparse Bayesian deep learning for dynamic system identification. arXiv preprint arXiv:2107.12910.

  • Zhu Y, Zabaras N, Koutsourelakis PS, Perdikaris P (2019) Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J Comput Phys 394:56–81

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The support of the National Key Research and Development Program (Grant No.: 2019YFA0706803) and the National Natural Science Foundation of China (Grant No.: 11872142) is greatly appreciated. The lead author thanks the financial support of China Scholarship Council for his visit at University of Waterloo.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Responsible Editor: Palaniappan Ramu

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

A one-dimensional polynomial chaos basis of θ is defined through the following orthogonality condition:

$$\left\langle {\phi_{i} (\theta ),\phi_{j} (\theta )} \right\rangle = \int {\phi_{i} (\theta )\phi_{j} (\theta )f{(}\theta {\text{)d}}\theta } = \delta_{ij} ,$$
(47)

where δij is the Kronecker symbol, and f(θ) is the PDF of the random variable θ. For arbitrary f(θ), the orthogonal polynomial basis, ϕi, can be derived by Stieltjes procedure (Wan and Karniadakis 2006) with the following recurrence relation

$$\sqrt {\beta_{n + 1} } \phi_{n + 1} (\theta ) = (\theta - \alpha_{n} )\phi_{n} (\theta ) - \sqrt {\beta_{n} } \phi_{n - 1} (\theta )\quad \quad n = 1,\;2,\;\ldots$$
(48)

where αn and βn are given by the Christoffel–Darboux formulae as follows:

$$\begin{gathered} \alpha_{n} = \frac{{\left\langle {\zeta \hat{\phi }_{n} ,\;\hat{\phi }_{n} } \right\rangle }}{{\left\langle {\hat{\phi }_{n} ,\;\hat{\phi }_{n} } \right\rangle }} \hfill \\ \beta_{n} = \frac{{\left\langle {\hat{\phi }_{n} ,\;\hat{\phi }_{n} } \right\rangle }}{{\left\langle {\hat{\phi }_{n - 1} ,\;\hat{\phi }_{n - 1} } \right\rangle }} \hfill \\ \phi_{n} = \frac{{\hat{\phi }_{n} }}{{\sqrt {\left\langle {\hat{\phi }_{n} ,\;\hat{\phi }_{n} } \right\rangle } }} \hfill \\ \end{gathered}$$
(49)

With Eqs. (48) and (49), polynomial chaos bases can be derived for the random variable with arbitrary PDF. For ease of use, the corresponding polynomial chaos bases for θ with the common distributions are shown in Table 1.

Appendix 2

Herein, the detailed derivation of Eq. (26) is provided. Because both p(gs|c, ωMAP) and p(c|ωMAP) are normal distributions, Eq. (25) can be rewritten as the following expression, dropping off the terms that are not related to c

$$\begin{gathered} p\left( {{\mathbf{c}}{|}\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} ,\;{\mathbf{g}}_{{\text{s}}} } \right) \propto p\left( {{\mathbf{g}}_{{\text{s}}} {|}\;{\mathbf{c}},\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} } \right)p\left( {{\mathbf{c}}{|}\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} } \right) \hfill \\ \quad \quad \quad \quad \quad \;\;\; \propto \exp \left( { - \frac{1}{{2\sigma_{{{\text{MAP}}}}^{2} }}\left( {{\mathbf{g}}_{{\text{s}}} - {\mathbf{H}}^{{\text{T}}} {\mathbf{c}}} \right)^{{\text{T}}} \left( {{\mathbf{g}}_{{\text{s}}} - {\mathbf{H}}^{{\text{T}}} {\mathbf{c}}} \right) - \frac{1}{2}{\mathbf{c}}^{{\text{T}}} {\mathbf{Ac}}} \right) \hfill \\ \quad \quad \quad \quad \quad \;\;\; \propto \exp \left( { - \frac{1}{2}\left( {{\mathbf{c}}^{{\text{T}}} \left( {\frac{{{\mathbf{HH}}^{{\text{T}}} }}{{\sigma_{{{\text{MAP}}}}^{2} }} + {\mathbf{A}}} \right){\mathbf{c}} - \frac{2}{{\sigma_{{{\text{MAP}}}}^{2} }}{\mathbf{g}}_{{\text{s}}}^{{\text{T}}} {\mathbf{H}}^{{\text{T}}} {\mathbf{c}}} \right)} \right). \hfill \\ \end{gathered}$$
(50)

Eq. (50) is nothing but the kernel of a Gaussian distribution. The corresponding covariance matrix is the inverse of the quadratic term coefficient, and the mean value vector can be easily obtained by the ratio of the linear term coefficient to the quadratic term coefficient; that is,

$${\mathbf{c}}{|}\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} ,\;{\mathbf{g}}_{{\text{s}}} \sim {\text{N(}}{{\varvec{\upmu}}},\;{{\varvec{\Sigma}}}),$$
(51)

where

$$\begin{gathered} {{\varvec{\upmu}}} = 1/\sigma_{{^{{{\text{MAP}}}} }}^{2} {\mathbf{\Sigma Hg}} \hfill \\ {{\varvec{\Sigma}}} = (1/\sigma_{{^{{{\text{MAP}}}} }}^{2} {\mathbf{HH}}^{{\text{T}}} + {\mathbf{A}})^{ - 1} . \hfill \\ \end{gathered}$$
(52)

Appendix 3

The derivation of Eq. (28) is based on the results of Appendix 2. Substituting Eqs. (18) and (51) into Eq. (24), we have

$$\begin{gathered} p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{g}}_{{\text{s}}} } \right) \approx \int {p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{c}},\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} ,\;{\mathbf{g}}_{{\text{s}}} } \right)p\left( {{\mathbf{c}}{|}\;{{\varvec{\upomega}}}_{{{\text{MAP}}}} ,\;{\mathbf{g}}_{{\text{s}}} } \right)} {\text{d}}{\mathbf{c}} \hfill \\ \quad \quad \quad \;\; \propto \int {\exp \left( { - \frac{1}{{2\sigma_{{{\text{MAP}}}}^{2} }}\left( {g_{{\text{s}}}^{*} - {{\varvec{\Phi}}}^{{\text{T}}} \left( {{\mathbf{x}}^{*} } \right){\mathbf{c}}} \right)^{2} } \right)\exp \left( { - \frac{1}{2}\left( {{\mathbf{c}} - {{\varvec{\upmu}}}} \right)^{{\text{T}}} {{\varvec{\Sigma}}}^{ - 1} \left( {{\mathbf{c}} - {{\varvec{\upmu}}}} \right)} \right)} {\text{d}}{\mathbf{c}}. \hfill \\ \end{gathered}$$
(53)

After some basic algebraic operations and dropping off the terms that are not related to \(g_{{\text{s}}}^{*}\) and c, Eq. (53) can be further written as

$$p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{g}}_{{\text{s}}} } \right)\; \propto \exp \left( { - \frac{1}{{2\sigma_{{{\text{MAP}}}}^{2} }}\left( {g_{{\text{s}}}^{*} } \right)^{2} } \right)\int {\exp \left( { - \frac{1}{2}\left( {{\mathbf{c}}^{{\text{T}}} {\mathbf{\Omega c}} - 2{\mathbf{c}}^{{\text{T}}} {{\varvec{\Gamma}}}} \right)} \right)} {\text{d}}{\mathbf{c}},$$
(54)

where

$$\begin{gathered} {{\varvec{\Omega}}} = \left( {\frac{{{{\varvec{\Phi}}}\left( {{\mathbf{x}}^{*} } \right){{\varvec{\Phi}}}^{{\text{T}}} \left( {{\mathbf{x}}^{*} } \right)}}{{\sigma_{{{\text{MAP}}}}^{2} }} + {{\varvec{\Sigma}}}^{ - 1} } \right) \hfill \\ {{\varvec{\Gamma}}} = \frac{{g_{{\text{s}}}^{*} {{\varvec{\Phi}}}\left( {{\mathbf{x}}^{*} } \right)}}{{\sigma_{{{\text{MAP}}}}^{2} }} + {{\varvec{\Sigma}}}^{ - 1} {{\varvec{\upmu}}}. \hfill \\ \end{gathered}$$
(55)

Thus, it can be seen that a Gaussian kernel can be constructed by the second term of Eq. (54); that is,

$$p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{g}}_{{\text{s}}} } \right)\; \propto \exp \left( { - \frac{{\left( {g_{{\text{s}}}^{*} } \right)^{2} }}{{2\sigma_{{{\text{MAP}}}}^{2} }} + \frac{1}{2}{{\varvec{\Gamma}}}^{{\text{T}}} {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Gamma}}}} \right)\int {\exp \left( { - \frac{{({\mathbf{c}} - {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Gamma}}})^{{\text{T}}} {{\varvec{\Omega}}}({\mathbf{c}} - {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Gamma}}})}}{2}} \right)} {\text{d}}{\mathbf{c}}.$$
(56)

The integration in Eq. (56) is constant because of the Gaussian kernel; therefore, we have

$$p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{g}}_{{\text{s}}} } \right)\; \propto \exp \left( { - \frac{1}{{2\sigma_{{{\text{MAP}}}}^{2} }}\left( {g_{{\text{s}}}^{*} } \right)^{2} + \frac{1}{2}{{\varvec{\Gamma}}}^{{\text{T}}} {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Gamma}}}} \right).$$
(57)

Combining Eq. (55) with (57), one can confirm that Eq. (57) is also a Gaussian kernel; therefore, it can be written as follows:

$$p\left( {g_{{\text{s}}}^{*} {|}{\mathbf{g}}_{{\text{s}}} } \right)\; \propto \exp \left( { - \frac{1}{{2\sigma_{{\text{p}}}^{2} }}\left( {g_{{\text{s}}}^{*} - \mu_{{\text{p}}} } \right)^{2} } \right),$$
(58)

where

$$\begin{gathered} \mu_{{\text{p}}} = \frac{{\sigma_{{{\text{MAP}}}}^{2} {{\varvec{\Phi}}}^{{\text{T}}} ({\mathbf{x}}^{*} ){{\varvec{\Omega}}}^{ - 1} {{\varvec{\Sigma}}}^{ - 1} {{\varvec{\upmu}}}}}{{\sigma_{{{\text{MAP}}}}^{2} - {{\varvec{\Phi}}}({\mathbf{x}}^{*} )^{{\text{T}}} {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Phi}}}({\mathbf{x}}^{*} )}} \hfill \\ \sigma_{{\text{p}}}^{2} = \left( {\frac{1}{{\sigma_{{{\text{MAP}}}}^{2} }} - \frac{{{{\varvec{\Phi}}}({\mathbf{x}}^{*} )^{{\text{T}}} {{\varvec{\Omega}}}^{ - 1} {{\varvec{\Phi}}}({\mathbf{x}}^{*} )}}{{(\sigma_{{{\text{MAP}}}}^{2} )^{2} }}} \right)^{ - 1} . \hfill \\ \end{gathered}$$
(59)

According to the Sherman–Morrison formula, the following relationship can be obtained:

$${{\varvec{\Omega}}}^{ - 1} = {{\varvec{\Sigma}}} - \frac{{{\mathbf{\Sigma \Phi }}\left( {{\mathbf{x}}^{*} } \right){{\varvec{\Phi}}}^{{\text{T}}} \left( {{\mathbf{x}}^{*} } \right){{\varvec{\Sigma}}}}}{{\sigma_{{{\text{MAP}}}}^{2} + {{\varvec{\Phi}}}^{{\text{T}}} \left( {{\mathbf{x}}^{*} } \right){\mathbf{\Sigma \Phi }}\left( {{\mathbf{x}}^{*} } \right)}}.$$
(60)

Substituting Eq. (60) into Eq. (59), the final mean value and variance of the posterior predictive distribution can be derived as

$$\begin{gathered} \mu_{{\text{p}}} = {{\varvec{\upmu}}}^{{\text{T}}} {{\varvec{\Phi}}}({\mathbf{x}}^{*} ) \hfill \\ \sigma_{{\text{p}}}^{2} = \sigma_{{{\text{MAP}}}}^{2} + {{\varvec{\Phi}}}({\mathbf{x}}^{*} )^{{\text{T}}} {\mathbf{\Sigma \Phi }}({\mathbf{x}}^{*} ). \hfill \\ \end{gathered}$$
(61)

Appendix 4

The real function value of a new test sample x* is noted as gs(x*), and its estimation is \(g_{{\text{s}}}^{{\text{e}}} ({\mathbf{x}}^{*} )\). After the final posterior predictive distribution \(p(g_{{\text{s}}}^{*} |{\mathbf{g}}_{{\text{s}}} )\) is obtained, it can be seen as the distribution of gs(x*), p(gs|gs). Thus, a loss function that represents the mean error between the real function value and the estimation can be defined as

$$L_{g} = \int {\left( {g_{{\text{s}}} \left( {{\mathbf{x}}^{*} } \right) - g_{{\text{s}}}^{{\text{e}}} \left( {{\mathbf{x}}^{*} } \right)} \right)^{2} p\left( {g_{\text{s}} |{\mathbf{g}}_{\text{s}} } \right)} {\text{d}}g_{\text{s}} .$$
(62)

The \(g_{{\text{s}}}^{{\text{e}}} ({\mathbf{x}}^{*} )\) minimizing Lg is expected. To this end, the derivative of Lg is calculated to obtain the stationary point; thus, the following equation yields

$$\frac{{{\text{d}}L_{g} }}{{{\text{d}}g_{{\text{s}}}^{{\text{e}}} }} = 2\int {\left( {g_{{\text{s}}}^{{\text{e}}} \left( {{\mathbf{x}}^{*} } \right) - g_{{\text{s}}} \left( {{\mathbf{x}}^{*} } \right)} \right)p\left( {g_{\text{s}} |{\mathbf{g}}_{\text{s}} } \right)} {\text{d}}g_{\text{s}} = 0.$$
(63)

From Eq. (63), the expected \(g_{{\text{s}}}^{{\text{e}}} ({\mathbf{x}}^{*} )\) can be obtained as

$$g_{{\text{s}}}^{{\text{e}}} \left( {{\mathbf{x}}^{*} } \right) = \int {g_{{\text{s}}} \left( {{\mathbf{x}}^{*} } \right)p\left( {g_{\text{s}} |{\mathbf{g}}_{\text{s}} } \right)} {\text{d}}g_{\text{s}} .$$
(64)

Eq. (64) is nothing but the mean value of the posterior predictive distribution.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, W., Li, G. & Nie, Z. An adaptive sparse polynomial dimensional decomposition based on Bayesian compressive sensing and cross-entropy. Struct Multidisc Optim 65, 26 (2022). https://doi.org/10.1007/s00158-021-03120-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00158-021-03120-w

Keywords

Navigation