Skip to main content
Log in

On the Construction of Uncertain Time Series Surrogates Using Polynomial Chaos and Gaussian Processes

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

Abstract

The analysis of time series is a fundamental task in many flow simulations such as oceanic and atmospheric flows. A major challenge is the design of a faithful and accurate time-dependent surrogate built with a tractable sample set and a manageable number of degrees of freedom. Several techniques are implemented to handle the time-dependent aspect of the quantity of interest including uncoupled approaches, low-rank approximations, auto-regressive models and global Bayesian emulators. These approaches rely on two popular methods for uncertainty quantification: polynomial chaos and Gaussian process regression. The different techniques are tested and compared on the uncertain evolution of the sea surface height forecast at two locations exhibiting contrasting levels of variance. Two ensemble sizes are considered as well as two versions of polynomial chaos (ordinary least squares or ridge regression) and Gaussian processes (squared exponential or Matérn covariance function) in order to assess their impact on the results. The conclusions focus on the advantages and the drawbacks, in terms of accuracy, flexibility and computational costs of the different techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Alemazkoor N, Meidani H (2017) Divide and conquer: an incremental sparsity promoting compressive sampling approach for polynomial chaos expansions. Comput Methods Appl Mech Eng 318:937–956

    Article  Google Scholar 

  • Alexanderian A, Le Maître O, Najm H, Iskandarani M, Knio O (2012) Multiscale stochastic preconditioners in non-intrusive spectral projection. SIAM J Sci Comput 50(2):306–340

    Article  Google Scholar 

  • Blatman G, Sudret B (2011) Adaptive sparse polynomial chaos expansion based on least angle regression. J Comput Phys 230(6):2345–2367

    Article  Google Scholar 

  • Bleck R (2002) An oceanic general circulation model framed in hybrid isopycnic-cartesian coordinates. Ocean Model 4(1):55–88

    Article  Google Scholar 

  • Bowman A, Azzalini A (1997) Applied smoothing techniques for data analysis. Oxford University Press, New York

    Google Scholar 

  • Cameron RH, Martin WT (1947) The orthogonal development of nonlinear functionals in series of Fourier–Hermite functionals. Ann Math 48:385–392

    Article  Google Scholar 

  • Chassignet E, Smith L, Halliwell G, Bleck R (2003) North Atlantic simulation with the hybrid coordinate ocean model (HYCOM): impact of the vertical coordinate choice, reference density, and themobaricity. J Phys Oceanogr 33:2504–2526

    Article  Google Scholar 

  • Conrad PR, Marzouk YM (2013) Adaptive smolyak pseudospectral approximations. SIAM J Sci Comput 35(6):A2643–A2670

    Article  Google Scholar 

  • Conti S, O’Hagan A (2010) Bayesian emulation of complex multi-output and dynamic computer models. J Stat Plan Infer 140(3):640–651

    Article  Google Scholar 

  • Doostan A, Owhadi H (2011) A non-adapted sparse approximation of pdes with stochastic inputs. J Comput Phys 230(8):3015–3034

    Article  Google Scholar 

  • Ernst OG, Mugler A, Starkloff HJ, Ullmann E (2012) On the convergence of generalized polynomial chaos expansions. ESAIM Math Model Numer Anal 46:317–339

    Article  Google Scholar 

  • Gerritsma M, van der Steen JB, Vos P, Karniadakis G (2010) Time-dependent generalized polynomial chaos. J Comput Phys 229(22):8333–8363

    Article  Google Scholar 

  • Ghanem RG, Spanos SD (1991) Stochastic Finite Elements: a Spectral Approach. Springer, Berlin

    Book  Google Scholar 

  • Gibbs M N (1997) Bayesian Gaussian processes for regression and classification. Ph.D. thesis, Department of Physics, University of Cambridge

  • Greengard L, Rokhlin V (1987) A fast algorithm for particle simulations. J Comput Phys 73(2):325–348

    Article  Google Scholar 

  • Iskandarani M, Le Hénaff M, Thacker WC, Srinivasan A, Knio OM (2016a) Quantifying uncertainty in gulf of mexico forecasts stemming from uncertain initial conditions. J Geophys Res Oceans 121(7):4819–4832

    Article  Google Scholar 

  • Iskandarani M, Wang S, Srinivasan A, Thacker WC, Winokur J, Knio O (2016b) An overview of uncertainty quantification techniques with application to oceanic and oil-spill simulations. J Geophys Res Oceans 121(4):2789–2808

    Article  Google Scholar 

  • Kocijan J, Girard A, Banko B, Murray-Smith R (2005) Dynamic systems identification with gaussian processes. Math Comput Model Dyn Syst 11(4):411–424

    Article  Google Scholar 

  • Le Gratiet L, Marelli S, Sudret B (2016) Metamodel-based sensitivity analysis: polynomial chaos expansions and Gaussian processes. Springer, Cham, pp 1–37 ISBN 978-3-319-11259-6

    Google Scholar 

  • Le Maître O, Najm H, Ghanem R, Knio O (2004) Multi-resolution analysis of Wiener-type uncertainty propagation schemes. J Comput Phys 197(2):502–531

    Article  Google Scholar 

  • Le Maître OP, Knio OM (2010) Spectral methods for uncertainty quantification. Springer, Berlin

    Book  Google Scholar 

  • Li G, Iskandarani M, Le Hénaff M, Winokur J, Le Maître OP, Knio OM (2016) Quantifying initial and wind forcing uncertainties in the gulf of mexico. Comput Geosci 20(5):1133–1153

    Article  Google Scholar 

  • Lorenz EN (1956) Empirical orthogonal functions and statistical weather prediction. Scientific report / MIT, Statistical Forecasting Project, Massachusetts Institute of Technology, Department of Meteorology

  • Mai CV, Spiridonakos MD, Chatzi EN, Sudret B (2016) Surrogate modeling for stochastic dynamical systems by combining nonlinear autoregressive with exogeneous input models and polynomial chaos expansions. Int J Uncertain Quant 6(4):313–339

    Article  Google Scholar 

  • Matheron G (1973) The intrinsic random functions and their applications. Adv Appl Probab 5(3):439–468

    Article  Google Scholar 

  • McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York

    Book  Google Scholar 

  • Morokoff WJ, Caflisch RE (1995) Quasi-Monte Carlo integration. J Comput Phys 122(2):218–230

    Article  Google Scholar 

  • Neal RM (1996) Bayesian learning for neural networks. Springer, Berlin ISBN 0387947248

    Book  Google Scholar 

  • Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, Berlin

    Google Scholar 

  • Owen NE, Challenor P, Menon PP, Bennani S (2017) Comparison of surrogate-based uncertainty quantification methods for computationally expensive simulators. SIAM/ASA J Uncertain Quant 5(1):403–435

    Article  Google Scholar 

  • Pronzato L, Müller WG (2012) Design of computer experiments: space filling and beyond. Stat Comput 22(3):681–701

    Article  Google Scholar 

  • Rasmussen CE, Williams CKI (2005) Gaussian processes for machine learning. The MIT Press, Cambridge ISBN 026218253X

    Book  Google Scholar 

  • Roy PT, Moçayd NE, Ricci S, Jouhaud JC, Goutal N, De Loco M, Rochoux MC (2017) Comparison of polynomial chaos and gaussian process surrogates for uncertainty quantification and correlation estimation of spatially distributed open-channel steady flows. Stoch Env Res Risk A ISSN 1436–3259

  • Sampson PD, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87(417):108–119

    Article  Google Scholar 

  • Santner TJ, Williams B, Notz W (2003) The design and analysis of computer experiments. Springer, Berlin

    Book  Google Scholar 

  • Seber GAF, Lee AJ (2003) Linear regression analysis. Wiley, New York ISBN 9780471722199

    Book  Google Scholar 

  • Smolyak S (1963) Quadrature and interpolation formulas for tensor products of certain classes of functions. Dokl Akad Nauk SSSR 4(240–243):123

    Google Scholar 

  • Sochala P, De Martin F (2017) Surrogate combining harmonic decomposition and polynomial chaos for seismic shear waves in uncertain media. Comput Geosci 22(1):125–144

    Article  Google Scholar 

  • Spiridonakos MD, Chatzi EN (2015) Metamodeling of dynamic nonlinear structural systems through polynomial chaos NARX models. Comput Struct 157:99–113

    Article  Google Scholar 

  • Tikhonov AN, Arsenin VIA (1977) Solutions of ill-posed problems. Scripta series in mathematics, Winston ISBN 9780470991244

  • Wan X, Karniadakis G (2006) Multi-element generalized polynomial chaos for arbitrary probability measures. SIAM J Sci Comput 28(3):901–928

    Article  Google Scholar 

  • Wang S, Li G, Iskandarani M, Le Hénaff M, Knio OM (2018) Verifying and assessing the performance of the perturbation strategy in polynomial chaos ensemble forecasts of the circulation in the gulf of mexico. Ocean Model Rev 131:59–70

    Article  Google Scholar 

  • Winokur J, Conrad P, Sraj I, Knio O, Srinivasan A, Thacker WC, Marzouk Y, Iskandarani M (2013) A priori testing of sparse adaptive polynomial chaos expansions using an ocean general circulation model database. Comput Geosci 17(6):899–911

    Article  Google Scholar 

Download references

Acknowledgements

The work of P. Sochala is supported by a funding of BRGM (French Geological Survey) through its Institut Carnot sponsored by the ANR (French National Research Agency). This research was made possible in part by a grant from The Gulf of Mexico Research Initiative, and in part by NASA-NNX13AE30G and NSF1639722. Data are publicly available through the Gulf of Mexico Research Initiative Information & Data Cooperative (GRIIDC) at https://data.gulfresearchinitiative.org (https://doi.org/10.7266/n7-d8ga-6c22). The authors are greatful to S. Wang for having performed HYCOM simulations and to O. Le Maître for fruitful discussions about the clustering approach in the high-variance case.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierre Sochala.

Appendix A: Cross-Validation Technique

Appendix A: Cross-Validation Technique

Cross-validation (CV) is a popular technique used in statistics and machine learning to assess the quality of the predictive capacity of a model. The principle of CV is to partition the data into two complementary subsets, then to build the model on one subset (the training one), and finally to test the model on the other subset (the validation one). This procedure is repeated several times with different partitioning of the data. In the leave-one-out version of CV, the predicted residual \(e_{[i]}\) at \(\varvec{\xi }^{(i)}\) is defined as

$$\begin{aligned} e_{[i]} = u(\varvec{\xi }^{(i)}) - \tilde{u}_{[i]}(\varvec{\xi }^{(i)}), \end{aligned}$$

where \(u(\varvec{\xi }^{(i)})\) is the true value, and \(\tilde{u}_{[i]}(\varvec{\xi }^{(i)})\) denotes the predicted value of the model \(\tilde{u}_{[i]}\) built by removing the training point \(\varvec{\xi }^{(i)}\) in the training set. The leave-one-out error \(E_{\mathrm{loo}}\), a.k.a predicted residual error sum of squares, is estimated by an empirical mean square of the predicted residual,

$$\begin{aligned} E_{\mathrm{loo}} = \frac{1}{N}\sum _{i=1}^N e_{[i]}^2. \end{aligned}$$

In the general case, CV can be an expensive technique due to the construction of the N models \(\tilde{u}_{[i]}\). However, a fast computation of \(E_{\mathrm{loo}}\) is possible in linear regression models (Seber and Lee 2003) by using the relation

$$\begin{aligned} e_{[i]} = \frac{u(\varvec{\xi }^{(i)}) - \tilde{u}(\varvec{\xi }^{(i)})}{1-h_i}, \end{aligned}$$

where \(\tilde{u}(\varvec{\xi }^{(i)})\) is the single model built with all the training points, and \(h_i\) is the i-th diagonal term of the hat matrix \(H=P(P^{\top }P)^{-1}P^{\top }\) with P the design matrix of the linear regression. In practice, the vector \(\varvec{e}_{[]}\) of the predicted residual can be directly computed from the model outputs as follows

where \(I=[\delta _{ij}]\in \mathbb {R}^{N,N}\) is the identity matrix, \(\oslash \) denotes the component-wise division, and \(\varvec{1}=[1]^{\top }\in {\mathbb {R}^N}\) and \(\mathrm{diag}(H)\) represent the diagonal part of H.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sochala, P., Iskandarani, M. On the Construction of Uncertain Time Series Surrogates Using Polynomial Chaos and Gaussian Processes. Math Geosci 52, 285–309 (2020). https://doi.org/10.1007/s11004-019-09806-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-019-09806-8

Keywords

Navigation