Advertisement

Uncertainty Quantification Using the Nearest Neighbor Gaussian Process

  • Hongxiang Shi
  • Emily L. Kang
  • Bledar A. Konomi
  • Kumar Vemaganti
  • Sandeep Madireddy
Chapter
Part of the ICSA Book Series in Statistics book series (ICSABSS)

Abstract

Gaussian process has been widely used in areas including geostatistics and uncertainty quantification due to its parsimonious yet flexible representation of a stochastic process. However, analyzing a large data set with Gaussian process can be challenging due to its O(n3) computational complexity, where n denotes the size of the data set. The recently proposed Nearest Neighbor Gaussian Process (NNGP) aims to approximate a Gaussian process with a target covariance function by using a series of conditional distributions and then exploiting the sparse precision matrices. We demonstrate that NNGP has the potential to be used for uncertainty quantification. We discover that when using NNGP to approximate a Gaussian process with strong smoothness, e.g., the squared-exponential covariance function, Bayesian inference needs to be carried out carefully with marginalizing over the random effects in NNGP. Using simulated and real data, we investigate empirically the performance of NNGP to approximate the squared-exponential covariance function as well as its ability to handle change-of-support effect, a common phenomenon in geostatistics and uncertainty quantification when only aggregated data over space are available.

Notes

Acknowledgements

This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center (OSC 1987). Shi’s research was supported by the Taft Research Center at the University of Cincinnati. Kang’s research was partially supported by the Simons Foundation’s Collaboration Award (#317298) and the Taft Research Center at the University of Cincinnati. Vemaganti’s work was partially supported by the University of Cincinnati Simulation Center.

References

  1. Arendt, P. D., Apley, D. W., & Chen, W. (2012). Quantification of model uncertainty: Calibration, model discrepancy, and identifiability. Journal of Mechanical Design, 134, 100908-100908-12.Google Scholar
  2. Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2014). Hierarchical modeling and analysis for spatial data. Boca Raton: CRC Press.Google Scholar
  3. Banerjee, S., Gelfand, A. E., Finley, A. O., & Sang, H. (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society B, 70, 825–848.Google Scholar
  4. Berrocal, V. J., Gelfand, A. E., & Holland, D. M. (2010). A spatio-temporal downscaler for output from numerical models. Journal of Agricultural, Biological, and Environmental Statistics, 15, 176–197.Google Scholar
  5. Bush, A., Gibson, R., & Thomas, T. (1975). The elastic contact of a rough surface. Wear, 35, 87–111.Google Scholar
  6. Craig, P. S., Goldstein, M., Rougier, J. C., & Seheult, A. H. (2001). Bayesian forecasting for complex systems using computer simulators. Journal of the American Statistical Association, 96, 717–729.Google Scholar
  7. Cressie, N. (1993). Statistics for spatial data, revised ed. New York: Wiley.Google Scholar
  8. Cressie, N. (1996). Change of support and the modifiable areal unit problem. Geographical Systems, 3, 159–180.Google Scholar
  9. Cressie, N., & Johannesson, G. (2008). Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70, 209–226.MathSciNetCrossRefzbMATHGoogle Scholar
  10. Cressie, N., Shi, T., & Kang, E. K. (2010). Fixed rank filtering for spatio-temporal data. Journal of Computational and Graphical Statistics, 19, 724–745.MathSciNetCrossRefGoogle Scholar
  11. Crevillen-Garcia, D., Wilkinson, R. D., Shah, A. A., & Power, H. (2017). Gaussian process modelling for uncertainty quantification in convectively-enhanced dissolution processes in porous media. Advances in Water Resources, 99, 1–14.CrossRefGoogle Scholar
  12. Currin, C., Mitchell, T, Morris, M., & Ylvisaker, D. (1988). A Bayesian approach to the design and analysis of computer experiments. Technical Report, ORNL498, Oak Ridge Laboratory.Google Scholar
  13. Datta, A., Banerjee, S., Finley, A. O., & Gelfand, A. E. (2016). Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111, 800–812.MathSciNetCrossRefGoogle Scholar
  14. Emery, X. (2009). The kriging update equations and their application to the selection of neighboring data. Computational Geosciences, 13, 269–280.CrossRefGoogle Scholar
  15. Furrer, R., Genton, M. G., & Nychka, D. (2006). Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15, 502–523.MathSciNetCrossRefGoogle Scholar
  16. Gneiting, T., Kleiber, W., & Schlather, M. (2010). Matérn cross-covariance functions for multivariate random fields. Journal of the American Statistical Association, 105, 1167–1177.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Goulard, M., & Voltz, M. (1992). Linear coregionalization model: Tools for estimation and choice of cross-variogram matrix. Mathematical Geology, 24, 269–286.CrossRefGoogle Scholar
  18. Gramacy, R. B., & Apley, D. W. (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24, 561–578.MathSciNetCrossRefGoogle Scholar
  19. Gramacy, R. B., & Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103, 1119–1130.MathSciNetCrossRefzbMATHGoogle Scholar
  20. Greenwood, J. A., & Williamson, J. B. P. (1966). Contact of nominally flat surfaces. In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. The Royal Society (Vol. 295, pp. 300–319).Google Scholar
  21. Guttorp, P., & Gneiting, T. (2006). Studies in the history of probability and statistics XLIX: On the Matérn correlation family. Biometrika, 93, 989–995.MathSciNetCrossRefzbMATHGoogle Scholar
  22. Higdon, D., Nakhleh, C., Gattiker, J., & Williams, B. (2008). A Bayesian calibration approach to the thermal problem. Computer Methods in Applied Mechanics and Engineering, 1976, 2431–2441.CrossRefzbMATHGoogle Scholar
  23. Kaufman, C. G., & Shaby, B. A. (2013). The role of the range parameter for estimation and prediction in geostatistics. Biometrika, 100, 473–484.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Kennedy, M. C., & O’Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87, 1–13.MathSciNetCrossRefzbMATHGoogle Scholar
  25. Kennedy, M. C., & O’Hagan, A. (2001). Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 425–464.MathSciNetCrossRefzbMATHGoogle Scholar
  26. Konomi, B., Sang, H., & Mallick, B. (2014). Adaptive Bayesian nonstationary modeling for large spatial datasets using covariance approximations. Journal of Computational and Graphical Statistics, 23, 802–829.MathSciNetCrossRefGoogle Scholar
  27. Liu, F., Bayarri, M. J., & Berger, J. O. (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Bayesian Analysis, 4, 119–150.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Nguyen, H., Cressie, N., & Braverman, A. (2012). Spatial statistical data fusion for remote sensing applications. Journal of the American Statistical Association, 107, 1004–1018.MathSciNetCrossRefzbMATHGoogle Scholar
  29. Ohio Supercomputer Center (OSC). (1987). Columbus OH: Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73
  30. Peng, C. Y., & Wu, J. (2004). On the choice of nugget in kriging modeling for deterministic computer experiments. Journal of Computational and Graphical Statistics, 23, 151–168.MathSciNetCrossRefGoogle Scholar
  31. Perdikaris, P., Venturi, D., Royset, J. O., & Karniadakis, G. E. (2015). Multi-fidelity modelling via recursive co-kriging and Gaussian Markov random fields. Proceedings of the Royal Society of London A, 471, 20150018.CrossRefGoogle Scholar
  32. Qian, P. Z. G., Wu, H., & Wu, C. F. J. (2008). Gaussian process Models for computer experiments with qualitative and quantitative factors. Technometrics, 50, 383–396.MathSciNetCrossRefGoogle Scholar
  33. Rue, H., & Held, L. (2005). Gaussian Markov random fields: Theory and applications. Boca Raton: Chapman and Hall.CrossRefzbMATHGoogle Scholar
  34. Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4, 409–423.MathSciNetCrossRefzbMATHGoogle Scholar
  35. Santner, T. J., Williams, B. J., & Notz, W. I. (2013). The design and analysis of computer experiments. New York: Springer Science & Business Media.zbMATHGoogle Scholar
  36. Sista, B., & Vemaganti, K. (2014). Estimation of statistical parameters of rough surfaces suitable for developing micro-asperity friction models. Wear, 316, 6–18.CrossRefGoogle Scholar
  37. Stein, M. L. (1999). Interpolation of spatial data: Some theory for kriging. New York: Springer.CrossRefzbMATHGoogle Scholar
  38. Tworzydlo, W. W., Cecot, W., Oden, J. T., & Yew, C. H. (1988). Computational micro-and macroscopic models of contact and friction: Formulation, approach and applications. Wear, 220, 113–140.CrossRefGoogle Scholar
  39. Wackernagel, H. (2003). Multivariate geostatistics: An introduction with applications, 3rd ed. Berlin: Springer.CrossRefzbMATHGoogle Scholar
  40. Zaytsev, V., Biver, P., Wachernagel, H., & Allard, D. (2016). Change-of-support models on irregular grids for geostatistical simulation. Mathematical Geosciences, 48, 353–369.MathSciNetCrossRefGoogle Scholar
  41. Zhou, Q., Qian, P. Z. G., & Zhou, S. (2011). A simple approach to emulation for computer models with qualitative and quantitative factors. Technometrics, 53, 266–273.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hongxiang Shi
    • 1
  • Emily L. Kang
    • 1
  • Bledar A. Konomi
    • 1
  • Kumar Vemaganti
    • 2
  • Sandeep Madireddy
    • 3
  1. 1.Department of Mathematical SciencesUniversity of CincinnatiCincinnatiUSA
  2. 2.Department of Mechanical and Materials EngineeringUniversity of CincinnatiCincinnatiUSA
  3. 3.Argonne National LaboratoryLemontUSA

Personalised recommendations