Uncertainty Quantification Using the Nearest Neighbor Gaussian Process
Gaussian process has been widely used in areas including geostatistics and uncertainty quantification due to its parsimonious yet flexible representation of a stochastic process. However, analyzing a large data set with Gaussian process can be challenging due to its O(n3) computational complexity, where n denotes the size of the data set. The recently proposed Nearest Neighbor Gaussian Process (NNGP) aims to approximate a Gaussian process with a target covariance function by using a series of conditional distributions and then exploiting the sparse precision matrices. We demonstrate that NNGP has the potential to be used for uncertainty quantification. We discover that when using NNGP to approximate a Gaussian process with strong smoothness, e.g., the squared-exponential covariance function, Bayesian inference needs to be carried out carefully with marginalizing over the random effects in NNGP. Using simulated and real data, we investigate empirically the performance of NNGP to approximate the squared-exponential covariance function as well as its ability to handle change-of-support effect, a common phenomenon in geostatistics and uncertainty quantification when only aggregated data over space are available.
This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center (OSC 1987). Shi’s research was supported by the Taft Research Center at the University of Cincinnati. Kang’s research was partially supported by the Simons Foundation’s Collaboration Award (#317298) and the Taft Research Center at the University of Cincinnati. Vemaganti’s work was partially supported by the University of Cincinnati Simulation Center.
- Arendt, P. D., Apley, D. W., & Chen, W. (2012). Quantification of model uncertainty: Calibration, model discrepancy, and identifiability. Journal of Mechanical Design, 134, 100908-100908-12.Google Scholar
- Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2014). Hierarchical modeling and analysis for spatial data. Boca Raton: CRC Press.Google Scholar
- Banerjee, S., Gelfand, A. E., Finley, A. O., & Sang, H. (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society B, 70, 825–848.Google Scholar
- Berrocal, V. J., Gelfand, A. E., & Holland, D. M. (2010). A spatio-temporal downscaler for output from numerical models. Journal of Agricultural, Biological, and Environmental Statistics, 15, 176–197.Google Scholar
- Bush, A., Gibson, R., & Thomas, T. (1975). The elastic contact of a rough surface. Wear, 35, 87–111.Google Scholar
- Craig, P. S., Goldstein, M., Rougier, J. C., & Seheult, A. H. (2001). Bayesian forecasting for complex systems using computer simulators. Journal of the American Statistical Association, 96, 717–729.Google Scholar
- Cressie, N. (1993). Statistics for spatial data, revised ed. New York: Wiley.Google Scholar
- Cressie, N. (1996). Change of support and the modifiable areal unit problem. Geographical Systems, 3, 159–180.Google Scholar
- Currin, C., Mitchell, T, Morris, M., & Ylvisaker, D. (1988). A Bayesian approach to the design and analysis of computer experiments. Technical Report, ORNL498, Oak Ridge Laboratory.Google Scholar
- Greenwood, J. A., & Williamson, J. B. P. (1966). Contact of nominally flat surfaces. In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. The Royal Society (Vol. 295, pp. 300–319).Google Scholar
- Ohio Supercomputer Center (OSC). (1987). Columbus OH: Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73