Bayes Linear Emulation, History Matching, and Forecasting for Complex Computer Simulators

Goldstein, Michael; Huntley, Nathan

doi:10.1007/978-3-319-12385-1_14

Michael Goldstein⁴ &
Nathan Huntley⁴

9504 Accesses
2 Citations

Abstract

Computer simulators are a useful tool for understanding complicated systems. However, any inferences made from them should recognize the inherent limitations and approximations in the simulator’s predictions for reality, the data used to run and calibrate the simulator, and the lack of knowledge about the best inputs to use for the simulator. This article describes the methods of emulation and history matching, where fast statistical approximations to the computer simulator (emulators) are constructed and used to reject implausible choices of input (history matching). Also described is a simple and tractable approach to estimating the discrepancy between simulator and reality induced by certain intrinsic limitations and uncertainties in the simulator and input data. Finally, a method for forecasting based on this approach is presented. The analysis is based on the Bayes linear approach to uncertainty quantification, which is similar in spirit to the standard Bayesian approach but takes expectation, rather than probability, as the primitive for the theory, with consequent simplifications in the prior uncertainty specification and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 1,099.99; Price excludes VAT (USA)

Hardcover Book: USD 1,399.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bastos, L.S., O’Hagan, A.: Diagnostics for Gaussian process emulators. Technometrics 51, 425–438 (2008)
Article MathSciNet Google Scholar
Clark, M.P., Slater, A.G., Rupp, D.E., Woods, R.A., Vrugt, J.A., Gupta, H.V., Wagener, T., Hay, L.E.: Framework for Understanding Structural Errors (FUSE): a modular framework to diagnose differences between hydrological models. Water Resour. Res. 44, W00B02 (2008)
Google Scholar
Craig, P.S., Goldstein, M., Seheult, A.H., Smith, J.A.: Pressure matching for hydrocarbon reservoirs: a case study in the use of Bayes linear strategies for large computer experiments (with discussion). In: Gastonis, C., et al. (eds.) Case Studies in Bayesian Statistics, vol. III, pp. 37–93. Springer, New York (1997)
Chapter Google Scholar
Craig, P.S., Goldstein, M., Rougier, J.C., Seheult, A.H.: Bayesian forecasting using large computer models. JASA 96, 717–729 (2001)
Article MATH Google Scholar
Cumming, J., Goldstein, M.: Small sample Bayesian designs for complex high-dimensional models based on information gained using fast approximations. Technometrics 51, 377–388 (2009)
Article MathSciNet Google Scholar
de Finetti, B.: Theory of Probability, vols. 1 & 2. Wiley, New York (1974, 1975)
Google Scholar
Goldstein, M.: Subjective Bayesian analysis: principles and practice. Bayesian Anal. 1, 403–420 (2006)
Article MathSciNet MATH Google Scholar
Goldstein, M., Rougier, J.C.: Bayes linear calibrated prediction for complex systems. JASA 101, 1132–1143 (2006)
Article MathSciNet MATH Google Scholar
Goldstein, M., Rougier, J.C.: Reified Bayesian modelling and inference for physical systems (with discussion). JSPI 139, 1221–1239 (2008)
MATH Google Scholar
Goldstein, M., Seheult, A., Vernon, I.: Assessing model adequacy. In: Wainwright, J., Mulligan, M. (eds) Environmental Modelling: Finding Simplicity in Complexity, 2nd edn., pp. 435–449, Wiley, Chichester (2010)
Google Scholar
Goldstein, M., Wooff, D.A.: Bayes Linear Statistics: Theory and Methods. Wiley, Chichester/Hoboken (2007)
Book MATH Google Scholar
Monteith, J.L.: Evaporation and environment. Symp. Soc. Exp. Biol. 19, 205–224 (1965)
Google Scholar
O’Hagan, A.: Bayesian analysis of computer code outputs: a tutorial. Reliab. Eng. Syst. Saf. 91, 1290–1300 (2006)
Article Google Scholar
Pukelsheim, F.: The three sigma rule. Am. Stat. 48, 88–91 (1994)
MathSciNet Google Scholar
Santner, T., Williams, B., Notz, W.: The Design and Analysis of Computer Experiments. Springer, New York (2003)
Book MATH Google Scholar
Vernon I., Goldstein M., and Bower, R.: Galaxy Formation: a Bayesian Uncertainty Analysis (with discussion). Bayesian Anal. 5, 619–670 (2010)
Article MathSciNet MATH Google Scholar
Williamson, D., Goldstein, M., Allison, L., Blaker, A., Challenor, P., Jackson, L., Yamazaki, K.: History matching for exploring and reducing climate model parameter space using observations and a large perturbed physics ensemble. Clim. Dyn. 41(7–8), 1703–1729 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Science Laboratories, Department of Mathematical Sciences, Durham University, South Road, DH1 3LE, Durham, UK
Michael Goldstein & Nathan Huntley

Authors

Michael Goldstein
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Huntley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathan Huntley .

Editor information

Editors and Affiliations

Department of Civil and Environmental Engineering, University of Southern California, Los Angeles, California, USA
Roger Ghanem
Social and Decision Analytics Laboratory, Virginia Bioinformatics Institute, Virginia Tech University, Arlington, Virginia, USA
David Higdon
Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California, USA
Houman Owhadi

Appendix: Internal Discrepancy Perturbations

In this appendix, a description of the internal discrepancy experiments is provided. The first step is to identify potential quantities to perturb. For FUSE, the obvious quantities to choose are the two input time series and the initial condition. Also, the parameters could be perturbed at every time step. Similarly, the state vectors could be perturbed at every step, but this was not feasible in FUSE. Other possibilities not considered here include the time scale of the simulator and the accuracy of the numerical solver.

The next step is to informally assess the potential influence of each quantity. For example, increasing all rainfall by 10 % makes a large difference to the output, whereas increasing all evapotranspiration by 10 % makes a smaller but noticeable difference. Meanwhile, making large changes to the initial condition leads to extremely small changes away from the start of the simulation (recall that the quantities of interest are near the end of the simulation). From these initial explorations, each quantity can be categorized: if it has very little influence, it may not be worth perturbing; if it has a small influence, it may be worth including but not expending much effort on; if it has a large influence, it is worth carefully modeling. The outcome of this exploration for FUSE suggested that the initial condition was hardly relevant, the evapotranspiration was worth including, and the rainfall and parameter perturbations deserved more attention.

The final step is to consider how to generate perturbations of each quantity. The initial condition is ignored. For evapotranspiration, good estimates of observation uncertainty are lacking, but given the low influence of this quantity this is not too worrying: any plausible perturbation should be sufficient. Each evapotranspiration observation was multiplied by a perturbation drawn from a log-normal distribution, such that most observations were perturbed by no more than 10 %. Correlation between observations within 24 h was also included, so if a particular observation has a high perturbation, nearby ones will also. This is motivated by the daily period of the evapotranspiration time series.

Parameter perturbations are performed by multiplying each initial parameter by some random perturbation, with nearby multipliers being correlated. The size of the perturbations were chosen such that the parameters rarely changed by much more than 10 % over the course of a simulation. This creates collections of perturbations that cause the parameters to evolve slowly without sudden large changes and without a large change overall. The parameter perturbations have a significant effect on the output, but expert opinion about how these are likely to change over time and by how much is lacking. In principle, in such a situation one should make the correlation and the magnitude of the perturbations configurable parameters, so as to understand their influence. For this example, however, this complication is avoided. An example of the evolution of a particular choice of x ₍₁₎ for a particular perturbation can be seen in Fig. 2.6.

Perturbing the rainfall also has a significant effect on the output. In this case, however, there is some more guidance on the perturbations required. Sources of uncertainty in the rainfall was attributed to three significant causes: the “local” gauge measurement error, the process of aggregating readings to the nearest hour, and the process of averaging over the catchment by kriging. Suitable perturbations from these errors were generated and combined.

The overall rainfall perturbations generated for this process typically display occasional noticeable differences but mostly small differences. This suggests that the rainfall error could contribute significantly to discrepancy for maximum stream flow, but not so much for discrepancy for average stream flow.

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Goldstein, M., Huntley, N. (2017). Bayes Linear Emulation, History Matching, and Forecasting for Complex Computer Simulators. In: Ghanem, R., Higdon, D., Owhadi, H. (eds) Handbook of Uncertainty Quantification. Springer, Cham. https://doi.org/10.1007/978-3-319-12385-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-12385-1_14
Published: 17 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12384-4
Online ISBN: 978-3-319-12385-1
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Bayes Linear Emulation, History Matching, and Forecasting for Complex Computer Simulators

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: Internal Discrepancy Perturbations

Appendix: Internal Discrepancy Perturbations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Share this entry

Publish with us

Search

Navigation