Multi-level emulation of complex climate model responses to boundary forcing data

Tran, Giang T.; Oliver, Kevin I. C.; Holden, Philip B.; Edwards, Neil R.; Sóbester, András; Challenor, Peter

doi:10.1007/s00382-018-4205-4

Multi-level emulation of complex climate model responses to boundary forcing data

Published: 16 April 2018

Volume 52, pages 1505–1531, (2019)
Cite this article

Climate Dynamics Aims and scope Submit manuscript

Giang T. Tran¹^nAff2,
Kevin I. C. Oliver¹,
Philip B. Holden³,
Neil R. Edwards³,
András Sóbester⁴ &
…
Peter Challenor⁵

525 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Climate model components involve both high-dimensional input and output fields. It is desirable to efficiently generate spatio-temporal outputs of these models for applications in integrated assessment modelling or to assess the statistical relationship between such sets of inputs and outputs, for example, uncertainty analysis. However, the need for efficiency often compromises the fidelity of output through the use of low complexity models. Here, we develop a technique which combines statistical emulation with a dimensionality reduction technique to emulate a wide range of outputs from an atmospheric general circulation model, PLASIM, as functions of the boundary forcing prescribed by the ocean component of a lower complexity climate model, GENIE-1. Although accurate and detailed spatial information on atmospheric variables such as precipitation and wind speed is well beyond the capability of GENIE-1’s energy-moisture balance model of the atmosphere, this study demonstrates that the output of this model is useful in predicting PLASIM’s spatio-temporal fields through multi-level emulation. Meaningful information from the fast model, GENIE-1 was extracted by utilising the correlation between variables of the same type in the two models and between variables of different types in PLASIM. We present here the construction and validation of several PLASIM variable emulators and discuss their potential use in developing a hybrid model with statistical components.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finding plausible and diverse variants of a climate model. Part 1: establishing the relationship between errors at weather and climate time scales

Article 25 February 2019

Finding plausible and diverse variants of a climate model. Part II: development and validation of methodology

Article 25 February 2019

Exploiting large ensembles for a better yet simpler climate model evaluation

Article Open access 29 May 2021

References

Barnett TP, Latif M, Graham N, Flugel M, Pazan S, White W (1993) ENSO and ENSO-related predictability. Part I: Prediction of equatorial Pacific Sea surface temperature with a hybrid coupled ocean–atmosphere model. J Clim 6(8):1545–1566
Article Google Scholar
Barnett TP, Preisendorfer R (1987) Origins and levels of monthly and seasonal forecast skill for United states furface air temperature determined by canonical correlation analysis. Mon Weather Rev 115(9):1825–1850
Article Google Scholar
Bastos LS, O’Hagan A (2009) Diagnostics for Gaussian process emulators. Technometrics 51(4):425–438
Article Google Scholar
Bayarri MJ, Berger JO, Cafeo J, Garcia-Donato G, Liu F, Palomo J, Parthasarathy RJ, Paulo R, Sacks J, Walsh D (2007) Computer model validation with functional output. Ann Stat 35(5):1874–1906. https://doi.org/10.1214/009053607000000163
Article Google Scholar
Boukouvalas A, Cornford D (2009) Dimension reduction for multivariate emulation. Tech. Rep. November 2009, Aston University, Birmingham
Bounceur N, Crucifix M, Wilkinson RD (2015) Global sensitivity analysis of the climate-vegetation system to astronomical forcing: an emulator-based approach. Earth Syst Dyn 6(1):205–224
Article Google Scholar
Challenor PG, McNeall D, Gattiker J (2010) Assessing the probability of rare climate events. In: O’Hagan A, West M (eds) The Oxford handbook of applied bayesian analysis. Oxford University Press, New York, pp 403–430 chap 16
Google Scholar
Cimatoribus AA, Drijfhout SS, Dijkstra HA (2012) A global hybrid coupled model based on atmosphere-SST feedbacks. Clim Dyn 38(3–4):745–760
Article Google Scholar
Conti S, O’Hagan A (2010) Bayesian emulation of complex multi-output and dynamic computer models. J Stat Plann Inference 140(3):640–651
Article Google Scholar
Cook RD, Nachtsheim CJ (1980) A comparison of algorithms for constructing exact D-optimal designs. Technometrics 22:315–324
Article Google Scholar
Edwards NR, Marsh R (2005) Uncertainties due to transport-parameter sensitivity in an efficient 3-D ocean-climate model. Clim Dyn 24(4):415–433
Article Google Scholar
Edwards NR, Cameron D, Rougier J (2011) Precalibrating an intermediate complexity climate model. Clim Dyn 37(7–8):1469–1482
Article Google Scholar
Foley AM, Holden PB, Edwards NR, Mercure JF, Salas P, Pollitt H, Chewpreecha U (2016) Climate model emulation in an integrated assessment framework: a case study for mitigation policies in the electricity sector. Earth Syst Dyn 7(1):119–132
Article Google Scholar
Forrester AI, Sóbester A, Keane AJ (2007) Multi-fidelity optimization via surrogate modelling. Proc R Soc A Math Phys Eng Sci 463(2088):3251–3269
Article Google Scholar
Fraedrich K (2012) A suite of user-friendly global climate models: hysteresis experiments. Eur Phys J Plus 127(5):53
Article Google Scholar
Fraedrich K, Jansen H, Kirk E, Luksch U, Lunkeit F (2005) The planet simulator: towards a user friendly model. Meteorol Z 14(3):299–304
Article Google Scholar
Geil KL, Zeng X (2015) Quantitative characterization of spurious numerical oscillations in 48 CMIP5 models. Geophys Res Lett 42(12):5066–5073. https://doi.org/10.1002/2015GL063931
Article Google Scholar
Haberkorn K, Sielmann F, Lunkeit F, Kirk E, Schneidereit A, Fraedrich K (2009) Planet simulator climate. Tech. rep., Meteorologisches Institut, Universität Hamburg
Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis; an overview with application to learning methods. Neural Comput 16(12):2639–2664
Article Google Scholar
Higdon D, Gattiker J, Williams B, Rightley M (2008) Computer model calibration using high-dimensional output. J Am Stat Assoc 103(482):570–583
Article Google Scholar
Holden PB, Edwards NR, Garthwaite PH, Wilkinson RD (2015) Emulation and interpretation of high-dimensional climate model outputs. J Appl Stat 42(9):2038–2055
Article Google Scholar
Holden PB, Edwards NR (2010) Dimensionally reduced emulation of an AOGCM for application to integrated assessment modelling. Geophys Res Lett 37(21): L21707
Article Google Scholar
Holden PB, Edwards NR, Oliver KIC, Lenton TM, Wilkinson RD (2010) A probabilistic calibration of climate sensitivity and terrestrial carbon change in GENIE-1. Clim Dyn 35(5):785–806
Article Google Scholar
Holden PB, Edwards NR, Müller SA, Oliver KIC, Death RM, Ridgwell A (2013) Controls on the spatial distribution of oceanic d13CDIC. Biogeosciences 10(3):1815–1833. https://doi.org/10.5194/bg-10-1815-2013
Article Google Scholar
Holden PB, Edwards NR, Garthwaite PH, Fraedrich K, Lunkeit F, Kirk E, Labriet M, Kanudia A, Babonneau F (2014) PLASIM-ENTSem v1.0: a spatio-temporal emulator of future climate change for impacts assessment. Geosci Model Dev 7(1):433–451
Article Google Scholar
Holden PB, Edwards NR, Fraedrich K, Kirk E, Lunkeit F, Zhu X (2016) PLASIM GENIE v1.0: a new intermediate complexity AOGCM. Geosci Model Dev 9:3347–3361. https://doi.org/10.5194/gmd-9-3347-2016
Article Google Scholar
Kennedy MC, O’Hagan A (2000) Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1):1–13
Article Google Scholar
Kennedy MC, O’Hagan A (2001) Bayesian calibration of computer models. J R Stat Soc Ser B (Stat Methodol) 63(3):425–464
Article Google Scholar
La Lee, Carslaw KS, Pringle KJ, Mann GW (2012) Mapping the uncertainty in global CCN using emulation. Atmos Chem Phys 12(20):9739–9751
Article Google Scholar
Labriet M, Joshi SR, Vielle M, Holden PB, Edwards NR, Kanudia A, Loulou R, Babonneau F (2015) Worldwide impacts of climate change on energy for heating and cooling. Mitig Adapt Strat Glob Chang 20(7):1111–1136
Article Google Scholar
Lenton TM, Williamson MS, Edwards NR, Marsh R, Price aR, Ridgwell aJ, Shepherd JG, Cox SJ (2006) Millennial timescale carbon cycle and climate change in an efficient Earth system model. Clim Dyn 26(7–8):687–711
Article Google Scholar
Liakka J, Nilsson J, Lofverstrom M (2012) Interactions between stationary waves and ice sheets: linear versus nonlinear atmospheric response. Clim Dyn 38(5–6):1249–1262
Article Google Scholar
Loeppky JL, Sacks J, Welch WJ (2009) Choosing the sample size of a computer experiment: a practical guide. Technometrics 51(4):366–376
Article Google Scholar
Lucarini V, Fraedrich K, Lunkeit F (2010) Thermodynamic analysis of snowball earth hysteresis experiment: efficiency, entropy production and irreversibility. Q J R Meteorol Soc 136(646):2–11. https://doi.org/10.1002/qj.543
Article Google Scholar
Lunt DJ, Williamson MS, Valdes PJ, Lenton TM, Marsh R (2006) Comparing transient, accelerated, and equilibrium simulations of the last 30,000 years with the GENIE-1 model. Clim Past 2(2):221–235
Article Google Scholar
Maniyar DM, Cornford D, Boukouvalas A (2007) Dimensionality reduction in the emulator setting. Tech. Rep. October 2007, Neural Computing research group. Aston University, Birmingham
Mardia KV, Marshall RJ (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71(1):135–146
Article Google Scholar
Marsh R, Sa Müller, Yool A, Edwards NR (2011) Incorporation of the C-GOLDSTEIN efficient climate model into the GENIE framework: “eb_go_gs” configurations of GENIE. Geosci Model Dev 4(4):957–992
Article Google Scholar
Matthews HD, Caldeira K (2007) Transient climate-carbon simulations of planetary geoengineering. Proc Nat Acad Sci USA 104(24):9949–54
Article Google Scholar
Mcneall DJ (2008) Dimension reduction in the Bayesian analysis of a numerical climate model. PhD Thesis, University of Southampton
Morris MD, Mitchell TJ (1995) Exploratory designs for computational experiments. J Stat Plann Inference 43:381–402
Article Google Scholar
Oa Saenko, Schmittner A, Weaver AJ (2004) The Atlantic Pacific seesaw. J Clim 17(11):2033–2038. https://doi.org/10.1175/1520-0442(2004)017%3c2033:TAS%3e2.0.CO;2
Article Google Scholar
Oakley JE, O’Hagan A (2004) Probabilistic sensitivity analysis of complex models: a Bayesian approach. J R Stat Soc Ser B (Stat Methodol) 66(3):751–769. https://doi.org/10.1111/j.1467-9868.2004.05304.x
Article Google Scholar
Osborne JW, Costello AB (2009) Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Pan Pac Manag Rev 12(2):131–146
Google Scholar
Peltier W (2004) Global glacial isostasy and the surface of the ice-age earth: the ICE-5G (VM2) model and GRACE. Annu Rev Earth Planet Sci 32(1):111–149
Article Google Scholar
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press, Cambridge
Google Scholar
Richman MB (1986) Rotation of principal components. J Climatol 6(3):293–335
Article Google Scholar
Romanova V, Lohmann G, Grosfeld K, Butzin M (2006) The relative role of oceanic heat transport and orography on glacial climate. Quatern Sci Rev 25(7–8):832–845
Article Google Scholar
Sacks J, Welch WJ, Mitchell TJ, Wynn HP (1989) Design and analysis of computer experiments. Stat Sci 4(4):409–423
Article Google Scholar
Santner TJ, Williams BJ, Notz WI (2003) The design and analysis of computer experiments. Springer, New York
Book Google Scholar
Schmittner A, Silva TAM, Fraedrich K, Kirk E, Lunkeit F (2011) Effects of mountains and ice sheets on global ocean circulation. J Clim 24(11):2814–2829
Article Google Scholar
Stocker TF, Johnsen SJ (2003) A minimum thermodynamic model for the bipolar seesaw. Paleoceanography 18(4):1087. https://doi.org/10.1029/2003PA000920
Article Google Scholar
Syu HH, Neelin DJ, Gutzler D (1995) Seasonal and interannual variability in a hybrid coupled GCM. J Clim 8:2121–2143
Article Google Scholar
Thompson SL, Warren SG (1982) Parameterization of outgoing infrared radiation derived from detailed radiative calculations. J Atmos Sci 39:2667–2680
Article Google Scholar
Tran GT (2017) Developing a multi-level gaussian process emulator of an atmospheric general circulation model for palaeoclimate modelling. PhD Thesis, University of Southampton
Tran GT, Oliver KIC, Sobester A, Toal DJJ, Holden PB, Marsh R, Challenor PG, Edwards NR (2016) Building a traceable climate model hierarchy with multi-level emulators. Adv Stat Climatol Meteorol Oceanogr 2(1):1–21
Article Google Scholar
Weaver AJ, Eby M, Wiebe EC, Bitz CM, Duffy PB, Ewen TL, Fanning AF, Holland MM, Macfadyen A, Matthews HD, Meissner KJ, Saenko O, Schmittner A, Wang H, Masakazu Y (2001) The UVic earth system climate model: model description, climatology, and applications to past, present and future climates. Atmos Ocean 39(4):361–428
Article Google Scholar
Wilkinson RD (2010) Bayesian calibration of expensive multivariate computer experiments. In: Biegler L, Biros G, Ghattas O, Heinkenschloss M, Keyes D, Mallick B, Marzouk Y, Tenorio L, Waanders BB, Willcox K (eds) Computational methods for large-scale inverse problems and quantification of uncertainity. Wiley, Chichester chap 10
Google Scholar
Williamson M, Lenton T, Shepherd J, Edwards N (2006) An efficient numerical terrestrial scheme (ENTS) for Earth system modelling. Ecol Model 198(3–4):362–374
Article Google Scholar
Williamson D, Blaker AT, Hampton C, Salter J (2015) Identifying and removing structural biases in climate models with history matching. Clim Dyn 45(5–6):1299–1324. https://doi.org/10.1007/s00382-014-2378-z
Article Google Scholar

Download references

Author information

Giang T. Tran
Present address: GEOMAR Helmholtz Centre for Ocean Research Kiel, Düsternbrooker Weg 20, 24105, Kiel, Germany

Authors and Affiliations

Ocean and Earth Sciences, University of Southampton, Southampton, SO14 3ZH, UK
Giang T. Tran & Kevin I. C. Oliver
Environment, Earth and Ecosystems, The Open University, Milton Keynes, MK7 6AA, UK
Philip B. Holden & Neil R. Edwards
Engineering and the Environment, University of Southampton, Southampton, SO16 7QF, UK
András Sóbester
College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, EX4 4QE, UK
Peter Challenor

Authors

Giang T. Tran
View author publications
You can also search for this author in PubMed Google Scholar
Kevin I. C. Oliver
View author publications
You can also search for this author in PubMed Google Scholar
Philip B. Holden
View author publications
You can also search for this author in PubMed Google Scholar
Neil R. Edwards
View author publications
You can also search for this author in PubMed Google Scholar
András Sóbester
View author publications
You can also search for this author in PubMed Google Scholar
Peter Challenor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giang T. Tran.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 30863 KB)

Appendix: Gaussian process emulator

The climate model, $f(\cdot )$, can be viewed as a function of a set of inputs, ${\varvec{x}}=[x_1,\ldots ,x_d]$, where d is the number of perturbed model parameters. This number is commonly referred to as the number of dimensions of the emulator. The output of each model run is a scalar value y. Supposed we have n simulation runs, providing n realisations ${\varvec{y}}=[y_1=f({\varvec{x}}_1),\ldots ,y_n=f({\varvec{x}}_n)]$. These comprise the training set used to train an emulator.

First, the function $f(\cdot )$ is represented by a GP prior described by a mean function $m(\cdot )$ and a covariance function $V(\cdot ,\cdot )$

$$\begin{aligned} f(\cdot )|\varvec{\beta },\sigma ^2,\varvec{\theta } \sim {\mathcal {N}}(m(\cdot ),V(\cdot ,\cdot )). \end{aligned}$$

(12)

This GP is used as a prior for Bayesian inference. The prior does not depend on the training data but specifies the assumptions made about the function of interest. Then, the outputs from a selected number of simulations are incorporated, allowing us to update the prior to the posterior GP. This process is called training the GP model. Following (Kennedy and O’Hagan 2001), $m(\cdot )$ and $V(\cdot ,\cdot )$ are modelled hierarchically, meaning that they are parameterised in terms of hyperparameters. The mean function is given by:

$$\begin{aligned} m({\varvec{x}})= {\varvec{h}}^T({\varvec{x}})\varvec{\beta }, \end{aligned}$$

(13)

where ${\varvec{h}}({\varvec{x}})$ is a vector of known regression functions of the inputs, describing a class of shapes of the function $f(\cdot )$. $\varvec{\beta }$ is an unknown vector of coefficients. In the case of ordinary kriging, ${\varvec{h}}(\cdot )={\mathbf {1}}$, making $\varvec{\beta }$ the unknown overall mean. A variation of kriging, called universal kriging, uses a linear mean function:

$$\begin{aligned} {\varvec{h}}(\cdot )=({\varvec{1}},{\varvec{x}}^T), \end{aligned}$$

(14)

where ${\varvec{h}}({\varvec{x}})^T$ is a $(s\times 1)$ vector with $s=d+1$.

The covariance function is given by:

$$\begin{aligned} V({\varvec{x}},\varvec{x'})=\sigma ^2 { {\Psi }}({\varvec{x}},\varvec{x'}), \end{aligned}$$

(15)

in which $\sigma ^2$ is an unknown variance of the GP and ${\Psi }(\cdot ,\cdot )$ is the assumed correlation function:

$$\begin{aligned} { {\Psi }}({\mathbf {x}},\mathbf {x'})=\text {exp}\left[ -\sum _{j=1}^{d}10^{\theta _j}\left| {x_j}-{x'_j}\right| ^{p_j}\right] . \end{aligned}$$

(16)

The function ${ {\Psi }}$ represents the correlation between pairs of points, which is assumed to be stationary and continuous, that is, it only depends on the distance between the pair of inputs, (${\varvec{x}}-\varvec{x'}$). This power exponential form of covariance structure is a popular choice due to its flexibility.

Both p and $\theta$ can be estimated for each dimension. For simplicity and to reduce computational cost, $p=2$ is assumed for all dimensions. An independent value of $\theta$ is obtained for each dimension by maximising the likelihood of ${\varvec{y}}$.

The specified GP is used as a prior for Bayesian inference and is parameterised in terms of the hyperparameters $\varvec{\beta }$, $\sigma ^2$, $\varvec{\theta }$ and p. Given that the prior is Gaussian, by analytically marginalising $\beta$ and $\sigma ^2$, the marginal likelihood of the observed outputs at n training points, ${\varvec{y}}$, given $\theta$ and p can then be computed (estimated by maximising the likelihood of ${\varvec{y}}$). A more detailed description of the derivations and formulations can be found in Mardia and Marshall (1984).

Prior beliefs about the model behaviour are combined with observations from training points to produce a posterior distribution for the model. Having obtained estimates for $\varvec{\theta }$ and p, the posterior distribution found can be used to make predictions about the model’s outputs at unsampled inputs. The predictive distribution is a Student’s t-distribution, with $n-s$ degrees of freedom

$$\begin{aligned} p(f({\varvec{x}})|{\mathbf {y}},\theta ) \sim t_{n-s}(m_1({\varvec{x}}),V_1({\varvec{x}},\varvec{x'})), \end{aligned}$$

(17)

with

$$\begin{aligned} m_1({\varvec{x}})={\varvec{h}}^T({\varvec{x}})\varvec{{\hat{\beta }}}+{\varvec{T}}({\varvec{x}}){\mathbf {A}}^{-1}({\varvec{y}}-{\varvec{H}}\varvec{{\hat{\beta }}}), \end{aligned}$$

(18)

and

$$\begin{aligned} V_1({\varvec{x}},{\varvec{x}}')={\hat{\sigma }}^2 [{\Psi }({\varvec{x}},\varvec{x'})-{\varvec{T}}({\varvec{x}})^T{\mathbf {A}}^{-1}{\varvec{T}}(\varvec{x'})+{\mathbf {P}}({\varvec{x}})({\mathbf {H}}^T{\mathbf {A}}^{-1}{\mathbf {H}})^{-1}{\mathbf {P}}(\varvec{x'})^T], \end{aligned}$$

(19)

where ${\mathbf {H}}$ is the regression matrix of the design points, ${\mathbf {H}}= {\varvec{h}}({\varvec{x}})^T$, and ${\mathbf {A}}$ is the design points correlation matrix, ${\mathbf {A}}= {\Psi }({\varvec{x}},{\varvec{x}}')$; ${\varvec{t}}({\varvec{x}})$ is the correlation vector between ${\varvec{x}}$ and the training set, i.e. $({\varvec{T}}({\varvec{x}}))_i = {\Psi }({\varvec{x}},{\varvec{x}}_i)$ and ${\mathbf {P}}({\varvec{x}})={\varvec{h}}({\mathbf {x}})^T-{\varvec{T}}({\varvec{x}}){\mathbf {A}}^{-1}{\mathbf {H}}$. The estimated values of $\sigma ^2$ and $\varvec{\beta }$ are indicated as ${\hat{\sigma }}^2$ and $\varvec{{\hat{\beta }}}$, respectively:

$$\begin{aligned} \varvec{{\hat{\beta }}} = ({\mathbf {H}}^{\mathbf {T}}{\mathbf {A}}^{-1}{\mathbf {H}})^{-1}{\mathbf {H}}^T{\mathbf {A}}^{-1}{\varvec{y}} \end{aligned}$$

(20)

and

$$\begin{aligned} {\hat{\sigma }}^2 = \frac{{\varvec{y}}^T({\mathbf {A}}^{-1}-{\mathbf {A}}^{-1}{\mathbf {H}}({\mathbf {H}}^T{\mathbf {A}}^{-1}{\mathbf {H}})^{-1}{\mathbf {H}}^T{\mathbf {A}}^{-1}){\varvec{y}}}{n-q-2}. \end{aligned}$$

(21)

A full description of the derivation of the posterior distribution is available in Rasmussen and Williams (2006).

Co-kriging is an extension to this technique, which is applicable when a fast approximation of the primary simulator is available. For this method to work, the primary simulator and its approximation need to be correlated and contain information about one another.

When only a small number of expensive runs are available, it has been shown that by combining these with cheaper runs from a simplified code, an emulator of the expensive model can be built at a lower cost (Forrester et al. 2007).

We make a simplification that the expensive and cheap models, $f_e$ and $f_c$ respectively, can be represented by GP emulators with the same value of p. The cheap model is first emulated and then linked to the expensive one using the single multiplier approach:

$$\begin{aligned} f_e({\varvec{x}})=\rho f_c({\varvec{x}})+f_d({\varvec{x}}). \end{aligned}$$

(22)

The right-hand side of the equation consists of a cheap GP, $f_c$, multiplied by a scaling factor $\rho$ and a separate GP, $f_d$, modelling the stochastic residual of the expensive model (Kennedy and O’Hagan 2000; Forrester et al. 2007). Together these two terms describe the emulator of the expensive model. This approximation is chosen for its simplicity as well as the assumption that the main difference between the two models is largely a matter of scale. This assumption is made based on the fact that both EMBM and PLASIM are driven by the boundary conditions specified by GENIE-1’s ocean. They essentially share similar inputs but have the ability to respond differently.

Two sets of training points are required for the construction of a co-kriging emulator; a cheap set ${\varvec{y}}_c=f_c({\varvec{x}}_c)$, which finely samples the input space, and a small, sparse set ${\varvec{y}}_e=f_e({\varvec{x}}_e)$ of expensive points. When the number of PLASIM training points is small, such that a kriging emulator cannot be built with high accuracy, co-kriging employing a large additional number of training points from GENIE-1’s EMBM can be used instead. The number of points required depends on the size of the problem as well as the smoothness of the function being emulated. A general rule of thumb for the number of training points for kriging is 10 times the number of parameters (Loeppky et al. 2009). The inputs at which the expensive training set is obtained, ${\varvec{x}}_e$, is a subset of the cheap set, ${\varvec{x}}_c$. These expensive points are chosen using an exchange algorithm described by Cook and Nachtsheim (1980).

The covariance matrix for co-kriging, ${\Psi }_{ck}$, can be written in block form as

$$\begin{aligned} {\Psi }_{ck}= \begin{pmatrix} \sigma ^2_c{\mathbf {A}}_c({\varvec{x}}_c) &{} \quad \rho \sigma _c^2 {\mathbf {A}}_c({\varvec{x}}_c,{\varvec{x}}_e) \\ \rho \sigma _c^2 {\mathbf {A}}_c({\varvec{x}}_e,{\varvec{x}}_c) &{} \quad \rho \sigma _c^2 {\mathbf {A}}_c({\varvec{x}}_e) + \sigma _e^2{\mathbf {A}}_d({\varvec{x}}_e) \\ \end{pmatrix}, \end{aligned}$$

(23)

with ${\mathbf {A}}_c= {\Psi }({\varvec{x}},\varvec{x'};\varvec{\theta }_c)$ and ${\mathbf {A}}_d= {\Psi }({\varvec{x}},\varvec{x'};\varvec{\theta }_d)$. This covariance matrix encompasses the correlation between cheap points (${\mathbf {A}}_c({\varvec{x}}_c)$), expensive points (${\mathbf {A}}_c({\varvec{x}}_e)$ and ${\mathbf {A}}_d({\varvec{x}}_e)$) and the cross-correlation between the cheap and expensive points (${\mathbf {A}}_c({\varvec{x}}_c,{\varvec{x}}_e)$ and ${\mathbf {A}}_c({\varvec{x}}_e,{\varvec{x}}_c)$). Details on the formulation and derivation of this equation can be found in Kennedy and O’Hagan (2000) and Forrester et al. (2007).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tran, G.T., Oliver, K.I.C., Holden, P.B. et al. Multi-level emulation of complex climate model responses to boundary forcing data. Clim Dyn 52, 1505–1531 (2019). https://doi.org/10.1007/s00382-018-4205-4

Download citation

Received: 31 May 2017
Accepted: 06 April 2018
Published: 16 April 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s00382-018-4205-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level emulation of complex climate model responses to boundary forcing data

Abstract

Access this article

Similar content being viewed by others

Finding plausible and diverse variants of a climate model. Part 1: establishing the relationship between errors at weather and climate time scales

Finding plausible and diverse variants of a climate model. Part II: development and validation of methodology

Exploiting large ensembles for a better yet simpler climate model evaluation

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 30863 KB)

Appendix: Gaussian process emulator

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level emulation of complex climate model responses to boundary forcing data

Abstract

Access this article

Similar content being viewed by others

Finding plausible and diverse variants of a climate model. Part 1: establishing the relationship between errors at weather and climate time scales

Finding plausible and diverse variants of a climate model. Part II: development and validation of methodology

Exploiting large ensembles for a better yet simpler climate model evaluation

References

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 30863 KB)

Appendix: Gaussian process emulator

Appendix: Gaussian process emulator

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation