## Abstract

One of the goals of climate science is to characterize the statistics of extreme and potentially dangerous events in the present and future climate. Extreme events like heat waves, droughts, or floods due to persisting rains are characterized by large anomalies of the time average of an observable over a long time. The framework of Donsker–Varadhan large deviation theory could therefore be useful for their analysis. In this paper we discuss how concepts and numerical algorithms developed in the context of with large deviation theory can be applied to study extreme, rare fluctuations of time averages of surface temperatures at regional scale with comprehensive numerical climate models. When performing this type of analysis, unless a rigorous study of the convergence to the large deviation limit is performed, it can be easy to be misled in thinking to have reached the asymptotic regime. In this paper we provide a systematic protocol to study the convergence of large deviation functions tailored for applications to climate problems. Referring to the existing literature on the subject, we provide explicit formulas to compute large deviation functions directly from time series of a deterministic dynamical system that can be applied to climate records, and we describe how to study the convergence. We show how using a rare event algorithm applied to a numerical model can improve the efficiency of the computation of the large deviation functions. As a case study we consider the time averaged European surface temperature obtained with the numerical climate model Plasim. We show how a precise analysis of the convergence leads to the conclusion that the large deviation limit is nor properly reached for time scales shorter than a few years, and is therefore of no practical interest to study midlatitude heat waves. Finally we show how, even in a case like this, rare event algorithms developed to study large deviation functions can be used to improve the statistics of events on time scales shorter than the one needed to reach the large deviation asymptotic regime.

This is a preview of subscription content, access via your institution.

## References

- 1.
AghaKouchak, A.: Extremes in a Changing Climate Detection, Analysis and Uncertainty. Springer, Dordrecht (2012)

- 2.
Ailliot, P., Allard, D., Monbet, V., Naveau, P.: Stochastic weather generators: an overview of weather type models. Journal de la Societe Francaise de Statistique

**156**(1), 101–113 (2015) - 3.
Bouchet, F., Marston, J.B., Tangarife, T.: Fluctuations and large deviations of reynolds stresses in zonal jet dynamics. Phys. Fluids

**30**(1), 015110 (2018). https://doi.org/10.1063/1.4990509 - 4.
Bucklew, J.A.: An Introduction to Rare Event Simulation. Springer, New York (2004)

- 5.
Coles, S.: An Introduction to Statistical Modeling of Extreme Values. Springer, New York (2001)

- 6.
Del Moral, P.: Feynman–Kac Formulae Genealogical and Interacting Particle Systems with Applications. Springer, New York (2004)

- 7.
Del Moral, P., Garnier, J.: Genealogical particle analysis of rare events. Ann. Appl. Prob.

**15**(4), 2496–2534 (2005). https://doi.org/10.1214/105051605000000566 - 8.
Dembo, A., Zeitouni, O.: Large Deviations and Applications. Handbook of stochastic analysis and application. CRC Press, Boca Raton (2001)

- 9.
Donsker, M.D., Varadhan, S.S.: Asymptotic evaluation of certain markov process expectations for large time, I. Commun. Pure Appl. Math.

**28**(1), 1–47 (1975) - 10.
Eliasen, E., Machenhauer, B., Rasmussen, E.: On a numerical method for integration of the hydrodynamical equations with a spectral representation of the horizontal fields. Københavns University, Copenhagen, Technical report, Inst. of Theor. Met. (1970)

- 11.
Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, New York (2007)

- 12.
Fischer, E.M., Seneviratne, S.I., Vidale, P.L., Luthi, D., Schaer, C.: Soil moisture–atmosphere interactions during the 2003 european summer heat wave. J. Clim.

**20**(20), 5081–5099 (2007). https://doi.org/10.1175/JCLI4288.1 - 13.
Fraedrich, K., Jansen, H., Luksch, U., Lunkeit, F.: The planet simulator: towards a user friendly model. Meteorol. Z.

**14**, 299–304 (2005) - 14.
Galfi, V.M., Lucarini, V., Wouters, J.: A large deviation theory-based analysis of heat waves and cold spells in a simplified model of the general circulation of the atmosphere. J. Stat. Mech.

**2019**(3), 033404 (2019). https://doi.org/10.1088/1742-5468/ab02e8 - 15.
Ghil, M., Yiou, P., Hallegatte, S., Malamud, B., Naveau, P., Soloviev, A., Friederichs, P., Keilis-Borok, V., Kondrashov, D., Kossobokov, V., Mestre, O., Nicolis, C., Rust, H., Shebalin, P., Vrac, M., Witt, A., Zaliapin, I.: Extreme events: dynamics, statistics and prediction. Nonlinear Process. Geophys.

**18**(3), 295–350 (2011). https://doi.org/10.5194/npg-18-295-2011 - 16.
Giardina, C., Kurchan, J., Peliti, L.: Direct evaluation of large-deviation functions. Phys. Rev. Lett.

**96**(12), 120603 (2006). https://doi.org/10.1103/PhysRevLett.96.120603 - 17.
Giardina, C., Kurchan, J., Lecomte, V., Tailleur, J.: Simulating rare events in dynamical processes. J. Stat. Phys.

**145**(4), 787–811 (2011). https://doi.org/10.1007/s10955-011-0350-4 - 18.
IPCC: Managing the risks of extreme events and disasters to advance climate change adaption: special report of the Intergovernmental Panel on Climate Change. Cambridge University Press, New York, NY (2012)

- 19.
IPCC: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA (2013)

- 20.
Kahn, H., Harris, T.E.: Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser.

**12**, 27–30 (1951) - 21.
Kifer, Y.: Large deviations in dynamical systems and stochastic processes. Trans. Am. Math. Soc.

**321**(2), 505–524 (1990) - 22.
Kuo, H.L.: On formations and intensification of tropical cyclone through latent heat release by cumulus convection. J. Atmos. Sci.

**22**, 40–63 (1965) - 23.
Kuo, H.L.: Further studies of the parameterization of the influence of cumulus convection on large-scale flow. J. Atmos. Sci.

**31**(5), 1232–1240 (1974) - 24.
Lacis, A.A., Hansen, J.: A parameterization for the absorption of solar radiation in the Earth?s atmosphere. J. Atmos. Sci.

**31**(1), 118–133 (1974) - 25.
Laursen, L., Eliasen, E.: On the effects of the damping mechanisms in an atmospheric general circulation model. Tellus

**41**, 385–400 (1989) - 26.
Lecomte, V., Tailleur, J.: A numerical approach to large deviations in continuous time. J. Stat. Mech.

**2007**(03), P03004 (2007). https://doi.org/10.1088/1742-5468/2007/03/P03004 - 27.
Lestang, T., Ragone, F., Brehier, C.E., Herbert, C., Bouchet, F.: Computing return times or return periods with rare event algorithms. J. Stat. Mech.

**2018**(4), 043213 (2018). https://doi.org/10.1088/1742-5468/aab856 - 28.
Lorenz, R., Jaeger, E.B., Seneviratne, S.I.: Persistence of heat waves and its link to soil moisture memory. Geophys. Res. Lett.

**37**(9), L09703 (2010). https://doi.org/10.1029/2010GL042764 - 29.
Louis, J.F.: A parametric model of vertical eddy fluxes in the atmosphere. Bound. Layer Meteorol.

**17**(2), 187–202 (1979) - 30.
Louis, J.F., Tiedke, M., Geleyn, M.: A short history of the PBL parameterisation at ECMWF. In: Proceedings of the ECMWF Workshop on Planetary Boundary Layer Parameterization. pp. 59–80. Reading (1981)

- 31.
Lucarini, V., Faranda, D., Freitas, A.C.G.M.M., Freitas, J.M.M., Holland, M., Kuna, T., Nicol, M., Todd, M., Vaienti, S.: Extremes and Recurrence in Dynamical Systems. Wiley, Hoboken (2016)

- 32.
Orszag, S.A.: Transform method for the calculation of vector-coupled sums: application to the spectral form of the vorticity equation. J. Atmos. Sci.

**27**(6), 890–895 (1970) - 33.
Pohorille, A., Jarzynski, C., Chipot, C.: Good practices in free-energy calculations. J. Phys. Chem. B

**114**, 10235–10253 (2010) - 34.
Ragone, F., Wouters, J., Bouchet, F.: Computation of extreme heat waves in climate models using a large deviation algorithm. Proc. Natl. Acad. Sci. USA

**115**(1), 24–29 (2018). https://doi.org/10.1073/pnas.1712645115 - 35.
Roeckner, E., Arpe, K., Bengtsson, L., Brinkop, S., Dümenil, L., Esch, M., Kirk, E., Lunkeit, F., Ponater, M., Rockel, B., Sausen, R., Schlese, U., Schubert, S., Windelband, M.: Simulation of present day climate with the ECHAM model: impact of model physics and resolution. Technical Report, 93. Technical report, Max Planck Institut für Meteorologie, Hamburg, (1992)

- 36.
Rohwer, C.M., Angeletti, F., Touchette, H.: Convergence of large-deviation estimators. Phys. Rev. E

**92**, 052104 (2015). https://doi.org/10.1103/PhysRevE.92.052104 - 37.
Rubino, G., Tuffin, B.: Rare Event Simulation Using Monte Carlo Methods. Wiley, Chichester (2009)

- 38.
Sasamori, T.: The radiative cooling calculation for application to general circulation experiments. J. Appl. Meteorol.

**7**(5), 721–729 (1968) - 39.
Semtner, A.J.: A model for the thermodynamic growth of sea ice in numerical investigations of climate. J. Phys. Oceanogr.

**6**(3), 379–389 (1976) - 40.
Slingo, A., Slingo, J.M.: Response of the National Center for Atmospheric Research community climate model to improvements in the representation of clouds. J. Geophys. Res.

**96**(D8), 15341 (1991) - 41.
Stefanon, M., D’Andrea, F., Drobinski, P.: Heatwave classification over Europe and the Mediterranean region. Environ. Res. Lett.

**7**, 014023 (2012) - 42.
Stephens, G.L., Paltridge, G.W., Platt, C.M.R.: Radiation profiles in extended water clouds. III: observations. J. Atmos. Sci.

**35**(11), 2133–2141 (1978) - 43.
Stephens, G.L., Ackerman, S., Smith, E.A.: A shortwave parameterization revised to improve cloud absorption. J. Atmos. Sci.

**41**(4), 687–690 (1984) - 44.
Touchette, H.: The large deviation approach to statistical mechanics. Phys. Rep.

**478**, 1–69 (2009) - 45.
Veneziano, D., Langousis, A., Lepore, C.: New asymptotic and preasymptotic results on rainfall maxima from multifractal theory. Water Resour. Res.

**45**(11), W11421 (2009). https://doi.org/10.1029/2009WR008257 - 46.
Welch, P.D.: The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust.

**15**(2), 70–73 (1967) - 47.
Wilks, D., Wilby, R.: The weather generation game: a review of stochastic weather models. Prog. Phys. Geogr.

**23**(3), 329–357 (1999) - 48.
Young, L.S.: Large deviations in dynamical systems. Trans. Am. Math. Soc.

**318**(2), 525–543 (1990)

## Acknowledgements

The authors thank the editor and two anonymous reviewers for their constructive criticism and suggestions. The research leading to these results has received funding from the European Research Council under the European Union’s seventh Framework Programme (FP7/2007-2013 Grant Agreement No. 616811). The simulations have been performed on the machines of the Pôle Scientifique de Modélisation Numérique (PSMN) and of the Centre Informatique National de l’Enseignement Supérieur (CINES).

## Author information

### Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors declare that they have no conflict of interest.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by Valerio Lucarini.

## Appendix: Convergence of Direct Estimate of Large Deviation Functions and Statistical Errors

### Appendix: Convergence of Direct Estimate of Large Deviation Functions and Statistical Errors

The choice of the size of the time block \(\tau _{b}\) and the test of the convergence to the large deviation limit requires computing the autocorrelation time of the process, which sets a lower bound to the values of the averaging time that it makes sense to consider. Figure 2b shows for the first 50 days the autocorrelation function *C*(*t*) of the average European surface temperature \(T_{s}\) computed from a 1000 years long run. We can see that to a first approximation the function is well described by a double exponential, with a first decay time of about 4 days compatible with the time scale of synoptic variability, followed by a slowly decaying tail that at least in the first part seems to decay exponentially on a time scale of 1 month. The longer time scales inducing the slow decay of the autocorrelation function could be related to the low frequency variability of the atmospheric dynamics, and/or to time scales related to the water vapor cycle in the atmosphere and the land surface processes.

The integral autocorrelation time \(\tau _{c}\) is defined as the integral from time lag 0 to \(+\infty \) of the autocorrelation function \(C(t)=\mathbb {E}\left[ (A(X(t))-\mu )(A(X(0))-\mu )\right] /\sigma ^{2}\). An equivalent expression for \(\tau _{c}\) is [3]

Equation 21 gives a better estimator of the autocorrelation time than a simple time integration of the autocorrelation function. In practice what we do is to divide the time series in \(N_{b}\) blocks of length \(\tau _{b}\), and then we compute the integrals in (21) in each block and approximate the expectation value as a sum of the \(N_{b}\) blocks. Figure 9 shows the value of the estimate of the autocorrelation time as a function of \(\tau _{b}\). The shaded area represents the 95% confidence interval of the estimate computed as two standard deviations of the sample of estimates over the \(N_{b}\) blocks. We can see that the estimate converges to a value of about 7.5 days, but that it is necessary to use a very large value of \(\tau _{b}\), of at least 3 years, in order to reach convergence.

Once computed the autocorrelation time, the first step of the direct estimate is to compute the generating function

knowing that \(\tau _{b}\) will have to be much larger than \(\tau _{c}\). When we deal with a discrete time series as the output of a numerical model, where time is discretized in time steps of length \(\Delta t\), this means in practice computing

where \(p=\tau _{b}/\Delta t\). Following [3], a more sophisticated way that makes a better use of the available data would be to compute

In (24) the sample mean is computed on \(2N_{b}\) blocks overlapping by 50%, as suggested by the Welch estimator of the power spectrum of a random process [46]. Using (24) instead of (23) does not change the results of the estimate or the convergence region, but gives smaller statistical errors where they can be computed. In the following we keep the simpler notation (22) for ease of presentation.

As discussed in the main text, in a practical application one is constrained by the fixed length *T* of the time series, and the choice of \(\tau _{b}\) and \(N_{b}\) has to be considered carefully. The convergence of the estimators has been studied by [36]. In the case of unbounded variables, obtaining a correct estimate is limited by two problems: (1) the artificial linearization of the tails of the functions due to the finite size of the sample and (2) the non-uniform convergence for different values of *k*.

The linearization effect is an artefact in the estimate of \(\hat{G}(k,\tau _{b},N_{b})\) for large values of *k* which causes the estimate of \(\hat{\lambda }(k,\tau _{b},N_{b})\) to become linear in *k* for any value of *k* whose module is large enough. This is due to the fact that a sum of exponentials over a finite sample, as the one involved in (22), is dominated for large *k* by the largest value in the sample, so that \(\sum _{j=1}^{N_{b}}e^{kA_{\tau _{b}}^{j}}\approx e^{kA_{\tau _{b}}^{max}}\), with \(A_{\tau _{b}}^{max}=\max _{j}\{A_{\tau _{b}}^{j}\}\). Therefore, for a given pair of \(\tau _{b}\) and \(N_{b}\), for positive *k* there is an upper critical value \(k_{c}^{+}(\tau _{b},N_{b})>0\) for which \(\hat{\lambda }(k,\tau _{b},N_{b})\approx kA_{\tau _{b}}^{max}\) for \(k>k_{c}^{+}(\tau _{b},N_{b})\). Equivalently for negative *k* there is a lower critical value \(k_{c}^{-}(\tau _{b},N_{b})<0\) for which \(\hat{\lambda }(k,\tau _{b},N_{b})\approx kA_{\tau _{b}}^{min}\) for \(k<k_{c}^{-}(\tau _{b},N_{b})\). If an observable is bounded, the linear behavior is actually correct. For unbounded variables it is instead an artefact of the finite size of the sample.

Scaling arguments can be provided to estimate \(k_{c}^{+}(\tau _{b},N_{b})\) and \(k_{c}^{-}(\tau _{b},N_{b})\), as discussed in details in [36]. However, the actual values depend on the underlying probability distribution of the process, and in complex applications they have to be estimated case by case by empirical analysis. A simple way to proceed is to compute the relative contribution of the largest value to the sample mean

By fixing an arbitrary upper threshold for \(r(k,\tau _{b},N_{b})\), one finds an estimate for the value of \(k_{c}^{+}(\tau _{b},N_{b})\) (and an equivalent procedure gives a value for \(k_{c}^{-}(\tau _{b},N_{b})\) ). Figure 10a shows \(r(k,\tau _{b},N_{b})\) as a function of *k* for different values of \(\tau _{b}\) for which there is actual convergence to the large deviation limit. Figure 10b shows the estimate of \(k_{c}^{+}(\tau _{b},N_{b})\) as a function of \(\tau _{b}\), obtained taking a threshold of 50% for \(r(k,\tau _{b},N_{b})\). We can see that there is a large difference in \(k_{c}^{+}(\tau _{b},N_{b})\) if taking a value of \(\tau _{b}\) of about 1 year or 3-4 years. However, the estimate for lower values of \(\tau _{b}\) is extremely unstable, showing that if proper convergence in time is not reached, also the convergence of the statistical estimator itself is not well behaved. For \(\tau _{b}\) larger than 3 years the estimate of \(k_{c}^{+}(\tau _{b},N_{b})\) stabilizes around a value of 5 \(K^{-1}years^{-1}\). We have therefore taken \(\tau _{b}=3\) years and \(k_{c}^{+}(\tau _{b},N_{b})=5\,K^{-1}years^{-1}.\) A similar analysis gives \(k_{c}^{-}(\tau _{b},N_{b})=-2.5\,K^{-1}years^{-1}\).

Once identified the convergence region, one can compute statistical errors in half of it, following [36]. The error on the generating function can be naturally estimated as

where \(\mathrm {var}(\hat{G}(k,\tau _{b},N_{b}))\) is the empirical variance associated with the sample mean replacing the expectation value. An estimate of the associated error on \(\hat{\lambda }(k,\tau _{b},N_{b})\) can be computed by taking a Taylor expansion of the estimator [33, 36]

The statistical error on \(\hat{a}(k,\tau _{b},N_{b})\) can be estimated by

where \(\hat{H}(k,\tau _{b},N_{b})=\sum _{j=1}^{N_{b}}A_{\tau _{b}}^{j}e^{kA_{\tau _{b}}^{j}}\) and \({{\text {err}}}[\hat{H}(k,\tau _{b},N_{b})]\) is computed as \({{\text {err}}}[\hat{G}(k,\tau _{b},N_{b})]\). This formula is obtained assuming that \(\hat{H}(k,\tau _{b},N_{b})\) and \(\hat{G}(k,\tau _{b},N_{b})\) are independent [36]. The error on \(\hat{I}(\hat{a}(k,\tau _{b},N_{b}),\tau _{b},N_{b})\) can then be estimated as

again assuming independence between \(\hat{a}(k,\tau _{b},N_{b})\) and \(\hat{\lambda }(k,\tau _{b},N_{b})\).

## Rights and permissions

## About this article

### Cite this article

Ragone, F., Bouchet, F. Computation of Extreme Values of Time Averaged Observables in Climate Models with Large Deviation Techniques.
*J Stat Phys* **179, **1637–1665 (2020). https://doi.org/10.1007/s10955-019-02429-7

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Large deviation theory
- Rare event algorithms
- Climate extremes
- Heat waves