Skip to main content

Using regional scaling for temperature forecasts with the Stochastic Seasonal to Interannual Prediction System (StocSIPS)


Over time scales between 10 days and 10–20 years—the macroweather regime—atmospheric fields, including the temperature, respect statistical scale symmetries, such as power-law correlations, that imply the existence of a huge memory in the system that can be exploited for long-term forecasts. The Stochastic Seasonal to Interannual Prediction System (StocSIPS) is a stochastic model that exploits these symmetries to perform long-term forecasts. It models the temperature as the high-frequency limit of the (fractional) energy balance equation, which governs radiative equilibrium processes when the relevant equilibrium relaxation processes are power law, rather than exponential. They are obtained when the order of the relaxation equation is fractional rather than integer and they are solved as past value problems rather than initial value problems. StocSIPS was first developed for monthly and seasonal forecast of globally averaged temperature. In this paper, we extend it to the prediction of the spatially resolved temperature field by treating each grid point as an independent time series. Compared to traditional global circulation models (GCMs), StocSIPS has the advantage of forcing predictions to converge to the real-world climate. It extracts the internal variability (weather noise) directly from past data and does not suffer from model drift. Here we apply StocSIPS to obtain monthly and seasonal predictions of the surface temperature and show some preliminary comparison with multi-model ensemble (MME) GCM results. For 1 month lead time, our simple stochastic model shows similar—but somewhat higher—values of the skill scores than the much more complex deterministic models.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18


  1. Biagini F, Hu Y, Øksendal B, Zhang T (2008) Stochastic calculus for fractional brownian motion and applications. Springer, London

    Book  Google Scholar 

  2. Blender R, Fraedrich K, Hunt B (2006) Millennial climate variability: GCM-simulation and Greenland ice cores. Geophys Res Lett 33:L04710.

    Article  Google Scholar 

  3. Box GEP, Jenkins GM, Reinsel GC (2008) Time series analysis. Wiley, New York

    Book  Google Scholar 

  4. Brockwell PJ, Davis RA (1991) Time series: theory and methods. Springer, New York

    Book  Google Scholar 

  5. Christensen HM, Moroz IM, Palmer TN (2015) Evaluation of ensemble forecast uncertainty using a new proper score: application to medium-range and seasonal forecasts. Q J R Meteorol Soc 141:538–549.

    Article  Google Scholar 

  6. Christensen HM, Berner J, Coleman DRB, Palmer TN (2017) Stochastic parameterization and El Niño-Southern Oscillation. J Clim 30:17–38.

    Article  Google Scholar 

  7. Clarke DC, Richardson M (2021) The benefits of continuous local regression for quantifying global warming. Earth Sp Sci, 8:e2020EA001082.

    Article  Google Scholar 

  8. Cleveland WS, Devlin SJ (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83:596.

    Article  Google Scholar 

  9. Davini P, von Hardenberg J, Corti S et al (2017) Climate SPHINX: evaluating the impact of resolution and stochastic physics parameterisations in the EC-Earth global climate model. Geosci Model Dev 10:1383–1402.

    Article  Google Scholar 

  10. Delignières D (2015) Correlation properties of (discrete) fractional Gaussian noise and fractional brownian motion. Math Probl Eng 2015:1–7.

    Article  Google Scholar 

  11. Del Rio Amador L, Lovejoy S (2019) Predicting the global temperature with the Stochastic Seasonal to Interannual Prediction System (StocSIPS). Clim Dyn 53:4373–4411.

    Article  Google Scholar 

  12. Del Rio Amador L, Lovejoy S (2021) Long-range forecasting as a past value problem: untangling correlations and causality with scaling. Geophys Res Lett Rev.

    Article  Google Scholar 

  13. Fisher RA (1915) Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10:507.

    Article  Google Scholar 

  14. Franzke C (2012) Nonlinear trends, long-range dependence, and climate noise properties of surface temperature. J Clim 25:4172–4183.

    Article  Google Scholar 

  15. Franzke CLE, O’Kane TJ, Berner J et al (2015) Stochastic climate theory and modeling. Wiley Interdiscip Rev Clim Change 6:63–78.

    Article  Google Scholar 

  16. Gneiting T, Raftery AE, Westveld AH, Goldman T (2005) Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon Weather Rev 133:1098–1118.

    Article  Google Scholar 

  17. Gottwald GA, Crommelin DT, Franzke CLE (2017) Stochastic climate theory. In: Franzke CLE, Okane TJ (eds) Nonlinear and stochastic climate dynamics. Cambridge University Press, Cambridge, pp 209–240

    Chapter  Google Scholar 

  18. Graham R, Yun W, Kim J et al (2011) Long-range forecasting and the Global Framework for Climate Services. Clim Res 47:47–55.

    Article  Google Scholar 

  19. Gripenberg G, Norros I (1996) On the prediction of fractional Brownian motion. J Appl Probab 33:400–410.

    Article  Google Scholar 

  20. Hasselmann K (1976) Stochastic climate models. Part I. Theory. Tellus 28:473–485.

    Article  Google Scholar 

  21. Hébert R, Lovejoy S (2018) Regional climate sensitivity- and historical-based projections to 2100. Geophys Res Lett 45:4248–4254.

    Article  Google Scholar 

  22. Hersbach H (2000) Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather Forecast 15:559–570.;2

    Article  Google Scholar 

  23. Hipel KW, Mcleod AI (1994) Time series modelling of water resources and environmental systems. In: Hipel KW, Mcleod AI (eds) Time series modelling of water resources and environmental systems. Elsevier, Amsterdam, pp 1–1013

    Google Scholar 

  24. Jolliffe IT, Stephenson DB (2011) Forecast verification: a practitioner’s guide in atmospheric science, 2nd edn. Wiley, Hoboken

    Book  Google Scholar 

  25. Kalnay E, Kanamitsu M, Kistler R et al (1996) The NCEP/NCAR 40-year reanalysis project. Bull Am Meteorol Soc 77:437–471.;2

    Article  Google Scholar 

  26. Keller JD, Hense A (2011) A new non-Gaussian evaluation method for ensemble forecasts based on analysis rank histograms. Meteorol Zeitschrift 20:107–117.

    Article  Google Scholar 

  27. Kim G, Ahn J, Kryjov VN et al (2020) Assessment of MME methods for seasonal prediction using WMO LC-LRFMME hindcast dataset. Int J Climatol.

    Article  Google Scholar 

  28. Koscielny-Bunde E, Bunde A, Havlin S et al (1998) Indication of a universal persistence law governing atmospheric variability. Phys Rev Lett 81:729–732.

    Article  Google Scholar 

  29. Kryjov VN, Kang H-W, Nohara D et al (2006) Assessment of the climate forecasts produced by individual models and MME methods. APCC Technical Report 2006, APEC Climate Center. Busan, South Korea

  30. Leutbecher M, Palmer TN (2008) Ensemble forecasting. J Comput Phys 227:3515–3539.

    Article  Google Scholar 

  31. Lovejoy S (2014) Scaling fluctuation analysis and statistical hypothesis testing of anthropogenic warming. Clim Dyn 42:2339–2351.

    Article  Google Scholar 

  32. Lovejoy S (2018) Spectra, intermittency, and extremes of weather, macroweather and climate. Sci Rep 8:12697.

    Article  Google Scholar 

  33. Lovejoy S (2019) Fractional relaxation noises, motions and the fractional energy balance equation. Nonlin Process Geophys Discuss.

  34. Lovejoy S (2021a) The half-order energy balance equation, Part 1: the homogeneous HEBE and long memories. Earth Syst Dynam.

  35. Lovejoy S (2021b) The half-order energy balance equation, Part 2: the inhomogeneous HEBE and 2D energy balance models. Earth Syst Dynam Discuss.

  36. Lovejoy S, Schertzer D (1986) Scale invariance in climatological temperatures and the spectral plateau. Ann Geophys 4B:401–410

    Google Scholar 

  37. Lovejoy S, Schertzer D (2010) Towards a new synthesis for atmospheric dynamics: space–time cascades. Atmos Res 96:1–52.

    Article  Google Scholar 

  38. Lovejoy S, Schertzer D (2012a) Haar wavelets, fluctuations and structure functions: convenient choices for geophysics. Nonlinear Process Geophys 19:513–527.

    Article  Google Scholar 

  39. Lovejoy S, Schertzer D (2012b) Low‐Frequency Weather and the Emergence of the Climate. In: Sharma AS, Bunde A, Dimri VP and Baker DN (eds) Extreme Events and Natural Hazards: The Complexity Perspective.

  40. Lovejoy S, Schertzer D (2013) The Weather and climate: emergent laws and multifractal cascades. Cambridge University Press, Cambridge

    Book  Google Scholar 

  41. Lovejoy S, del Rio Amador L, Hébert R (2015) The ScaLIng Macroweather Model (SLIMM): using scaling to forecast global-scale macroweather from months to decades. Earth Syst Dyn 6:637–658.

    Article  Google Scholar 

  42. Lovejoy S, Procyk R, Hébert R, Del Rio Amador L (2021) The fractional energy balance equation. Q J R Meteorol Soc.

    Article  Google Scholar 

  43. Mandelbrot BB, Van Ness JW (1968) Fractional Brownian motions, fractional noises and applications. SIAM Rev 10:422–437.

    Article  Google Scholar 

  44. Meinshausen M, Smith SJ, Calvin K et al (2011) The RCP greenhouse gas concentrations and their extensions from 1765 to 2300. Clim Change 109:213–241.

    Article  Google Scholar 

  45. Mori H (1965) Transport, collective motion, and Brownian motion. Prog Theor Phys 33:423–455.

    Article  Google Scholar 

  46. Murphy AH (1988) Skill scores based on the mean square error and their relationships to the correlation coefficient. Mon Weather Rev 116:2417–2424.;2

    Article  Google Scholar 

  47. Ncep/ncar (2020) Ncep/ncar reanalysis 1. Accessed 3 Jan 2020

  48. Newman M (2013) An empirical benchmark for decadal forecasts of global surface temperature anomalies. J Clim 26:5260–5269.

    Article  Google Scholar 

  49. Newman M, Sardeshmukh PD, Winkler CR, Whitaker JS (2003) A study of subseasonal predictability. Mon Weather Rev 131:1715–1732.

    Article  Google Scholar 

  50. Palma W (2007) Long-memory time series. Wiley, Hoboken

    Book  Google Scholar 

  51. Palmer TN (2019) Stochastic weather and climate models. Nat Rev Phys 1:463–471.

    Article  Google Scholar 

  52. Palmer T, Buizza R, Hagedorn R et al (2006) Ensemble prediction: a pedagogical perspective. ECMWF Newsl 106:10–17.

    Article  Google Scholar 

  53. Pasternack A, Bhend J, Liniger MA et al (2018) Parametric decadal climate forecast recalibration (DeFoReSt 1.0). Geosci Model Dev 11:351–368.

    Article  Google Scholar 

  54. Penland C, Matrosova L (1994) A balance condition for stochastic numerical models with application to the El Niño-Southern Oscillation. J Clim 7:1352–1372.;2

    Article  Google Scholar 

  55. Penland C, Sardeshmukh PD (1995) The optimal growth of tropical sea surface temperature anomalies. J Clim 8:1999–2024.;2

    Article  Google Scholar 

  56. Procyk R, Lovejoy S, Hébert R (2020) The fractional energy balance equation for climate projections through 2100. Earth Syst Dynam Discuss.

  57. Rackow T, Juricke S (2020) Flow-dependent stochastic coupling for climate models with high ocean-to-atmosphere resolution ratio. Q J R Meteorol Soc 146:284–300.

    Article  Google Scholar 

  58. Rypdal K, Østvand L, Rypdal M (2013) Long-range memory in Earth’s surface temperature on time scales from months to centuries. J Geophys Res Atmos 118:7046–7062.

    Article  Google Scholar 

  59. Sardeshmukh PD, Sura P (2009) Reconciling non-Gaussian climate statistics with linear dynamics. J Clim 22:1193–1207.

    Article  Google Scholar 

  60. Trenberth KE (1997) The definition of El Niño. Bull Am Meteorol Soc 78:2771–2777.;2

    Article  Google Scholar 

  61. Varotsos CA, Efstathiou MN, Cracknell AP (2013) On the scaling effect in global surface air temperature anomalies. Atmos Chem Phys 13:5243–5253.

    Article  Google Scholar 

  62. Williams PD (2012) Climatic impacts of stochastic fluctuations in air–sea fluxes. Geophys Res Lett.

    Article  Google Scholar 

  63. Winkler CR, Newman M, Sardeshmukh PD (2001) A linear model of wintertime low-frequency variability. Part I: Formulation and forecast skill. J Clim 14:4474–4494.;2

    Article  Google Scholar 

  64. WMO (2010a) Standardised verification system (SVS) for long-range forecasts (LRF). New attachment II-8 to the manual on the GDPS. WMO-No. 485, vol 1. Geneva, Switzerland.

  65. WMO (2010b) Manual on the Global Data-processing and Forecasting System Volume I. (WMO-No. 485). Geneva, Switzerland.

  66. Wold H (1938) A study in analysis of stationary time series. J R Stat Soc, Almqvist und Wiksell, Uppsala

  67. Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise-assisted data analysis method. Adv Adapt Data Anal 01:1–41.

    Article  Google Scholar 

  68. Yuan N, Fu Z, Liu S (2015) Extracting climate memory using Fractional Integrated Statistical Model: a new perspective on climate prediction. Sci Rep 4:6577.

    Article  Google Scholar 

  69. Zampieri L, Goessling HF, Jung T (2018) Bright prospects for arctic sea ice prediction on subseasonal time scales. Geophys Res Lett 45:9731–9738.

    Article  Google Scholar 

  70. Zeiler A, Faltermeier R, Keck IR et al (2010) Empirical mode decomposition—an introduction. In: The 2010 international joint conference on neural networks (IJCNN). IEEE, pp 1–8

  71. Zwanzig R (1973) Nonlinear generalized Langevin equations. J Stat Phys 9:215–220.

    Article  Google Scholar 

  72. Zwanzig R (2001) Nonequilibrium statistical mechanics, 1st edn. Oxford University Press, Oxford

    Google Scholar 

Download references


This study was funded by Hydro-Québec (Bourse de doctorat Hydro-Québec en science-F213013R02) and Natural Sciences and Engineering Research Council of Canada (NSERC RGPIN-2016-04796).

Author information



Corresponding author

Correspondence to Lenin Del Rio Amador.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1: Basic theory for fGn processes

Continuous-in-time fGn

In DRAL, the stochastic natural variability component of the globally averaged temperature was represented as an fGn process. The main properties of fGn relevant for the present paper are summarized in the following.

An fGn process at resolution \(\tau\) (the scale at which the series is averaged) has the following integral representation:

$$T_{\tau } \left( t \right) = \frac{1}{\tau }\frac{{c_{H} \sigma_{T} }}{{\Gamma \left( {H + {3 \mathord{\left/ {\vphantom {3 2}} \right. \kern-\nulldelimiterspace} 2}} \right)}}\left[ {\int\limits_{ - \infty }^{t} {\left( {t - t^{\prime}} \right)^{{H + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}} \gamma \left( {t^{\prime}} \right)dt^{\prime}} - \int\limits_{ - \infty }^{t - \tau } {\left( {t - \tau - t^{\prime}} \right)^{{H + {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}} \gamma \left( {t^{\prime}} \right)dt^{\prime}} } \right],$$

where \(\gamma \left(t\right)\) is a unit Gaussian \(\delta\)-correlated white noise process with \(\langle \gamma \left(t\right)\rangle =0\) and \(\langle \gamma \left(t\right)\gamma \left(t^{\prime}\right)\rangle =\delta \left(t-t^{\prime}\right)\) [\(\delta \left(x\right)\) is the Dirac function], \(\Gamma \left(x\right)\) is the Euler gamma function, \({\sigma }_{T}\) is the ensemble standard deviation (for \(\tau =1\)) and

$$c_{H}^{2} = \frac{\pi }{{2\cos \left( {\pi H} \right)\Gamma \left( { - 2 - 2H} \right)}}.$$

This is the canonical value for the constant \({c}_{H}\) that was chosen to make the expression for the statistics particularly simple. In particular, the variance is \(\langle {{T}_{\tau }\left(t\right)}^{2}\rangle ={{\sigma }_{T}}^{2}{\tau }^{2H}\) for all \(t\), where \(\langle \bullet \rangle\) denotes ensemble (infinite realizations) averaging. The parameter \(H\), with \(-1<H<0\), is the fluctuation exponent of the corresponding fractional Gaussian noise process, the Hurst exponent, \({H}^{^{\prime}}=H+1\). Fluctuation exponents are used due to their wider generality; they are well defined even for strongly intermittent non-Gaussian multifractal processes and they can be any real value. For a discussion, see page 643 in Lovejoy et al. (2015).

Equation (15) can be interpreted as the smoothing of the fractional integral of a white noise process or as the power-law weighted average of past innovations, \(\gamma \left(t\right)\). This power-law weighting accounts for the memory effects in the temperature series. The closer the fluctuation exponent is to zero, the larger is the influence of past values on the current temperature. This is evidenced by the behaviour of the autocorrelation function:

$$R_{H} \left( {\Delta t} \right) = \frac{{\left\langle {T_{\tau } \left( t \right)T_{\tau } \left( {t + \Delta t} \right)} \right\rangle }}{{\left\langle {T_{\tau } \left( t \right)^{2} } \right\rangle }} = \frac{1}{2}\left( {\left| {\frac{\Delta t}{\tau } + 1} \right|^{2H + 2} + \left| {\frac{\Delta t}{\tau } - 1} \right|^{2H + 2} - 2\left| {\frac{\Delta t}{\tau }} \right|^{2H + 2} } \right),$$

for \(\left|\Delta t\right|\ge \tau\). In particular, for \(\Delta t\gg \tau\) we obtain:

$$R_{H} \left( {\Delta t} \right) \approx \left( {H + 1} \right)\left( {2H + 1} \right)\left( {\frac{\Delta t}{\tau }} \right)^{2H} ,$$

which has a power–law behaviour with the same exponent as the average squared fluctuation and due to the Wiener–Khinchin theorem, it implies the spectrum exponent \(\beta =1+2H\). For more details on fGn processes see Mandelbrot and Van Ness (1968), Gripenberg and Norros (1996) and Biagini et al. (2008).

Discrete-in-time fGn

A detailed explanation of the theory for modeling and predicting using the discrete version of fGn processes was presented in DRAL; the main results are summarized next. The analogue of Eq. (15) in the discrete case for a finite series, \({\left\{{T}_{t}\right\}}_{t=1,\dots ,N}\), with length \(N\) and zero mean is:

$$T_{t} = \sum\limits_{j = 1}^{t} {m_{tj} \gamma_{t + 1 - j} } = m_{t1} \gamma_{t} + \cdots + m_{tt} \gamma_{1} ,$$

for \(t=1,\dots ,N\), where \({\left\{{\gamma }_{t}\right\}}_{t=1,\dots ,N}\) is a discrete white noise process and the coefficients \({m}_{ij}\) are the elements of the lower triangular matrix \({\mathbf{M}}_{H,{\sigma }_{T}}^{N}\) given by the Cholesky decomposition of the autocovariance matrix, \({\mathbf{C}}_{H,{\sigma }_{T}}^{N}={{\sigma }_{T}}^{2}{\left[{R}_{H}\left(i-j\right)\right]}_{i,j=1,\dots ,N}\):

$${\mathbf{C}}_{{H,\sigma_{T} }}^{N} = {\mathbf{M}}_{{H,\sigma_{T} }}^{N} \left( {{\mathbf{M}}_{{H,\sigma_{T} }}^{N} } \right)^{T} ,$$

with \({m}_{ij}=0\) for \(j>i\) (we assume \(\tau =1\) is the smallest scale in our system). The superscript \(T\) denotes transpose operation.

In vector form, Eq. (19) can be written as:

$${\mathbf{T}}_{N} = {\mathbf{M}}_{{H,\sigma_{T} }}^{N} {{\varvec{\upgamma}}}_{N}$$

Equations (1921) can be used to create synthetic samples of fGn with a given length \(N\), autocorrelation function given by Eq. (17) and set of parameters \({\sigma }_{T}>0\) and \(-1<H<0\) (the mean of the series is always assumed equal to zero). Conversely, given an actual temperature series with vector \({\mathbf{T}}_{N}={\left[{T}_{1},\dots ,{T}_{N}\right]}^{T}\), we can estimate the parameters \({\sigma }_{T}\) and \(H\) using the maximum likelihood method (details are given in Appendix 1 of DRAL) and we can verify that it could be well approximated by an fGn model by inverting Eq. (21) and obtaining the residual vector of innovations:

$${{\varvec{\upgamma}}}_{N} = \left( {{\mathbf{M}}_{{H,\sigma_{T} }}^{N} } \right)^{ - 1} {\mathbf{T}}_{N} .$$

If the model provides a good description of the data, the residual vector \({{\varvec{\upgamma}}}_{N}={\left[{\gamma }_{1},\dots ,{\gamma }_{N}\right]}^{T}\) is a white noise, i.e. the elements should be \(\text{NID}\left(\text{0,1}\right)\) with autocorrelation function \(\langle {\gamma }_{i}{\gamma }_{j}\rangle ={\delta }_{ij}\) (\({\delta }_{ij}\) is the Kronecker delta and \(\text{NID}\left(\text{0,1}\right)\) stands for Normally and Independently Distributed with mean \(0\) and variance 1). It is worth mentioning that a white noise process is a particular case of fGn with \(H=-1/2\).

fRn correlation function for \(0<{\varvec{H}}<1\)

The fractional Relaxation noise (fRn) process was introduced in Lovejoy (2019) generalizing both fGn, fBm and Ornstein–Uhlenbeck processes. For short time scales (compared to some characteristic relaxation time, \({\tau }_{r}\)) and for exponents \(-1/2<H<0\), the fRn is close to an fGn process. For fluctuation exponents in the range \(0<H<1\) the high-frequency approximation to fRn is no longer an fGn process. In this case, to leading order, the correlation function is:

$$\begin{array}{*{20}c} {R_{{{\text{fRn}}}} \left( {\Delta t} \right) = 1 - A_{H} \left( {\frac{\Delta t}{{\tau_{r} }}} \right)^{2H} + O\left( {\frac{\Delta t}{{\tau_{r} }}} \right)^{3H + 1/2} ;} & {\begin{array}{*{20}c} {\Delta t < \tau_{r} ;} & {0 < H < 1} \\ \end{array} } \\ \end{array}$$

where \({\tau }_{r}\) is the relaxation time and \({A}_{H}\) is an \(H\)-dependent numerical factor [see (Lovejoy 2019)]. The same correlation function was obtained by Delignières (2015) as an approximation to short segments of discrete-in-time fractional Brownian motion (fBm) process that is the integral of an fGn process (but with \(H\) increased by 1). This shows that although fBm is nonstationary, short segments approximate (the stationary) fRn process. When \(0<H<1\), fBm is a high-frequency approximation to an fRn process.


In DRAL it was shown that, if \({\left\{{T}_{t}\right\}}_{t<0}\) is an fGn process, the optimal k-steps predictor for \({T}_{k}\) (\(k>0\)), based on a finite number, \(m\) (memory), of past values, is given by:

$$\hat{T}_{k} = \sum\limits_{j = - m}^{0} {\phi_{j} \left( k \right)T_{j} } = \phi_{ - m} \left( k \right)T_{ - m} + \cdots + \phi_{0} \left( k \right)T_{0} ,$$

where the vector of predictor coefficients, \({\varvec{\upphi}}\left(k\right)={\left[{\phi }_{-m}\left(k\right),\dots ,{\phi }_{0}\left(k\right)\right]}^{T}\), satisfies the Yule–Walker equations:

$${\mathbf{R}}_{H} {{\varvec\upphi}}\left( k \right) = {\mathbf{r}}_{H} \left( k \right),$$

with the vector \({\mathbf{r}}_{H}\left(k\right)={\left[{R}_{H}\left(k-i\right) \right]}_{i=-m,\dots ,0}^{T}={\left[{R}_{H}\left(m+k\right),\dots ,{R}_{H}\left(k\right) \right]}^{T}\) and \({\mathbf{R}}_{H}={\left[{R}_{H}\left(i-j\right)\right]}_{i,j=-t,\dots ,0}\) being the autocorrelation matrix (see Eq. 17). In those regions with consecutive values positively correlated (blue regions in Fig. 4a with \(-1/2<H<0\) or the increments in the yellow region with\(1/2<H<1\)), the elements \({R}_{H}\left(\Delta t\right)\) are obtained from Eq. (17). In the places with consecutive increments negatively correlated, where \(0<H<1/2\) (red in Fig. 4a), instead of forecasting the fGn increments, we forecast directly the fRn process and we get the elements \({R}_{H}\left(\Delta t\right)\) from Eq. (23). To use this autocorrelation for fRn, we estimate the constant \({A}_{H}\) in Eq. (23) for each location by fitting the empirical autocorrelation function.

The root mean square error (\(\text{RMSE}\)) for the predictor at a future time \(k\), using a memory of \(m\) values, is defined as:

$${\text{RMSE}}\left( {k,m} \right) = \sqrt {\left\langle {\left[ {T_{k} - \hat{T}_{k} \left( m \right)} \right]^{2} } \right\rangle } .$$

Following the results presented in DRAL and using that, for positive \(H\) the fRn is the integral of the corresponding fGn process, we obtain the following analytical expression for the \(\text{RMSE}\) of the predictor of the natural variability component:

$${\text{RMSE}}_{{{\text{nat}}}}^{{{\text{theory}}}} \left( k \right) = {\text{RMSE}}\left( {k,m,\sigma_{T} ,H} \right) = \left\{ {\begin{array}{*{20}l} {\sigma_{T} \sqrt {1 - {\mathbf{r}}_{H} \left( k \right)^{T} \left( {{\mathbf{R}}_{H} } \right)^{ - 1} {\mathbf{r}}_{H} \left( k \right)} {;}} \hfill & {{\text{for}}\; - {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2} < H < 0} \hfill \\ {\sigma_{T} k^{H} \sqrt {1 - {\mathbf{r}}_{H - 1} \left( 1 \right)^{T} \left( {{\mathbf{R}}_{H - 1} } \right)^{ - 1} {\mathbf{r}}_{H - 1} \left( 1 \right)} {;}} \hfill & {{\text{for}}\;0 < H < 1} \hfill \\ \end{array} } \right..$$

For a given forecast horizon, \(k\), the \(\text{RMSE}\) only depends on the parameters \({\sigma }_{T}\) and \(H\), and the memory used, \(m\). In Fig. 3 of DRAL it was shown that only a few past datapoints are needed as memory to obtain an error approaching—with more than 95% agreement—the asymptotical value corresponding to \(m=\infty\), for all possible values of \(H\).

The theoretical mean square skill score (\(\text{MSSS}\)), is defined as:

$${\text{MSSS}}\left( k \right) = 1 - \frac{{\left\langle {\left[ {T\left( k \right) - \hat{T}\left( k \right)} \right]^{2} } \right\rangle }}{{\left\langle {T\left( k \right)^{2} } \right\rangle }}$$

(the reference forecast is the mean of the series, assumed equal to zero here).

From the definition of the \(\text{RMSE}\), Eq. (26), we obtain the theoretical value for fGn:

$${\text{MSSS}}_{{{\text{nat}}}}^{{{\text{theory}}}} \left( k \right) = {\text{MSSS}}\left( {k,m,H} \right){ = 1} - \frac{{{\text{RMSE}}\left( {k,m,\sigma_{T} ,H} \right)^{2} }}{{\sigma_{T}^{2} }}$$

or, replacing Eq. (27) for \(-1/2<H<0\):

$${\text{MSSS}}\left( {k,m,H} \right) = {\mathbf{r}}_{H} \left( k \right)^{T} \left( {{\mathbf{R}}_{H} } \right)^{ - 1} {\mathbf{r}}_{H} \left( k \right) = {{\varvec\upphi}}\left( k \right) \cdot {\mathbf{r}}_{H} \left( k \right).$$

In Fig. 19 we show graphs of the theoretical \(\text{MSSS}\) as a function of \(H\) for different values of \(k\). A memory \(m=50\) was used for computing the \(\text{MSSS}\). As expected, the skill decreases as the forecast horizon increases. For \(H=-0.5\), the fGn process is a white noise process and \(\text{MSSS}=0\). The skill increases with \(H\) and (with infinite past data) the process becomes perfectly predictable when \(H\to 0\).

Fig. 19

Graphs of the theoretical \(\text{MSSS}\) (Eq. 46) as a function of \(H\) for different values of \(k\). A memory \(m=50\) was used for computing the \(\text{MSSS}\)

Appendix 2: Verification metrics


The verification metrics used in this paper were defined following the recommendations in the Standardized verification system for long-range forecasts (SVS-LRF) for the practical details of producing and exchanging appropriate verification scores (WMO 2010a, b). Let \({x}_{i}\left(t\right)\) and \({f}_{i}\left(t\right)\), (\(t=1,\dots ,N\)) denote time series of observations and forecasts, respectively, for a grid point i over the period of verification (POV) with \(N\) time steps. Then, their averages for the POV, \({\stackrel{-}{x}}_{i}\) and \({\stackrel{-}{f}}_{i}\) and their sample variances \({{s}_{{x}_{i}}}^{2}\) and \({{s}_{{f}_{i}}}^{2}\) are given by:

$$\begin{aligned} & \overline{x}_{i} = \frac{1}{N}\sum\limits_{t = 1}^{N} {x_{i} \left( t \right)} {,}\quad \overline{f}_{i} = \frac{1}{N}\sum\limits_{t = 1}^{N} {f_{i} \left( t \right)} \\ & s_{xi}^{2} = \frac{1}{N}\sum\limits_{t = 1}^{N} {\left[ {x_{i} \left( t \right) - \overline{x}_{i} } \right]^{2} } {,}\quad s_{fi}^{2} = \frac{1}{N}\sum\limits_{t = 1}^{N} {\left[ {f_{i} \left( t \right) - \overline{f}_{i} } \right]^{2} } . \\ \end{aligned}$$

The mean square error (\(\text{MSE}\)) of the forecast for grid point i is:

$${\text{MSE}}_{i} = \frac{1}{N}\sum\limits_{t = 1}^{N} {\left[ {f_{i} \left( t \right) - x_{i} \left( t \right)} \right]^{ \, 2} }$$

and the root mean square error (\(\text{RMSE}\)) is:

$${\text{RMSE}}_{i} = \sqrt {{\text{MSE}}_{i} } .$$

For leave-one-out cross-validated data in the POV (WMO 2010a), the \(\text{MSE}\) of climatology forecasts is:

$${\text{MSE}}_{{{\text{C}}i}} = \left( {\frac{N}{N - 1}} \right)^{ \, 2} s_{xi}^{ \, 2} .$$

The mean square skill score (\(\text{MSSS}\)) for grid point i, taking as reference the climatology forecast, is defined as:

$${\text{MSSS}}_{i} = 1 - \frac{{{\text{MSE}}_{i} }}{{{\text{MSE}}_{{{\text{C}}i}} }}.$$

The temporal correlation coefficient (\(\text{TCC}\)) is:

$${\text{TCC}}_{i} = \frac{{\frac{1}{N}\sum\nolimits_{t = 1}^{N} {\left[ {x_{i} \left( t \right) - \overline{x}_{i} } \right]\left[ {f_{i} \left( t \right) - \overline{f}_{i} } \right]} }}{{s_{xi} s_{fi} }}.$$

Both the \({\text{MSE}}_{i}\) and the \({\text{TCC}}_{i}\) are computed using temporal averages for a given location i, conversely, the anomaly pattern correlation coefficient (\(\text{ACC}\)) (Jolliffe and Stephenson 2011) is defined using spatial averages for a given time t:

$${\text{ACC}}\left( t \right) = \frac{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} \left[ {x^{\prime}_{i} \left( t \right) - \left\langle {x^{\prime}\left( t \right)} \right\rangle } \right]\left[ {f^{\prime}_{i} \left( t \right) - \left\langle {f^{\prime}\left( t \right)} \right\rangle } \right]} }}{{\sqrt {\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} \left[ {x^{\prime}_{i} \left( t \right) - \left\langle {x^{\prime}\left( t \right)} \right\rangle } \right]^{ \, 2} } } \sqrt {\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} \left[ {f^{\prime}_{i} \left( t \right) - \left\langle {f^{\prime}\left( t \right)} \right\rangle } \right]^{ \, 2} } } }},$$

where \(n\) is the number of grid points, \({\theta }_{i}\) is the latitude at location i, \({x}_{i}^{^{\prime}}\left(t\right)\) and \({f}_{i}^{^{\prime}}\left(t\right)\) are observation and forecast anomalies for the POV, respectively, and the spatial averages \(x^{\prime}\left( t \right)\) and \(f^{\prime}\left( t \right)\) are given by:

$$\left\langle {x^{\prime}\left( t \right)} \right\rangle = \frac{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} x^{\prime}_{i} \left( t \right)} }}{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} } }}{,}\quad \left\langle {f^{\prime}\left( t \right)} \right\rangle = \frac{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} f^{\prime}_{i} \left( t \right)} }}{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} } }}.$$

Averaged scores

To take the average of nonlinear scores, they should be transformed so the corresponding variables are Gaussian. The spatial average \(\text{RMSE}\) (considering the area factor) is:

$$\left\langle {{\text{RMSE}}} \right\rangle = \sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {{\text{MSE}}_{i} \cos \theta_{i} } }}{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} } }}} .$$

Similarly, the average \(\text{MSSS}\) is:

$$\left\langle {{\text{MSSS}}} \right\rangle = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {{\text{MSE}}_{i} \cos \theta_{i} } }}{{\sum\nolimits_{i = 1}^{n} {{\text{MSE}}_{{{\text{C}}i}} \cos \theta_{i} } }}.$$

For the correlation coefficients, the Fisher Z-transform must be taken first. This is defined as:

$$Z\left( r \right) = \frac{1}{2}\ln \left( {\frac{1 + r}{{1 - r}}} \right) = \tanh^{ - 1} r$$

The spatial average \(\text{TCC}\) is the defined as:

$$\left\langle {{\text{TCC}}} \right\rangle = Z^{ \, - 1} \left[ {\sqrt {\frac{{\sum\nolimits_{i = 1}^{n} {Z\left( {{\text{TCC}}_{i} } \right)\cos \theta_{i} } }}{{\sum\nolimits_{i = 1}^{n} {\cos \theta_{i} } }}} } \right]$$

and the temporal average \(\text{ACC}\) is

$$\left\langle {{\text{ACC}}} \right\rangle = Z^{ \, - 1} \left\{ {\frac{1}{N}\sum\limits_{t = 1}^{N} {Z\left[ {{\text{ACC}}\left( t \right)} \right]} } \right\}.$$

Orthogonality principle and MSSS decomposition

The \(\text{MSSS}\) (Eq. 35), can be expanded for leave-one-out cross-validated forecasts (Murphy 1988). Using Eqs. (31), (32), (34) and (36) in (35), we obtain:

$${\text{MSSS}}_{i} = {{\left\{ {2\frac{{s_{fi} }}{{s_{xi} }}{\text{TCC}}_{i} - \left( {\frac{{s_{fi} }}{{s_{xi} }}} \right)^{2} - \left( {\frac{{\left[ {\overline{f}_{i} - \overline{x}_{i} } \right]}}{{s_{xi} }}} \right)^{2} + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}} \right\}} \mathord{\left/ {\vphantom {{\left\{ {2\frac{{s_{fi} }}{{s_{xi} }}{\text{TCC}}_{i} - \left( {\frac{{s_{fi} }}{{s_{xi} }}} \right)^{2} - \left( {\frac{{\left[ {\overline{f}_{i} - \overline{x}_{i} } \right]}}{{s_{xi} }}} \right)^{2} + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}} \right\}} {\left\{ {1 + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}} \right\}}}} \right. \kern-\nulldelimiterspace} {\left\{ {1 + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}} \right\}}}.$$

This equation gives a relation between the \(\text{MSSS}\) and the \(\text{TCC}\). For forecasts with the same variance as that of observations and no overall bias, the \(\text{MSSS}\) is only positive (\(\text{MSE}\) lower than for climatology) if the \(\text{TCC}\) is larger than approximately 0.5.

A more simplified relation can be obtained in our case for the prediction of the detrended anomalies (natural variability). As we mentioned in Appendix 1.4, the predictor (Eq. 24) is built in such a way that the coefficients satisfy the Yule Walker equations, which are derived from the orthogonality principle (Wold 1938; Brockwell and Davis 1991; Hipel and McLeod 1994; Palma 2007; Box et al. 2008). This principle states that the error of the optimal predictor, \({e}_{i}\left(t\right)={x}_{i}\left(t\right)-{f}_{i}\left(t\right)\) (in a mean square error sense) is orthogonal to any possible estimator:

$$\left\langle {e_{i} \left( t \right)f_{i} \left( t \right)} \right\rangle = 0.$$

From this ensemble average condition, we get the analytical expressions for the coefficients as a function of the fluctuation exponent, \(H\), for the fGn process. If the model realistically describes the actual temperature anomalies, then the condition Eq. (45) can be approximated by the temporal average in the POV:

$$\frac{1}{N}\sum\limits_{t = i}^{N} {\left[ {e_{i} \left( t \right)f_{i} \left( t \right)} \right]} = 0.$$


$$\frac{1}{N}\sum\limits_{t = i}^{N} {\left[ {x_{i} \left( t \right) - f_{i} \left( t \right)} \right]} f_{i} \left( t \right) = 0.$$

from which:

$$\frac{1}{N}\sum\limits_{t = i}^{N} {\left[ {x_{i} \left( t \right)f_{i} \left( t \right)} \right]} = \frac{1}{N}\sum\limits_{t = i}^{N} {f_{i} \left( t \right)^{2} } .$$

For \({\stackrel{-}{x}}_{i}={\stackrel{-}{f}}_{i}=0\), dividing by the product \({s}_{{x}_{i}}{s}_{{f}_{i}}\) and using Eqs. (31) and (36), we can rewrite Eq. (48) as:

$${\text{TCC}}_{i} = \frac{{s_{fi} }}{{s_{xi} }}.$$

Using this ratio in Eq. (44) we finally obtain:

$${\text{MSSS}}_{i} = \frac{{{\text{TCC}}_{i}^{2} + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}}}{{1 + \frac{2N - 1}{{\left( {N - 1} \right)^{2} }}}}.$$

A more detailed analysis gives the same expression with the weaker condition of overall unbiased estimates \({\stackrel{-}{x}}_{i}-{\stackrel{-}{f}}_{i}=0\) (not necessarily each of them must be zero).

In our case, for the forecast of the detrended anomalies (natural variability) at monthly resolution in the POV 1951–2019 (\(N=828\) months), the N-dependent term in Eq. (50) is negligible:

$$\frac{2N - 1}{{\left( {N - 1} \right)^{2} }} \approx 0.0024,$$

so, with good approximation we obtain:

$${\text{MSSS}}_{i} \approx {\text{TCC}}_{i}^{2} .$$

The orthogonality principle, Eq. (47) (or equivalently, Eq. (49) or Eq. (52)), is the condition that maximizes the \(\text{MSSS}\). In our case, where the autoregressive coefficients in our predictor are analytical functions of only one parameter (\(H\)), if Eq. (52) is verified then our predictor is optimal in a mean square error sense and our model is suitable for describing the natural temperature variability.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Del Rio Amador, L., Lovejoy, S. Using regional scaling for temperature forecasts with the Stochastic Seasonal to Interannual Prediction System (StocSIPS). Clim Dyn 57, 727–756 (2021).

Download citation