Skip to main content
Log in

Predicting the global temperature with the Stochastic Seasonal to Interannual Prediction System (StocSIPS)

  • Published:
Climate Dynamics Aims and scope Submit manuscript

Abstract

Many atmospheric fields—in particular the temperature—respect statistical symmetries that characterize the macroweather regime, i.e. time-scales between the \(\approx\) 10 day lifetime of planetary sized structures and the (currently) 10–20 year scale at which the anthropogenic forcings begin to dominate the natural variability. The scale-invariance and the low intermittency of the fluctuations implies the existence of a huge memory in the system that can be exploited for macroweather forecasts using well-established (Gaussian) techniques. The Stochastic Seasonal to Interannual Prediction System (StocSIPS) is a stochastic model that exploits these symmetries to perform long-term forecasts. StocSIPS includes the previous ScaLIng Macroweather Model (SLIMM) as a core model for the prediction of the natural variability component of the temperature field. Here we present the theory for improving SLIMM using discrete-in-time fractional Gaussian noise processes to obtain an optimal predictor as a linear combination of past data. We apply StocSIPS to the prediction of globally-averaged temperature and confirm the applicability of the model with statistical testing of the hypothesis and a good agreement between the hindcast skill scores and the theoretical predictions. Finally, we compare StocSIPS with the Canadian Seasonal to Interannual Prediction System. From a forecast point of view, GCMs can be seen as an initial value problem for generating many “stochastic” realizations of the state of the atmosphere, while StocSIPS is effectively a past value problem that estimates the most probable future state from long series of past data. The results validate StocSIPS as a good alternative and a complementary approach to conventional numerical models. Temperature forecasts using StocSIPS are published on a regular basis in the website: http://www.physics.mcgill.ca/StocSIPS/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lenin Del Rio Amador.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Simulation, parameters estimation, ergodicity and model adequacy

1.1 Simulation

When modeling real time series and testing numerical algorithms, it is often useful to obtain synthetic realizations of fGn processes. There are many methods for simulating approximate samples of fGn, e.g.: (1) type 1 (Mandelbrot and Wallis 1969), (2) type 2 (Mandelbrot and Wallis 1969), (3) fast fGn (Mandelbrot 1971), (4) filtered fGn (Matalas and Wallis 1971), (5) ARMA(1,1) (O’Connell 1974), (6) broken line (Garcia et al. 1972; Mejia et al. 1972; Rodriguez-Iturbe et al. 1972; Mandelbrot 1972), (7) ARMA-Markov models (Lettenmaier and Burges 1977) and some approximate, more efficient, recent methods (Paxson 1997; Jeong et al. 2003). We can choose among these methods based on their strengths and weaknesses, depending on the specific application we need.

Nevertheless, instead of using short memory approximations for simulating fGn, it is possible to generate exact realizations by applying the following procedure (Hipel and McLeod 1994; Palma 2007). In Eq. (20) we gave the MA representation of our series for any time, \(t\), based on the knowledge of an infinite past of innovations, \(\left\{ {\gamma_{t - j} } \right\}_{j = 1, \ldots ,\infty }\) with \(\gamma_{t} \sim {\text{NID}}\left( {0,1} \right)\) and \(\langle \gamma_{i} \gamma_{j} \rangle = \delta_{ij}\). If we want a series with specific length, \(N\), mean \(\mu\), variance \(\sigma_{T}^{2}\) and fluctuation exponent \(H\), we can work in a similar way as we did with the AR representation for obtaining the predictor. By replacing the coefficients, \(\varphi_{j}\), we could write instead the finite sum:

$$T_{t} = \mu + \sum\limits_{j = 1}^{t} {m_{tj} \gamma_{t + 1 - j} = \mu + m_{t1} \gamma_{t} + \cdots + m_{tt} \gamma_{t} } ,$$
(46)

for \(t = 1, \ldots ,N\), where the optimal coefficients \(m_{ij}\) are the elements of the lower triangular matrix \({\mathbf{M}}_{{H,\sigma_{T} }}^{N}\) given by the Cholesky decomposition of the autocovariance matrix, \({\mathbf{R}}_{{H,\sigma_{T} }}^{N} = \left[ {C_{{H,\sigma_{T} }} \left( {i - j} \right)} \right]_{i,j = 1, \ldots ,N}\); that is:

$${\mathbf{R}}_{{H,\sigma_{T} }}^{N} = {\mathbf{M}}_{{H,\sigma_{T} }}^{N} \left( {{\mathbf{M}}_{{H,\sigma_{T} }}^{N} } \right)^{T} ,$$
(47)

with \(m_{ij} = 0\) for \(j > i\). In summary, for obtaining an fGn realization of length \(N\), we need to generate a white-noise process \(\left\{ {\gamma_{t} } \right\}_{t = 1, \ldots ,N}\) with an appropriate method, obtain the autocovariance matrix \({\mathbf{R}}_{{H,\sigma_{T} }}^{N}\) using Eq. (7.iii), then get \({\mathbf{M}}_{{H,\sigma_{T} }}^{N}\) from the Cholesky decomposition of \({\mathbf{R}}_{{H,\sigma_{T} }}^{N}\), and finally apply Eq. (46) for every \(t\) to obtain our \(\left\{ {T_{t} } \right\}\) series. The variables \(T_{t}\) will be \({\text{NID}}\left( {\mu ,\sigma_{T}^{2} } \right)\) and the process will have fluctuation exponent \(H\) in the interval \(\left( { - 1, 0} \right)\).

1.2 Maximum likelihood estimation

If instead of simulating an fGn process, we are interested in the opposite operation of finding the parameters that best fit a given time series, the most accurate method to use is based on maximizing the log-likelihood function (Hipel and McLeod 1994). Suppose that we have our vector \({\mathbf{T}}_{N} = \left[ {T_{1} , \ldots ,T_{N} } \right]^{T}\) that represents a stationary Gaussian process. Then the log-likelihood function of this process is given by:

$${\mathfrak{L}}\left( {\mu ,\sigma_{T} ,H} \right) = - \frac{1}{2}\log \left[ {\det \left( {{\mathbf{R}}_{{H,\sigma_{T} }}^{N} } \right)} \right] - \frac{1}{2}{\tilde{\mathbf{T}}}_{N,\mu }^{T} \left( {{\mathbf{R}}_{{H,\sigma_{T} }}^{N} } \right)^{ - 1} {\tilde{\mathbf{T}}}_{N,\mu }$$
(48)

where \({\tilde{\mathbf{T}}}_{N,\mu } = \left[ {T_{1} - \mu , \ldots ,T_{N} - \mu } \right]^{T}\) is a vector formed by our original series after removing the mean.

For fixed \(H\), the maximum likelihood estimators (MLE) of \(\mu\) and \(\sigma_{T}\) are:

$$\hat{\mu } = \frac{{{\mathbf{1}}_{N}^{T} \left( {{\tilde{\mathbf{R}}}_{H}^{N} } \right)^{ - 1} {\mathbf{T}}_{N} }}{{{\mathbf{1}}_{N}^{T} \left( {{\tilde{\mathbf{R}}}_{H}^{N} } \right)^{ - 1} {\mathbf{1}}_{N} }}$$
(49)

and

$$\hat{\sigma }_{T}^{2} = \frac{1}{N}{\tilde{\mathbf{T}}}_{{N,\hat{\mu }}}^{T} \left( {{\tilde{\mathbf{R}}}_{H}^{N} } \right)^{ - 1} {\tilde{\mathbf{T}}}_{{N,\hat{\mu }}} ,$$
(50)

where \({\mathbf{1}}_{N} = \left[ {1,1, \ldots ,1} \right]^{T}\) is an \(N \times 1\) vector with all the elements equal to 1 and \({\tilde{\mathbf{R}}}_{H}^{N} = {\mathbf{R}}_{{H,\sigma_{T} }}^{N} /\sigma_{T}^{2}\) is the autocorrelation matrix, which only depends on \(H\).

Substituting these values into Eq. (48), we obtain the maximized log-likelihood function of \(H\):

$${\mathfrak{L}}_{\text{max} } \left( H \right) = - \frac{1}{2}\log \left[ {\det \left( {{\tilde{\mathbf{R}}}_{H}^{N} } \right)} \right] - \frac{N}{2}\log \left[ {\frac{1}{N}{\tilde{\mathbf{T}}}_{{N,\tilde{\mu }}}^{T} \left( {{\tilde{\mathbf{R}}}_{H}^{N} } \right)^{ - 1} {\tilde{\mathbf{T}}}_{{N,\tilde{\mu }}} } \right] .$$
(51)

The estimate for the fluctuation exponent, \(\hat{H}_{l}\), is obtained by maximizing \({\mathcal{L}}_{ \text{max} } \left( H \right)\) and can be used then to obtain \(\hat{\mu }\) and \(\hat{\sigma }_{T}^{2}\) using Eqs. (49) and (50).

1.3 Ergodicity

It is worth noticing here that \(\hat{\mu }\) and \(\hat{\sigma }_{T}^{2}\) are estimates of the ensemble mean \(\mu = \left\langle {T_{t} } \right\rangle\) and variance \(\sigma_{T}^{2} = \langle {\left( {T_{t} - \mu } \right)^{2} } \rangle\) of the fGn process, respectively (see Sect. 2.1). If we try to estimate these parameters based on temporal averages of a single realization, some differences may arise with the values obtained using Eqs. (49) and (50). To explain these differences, we briefly discuss some ergodic properties of fGn processes.

Let

$$\overline{T}_{N} = \frac{{\sum\nolimits_{t = 1}^{N} {T_{t} } }}{N}$$
(52)

and

$$SD_{T}^{2} = \frac{{\sum\nolimits_{t = 1}^{N} {\left( {T_{t} - \overline{T}_{N} } \right)^{2} } }}{N} = \overline{{\left( {T_{N} - \mu } \right)^{2} }} - \left( {\overline{T}_{N} - \mu } \right)^{2}$$
(53)

be the temporal average estimates of the mean and the variance of our process, respectively (the overbar indicates temporal averaging, \(N\) is considered large here), SD indicates “standard deviation”.

Using the relationship between fBm and fGn (Eq. (5)), we can write the temperature as:

$$T_{t} = \sigma_{T} \left[ {B_{{H^{\prime}}} \left( t \right) - B_{{H^{\prime}}} \left( {t - 1} \right)} \right].$$
(54)

The fBm process has the following properties:

$$\begin{aligned}&{\text{(i)}}\;B_{{H^{\prime}}} \left( t \right)\; {\text{is Gaussian with stationary increments}}; \\ &{\text{(ii)}}\;\left\langle B_{{H^{\prime}}} \left( t \right) \right\rangle = \mu t/\sigma_{T}\;{\text{for all}}\;t\;{\text{(the notation}}\;\left\langle \cdot \right\rangle\; {\text{denotes ensemble averaging)}} \\ &{\text{(iii)}}\;C_{{B_{{H^{\prime}}}}} \left( {t,s} \right) = \left\langle \left[ {B_{{H^{\prime}}} \left( t \right) - \mu t/\sigma_{T} } \right]\left[ {B_{{H^{\prime}}} \left( s \right) - \mu s/\sigma_{T} } \right]\right\rangle \\ &\qquad\qquad\quad\;\;\, = \left( {\left| t \right|^{{2H^{\prime}}} + \left| s \right|^{{2H^{\prime}}} - \left| {t - s} \right|^{{2H^{\prime}}} } \right)/2 \end{aligned}$$
(55)

Usually, the condition \(B_{{H^{\prime}}} \left( 0 \right) = 0\) is added to this definition. Using this and Eq. (54), by telescopic sum all addends cancel except for the last one and we obtain:

$$\overline{T}_{N} = \frac{1}{N}\sigma_{T} B_{{H^{\prime}}} \left( N \right).$$
(56)

Taking ensemble averages and using Eqs. (55) (ii) and (iii) we get:

$$\left\langle {\overline{T}_{N} } \right\rangle = \mu$$
(57)

and

$$\left\langle {\left( {\overline{T}_{N} - \mu } \right)^{2} } \right\rangle = \frac{1}{{N^{2} }}\sigma_{T}^{2} \left\langle {\left[ {B_{{H^{\prime}}} \left( N \right) - {{\mu N} \mathord{\left/ {\vphantom {{\mu N} {\sigma_{T} }}} \right. \kern-0pt} {\sigma_{T} }}} \right]^{2} } \right\rangle = \sigma_{T}^{2} N^{2H} ,$$
(58)

where we replaced \(H' = H + 1\).

Consequently, since the process \(B_{{H^{\prime}}} \left( t \right)\) is Gaussian, we conclude that, the temporal average estimate of the mean satisfies:

$$\overline{T}_{N} \sim {\text{N}}\left( {\mu ,\sigma_{T}^{2} N^{2H} } \right).$$
(59)

Now, taking the ensemble average of Eq. (53), we get:

$$\left\langle {SD_{T}^{2} } \right\rangle = \left\langle {\overline{{\left( {T_{N} - \mu } \right)^{2} }} } \right\rangle - \left\langle {\left( {\overline{T}_{N} - \mu } \right)^{2} } \right\rangle .$$
(60)

The ensemble and the time averaging operations commute in the first term of the right-hand side of Eq. (60):

$$\left\langle {\overline{{\left( {T_{N} - \mu } \right)^{2} }} } \right\rangle = \overline{{\left\langle {\left( {T_{N} - \mu } \right)^{2} } \right\rangle }} = \sigma_{T}^{2} .$$
(61)

Using this and Eq. (58) for the last term in Eq. (60), we finally get:

$$SD_{T}^{2} = \sigma_{T}^{2} \left( {1 - N^{2H} } \right),$$
(62)

meaning that the temporal average \(SD_{T}\) is a biased estimate of the variance of the process, \(\sigma_{T}^{2}\). An unbiased estimate would then be \(SD_{T}^{2} /\left( {1 - N^{2H} } \right)\). The variance of this estimator is more difficult to obtain. Its derivation, together with potential applications for treating climate series, will be presented in a future paper (currently in preparation).

In the limit \(N \to \infty\), as \(- 1 < H < 0\), we have \(SD_{T}^{2} \to \sigma_{T}^{2}\), meaning that the process is ergodic (the temporal average and the ensemble average coincide for infinitely long series). Nevertheless, for \(H \to 0\) this convergence is very slow, and a very long series would be needed in order to estimate the variance of the process from the sample variance without any correction. For example, for \(H = -\, 0.1\) and \(N = 1656\) months \(= 138\) years (realistic values for globally-averaged temperatures, see Sect. 3), we have \(SD_{T}^{2} /\sigma_{T}^{2} = \left( {1 - N^{2H} } \right) = 0.772\), i.e. a 23% difference between both estimates. In the same sense, if we want to estimate \(\sigma_{T}^{2}\) from the sample variance with 95% accuracy, we would need a series with \(N = 3.2 \cdot 10^{6}\) (if \(N\) is in months that would be \(N =\) 266,667 years!). The last three columns of Table 4 show the average estimates \(\hat{\sigma }_{T} = \sqrt {\hat{\sigma }_{T}^{2} }\) (Eq. (50)), \(SD_{T}\) (Eq. (53)) and the confirmation of their relationship (Eq. (62)), for simulations of fGn with length \(N = 1656\) and parameters \(\mu = 0\), \(\sigma_{T} = 1\) and values of \(H\) in the range \(\left( { -1/2, 0} \right)\). In each case, 200 realizations were analyzed, but only the average values of the estimates are shown. The standard deviations are always 2–7% of the respective mean values and were not reported. Notice that the difference between \(\hat{\sigma }_{T}\) and \(SD_{T}\) increases as \(H\) goes close to zero and the memory effects become more important.

Table 4 Average estimates of \(H\) for 200 realizations of simulated fGn with length \(N = 1656\) and parameters \(\mu = 0\), \(\sigma_{T} = 1\) and \(H\) corresponding to the values in the first column

Let us return now to the estimates \(\hat{\mu }\) and \(\hat{\sigma }_{T}^{2}\) given by Eqs. (49) and (50), respectively. These ensemble estimates are still obtained from the information of only one finite series, \({\mathbf{T}}_{N} = \left[ {T_{1} , \ldots ,T_{N} } \right]^{T}\), but the presence of the correlation matrix, \({\tilde{\mathbf{R}}}_{H}^{N}\), automatically includes all the information from the infinite unknown past. If we make \({\tilde{\mathbf{R}}}_{H}^{N} = {\mathbf{I}}_{\varvec{N}}\) (\({\mathbf{I}}_{N}\) is the \(N \times N\) identity matrix) in Eqs. (49) and (50) (or equivalently \(H = - 1/2\)), we obtain:

$$\hat{\mu } = \frac{{{\mathbf{1}}_{N}^{T} {\mathbf{T}}_{N} }}{{{\mathbf{1}}_{N}^{T} {\mathbf{1}}_{N} }} = \frac{{\sum\nolimits_{t = 1}^{N} {T_{t} } }}{N} = \overline{T}_{N}$$
(63)

and

$$\hat{\sigma }_{T}^{2} = \frac{1}{N}{\tilde{\mathbf{T}}}_{{N,\hat{\mu }}}^{T} {\tilde{\mathbf{T}}}_{{N,\hat{\mu }}} = \frac{{\sum\nolimits_{t = 1}^{N} {\left( {T_{t} - \hat{\mu }} \right)^{2} } }}{N} = SD_{T}^{2} .$$
(64)

This means that the temporal average estimates based on one realization of the process are only valid for uncorrelated process, for which the ensemble and the sample averages are equal. When both correlations and memory effects are present, this information must be considered. In the case of fGn processes, the memory effects are introduced by including the correlation matrix which only depends on the fluctuation exponent \(H\). The value of this parameter for the process can also be obtained from only one realization of the same as shown below.

1.4 Quasi-maximum-likelihood estimation for \(H\)

As we mentioned before, the MLE for the fluctuation exponent, \(\hat{H}_{l}\), is obtained by maximizing \({\mathcal{L}}_{ \text{max} } \left( H \right)\) (Eq. (51)). The process of optimization of \({\mathcal{L}}_{ \text{max} } \left( H \right)\) could easily be computationally expensive for large values of \(N\). To avoid this, many approximate methods have been developed. We can use Eq. (9) to obtain \(\hat{H}_{s} = \left( {\beta_{l} - 1} \right)/2\) from the spectral exponent at low frequencies. This method, as well as the Haar wavelet analysis to obtain an estimate \(\hat{H}_{h}\) from the exponent of the Haar fluctuations, was used in Lovejoy and Schertzer (2013) and Lovejoy et al. (2015) to obtain estimates of \(H\) for average global and Northern Hemisphere anomalies. These two methods depend on the range selected for the linear regression and, when the graphs are noisy, it could result in poor estimates of the exponents. They, nevertheless, have the advantage of being more general; they yield \(H\) estimates even for highly nonGaussian processes. In the present case, a more accurate approximation is based on quasi-maximum-likelihood estimates (QMLE) from autoregressive approximations (Palma 2007).

Suppose we have a series of \(N\) observations, \(\left\{ {T_{t} } \right\}_{t = 1, \ldots ,N}\), we can build the one-step predictor for \(T_{t}\), \(\hat{T}_{t}^{p} \left( 1 \right)\) from Eq. (22) using a memory of \(p\) steps in the past with \(p + 1 < t \le N\):

$$\hat{T}_{t}^{p} \left( 1 \right) = \sum\limits_{j = - p}^{0} {\phi_{p,j} \left( k \right)T_{t + j - 1} } = \phi_{p, - p} \left( k \right)T_{t - p - 1} + \cdots + \phi_{p,0} \left( k \right)T_{t - 1} .$$
(65)

Then, the approximate QMLE, \(\hat{H}_{q}\), is obtained by minimizing the function

$${\mathfrak{L}}_{1} \left( H \right) = \sum\limits_{t = p + 2}^{N} {\left[ {T_{t} - \hat{T}_{t}^{p} \left( 1 \right)} \right]^{ \, 2} } = \sum\limits_{t = p + 2}^{N} {\left[ {T_{t} - \phi_{p, - p} \left( 1 \right)T_{t - p - 1} - \cdots - \phi_{p,0} \left( 1 \right)T_{t - 1} } \right]^{ \, 2} } .$$
(66)

Remember that the coefficients \(\phi_{p,j}\) only depend on \(H\). An added advantage of this method is that, by construction, it is done as part of the verification process based on hindcasts. The actual mean square error (MSE) of our one-step predictor with memory \(p\) is \({\mathcal{L}}_{1} \left( H \right)/\left( {N - p - 1} \right)\), so in practice, we perform the one-step hindcasts for different values of \(H\) in the specified range and select the value that gives the minimum MSE. The computation of the coefficients \(\phi_{pj}\) is fast, since we do not need to take very large values of \(p\) to achieve nearly the asymptotic skill, as we showed in Sect. 2.2.1.

In order to compare these different estimation methods, we performed some numerical experiments. By using Eq. (46) for the exact method with parameters \(\mu = 0\) and \(\sigma_{T} = 1\), we generated fGn ensembles of one hundred members of length \(N = 1656\) (see Sect. 3) for each value of \(H \in \left\{ { -\, 0.45, - \,0.40, -\, 0.35, -\, 0.30, -\, 0.25, - \,0.20, -\, 0.15, -\, 0.10, -\, 0.05} \right\}\). Then, we estimated \(H\) from the four previously mentioned methods for each realization. The results are summarized in Table 4. The values in parentheses represent the standard deviations for each ensemble. The maximum likelihood, the Haar fluctuation and the spectral methods allow for direct estimates of the ensemble values (shown with the subscript “ens” in Table 4) by considering the maximum likelihood of the vector process, the ensemble of all the fluctuations or the average of all the spectra, respectively from all the paths instead of from each of the series independently. We could say, for example that \(\hat{H}_{s}\) is the mean of all the \(\hat{H}_{s}\)’s obtained from each realization spectrum, while \(\hat{H}_{{s,{\text{ens}}}}\) is the value obtained from the mean of all the spectra. This ensemble estimate reduces the error due to dispersion of each of the ensemble members. For the QMLE, a memory \(p = 20\) was used.

As we can see from Table 4, for the MLE method, there is good agreement between the average of the estimates for each realization and the direct ensemble estimate. This is not the case for the less accurate methods of Haar fluctuation and spectral analysis in the member-by-member cases. Comparatively, the standard deviation of these two methods (without considering the estimation error for each specific realization) is much larger than for the MLE. Nevertheless, the ensemble estimates for the Haar are very accurate because the dispersion for the ensemble is much lower than for each individual graph. In practice, it is almost always the case that we only have a given time series to analyze instead of multiple realizations of an ensemble. In that sense, unless we have more theoretical or empirical justifications for the scaling, estimates based on these graphical methods should be considered cautiously.

A direct comparison of the second and third columns in Table 4 shows the accuracy of the QMLE method if we take the MLE as reference. The average values and the standard deviations for the two methods are very close for small values of \(H\), but as we move to values close to zero there is a systematic bias in the QMLE method towards slightly smaller values than those obtained with the MLE. Nevertheless, the presence of this bias is of little consequence from the point of view of forecasting and can be reduced by increasing the memory used. As we mentioned before, the QMLE method is based on minimizing the MSE—or what is the same—maximizing the MSSS obtained from hindcasts. Near the extreme, a small variation of the value of \(H\) used to perform the forecast will produce almost no change on the MSSS obtained.

1.5 Model adequacy

The final step after finding the parameters \(\mu\), \(\sigma_{T}^{2}\) and \(H\), is to check the adequacy of the fitted model to the data. Imagine we have a time series \(\left\{ {T_{t} } \right\}_{t = 1, \ldots ,N}\). The residuals of our fGn model are obtained from inverting Eq. (46) and calculating the vector

$${\mathbf{e}}_{N} = \left( {{\mathbf{M}}_{{H,\sigma_{T} }}^{N} } \right)^{ - 1} {\tilde{\mathbf{T}}}_{N,\mu } .$$
(67)

If the model provides a good description of the data, the elements of the residual vector \({\mathbf{e}}_{N} = \left[ {e_{1} , \ldots ,e_{N} } \right]^{T}\) should be white noise, i.e. they should be \({\text{NID}}\left( {0,1} \right)\) with autocorrelation function \(\left\langle e_{i} e_{j}\right\rangle = \delta_{ij}\). Many statistical tests for whiteness of \(\left\{ {e_{i} } \right\}\) could be performed, the more descriptive one being based on the examination of the graph of the residual autocorrelation function (RACF). The RACF at lag \(l\) is calculated as:

$$r_{l} \left( {{\mathbf{e}}_{N} } \right) = \frac{{\sum\nolimits_{t = 1}^{N - l} {e_{i} e_{i + l} } }}{{\sum\nolimits_{t = 1}^{N} {e_{i}^{2} } }}.$$
(68)

Asymptotically, \(r_{l} \left( {{\mathbf{e}}_{N} } \right)\sim {\text{NID}}\left( {0,1/N} \right)\) for any lag \(l \ge 1\) and \(r_{0} \left( {{\mathbf{e}}_{N} } \right) = 1\). In the graph of \(r_{l} \left( {{\mathbf{e}}_{N} } \right)\) vs. \(l\), there should not be any point significantly far outside the 95% confidence interval given by the horizontal lines \(\pm 1.96/\sqrt N\), and the number of points outside this range, should represent around 5% of the total number of points. As additional tests, we could verify that the estimates of the fluctuation exponent of \(\left\{ {e_{i} } \right\}\), using the previous graphical methods, are \(\hat{H}_{s} \approx \hat{H}_{h} \approx - 0.5\), which is the value for white noise as a particular case of fGn. The less important Gaussianity assumption could also be verified by visualizing the empirical probability distribution against a normal distribution and checking for the presence of extremes.

Appendix 2: Checking the fGn model fit to global temperature data

In Table 5 we show the values of the parameters obtained for the ten datasets and the corresponding mean series for the globe and for land:

Table 5 Values of the parameters obtained for the ten datasets and the corresponding mean series for global and for land

As we can see in Table 5, there is relatively good agreement between the more robust estimates of the fluctuation exponent, \(\hat{H}_{l}\) and \(\hat{H}_{q}\) (see Appendix 1 for the notation), with the small bias of \(\hat{H}_{q}\) towards smaller values (we used a memory \(p = 20\) months for estimating \(\hat{H}_{q}\)). The estimates \(\hat{H}_{h}\) and \(\hat{H}_{s}\), obtained using the general methods, also roughly agree with the MLE and QMLE considering their relatively wide one-standard deviation confidence interval (given in parentheses in Table 5). Notice the difference between the parameter \(\hat{\sigma }_{T}\) and the amplitude of each series, \(SD_{T}\). The former is an unbiased estimate of the standard deviation for the ensemble process using maximum likelihood, while the latter is a biased estimate, where the bias is because of the limited time series and autocorrelated samples (see Ergodicity in Appendix 1). We also include the values of \(SD_{T} /\sqrt {1 - N^{2H} }\) for confirmation of Eq. (25) (\(N = 1656\) months). The last two columns show the climate sensitivity, \(\lambda_{{2 \times {\text{CO}}_{2} {\text{eq}}}}\), and the parameter \(T_{0}\) (Eq. (27)) used to remove the anthropogenic trend in each global series. The value \(T_{0}\) was chosen to obtain \(\bar{T}_{\text{nat}} = 0\) for each dataset, but this condition does not imply that \(\hat{\mu } = 0\) in Eq. (49), as this last one is an estimate for the ensemble mean. Nevertheless, the values obtained for \(\hat{\mu }\) were too small compared to \(\hat{\sigma }_{T}\) and they were not included in Table 5.

With the parameters shown in Table 5 for global temperature series, we can check the fit of the model to the data as described at the end of Appendix 1. As an example, in Fig. 22 we show the natural variability component for the Mean-G dataset, together with its corresponding series of residual innovations, \(\left\{ {e_{i} } \right\}\), obtained using Eq. (67). The first series should be Gaussian with standard deviations \(SD_{T}\) while the residuals should be white noises, i.e. they should be \({\text{NID}}\left( {0,1} \right)\) with autocorrelation function \(\left\langle e_{i} e_{j}\right\rangle = \delta_{ij}\). To verify the whiteness of the innovations, we should check that the residual autocorrelation function (RACF, (Eq. (68)) satisfies \(r_{l} \left( {{\mathbf{e}}_{N} } \right)\sim {\text{NID}}\left( {0,1/N} \right)\) for any lag \(l \ge 1\) (for \(l = 0\), \(r_{0} \left( {{\mathbf{e}}_{N} } \right) = 1\)).

Fig. 22
figure 22

Natural variability component for the Mean-G dataset, together with its corresponding series of residual innovations, \(\left\{ {e_{i} } \right\}\), obtained using Eq. (67). The units for the \(T_{nat}\) series are °C, while the innovations are unitless

The graph of the RACF for the innovations of the Mean-G dataset is shown in Fig. 23 for \(0 \le l \le N/4\), where \(N = 1656\) is the total number of points. The inset was obtained by dropping the point for zero lag and zooming in the y-axis. The theoretical 95% confidence interval, given by the values \(\pm 1.96/\sqrt N\), is shown in dashed lines. From a direct inspection, we can see that there are not too many points that fall outside the band considered and the extreme values are not too far from these thresholds.

Fig. 23
figure 23

RACF for the innovations of the Mean-G dataset. The theoretical 95% confidence interval, given by the values \(\pm 1.96/\sqrt N\), is shown in dashed lines (\(N = 1656\) is the total number of points)

With the purpose of checking the Gaussianity hypothesis of the series represented in Figs. 22 and 23, a detailed statistical analysis was performed. Extremes in temperature natural variability are an important issue for the prediction of catastrophic events. Its presence would show as large tails in the distributions of temperature anomalies and their corresponding innovations. If this were the case, the model could be fixed by assuming white noise with a different distribution for the innovations (e.g. a Levy distribution or one from a multifractal process). On the other hand, deviations from Gaussianity in the RACF distributions would imply a different correlation structure and would automatically invalidate the applicability of the fGn model.

As an example, in Fig. 24, we show, from top to bottom, the results of this analysis for the natural variability component of the Mean-G dataset, for its corresponding series of residual innovations and for the RACF. In the left, there is a visual comparison of the empirical cumulative distribution functions, CDF, (blue) to that of the respective fitted Gaussian distributions (red) and in the right the more enlightening probability graphs where the empirical probabilities obtained from the graphs in the left are plotted against the theoretical probabilities (blue curve). The reference line shown in red corresponds to a perfect fit. The Kolmogorov–Smirnov (K–S) test can be used to create a measure that quantifies the behavior in probability graphs. The K–S test statistic is equivalent to the maximum vertical distance between a point in the plot and the reference line. The closer the points are to the reference line, the more probable is the data satisfies the fitted theoretical distribution.

Fig. 24
figure 24

From top to bottom, graphs for the natural variability component of Mean-G dataset, for the series of residual innovations and for the RACF. In the left, a comparison of the empirical CDF’s (blue line with circles) to that of the respective fitted Gaussian distributions (red) and in the right the more detailed probability graphs where the empirical probabilities obtained from the graphs in the left are plotted against the theoretical probabilities (blue line with circles). The reference line shown in red corresponds to a perfect fit

In Table 6 we summarize the standard deviations of the normal distributions obtained for the series of anomalies (\(SD_{T}\)), the series of residual innovations (\(SD_{\text{innov}}\)) and the RACF (\(SD_{\text{RACF}}\)) for each dataset. The mean values of the distributions were very small compared to the respective standard deviations and they were omitted. The K–S test statistics with the corresponding p-values are also shown. More powerful statistical tests for normality could be performed, like the Shapiro–Wilk or the Anderson–Darling tests. However, these other tests have their own disadvantages, and, for the purpose of this work, the conclusions obtained from the K–S test to check the Gaussianity hypothesis of the original anomalies and the adequacy of the fGn process fit, are good enough.

Table 6 Normality tests and standard deviations of the distributions obtained for the series of anomalies (\(SD_{T}\)), the series of residual innovations (\(SD_{\text{innov}}\)) and the RACF (\(SD_{\text{RACF}}\)) for each global dataset

The values of \(SD_{T}\) are the same shown previously in Table 5. As expected from the theory, \(SD_{\text{innov}} = 1\) for all dataset and the values obtained for \(SD_{\text{RACF}}\) are close to the theoretical value \(1/\sqrt N = 0.025\) (\(N = 1656\)). With the exceptions of the residual innovations of NOAA and HAD4 for the global datasets, the p-values are above 0.05, so there is not enough evidence to reject normality at that level. Moreover, the p-values obtained are, in general, larger than those obtained for series of the same length based on pseudorandom number generators [for a numerical experiment using 10000 samples, the p-values were uniformly distributed in the range (0–1)]. For the land surface datasets, the p-values for the temperature anomalies and the innovations are low and a different distribution for the white noise innovations could be proposed.

As we mentioned before, the normality of the innovations is less important to confirm the adequacy of the model than its whiteness, which is confirmed by the Gaussianity of the RACF in all cases (see the large p-values in the last column of Table 6). A main deviation from normal behavior is the existence of extremes in the original data. This “fat-tailed” property of the probability distributions was evidenced in Lovejoy (2014) in a paper of statistical hypothesis testing of anthropogenic warming. In the present work, it does not have major implications or compromise the applicability of the model to the global data.

Appendix 3: Forecast and validation for all datasets

Some results of the hindcast validation are summarized in Table 7 for the twelve datasets, including the mean series for the global and the land surface. Only the error, \({\text{RMSE}}_{\text{nat}}\), and the \({\text{ACC}}_{\text{nat}}\), for the natural variability component were presented for horizons \(k =\) 1, 3, 6 and 12 months. The values \({\text{MSSS}}_{\text{nat}}\) and \({\text{MSSS}}_{\text{raw}}\) can be obtained from Eq. (34) taking \({\text{MSE}} = {\text{RMSE}}^{2}\) and the respective \({\text{MSE}}_{\text{ref}} = SD_{T}^{2}\) or \({\text{MSE}}_{\text{ref}} = SD_{\text{raw}}^{2}\). Also, we can use the values of \({\text{ACC}}_{\text{nat}}\) to obtain very good approximations of \({\text{MSSS}}_{\text{nat}}\) for these horizons thanks to the relationship \({\text{MSSS}}_{\text{nat}} \approx {\text{ACC}}_{\text{nat}}^{2}\) (Eq. (38)). Only the spurious values of \({\text{ACC}}_{\text{raw}}\) cannot be obtained from this table, but it is worth mentioning that, even for \(k = 12\) months, they are higher than 0.75 for all datasets. Notice the large difference between the values of \(SD_{T}\) and \(SD_{\text{raw}}\), for the detrended and the raw anomalies respectively, due to the presence of the anthropogenic trend. The values of \(\hat{\sigma }_{T}\), were included for reference.

Table 7 Skill scores \({\text{RMSE}}_{\text{raw}}\) and \({\text{ACC}}_{\text{nat}}\) for forecast horizons \(k =\) 1, 3, 6 and 12 months for the twelve datasets, including the mean series for the global and the land surface

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Del Rio Amador, L., Lovejoy, S. Predicting the global temperature with the Stochastic Seasonal to Interannual Prediction System (StocSIPS). Clim Dyn 53, 4373–4411 (2019). https://doi.org/10.1007/s00382-019-04791-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00382-019-04791-4

Navigation