Skip to main content

Differences in potential and actual skill in a decadal prediction experiment


Decadal prediction results are analyzed for the predictability and skill of annual mean temperature. Forecast skill is assessed in terms of correlation, mean square error (MSE) and mean square skill score. The predictability of the forecast system is assessed by calculating the corresponding “potential” skill measures based on the behaviour of the forecast ensemble. The expectation is that potential skill, where the model predicts its own evolution, will be greater than the actual skill, where the model predicts the evolution of the real system, and that the difference is an indication of the potential for forecast improvement. This will depend, however, on the agreement of the second order climate statistics of the forecasts with those of the climate system. In this study the forecast variance differs from the variance of the verifying observations over non-trivial parts of the globe. Observation-based values of variance from different sources also differ non-trivially. This is an area of difficulty independent of the forecasting system and also affects the comparison of actual and potential mean square error. It is possible to scale the forecast variance estimate to match that of the verifying data so as to avoid this consequence but a variance mismatch, whatever its source, remains a difficulty when considering forecast system improvements. Maps of actual and potential correlation indicate that over most of the globe potential correlation is greater than actual correlation, as expected, with the difference suggesting, but not demonstrating, that it might be possible to improve skill. There are exceptions, mainly over some land areas in the Northern Hemisphere and at later forecast ranges, where actual correlation can exceed potential correlation, and this behaviour is ascribed to excessive noise variance in the forecasts, at least as compared to the verifying data. Sampling error can also play a role, but significance testing suggests it is not sufficient to explain the results. Similar results are obtained for MSE but only after scaling the forecasts to match the variance of the verifying observations. It is immediately clear that the forecast system is deficient, independent of other considerations, if the actual correlation is greater than the potential correlation and/or the actual MSE is less than the potential MSE and this gives some indication of the nature of the deficiency in the forecasts in these regions. The predictable and noise components of an ensemble of forecasts can be estimated but this is not the case for the actual system. The degree to which the difference between actual and potential skill indicates the potential for improvement of the forecasting can only be judged indirectly. At a minimum the variances of the forecasts and of the verifying data should be in reasonable accord. If the potential skill is greater than the actual skill for a forecasting system based on a well behaved model it suggests, as a working hypothesis, that forecast skill can be improved so as to more closely approach potential skill.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. Baker LH, Shaffrey LC, Sutton RT, Weisheimer A, Scaife AA (2018) An intercomparison of skill and over/underconfidence of the wintertime North Atlantic Oscillation in multi-model seasonal forecasts. Res Letts Geophys.

  2. Boer GJ, Merryfield WJ, Kharin VV (2018) Relationships between potential, attainable, and actual skill in a decadal prediction experiment. Clim Dyn.

    Google Scholar 

  3. Boer GJ, Kharin VV, Merryfield WJ (2013) Decadal predictability and forecast skill. Clim Dyn 41:1817–1833.

    Article  Google Scholar 

  4. Boer GJ (2009) Climate trends in a seasonal forecasting system. Atmos-Ocean 47:123–138.

    Article  Google Scholar 

  5. Boer GJ, Lambert SJ (2008) Multi-model decadal potential predictability of precipitation and temperature. Geophys Res Lett.

  6. Boer GJ (2004) Long time-scale potential predictability in an ensemble of coupled climate models. Clim Dyn 23:29–44

    Article  Google Scholar 

  7. Boer GJ, Lambert SJ (2001) Second order space-time climate difference statistics. Clim Dyn 17:213–218

    Article  Google Scholar 

  8. Dee DP (2011) The ERA-interim reanalysis: configuration and performance of the data assimilation system. Q J R Meteorol Soc 137:553–597

    Article  Google Scholar 

  9. Deque M (2003) Continuous Variables. In: Jolliffe IT, Stephenson DB (eds) 2003: Forecast verification: a practitioner’s guide in atmospheric science. Wiley, Chichester, p 240

    Google Scholar 

  10. Dunstone NJ, Smith DM, Scaife AA, Hermanson L, Eade R, Robinson N, Andrews M, Knight J (2016) Skilful predictions of the winter North Atlantic Oscillation one year ahead. Nat Geosci.

  11. Eade R, Smith D, Scaife A, Wallace E, Dunstone N, Hermanson L, Robinson N (2014) Do seasonal-to-decadal climate predictions under-estimate the predictability of the real world? Geophys Res Lett.

  12. Hansen J, Ruedy R, Sato M, Lo K (2010) Global surface temperature change. Rev Geophys 48:RG4004.

    Article  Google Scholar 

  13. Jin Y, Rong X, Liu Z (2017) Potential predictability and forecast skill in ensemble climate forecast: a skill-persistence rule. Clim Dyn.

  14. Kumar A, Peng P, Chen M (2014) Is there a relationship between potential and actual skill? Mon Weather Rev 142:2220–2227.

    Article  Google Scholar 

  15. Pitman JG (1939) A note on normal correlation. Biometrika 31:9–12

    Article  Google Scholar 

  16. Pohlmann H, Botzet M, Latif M, Roesch A, Wild M, Tschuck P (2004) Estimating the Decadal Predictability of a Coupled AOGCM. J Clim 17:4463–4472.

    Article  Google Scholar 

  17. Scaife AA (2014) Skillful long- range prediction of European and North American winters. Geophys Res Lett 41:2514–2519.

    Article  Google Scholar 

  18. Scaife AA, Smith DM (2018) A signal-to-noise paradox in climate science. Nat Clim Atmos Sci.

  19. Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res 106:7183–7192

    Article  Google Scholar 

  20. Merryfield WJ et al (2013) The Canadian seasonal to interannual prediction system. Part I: Models and initialization. Mon Weather Rev 141:2910–2945.

    Article  Google Scholar 

  21. Stockdale TN, Molteni F, Ferranti L (2015) Atmospheric initial conditions and the predictability of the Arctic Oscillation. Geophys Res Lett 42:1173–1179.

    Article  Google Scholar 

  22. Uppala SM (2005) The ERA-40 re-analysis. Q J R Meteorol Soc 131:2961–3012

    Article  Google Scholar 

Download references


We acknowledge the important contributions of many members of the CCCma team in developing the model and the forecasting system Woo-Sung Lee for her contribution in producing the forecasts.

Author information



Corresponding author

Correspondence to G. J. Boer.



Kumar et al. (2014), hereinafter KEA, consider the relationship between potential and actual skill for DJF seasonal predictions made with two forecasting systems. It may be worth considering their results in terms of the formalism developed here. The approaches are similar in some ways although notation differs and several explicit and implicit assumptions are made in KEA which strongly conditions their conclusions.

In the Appendix of KEA two assumptions are introduced which considerably alter the relationships in Table 1. In the notation used here, the assumptions are that the predictable component of the forecast \(\psi\) is proportional to that of the observations \(\chi\), that the forecast and observation-based variances are the same and, implicitly, that m is large. That is

$$\begin{aligned} \psi= & \,\alpha \chi \nonumber \\ \sigma _{Y}^{2}= & \,\sigma _{X}^{2}=\sigma ^{2} \end{aligned}$$

where \(\alpha\) is a constant. These assumptions imply that

$$\begin{aligned} &R_{\chi \psi }\Rightarrow 1\nonumber \\ &\sigma _{\psi }^{2}\,\,\Rightarrow \alpha ^{2}\sigma _{\chi }^{2}\nonumber \\ &q\quad\Rightarrow \alpha ^{2}p \end{aligned}$$

where the broad arrows indicate the result of imposing (8). KEA also consider lag 1 autocorrelations which, following the same reasoning as for the statistics in Table 1, give

$$\begin{aligned} A_{Y}=q{\mathcal {A}}_{\psi }\Rightarrow \alpha ^{2}p{\mathcal {A}}_{\psi }=\alpha ^{2}p{\mathcal {A}}_{\chi }=\alpha ^{2}A_{X} \end{aligned}$$

with \({\mathcal {A}}_{\psi }\Rightarrow {\mathcal {A}}_{\chi }\) for the lag 1 autocorrelations of the predictable components \(\psi\) and \(\chi\). We note in passing that equations (1) and (2) in KEA have misprints.

KEA compare r and \(\rho\) for DJF forecasts from two different forecast models (their Figs. 1 and 2), differences in \(\rho\) and \(A_{Y}\) between the two models (Figs. 3 and 4), and the RMSE \(\sqrt{e}\) and \(\sqrt{\xi }\) for the models (Fig. 5). From Table 1

$$\begin{aligned} r=\left( \sqrt{\frac{p}{q}}R_{\chi \psi }\right) \rho \Rightarrow \frac{\rho }{\alpha } \end{aligned}$$

where the broad arrow is again the result of imposing assumptions (8). If assumptions (8) were to hold then r would be proportional to \(\rho\) and scatter plots of local values of r against \(\rho\), as in KEA Fig. 2, would fall on the diagonal. However, in the general case all of pq and \(R_{\chi \psi }\) vary with location over the globe as well as with forecast range so that

$$\begin{aligned} \alpha =\frac{1}{R_{\chi \psi }}\sqrt{\frac{q}{p}} \end{aligned}$$

would not be expected to be constant and off-diagonal points in Fig. 2 would be expected. It is certainly the case, as noted by KEA and discussed in detail above, that for local values of \(r>\rho\), i.e. for points below the diagonal in Fig. 2, \(\rho\) is clearly an underestimate of potentially available skill. However, the obverse is not the case, and \(\rho>r\) does not imply that \(\rho\) is an overestimate of potentially available skill.

The lag 1 autocorrelation can be represented in several ways in the notation use here

$$\begin{aligned} A_{Y}=q{\mathcal {A}}_{\psi }=\left( 1-\frac{\sigma _{y}^{2}}{\sigma _{Y}^{2}}\right) {\mathcal {A}}_{\psi }\Rightarrow \alpha ^{2}pA_{\chi }=\alpha ^{2}A_{X} \end{aligned}$$

but from Table 1 the simplest relationship between \(A_{Y}\) and \(\rho\) is

$$\begin{aligned} A_{Y}=\rho ^{2}{\mathcal {A}}_{\psi } \end{aligned}$$

which is independent of (8) (presuming large m). Taking differentials to represent differences between the results from two different models, as plotted in KEA Figs. 3 and 4, gives

$$\begin{aligned} \frac{\delta \rho }{\rho } =\frac{1}{2} \left( \frac{\delta A_{Y}}{A_{Y}}-\frac{\delta {\mathcal {A}}_{\psi }}{{\mathcal {A}}_{\psi }}\right) \end{aligned}$$

so that proportionally a larger \(A_{Y}\) in one model compared to another, contributes proportionally, but not linearly, to larger \(\rho\) while the reverse is true for the difference in \({\mathcal {A}}_{\psi }\). The general expectation is that larger \(\delta A_{Y}\) will be associated with larger \(\delta \rho\), as broadly seen in KEA Fig. 4. However, this may be offset locally by differences in \(\delta {\mathcal {A}}_{\psi }.\) If (8) were strictly the case, including the constancy of \(\alpha\), then (12) would become

$$\begin{aligned} \frac{\delta r}{r} = \frac{1}{2}\left( \frac{\delta A_{X}}{A_{X}}-\frac{\delta {\mathcal {A}}_{\chi }}{{\mathcal {A}}_{\chi }}\right) \end{aligned}$$

which would be zero provided that the same verification data X were used in each case.

From Table 1, differences in actual and potential MSE are

$$\begin{aligned} e-\xi =e_{\chi \psi } +\sigma _{x}^{2}-\sigma _{y}^{2}\Rightarrow (1-\alpha )^{2}\sigma _{\chi }^{2}+(\sigma _{x}^{2}-\sigma _{y}^{2}) \end{aligned}$$

and, as noted elsewhere, \(e>\xi\) in (13) does not necessarily mean that \(\sigma _{y}^{2}<\sigma _{x}^{2}\) and with assumptions (8) this would not be the case unless \(\alpha =1\) everywhere.

These more general relationships offer interpretations of the KEA’s results which differ from theirs to a greater or lesser extent. It is certainly the case that potential skill and other second order statistics must be interpreted with care as discussed in KEA and at considerable length in Boer et al. (2018). The hope is that investigations of these kinds can provide new information on forecast system deficiencies and remedies.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Boer, G.J., Kharin, V.V. & Merryfield, W.J. Differences in potential and actual skill in a decadal prediction experiment. Clim Dyn 52, 6619–6631 (2019).

Download citation


  • Decadal prediction
  • Predictability
  • Skill
  • Potential skill