Skip to main content

Differences in potential and actual skill in a decadal prediction experiment


Decadal prediction results are analyzed for the predictability and skill of annual mean temperature. Forecast skill is assessed in terms of correlation, mean square error (MSE) and mean square skill score. The predictability of the forecast system is assessed by calculating the corresponding “potential” skill measures based on the behaviour of the forecast ensemble. The expectation is that potential skill, where the model predicts its own evolution, will be greater than the actual skill, where the model predicts the evolution of the real system, and that the difference is an indication of the potential for forecast improvement. This will depend, however, on the agreement of the second order climate statistics of the forecasts with those of the climate system. In this study the forecast variance differs from the variance of the verifying observations over non-trivial parts of the globe. Observation-based values of variance from different sources also differ non-trivially. This is an area of difficulty independent of the forecasting system and also affects the comparison of actual and potential mean square error. It is possible to scale the forecast variance estimate to match that of the verifying data so as to avoid this consequence but a variance mismatch, whatever its source, remains a difficulty when considering forecast system improvements. Maps of actual and potential correlation indicate that over most of the globe potential correlation is greater than actual correlation, as expected, with the difference suggesting, but not demonstrating, that it might be possible to improve skill. There are exceptions, mainly over some land areas in the Northern Hemisphere and at later forecast ranges, where actual correlation can exceed potential correlation, and this behaviour is ascribed to excessive noise variance in the forecasts, at least as compared to the verifying data. Sampling error can also play a role, but significance testing suggests it is not sufficient to explain the results. Similar results are obtained for MSE but only after scaling the forecasts to match the variance of the verifying observations. It is immediately clear that the forecast system is deficient, independent of other considerations, if the actual correlation is greater than the potential correlation and/or the actual MSE is less than the potential MSE and this gives some indication of the nature of the deficiency in the forecasts in these regions. The predictable and noise components of an ensemble of forecasts can be estimated but this is not the case for the actual system. The degree to which the difference between actual and potential skill indicates the potential for improvement of the forecasting can only be judged indirectly. At a minimum the variances of the forecasts and of the verifying data should be in reasonable accord. If the potential skill is greater than the actual skill for a forecasting system based on a well behaved model it suggests, as a working hypothesis, that forecast skill can be improved so as to more closely approach potential skill.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


Download references


We acknowledge the important contributions of many members of the CCCma team in developing the model and the forecasting system Woo-Sung Lee for her contribution in producing the forecasts.

Author information

Authors and Affiliations


Corresponding author

Correspondence to G. J. Boer.



Kumar et al. (2014), hereinafter KEA, consider the relationship between potential and actual skill for DJF seasonal predictions made with two forecasting systems. It may be worth considering their results in terms of the formalism developed here. The approaches are similar in some ways although notation differs and several explicit and implicit assumptions are made in KEA which strongly conditions their conclusions.

In the Appendix of KEA two assumptions are introduced which considerably alter the relationships in Table 1. In the notation used here, the assumptions are that the predictable component of the forecast \(\psi\) is proportional to that of the observations \(\chi\), that the forecast and observation-based variances are the same and, implicitly, that m is large. That is

$$\begin{aligned} \psi= & \,\alpha \chi \nonumber \\ \sigma _{Y}^{2}= & \,\sigma _{X}^{2}=\sigma ^{2} \end{aligned}$$

where \(\alpha\) is a constant. These assumptions imply that

$$\begin{aligned} &R_{\chi \psi }\Rightarrow 1\nonumber \\ &\sigma _{\psi }^{2}\,\,\Rightarrow \alpha ^{2}\sigma _{\chi }^{2}\nonumber \\ &q\quad\Rightarrow \alpha ^{2}p \end{aligned}$$

where the broad arrows indicate the result of imposing (8). KEA also consider lag 1 autocorrelations which, following the same reasoning as for the statistics in Table 1, give

$$\begin{aligned} A_{Y}=q{\mathcal {A}}_{\psi }\Rightarrow \alpha ^{2}p{\mathcal {A}}_{\psi }=\alpha ^{2}p{\mathcal {A}}_{\chi }=\alpha ^{2}A_{X} \end{aligned}$$

with \({\mathcal {A}}_{\psi }\Rightarrow {\mathcal {A}}_{\chi }\) for the lag 1 autocorrelations of the predictable components \(\psi\) and \(\chi\). We note in passing that equations (1) and (2) in KEA have misprints.

KEA compare r and \(\rho\) for DJF forecasts from two different forecast models (their Figs. 1 and 2), differences in \(\rho\) and \(A_{Y}\) between the two models (Figs. 3 and 4), and the RMSE \(\sqrt{e}\) and \(\sqrt{\xi }\) for the models (Fig. 5). From Table 1

$$\begin{aligned} r=\left( \sqrt{\frac{p}{q}}R_{\chi \psi }\right) \rho \Rightarrow \frac{\rho }{\alpha } \end{aligned}$$

where the broad arrow is again the result of imposing assumptions (8). If assumptions (8) were to hold then r would be proportional to \(\rho\) and scatter plots of local values of r against \(\rho\), as in KEA Fig. 2, would fall on the diagonal. However, in the general case all of pq and \(R_{\chi \psi }\) vary with location over the globe as well as with forecast range so that

$$\begin{aligned} \alpha =\frac{1}{R_{\chi \psi }}\sqrt{\frac{q}{p}} \end{aligned}$$

would not be expected to be constant and off-diagonal points in Fig. 2 would be expected. It is certainly the case, as noted by KEA and discussed in detail above, that for local values of \(r>\rho\), i.e. for points below the diagonal in Fig. 2, \(\rho\) is clearly an underestimate of potentially available skill. However, the obverse is not the case, and \(\rho>r\) does not imply that \(\rho\) is an overestimate of potentially available skill.

The lag 1 autocorrelation can be represented in several ways in the notation use here

$$\begin{aligned} A_{Y}=q{\mathcal {A}}_{\psi }=\left( 1-\frac{\sigma _{y}^{2}}{\sigma _{Y}^{2}}\right) {\mathcal {A}}_{\psi }\Rightarrow \alpha ^{2}pA_{\chi }=\alpha ^{2}A_{X} \end{aligned}$$

but from Table 1 the simplest relationship between \(A_{Y}\) and \(\rho\) is

$$\begin{aligned} A_{Y}=\rho ^{2}{\mathcal {A}}_{\psi } \end{aligned}$$

which is independent of (8) (presuming large m). Taking differentials to represent differences between the results from two different models, as plotted in KEA Figs. 3 and 4, gives

$$\begin{aligned} \frac{\delta \rho }{\rho } =\frac{1}{2} \left( \frac{\delta A_{Y}}{A_{Y}}-\frac{\delta {\mathcal {A}}_{\psi }}{{\mathcal {A}}_{\psi }}\right) \end{aligned}$$

so that proportionally a larger \(A_{Y}\) in one model compared to another, contributes proportionally, but not linearly, to larger \(\rho\) while the reverse is true for the difference in \({\mathcal {A}}_{\psi }\). The general expectation is that larger \(\delta A_{Y}\) will be associated with larger \(\delta \rho\), as broadly seen in KEA Fig. 4. However, this may be offset locally by differences in \(\delta {\mathcal {A}}_{\psi }.\) If (8) were strictly the case, including the constancy of \(\alpha\), then (12) would become

$$\begin{aligned} \frac{\delta r}{r} = \frac{1}{2}\left( \frac{\delta A_{X}}{A_{X}}-\frac{\delta {\mathcal {A}}_{\chi }}{{\mathcal {A}}_{\chi }}\right) \end{aligned}$$

which would be zero provided that the same verification data X were used in each case.

From Table 1, differences in actual and potential MSE are

$$\begin{aligned} e-\xi =e_{\chi \psi } +\sigma _{x}^{2}-\sigma _{y}^{2}\Rightarrow (1-\alpha )^{2}\sigma _{\chi }^{2}+(\sigma _{x}^{2}-\sigma _{y}^{2}) \end{aligned}$$

and, as noted elsewhere, \(e>\xi\) in (13) does not necessarily mean that \(\sigma _{y}^{2}<\sigma _{x}^{2}\) and with assumptions (8) this would not be the case unless \(\alpha =1\) everywhere.

These more general relationships offer interpretations of the KEA’s results which differ from theirs to a greater or lesser extent. It is certainly the case that potential skill and other second order statistics must be interpreted with care as discussed in KEA and at considerable length in Boer et al. (2018). The hope is that investigations of these kinds can provide new information on forecast system deficiencies and remedies.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Boer, G.J., Kharin, V.V. & Merryfield, W.J. Differences in potential and actual skill in a decadal prediction experiment. Clim Dyn 52, 6619–6631 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Decadal prediction
  • Predictability
  • Skill
  • Potential skill