Evaluation of forecast strategies for seasonal and decadal forecasts in presence of systematic model errors
This study discusses and compares three different strategies used to deal with model error in seasonal and decadal forecasts. The strategies discussed are the so-called full initialisation, anomaly initialisation and flux correction. In the full initialisation the coupled model is initialised to a state close to the real-world attractor and after initialisation the model drifts towards its own attractor, giving rise to model bias. The anomaly initialisation aims to initialise the model close to its own attractor, by initialising only the anomalies. The flux correction strategy aims to keep the model trajectory close to the real-world attractor by adding empirical corrections. These three strategies have been implemented in the ECMWF coupled model, and are evaluated at seasonal and decadal time-scales. The practical implications of the different strategies are also discussed. Results show that full initialisation results in a clear model drift towards a colder climate. The anomaly initialisation is able to reduce the drift, by initialising around the model mean state. However, the erroneous model mean state results in degraded seasonal forecast skill. The best results on the seasonal time-scale are obtained using momentum-flux correction, mainly because it avoids the positive feedback responsible for a strong cold bias in the tropical Pacific. It is likely that these results are model dependent: the coupled model used here shows a strong cold bias in the Central Pacific, resulting from a positive coupled feedback between winds and SST. At decadal time-scales it is difficult to determine whether any of the strategies is superior to the others.
KeywordsForecast strategies Seasonal forecasting Decadal forecasting Anomaly initialisation Flux correction
Systematic model error is a difficult problem for seasonal forecasting and climate predictions. Systematic model error means that the climatology of the model is different to observed climatology. The term “climatology” refers to the probability density function of the climate, which is often characterised by its mean (the mean of a variable over a long period) and the variability around this mean state. The climatologies could therefore differ either in the mean and/or the variability. We will use the term climatology as the subspace of the phase space covered by the model trajectories over a long period of time (sometimes referred to as the attractor of the system). In a non-linear system, the different moments of the probability distribution are linked, and errors in the mean state could affect the variability of the system.
Systematic model error leads to difficulties in the forecasting process. Particular problems happen when transferring information between observation space and model space, namely the initialisation and the issuing of the forecast. At the initialisation stage, information needs to be transferred from observations to model space. When issuing the forecast, the model output needs to be calibrated using reliable information about the real world. In numerical weather prediction (NWP) the forecast covers typically the range 1–15 days, and, because of the relatively short forecast time, the difference between model and observed climatologies can be ignored (i.e. the model error is neglected). At longer lead times (monthly, seasonal and decadal time-scales) the systematic model error cannot be ignored. Due to the systematic model errors, the state of the model will drift away from the real-world attractor towards its own attractor, leading to a biased state. In these cases a strategy for accounting for the model bias is needed. In this study we present and compare different forecast strategies to cope with model bias.
For anomaly initialisation (purple), the aim is to initialise the model on its own attractor by adding the observed anomalies relative to observed climate (estimated from a set of re-analyses) to the model climatology (estimated by multiyear forecasts). Then by construction, the model drift should be avoided. This strategy is popular to initialise decadal predictions (Smith et al. 2007 among others).
The third alternative to be discussed here is flux correction (blue). The aim of this strategy is to avoid (or limit) the model drift by adding a correction term to the model during the simulation that pushes the model solution towards the climatology of nature. Although flux correction was widely used in early work with coupled global circulation models, it has largely been considered “taboo” by the scientific community since the paper by Neelin and Dijkstra (1995), where they argue that flux corrections could lead to non-natural variablity patterns by disturbing the feedbacks operating in a free dynamical system. Indeed, flux correction should be avoided if the aim is the study of coupled feedbacks, and can be misleading for model development. In this report we will discuss the results from a forecasting perspective, i.e. whether flux correction can deliver an improved forecast, which is a pragmatic point of view.
In this comparative study we discuss the practical difficulties and advantages of the different forecast strategies, with particular focus on seasonal and decadal forecasts. These three methodologies are applied to the ECMWF coupled model. In a companion paper (Magnusson et al. 2012b), the ENSO variability and its dependence on the model state is discussed, which have implications for the choice of forecast strategy.
This paper is organised as follows: the model system and experimental setup is described in Sect. 2; the three different forecast strategies in Sect. 3; the results from the different strategies regarding model drift in Sect. 4 and in terms of quality scores in Sect. 5. Finally the findings are discussed in Sect. 6.
2 Model setup and experiments
The model used for this study is the ECMWF IFS model (model version 36r1) coupled with the NEMO ocean model version 3.0 (Madec 2008). The atmospheric resolution is T159 (corresponds to an horizontal resolution of 150 km) and 91 vertical levels. The ocean uses the ORCA1 grid, a tripolar grid with a horizontal resolution of about 1 degree at mid latitudes, with a finer meridional resolution (approximately 0.3 degrees) at the equator. There is no prognostic sea-ice model, instead the sea-ice concentration is prescribed from observed states, randomly selected from any of the 5 years previous to the forecast starting date [for details see Molteni et al. (2011)]. The model runs include increased greenhouse gases following observed values. Tropospheric and stratospheric aerosols are included in the model only as fixed climatologies, so no account is taken of volcanic eruptions and changes in anthropogenic “pollution”. The model has not been specifically tuned to perform (near-term) climate simulations, as is the case for EC-EARTH (Hazeleger et al. 2010), which uses a similar model system.
The initial conditions for the atmosphere are provided by the ERA-40 reanalysis (Uppala et al. 2005) for starting dates prior to 1989, after which the ERA Interim reanalysis (Dee et al. 2011) is used. The ocean initial conditions are from a reanalysis based on NEMOVAR (Balmaseda et al. 2010b) oceanic reanalysis. The ocean reanalysis uses fluxes from the ERA-reanalyses as well as sub-surface observations. The forecast ensemble is constructed by applying stochastic physics (SPPT scheme) to simulate model uncertainties in the atmosphere (Palmer et al. 2009).
Experiment configurations for model climate estimations and flux-correction calculations
Forecast length (years)
Number of initial dates
Configurations for hindcast experiments
Forecast length (months)
Number of initial dates
Heat and momentum
Heat and momentum
In order to evaluate the forecast strategies, one set of hindcast experiments (Table 2) has been run on a decadal time-scale while another set is run on an annual time-scale with an increased number of start dates and ensemble members in order to get more reliable forecast statistics.
In this section we will explain the different forecast strategies used in this paper. As a background to the motivation of the different strategies, we will start with highlighting some of the systematic errors in the model.
In the tropical Pacific the cold bias is pronounced with a too intense cold tongue. This systematic error is of importance for this study due to its connection to the ENSO. In the western part of the tropical Pacific a warm bias is present at depth, extending all across the Equatorial Pacific at the depth of the thermocline (not shown). The pattern of the errors (cold surface and warm subsurface) is indicative of too diffuse thermocline. The warm bias is stronger in the western Pacific, as if the thermocline in the coupled model is not only more diffuse, but also has a strong zonal gradient. This error in the slope of the thermocline is the fingerprint of too strong zonal winds.
Figure 3b shows the wind bias in the StrongRelax experiment, which is reduced with respect to the free coupled model. This result clearly shows that if the SST bias is removed the wind bias in the tropical Pacific is reduced. The impact of the SST bias is especially strong in the western part of the basin where the wind bias is reduced by 50 % in the StrongRelax experiment compared with the Control. Note, though, that substantial wind biases are present even in the absence of SST bias.
The diagnostics of the structure of the temperature bias in the tropical Pacific together with the wind bias suggest that at least a part of the cold bias in the region results from a positive coupled feedback between winds and SST: too strong winds lead to an excess of upwelling, producing colder SST, which in turn produces stronger zonal winds. Disrupting this positive feedback is an important motivation for using momentum-flux correction in this study (see below).
3.1 Full initialisation
In numerical weather prediction, the normal procedure is to initialise the model from an analysis performed via data assimilation. The analysis is a combination of the latest observations together with a short-range forecast. By continuously using the information from the observations the analysis state is kept close to the real-world attractor (although in poorly observed areas, a difference could still be present). During the model integration, the state of the model will diverge from the true state both due to the loss of predictability and the development of systematic errors. In NWP the systematic error is assumed small compared to non-systematic errors and often the model output is not calibrated. At longer lead times (monthly, seasonal and decadal time-scales), the model will drift away from the real-world attractor towards its own attractor. In these cases the model bias cannot be neglected and the strategy for accounting for the model bias is the a-posteriori removal of it.
The bias is corrected by applying a lead-time dependent bias correction when post-processing. The bias correction made is also dependent on the seasonal cycle. This is the strategy commonly used in monthly and seasonal forecasts (Stockdale 1997). For example, in an operational seasonal forecast issued every month with a typical lead-time of 7 months, the estimation of 7 × 12 = 84 bias correction terms is needed to account for all lead times and all starting dates. The robust estimation of this large number of bias fields requires a large data set of hindcasts (retrospective forecasts). This strategy will fail if the bias is non stationary, and can lead to sub-optimal forecast skill. The non-stationarity of the bias may be due to non-stationary errors on the initial conditions (Kumar et al. 2012), or to flow-dependent bias arising from the non-linear nature of the system (Balmaseda et al. 2009). Generally speaking, if the systematic error bias is large enough, the non-linear terms will become non-negligible and therefore a mere linear calibration process will be insufficient.
The full initialisation strategy may also give rise to so-called initialisation shock, a term referring to the rapid adjustment processes in the initial phases of the forecast, that can produce non-monotonic behaviour in the model drift. The consequence of the initialisation shock is that at short lead times the error can be larger than at longer lead times. The cause of the initialisation shock is an imbalance between the initial conditions and the dynamics of the model.
In this paper the experimentation using full initialisation but without any flux correction will be referred to as FullIni.
3.2 Anomaly initialisation
Due to the difference in mean climate of the analyses and the model, a forecast initialised from an analysis will drift torwards the model climate. The drift towards the model attractor could result in an initialisation shock or over-shooting of a model drift further away than the model climate, before stabilising at the model climate. The idea of using anomaly initialisation is to avoid the model drift and initialisation shock (but not the model error). The procedure of using anomaly initialisation is to calculate anomalies in the analysis with respect to the analysis’ climatology and add such anomalies to the climate of the model. The method has previously been used in several studies e.g. Schneider et al. (1999), Pierce et al. (2004) and Smith et al. (2007). The rationale is to avoid an initialisation shock due to an initial state being far from the model attractor. Strategies for initialising imperfect models are also discussed in Toth and Peña (2007) in a simple model framework.
It is often emphasised that the advantage of the anomaly initialisation is the avoidance of initialisation shock. This is by no means guaranteed, since the structure of the observed anomaly may not be consistent with the model mean state. For instance, the largest anomalies in the observations are associated with displacement of sharp fronts or gradients. If, for example, the systematic error of the model is highly correlated with the misplacement of these fronts/gradients, simply adding an anomaly where it is not supposed to be found is not the same that initialising the model around its attractor. A sharp inconsistency is found also when the placement of the anomalies is associated with vertical displacements of the Equatorial thermocline. Another clear example is the application of an observed sea-ice anomaly in regions where the model never has sea-ice.
A more interesting advantage of the anomaly initialisation, which is often not discussed, is the avoidance of model drift. By avoiding model drift, the a-posteriori correction of the forecast does not require the bias dependence on the forecast lead time (so typically only the 12-month climatology of the bias is required), and the bias estimators can be more robust. This is more relevant for decadal forecast ranges, when it is also more expensive to conduct the calibrating hindcasts. The procedure requires however a long integration to estimate the model climatology.
The procedure of the anomaly initialisation is not without problems. First of all, if the non-linearities are strong, the calibration of the forecasts will face the same problems as the full initialisation, or even stronger, since the mean error is fully developed during the coupled model integrations. It is not guaranteed that the initialisation shock is removed, since it depends on how the anomaly is assimilated into the model (as discussed above). The other drawback is that the estimation of the anomaly requires the knowledge of the observed climatology. This introduces two kind of difficulties. On the one hand, it is important that the sampling period used for the observed climatology is consistent with that used for the model climatology (for instance, a model climatology estimated for the pre-industrial era should not be used for the anomaly initialisation of decadal forecast post 1960’s, with an observed climatology estimated during the period 1970–2005). The other kind of problem is related with defining the climatology of new or sporadic observations. For instance, some regions like the Southern Ocean had not been observed prior to advent of Argo (Roemmich and Owens 2000). Most of the deep ocean has only been observed sporadically with cruise data, and there is not enough information to extract a long term climatology. To avoid this problem, the anomaly initialisation strategy of the ocean often uses gridded fields from existing ocean reanalysis. In this way, it turns an initial weakness into a good advantage, since it means that different coupled modelling groups can initialise their decadal forecasts with external ocean reanalysis, without the need of having to develop data assimilation systems for their own models.
In our experiments, the anomalies are added to the full model state vector instantaneously at initial time, instead of assimilating only temperature and salinity anomalies with a certain time-scale, usually by means of relaxation techniques, as in e.g. Pohlmann et al. (2009). One potential problem with using the instantaneous full state vector is that the new state could also not lie on the model attractor, creating instabilities in the adjustment process that may lead to the quick disappearance of the anomalies. But none of the results from these experiments suggest that this might be the case.
To obtain a model climate for the anomaly initialisation procedure, the Control experiment was conducted, consisting of a set of 25-year forecasts from 3 different initial dates and 3-ensemble member each (i.e. a total of 6 25-year forecasts initialised in 1965 and 1975 and 3 23-year forecasts initialised in 1985). The ensemble mean of these Control integrations was used to compute the model climate. The first 10 years of the simulations have been discarded in order to let the model drift to its own climatology. This may still be a short period for the drift in the deep ocean to be fully developed, but it is sufficient for obtaining a stable state in the upper ocean, which is the prime objective of this study. An observed ocean climatology has been calculated from ocean reanalysis, spanning the same time period used in the estimation of the model climate. Using the same period yields data sets with the same impact of greenhouse gases. The model and analysis climate is calculated for the actual day of the year used for the initialisation (November 1st in our case).
The forecasts using anomaly initialisation will be referred as AnoIni.
3.3 Flux correction
It is clear from a variety of studies that strong non-linear interactions (e.g the relationship between SST and atmospheric convection) between mean state and anomaly are at play in the coupled model forecasts. For instance, Balmaseda et al. (2010a) show that the atmospheric response to a given sea-ice anomaly depends on the atmospheric mean state, which in turn is conditioned by the underlying SST. In their study they show that correcting the SST in the North-Atlantic, where the model has errors due to the wrong position of the Gulf Stream, has large impact on the atmospheric mean state and in the atmospheric response to the Arctic sea-ice anomaly. Results along these lines are documented by Scaife et al. (2011) and Keeley et al. (2012).
Model improvement is the ultimate way of reducing model biases. However this is a slow process, especially if the systematic errors are related to model resolution (as in the case of the correct Gulf Stream). A temporary solution, until the problems in the model are detected and solved, is to compensate for the systematic errors by applying empirical corrections.
One specific correction is the so-called flux correction, applied only in the coupling between the atmosphere and the ocean. The aim of the strategy is to avoid (or limit) the model drift by adding a correction term to the model during the simulation that pushes the model solution towards the observed climatology. The aim is to keep the forecast errors as linear as possible, to be able to apply simple calibration techniques. In this strategy the empirical correction of the forecast is done during the model integration rather than only in the final calibration phase. Ideal candidates for this strategy are those situations where the model exhibits a very clear mean error, difficult to correct by improving the model formulation but relatively easily by applying empirical correction. In these cases flux correction may be a successful forecast strategy. The use of flux correction has recently been discussed in Spencer et al. (2007) and Manganello and Huang (2009).
In this study we will investigate two options, namely of using only momentum-flux correction (Ucorr) and a combination of momentum and heat-flux correction (UHcorr). The flux correction has been calculated in two steps. First, the strong SST relaxation simulations initialised in 1965 and 1975 have been used in order to calculate the wind stress errors when the SST is constrained to the reanalysis (see Fig. 3b). The momentum-flux correction has been calculated as the difference between the StrongRelax forecasts and the surface stresses from the reanalysis. The first 5 years of each SST-relaxation simulation have not been used in order to let the atmospheric model drift. In a second step, a set of 25-year forecasts (WeakRelax) has been run applying the momentum-flux correction and a weak SST-relaxation (40 W/K), in order to calculate the heat-flux correction required to correct the coupled SST when the momentum-flux correction applied. The heat-flux correction is calculated from the SST relaxation coefficients. This strategy yields a heat-flux correction suitable to be used together with momentum-flux correction and that partly accounts for the feedback effects.
The corrections are added to the fields passed from the atmosphere to the ocean: the wind stress seen by the ocean has been modified by the momentum-flux correction, and the heat-correction has been applied to the non-solar component of the heat flux received by the ocean.
The lower panels in Fig. 4 shows the heat-flux correction required with the momentum-flux correction applied, for January (Fig. 4c) and September (Fig. 4d). Positive correction means that heat is added to the ocean and the global annual mean is 3.9 W/m2. Regionally, positive corrections are needed for the winter hemispheres in the subtropics. Large heat-flux corrections are also needed in the vicinity of the western boundary currents in both the Atlantic and the Pacific. In the Southern Ocean a negative flux correction is needed due to the warm bias in the model. Regarding the tropical Pacific, the heat-flux correction is weak in January, while there is a strong positive correction during September, especially in the eastern part. Close the south-American coast the correction is negative for both January and September.
4 Model drift
In the previous section, the model climate and its systematic errors were introduced. In this section we will discuss the evolution of the systematic errors as a function of forecast lead-time, usually referred to as the model drift. We will discuss the global temperature drift as well as focusing on the drift in the tropical Pacific.
For the FullIni experiment (red), the SST drifts towards colder values as expected from the cold bias for the model climate. The drift has different time-scales; most of the drift appears during the first two years with a slower drift acting on the decadal time-scale. For the AnoIni experiment (pink), a substantial part of the model drift is avoided by initialising the model close to its attractor, which is the primarily aim of of the strategy. The AnoIni experiment has a slightly colder mean state than the FullIni even after 8 years, which is a sign of a drift acting on long time-scales.
The results for the Ucorr (green) experiment indicate a delayed drift compared to FullIni, which can be related to the tropical Pacific (see below). For the longest forecast range, the SST bias is comparable to FullIni. The UHcorr experiment shows a much improved model climate, which is an expected result from using heat-flux correction. The small difference between the reanalysis and the UHcorr experiment is believed to be due to the differences in the SST data set used to calculate the flux correction and the SST in the reanalysis, but it can also be an artifact of sampling errors and non-stationarity of the record.
In Fig. 5c, the ensemble mean of bias-corrected forecasts is plotted, using a lead-time dependent correction similar to the bias in Fig. 5b. After applying the bias correction, it is difficult from this figure to determine whether any forecast method is better or not. The forecast quality will be discussed in Sect. 5.
Studying the results for FullIni (Fig. 6a) and AnoIni (Fig. 6b), the general features are a cold bias in the upper-ocean and a warm bias at depth. For the FullIni experiment, the cold bias appears on a much faster time-scale than the warm bias at deeper levels. Magnusson et al. (2012a) argue that the fast development of the surface bias is believed to be related to the initial imbalance in the radiation at the top of the atmosphere together with a fast development of the enhanced cold tongue in the Tropical Pacific ocean surface. The slow component of the drift could be due to a too strong vertical mixing in the ocean. The location for the warm bias is generally in the extra-tropics. Because of the different time-scales of the cold and the warm bias, the integrated heat-content decreases during the first years of integration for FullIni, while the bias for AnoIni is much more stable.
For the AnoIni experiment, as expected, the initial conditions already show the bias structure of the coupled model, but the warm bias continues to develop. One possible explanation is that the model climate has been calculated before the coupled model reached equilibrium, and therefore the drift in the deep ocean continues after 25 years, which would be consistent with the slow time-scale of errors associated with vertical mixing in the ocean interior. Another possible explanation is the non-stationarity of the model error during the period considered: it may be possible that the response in the model to greenhouse radiative forcing is too strong, which, combined with the errors in the ocean vertical mixing, results in too strong warming of the deep ocean in the model. This would explain why the AnoIni decadal simulations exhibit a correct warming trend in SST, while the trend in the 700-m integrated heat content is stronger than in the ocean reanalysis (Fig. 7a, pink curve). This error is particularly apparent in the AnoIni decadal simulation, where at initial time the cold bias in the upper ocean is already developed. A third explanation is the poor avaliable sample of decadal forecast start dates, insufficient for a robust estimation of model drift, in particular in the presence of outliers such as the failure to predict the strong post-Pinatubo cooling of the ocean apparent in the ocean reanalysis. This third possiblity should not be discarded, especially since the verifying ocean reanalysis and radiative forcing can also be in error.
For the Ucorr experiment, the cold bias at the surface is similar in magnitude to FullIni, while the warm bias below is somewhat weaker. This makes the integrated heat content even colder than for the FullIni experiment, which has a stronger compensation of errors. Comparing FullIni and Ucorr, the impact of the momentum-flux correction is mainly in the Pacific (not shown). By changing the speed of the gyre, the Ucorr experiment has reduced biases at this depth, especially around the western boundary current in the northern Pacific.
The UHcorr experiment shows a reduced bias at the surface as expected from the flux correction design. However, the UHcorr experiment also shows a weaker warm bias at depth compared with FullIni, while a simple linear argument would have suggested the opposite since the heat-flux correction is adding heat to the ocean. This is a clear example of the non-linear processes at play in the coupled system. A warmer ocean surface would result in stronger stratification, then reduced mixing and less heat uptake. There is also the possibility that as a result of the warmer surface, more heat is released back into the atmosphere.
Figure 7c shows the forecast time-series of the 700-m heat content for the ensemble mean with a bias correction applied. Studying the bias-corrected forecasts (Fig. 7c), the general feature is a faster warming in the latter part of the time-series than in the early part. However, all forecasts miss the cooling of the oceans in 1992 and the fast warming started in year 2000.
The systematic model errors in the Tropical Pacific and its the effect on the variability was extensively discussed in Magnusson et al. (2012b). The main finding was that the ENSO variability is clearly suppressed by the systematic errors in the model and by using flux correction the variability was increased. In order to evaluate ENSO forecasts, the average SST for different areas are commonly used. In this study we will refer to Niño3 (150°W–90°W,5°N–5°S) in the eastern part of tropical Pacific; Niño3.4 (170°W–120°W,5°N–5°S) in the central part and Niño4 (160°E–150°W,5°N–5°S) in the central-western part of the basin.
For the Ucorr experiment, the bias during the first year is clearly reduced, compared to FullIni. The reduction is clearest in the eastern part where the applied momentum-flux correction is due to the change in the amount of upwelling cold water. In the western part, the additional heat-flux correction plays a role as Ucorr develops a stronger bias than UHcorr.
In order to evaluate the model drift on a longer time-scales, the model bias as a function of lead time has been calculated from the decadal simulations, using 3 ensemble members. Figure 8c and d shows the model drift for the first 5 years of these simulations for Niño3 and Niño4 respectively. Comparing with the 14-month simulations, the data from the decadal simulations is much noisier, due to a reduced number of initial dates and ensemble members. However, the same features stand out as in the 14-month data sets. The overshooting for FullIni for Niño3 is also present here and remains into the second year. For longer forecast ranges (year 2 and onwards), the sub-surface bias structure has developed in the FullIni experiment and the period of over-shooting is over. After that the AnoIni is somewhat colder than FullIni due to the generally colder ocean as seen in Fig. 5b.
For Ucorr and UHcorr the positive bias after one year is much more pronounced in the decadal simulations. For the UHcorr experiment the bias is about 2 Kelvin. For the following year a corresponding cold bias is present. One could speculate that it could be a sign that the flux correction is over-doing the correction for the first year, which can not be discarded since the flux correction was estimated from a long coupled integration (i.e, with errors fully developed). It can be also an artifact of poor sampling, the results being affected by some outliers (false alarms for El Niño event or a missed La Niña). For longer lead times the mean state for Ucorr is colder than UHcorr but still not as biased as FullIni.
5 Forecast quality
In this section, the forecast quality on seasonal and decadal time-scales will be discussed, starting with the predictability of ENSO on seasonal time-scales. The predictability of ENSO is a key for the predictability of other regions due to ENSO’s strong teleconnections. For all scores presented here the bias has been removed by applying a lead-time dependent bias correction. For the decadal time-scale, results from Ucorr will not be discussed due to only 3 available ensemble members. In the evaluation of the decadal experiments, the data has been detrended hence the main interest in this study is the effect of the forecast strategy and not the response of increased greenhouse gases.
For AnoIni, which has only a small model drift, the amplitude ratio is close to 1. This shows that the strategy implemented here for anomaly initialisation is able to retain the information for about 6–7 months, after which the amplitude of the anomaly converges to the same values as seen for FullIni, and probably for the same reasons (a too cold mean state favouring weak ENSO varibility). These results indicate that even with anomaly initialisation the features of the coupled model depend on the forecast lead time: although the mean state is relatively stable, and therefore there is not drift in the mean, there is a drift in the variability, which is larger at the initial time (when the anomalies are from the “real-world”), since the model attractor will not contain the same sort of anomalies.
For Ucorr and UHcorr, an overestimation of the amplitude is present around forecast month 6, but not as strong as in the FullIni: the seasonal momentum-flux correction will enforce the trade wind relaxation. For UHcorr, the amplitude ratio increases after one year. This manifests ifself in the overprediction of El Niño events at forecast ranges longer than 1 year.
For global temperatures (Fig. 12a), skill is present throughout the first year. This is mainly due to persistence of initialised anomalies and prediction of ENSO, which has an influence on the global mean temperature. For longer time-scales the effect of increased greenhouse gases plays an important role (Hawkins and Sutton 2009), and has been removed by linear detrending in the calculation of these scores.
The region with the highest predictability for month 2–4 is the tropical Pacific that benefits from the predictability of ENSO (see above). For the second half of the first year, UHcorr shows the worst performance for Niño3.4 (Fig. 12b). This could be related to this strategy producing too much ENSO variability at longer lead times, as indicated by a model drift and a high amplitude ratio (see above).
Figure 13 shows the ACC for forecasts of 2-m temperature on a multiyear time-scale averaged over the forecast period year 2–5. This could be seen as a measure of skill of decadal forecasts, even if the period is only four years. One area with enhanced predictability on this time-scale is the North Atlantic, in agreement with other studies (e.g. Pohlmann et al. 2004; Mochizuki et al. 2010; van Oldenborgh et al. 2012). Although the figures show that the UHcorr and FullIni perform somewhat better than AnoIni, it would be dangerous to draw conclusions without a better understanding of the causes. One could speculate that this is due to a more realistic strength of the MOC on this time-scale, but in general the MOC is not well represented in the coupled model, nor in the initial conditions. Some skill is surprisingly present in the southern Indian Ocean, where other coupled models show little skill after the linear detrending (e.g. van Oldenborgh et al. 2012). In fact, it seems that the positive skill extends to the south-eastern Pacific and southern Atlantic. These results would suggest that the higher skill at 2–5 years is not exclusive to the North Atlantic, but more generally of the mid latitudes. The decadal predictability in the Indian Ocean is further discussed in Guemas et al. (2012). Some skill for year 2–5 is also present in the Tropical Pacific for all experiments (the highest skill for AnoIni); the pattern of the positive skill (latitudinally broad horseshoe, with little amplitude in the Eastern Pacific) is reminiscent of the decadal ENSO signal (Power et al. 1999, 2006).
For year 6–9, the UHcorr experiment seems to have higher skill in the Niño3.4 area than FullIni and AnoIni. It is difficult to judge whether this signal is robust and a deeper investigation is needed to find the reason for this increased predictability.
6 Summary and discussion
In the presence of systematic model errors, a forecast strategy needs to be applied to deal with bias. This study discusses different forecasts strategies, and compares the results when applied to seasonal and decadal forecasting. The standard forecast strategy for numerical weather prediction, monthly and seasonal forecasts is full initialisation, where the model is initialised from a state close to the real-world attractor. In the presence of systematic model errors, the model, once initialised, will drift towards the model attractor. For short lead times, as for medium-range forecasts, the systematic error is usually ignored, since it is considered small compared to the error growth of the initial conditions. However, at longer lead times (monthly and seasonal) this is no longer the case. If the difference between the model and the real-world attractor is large, a different forecast strategy is needed: the direct model output needs to be calibrated in order to issue the forecasts. Initialisation and calibration can be considered two different aspects of the forecast strategy. In this study we assume the simplest calibration strategy (the removal of the mean bias a posteriori) and compare different forecast strategies, focusing on their effect on the forecasting quality.
We have compared full initialisation, anomaly initialisation, momentum-flux correction and heat and momentum-flux correction. While the full initialisation simulations are initialised close to the real-world attractor, anomaly initialisation aims to initialise the model on its own attractor by attaching observed anomalies to the model climate. The purpose of this strategy is to avoid the model drift and possible non-linear effects of the model drift such as over-shooting and initial shocks. Another strategy to avoid or reduce model drift is to apply flux correction in the coupling between the atmosphere and ocean. This will act as an artificial energy and/or momentum source or sink with a seasonal cycle.
The model system used for this study (ECMWF atmospheric model version 36r1 + NEMO ocean model version 3.0) shows a general cold bias. The bias is due to imbalance in the energy flux at the top of the atmosphere. A part of the atmospheric temperature bias can also be attributed to a strong uptake of heat by the ocean. One sign of this is a warm bias in the oceans below 200 m depth. The coupled model also has an enhanced cold bias in the tropical Pacific caused by too strong easterlies yielding a positive feedback to the Walker circulation (Bjerknes 1969). The cold bias has a strong influence on the ENSO variability, which is discussed for the present model system in Magnusson et al. (2012b).
Comparing the forecast quality for ENSO prediction (SST indices) on seasonal time-scales, the best results are obtained using momentum-flux correction. The worst results are found with the anomaly initialisation, especially in the western part of the tropical Pacific. For the second half of the first year, the worst scores are found for the heat and momentum-flux corrected experiment. This could be related to erroneously triggering of El Niño events, for which we have found evidence by studying the model drift of this experiment.
Comparing precipitation scores for the tropical Pacific, we see a clear disadvantage for the anomaly initialisation. By avoiding the model drift, the mean state for the anomaly initialisation experiment is always in a cold state. The SST bias leads to a strong supression of convective precipitation. Even with a warm SST anomaly, the SST is too cold to trigger convection. Here we have a clear advantage for the flux-corrected experiment, which is closer to the observed mean state and has a better precipitation response to warm events (c.f. discussion in Magnusson et al. 2012b).
Looking at decadal time-scales and comparing full initialisation, anomaly initialisation and heat and momentum-flux correction, even with a clear difference in mean climate, the differences in scores are small and uncertain due to a limited set of hindcasts. For the detrended data, some skill is present for the northern Atlantic, tropical Pacific and southern Indian Ocean for year 2–5. However, it is hard to verify any systematic differences between the experiments on this time-scale. This result is in line with the findings in Smith et al. (2012), where full initialisation and anomaly initialisation were compared on the decadal time-scale using a different model system and more initial dates for the hindcast.
All methods investigated in this report have advantages and disadvantages, both with regard to results and from a practical point of view. By using full initialisation the model will drift from the attractor of the analysis to the attractor of the model. During this drift the properties of the forecasts change with lead time, both in the sense of mean and variability. In order to correct for this one needs to apply a lead-time dependent bias correction. The development of the bias might well depend on the state (i.e. be conditional), and optimally one should account for this, although sampling considerations are an obstacle. In this study we found evidence of ”over-shooting” model drift in the eastern Pacific at certain forecast ranges.
The anomaly initialisation removes most of the model drift in the global SST and a large part of the model drift in the ocean heat content. The skill in the ENSO forecasts indicates that the anomalies are initialised correctly, altought the scores are worse than using full initialisation. By using the anomaly initialisation the model is always in an erroneous state. It severely affects interactions in the climate system, as seen for the ENSO variability in Magnusson et al. (2012b) and the precipitation scores for Niño3.4 in this report. For a practical point of view, long simulations are needed to obtain the model climate. The main problem is, however, the limited period for which the analysed climate can be defined, due to limitations in past ocean observations.
The momentum-flux correction is mainly aimed to compensate for the wind bias in the tropical Pacific, even if the correction is applied globally. For the mean climate and the skill scores for the tropical Pacific we see a clear improvement. This is also documented in Molteni et al. (2011), for a slightly different model configuration. Using the combined heat and momentum-flux correction, the mean climate as well as the variability is well improved. However, it is difficult to see improvement in the scores. The practical downside of flux correction is the estimation of the corrections. This is not straight-forward, especially for a combination of heat and momentum-flux correction.
In this study we carried out retrospective forecasts (hindcasts), covering the same period that the model climate for anomaly initialisation and the flux-corrections are estimated for. In a world where the climate is changing due to long time-scale variability and/or increase of greenhouse gases this approach is suboptimal. All three methods compared here are affected by the non-stationarities associated to a changing climate. It can be argued that in this assessment the flux-correction and anomaly initialization strategies have not been fully tested in cross-validation mode, since the flux correction, model climate and analysis climate were estimated using data included in the validation period. And even if the bias correction is run in cross-validation mode, a change in climate could affect the bias correction because the bias correction is based on the hindcast period. Therefore all three methods are affected by a non-stationary climate and it is difficult to evaluate which method is most sensitive to a changing climate.
In this study we have not found clear evidence for any universal and easy solution for forecasting in the presence of systematic error. Flux correction (especially momentum-flux correction) has a positive effect on the seasonal time-scale for the model system used here, but this result is highly dependent on the type of systematic errors in the model and may not hold true for other models. For the decadal time-scale only small differences are present between the strategies and no strategy shows a clear advantage. This could be related to the limited sample of starting dates and the limited scope for predictability on these scales, once the effect of increased greenhouse gases has been removed. To conclude, it is still an open question whether the choice of forecast strategy matters for the decadal time-scale.
We would like to acknowledge Kristian Mogensen for help during this project. We would also like to thank Rob Hine and Anabel Bowen for help with the preparation of the figures for this report. This work was supported by the EU-funded projects COMBINE (FP7/2007-2013) under grant agreement 226520, and THOR (FP7/2007-2013) under grant agreement 212643.
- Balmaseda M, Ferranti L, Molteni F (2009) Air-sea interaction in seasonal forecasts: some outstanding issues. ECMWF workshop on ocean-atmosphere interactionGoogle Scholar
- Balmaseda MA, Mogensen K, Molteni F, Weaver AT (2010b) The NEMOVAR-COMBINE ocean re-analysis. COMBINE Technical Report 1, FP7 COMBINE projectGoogle Scholar
- Guemas V, Corti S, Garcìa-Serrano J, Doblas-Reyes FJ, Balmaseda M, Magnusson L (2012) The Indian Ocean: the region of highest skill worldwide in decadal climate prediction. J Clim. doi:10.1175/JCLI-D-12-00049.1
- Hazeleger W et al. (2010) EC-Earth: a seamless earth-system prediction approach in action. Bull Am Meteor Soc 91:1357–1363Google Scholar
- Keeley SPE, Sutton RT, Shaffrey LC (2012) The impact of North Atlantic sea surface temperature errors on the simulation of North Atlantic European region climate. Q J R Met Soc 668:1774–1783Google Scholar
- Kumar A, Chen M, Zhang L, Wang W, Xue Y, Wen C, Marx L, Huang B (2012) An analysis of the non-stationarity in the bias of sea surface temperature forecasts for the NCEP climate forecast system (CFS), version 2. Mon Wea Rev 140:3003–3016. doi:10.1175/MWR-D-11-00335.1
- Madec G (2008) NEMO reference manual, ocean dynamics component: NEMO-OPA. Note du Pole de Modelisation 27, Institut Pierre-Simon Laplace (IPSL), FranceGoogle Scholar
- Magnusson L, Balmaseda MA, Corti S, Molteni F, Stockdale T (2012a) Evaluation of forecast strategies for seasonal and decadal forecasts in presence of systematic model errors. Technical Memorandum 676, ECMWFGoogle Scholar
- Magnusson L, Balmaseda, MA, Molteni F (2012b) On the dependence of ENSO simulation on the coupled model mean state. Clim Dyn. doi:10.1007/s00382-012-1574-y
- Molteni F et al. (2011) The new ECMWF seasonal forecast system (System 4). Technical Memorandum 656, ECMWFGoogle Scholar
- Palmer T, Buizza R, Doblas-Reyes F, Jung T, Leutbecher M, Shutts J, Steinheimer GM, Weisheimer A (2009) Stochastic parametrization and model uncertainty. Technical Memorandum 598, ECMWFGoogle Scholar
- Smith DM, Eade R, Pohlmann H (2012) A comparison of full-field and anomaly initialization for seasonal to decadal climate prediction. Submitted to Clim DynGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.