1 Introduction

Decadal predictability has become a topic of increasing interest as evidenced by the fact that the upcoming 5th IPCC Assessment report will include one chapter on predictability of natural climate variations on annual to decadal time scales and near term climate change. Natural variability in the upcoming 10–30 years is believed to be of similar magnitude to the anthropogenic warming signal during this period. Projections of future greenhouse gas concentrations do not show large deviations for the next one to three decades, which is in contrast to the second half of the twenty-first century. Thus, natural decadal variations might carry a large fraction of the uncertainties in climate change projections of the near future.

Observation-based data sets, proxy-based reconstructions as well as model simulations indicate the existence of climate variations on decadal to multi-decadal time scales and build the physical basis for decadal predictions (Meehl et al. 2006; Latif et al. 2006a; Hurrell et al. 2009).

A reconstruction by Mann and Jones (2003) for the last two millennia showed multi-decadal variations of global mean temperature. Most of these multi-decadal variations are probably due to internal processes within the ocean and ocean–atmosphere interactions. Analysis of a 300-year proxy based reconstruction and comparison to instrumental records and model simulations by Delworth and Mann (2000) showed that sea surface temperature (SST) over the North Atlantic is the primary carrier of the multi-decadal signal although a global but much weaker signal exists. Multi-decadal variations in the meridional overturning circulation (MOC; Cunningham et al. 2007; Bentsen et al. 2004; Baehr et al. 2009) are closely related to the SST-variations. The physical mechanism behind the multi-decadal variations is not clear yet. Results from Delworth and Greatbatch (2000) indicate that multi-decadal variations in the MOC can arise as oceanic response to stochastic surface flux forcing while Timmermann et al. (1998) and Latif et al. (2006b) emphasized the importance of low-frequency variations of the North Atlantic Oscillation (NAO). Vellinga and Wu (2004) highlighted the importance of coupled processes in the tropical Atlantic and Latif (2001) found some evidence for remote forcing by the tropical Pacific.

Since multi-decadal variations seem to be most pronounced in the North Atlantic, most decadal predictability studies focused on this region (Pohlmann et al. 2004; Collins et al. 2006; Keenlyside et al. 2008). They all show that MOC and SST in the North Atlantic Ocean region is highly predictable for more than one decade.

Pohlmann et al. (2006) analyzed the impact of the MOC on European climate in a global coupled model and found a significant response indicating the importance of the MOC for decadal predictions over Europe. However, potential decadal predictability experiments by Pohlmann et al. (2004) did not show any significant predictability over Europe. Boer (2004) came up with similar results. Boer (2004) found the North Atlantic Ocean and the Southern Ocean as main regions with decadal predictability. Furthermore, some predictability of air temperature showed up over the North Pacific.

In this study, we analyze the potential decadal predictability in the global coupled model EC-Earth by performing perfect model experiments. EC-Earth will contribute to the decadal predictions in the upcoming IPCC-AR5. This study is idealized since no observations are used for initialization of the experiments and only the predictability of the model climate and not of real world conditions is calculated. Under the assumption that the model realistically simulates real climate conditions, the potential predictability can be seen as upper limit of predictability. Knowing this upper limit is important to realistically estimate the outcome from decadal predictions. Furthermore, it helps diagnosing regions where decadal predictions are not skilful although the potential is there, thus indicating if it is meaningful to use different initialization or perturbation techniques for decadal predictions.

We analyzed the impact of changes in sea ice parameterizations on the potential predictability, thus testing the robustness of the potential predictability.

The article is organized as follows. Chapter 2 describes model, experiments and the method used to analyze the potential predictability. The following section shows the results of the predictability experiments. Section 4 analyses the major mechanisms leading to predictability and possible processes that might lead to uncertainties in the predictability. Section 5 provides a summary and conclusions from this study.

2 Model, experiments and method

2.1 Model description

The model used in this study is the newly developed global coupled climate model EC-Earth (Hazeleger et al. 2010). It consists of the Integrated Forecast System (IFS) of the European Centre for Medium Range Weather Forecasts (ECMWF) as atmosphere component and the Nucleus for European Modelling of the Ocean (NEMO) developed by the Institute Pierre Simon Laplace (IPSL) as the ocean component (Madec 2008).

The atmosphere component is used in T159 resolution and has 62 vertical levels. It is based on development cycle 31r1 of IFS, but includes some improvements from later cycles. The most important ones are a new convection scheme (Bechtold et al. 2008), the new land surface scheme H-TESSEL (Balsamo et al. 2009), and a new snow scheme (Dutra et al. 2010).

The ocean component uses a tri-polar grid with poles over northern North America, Siberia and Antarctica with a resolution of about 1 degree and 42 vertical levels. It is based on NEMO version 2.0 and it includes the Louvain la Neuve sea ice model version 2 (LIM2, Fichefet and Morales Maqueda 1997; Bouillon et al. 2009), which is a dynamic-thermodynamic sea ice model.

The atmosphere and ocean/sea ice parts are coupled through the OASIS (Ocean, Atmosphere, Sea Ice, Soil) coupler (Valcke 2006).

In this study, we use version 2.1 of EC-Earth, which is almost completely identical to the version 2.2 that is used to perform the IPCC-AR5 simulations. Changes made in 2.2 compared to the version used here are mainly related to requirements necessary for the CMIP5-simulations like aerosol forcing data for 1,850–2,009 and time dependent ozone forcing. Furthermore, version 2.2 includes some changes in the snow parameterization over land, which are similar to IFS-cycle 36.

2.2 Experiments

2.2.1 Experiment 1 (EXP1)

We performed a 350-year control integration with EC-Earth version 2.1 (CTRL1) using present day forcing, where greenhouse gas concentrations are kept constant at the year 2000 level. The ocean was initialized from a 500-year uncoupled spin-up run with the ocean model using restoring of sea surface temperature and sea surface salinity towards climatology. The years 151–350 of the coupled present day simulation are analyzed. Figure 1 shows the annual mean climate in CTRL1 compared to ERA-40 reanalysis. The large scale atmospheric sea level pressure (SLP) patterns are generally well simulated. However, compared to ERA-40, SLP over the Pacific Arctic sector is underestimated by up to 3 hPa and overestimated in the Central Arctic by up to 1 hPa. SLP in the northern hemispheric subtropical and mid-latitude regions is partly overestimated, particularly over the North Atlantic where the bias reaches up to 3 hPa. In the tropics, the discrepancies are mainly below 1.5 hPa but over India pressure is up to 3 hPa higher in the model compared to ERA-40. In most southern hemispheric subpolar regions, SLP is lower than ERA-40. Over Antarctica, SLP is overestimated by up to 4 hPa (although this might be due to elevation effects in either model or reanalysis).

Fig. 1
figure 1

Annual mean SLP, T2m and precipitation in CTRL1 (a, c, e, mean year 151–350) and differences between CTRL1 and ERA 40 (b, d, f). Bottom March (g) and September (h) sea ice thickness in CTRL1. The magenta line indicates the sea ice extent (>15% concentration)

The 2 m air temperature (T2m) in EC-Earth shows a cold bias of about 1–3°C in most of the tropics, over India and the west coasts of the continents locally up to 5°C. Also in the northern North Atlantic and along the west coasts of North and South America, it is several degrees too cold. Over the northern North Pacific and the Siberian Arctic, EC-Earth is warmer than ERA-40 by up to 3°C. A positive bias of up to 5°C exists over the southern Ocean around Antarctica, which leads to too little sea ice in the southern hemisphere. The colder temperatures in the tropics lead to reduced precipitation in the tropics in EC-Earth. Here, precipitation is reduced by up to 50% and locally even more. In the sub tropics, precipitation is generally overestimated. The very large values of more than 100% occur in regions where almost no precipitation exists. In a few mid-latitude regions and particularly over Antarctica, precipitation is overestimated in the model by 10–50%, in Antarctica up to 100%.

Sea ice thickness seems to be somewhat overestimated in most of the Arctic. Due to missing comprehensive data, we rely here on the established view of recent climate ice thickness, based on various observations and analyses methods (Belchansky et al. 2008; Rothrock et al. 2003). The ice in EC-Earth is thickest north of Greenland and the Canadian Archipelago with up to 6 m in late winter. The ice extent is comparably well simulated with a partly ice-free Barents Sea but extends slightly too far south in the Greenland and Iceland Seas in winter. The summer sea ice extent in the European Arctic sector fits well to satellite observations but too much ice is left at the Siberian coast. Here, the ice is too thick both in summer and winter, which is a common problem of coupled climate models and at least partly related to the reduced SLP-gradient across the Arctic (DeWeaver and Bitz 2006).

In addition to the control simulation, perfect ensemble experiments are performed. The ensembles are initialized from different years of the control simulation because the predictability skill may strongly depend on the initial state (Palmer 1993; Reichler and Roads 2003). In total four ensembles each consisting of six members (seven with the control simulation, thus 28 members total) are performed. All runs are initialized on 1st January and run for 10 years. In the following, we will call this ensemble EXP1.

The ensemble members are perturbed by a slight change in the vertical diffusivity coefficient of the ocean. We changed the respective member by ±1 × 10−4, 2 × 10−4, 3 × 10−4% of the vertical diffusivity coefficient in the first month of the simulation. From the second month onwards all ensemble members used identical codes. This very small change in the vertical diffusivity leads to a perturbation of the sea surface temperature and consequently the SLP. Already in the first month after initialization, SLP differs by up to 2 hPa from the control simulation, after 3 months, the differences in monthly mean values reach locally up to 10 hPa. We tested also direct perturbations of the initial SST and sea ice fields by adding randomly created small values (±0.01°C, ±1% concentration). However, all tested perturbation techniques resulted in perturbations of comparable size of SST and SLP in the first months. Obviously, it does not matter how the system is perturbed if the perturbation is sufficiently small. The advantage of perturbing the system by changing the vertical diffusivity for one month is that we are not changing the initial energy and freshwater balances and we need not to care about possible problems like reaching the freezing point in open water regions or sea ice concentration below 0 or above 100%.

Note, that we do not aim to create realistic initial uncertainties as they exist in observations but we aim for an unrealistic small perturbation of otherwise perfect initial conditions that spreads and amplifies in the climate system.

2.2.2 Experiment 2 (EXP2)

In order to analyze the robustness of the predictability to changes in the parameterizations, sea ice albedo of the solar radiation has been reduced by 0.03 for melting sea ice, freezing sea ice and freezing snow. We chose to change the sea ice albedo since sea ice has a rather long memory (Wrigglesworth et al. 2010; Koenigk and Mikolajewicz 2009). Furthermore, changes in sea ice have an important effect on the Arctic freshwater balance and thus on the North Atlantic deep water formation and potentially the MOC (Haak et al. 2003; Jungclaus et al. 2005; Vancoppenolle et al. 2008), which is assumed to be the major predictor on decadal time scales.

Starting from year 150 of CTRL1, a 250 year present day simulation has been performed using the sea ice albedo changes. The years 51–250 are used for the analyses (CTRL2). Figure 2 shows the differences between CTRL2 and CTRL1. The change in SLP is generally small. The largest changes occur in the northern North Pacific and southern South Pacific, where mean SLP is up to −0.7 hPa lower and 0.7 hPa higher in CTRL2 than in CTRL1, respectively. T2m shows a general warming in northern high latitudes with a maximum in the Barents Sea of 1.5 K. In the rest of the world, the temperature change is very small. The sea ice is thinner in the Arctic in CTRL2. From Greenland across the Central Arctic to the Siberian coast, the reduction is strongest and reaches 0.6–1 m. The ice reduction is somewhat higher in summer than in winter (not shown). Although sea ice observations (Rothrock et al. 2003; Johannessen et al. 2004) are coarse in time and space, it seems that the thinning of ice in CTRL2 leads to a more realistic sea ice thickness in the Arctic.

Fig. 2
figure 2

Differences of annual mean SLP (a), T2m (b), sea ice thickness (c) and concentration (d) in CTRL2 and CTRL1

Sea ice concentration in the Arctic is reduced, particularly along the ice edges and coasts. In the Barents Sea, sea ice concentration is up to 15% lower in CTRL2 than in CTRL1. A small area in the Greenland Sea shows an increased ice concentration in CTRL2. A reduction of the convection depth by up to 200 m in this area might be responsible for the increased sea ice concentration. The reduced convection is caused by reduced surface salinity in the Greenland Sea due to enhanced sea ice melting in the East Greenland Current north of Iceland in CTRL2. Generally, the convection region in the Greenland Sea is slightly shifted to the north and the east in CTRL2. In the Labrador Sea, the convection area moves somewhat to the north with the ice edge.

In the Antarctic, sea ice thickness and concentration remain almost unchanged despite reduced sea ice albedo.

As for EXP1, perfect ensemble experiments are performed starting at 4 different times of CTRL2 and consisting of 6 members (+ control run) each. The same perturbations as in EXP1 are used. The ensemble simulations based on CTRL2 will be called EXP2 in the following.

2.3 Method

The predictability of the model climate is analyzed by calculating the prognostic potential predictability (PPP, Pohlmann et al. 2004). The PPP is a measure for the ensemble variance of a climate variable X at time t in relation to its variance in the control run and is defined as:

$$ {\text{PPP}}(t) = 1 - \frac{{\frac{1}{N(M - 1)}\sum\nolimits_{j = 1,\, N} {\sum\nolimits_{i = 1,\, M} {\left[ {X_{i,\, j} (t) - \overline{{X_{j} }} (t)} \right]^{2} } } }}{{\sigma^{2} }} $$

X i,j : member i of ensemble j, \( \overline{{X_{j} }} \): mean of ensemble j, N(M): number of ensemble (number of ensemble members), σ 2 : variance of the control run over time.

Note, that we analyze the potential predictability of climate in this particular model. The predictability shown in this study is based on having a perfect model and having near perfect knowledge of the initial conditions. Both will not be achieved in the real world. Real predictions suffer from substantially larger uncertainties in the initial conditions and from model errors. Furthermore, this study analyzes predictability of natural climate variability. External forcing like increased greenhouse gases, which might increase the predictability are not included in this study.

Both for CTRL1 and CTRL2, we detrended the control simulations to really capture only the natural variability. A PPP of 1 shows perfect predictability while a value of 0 shows no predictability at all. In this case, the ensemble spread (variance among ensemble members) is equal to the variance of the control integration. To calculate the significance of a PPP-value, we use an F-test with the null-hypothesis that the variances of the ensemble and the control simulations are the same. This hypothesis can be rejected with 95% probability if the ratio of the variances exceeds a certain value, which depends on the number of degrees of freedom. In our experiments this value is exceeded if the variance of the control simulation is about two times larger than the variance among the ensemble members. The exact values vary between 0.45 and 0.56 because the de-correlation time and thus the degrees of freedom (Zwiers and von Storch 1995) vary in dependency on the variable and the location.

In the following, for simplicity and because the 95% significance level is only an arbitrarily defined statistical value we mark all PPP-values exceeding 0.5 coloured to indicate that the predictability is significant at 95% or around 95% significance.

3 Results from perfect ensemble experiments

In this section, we present the potential prognostic predictability of selected atmospheric and oceanic variables in the two different experiments. The physical mechanisms leading to the predictability and possible differences between EXP1 and EXP2 are discussed in Sect. 4 in more detail.

3.1 Predictability in the atmosphere

3.1.1 Annual means

Wind, temperature and precipitation are the atmospheric climate variables that affect society and economy most. Hence, the motivation to provide better future predictions of these parameters is particularly high.

The predictability of single years—partly except for the first year—in the first decade is small since the response to the initial forcing often is too small to overcome the noise of unpredictable high frequency atmospheric variations. Figure 3 shows that PPP of SLP in year 1 is only high in the Arctic and in south eastern Asia. In year 2, almost no predictability is left.

Fig. 3
figure 3

PPP of SLP (a, b) and T2m (c, d) in the first year (left) and the second year (right) after initialization in EXP1

T2m is highly predictable in most tropical regions and a number of extra-tropical ocean regions during the first year after initialization of the predictability experiments. But already in the first year, predictability over the extra-tropical continents is very limited and is below 0.5 in most regions. In year two, predictability is also strongly reduced in the tropics.

Koenigk and Mikolajewicz (2009) analyzed seasonal to inter-annual predictability of mid and high northern latitudes with a coarse resolution version of the global coupled climate model ECHAM5/MPI-OM. Similar to this study they found high predictability of T2m over the oceans but low values over land. They concluded that most of the first year predictability of T2m over land was due to high predictability in the first half of the year. In contrast to this study, ECHAM5/MPI-OM showed high PPP values over the North Pacific for several years for T2m. The predictability of SLP found in Koenigk and Mikolajewicz (2009) was generally small in mid and high northern latitudes.

Figure 4 shows the PPP of SLP for the mean of the first decade. In EXP1, the predictability is very limited in almost all regions of the world. We find only some significant predictability in the tropical and southern Pacific as well as in some areas of the Indian Ocean. However, PPP of SLP in the Labrador Sea is significant.

Fig. 4
figure 4

Top PPP of decadal mean SLP in EXP1 (a) and EXP2 (b). Middle Decadal variance of SLP in CTRL1 (c) and CTRL2 (d). Bottom Variance among ensemble members of decadal mean SLP in EXP1 (e) and EXP2 (f)

In EXP2, we see a significant predictability over the entire Arctic Ocean. Otherwise, the predictability is low as well. Most of the small regions with a significant PPP are different from those of EXP1 indicating a low robustness of predictability in these regions.

In order to find out if possible changes in the predictability between EXP1 and EXP2 are due to the variance in the control simulations (CTRL1 and CTRL2) or the variance among the ensemble members of EXP1 and EXP2, we analyzed both variances (Fig. 4 middle, bottom). This gives us also an impression on the size of the decadal SLP variance. Generally, the variance increases from the equator to the poles and the variance is slightly higher over the oceans than over the continents. The highest variances occur in the Barents Sea/Kara Sea area and both east and west of the Antarctic Peninsula in the Weddell, Bellingshausen and Amundsen Seas with up to 0.6 hPa2. The strength and distribution of variance in CTRL1 and CTRL2 compare well but there is a slight tendency to enhanced variance in polar latitudes in CTRL2. Also the variances in the EXP1 and EXP2 ensembles compare well except for high latitudes where the ensemble spread is smaller in the EXP2 ensemble. This and the slightly enhanced decadal variability in CTRL2 lead to a higher PPP in the Arctic in EXP2.

The atmospheric circulation is the main driver for air temperature variations on short time scales but ocean and sea ice variations become increasingly important with increasing time scales. Thus, PPP of T2m can be skilful despite low predictability of SLP. The predictability of T2m is shown in Fig. 5. The main two regions with high PPP are the northern North Atlantic and the southern South Atlantic. In EXP1, the highest predictability occurs in the Labrador Sea and the northeastern North Atlantic. T2m in the Barents/Kara Sea region also shows high PPP. In EXP2, PPP is slightly smaller in both Labrador Sea and northeastern North Atlantic than in EXP1 but larger southeast of Greenland. Furthermore, the region with high T2m predictability in Barents Sea/Kara Sea is moved eastwards and the entire Central Arctic is significantly predictable. Thus, similar to SLP, T2m shows a higher potential predictability in the Arctic in EXP2.

Fig. 5
figure 5

Top PPP of decadal mean T2m in EXP1 (a) and EXP2 (b). Middle Decadal variance of T2m in CTRL1 (c) and CTRL2 (d). Bottom Variance among ensemble members of decadal mean T2m in EXP1 (e) and EXP2 (f)

Most interesting for society is the predictability of T2m over continents. Over most land regions, PPP is rather small and not significant. However, the region of high predictability over the northeastern North Atlantic extends towards Scandinavia in EXP1 and even further south and east in EXP2. In EXP1, significant PPP occurs also at the east coast of North America. Over northern Siberia, some predictive skill exists, which is more pronounced in EXP2.

The predictability distribution of T2m in our study agrees relatively well with results of Pohlmann et al. (2004) and Boer (2004), who also analyzed the potential predictability in climate models. However, our results indicate somewhat higher predictability in a few land regions.

The decadal variance of T2m of the control runs (Fig. 5c, d) is similarly distributed as the SLP variance with small variations in the tropics and largest variability in high latitudes. The maximum variances occur along the ice edges, particularly in Barents and Labrador Sea but also in the Weddell Sea. In contrast to SLP, tropical and subtropical landmasses show a higher decadal T2m variance than the surrounding oceans.

The T2m variance in high northern latitudes, particularly in Labrador Sea and Barents Sea is smaller in CTRL2 than in CTRL1. In southern high latitudes, some shifts in the variances occur but no general increase or decrease can be seen in either experiment. The T2m ensemble spread in the EXP2 ensemble (Fig. 5e) is substantially smaller around the southern tip of Greenland and in the Barents Sea than in the EXP1 ensemble (Fig. 5d) but also in most parts of the Arctic Ocean, Europe and northern Asia, the variance is slightly smaller in the EXP2 ensemble. Thus, it is mainly the reduced variance among ensemble members leading to higher PPP in these areas in EXP2.

Table 1 shows the decadal T2m predictability for all continents and a number of regions that stick out due to particularly high predictability or particularly high decadal variability. Generally, most regions with a high decadal variability in the control simulation also show a rather high potential predictability of T2m. Regions without pronounced decadal variations normally also miss decadal processes that might be predictable. Thus, PPP is high in the Arctic and Arctic sub-regions like Barents/Kara Sea or the Labrador Sea. Also the North Atlantic has a very high potential predictability and even PPP of T2m in Europe exceeds 0.7 in both experiments. T2m in southern Asia and Africa are significantly predictable in EXP1 but not in EXP2. T2m in northern Asia is significantly predictable in EXP2 and almost significant in EXP1 and T2m in North America is slightly below the 95% significance level in both experiments. T2m of South America, Australia and Antarctica are not predictable. The global mean T2m has a PPP-value of 0.85 in EXP1 but only 0.67 in EXP2. PPP values for regions with small decadal T2m variance vary much more between EXP1 and EXP2 than regions with high variances because a rather small change in the spread of the ensemble members or in the control run can lead to relatively strong impact on the PPP. Despite some differences in the PPP between EXP1 and EXP2, the results for the regions with high variance seem to be relatively robust.

Table 1 PPP of 10 years running mean regional averaged T2m and decadal variance of regional averaged T2m in K2

Another insight we get from this table is that the T2m variance in CTRL2 is generally smaller than in CTRL1. As we will analyze in Sect. 4, this seems to be related to the thinner sea ice conditions in the Arctic due to reduced sea ice albedo in CTRL2.

Figure 6 shows the PPP of precipitation. It is largest in the north-eastern North Atlantic and in the Barents Sea/Kara Sea region in both EXP1 and EXP2. Also in the Labrador Sea, PPP of precipitation is significant. In EXP1, PPP is slightly higher in these regions than in EXP2. Furthermore, some predictability is found in the southern South Atlantic in both experiments, in the Arctic in EXP2 and in the NINO3 region in EXP1. Most of the regions with significant PPP of precipitation also show a significant predictability of T2m (Fig. 5). Over ocean regions, higher (lower) SST normally leads to more (less) precipitation.

Fig. 6
figure 6

PPP of decadal mean precipitation in EXP1 (a) and EXP2 (b)

In most other ocean regions and all land regions, the PPP pattern is very small scale and randomly looking. Note, that just due to noise about 5% of the area can be expected to show up as significant using a 95% significance criteria.

3.1.2 Seasonal means

The potential predictability of decadal means of winter (January, February, March) and summer means (June, July, August) in EXP1 and EXP2 are shown in Figs. 7 and 8. We defined the winter as mean over January, February, March since all our model simulations started in January and ended in December.

Fig. 7
figure 7

PPP of decadal mean winter and summer SLP in EXP1 (a, c) and EXP2 (b, d)

Fig. 8
figure 8

PPP of decadal mean winter and summer T2m in EXP1 (a, c) and EXP2 (b, d)

PPP of SLP in both winter and summer is small and not significant in most regions. In winter in EXP1, significant PPP occurs only in the Labrador Sea, in parts of mid-latitudes in the South Pacific and subtropical North Pacific, in the subtropical South Atlantic and over Australia. Most of these areas are also significant predictable for decadal annual mean SLP (compare Fig. 4). The main areas with significant PPP of summer SLP in EXP1 are the Barents Sea, parts of the North Pacific, the southern part of North America and the mid-latitude southern Indian Ocean. Again, some of these regions also appear in the PPP pattern of decadal annual means. Similar as for decadal annual means, the areas with significant seasonal PPP in EXP2 differ from those of EXP1. Winter SLP but not summer SLP shows some significant predictability over the Arctic in EXP2.

The large scale PPP patterns of winter and summer T2m are similar to each other and also similar to the annual PPP pattern (compare Fig. 5). However, PPP is higher in the Arctic and smaller in the northern North Atlantic, particularly in EXP1, in winter than in summer. In the Arctic, this is caused by the fact that surface temperature in summer is always near freezing level independent of sea ice thickness or ice concentration. During winter, ice variations matter for the isolation between relatively warm ocean and cold atmosphere. Thus, decadal variations in sea ice have a stronger impact on winter T2m predictability. In the northern North Atlantic, summer T2m is more predictable due to weaker winds that reduces the impact of the highly unpredictable atmospheric circulation on the T2m. The correlation between SST and T2m is significantly higher in the North Atlantic in summer than in winter. The same is true in the southern hemisphere and leads to slightly higher PPP in the southern Ocean in the southern hemispheric summer (JFM). Also in the tropical Pacific, PPP is slightly higher in JFM.

Generally, the area of the earth with significant PPP is slightly smaller for seasonal decadal means compared to annual decadal means.

3.2 Predictability in sea ice and ocean

The predictability of surface temperature is shown in Fig. 9 (top). The PPP distribution of SST is similar to the T2m pattern but often shows somewhat higher PPP. This is expected and indicates that T2m over the oceans is highly governed by the SST. The correlation of decadal mean SST and T2m in CTRL1 exceeds 0.9 in most of the world, except for the equatorial regions, where the correlations vary between 0.6 and 0.8 and the Arctic, where SST and T2m are badly correlated due to the ice cover that effectively reduces the heat exchanges between ocean and atmosphere. This is why SST is also predictable in the Arctic in contrast to T2m. In EXP2, PPP of Arctic SST is slightly smaller than in EXP1. In contrast to T2m, SST in the Arctic stays close to freezing level and thus exhibits very small variations in both the ensemble simulations and the control runs. It is thus up to discussion if high predictability of SST under sea ice is of any value.

Fig. 9
figure 9

PPP of decadal mean sea surface temperature in EXP1 (a) and EXP2 (b) and decadal mean surface salinity in EXP1 (c) and EXP2 (d)

PPP of ocean heat contents up to a depth of 1,000 m has been analyzed as well (not shown). The PPP pattern of upper ocean heat content (0–100 m) compares well to the PPP pattern of SST. The heat content down to 500 m depth shows a similar pattern but higher PPP and extended areas with significant PPP. The heat content between 100 and 500 m depth shows a growing PPP in the tropical oceans. Below 500 m the distribution strongly changes with increasing PPP values in most of the world oceans and highest PPP values in the tropical regions. The lowest predictability is found in mid and high southern latitudes and in the North Pacific.

The small initial perturbation in our experiments is fast amplified in the mixed layer by atmospheric noise but it takes longer to affect the deep ocean.

The PPP pattern of sea surface salinity (Fig. 9, bottom) is similar to the one of SST. However, the predictability is even slightly higher with significant PPP in most mid and high latitude regions. Nevertheless, we can assume that surface salinity and SST are mainly dominated by the same decadal scale processes. The impact of the rather unpredictable atmosphere on SST is probably higher than on salinity, which leads to the slightly better predictability of salinity.

In agreement to PPP of SST, PPP of salinity shows a slightly better predictability in the Arctic in EXP1 compared to EXP2. The same is true for the South Atlantic while EXP2 shows higher PPP values in the North Pacific and southern Indian Ocean.

The predictability of decadal mean summer and winter values of surface salinity and SST are similar to the decadal mean annual values. In most regions with decadal predictability, the phase of the decadal signal is similar in summer and winter. The amplitude varies somewhat but both in the ensemble simulations and the control simulations. Thus, the decadal predictability does not show any strong seasonal cycle.

A number of studies (Venegas and Mysak 2000; Schmith and Hansen 2003; Goosse et al. 2002) suggested decadal or longer scale variations in Arctic sea ice. However, we do not find high sea ice predictability in the Arctic in our model (Fig. 10). Note that the predictability of sea ice can be artificially high in regions (mainly along the ice edges) where sea ice exists in the control simulation but not in the ensemble members. To avoid this, we only calculated the predictability of sea ice for areas where sea ice concentration in the control simulation exceeds 10%.

Fig. 10
figure 10

Top PPP of decadal mean sea ice thickness in EXP1 (a) and EXP2 (b). Middle Decadal variance of sea ice thickness in CTRL1 (c) and CTRL2 (d). Bottom Variance among ensemble members of decadal mean sea ice thickness in EXP1 (e) and EXP2 (f)

Significant predictability of sea ice thickness occurs mainly in the Labrador and Greenland Sea in EXP1 and Labrador and Barents/Kara Seas in EXP2. In the Central Arctic, no significant predictability occurs. The Arctic ice volume shows pronounced decadal variations in our simulations but the decadal variance of the Arctic ice volume differs strongly between CTRL1 (3.6 × 1024 m6) and CTRL2 (9.8 × 1023 m6). However, the Arctic ice volume shows no significant predictability in either simulation (0.32 in EXP1, 0.04 in EXP2). The reduced ice volume variance in EXP2 is probably due to the thinner ice. The thinner the ice, the less survives the summer and the less ice can accumulate to large anomalies.

Also sea ice concentration (not shown) shows the highest predictabilities in the Labrador, Greenland, Barents and Kara Seas. In EXP1, PPP of sea ice concentration is larger than PPP of ice thickness in the Barents/Kara Sea region. Additionally, both ensemble experiments also show some significant areas in the Central Arctic. In contrast to the Arctic ice volume, the Arctic ice extent is highly predictable (PPP = 0.88 in EXP1, 0.74 in EXP2). This is because ice extent variations are dominated by the ice edges in the North Atlantic Arctic sector, where predictability is high while ice volume variations are mainly dominated by the Central Arctic and the Siberian coast where predictability is small.

The spatial distribution of ice thickness variance (Fig. 10, middle) clearly indicates a smaller decadal sea ice thickness variance in CTRL2 than in CTRL1, particularly in the Central Arctic and the Beaufort Gyre. Generally, the highest variations occur at the Siberian coast but high variances occur also in the East Greenland Current and in CTRL1 in the Central Arctic. The spread in the ensemble simulations is largest at the Siberian coast and the Central Arctic in both EXP1 and EXP2. North of Greenland and the Canadian Archipelago as well as in the entire North Atlantic section of the Arctic, the spread is small.

In the southern hemisphere, the largest sea ice thickness predictability occurs in the Weddell Sea (not shown). Here, PPP is somewhat larger in EXP1 than EXP2 but in EXP2 significant predictability occurs also in the Ross Sea and in the Indian polar regions.

3.3 Results of the combined ensemble simulations

It is difficult to judge whether CTRL1 or CTRL2 provides a more realistic climate although the sea ice representation seems to be slightly more realistic in CTRL2. Hence, the differences in the two experiments have to be seen as uncertainties of the potential predictability. Performing the same experiment with small changes in other parameterizations might provide slightly changed predictabilities again, at least if the changes affect the governing processes of decadal predictability like the MOC.

Figure 11 shows the PPP of our combined ensemble consisting of the EXP1 and EXP2 ensemble for T2m, SLP, precipitation and ice thickness. Note that due to more ensemble members, already PPP-values exceeding about 0.4 are significant at the 95% level. The colour scale used in Fig. 11 is thus different from the rest of the figures. Generally, similar PPP patterns arise in the combined ensemble compared to EXP1 and EXP2. However, PPP of SLP is reduced because EXP1 and EXP2 partly cancel out each other, thus indicating a low robustness of the SLP predictability. The total area of significant PPP of T2m is slightly increased in the combined ensemble compared to the single ensembles. For precipitation, the Nordic Seas is the main region of predictability in the combined ensemble. Similar to SLP, PPP in EXP1 and EXP2 often cancel out each other, particularly over land. Sea ice thickness shows high predictability in Labrador Sea, Greenland Sea, parts of Barents and Kara Seas and the Canadian Archipelago.

Fig. 11
figure 11

PPP of decadal mean SLP (a), T2m (b), precipitation (c) and sea ice thickness (d) from the combined ensemble of EXP1 and EXP2

4 Sources of predictability

4.1 Impact of MOC on predictability

Multi-decadal variations in the North Atlantic Ocean and particularly the MOC are usually implicated as the major contributor to decadal predictability (Latif et al. 2006a). Also our model simulates pronounced multi-decadal variability of the MOC (Fig. 12). In the following, we use always the MOC at 30°N. Spectrum of annual mean values and decadal variance of the MOC differ somewhat between CTRL1 and CTRL2. The MOC in CTRL1 has the major peak in the spectrum at about 60 years and a minor peak (not significant at 95% level) at about 25 years. The MOC in CTRL2 shows significant peaks at about 10 and 80 years. Both time series show also a peak at about 3 years. However, we have to note that a 200-year simulation is short for identifying the exact frequency of multi-decadal variations. Generally, more power is concentrated in short periods in CTRL2 compared to CTRL1. The decadal variance is much larger in CTRL1 with 1.93 Sv2 compared to 0.96 Sv2 in CTRL2.

Fig. 12
figure 12

a 10-year running mean of MOC at 30°N in CTRL1 and CTRL2. Year 1 corresponds to year 251 of CTRL1 and year 51 of CTRL2. b and c Power spectrums of annual mean MOC in CTRL1 and CTRL2

Our model results indicate a much higher variance of Fram Strait ice export in CTRL1 compared to CTRL2, mainly due to thinner sea ice and thus reduced decadal variations in sea ice thickness in CTRL2. This leads to a strong reduction of both salinity and temperature variability in the Labrador Sea and might be responsible for the reduced MOC-variability in CTRL2. Consequently, decadal temperature variations in the northern hemisphere are reduced. This is in line with findings by Dau et al. (2005) and Meehl et al. (2006) who showed in model simulations that increased warming not only reduce the mean MOC but also interdecadal variations of the MOC. They showed that the mean MOC reduction is due to changes in ocean temperature and salinity. Häkkinen (1999), Haak et al. (2003) and Koenigk et al. (2006) showed an important impact of the freshwater export through Fram Strait on the Labrador Sea convection and the MOC. Jungclaus et al. (2005) analyzed a several centuries long global coupled model simulation and came to the conclusion that the freshwater export through Fram Strait is responsible for the sign-change of multi-decadal MOC anomalies. Also Holland et al. (2001) stressed the importance of variations in the Arctic ice export for the variability in the MOC.

The predictability of the decadal mean MOC is very high in both experiments and reaches 0.96 in CTRL1 and 0.85 in CTRL2. In the following, we investigate the impact of the MOC on the variables shown in Sect. 3 and analyze how much predictability is connected to the MOC. Furthermore, the importance of other decadal to multi-decadal processes for the predictability is investigated.

Figure 13 shows the correlation between MOC and T2m in CTRL1 and CTRL2. In CTRL1, the highest correlations occur in most regions when MOC leads temperature by 0–4 years. However, at lag 10 years, correlations are still significant in the North Atlantic Ocean and parts of the Arctic. The correlation at lag 2 is largest over the North Atlantic but we also see significant correlations over Europe, northern Asia, the Arctic except for the Pacific Arctic sector, north-eastern North America and parts of the North Pacific. In the southern hemisphere (SH), no significant correlation is found. This correlation pattern compares well to the PPP-pattern of T2m in the northern hemisphere (NH) in EXP1. In all NH regions with significant predictability we find high correlations between MOC and T2m. Over Middle and Southern Europe, most parts of the Arctic and Asia, the link between MOC and T2m seems to be too weak to lead to a good predictability.

Fig. 13
figure 13

a Correlation between decadal mean MOC and T2m in CTRL1. MOC leads 2 years. b Correlation between decadal mean MOC and T2m in CTRL2. MOC leads 2 years. c and d Same as a and b but MOC leads 10 years

In CTRL2, highest correlations between MOC and T2m occur when MOC leads by about 2 years and by about 10 years. Although again high correlations occur in mid and northern high latitude regions, the correlation pattern differs from the pattern of CTRL1. Correlations are smaller in the northwestern North Atlantic region including Greenland, Labrador Sea and northwestern Canada compared to CTRL1. The correlations are furthermore generally slightly less extended to the south in CTRL2. Over Europe, the correlations are more pronounced over the north and northeastern part but less pronounced over the southern part. The PPP pattern of T2m in EXP2 reflects some but not all of the differences between the correlation patterns in the control runs.

Table 2 shows the correlation between 10-year running mean MOC and T2m averaged over the same regions as in Table 1. Obviously, the MOC is responsible for a large part of the predictability of regional averaged temperatures. But the MOC can not explain the entire predictability and not all differences in the predictability between EXP1 and EXP2. Generally, the correlation between MOC and T2m is weaker in EXP2. This is related to a weaker decadal variability in both MOC and T2m in EXP2. Nevertheless, the predictability is not generally smaller in EXP2.

Table 2 Correlation between 10 year running means of MOC and regional averaged T2m in CTRL1 and CTRL2 and time lag in years where the maximum correlation occurs

This indicates that processes other than the MOC contribute to the predictability of T2m in EXP2.

The processes leading to high PPP of T2m in the southern hemisphere are less clear. Latif et al. (2006b) stated that the multi-decadal variability of SST in the Southern Ocean (SO) is anti-correlated with SST in the North Atlantic. However, lag correlations between SO SST and MOC or North Atlantic SST are not very large in our model and reach at most −0.47 (SO SST leading 15 years). The correlation between MOC and SST in the southern South Atlantic where PPP is highest is even worse. Thus, we conclude that the MOC can if at all only explain a small part of the SO-variations.

A common method to analyze the benefit of performing predictions with a dynamic model is to compare the predictability from the ensemble experiments with the persistence in the control runs. Here, we define the persistence as the square of the autocorrelation of running 10-year means at a lag of 10 years. The predictability of the MOC due to persistence is 0.36 in both CTRL1 and CTRL2 and thus less than half of the PPP we found in our experiments. The decadal persistence of T2m contributes mainly in the North Atlantic to the predictability. Here, up to 50% of the predictability is due to persistence. In the South Atlantic, the persistence is much smaller and contributes up to a maximum of 30% to the modelled predictability.

The MOC also influences decadal variations of sea ice conditions in parts of the Arctic (Fig. 14). In contrast to the correlation between MOC and T2m, the correlation between MOC and sea ice thickness shows highest values at lag 0 in both control simulations. In Labrador, Greenland, Barents and Kara Seas, ice thickness is highly negatively correlated with the MOC due to larger than normal ocean heat transport to the north during anomalously strong MOC. The correlation in Labrador and Greenland Seas is higher in CTRL1 than in CTRL2 and this is also reflected in a higher PPP of ice thickness in EXP1 compared to EXP2. PPP around Franz-Josef Land is higher in EXP2 despite similar correlation values in CTRL1 and CTRL2.

Fig. 14
figure 14

Correlation between decadal mean MOC and sea ice thickness in CTRL1 (left) and CTRL2 (right) at lag 0

4.2 Impact of other decadal scale processes on predictability

The MOC is responsible for a large part of the northern hemisphere predictability but can not explain all predictability. Possible other candidates are decadal climate modes in the Arctic particularly since PPP of SLP and T2m is higher in the Arctic in EXP2. A number of studies describe Arctic climate modes at decadal time scales (Mysak and Venegas 1998; Polyakov and Johnson 2000; Koenigk et al. 2006).

CTRL1 indeed shows pronounced variations of Arctic ice volume (without Labrador and Nordic Seas) with a frequency of about 25 years. The sea ice volume is significantly correlated with air temperature over the Central Arctic, the Siberian Arctic and also Siberian and Northern European land areas. The correlation pattern in CTRL2 is similar but the variations in the ice volume are much smaller and the decadal variations are much less regular than in CTRL1. The decadal ice volume variations are not well correlated with the MOC in CTRL1 but show a weak negative correlation in CTRL2 (r = −0.38, significant at 90% level, not at 95%). However, neither in EXP1 nor EXP2 the sea ice volume is predictable. Thus, Arctic ice volume variations can probably not directly explain the differences in PPP of SLP and T2m between EXP1 and EXP2.

Another candidate influencing Arctic climate on decadal scales is the NAO (we defined the NAO-index as SLP difference between Iceland and Azores) but although the NAO-index shows decadal variations, no predictability is found in either experiment.

The Pacific Decadal Oscillation (PDO) is according to Mantua et al. (1997) defined as 1st EOF of the SST in the Pacific basin north of 20°N. Here, we used the mean SST of the region 32–48°N and 140–180°E as PDO-index. This index is correlated in CTRL1 and CTRL2 with the time series of EOF1 of SST in the Pacific with −0.94 and −0.96, respectively. Both control simulations show decadal to multi-decadal variations of the PDO. However, the PPP of our PDO-index varies substantially with 0.15 in EXP1 and 0.56 in EXP2. Thus, the PDO might play a role in explaining differences between EXP1 and EXP2. Latif and Barnett (1994) and Schneider and Cornuelle (2005) related the predictability in the North Pacific to long propagating Rossby waves but an analysis of the details in the different behaviour in the variances in CTRL1 and CTRL2 is beyond the scope of this study.

The Southern Annular Mode, which can be defined as the difference of normalized SLP-anomalies between 40S and 65S shows only a small predictability (0.36 in EXP1, 0.32 in EXP2) and can thus only explain a small part of the high PPP of T2m in the South Atlantic.

PPP of SST in the NINO3 region is small and is 0.2 and 0.13 in EXP1 and EXP2 respectively. Thus, El Nino is not contributing to Southern Ocean decadal predictability in our model.

5 Summary and conclusions

The decadal potential predictability and its dependency on changes in sea ice albedo parameterization have been analyzed with a perfect model approach. Two control simulations using the EC-Earth standard formulation for sea ice albedo (CTRL1) and a formulation with reduced sea ice albedo (CTRL2) and two corresponding sets of ensemble experiments (EXP1 and EXP2, belonging to CTRL1 and CTRL2 respectively) have been performed with the global coupled climate model EC-Earth.

The decadal potential predictability of the atmospheric circulation is small. However, in EXP1, using the standard configuration of EC-Earth, some significant predictability has been found over parts of tropical and southern hemispheric Pacific and Indian Ocean. EXP2, using slightly reduced sea ice albedo, shows significant predictability of sea level pressure over the Arctic Ocean.

The predictability of 2 m air temperature is particularly high over the northern and north-eastern North Atlantic and southern South Atlantic. Also sea surface salinity and sea surface temperature show a very high predictability skill in these regions but are in addition predictable in the Arctic and in larger parts of the mid and high southern latitudes.

Precipitation is predictable in the northeastern North Atlantic and parts of the Arctic in EXP2 as well as in the tropical Pacific in EXP1. Obviously, the decadal predictability of precipitation is mainly governed by decadal temperature variations.

The relatively high decadal predictabilities of 2 m air temperature over the oceans are interesting from a scientific point of view, but the benefit for society might be limited. Of more interest is the question of decadal predictability over populated land areas. General, the predictability is smaller over the continents compared to the oceans due to shorter memory time scales. However, both experiments show a significant predictability of air temperature over north-western Europe. EXP1 indicates predictability along the east coast of North America and parts of tropical Africa. Air temperature in EXP2 is also predictable over central and eastern parts of Europe. Most of the high potential predictability areas over land are located close to high predictability over sea.

Northwestern Europe seems to be the populated area where decadal predictions might have the highest potential for improvement compared to traditional scenario simulations. Compared to previous studies, our results indicate a slightly higher predictability over land regions. If the relatively high resolution in EC-Earth compared to the models used in most previous studies contribute to this fact remains to be analyzed.

The predictability of air temperature averaged over continental-size regions increases particularly for northern hemispheric regions compared to the predictability on the grid size scale. However, this increase of predictability goes along with a loss of local information. Future studies should analyze if an ‘optimum’ area-size exists where predictability starts saturating but most of the local information is still available. The optimum might depend on the specific question to address. Thus, it is a matter for the impact and adaption community.

Sea ice is well predictable in the Atlantic sector of the Arctic, particularly in Labrador, Greenland, Barents and Kara Seas but not in the rest of the Arctic. However, this implies that decadal prediction might give improved information for planning ship traffic in this region or exploitation of natural resources.

Generally, our results indicate that all regions with high predictability show pronounced decadal variability. On the other hand, not all regions with high decadal variances are predictable. Sea ice thickness in the Central Arctic and air temperature in polar regions show pronounced decadal variations but the predictability is obviously limited.

The meridional overturning circulation is the major driver for decadal predictability in the northern hemisphere and can explain a large part of the predictability of sea surface temperature, surface salinity, 2 m air temperature, precipitation and sea ice in the North Atlantic Sector. The predictability that arises from the persistence of the overturning circulation indicates that about 40% of the next-decade predictability of the overturning circulation is due to persistence. Thus, using a dynamical approach leads to a considerable gain of predictability compared to persistence.

The results might depend on individual models. Collins et al. (2006) analyzed the predictability of the meridional overturning circulation in different CMIP3-models and found relatively high predictabilities in most models. However, both amplitude and frequency of multi-decadal variations varied considerably among the models, which indicate that climate models might have problems to simulate real long-term overturning-simulations and thus real long-term temperature variations. This problem is minor for the first decade but can limit the predictability thereafter.

The Southern Annular Mode seems to contribute to the high predictability of air temperature in the southern South Atlantic but can only explain a small part of the predictability. The correlation between North Atlantic sea surface temperature anomalies and southern hemispheric sea surface temperature anomalies is relatively small in our model and can only explain a small fraction of Southern Oceans air temperature predictability.

Another result from this study is that decadal variations are substantially reduced in CTRL2 compared to CTRL1. Even though the main patterns of predictability are similar in both simulations and similar to most existing studies, large uncertainties exists on regional scales. We hypothesize that thinner Arctic sea ice is leading to reduced decadal scale variations in Arctic sea ice volume and Fram Strait ice export. This reduces sea surface temperature and salinity variations in the Labrador Sea and impacts thus decadal variations of deep water formation and overturning circulation. A reduced overturning variability in its turn reduces temperature variations in most of the mid and high northern hemisphere regions.

Although, the number of ensemble members is not sufficiently high to always distinguish between noise and effect of the different albedo parameterization, it is obvious that rather small changes in the parameterization can lead to important changes in the results and might have a large influence on both amplitude and frequency of long-term variations.

Completely different models will particularly on a regional scale show larger differences than the same model with slightly changed parameterization. Thus, significant uncertainties exist in the simulation of long term natural variations in climate models. Model parameterization and therefore model error has a significant impact on the potential predictability and therefore real prediction will be dependent on model error and bias. Large ensembles will be necessary to quantify the uncertainties in the results in decadal predictions and improve the benefit for society.

The benefit of decadal predictions depends strongly on the relation between decadal variance to warming trend. A high predictive skill is of limited practical use if the predicted natural variations are small compared to the trend.

Global mean decadal air temperature in our model is highly predictable but its standard deviation is small compared to the expected global mean warming as predicted by CMIP3-models. However, on continental or smaller scales, temperature variations can be of similar magnitude to the warming trend in the next 10–20 years. Hence, decadal predictions might have the potential to significantly reduce the uncertainties for future climate change in the next 10–20 years in some specific areas. These results compare well to a recent study from Boer (2010).