1 Introduction

For the past few years, the climate research community has been facing a scientific challenge with the emergence of predictability studies at decadal timescales. Focus lies on near term future ranging from 1- to 10-year horizon (Smith et al. 2007; Keenlyside et al. 2008; Hurrell et al. 2010; Meehl et al. 2009; Pohlmann et al. 2009; Mochizuki et al. 2010) and complementing the traditional long term future climate projections based on greenhouse gases (GHGs) aerosols emission scenarios on which the successive Intergovernmental Panel of Climate Change (IPCC) reports have been based. Climate predictability at decadal timescale may have significant social, economic and environmental implications. Hence, there is an important demand from decision makers who need to know at best the information provided by climate forecasts in order to plan adaptation strategies for areas of most vulnerability and sensitivity to climate low frequency variability and climate changes (Meehl et al. 2009; Hurrell et al. 2010; Means et al. 2010).

To advance in the science of decadal prediction, several coordinated exercises have been proposed at European level within the ENSEMBLES (Doblas-Reyes et al. 2011; Van Oldenborgh et al. 2012; Garcia Serrano and Doblas-Reyes 2012) and COMBINE (Bellucci et al. 2014) projects for instance. Recently, in a more international context, near-term future climate changes have been included in the 5th IPCC report (Chapter 11, Kirtman et al. 2013) based on simulations proposed within the 5th edition of the Coupled Model Intercomparison Project (CMIP5, Taylor et al. 2012). The coordinated experiments mostly rely on retrospective climate predictions (also called “hindcasts”) over the 1960–2005 period to evaluate the predictability of the climate system at decadal timescale. As an extension, Smith et al. (2013) have performed quasi-real time decadal forecasts in a multi-model framework using most of the climate prediction systems that participated in CMIP5.

The most relevant scientific challenge in decadal prediction is to evaluate, quantify and understand the sources of the forecast skill (Meehl et al. 2014). The latter may arise from (1) external forcings (GHGs, aerosols, volcanic eruptions and solar irradiance) and, (2) natural climate variability, which is dominated by the slow components of climate system. In decadal predictability context, the information contained in the ocean is the most important (Collins et al. 2006) and its correct initialization from observations is then crucial in climate models used for forecasting. The three-dimensional knowledge of the ocean component has now become more accurate due to the recent improvements in observational networks and the development of data assimilation systems that provide ocean reanalysis products of higher quality (Wijffels et al. 2008; Ishii and Kimoto 2009; Corre et al. 2012; Ferry et al. 2010; Balmaseda et al. 2010). The most recent studies from CMIP5 (see Kirtman et al. 2013; Meehl et al. 2014 for a review) confirm that a large fraction of the decadal predictability comes from the external forcings, either anthropogenic (worldwide) or natural ones (e.g. role of volcano radiative forcings over the Indian Ocean, Guémas et al. 2013) whatever the forecast leadtimes. Added-value from ocean initialization accounting for the phase of the modes of natural variability such as the Atlantic Multidecadal Oscillation (AMO, Kerr 2000), increases the regional forecast skill for Sea Surface Temperature (SST) comparing with non-initialized experiments, in particular over the North Atlantic and western Pacific oceans up to 8–9 year leadtimes (Mochizuki et al. 2010, 2012; Msadek et al. 2014; Bellucci et al. 2012; van Oldenborgh et al. 2012; Hazeleger et al. 2013a; Doblas-Reyes et al. 2013; Ham et al. 2014). Despite improved performance over the latter basins, the impact of ocean initialization on the predictive skill over land, even over the adjacent areas to the North Atlantic and Pacific oceans, is very limited (Goddard et al. 2012).

A great difficulty in climate prediction is to find the most appropriate method to initialize the model from ocean observations or their estimation via reanalyses. Due to imperfect climate simulated by coupled models, significant drifts occur throughout the forecast, which may alter the predictive skill. There are two classical initialization strategies: “full field initialization” in which the raw ocean reanalysis is used as initial conditions for the coupled forecast model (Mochizuki et al. 2010; Garcia-Serrano and Doblas-Reyes 2012), and the “anomaly initialization” (Schneider et al. 1999) in which anomalies for the reanalysis are first computed and are then added to the model climatology (Smith et al. 2007; Keenlyside et al. 2008; Pohlmann et al. 2009; Smith et al. 2010) to produce the ocean initial conditions. The latter is viewed as a technique to minimize the strong model drift when initialized close to observations in full field. In both cases, the model biases must though be removed a posteriori in order to estimate the forecast skill. Some works have compared the two methods using the same forecast system and conclude that both lead to a similar level of predictive skill (Smith et al. 2013; Magnusson et al. 2012; Hazeleger et al. 2013b). Hence, no consensus has been found so far on best practice in model initialization. Beyond full-field versus anomaly strategies, choice also lays between tri-dimensional versus surface-only initializations as adopted by some groups (Swingedouw et al. 2012).

Beyond statistical predictability issues, the dynamical study of model drift and associated bias adjustment is also crucial, since, as pointed by Meehl et al. (2014) and Hawkins et al. (2014), the rate and the spatial pattern of the bias development can provide a useful information on physical processes connected to model systematic errors that potentially affect the skill scores. This can give some clues to understand the model behaviors and provide some guidance for model improvements. The systematic analysis of bias adjustment in hindcasts appeared only recently in few studies, like in Vanniere et al. (2013) who tracked back the origin of cold biases on the equatorial cold tongue in the Pacific from several seasonal forecast systems, in Huang et al. (2015) who examined the drift mechanism yielding to a weakening of the Atlantic meridional overturning circulation (AMOC) in the CFSv2 decadal prediction system, in Voldoire et al. (2014) who analyzed the role of atmospheric systematic errors in initiating seasonal SSTs biases in the Tropical Atlantic in the CNRM-CM5 model, and in Tonniazzo and Woolnough (2013) who also studied the development of Tropical Atlantic errors but based on multi-model decadal predictions from CMIP5. Lately, Hawkins et al. 2014 investigated the importance of the methodology used for removing model biases estimates for global temperature in decadal hindcasts, using a toy model and CMIP5 experiments.

In this study, we use the CNRM-CM5 coupled model (Voldoire et al. 2013) and perform a descriptive analysis of the model drift dynamics in full field initialized decadal hindcasts performed within CMIP5. We will pay attention to the physics of the model adjustment at both short leadtimes (from 1 season to 2–3 years) commonly referred to as initial shock and longer timescale, for which processes are intrinsically different. The goal is to understand the mechanisms leading to the model systematic errors or biases defined in this paper as the difference between non-initialized simulations of CNRM-CM5 (hereafter referred to as historical) and observational estimates. We will start with the analysis of the global heat balance and associated meridional heat transport simulated in the decadal hindcasts as a function of leadtime. Then we will focus on two specific regions, namely the tropical Pacific and the North Atlantic oceans, for which a detailed investigation of the relationship between the drift and some modes of variability such as El Niño Southern Oscillation (ENSO) and the North Atlantic Oscillation (NAO) is documented. The latter question is of particular relevance since one precisely wants to predict those modes. Lessons might be drawn in light of our results for the implementation and use of drift correction schemes that are mandatory to apply in any forecast system. Our study ultimately contributes to the ongoing research effort to reduce the model errors or, in other words, to minimize their drifts when initialized.

This paper is organized as follows: The CNRM-CM5 system is briefly described in Sect. 2, together with the initialization methodology adopted as a first attempt by the CNRM-CERFACS group to produce the initial conditions for CMIP5 decadal forecasts. Section 3 is devoted to the analysis of the model drift in terms of global heat balance. The physical mechanisms involved in the model adjustment in the Equatorial Pacific are presented in Sect. 4. In Sect. 5, we investigate the processes involving model drift in the North Atlantic. Finally, the summary and conclusions can be found in Sect. 6.

2 Description of model, methodology and experiments

2.1 The CNRM-CM5 coupled model

The CNRM-CM5 coupled model (Voldoire et al. 2013) has been developed jointly by the CNRM and the CERFACS institutes. The atmospheric part is the version 5.2 of the ARPEGE-Climat global spectral model (Déqué et al. 1994). This code is derived from the ARPEGE/IFS numerical weather prediction model developed by Météo-France and the European Center for Medium Range Forecast (ECMWF). ARPEGE-Climat v5.2 operates on a T127 triangular truncation that corresponds to a resolution of about 1.4° in both latitude and longitude. CNRM-CM5 is run in low-top configuration with 31 vertical levels (26 in the troposphere): the highest level is set at 10 hPa and there are 6 layers below 850 hPa except in regions of high orography. The surface component embedded in ARPEGE-Climat is SURFEX, which includes three schemes that represent the surfaces of natural land, inland water (lakes) and sea/ocean areas. The natural land surface scheme is based on the “Interaction between Soil Biosphere and Atmosphere” (ISBA) model (Noilhan and Planton 1989; Noilhan and Mahfouf 1996). The total runoff (surface runoff + deep drainage) simulated by SURFEX feeds the Total Runoff Integrating Pathways (TRIP, Oki and Sud 1998) river routing model used to convert the latter into river discharge on a daily basis. ARPEGE-Climat and SURFEX run in a generalized implicit coupling sharing the same time-step (30 min).

The ocean component of CNRM-CM5 is based on the “Nucleus for European Modelling of the Ocean” (NEMO, version v3.2) model, a numerical framework developed by several European institutions (CNRS, Mercator-Ocean, UK Met Office and NERC-NOCS). An extensive description of the ocean model can be found in Madec (2008). The global ocean configuration used in CNRM-CM5 is known as ORCA1 (Hewitt et al. 2011) characterized by a tripolar grid of 1° × 1° on average with equatorial refinement up to 1/3 of a degree. Along the vertical, 42 levels are used and the model time-step is 1h30. The sea ice component is the GELATO5 model, which is embedded in NEMO and shares the same grid (Salas y Mélia 2002). The coupling among all the components (ARPEGE/SURFEX, NEMO/GELATO, TRIP) is carried out through the OASIS3 coupler (Valcke 2013) at a daily basis.

2.2 Methodology for initialization of the CNRM-CERFACS decadal system

Initial conditions for the CMIP5 decadal hindcasts produced with CNRM-CM5 are obtained from a preliminary simulation (hereafter referred to as NUD4IC) over 1958–2008, where the ocean component is nudged towards the NEMOVAR-COMBINE (Balmaseda et al. 2010) ocean reanalysis, while the other components (atmosphere, sea-ice, continents) are freely coupled. NEMOVAR-COMBINE (NEMOVAR for simplicity) reanalysis is based on a 3D-VAR assimilation data system (Weaver et al. 2005; Daget et al. 2009) and assimilates profiles of temperature and salinity from a version of the quality controlled EN3_v2a data set (Wijffels et al. 2008). The choice for NEMOVAR instead of other reanalysis products is motivated by the fact that first, NEMOVAR and CNRM-CM5 share the same ocean model version and grid avoiding spurious effect introduced by interpolation, especially over the vertical dimension, and second, they are integrated with the same physical and dynamical assumptions set in the namelist. Through the use of NUD4IC outputs as initial conditions datasets, our initialization strategy differs from the traditional full-field one, where raw initial conditions from reanalysis datasets are simply used. The rationale for NUD4IC is to try minimizing the initial shock when forecasts begin but also, on a practical side, to get states for components for which there is no available reanalyses such as land and sea-ice (thickness, surface albedo etc.) for initial conditions.

Two different approaches are combined for nudging in the NUD4IC experiment. At the surface, a restoring is applied in terms of heat and fresh water fluxes, by using a flux derivative term as follows:

$$\begin{aligned} Q_{ns} & = Q_{ns}^{o} + \frac{dQ}{dT}\left( {T_{k = 1} - SST_{NEMOVAR} } \right) \\ EMP & = EMP_{o} + \gamma^{ - 1} e_{3t} \frac{{\left( {S_{k = 1} - SSS_{NEMOVAR} } \right)}}{{S_{k = 1} }} \\ \end{aligned}$$
(1)

where Qns and Q ons are the net non solar flux at the surface, T is the sea surface temperature of the model and SSTNEMOVAR for the reanalysis, and dQ/dT is a feedback coefficient between flux and temperature set to −40 W/m2/K as diagnosed from Barnier et al. 1995 (their Fig. 4). EMP and EMPo are the fresh water budget at the surface, S and SSSNEMOVAR are the sea surface salinity of the model and reanalysis respectively, e3t is the vertical weighting scale factor, and γs is the feedback parameter which here is set to −167 mm/day. The dQ/dT coefficient plausibly represents corrections to real physical feedbacks involving net non-solar flux at the atmosphere, whereas there is little feedback of surface salinity on the atmosphere and hence γs is rather an ad-hoc measure to prevent surface salinity drift and also tentatively conserve the density. Flux derivative terms are preferred to Newtonian damping for surface fields because they indirectly account for the prognostic evolution of the mixed layer depth ensuring for more dynamical/physical coherence throughout the ocean column (see Servonnat et al. 2015 for more details).

Below the mixed layer that is not affected by the surface restoring, a 3D Newtonian damping in temperature and salinity (see Madec 2008 for details) is however implemented following the equations:

$$\begin{aligned} \frac{\partial T}{\partial t} & = \cdots - \frac{1}{\beta }\left( {T - T_{NEMOVAR} } \right) \\ \frac{\partial S}{\partial t} & = \cdots - \frac{1}{\beta }\left( {S - S_{NEMOVAR} } \right) \\ \end{aligned}$$
(2)

where T and S are the model temperature and salinity, To and So are from NEMOVAR, and β is a timescale parameter. To conserve as much as possible the ocean properties and in order to avoid spurious effects on ocean currents let free in our case, the values of parameter β must be carefully chosen as a function of depth and location. Here, no damping (1/β = 0) is applied within the mixed layer that is free to evolve. Below the thermocline down to 800 m depth, the β parameter is set to 10 days and for the deep ocean below, a weak restoring is chosen (β = 360 days). In addition, nudging is equal to 0 (1/β = 0) along the coastline (considering a distance of 300 km from the coast).

Several tests have been performed to determine the optimal set of surface/subsurface parameters detailed above, but also to determine the geographical locations where the subsurface damping terms is applied. Our reference configuration in the following is the one where the subsurface nudging is only applied outside the 15°S–15°N latitudinal band; the latter has been retained for initializing the decadal hindcasts for the CNRM-CERFACS group as archived in CMIP5. This experiment is called NOTROP _NUD4IC (hereafter NOTROP_IC for short) referring that no subsurface nudging is activated in the tropical stip. As documented in the following, the NOTROP_IC configuration is the one for which the initial shock is the most limited. For the only sake of comparison, another configuration named NOEQ_IC is analyzed in Sects. 4 and 5. In NOEQ_IC, the subsurface nudging is applied everywhere except within the 1°S–1°N band. Nudging right at the equator is indeed problematic because it leads to spurious vertical velocity in the ocean that is clearly unrealistic. Note that whatever the configuration, the sea surface restoring is performed everywhere and a 5° buffer zone is considered between the no-nudged zone and the rest of the ocean where full nudging is applied.

Following the CMIP5 experimental design (http://cmip-pcmdi.llnl.gov/cmip5/experiment_design.html), 10 members of 10 years initialized at the 1st of January (hereinafter DEC) have been performed for starting date between 1961 and 2006 at years 1 and 6 of each decades, namely 1961, 1966, 1971,…, 1996, 2001, 2006. To build the decadal ensembles, only the atmosphere is perturbed by random selection of initial states within the January month produced in NOTROP_IC for the corresponding starting date. These experiments have been published in the CMIP5 database together with additional starting dates for years 0 and 5 of each decade (1960, 1965, 1970,…, 2000). A similar protocol has been followed for the NOEQ_IC ensemble that is not published in the archive. Note that for 6 starting dates (1960, 1961, 1980, 1981, 2005, 2006) and for NOTROP_IC only, the 10-year members have been extended up to 30 years. External forcings (GHGs concentration, aerosols, solar irradiance and observed volcanic eruptions) are prescribed in the model and are the exact same ones as in the so-called historical experiments (HIST hereinafter) corresponding to the non-initialized runs (Taylor et al. 2012). 10 members are also available for HIST and have been initialized in 1850 from 10 states randomly selected from a 1000-year long pre-industrial simulation (hereinafter piControl).

In this paper, we investigate the physical and dynamical processes playing a role in the model adjustment towards its own equilibrium state or model attractor. In the following analyses, we make the hypothesis that the model attractor can be estimated by HIST over the same period as the forecasts (1960–2010). Most of the diagnostics are thus presented through differences between the decadal predictions starting from NOTROP_IC (hereafter DEC for simplicity) and HIST. The latter is partitioned into members/leadtimes and starting dates to mimic the DEC dataset and for both, fields used to investigate the model drifts are generally averaged over all the members and all the starting dates produced for the CMIP5 CORE experiments (Taylor et al. 2012).

3 Model heat balance drift as a function of leadtime

The model drift is first investigated in terms of meridional heat transports computed on annual basis as a function of the forecast lead-time. The goal is to characterize the redistribution of energy between and within each component after initialization. Transports are estimated following the methodology described in Trenberth and Caron (2001) and Trenberth and Fasullo (2007), who derive all the components of the meridional heat transport directly from atmospheric energy budgets. The poleward heat transport by the atmosphere (TA) is thus estimated as:

$${\text{T}}_{\text{A}} = {\text{RT}} - {\text{SURF}}$$
(3)

where RT is the net heat budget at the top of the atmosphere (net solar minus net outgoing longwave radiation) and SURF is the net heat budget at the surface (net solar + net longwave + turbulent heat fluxes); both terms are integrated along latitude circles. The meridional heat transport by the ocean is estimated by TO = SURFsea accounting for ocean points only. For the ocean and atmosphere combined system, the total heat transport is given by: TA+O ~ RT since, in the above mentioned equation, SURFland is negligible and thus SURF ~ SURFsea. As pointed by Magnusdottir and Saravanan 1999, the global average flux of the system must be set to zero within this framework. To do so, we have subtracted the global average fluxes derived from the HIST experiments to TA, TO and TA+O.

Figure 1 compares the values of meridional heat transports for HIST and NOTROP_IC averaged over the same 1960–2000 temporal window. TA+O, TA and TO values are displayed first, together with the estimates of TO for each ocean basin taken separately, namely Atlantic, Pacific and Indian Oceans. The TA+O quantities are very similar for NOTROP_IC and HIST (Fig. 1a) as expected even in presence of mean biases in the latter. Those biases can be interpreted as a different partition of the heat transport between the two climate fluids leading a new equilibrium, whilst the total energy transport is conserved. Considering the atmosphere only (Fig. 1b), TA is weaker in HIST in the Southern Hemisphere, but slightly stronger in the northern tropical band (up to 25°N) and at the equator. Changes in TA are compensated by TO that is overestimated in HIST with respect to NOTROP_IC from 60°S to the equator but underestimated to the North (Fig. 1c). The differences in TA can be explained in the south by strong warm biases in austral SST in HIST (Voldoire et al. 2013, their Fig. 2) that are associated with a significant decrease of mid-latitude cyclones activity and with a storm-track shift, that mostly controls the atmospheric heat transport at mid-latitudes. In the Northern Hemisphere, the Hadley cell is too strong in HIST (not shown) compensated by a slackening of the AMOC as documented later on. At the equator, changes in TA are linked to the presence of the so-called double Inter-Tropical Convergence Zone (ITCZ) that is reinforced in HIST with respect to NOTROP_IC. Recall that NUD4IC experiments can be treated as AMIP-type simulations for the atmosphere, since the SST is constrained by the strong surface restoring. Compensation and redistribution can also occur between oceanic basins as displayed in Fig. 1d–f, showing the contribution of each ocean basin to the differences found between HIST and NOTROP_IC in the total TO. Enhanced meridional ocean heat transport in the Southern Hemisphere in HIST is explained by changes in the Indian Ocean sector that dominate the opposite signals in the Atlantic and Pacific. In the Northern Hemisphere, weak slackened heat transport in HIST hides a strong compensation between the Atlantic, where values are underestimated by about 30–50 %, and the Pacific where TO northward transport is particularly reinforced whatever the latitudes (Fig. 1d, e). Note that for validation of the ocean transport estimates based on surface and top of the atmosphere fluxes, the computational value issued directly from the ocean model is also provided in Fig. 1c (green curve). Agreement is very good, except in the 60°S–30°S latitudinal band where discrepancies occur as also pointed out in Trenberth and Caron (2001) suggesting that the protocol there has some limitations.

Fig. 1
figure 1

(Top) meridional heat transport for HIST (black) and NOTROP_IC (red) experiments as a function of the latitude, derived from surface heat fluxes (Trenberth and Caron 2001; Trenberth and Fasullo 2007). a Total heat transport (TA+O), b atmosphere heat transport (TA) and c global ocean heat transport (TO). Meridional heat transport value estimated from direct model outputs from NEMO for HIST is superimposed in green. (Bottom) meridional ocean heat transports contributions for each basin: d Atlantic, e Pacific and f Indian oceans. HIST is represented here by one member among the historical ensemble (see Voldoire et al. 2013)

Figure 2 represents the DEC–HIST differences in terms of annual means of meridional heat transports as a function of the lead-time (Yr1–Yr10). The NOTROP_IC–HIST are also shown and referenced as Yr0; the goal here is to highlight how the model drifts towards its own equilibrium states in terms of energy balance after initialisation. By construction, values equal to 0 mean that the model is close to its attractor. Two important features can be highlighted:

Fig. 2
figure 2

(Top) lead time (in years) versus latitude Hovmöller diagrams for DEC-HIST differences in meridional heat transport for a total heat transport (TA+O), b atmosphere heat transport (TA) and c global ocean heat transport (TO). (Bottom) same but for meridional ocean heat transports partitioned into oceanic basins: d Atlantic, e Pacific and f Indian. Grey shading stands for significance at the 95 % level of confidence based on a Welch t test with a degree of freedom equal to the number of members, namely 10 per forecast. Yr0 represents the NOTROP_IC–HIST differences. Contour interval is every 0.1 Petawatt

  1. 1.

    Starting with weaker (stronger) TA+O heat transport in the NOTROP initial conditions in the Northern Hemisphere (Southern Hemisphere between the equator and 30°S) as revealed by strong negative values, anomalies decay suggestive for progressive enhancing (weakening) of TA+O in DEC from lead time Yr1 to approximately Yr5 until DEC reaches the values of HIST (Fig. 2a). Interestingly overshoot occurs afterwards, as DEC meridional heat transport gets stronger than HIST in the Northern Hemisphere at the end of the forecast (Yr10). Partitioning the TA+O components into ocean and atmosphere (Fig. 2b, c) shows that the drift in TA+O in the Northern Hemisphere over the forecast period is mainly driven by the adjustment of the ocean while the role of the atmosphere is smaller. In the Southern Hemisphere, by contrast, both ocean and atmosphere play a role: weaker transport in the atmosphere in the initial state (Fig. 2b) is compensated by the ocean (Fig. 2c), as also diagnosed in Fig. 1. This counteracting balance continues along the forecast period while values are slowly decaying. Treating the ocean basins separately (Fig. 2d–f) shows that in the Southern Hemisphere, signals come mainly from the slow adjustment in the Indian Ocean and in the Pacific basin to a lower extent. In Northern Hemisphere, the Pacific plays the greatest role in the global heat transport adjustment of the modelled system over the first 5 years of the forecast. The Atlantic Ocean also contributes, but to a smaller extent, to the TO enhancing from 40°N northwards at those leadtimes.

  2. 2.

    The TO in the Atlantic Ocean (Fig. 2d) is responsible for the overshoot of DEC comparing to HIST in TA+O after Yr5. Indeed, from Yr5 to Yr10 leadtime, the ocean meridional heat transport becomes significantly stronger in DEC than HIST within the 0–40°N latitudinal band. The case of the Atlantic Ocean deserves a particular attention. There is a fast adjustment in the tropical areas at Yr1 between 30°S and 30°N since TO values in the Atlantic reach very quickly values closed to 0 (i.e. HIST mean state) although departures between NOTROP_IC and HIST were very large. The mechanisms associated with this fast adjustment are described in Voldoire et al. (2013). Different processes are then active for longer leadtime and will be thoroughly detailed in Sect. 5.

In the following, we investigate two specific physical mechanisms that explain the main part of the drifting heat transport described above. We will focus in Sect. 4 on the tropical adjustment at work in the Pacific Ocean at relatively short timescales (from Yr1 to Yr4), then in Sect. 5 on the mid-latitude mechanisms occurring in the North Atlantic region over the full forecast range.

4 Model drift in the Pacific: role of ENSO

Zonal mean differences of the whole Pacific Ocean temperature as a function of depth are first computed for NOTROP_IC–HIST experiments (Fig. 3a). The hindcast initial conditions are characterized by an excess of heat with respect to HIST in a broad tropical band between 30°S–10°N and down to 300–400 m depth; this is indicative for shallower thermocline of the model attractor with respect to the observations. The warmer ocean subsurface is straddled by cooler temperature in HIST around 45°S and 15°N with maximum loading between 100 and 500 m. South of 45°S, NOTROP_IC temperatures are considerably colder than HIST (differences ranging between 0.5 and 2.5 °C) from surface to bottom of the ocean. Relatively homogeneous differences there over the whole water column are indicative of poor representation in the CNRM-CM5 model of the Austral deep-water formation that feeds the deepest ocean. The Northern Pacific basin is characterized by a vertical dipole with warmer conditions in the mixed layer down to 150 m in NOTROP_IC versus HIST and colder conditions below down to about 1500 m.

Fig. 3
figure 3

(Top) a NOTROP_IC–HIST and b NOEQ_IC–HIST zonal mean differences for temperature annual means as a function of depth in the Pacific Ocean. Contour interval is every 0.5 °C. HIST is represented here by one member among the historical ensemble (see Voldoire et al. 2013). (Bottom) lead-time (from OND Yr0 to OND Yr4) versus longitude plots for c DEC–HIST and d DEC_NOEQ–HIST seasonal means differences of the 20 °C isotherm depth (filled colors) and 10-m winds (arrows) averaged over 2°S–2°N. Yr0 OND represents the c NOTROP_IC–HIST and d NOEQ_IC–HIST differences that are present one season before the forecast starts. Contour interval is every 2 m and arrow units are given in the upper-right corner of the panel in m s−1

The evolution in DEC of the equatorial Pacific heat content from NOTROP_IC is investigated in Fig. 3c as a function of leadtime, based on a time versus longitude Hovmöller diagram of the DEC-HIST differences for 10 m wind and for 20 °C isotherm depth averaged between 2°S and 2°N. Considering the importance of the annual cycle in the equatorial Pacific, seasonal means (JFM, AMJ, JAS, OND) are preferred to annual means in the following. Similarly to Fig. 2, NOTROP_IC–HIST differences in OND of Yr0 are also included in the graph for the ocean field. Figure 3c shows that, at the beginning of the forecast (OND Yr0 and JFM Yr1) consistent with Fig. 3a, the thermocline is considerably deeper (by around 40 m) in DEC especially on the western and central part of the basin. Westerly wind anomalies develop concurrently at the west of the dateline at the beginning of the forecast from AMJ Yr1 and persist up to the following fall. The latter maintain the initial deepened thermocline and simultaneously trigger equatorial downwelling Kelvin waves crossing the basin in about 3 months. A first one reaches the eastern basin in AMJ as materialized by a deepening thermocline depth compared to previous JFM Yr1 and OND Yr0. A second Kelvin wave of lower amplitude appears in JAS Yr1 with maximum amplitude in the east in late OND Yr1 and JFM Yr2. The latter is explained by the prevalence of westerly wind anomalies west of 200°E. This yields indeed to positive SSTs anomalies in the eastern Equatorial Pacific, reminiscent of the formation of an El Niño event following the traditional Bjerknes feedback mechanism (Bjerknes 1969; Wyrtki 1975). Discharge occurs in late Yr1 in the western Pacific and during Yr2, while anomalous westerlies disappear and a weak La Niña tends to pop up in the central Pacific. After one ENSO cycle, the model has reached the HIST state, i.e. the model intrinsic equilibrium.

This is further illustrated in Fig. 4. The probability density function (PDF) of the leading principal component obtained by Empirical Orthogonal Functions (EOF) analysis of the SST field in the Pacific basin over 20°S–20°N from HIST is displayed for seasonal means. The first EOF modes correspond to the ENSO mode of variability (not shown) for all the seasons and explain 55, 33, 54 and 65 % of the total variance in HIST for JFM, AMJ, JAS and OND respectively. We verify that the patterns obtained from HIST are essentially the same as the one computed from piControl where external forcings are kept constant to 1850 levels and from which internal modes of variability of CNRM-CM5 can thus be assessed from a strict point of view. Note that PDF values obtained from HIST are very much coherent with those from NEMOVAR observational estimates. All the members for all the starting dates of the DEC hindcasts are then projected onto this mode and Fig. 4 represents the seasonal PDF of the standardized projection values in red, assimilated as a principal component, as a function of lead time on a seasonal basis. The latter is compared to the climatological distribution corresponding to the model’s attractor estimated from HIST. For simplicity, positive (negative) projection stands here for El Niño (la Niña) event without any additional criterion of SST threshold exceedance. Figure 4 shows that the histogram shifts progressively towards El Niño states (positive values) as a function of leadtime from AMJ in Yr1 to AMJ in Yr2. From JAS of Yr2, the histograms shift towards La Niña states (negative values). At the end of Yr3, DEC hindcasts statistics are undistinguishable from HIST. Figure 4 clearly highlights that the excess of heat with respect to the model own equilibrium, which has been inserted at the subsurface through initialization (Fig. 3a), is released during the first year of the prediction through the excitation of a El Niño events for 80 % of the 10 members × 10 starting dates of the forecast dataset at JAS Yr1. In the following winter (OND Yr2) like a boomerang, 62 % of the forecasts simulate a cold ENSO event in response to the leadtime Yr1 El Niño following the so-called recharge-discharge paradigm.

Fig. 4
figure 4

Seasonal probability density function (PDF) as a function of leadtime (from JFM Yr1 to OND Yr3) obtained from the projection values of the DEC spatial SST anomalies onto the leading EOF of SST computed from HIST over the 20°S–20°N latitudinal strip in the Pacific ocean. All the forecasts (10 members + 10 starting dates) initialized either from NOTROP_IC (red histograms) or NOEQ_IC (purple histograms), and their HIST and NEMOVAR counterpart (black and cyan lines respectively) have been used to get the PDF. DEC anomalies used for projection have been computed using HIST as climatology and all the data have been standardized (hence the horizontal axis are unitless)

Note that such a shock has been minimized in the DEC configuration because of the absence of subsurface nudging in the tropical band as presented above. To further illustrate this statement, the same analysis is carried out from the DEC_NOEQ experiment initialized from NOEQ_IC. Figure 3b shows that the excess of heat stored in the subsurface in NOEQ_IC is greater than for NOTROP_IC (Fig. 3a) where the model thermocline and surface winds are more in a balanced state, thus potentially reducing the shock when the model is set free in a forecast mode. Accordingly, the amplitude of the downwelling Kelvin wave triggered the first year of the forecast is dramatically reinforced in DEC_NOEQ and the system bounces back and forth between El Niño and La Niña events during the first 4 years of the forecasts (Fig. 3d) instead of 2 years for DEC (Fig. 3c). Reversed trade winds anomalies occur accordingly in the western and central part of the Pacific from AMJ Yr2 to OND Yr2. Figure 4 confirms that the model releases the additional excess of heat in its initial conditions in DEC_NOEQ (in blue) by creating artificial alternation of strong El Niño/La Niña events yielding to an oscillatory drift of the model towards its own equilibrium state. In DEC_NOEQ and for JAS Yr1, ~ 98 % of the forecast simulates an El Niño event and ~70 % of amplitude greater to 1 standard deviation. Conversely, 80 % of the forecasts produce a La Niña state at OND Yr2 leadtime. The reduction of the shock in DEC is evident and has dictated our choice to retain the latter configuration instead of DEC_NOEQ for the coordinated experiments archived through CMIP5 and used in the 5th IPCC report.

Beyond the tropical Pacific per se, the spurious ENSO dynamics triggered at the earliest leadtime of the forecast have a significant worldwide influence and remotely lead to regional drifts. Figure 5 displays precipitation together with geopotential at 500 hPa (Z500) differences between DEC and HIST (Fig. 5a) and DEC_NOEQ and HIST (Fig. 5b) at leadtime JFM Yr2, which corresponds to the peak of the El Niño episodes over the drifting period (see Fig. 4). The pattern of precipitation in both cases is consistent with the SST warming over the eastern Equatorial Pacific and the equatorward shift of the ITCZ. For DEC_NOEQ, precipitation differences are stronger by a factor of 2 than in DEC. The associated Z500 spatial structure is reminiscent of the positive phase of the Pacific-North-American pattern (PNA, Wallace and Guztler 1981; Barnston and Livezey 1987), characterized by negative Z500 anomalies located south of the Aleutian Islands and over the southeastern United States, concomitant with positive values over most of Canada extending eastward towards Europe. Spatial correlation between the model PNA internal mode of variability determined as the first EOF of the Z500 anomalies from HIST and the atmospheric pattern identified in Fig. 5 is equal to 0.34 for DEC and climbs up to 0.56 for DEC_NOEQ. Note that in Fig. 5b for DEC_NOEQ, the center of positive Z500 anomalies is highly significant and spread eastwards in the North Atlantic sector leading to the prevalence of negative phases of the North Atlantic Oscillation (NAO, see Hurrell 2003 for a review). The spatial correlation between the anomalous pattern in Fig. 5 and the NAO mode determined as the first EOF of Z500 anomalies from piControl over the North Atlantic sector is 0.63 and 0.57 respectively. Through teleconnection, it is demonstrated that the effects of the spurious ENSO at the early stage of the forecast spread out globally; it thus appears to be crucial to minimize the Pacific shock that can control part of drift in very remote places. This is all the more important that teleconnections are piloted by anomalous diabatic sources and therefore nonlinear processes rendering the a posteriori correction of the drifts to be difficult and non trivial in the traditional linear sense.

Fig. 5
figure 5

Z500 (contours) and precipitation (shading) differences between hindcasts initialised from a NOTROP_IC and b from NOEQ_IC respectively, and HIST at leadtime JFM Yr2. Gray hatching stands for Z500 significance at 95 % determined by a Welch t test. Contour and shading intervals are 10 m and 0.5 mm/day for Z500 and rainfall, respectively. Negative contours are dashed and the zero line is thicker

The excitation of the ENSO mode, in particular the strong El Niño during the first 2 years of the forecast, is connected to the enhanced meridional heat transport in the Pacific basin described in Sect. 3 for leadtime Yr1–Yr4. As shown by Sun and Trenberth (1998) and Sun (2000) for strong El Niño events, TA+O is enhanced with the contribution of the ocean being more important than the one from the atmosphere. In CNRM-CM5, the El Niño excitation appears to be a very efficient and powerful way for the model to remove the excess of heat, with respect to its own biased mean state, which has been inserted through initialization in the tropical Pacific.

5 Model drift in the North Atlantic

To understand the model drift over the North Atlantic region, we first investigate the differences between the initial states of the DEC hindcasts (NOTROP_IC experiment) and the model climatological attractor assessed by HIST. The barotropic stream function (BSF hereinafter) and the North Atlantic meridional stream function used as a metric for the AMOC are shown in Fig. 6. Discrepancies between NOTROP_IC and HIST are maximum for all the western boundary systems namely the North Brazilian current and the Gulf Stream, that are both responsible for mean transport of warm and salty tropical water masses towards the North Atlantic basin (Fig. 6a). Both are considerably stronger by about 10 Sv in NOTROP_IC compared to HIST and the Gulf Stream is located northward; such a shift is revealed by the presence of a latitudinal dipole off Florida. The subpolar gyre (SPG hereinafter) strength is also higher in NOTROP_IC than HIST (i.e. negative values in Fig. 6a) by more than 10 Sv in the Labrador Sea and South of Greenland, where maximum isopycnal doming occurs. It is however lower in NOTROP_IC versus HIST by about −5 Sv (positive values) on its easternmost core, at the longitude of Iceland and between 50 and 65°N. This is related to the eastward expansion of the SPG rim in HIST leading to a northeastward displacement of the North Atlantic Current (NAC) around the Rockall Bank. When partitioned into eastern and western branches following Mercier et al. (2015, their Fig. 1), the former is clearly reinforced in HIST.

Fig. 6
figure 6

a NOTROP_IC–HIST barotropic stream function (BSF) differences. Color shading interval is every 1 Sv. The black contours correspond to the BSF long-term annual mean for HIST, contour is every 5 Sv. Negative contours are dashed and the zero line is thicker. The red and green contours over the SPG define the two spatial domains used for west SPG and east SPG indices respectively. b NOTROP_IC–HIST differences for the Atlantic Meridional Overturning Stream Function (AMOC) as a function of depth. Color shading interval is every 1 Sv. The contours correspond to the AMOC long-term annual mean for HIST (black), NOTROP_IC (blue) and NEMOVAR reanalysis (cyan). Contour interval is every 2 Sv starting at +10 Sv

The AMOC is more intense in NOTROP_IC than in HIST by ~6 Sv (Fig. 6b). Strong reduction occurs in the lower limb of the overturning cell in HIST (below 1000 m) and the AMOC maximum core is shallower (~750 m versus ~1500 m in NOTROP_IC) and southward shifted by about 10°–15°. Marginal strengthening concurrently occurs in the upper branch from the tropics to ~30°N which is altogether suggestive for a vertical squeezing and southward retreat of the AMOC in HIST. Note that our nudging technique does not introduce a dramatic perturbation in the initial conditions for AMOC. The maximum core is located at the same locations (~1500 m at ~40° N) for both NEMOVAR and NOTROP_IC and the AMOC in the latter is just about 2 Sv weaker (13.5 vs. 15.5 Sv).

As a summary, the ocean both horizontal and vertical circulations of the model attractor (HIST) is considerably weaker compared to the initial states used for prediction, except over the eastern flank of the SPG. It is worth mentioning that conclusions from NOEQ_IC instead of NOTROP_IC are very similar (not shown) as expected by construction since identical nudging is applied in both configurations north of 15°N. The reader is invited to refer to Ruprich-Robert and Cassou (2014) and Voldoire et al. (2013) for a complete description of the CNRM-CM5 biases in the Atlantic.

Focus is now specifically laid on the model temporal drift of the SPG going from NOTROP_IC towards HIST. The SPG is crucial for the entire North Atlantic dynamics, and consequently for the drift over the full basin, because of the presence of deep convection and dense water formation, among other factors. Following Ruprich-Robert and Cassou (2014) and Barrier et al. (2013) recommendations, western and eastern branches of the SPG across the Reykjanes Ridge and defined by the −10 Sv mean value in piControl are treated separately (see Fig. 6a for the domains partition). Figure 7 shows a spaghetti plot of BSF evolution as a function of leadtime for DEC hindcasts together with their HIST counterparts; NEMOVAR and NOTROP_IC time series are superimposed. Note that in west SPG, the gyre circulation in the latter is stronger by approximately 2 Sv. By contrast, over East SPG, the NOTROP_IC and NEMOVAR values are very similar. The difference of behavior between western and eastern gyres is partly explained by the timescale of the restoring at the surface, which is dependent on the mixed layer depth because dQ/dT is spatially constant. This can also be due to the fact that density may not perfectly conserved by separated temperature and salinity restoring.

Fig. 7
figure 7

Spaghetti plot of the BSF averaged over the a western and b eastern SPG region for all DEC (red) and HIST (gray) members as a function of leadtime; ensemble means (thick red and black lines) are superimposed. Blue and cyan lines stands for NOTROP_IC and NEMOVAR respectively. c, d Same as a, b but for the 1960 and 1980 hindcasts up to 30-year leadtime

As expected from Fig. 6a, the west SPG gyre circulation in DEC is initially strong but rapidly weakens to reach the HIST values after around 4–5 years (Fig. 7a). In the East, despite fast weakening occurs at the earliest leadtimes like in the West, the model behavior is very different. The East SPG in DEC is initialized close to HIST values but unexpectedly drifts away from the model attractor with increasing forecast leadtime (Fig. 7b). The fact that the model goes away from its own climatological mean states is somehow intriguing and counter-intuitive because the initialization could have been interpreted there as “perfect” or “model best compatible” based on the agreement between NOTROP_IC and HIST. To go further, we use the 1960 and 1980 starting dates for which the forecasts have been prolonged up to 30 years in the CORE set of the CMIP5 experiments (Fig. 7c, d). Figure 7c confirms in the West that the model equilibrium is indeed reached before 10 years of integration and that the gyre circulation stabilizes around a mean state of −21 ± 2 Sv like in HIST. On the eastern part conversely (Fig. 7d), the circulation weakens in DEC until approximately 10–12 years leadtime and then undergoes a progressive recovery without reaching though the HIST values at the end of the 30-year forecasts. This points out the complexity of the model adjustment because of the existence of a multitude of mechanisms that operate at different timescales and spatial scales either locally or remotely.

The SPG adjustment in DEC is now analyzed in terms of temperature and salinity indices. The ventilated ocean from surface to 700 m is considered in the following (Fig. 8). The upper-level ocean in DEC undergoes a systematic initial warming during the first 4 years of forecasts both in the western and eastern SPG, followed by a progressive cooling towards the model attractor (Fig. 8a, b). 30-year forecasts indicate (not shown) that the model equilibrium assessed by HIST is reached around Yr11–Yr12 on the western part of the SPG, but is still far away in the eastern part, even at leadtime Yr30, where DEC remains significantly much warmer than HIST. Note that conversely to BSF, initial conditions for upper-level heat content depart much more from the model attractor in the eastern SPG (~+2.5°) than in the western one (~0.4°). The systematic behavior of the ventilated ocean is also found in SST for both SPG regions and is even more pronounced (not shown). The drift of the upper-ocean salinity (Fig. 8c, d) is very similar to the one in temperature (bell shape) in the eastern SPG except the first year of the hindcasts, where a weak but systematic decrease occurs while temperature is rising very rapidly. In the western SPG, stabilization appears after 5 years following the initial increase of the salt content. It is difficult from Fig. 8 only to assess the respective role of temperature and salinity in density changes that may be related, at least partially, to the circulation drift. Through T-S diagrams computed for DEC and NOTROP_IC, Fig. 9a shows that the density diminishes in West SPG up to leadtime Yr4 while densification takes over afterwards. Temperature fluctuations control most of the density changes except from Yr3 to Yr5 where salinity increase counteracts the concurrent warming. In East SPG (Fig. 9b), the density diminishes over the first 2 years of the forecasts followed by a weak but progressive increase as a final competitive influence between strong cooling and simultaneous reducing salinity.

Fig. 8
figure 8

ad Spaghetti plot of the (a, b) ocean temperature and (c, d) salinity indices averaged from 0 to 700-m depth, over the western (left) and eastern (right) SPG region for all DEC (red) and HIST members (gray) as a function of leadtime; ensemble means (thick red and black lines) are superimposed blue line stands for NOTROP_ICs

Fig. 9
figure 9

Scatter plot between annual mean temperature and salinity averaged from 0 to 700-m depth over the a west SPG and b east SPG domains for DEC (plain color dots) as a function of leadtime given in years. The NOTROP_IC (labelled “IC”) is represented by the black crossed circle. Isopycnals are superimposed (grey lines)

The temporal behavior of the model drift in the North Atlantic, as documented in Figs. 6, 7, 8 and 9, can be interpreted as a succession of several processes as a function of leadtime. From initial conditions up to Yr4, the SPG adjustment is consistent with the ocean response to persistent negative NAO like forcing (e.g. Lohmann et al. 2009). The mean westerlies above the SPG in the DEC hindcasts at short leadtime (Fig. 10a) are considerably reduced compared to NCEP reanalysis, while they are reinforced at midlatitudes along the 30°–50°N latitudinal band, especially off France and the Spanish peninsula. We verify that this result is not dependent on the reanalysis product used for the bias estimate. This is consistent with positive (negative) sea level pressure (SLP) anomalies at high latitudes (in the subtropics) leading to slackened Icelandic Low and Azores High. It is interesting to note (Fig. 10b) that such an altered circulation is marginally reinforced with leadtime for Yr5–Yr10 as further detailed below, and is in fact present to a large extent from the beginning, as diagnosed from the NOTROP_IC coupled simulation (Fig. 10c). The latter could be considered for the atmosphere as an AMIP-type simulation since the SST is strongly restored to observation (see Sect. 2). Figure 10 thus highlights the important fact that the mean atmospheric biases in the North Atlantic in CNRM-CM5 can be first and foremost attributed to the mean biases of the atmospheric component of the model and strongly projects upon a permanent NAO- pattern. Weaker winds over the northern North Atlantic lead to surface warming in the SPG region through reduced loss of heat from the ocean to the atmosphere and a reduction of formation of deep water masses accompanied by shallower mixed layer and inhibited convection over the SPG (Fig. 11a, c) especially in the Labrador Sea. Note that mixed layer depths are slightly reinforced along the storm-track consistently with the southward shifted zonal flow in NAO-. Additionally, the anomalous NAO- buoyancy forcing lowers the mean doming structure of the isopycnals in the SPG, which leads to an initial and strong slackening of the SPG by geostrophy as documented in Fig. 7 (Lohmann et al. 2009; Barrier et al. 2013). Reduced formation of intermediate to deep water masses in the western SPG under NAO- like conditions also diminishes the AMOC in agreement with previous modeling studies. Slackening occurs first in the sinking region around 40°–60°N and below 1000 m (Fig. 11b), then spreads southward and progressively invades the entire lower limb of the circulation at longer leadtimes (Fig. 11d). By contrast, the upper circulation is marginally reinforced in DEC between 0° and 30°N whatever the leadtime. Maximum loading is found around 30°N and corresponds to the southward retreat of the NAC (Fig. 6a and SST in Fig. 11a, c). Altogether, this results in a net positive northward heat transport over 0°–30°N as diagnosed for longer leadtimes in Fig. 2d.

Fig. 10
figure 10

Differences in SLP (shading) and surface winds (vectors) between DEC and NCEP reanalysis for leadtimes averaged over a Yr1–Yr4, b Yr5–Yr10, and between NOTROP_IC and NCEP (c). Color shading interval is every 0.5 hPa. Arrow units are 5 in m s−1

Fig. 11
figure 11

a, c Ocean mixed layer depth (m, color) and SST (°C, contour) differences between DEC and NOTROP_IC for leadtimes averaged over a Yr1–Yr4 and c Yr5–Yr10. For SSTs, negative contours are dashed and the zero line is thicker. Contour interval is every 0.2 °C for positive contours and 1 °C for negative contours. b, d Same but for the Atlantic meridional stream function (AMOC). The black contour represents the AMOC long-term annual mean for NOTROP_IC, contour interval is every 2 Sv

North of 30°N, the decrease of the deep water formation together with the slowdown of both the AMOC (Fig. 11) and gyre circulations (Fig. 7a, b) leads to a reduction of the advection of warm and saline water from the subtropical gyre into the SPG. The associated reduced meridional heat/salt transport starts compensating the warming effect due to buoyancy forcing after Yr5 leadtime onwards in the eastern part of the SPG (Fig. 9b) where cooling and freshening progressively take over. In the West SPG (Fig. 9a), this process occurs later in time as estimated from the 30 year long forecast (not shown). From Yr5 to Yr10 leadtimes, cooling is explained by the advection of colder water from the Greenland–Iceland–Norwegian (GIN) Seas along the East Greenland Current due to much cooler conditions over the entire GIN basin consistent with NAO- like forcing (Fig. 10b), and associated spurious ice formation documented in detail in Germe et al. (2014). The progressive densification in the eastern SPG from Yr5 onwards controlled by cooling (Fig. 8b) leads to a progressive deepening of the mixed layer there and a slow recovery of the BSF by geostrophic adjustment (Fig. 7d). Additionally, the spurious windstress curl located off the British Isles due to intrinsic biases of the ARPEGE atmospheric component (Fig. 10c) is contributing to the eastward extension of the SPG as well as its spin-up. The attractor of the model characterized in Fig. 6 through differences between NOTROP_IC and HIST can be thus understood, to the first order, as the result of a permanent NAO- like forcing. South of 50°N, the anomalous circulation strength and shift projects very well upon the ocean response to wind-driven NAO- forcing (compare Fig. 6a and Fig. 7a in Barrier et al. 2013) while the SPG drift is dominated, at shorter leadtime, by associated NAO- buoyancy forcing and at longer leadtime, by altered heat/salt convergence in link with horizontal circulation and AMOC reduction.

As stated above, most of the biases in atmospheric circulation in the predictions are tightly linked to atmospheric model intrinsic flaws. A weak positive feedback between the ocean drift and the local atmosphere exists though as diagnosed in Fig. 12, where the relationship between SST and SLP forecasts over the entire SPG is investigated. The correlation between leadtime-evolving SST and SLP is equal to −0.75 corresponding to a feedback strength equal to −0.44 hPa/ °C in the model. Note that the significance of the latter values is difficult to assess because of limited sampling and strong autocorrelation between leadtimes. The fact that (1) similar calculation done over both the West SPG and East SPG regions gives approximately the same figure (−0.42 and −0.46 hPa/°C respectively), and (2) the SLP/SST values for HIST on one hand and for NOTROP_IC on the other hand, which have not been used for the computation of the regression coefficients, are very much consistent with the regression fit, tend to give some credit to our findings. Figure 12 finally shows that about 50 % of the total SLP bias over the SPG can be attributed to the atmospheric-alone component, as quantified by the difference between NCEP and NOTROP_IC while the rest of the biases, involving coupled processes (difference between NOTROP_IC and HIST), is tied to the slow adjustment of the ocean water masses and its associated feedback; it takes more than 10 years (limit here of the forecast range) to settle.

Fig. 12
figure 12

Scatter plot between annual mean SLP and SST averaged over the SPG for NCEP reanalysis, NOTROP_IC and HIST (black crossed circle) and for DEC (plain color dots) as a function of leadtime given in years. The regression fit is computed from DEC only (solid line) and is simply prolonged (dashed line) for visual purpose

Finally, the remote impact of the spurious ENSO documented in Sect. 3, upon the North Atlantic drift behavior, is now investigated. The differences between DEC_NOEQ ensembles, where ENSO dynamics is active over the first 3–4 years of the forecasts (see Fig. 4), and DEC ensembles, are presented in Fig. 13 for Northern Hemisphere SLP, Arctic sea ice cover (SIC), North Atlantic total surface flux and mixed layer depth. Maps are provided for the JFM Yr2 leadtime where the midlatitudes ENSO teleconnection is maximal (Fig. 6). As shown in Fig. 5, the greater excitation of ENSO in DEC_NOEQ leads to an accentuation of the intrinsic PNA+ and NAO- patterns in the latter ensemble compared to DEC (Fig. 13a). In the North Pacific, negative SLP differences (~−3 hPa) are located south of the Aleutian Islands. In the North Atlantic sector, values are about +0.8 hPa (+1.4 hPa) over the western (eastern) SPG, which is significant in regard to Fig. 10’s orders of magnitude. The decrease of surface westerly winds associated to a NAO- pattern yields to surface warming (not shown) and positive heat flux differences between DEC_NOEQ and DEC over the SPG area, with values up to +25 W/m2 (+18 W/m2) over West (East) SPG (Fig. 13c). In addition to buoyancy, the reduction of surface wind stress contributes to a decrease of the turbulent vertical mixing in the ocean and leads altogether to a shallower mixed layer depth is in DEC_NOEQ over the entire SPG with maximum loading in the Labrador Sea (Fig. 13d). The resilience time of the North Atlantic to the additional NAO- kick due to spurious ENSO excitation is about 3 years in terms of circulation and heat content (not shown).

Fig. 13
figure 13

Differences between DEC_NOEQ and DEC ensemble mean hindcasts at JFM Yr2 leadtime for a sea level pressure (SLP), b North Atlantic net flux at the surface, and c North Atlantic ocean mixed layer depth. Gray hatching stands for significance at 95 % determined by a t test

6 Summary and discussion

Drifts are often present in climate models when initialized from observed conditions; those intrinsically affect any type of climate predictions based on numerical experiments. Model drifts are usually removed through more or less sophisticated techniques for forecast verification. Drifts are however rarely analysed. In this study, we have adopted the opposite approach and provide a detailed physical and dynamical description of the drifts in the CNRM-CM5 coupled model by means of decadal hindcasts produced within CMIP5. We are not interested in the predictive performance of the forecast system and the scope of the paper is to deepen our understanding on the physical processes involved in the development of some systematic errors within a coupled model. The ultimate goal of this approach is to provide some physical insights and lines of approach to, on one hand implement more appropriate techniques of initialisation that minimize the drift, and on the other hand reduce the systematic biases of the models. The challenge is to link the knowledge of the physical origin of model errors and their remote propagation pathways to future prediction and projections. Such an approach is also useful to assess the limit of the bias correction techniques that could be too simple to correctly account for the physical and dynamical mechanisms that control the drifts, and thus could affect in fine the true estimation of the predictability and the predictive skills of the forecast systems.

The methodology used for ocean initialization in the decadal hindcasts available in the CMIP5 archive for the CNRM-CM5 system is first described. It relies on a preliminary simulation of the coupled model in which the ocean component is restored over 1958–2008 towards the NEMOVAR ocean reanalysis. At the surface, a flux derivative restoring is applied over the entire ocean while a 3D damping is implemented at the subsurface, below the mixed layer, outside the 15°S–15°N tropical band. The rest of the system is freely coupled and initial conditions for hindcasts are directly taken from this simulation referred to as NOTROP_IC; they are applied as initial conditions following the so-called full field initialisation strategy. A 10-member ensemble for each starting dates (referred to as DEC) is produced following the CMIP5 recommendations. To test the sensitivity of the results to the 3D-nudging latitudinal window in which the coupled model is not restored at the subsurface, an additional simulation with 1°S–1°N instead of 15°S–15°N has been carried (NOEQ_IC) as well as their corresponding hindcasts (referred to as DEC_NOEQ.

We have first analyzed the spatio-temporal heat balance redistribution within the modeled climate system through the description of the drift of the meridional heat transport estimated from net TOA and net surface fluxes as a function of leadtimes (Trenberth and Caron 2001; Trenberth and Fasullo 2007). As expected from previous literature, results show that the ocean processes drive in terms of heat balance the low-frequency adjustment of the model towards its equilibrium. The atmosphere adjustment is rather fast except in the Southern Hemisphere over the storm-track region where strong ocean–atmosphere coupling may take place. Partitioning the world ocean into separate basins indicates that the total drift in ocean meridional heat transport is dominated by the Pacific one from Yr1 to Yr5 of the forecasts. In this basin, the transport progressively increases and reaches values closed to HIST, i.e. the estimation of the model attractor, after ~Yr5. From Yr5 onwards, the adjustment of the North Atlantic Ocean dynamics seems to emerge and becomes eventually dominant.

Based on the timescales and the basin characteristics involved in the global model adjustment, we focused first on the mechanisms that occur in the tropical Pacific at the early stage of the forecasts (from initial conditions to ~Yr4). In the Pacific, the model experiences an initial shock or fast adjustment that is materialized by a quasi-systematic excitation of ENSO warm events whatever the starting dates at leadtime Yr1. Weak cold events tend to occur the following year while the spurious oscillatory behavior is damped afterwards in DEC. The first-year El Niño excitation can be viewed for the coupled model as an efficient way to rapidly adjust to its own ocean + atmosphere mean state that is characterized by colder SST conditions and displaced variance in the western-central part of the Pacific basin and warmer conditions in the eastern side (Bellucci et al. 2014, their Fig. 7). CNRM-CM5 trade winds are accordingly weaker along the equator especially in winter and spring seasons. Through the excitation of ENSO warm events in Yr1, a significant amount of upper-ocean heat that has been initially stored within the thermocline through initialization to NEMOVAR and that is inconsistent with the model mean state is thus released to the atmosphere through diabatic heating. The Yr2 La Niña events can be simply explained by the so-called discharge-recharge mechanism (Jin et al. 1997) and the associated Kelvin waves back and forth excitation. Such an adjustment mechanism is much more pronounced in DEC_NOEQ than in DEC as the spurious ENSO flip-flop lasts up to Yr4 instead of 2. In DEC_NOEQ, the thermocline in the initial conditions is warmer as in DEC and is additionally much deeper because of subsurface nudging, leading altogether to a large excess of heat. The mean depth and tilt of the thermocline is not in balance between the biased wind and their adjustment leads to the excitation of strong ENSO warm events at Yr1 and subsequent back and forth Niño/Niña until the tropical heat content is compatible with the model attractor one. The spurious excitation of ENSO events affects the atmospheric drift over a large part of the planet through teleconnection. This is particularly marked over the North Atlantic and in DEC_NOEQ that is dominated by NAO- circulation during the first year ENSO warm events. Based on DEC_NOEQ and DEC comparisons, we have decided to retain the latter configuration for CMIP5 archive because of the minimized initial shock in the Pacific. The adjustment mechanism through ENSO as documented here seems to be present in numerous models but its efficiency is expected to be model dependant as suggested in Vannière et al. (2013).

We then investigated the drift over the North Atlantic. To a first order, it can be interpreted as the model response to intrinsic atmospheric circulation biases found in the stand-alone atmosphere component, which project onto the negative phase of the NAO. A fast adjustment occurs in the North Atlantic leading to a rapid slackening of both horizontal and vertical circulations. Over the SPG, the reduced oceanic loss of heat to the atmosphere due to the southward displacement of the mean westerlies leads to initial warming (from Yr1 to Yr4); the associated reduced density inhibits the formation of intermediate and deep water masses which feeds the lower limb of the AMOC. The latter slows down and contracts towards the surface. Note that the mechanisms for the AMOC reduction proposed here are different from the ones examined in Huang et al. (2015) using the CFSv2 decadal forecasts. In their case, the AMOC weakening is caused by a reduction of the upper ocean salinity in the SPG, likely due to an excessive freshwater transport from the Arctic due to rapid sea ice melting while in CNRM-CM5, drifts can be mostly interpreted as the integration by the ocean of intrinsic atmospheric biases.

At longer timescales, from Yr5 onwards, cooling takes over in the SPG because of (1) reduced meridional heat transport by the NAC due to overall slackened circulations and (2) because of the progressive invasion of spurious colder water through the East Greenland Current coming from the GIN Seas where sea ice forms rapidly in response to the NAO- circulation (not shown). A weak feedback is locally present between the atmospheric circulation and the ocean drift that controls the timescale of setting of the coupled model biases. In CNRM-CM5, it is such that it is positive and progressively reinforces the intrinsic atmospheric model errors considered as the main seed for the total coupled model biases.

As pointed out in Vannière et al. (2013) and Xie et al. (2015) the challenge for the climate community is to move beyond the routine evaluation of the climate model and to develop innovative techniques and approaches to trace climate model errors back to their physical origin. In other words, beyond simple comparison of measurable quantities, models evaluation should be process-based to identify model systematic errors and the timescale of their setting, with the ultimate goal to reduce them.

Several recommendations can be drawn from this study, which precisely follows this way of proceed:

  1. 1.

    the imprint of the zero-order coupled model biases can be found in the stand-alone atmospheric component as shown here for the tropical Pacific (weaker trade winds) and the North Atlantic (NAO- like circulation). As a consequence, AMIP-type simulations are very relevant to guess a priori a large part of the total biases of the coupled system; the advantage is that this type of experiment is computationally very cheap. It is our conjecture that this conclusion is valid for numerous models. The feedback between the ocean drift and the overlying atmosphere as quantified from Fig. 12 might be model dependent though. In any case, its estimation requires a lot of members because of important atmospheric noise at midlatitudes.

  2. 2.

    The temporal development of model zero-order integrated errors in the ocean, either as a response to initial atmospheric biases or as intrinsic biases in the ocean component or from various ocean–atmosphere feedbacks during the drifting period are overly dominant and can be found in any single realization. As a consequence, few members and few starting dates are needed to evaluate those types of errors (see Figs. 7, 8 for instance). This assertion is especially valid for the midlatitude oceans as well as in the tropical Atlantic (not shown) but less true, although still pertinent, for the Pacific due to the presence of ENSO.

  3. 3.

    Because of teleconnections, the properties of the drift in the tropical Pacific are important to understand the global drift and it may be imperative to minimize the spurious and systematic excitation of ENSO in initialized forecasts.

Hawkins et al. (2014) decompose the uncertainties of total model drift in three contributions associated with sampling due to internal variability (not enough members), the dependence on initial states (not enough starting-dates) and the interaction with evolving forcing (GHGs and volcanoes radiative change). We believe that our main findings are not too much polluted by those three factors even if they are visible in some figures especially for thermodynamical fields (e.g. Fig. 8). These factors are however important for predictive skill evaluation, because part of the signal we want to predict is removed through the classical bias correction technique that consists in averaging all the members and starting dates to estimate the drift to be retrieved a posteriori.

That said, it is interesting to note that, despite the presence of strong drifts, skill scores can be high when those are removed classically. This is particularly the case over the North Atlantic, as documented in Kirtman et al. (2013) in the 5th IPCC report, suggesting on one hand, that the coupled model adjustment and associated processes weakly interfere with the initial conditions and the physical mechanisms at the source of the predictability and on the other hand that the linear assumption of the bias correction technique used in CMIP5 is valid. This might not be the case for the Pacific because ENSO characteristics are strongly nonlinear regarding associated precipitation and diabatic heating as source of teleconnections. The perturbation associated with the spurious ENSO excitation at the earliest leadtimes of the forecast may be significant enough to destroy any source of decadal predictability stored in the ocean initial states; this hypothesis may help explain the very poor predictive skill of all the CMIP5 models over the entire Pacific, among other possible explanations. More work is needed on these specific issues.