1 Introduction

Climate models are used to simulate the climate system by numerically solving the fundamental governing equations on supercomputers. They are becoming more and more important to a wider group of users. Climate models are used, for example, to assess possible future climate change due to anthropogenic greenhouse gas emissions (e.g., Solomon et al. 2007); they form the basis of operational seasonal to decadal climate prediction systems (e.g., Palmer et al. 2004); and they serve the climate research community as laboratories to help unravelling the functioning of the climate system (e.g., Delworth 1996; Timmermann et al. 1998; Jung et al. 2010b).

The history of climate modeling has been clearly a story of success with the development of complex Earth system models from simple atmosphere–ocean models within less than four decades. However, even the most sophisticated systems still show substantial shortcomings when it comes to the representation of key aspects such as the North Atlantic meridional overturning circulation, Arctic sea ice, and meso-scale phenomena such as eddies and topographically influenced ocean currents. The implications are wide and profound: The uncertainty of regional climate change projections is still high (e.g. Hawkins and Sutton 2009); and model error has a detrimental influence on the skill of seasonal and decadal climate predictions (Kirtman and Pirani 2008). Furthermore, recent progress in advancing the fidelity of climate models has become incremental. From this development it has been argued that radically new approaches are required to significantly advance the field of climate modeling and prediction (Shukla et al. 2009; Slingo et al. 2009).

There is general agreement that one of the main sources of model error lies in the fact that many climate-relevant processes need to be parameterized (Shukla et al. 2009; Jakob 2010) since they are too small-scale in nature to be explicitly simulated by state-of-the-art climate models. The solution to overcome this problem sounds straightforward: increase the resolution of the computational grid sufficiently so that important small-scale processes can be resolved explicitly. There is a downside to this argument, however: All existing global climate models still use the first generation of dynamical cores, which employ quasi-regular mesh methods. Mesh regularity dictates that approximately the same resolution has to be used everywhere. Significantly increasing resolution with regular mesh models therefore means that resolution needs to be increased globally, which leaves the problem computationally prohibitive for most climate applications on present-day supercomputing facilities.

Given the existing computational constraints, restricting the use of high resolution to dynamically active regions, where it is actually needed, while retaining a relatively coarse resolution otherwise appears to be a much more promising way forward. The implementation of such an approach in climate modeling has become possible with the recent emergence of a new generation of global multi-resolution ocean models that employ unstructured meshes (e.g., Danilov et al. 2004; Wang et al. 2008; Ringler et al. 2013). In this study, the presently most advanced and well tested global multi-resolution sea ice-ocean model—the Finite Element Sea Ice-Ocean Model (FESOM; Danilov et al. 2004; Wang et al. 2008, 2013; Timmermann et al. 2009)—is coupled to the atmospheric model ECHAM6. This first global climate model with a radically different multi-resolution dynamical core for the sea ice-ocean system is expected to provide a substantial contribution to the diversity of climate models, which has been shown to be surprisingly small (Pennell and Reichler 2011).

The aim of this study is to give a detailed description of the formulation of the new model and to provide an overview of the performance of the model in replicating the observed mean climate. As a starting point, it has been decided to use relatively moderate mesh-stretching factors for the sea ice-ocean component (from about 25 km in the northern North Atlantic and in the tropics to about 150 km in parts of the open ocean) in order to have a reference against which future configurations with large mesh-stretching factors can be tested. Given the choice made here—the first of a series of papers that will explore the benefit of unstructured mesh approaches in climate modeling—it cannot be expected that the new coupled model produces vastly better results than existing state-of-the-art models run at similar resolutions. The only exception is in the tropics where, unlike for existing climate models, the unstructured mesh approach allows us to increase horizontal resolution in both directions (meridionally and zonally) simultaneously.

The outline of the paper is as follows: The individual model components, the way they are coupled, and the experimental setup are described in Sect. 2. The simulated mean climate of the atmosphere, ocean, and cryosphere are presented and compared to observations in Sect. 3. Finally, the results are briefly summarized and discussed in Sect. 4.

2 Model formulation

2.1 The atmospheric model ECHAM6

The atmospheric general circulation model ECHAM6 is the latest release of the ECHAM model developed at the Max-Planck-Institute for Meteorology (MPI) in Hamburg (Stevens et al. 2013). In the late 1980s, the first ECHAM model branched off from the then operational ECMWF (European Center for Medium-Range Weather Forecasts) model (Simmons et al. 1989) and has since then been further developed in Hamburg—hence the name ECHAM (Stevens et al. 2013).

ECHAM6 and earlier versions of the model have been widely used in climate research, both in uncoupled mode with prescribed lower boundary conditions and in coupled mode with dynamical ocean models, in particular the MPI ocean model (MPIOM; Marsland et al. 2003). Integrations of both kinds served as input for model intercomparisons in the framework of the Atmospheric Model Intercomparison Project (AMIP; Gates et al. 1999) and different phases of the Coupled Model Intercomparison Project (e.g. CMIP3 and CMIP5; Meehl et al. 2007; Taylor et al. 2012). ECHAM6 constitutes the atmospheric part of the new coupled model presented here. A detailed description of ECHAM6 together with a historical overview of the development of ECHAM is given in Stevens et al. (2013). Some characteristic properties of ECHAM6 as well as issues that are specific to the configuration used in this study will be listed in the following.

ECHAM6 is a spectral atmospheric model. There are several configurations available differing in the spectral truncation, vertical discretization, time step, and settings for key resolution dependent parameters (Stevens et al. 2013). The T63L47 configuration is used here and has also been employed in MPI-ESM-LR (Max-Planck-Institute Earth System Model Low Resolution; Giorgetta et al. 2013) and uses total wavenumbers up to 63, which corresponds to about \(1.85\times 1.85\) degrees horizontal resolution; the atmosphere comprises 47 levels and has its top at 0.01 hPa (approx. 80 km). ECHAM6 includes the land surface model JSBACH (Stevens et al. 2013) and a hydrological discharge model (Hagemann and Dümenil 1997).

Since with higher resolution “the simulated climate improves but changes are incremental” (Stevens et al. 2013), the T63L47 configuration appears to be a reasonable compromise between simulation quality and computational efficiency. All standard settings are retained with the exception of the T63 land-sea mask, which is adjusted to allow for a better fit between the grids of the ocean and atmosphere components. The FESOM land-sea distribution is regarded as ’truth’ and the (fractional) land-sea mask of ECHAM6 is adjusted accordingly. This adjustment is accomplished by a conservative remapping of the FESOM land-sea distribution to the T63 grid of ECHAM6 using an adapted routine that has primarily been used to map the land-sea mask of the MPIOM to ECHAM5 (H. Haak, personal communication).

2.2 The Finite Element Sea Ice-Ocean Model (FESOM)

The sea ice-ocean component in the coupled system is represented by FESOM, which allows one to simulate ocean and sea-ice dynamics on unstructured meshes with variable resolution. This makes it possible to refine areas of particular interest in a global setting and, for example, resolve narrow straits where needed. Additionally, FESOM allows for a smooth representation of coastlines and bottom topography. The basic principles of FESOM are described by Danilov et al. (2004), Wang et al. (2008), Timmermann et al. (2009) and Wang et al. (2013). FESOM has been validated in numerous studies with prescribed atmospheric forcing (see e.g., Sidorenko et al. 2011; Wang et al. 2012; Danabasoglu et al. 2014). Although its numerics are fundamentally different from that of regular-grid models, previous model intercomparisons (see e.g., Sidorenko et al. 2011; Danabasoglu et al. 2014) show that FESOM is a competitive tool for studying the ocean general circulation. The latest FESOM version, which is also used in this paper, is comprehensively described in Wang et al. (2013). In the following, we give a short model description here and mention those settings which are different in the coupled setup.

The surface computational grid used by FESOM is shown in Fig. 1. We use a spherical coordinate system with the poles over Greenland and the Antarctic continent to avoid convergence of meridians in the computational domain. The mesh has a nominal resolution of 150 km in the open ocean and is gradually refined to about 25 km in the northern North Atlantic and the tropics. We use isotropic grid refinement in the tropics since biases in tropical regions are known to have a detrimental effect on the climate of the extratropics through atmospheric teleconnections (see e.g., Rodwell and Jung 2008; Jung et al. 2010a), especially over the Northern Hemisphere. Grid refinement (meridional only) in the tropical belt is employed also in the regular-grid ocean components of other existing climate models (see e.g., Delworth et al. 2006; Gent et al. 2011). The 3-dimensional mesh is formed by vertically extending the surface grid using 47 unevenly spaced z-levels and the ocean bottom is represented with shaved cells.

Fig. 1
figure 1

Grids corresponding to (left) ECHAM6 at T63 (\(\approx 180\hbox { km}\)) horizontal resolution and (right) FESOM. The grid resolution for FESOM is indicated through color coding (in km). Dark green areas of the T63 grid correspond to areas where the land fraction exceeds 50 %; areas with a land fraction between 0 and 50 % are shown in light green

Although the latest version of FESOM (Wang et al. 2013) employs the K-Profile Parameterization (KPP) for vertical mixing (Large et al. 1994), we used the PP scheme by Pacanowski and Philander (1981) in this work. The reason is that by the time the coupled simulations were started, the performance of the KPP scheme in FESOM was not completely tested for long integrations in a global setting. The mixing scheme may be changed to KPP in forthcoming simulations. The background vertical diffusion is set to \(2\times 10^{-3}\hbox { m}^2\hbox {s}^{-1}\) for momentum and \(10^{-5}\hbox { m}^2\hbox {s}^{-1}\) for potential temperature and salinity. The maximum value of vertical diffusivity and viscosity is limited to \(0.01\hbox { m}^2\hbox {s}^{-1}\). We use the GM parameterization for the stirring due to eddy-induced velocity (Gent and McWilliams 1990; Gent et al. 1995; Griffies 1998), and biharmonic viscosity.

One of the major differences between the FESOM version in Wang et al. (2013) and that used in this paper is the thermodynamic sea ice scheme. Here it is based on the approach developed by Dorn et al. (2009) and allows for separate calculations of the heat balances for the ice-covered and open water fractions within the individual meshes. Dorn et al. (2009) demonstrate that such a subdivision of heat flux contributions improves the simulation of atmosphere-sea ice feedbacks in coupled models especially when sophisticated parameterizations of the surface albedo, the snow cover and melt pond fractions are used. ECHAM6 includes a sophisticated sea ice albedo and melt pond scheme (see e.g., Pedersen et al. 2009; Roeckner et al. 2012) by default.

Finally, FESOM runs with a linear free surface and, accordingly, freshwater flux is modeled by virtual salinity flux. The latter is based on the locally referenced salinity. No salinity restoring or any other flux corrections are applied.

2.3 Coupling

In the current setup, ECHAM6–FESOM employs the OASIS3-MCT coupler (Valcke et al. 2013) together with an intermediate regular exchange grid. The latter is introduced as a regular interface for FESOM to simplify the coupling procedure. Direct mapping between the unstructured ocean and regular atmospheric mesh will be implemented in future model versions. Currently, mapping between the finite-element mesh and the intermediate grid is done by FESOM, while the exchange between the intermediate grid and the atmospheric grid is carried out by OASIS3-MCT (see Sects. 2.3.1 and 2.3.2).

The coupling strategy is illustrated in Fig. 2. The air-sea fluxes are computed by the atmosphere model based on the surface fields provided by the sea ice-ocean model. In order to compute the air–sea fluxes it is therefore necessary to map the fields given on the unstructured ocean mesh onto the structured atmospheric grid. Similarly, the fluxes have to be mapped to the unstructured ocean grid. The method currently used to couple the fields and fluxes between the model components is described in the following. Coupling takes place every 6 h for the current setup.

Fig. 2
figure 2

Schematic of the coupling procedure: ECHAM6 computes 12 air–sea fluxes based on 4 surface fields provided by FESOM. The 6-hourly averaged fields and 6-hourly accumulated fluxes are mapped between the model components every 6 h employing an intermediate exchange grid together with the OASIS3-MCT coupler

2.3.1 Ocean to atmosphere exchange

The mapping from the intermediate grid to the atmospheric grid is done by OASIS3-MCT applying a bilinear interpolation. In order to map the fields given on the unstructured ocean mesh to the intermediate grid, two cases (depending on the local resolution in the ocean) are distinguished for each grid box:

  1. 1.

    The number of oceanic surface nodes enclosed by a grid box of the intermediate grid is at least three (the resolution is locally finer in the ocean than in the atmosphere).

    \(\rightarrow\) For each field a mean over the enclosed nodes is computed; this value is regarded as the grid box value.

  2. 2.

    The above condition is not met (the resolution is locally coarser than in the atmosphere).

    \(\rightarrow\) The values for the grid box of the intermediate grid are determined by a linear interpolation of the ocean surface fields to the grid box center.

All FESOM surface fields (see Fig. 2) are averaged over the 6-hourly interval preceding the coupling.

2.3.2 Atmosphere to ocean exchange

Similar to the treatment of the ocean surface fields, the air–sea fluxes are accumulated over 6-hourly intervals. The fluxes are computed by the atmosphere model and bilinearly interpolated to the intermediate grid using OASIS3-MCT. From the intermediate grid the values on the unstructured ocean mesh are computed using a simple inverse distance weighting (Shepard 1968). This provides some form of downscaling. In order to enforce the conservation of the net heat and freshwater fluxes, the resulting fluxes on the ocean grid are scaled such that their spatially integrated net values are identical to the ones provided by ECHAM6. The methodology is similar to the OASIS3 global conservation option GLBPOS, where the residual flux is distributed globally on the target grid, proportionally to the value of the original flux (Valcke 2013). To avoid spurious exchanges between the hemispheres in ECHAM6–FESOM, fluxes that are relevant mainly at middle and high latitudes (snowfall, sublimation, and the heat flux into the ice) are scaled separately for each hemisphere rather than globally.

2.4 Experimental setup

The simulation presented here was run under constant radiative conditions of the year 1990 and is similar to the control simulations performed with the GFDL climate model (e.g. Delworth et al. 2006, 2012). Two simulations with different initial states were performed: (1) ECHAM6 was initialized with the (quasi arbitrary) default initial state, and FESOM with the Polar Science Center Hydrographic Climatology (PHC, Steele et al. 2001) and zero velocities. (2) ECHAM6 was initialized with the state it attained after 5 years using method (1), and FESOM with the final state of a 60-year spin-up run under CORE-II atmospheric forcing with surface salinity restoring (Large and Yeager 2009; Wang et al. 2013). Method (2) was applied to reduce the initial shock of the coupled system (see also Delworth et al. 2006, 2012). A 60 year ocean spin-up was used in method (2) because the largest ocean adjustment to the atmospheric forcing happens within the first few decades (e.g. Griffies et al. 2009; Sidorenko et al. 2011). Only the results obtained with method (2) are presented in the remainder of this paper; only minor differences were found in terms of model drift and mean climate between the two methods.

The size of the triangles in the FESOM mesh varies significantly in space, and the time step needs to be chosen such that numerical stability is guaranteed everywhere. In this study a time step of 30 min was chosen for FESOM. The time step for ECHAM6 is set to 10 min as is common for the T63 configuration of the model (Stevens et al. 2013). As already mentioned above, the coupling between FESOM and ECHAM6 is performed every 6 h.

The coupled system was run for 350 years. The last 300 years are used for the diagnostics of the mean state of the coupled system to avoid most of the initial model drift. The atmospheric time series will be shown for the full 350 years. Time series for the ocean, on the other hand, include additional 60 years from the uncoupled ocean spin-up run (i.e. 410 years in total).

2.5 Data for model evaluation

The atmospheric mean state of ECHAM6–FESOM is compared against ECMWF reanalysis (ERA) data to evaluate biases. ERA-40 has been chosen instead of ERA-Interim because it is available for a longer time period (1957–2002). Only those ERA-40 parameters were considered that are relatively well constrained by the data assimilation system, such as 2-m temperature, 10-m wind speed, 500-hPa geopotential height, and 300-hPa u-component (Uppala et al. 2005).

For precipitation two different datasets have been used: the Global Precipitation Climatology Project (GPCP) dataset for 1979–2010 (Adler et al. 2003) and the CPC Merged Analysis of Precipitation (CMAP) for 1981–2010 (Xie and Arkin 1998). For total cloud cover two different datasets have been used: data from the Moderate Resolution Imaging Spectroradiometer (MODIS) for 2000–2011 (Platnick et al. 2003; King et al. 2003) and the GCM-Oriented CALIPSO Cloud Product (GOCCP) for 2006–2010 (Chepfer et al. 2010). For top-of-the-atmosphere (TOA) outgoing longwave radiation data from the Clouds and the Earth’s Radiant Energy System (CERES) for 2000–2012 (Loeb et al. 2012) have been chosen.

The mean ECHAM6–FESOM ocean temperature and salinity fields are evaluated against the PHC Climatology. For the sea ice extent, the satellite record from Fetterer et al. (2002) is used. Furthermore, the associated ocean transports and other characteristics are compared with those from other state-of-the-art coupled climate models and stand-alone sea ice-ocean models.

Simulation results from a set of well-established CMIP5 models (Taylor et al. 2012) that qualitatively represents the spread among CMIP5 models have been analyzed in order to put the performance of ECHAM6–FESOM into context. Single realizations of the so-called historical simulations for the period 1950–2005 were considered. In these, observed greenhouse gas and aerosol forcing from 1850 onwards was prescribed rather than constant 1990 forcing. Because of the inertia inherent to the climate system, this difference in design needs to be taken into account when discussing differences between ECHAM6–FESOM, the CMIP5 models, and the observations. In particular, a cooler surface temperature especially over the oceans is to be expected in the CMIP5 model simulations, because some commitment warming remains unrealized. This also has consequences for other parameters such as 500 hPa geopotential height and precipitation.

3 Results

In the following, the performance of ECHAM6–FESOM in simulating the observed mean state of the atmosphere, ocean, and sea ice will be described. The results will be discussed in the context of systematic errors found for other state-of-the-art climate models.

3.1 Atmosphere

3.1.1 2-m temperature

Climatological 2-m temperatures (2mT) for boreal winter and summer as simulated by ECHAM6–FESOM are shown in Fig. 3 together with corresponding systematic errors. Over large parts of the globe, 2mT is simulated with an accuracy of 1 K when compared to climatological means from ERA-40 data. The spatial pattern of biases is generally similar to MPI-ESM-LR (Stevens et al. 2013) except for the Northern North Atlantic while MPI-ESM-LR is globally slightly cooler due to the different model set-up regarding the greenhouse gas and aerosol forcing.

Fig. 3
figure 3

a Climatological 2-m temperature (°C) during boreal winter (DJF) as simulated by ECHAM6–FESOM and b corresponding differences between ECHAM6–FESOM and ERA-40 reanalysis Uppala et al. (2005). c and d same as a and b, but for boreal summer (JJA)

The largest biases of around 10 K are found over Antarctica with colder (warmer) temperatures occurring in austral summer (winter) in ECHAM6–FESOM compared to ERA-40. The warmer temperatures in austral winter extend into the Southern Ocean especially around the Greenwich meridian—a problem also present in the historical simulations of CMIP5 models, such as MPI-ESM-LR (Stevens et al. 2013) and HadGEM2 (Collins et al. 2011), as well as in a present-day control run of EC-EARTH (Hazeleger et al. 2012). However, substantial biases over Antarctica have been detected even in the reanalysis data used for the comparison (Klöwer et al. 2014; Bromwich and Fogt 2004).

Over the continents CCSM4 (Gent et al. 2011), HadGEM2, and MIROC-ESM (Watanabe et al. 2011) tend to show larger biases compared to ECHAM6–FESOM and MPI-ESM-LR. In the stratocumulus regions west of South America and west of South Africa a lack of stratocumulus clouds contributes to strong positive biases of up to 6 K during austral winter. This problem is typical for many coupled climate models (Stevens et al. 2013) although the magnitudes in MPI-ESM-LR, CCSM4, and HadGEM2 are slightly smaller.

Furthermore, during boreal winter a pronounced cold bias of up to 10 K can be seen in the Barents Sea. This bias is consistent with an excessive amount of sea ice in this area (see Sect. 3.2.2). The problem of too much ice is also present in CCSM4 and to some extent in HadGEM2. By contrast, MPI-ESM-LR has a pronounced warm bias of up to 8 K in an extended area around Greenland, Iceland and Spitsbergen, preventing the formation of excessive sea ice. In boreal summer there are cold biases of 1–3 K over large areas of the North Atlantic, the North Pacific, and the Arctic. This bias is slightly bigger in MPI-ESM-LR except for the area around Greenland and Iceland. As discussed further below, some of the 2mT errors in the North Atlantic region can be explained by shortcomings in simulating the North Atlantic circulation (e.g. the Gulf Stream separation).

3.1.2 Precipitation

Simulated total precipitation climatologies are shown in Fig. 4 along with their biases. In absolute terms the largest biases are found in the tropics and subtropics (Fig. 4). During both boreal winter and summer there are large areas with wet biases in excess of \(5\hbox { mm day}^{-1}\) associated with the Intertropical Convergence Zone (ITCZ). In the surrounding areas of downward vertical motion dry biases locally exceeding \(5\hbox { mm day}^{-1}\) can be seen. This feature also exists in other models and is known as the double ITCZ problem (e.g. Lin 2007; Gent et al. 2011; Hazeleger et al. 2012). However, only when annual means (not shown) are considered instead of seasonal means an augmented double ITCZ, caused by too strong precipitation in both boreal winter and summer at the respective location of the ITCZ, becomes visible.

Fig. 4
figure 4

Same as Fig. 3, but for total precipitation (mm/day) compared to GPCP (Adler et al. 2003; Huffman et al. 2009)

The precipitation in the mid- and high latitudes is similar between ECHAM6–FESOM and the GPCP data. It is important to note, however, that there is considerable uncertainty in precipitation observations. Stephens et al. (2012) point out, for example, that the GPCP dataset tends to underestimate precipitation. CMAP (Xie and Arkin 1998) is quite similar to GPCP in the tropics and subtropics, but over the northern North Atlantic and the northern North Pacific, as well as in the area of the Antarctic Circumpolar Current (ACC) around 60°S, CMAP shows even around 1–2\(\hbox { mm day}^{-1}\) less precipitation than GPCP; in relative terms this amounts to more than 50 % locally. Given that GPCP is believed to be more accurate over the oceans (Yin et al. 2004), only results from GPCP data are used here.

3.1.3 Total cloud cover

Figure 5 shows a total cloud cover comparison between ECHAM6–FESOM and MODIS data; the results found for MODIS are equally valid for GOCCP (not shown). The strongest negative biases in total cloud cover can be found in subtropical stratocumulus areas west of South America, west of South Africa, and west of Australia; similar results are found for MPI-ESM-LR (Stevens et al. 2013).

Fig. 5
figure 5

Same as Fig. 3, but for total cloud cover (%) compared to MODIS satellite data (Platnick et al. 2003; King et al. 2003)

For austral winter the negative bias is particularly strong, exceeding 60 % in some areas west of South America and west of South Africa. In the high latitudes total winter cloud cover is overestimated. In the summer hemisphere biases have a more complicated structure. It should be noted that simulated total cloud cover very much depends on how it is calculated from the cloud cover in each model layer, which hampers any total cloud cover assessment.

3.1.4 Outgoing longwave radiation

The pattern of outgoing longwave radiation (OLR) with maxima in the subtropical areas is reproduced by ECHAM6–FESOM (Fig. 6). Especially for boreal winter there are large areas in which differences between ECHAM6–FESOM and observations are less than \(5\hbox { Wm}^{-2}\). In this season the strongest positive biases of up to \(50\hbox { W m}^{-2}\) occur around and just north of the equator, whereas the strongest negative biases have a similar magnitude and occur in the tropical and subtropical South Atlantic. In boreal summer the strongest positive biases of up to \(50\hbox { W m}^{-2}\) are restricted to the western tropical Pacific; the strongest negative biases, having again a similar magnitude, occur over the tropical Atlantic and Indonesia. Not surprisingly, biases are very similar compared to MPI-ESM-LR. In HadGEM2 and CCSM4 biases are of similar magnitude but show substantially different patterns. This suggests that the atmospheric model formulation is the main driver of the above mentioned biases.

Fig. 6
figure 6

Same as Fig. 3, but for top of the atmosphere outgoing longwave radiation (W m−2) compared to CERES satellite data (Loeb et al. 2012)

3.1.5 10-m wind

Climatological 10-m winds of the ECHAM6–FESOM simulation are shown in Fig. 7 together with their difference to ERA-40 reanalysis data. The areas with positive bias are larger than the areas with negative bias when compared to ERA-40, indicating that the near-surface winds tend to be too strong. Another feature is that the position of the strongest westerlies over the Southern Hemisphere is shifted to the north as can be seen from a positive wind speed bias around 35–50°S and a negative wind speed bias around 50–60°S. In MPI-ESM-LR near-surface wind biases are generally similar compared to our simulations, whereas in HadGEM2 the patterns are different, including more areas with too weak winds than in ECHAM6–FESOM and MPI-ESM-LR.

Fig. 7
figure 7

Same as Fig. 3, but for wind vectors (m/s) 10 m above the surface. Wind vectors are plotted as arrows, and colors indicate the wind speed

3.1.6 500-hPa geopotential height

For the geopotential height of the 500-hPa level (Z500) there is a tendency towards negative biases in the mid-latitudes and positive biases in the high latitudes (Fig. 8). Together with a slightly negative bias of around 10 m in the tropics in boreal winter, this leads to a weakened meridional gradient in Z500. Over the Northern Hemisphere a wave pattern of the bias can be seen during wintertime with negative biases up to around 50 m over the British Isles and similar positive biases over the Bering Sea. The systematic error structure for Z500 shows that the model tends to produce too zonal winds over the Northern Hemisphere, which is a well-known problem of many atmospheric models (e.g. Jung 2005).

Fig. 8
figure 8

Same as Fig. 3, but for 500-hPa geopotential height (m)

The cyclonic circulation bias over the north-eastern North Atlantic in boreal winter can have two reasons. First, coarse-resolution atmospheric models tend to underestimate Euro-Atlantic blocking events (e.g. Jung et al. 2012). Second, a location bias of the Gulf Stream/North Atlantic Current can result in an atmospheric circulation bias which in turn affects Euro-Atlantic blocking (Keeley et al. 2012; Scaife et al. 2011). In fact, both our SST (Fig. 11a) and our atmospheric circulation biases are similar to the ones in the coupled climate model simulation of Keeley et al. (2012, their Fig. 1).

The negative biases in austral winter south of Australia exceeding 80 m and west of the Drake passage of around 50 m tend to amplify the observed wave pattern in the Southern Hemisphere, resulting in too weak zonality of the winds. In MPI-ESM-LR bias patterns are similar, but negative biases are up to 40 m stronger than in ECHAM6–FESOM, and all year round negative biases of around 20–40 m occur in the tropics. This is presumably a result of the colder tropical lower troposphere in MPI-ESM-LR compared to ECHAM6–FESOM. HadGEM2 also shows negative Z500 biases over the tropics all year round along with a tendency for too high Z500 over the polar regions. On the other hand, CCSM4 shows a very different bias pattern with too high subtropical and tropical Z500, possibly due to higher near-surface temperatures in these areas, and negative Z500 biases over and around Antarctica.

3.1.7 Performance indices

Recently it has become common practice to calculate performance indices (PI) to obtain an objective overview of the relative skill of different climate models in simulating observed aspects of the climate system. One such index has been developed by Reichler and Kim (2008). Their metric is based on the error variance derived from the simulated and observed climatological spatial patterns of different variables. The error variance is scaled by the observed interannual variance. Furthermore, they normalize this scaled error variance with the scaled error variance averaged over all models participating in the Coupled Model Intercomparison Project 3 (CMIP3).

Systematic errors in some of the observational data used in Reichler and Kim (2008) have been identified and improved (e.g. Tokinaga and Xie (2011)). Therefore we used a modified set of observational data, summarized in Sect. 2.5. The GPCP precipitation dataset was chosen because of its better reliability over the ocean compared to CMAP (Yin et al. 2004). For total cloud cover the MODIS dataset was selected because of the longer time series available compared to GOCCP.

Since different physical processes are likely to be responsible for biases in different seasons and different regions, the PI is computed in our study separately for the four seasons and for the following five regions: the Arctic (60–90°N), the Northern Hemisphere mid-latitudes (30–60°N), the tropics (30°N–30°S), the Southern Hemisphere mid-latitudes (30–60°S), and the Antarctic (60–90°S). Furthermore, the mean absolute error was used instead of the error variance and no scaling of the error with the observed interannual variance was applied. Reichler and Kim (2008) applied the scaling to account for the spatial variability of the variance, but this method may put too much emphasis on low-variability regions, including foremost the tropics. Therefore we omit the scaling but account for a large part of the mainly meridional variability pattern by splitting the globe into the five different regions given above.

Finally, the mean absolute error is normalized by the mean absolute error averaged over the five CMIP5 models MPI-ESM-LR (Stevens et al. 2013), HadGEM2 (Collins et al. 2011), CCSM4 (Gent et al. 2011), GFDL-CM3 (Griffies et al. 2011), and MIROC-ESM (Watanabe et al. 2011). As a result of this normalization, PIs below (above) 1 indicate that ECHAM6–FESOM performs better (worse) than the average of these five well-established CMIP5 models compared to the observations.

The modified PIs for ECHAM6–FESOM are shown in Table 1. PIs for MPI-ESM-LR, which uses the same atmospheric component as ECHAM6–FESOM, are also shown in order to allow for an assessment of the overall impact FESOM has on the climate of the coupled system. According to the PIs, ECHAM6–FESOM performs slightly better than the average over the five CMIP5 models. ECHAM6–FESOM also performs well compared to MPI-ESM-LR. Again it must be kept in mind though that slight differences in the PIs can occur because of the different forcing in our ECHAM6–FESOM simulation (constant greenhouse gas forcing of 1990) compared to the historical CMIP5 simulations. In summary, the objective performance indices presented in Table 1 indicate that ECHAM6–FESOM can compete with other state-of-the-art climate models.

Table 1 Modified performance index (PI) for five regions in ECHAM6–FESOM and MPI-ESM-LR

3.2 Ocean

3.2.1 Temperature and salinity

The time series of the globally averaged oceanic potential temperature is shown in Fig. 9 (blue curve). Evidently, the drift of ECHAM6–FESOM is characterized by a continuous ocean warming of about 0.001 K/year. The first 60 years of the time series correspond to the ocean-only spin-up of FESOM and show a positive trend comparable to that of the coupled system. Climate models generally tend to simulate higher than observed average ocean temperatures, and such drifts are found in many models under present-day and pre-industrial forcing (e.g. Griffies et al. 2011; Lucarini and Ragone 2011).

Fig. 9
figure 9

Time series of the globally averaged ocean potential temperature (°C; blue line) and of the net residual (all components included) heat flux across the ocean surface (\(\hbox {W m}^{-2}\); red line). The temperature time series includes 60 years of the ocean-only spin-up

The positive imbalance in the net downward surface heat flux (Fig. 9, red curve) associated with the drift shows a slightly negative trend but remains non-zero, indicating that the model has not yet reached its equilibrium in terms of the energy balance at the air–sea interface. Note that there is no change in global salinity as the net freshwater flux across the ocean surface is forced to be zero.

A Hovmöller diagram of the horizontally averaged potential temperature as a function of depth (Fig. 10a) indicates that the excessive amount of heat is stored at mid-depth levels of the ocean (at about 1,000 m depth). At the same time, the upper ocean develops a cold bias that first occurs at the surface and then becomes stronger and propagates downward to a depth of about 200 m. A similar pattern is also simulated for the horizontally averaged salinity (Fig. 10b): the deep ocean becomes saltier and the upper ocean fresher. The overall effect on density is spatially inhomogeneous in the upper ocean. In the deeper ocean the density is generally reduced.

Fig. 10
figure 10

a Hovmöller diagramm for the global profile of oceanic potential temperature changes (°C) including 60 years from the ocean-only spin-up. b The same as a but for salinity changes (PSU). Departures from PHC climatology (Steele et al. 2001) are shown. The top and bottom panels on the right show the profiles of temperature and salinity, respectively. Blue lines correspond to the PHC climatology and the red lines to the year 410

A similar drift in ocean hydrography is also described in Sterl et al. (2012), Delworth et al. (2006, 2012), and Jungclaus et al. (2013). These authors discuss different factors that may be responsible for the bias. Sterl et al. (2012) show that overestimation of the Mediterranean outflow can significantly increase the deep-ocean salinity bias. Delworth et al. (2012) attribute this anomaly to the insufficient eddy transport required to compensate for the wind-driven subduction in the subtropical gyres. They show that moving towards an eddy-resolving setting or a parameterization of the eddy stirring reduces the temperature biases significantly. Jungclaus et al. (2013) suggest that part of the problem arises from the improper interbasin exchange between the Indian and South Atlantic oceans.

A more in-depth analysis of the evolution of the mid-depth bias in ECHAM6–FESOM (not shown) indicates that the bias originates primarily in three regions in the Atlantic ocean: the Strait of Gibraltar, the Gulf Stream separation zone, and a latitudinal belt around 40°S. This fits well with the arguments by Sterl et al. (2012), Delworth et al. (2012) and Jungclaus et al. (2013).

We see that in ECHAM6–FESOM the Mediterranean Sea is contributing most to the deep-ocean bias below 1,000 m through the modification of the outflow water and its pathways. The anomalously saline and warm waters are advected westward in the subtropical gyre, gradually modifying the water masses in the entire deep Atlantic. We suspect that the too strong surface winds and associated offset in the buoyancy fluxes over the subtropical North Atlantic (see Fig. 7) contribute to the drift as it does not appear before the coupling, and also not in longer uncoupled FESOM simulations (see Danabasoglu et al. 2014). The bias, related to the Gulf stream separation zone and the position of Gulf Stream and North Atlantic Current, does not penetrate deeper than about 1000 m. It is most pronounced in the coupled mode indicating once again that the model drift may be explained by the atmospheric forcing, driving the coupled ocean towards a different climatological state.

Further insight into the nature of the model drift can be obtained by inspecting the horizontal patterns of temperature and salinity biases at different levels (Figs. 11, 12). The simulated sea surface temperature (SST; Fig. 11a) shows a cold bias in excess of 5 °C east of Newfoundland. This error is shared by most existing climate models (e.g. Scaife et al. 2011; Gent et al. 2011; Delworth et al. 2012; Sterl et al. 2012; Jungclaus et al. 2013) and is associated with a too southerly position of the North Atlantic current. The bias amplitude varies within 5–7 °C in CCSM3 and CCSM4 (Gent et al. 2011) depending on the model resolution. Different remedies have been proposed to reduce this problem: Gent et al. (2011), for example, argue that the bias can be reduced by inclusion of a Nordic Sea overflow parameterization (as introduced by Danabasoglu et al. 2010). Scaife et al. (2011) show that the North Atlantic resolution of about 0.25° is sufficient in their case to substantially reduce the cold SST bias compared to a resolution of about 1°; interestingly, their study suggests that improving the cold SST bias in the northern North Atlantic might also help to enhance the ability of coupled models to simulate Euro-Atlantic blocking. In contrast, Jungclaus et al. (2013) compare two climate configurations with 1.5° and 0.4° resolution in the ocean and report only minor changes between both setups. This suggests that, while the problem of simulating the proper location and strength of the North Atlantic current is indeed common and shared by different state-of-the-art climate models, possible remedies may well be model specific.

Fig. 11
figure 11

Horizontal deviations of potential temperature (°C) from PHC climatology (Steele et al. 2001) at a the surface, b 500 m, c 1,000 m, and d 1,500 m depth. The time average was taken over years 51–350 of the coupled model run. Note the change in the color bar range for different depth levels

Fig. 12
figure 12

Same as Fig. 11, but for salinity (PSU). Note the change in the color bar range for different depth levels

The SST bias in ECHAM6–FESOM, with a pronounced high SST along the North American coast, points to a problem with the Gulf Stream separation. Furthermore, deficiencies in the equatorial coastal upwelling regions in the South Atlantic and Pacific appear, as well as a warm bias in the Southern Ocean. These model shortcomings are typical problems in existing climate models. Sea surface salinity (SSS; Fig. 12a) indicates that the cold and warm biases east of Newfoundland and at the North American coast are associated with a freshening and salinization of the waters in the respective regions. Freshening is found in the whole Pacific and across the Atlantic Ocean in the belt 10–20°S. The Indian and Southern oceans also show higher SSS, with the largest bias in the Indian Ocean.

The patterns of model biases at deeper layers, shown in Fig. 11b,c,d for temperature and in Fig. 12b,c,d for salinity, indicate that the excessive heat is primarily stored in the Atlantic ocean with a maximum at a depth of about 1,000 m (see also Fig. 10a). Essentially the same holds for the spatial pattern of the salinity bias. It is worth mentioning that the Atlantic temperature and salinity biases in the deeper ocean are very similar to those reported by Sterl et al. (2012). The Strait of Gibraltar in Sterl et al. (2012) is represented by only one grid box with a width of 90 km, a factor six larger than in reality, and the strait is too wide in ECHAM6–FESOM as well. This suggests that the deep-ocean salinity bias might be caused by an excessive Mediterranean outflow. However, an additional experiment with ECHAM6–FESOM with a narrower Strait of Gibraltar (not shown) did not show any improvements regarding the deep-ocean bias. Besides, as already mentioned, the bias evolves only when FESOM is coupled with ECHAM6. This corroborates the hypothesis that the deep Atlantic bias in ECHAM6–FESOM is rather caused by the bias in the atmospheric forcing.

The general conclusion from inspecting the ECHAM6–FESOM temperature and salinity bias is that the model climatology drifts in a systematic way from the observed climatology in response to the “offset” in momentum, heat and buoyancy fluxes. Hopefully, increasing resolution to explicitly resolve eddies would alleviate some of the above model errors. In principle, this would favor the modeling approach based on unstructured meshes, as one can resolve eddies locally where it is required. Yet this remains to be shown in forthcoming studies.

3.2.2 Sea ice

Sea ice is an important component of the climate system and is very sensitive to the model formulation. Sea ice concentration is one of the most reliably measured quantities since the beginning of the satellite era in the late 1970s; sea ice extent, defined as the area with ice concentrations greater than 15 %, is directly derived from sea ice concentration and is commonly used for model validation.

ECHAM6–FESOM simulates a mean Arctic sea ice extent of \(14.56\,\times 10^6\hbox { km}^2\) in March and \(6.14\,\times 10^6\hbox { km}^2\) in September. These values are lower than observational estimates of \(15.7\,\times 10^6\hbox { km}^2\) in March and \(7.0\,\times 10^6\hbox { km}^2\) in September (available online from the NASA Earth Observatory website, Fetterer et al. 2002). The apparent underestimation of Arctic sea ice extent can be explained by the Arctic Ocean area in the current ECHAM6–FESOM setup being approximately 10 % smaller than the actual area, largely because of an inaccurate representation of the Canadian Arctic Archipelago (CAA) and some features along the Siberian coast. The modeled sea ice extent in the Southern Hemisphere is also underestimated. The simulated values are \(1.19\,\times 10^6\hbox { km}^2\) in March and \(18.03\,\times 10^6\hbox { km}^2\) in September compared to the observations of \(2.9\,\times 10^6\hbox { km}^2\) and \(18.7\,\times 10^6\hbox { km}^2\) (Fetterer et al. 2002).

Generally, the sea ice thickness and concentration patterns in ECHAM6–FESOM in the Northern and Southern hemispheres (Fig. 13) are similar to reanalysis and observational estimates (Zhang and Rothrock 2005; Fetterer et al. 2002). Compared to MPI-ESM-LR, which employs the same atmospheric component, sea ice in ECHAM6–FESOM is thicker in March north of Greenland and north of the CAA; the maximum thickness of Arctic sea ice in March is above 4 m in ECHAM6–FESOM and about 3.5 m in MPI-ESM-LR (Notz et al. 2013; Jungclaus et al. 2013), the former fitting better with reanalysis data (Zhang and Rothrock 2005). ECHAM6–FESOM simulates too thick ice in the western Nordic Seas. Some of this ice is trapped in a gyre, pointing to a possible artifact in the ocean circulation. There is also extensive ice in the Barents Sea which is consistent with the pronounced local cold bias in the lower atmosphere during boreal winter mentioned in Sect. 3.1. Despite some deficiencies, comparison with other climate models under present-day forcing reveals that the sea ice thickness distribution simulated by ECHAM6–FESOM is fairly realistic (see Fig. s07 in Tietsche et al. 2014).

Fig. 13
figure 13

Ice thickness (m) in March (left) and September (right) for a the Northern Hemisphere and b the Southern Hemisphere. The time average was taken over years 51–350 of the coupled model run. The white and black contours indicate mean ice concentrations of 15 and 85 %, respectively

Another model deficiency is seen in the Labrador Sea, which is intermittently covered by sea ice extending into the northwestern North Atlantic. The presence of sea ice in the Labrador Sea is associated with reduced convection and deep water formation. The latter is known to affect the amplitude and variability of the North Atlantic meridional overturning circulation (AMOC) (e.g. Eden and Willebrand 2001; Eden and Jung 2001). Reduced deep convection and winter freezing of the Labrador Sea is an artifact inherent to many existing climate models and is addressed in more detail in the following section.

3.2.3 Mixed layer depth and Labrador Sea bias

The mixed layer depth (MLD) as simulated by ECHAM6–FESOM is shown in Fig. 14a. There are three main centers of deep convection: the northeastern North Atlantic, the Greenland Sea, and the eastern part of the Weddell Sea; deep convection in the Labrador Sea as inferred from MLD is much less pronounced. Given the lack of observational maps for MLD we resort to a comparison with ocean-only integrations. The spread of MLD from 7 different ocean models is shown in Fig. 15 of Griffies et al. (2009) (see also Sidorenko et al. 2011, for FESOM). Similar to some ocean-only models ECHAM6–FESOM also depicts extreme MLD in the eastern North Atlantic. We can speculate that the deep density bias in the North Atlantic, associated with a less dense ocean, contributes to deeper MLD. This bias is largest in the eastern North Atlantic and is probably driven by surface forcing, as explained in Sect. 3.2.1. Note that the deep bias is outweighed by fresh surface water in the Labrador Sea. By and large, comparison reveals that MLD simulated by FESOM when coupled to ECHAM6, including the average MLD in the Labrador Sea of about 800 m, lies within the range of values obtained by other state-of-the-art ocean models.

Fig. 14
figure 14

a Annual maximum of the mixed layer depth (m) averaged over years 51–350 of the coupled model run. b From top to bottom: time series of the annual Labrador Sea mixed layer depth maximum; normalized fresh water content (blue) and the sea surface elevation (red); freshwater exports across Davis Strait (green), Fram Strait (red) and the sum of both (blue); time series of the deep salinity in the Labrador Sea

The time series of the annual Labrador Sea MLD maximum is shown in the upper panel of Fig. 14b. It varies between about 2,000 and 200 m. Clearly evident are persistent episodes with strong and weak Labrador Sea convection. The latter is associated with sea ice spreading too far into the northwestern North Atlantic, resulting in non-zero annual-mean thickness values in that region (Fig. 13a).

A similar model issue has been reported in previous studies. Wood et al. (1999), for example, find a collapse of Labrador Sea deep convection when their model is forced with increased greenhouse gases. Stouffer et al. (2005) indicate that the cold Labrador bias in the CM2.0 version of the GFDL climate model has been removed in newer versions by choosing a smaller viscosity in the extratropical ocean, which increases the strength of the subpolar gyre circulation. A similar treatment was proposed by Jochum et al. (2008) and implemented in CCSM4 (Gent et al. 2011). A different study by Sterl et al. (2012) also describes such a “cold event”, appearing in one of the runs with the EC-Earth climate model; they attribute it to the “chaotic nature of the climate system”. The recent work by Drijfhout et al. (2013) discusses the mechanism behind this abrupt climate shift and argues that the initial cooling, responsible for the cold event, is caused by a period of enhanced atmospheric blocking over the eastern subpolar gyre.

Although the reason for the low skill of ECHAM6–FESOM in simulating the sea ice and MLD in the Labrador Sea is not yet completely understood, some analysis has been undertaken in an attempt to identify its cause. The discussion below relies on several integrations with a total duration of 1,500 years, but the illustrations use only the first few hundred years of the model run. The second panel (from top) in Fig. 14b (blue line) depicts the normalized time series of the fresh water content in the upper 300 m diagnosed in the middle of the Labrador Sea. As one would expect, this quantity indicates that the cold Labrador event (defined as the collapse of Labrador Sea MLD) is associated with an increase of the upper-ocean fresh water content. Sea surface height is significantly influenced by such freshening through its steric component as shown in the same plot (red line). The analysis suggests that the freshwater increase and the cold event occur simultaneously.

This implies that the origin of the cold events may be associated with the source of the upper-ocean fresh water. The time series of the fresh water transport through Davis and Fram Straits are therefore also shown in the third panel of Fig. 14b. A reference salinity of 34.8 psu is used for the computation of the fresh water content. The transport time series are almost perfectly anticorrelated: the cold (shallow MLD) events are associated with an increase in the fresh water transport through Fram Strait, whereas they are associated with a decrease in fresh water transport through Davis Strait. This is supported by evidence that the simulated fresh water exported through Davis Strait stays largely confined to the Labrador Current, whereas the fresh water coming through Fram Strait mixes laterally and affects the deep convection in the Labrador Sea (see e.g. Wekerle et al. 2013). This suggests that the Fram Strait fresh water export contributes to the cold events. A cross-correlation analysis (not shown), however, indicates that the export through Fram Strait lags the cold events and is thus a consequence rather than the cause of the event: Fram Strait fresh water export appears to be part of a positive feedback to the “Labrador freezing” phenomenon. Excluding the influence of the atmospheric buoyancy flux as a cause, fluctuations in the Labrador Sea freshwater export or the input of saline water from the Irminger current seem to be causing the sporadically occuring cold events. It is worth mentioning that closing the fresh water budget for the upper layer in the whole Labrador Sea is difficult since deep convection takes place in the interior of the basin.

A further analysis of the upper-ocean fresh water inflow and export (not shown) reveals that the decrease in fresh water export precedes the increase of the upper Labrador Sea fresh water content. This also suggests that some shift in the dynamics of the subpolar gyre takes place; it may be associated with the change in the upper as well as in the deeper ocean. The time series for the deep-ocean salinity anomaly at about 1,600 m (bottom panel in Fig. 14b) indicates that the cold events are indeed associated with changes also in the deep ocean. The high deep-ocean salinity during the cold events is not the imprint of the reduced deep convection, but results from anomalously saline and warm water advected into the Labrador Sea from the deep eastern North Atlantic. In fact, every cold event simulated by ECHAM6–FESOM is preceded by such an inflow of this anomalous water into the Labrador Sea. Because the temperature effect dominates, this water is anomalously light, explaining the weakening of the subtropical gyre. The origin of this water can be traced back to the salinity anomaly occurring in the eastern North Atlantic. As discussed in Sect. 3.2.1, the shift in the surface forcing modifies the deep waters originating near the Strait of Gibraltar. Note also that the first cold event happens shortly after the salinity bias has propagated from the eastern Atlantic to the deep Labrador basin. This suggests that the bias in the whole deep North Atlantic, presumably caused by a bias in atmospheric forcing, is at least partly responsible for these intermittent periods of reduced deep convection and winter freezing in the Labrador Sea.

A plausible solution to the problem then is the reduction of the wind bias and, by this means, the associated offset in the buoyancy fluxes in the coupled model. A sensitivity study shows a strong impact of the gravity wave drag formulation in ECHAM6–FESOM on the deep North Atlantic bias. On the other hand, the suggestion made by Jochum et al. (2008) to reduce the oceanic viscosity did not lead to great improvements in ECHAM6–FESOM. This indicates once more that possible remedies may well be model specific.

3.2.4 Ocean circulation

The large scale ocean circulation is now assessed in terms of barotropic and AMOC streamfunctions (Figs. 15, 16). In general, the ocean circulation simulated by ECHAM6–FESOM is similar to those produced by other climate and ocean-only models (e.g. Griffies et al. 2009; Jungclaus et al. 2013), with some differences in the strength of the main gyres. ECHAM6–FESOM simulates the Atlantic subtropical gyre with a maximum of 40 Sv at the American coast; its subpolar gyre is relatively weak with a maximum of only 20 Sv. The latter varies between 37 and 28 Sv in Jungclaus et al. (2013) for medium and low resolution settings, respectively. The strength of the South Atlantic subtropical gyre amounts to about 75 Sv and lies between the low and medium resolution settings from MPI-ESM with 90 and 62 Sv, respectively. The maximum of the Kuroshio subtropical gyre is 80 Sv and is identical to the low resolution setting in MPI-ESM and 20 Sv stronger than in the medium resolution run.

The strength of the ACC as derived from the Drake Passage transport is about 160 Sv in ECHAM6–FESOM. This is consistent with the observational estimate of 134 Sv and a provided uncertainty of 11–20 % (Cunningham et al. 2003). The Drake Passage transport in ECHAM6–FESOM also agrees with the ocean-only model results described in Griffies et al. (2009) and the stand-alone FESOM in Sidorenko et al. (2011). Except for two, all models described in Griffies et al. (2009) simulate a Drake Passage transport between 140 and 190 Sv. The transports reported in Jungclaus et al. (2013) are 169 and 185 Sv for low and medium resolutions, respectively. On the other hand, Delworth et al. (2012) report a lower value of 116 Sv for the Drake Passage transport in the newer CM2.5 model version, compared to 130–140 Sv in CM2.1.

Fig. 15
figure 15

Barotropic streamfunction (Sv) computed by integrating the barotropic flow from north to south. The time average was taken over years 51–350 of the coupled model run

Fig. 16
figure 16

a Atlantic meridional overturning streamfunction (Sv) as a function of latitude and depth. b The same as a, but as a function of latitude and potential density referenced to 2,000 m. The time average was taken over years 51–350 of the coupled model run

The Atlantic meridional overturning circulation (AMOC) is shown in Fig. 16 as a function of both depth and potential density (referenced to 2,000 m). The maximum of the basin-wide mid-depth cell is at 1,000 m and at a mean potential density of about 1,036\(\hbox { kg/m}^3\). The high tropical resolution results in a system of currents and related pattern of upwelling at and downwelling north and south of the equator. This modifies the mid-depth cell at the equator. This is not observed at lower tropical resolutions in ECHAM6–FESOM, but fits well the results shown by Delworth et al. (2012, their Fig. 15).

The bottom cell, which corresponds to the Antarctic bottom water, is clearly reproduced and has a maximum of 2 Sv in the North Atlantic and above 5 Sv in the South Atlantic. The AMOC maximum at 45°N is only slightly above 12 Sv, which is significantly less compared to most other stand-alone ocean and coupled climate models (e.g. Griffies et al. 2009, 2011; Delworth et al. 2012; Gent et al. 2011; Jungclaus et al. 2013; Danabasoglu et al. 2014). Also, when run in an ocean-only configuration, FESOM produces a relatively weak AMOC (Sidorenko et al. 2011; Danabasoglu et al. 2014). In the coupled model the weak AMOC affects the heat transport to the north (see Sect. 3.3), and may also be partly responsible for the episodic freezing of large parts of the Labrador Sea (see above). The weak AMOC in FESOM will be addressed in future works; particular attention will be paid to the resolutions of the Faroe and Denmark strait overflows, as these are believed to play a key role in setting the strength and variability of the AMOC (e.g. Beismann and Barnier 2004; Yeager and Danabasoglu 2012).

3.3 Meridional heat transports

The annual mean meridional heat transports (MHT) simulated by ECHAM6–FESOM (Fig. 17) are largely consistent with observational estimates. The latter are based on satellite measurements of the net radiation at the top of the atmosphere (Fasullo and Trenberth 2008), direct oceanic observations (Ganachaud and Wunsch 2003; Msadek et al. 2013), and atmospheric reanalyses (Fasullo and Trenberth 2008). The peak values of the simulated total (atmosphere + ocean) MHT are 5.8 PW at 35°N and \(-5.9\hbox { PW}\) at 35°S, agreeing well with observational estimates of (\(5.9\pm 0.3\)) PW and (\(-5.9\pm 0.5\)) PW found at the same latitudes (Fasullo and Trenberth 2008). This agreement indicates that the zonal-mean distribution of absorbed solar radiation, which is the main driver of the considerable spread in total MHT among coupled climate models and is in turn largely determined by cloud reflection properties (Donohoe and Battisti 2012), is simulated comparatively well in ECHAM6–FESOM.

Fig. 17
figure 17

Mean meridional heat transports (PW, northward positive) in ECHAM6–FESOM. To compute the individual contributions from the Atlantic and Indo-Pacific oceans, a zero heat transport through the Bering Strait has been specified

The simulated atmospheric MHT peaks at 41°N with 5.0 PW and at 39°S with \(-5.6\) PW (Fig. 17); observed peak values occur at the same latitudes, with the same magnitude in the northern hemisphere ([\(5.1\pm 0.5\)] PW) but with a smaller magnitude in the southern hemisphere ([\(4.9\pm 0.2\)] PW). The latter apparent bias is compensated by a relatively weak poleward MHT in the southern extratropical ocean. The global ocean MHT is however reasonably simulated in the northern extratropics and in the tropics, with peak values of 1.6 PW at 19°N [(\(1.7\pm 0.3\)) PW at 15°N according to Fasullo and Trenberth (2008)] and \(-1.3\,\hbox {PW}\) at 11°S [(\(-1.2\pm 0.5\)) PW according to Fasullo and Trenberth (2008)].

Splitting the ocean MHT into contributions from the Atlantic and Indo-Pacific oceans (Fig. 17) and comparing with estimates based on direct observations reveals a considerable underestimation of northward energy transport in the Atlantic ocean in ECHAM6–FESOM. Simulated values of 0.21 PW at \(30\,^\circ \hbox {S},\,0.22\hbox { PW}\) at \(19\,^\circ \hbox {S},\,0.84\hbox { PW}\) at 25°N, and 0.47 PW at 47°N are consistently lower than the respective observational estimates of (\(0.35\pm 0.15\)) PW, (\(0.77\pm 0.2\)) PW, (\(1.27\pm 0.15\)) PW, and (\(0.6\pm 0.09\)) PW in Ganachaud and Wunsch (2003), the value close to 25°N being even slightly higher according to Msadek et al. (2013). This bias in the Atlantic is largely compensated by the Indo-Pacific MHT, resulting in reasonable values for the global ocean MHT. Since the northward MHT in the Atlantic is dominated by the overturning component (Msadek et al. 2013), the Atlantic MHT bias corresponds to the weak AMOC in ECHAM6–FESOM and to the biases at the North Atlantic deep water formation sites (Sect. 3.2.3).

3.4 Effects of increased horizontal resolution in the tropical Pacific

In order to analyse the effect of the increased horizontal resolution in the tropical ocean (resolved in the current setting with up to 0.25°, see Fig. 1), we performed a second simulation (LOW) with coarser resolution in the tropical belt (\(\approx \!1^{\circ }\)). Outside the tropics the two grids are identical. The model setup of LOW, including initialization and simulation length, is identical to that of the simulation analyzed throughout this study (REF in the following).

With higher resolution, narrow equatorial current systems emerge in the Pacific. This is revealed by the tropical barotropic streamfunction difference between REF and LOW (Fig. 18). There is an increased eastward volume transport at the equator in REF. Simultaneously, we observe an increased westward volume transport at approximately 2°N and 3°S in REF. In accordance, a more vigorous Equatorial Undercurrent (EUC) and distinguishable branches of the South Equatorial Current (SEC) are found in the equatorial ocean (not shown). Due to the increased horizontal resolution, the EUC is narrower in REF with a core speed of more than \(0.9\hbox { m s}^{-1}\) at 150 m depth. The EUC in LOW is broader and shows a peak current speed of only 0.5–0.6\(\hbox { m s}^{-1}\) at 150 m depth. Consistent with REF, Wyrtki and Kilonsky (1984, their Fig. 2) observed an EUC core speed of more than \(0.9\hbox { m s}^{-1}\). The North Equatorial Countercurrent (NECC) also tends to strengthen with higher resolution. Outside the tropical band between 10°N and 10°S, REF and LOW show only minor differences in the barotropic streamfunction.

Fig. 18
figure 18

Mean barotropic streamfunction (Sv) for the tropical Indian and Pacific Ocean for a REF (\(\approx 0.25^{\circ }\) tropical resolution) and b LOW (\(\approx 1^{\circ }\)), years 51–350. c Difference in the barotropic streamfunction (REF—LOW). Streamfunctions have been computed by integrating from north to south

Consistent with the improved equatorial current systems in the Pacific, the annual mean SST in the Western Pacific warm pool region increases. A positive atmospheric feedback appears to strengthen this initial anomaly: convergent anomalous surface winds at the equator (not shown) result in anomalous downwelling and a further increase in SST. Overall, the mean SST in the warm pool region is higher by up to 1 K for REF compared to LOW (Fig. 19). Thus the equatorial cold tongue bias, present in most coupled climate models, is reduced by up to 1 K in REF compared to LOW. The improved equatorial ocean background state in REF compared to LOW has a positive effect on the representation of ENSO; details on this will be reported elsewhere.

Fig. 19
figure 19

Difference (REF—LOW) in mean surface temperature (SST over the ocean) between REF (\(\approx \!0.25^{\circ }\) tropical resolution) and LOW (\(\approx \!1^{\circ }\)), years 51–350

The effects of increased horizontal resolution in the tropical ocean are strongest in the Pacific. In principle, ECHAM6–FESOM would allow for increased tropical resolution only in the Pacific. In that way the improvement gained in the simulation of the tropical Pacific could be retained while simultaneously reducing the amount of grid points and hence the computational burden. One would expect that an increased horizontal resolution in other key regions, such as the Gulf Stream/North Atlantic or Arctic Ocean, has the potential to improve the modeled climate. This will be investigated in forthcoming studies.

4 Summary and conclusions

A new coupled climate model—ECHAM6–FESOM—has been developed. Employing an unstructured grid for the sea ice-ocean component, ECHAM6–FESOM is the first coupled model of its kind and represents a major step towards multi-resolution climate modeling. In its present configuration, the overall fidelity of ECHAM6–FESOM in simulating the observed mean climate is comparable to that of some of the most realistic CMIP5 models. There are still a number of model shortcomings, however, such as a too weak AMOC, intermittent periods of winter ‘freezing’ of the whole Labrador Sea, and the development of large temperature and salinity biases in the deep North Atlantic. It has been argued that the deep ocean bias, presumably caused by a systematic shift in the surface winds, is likely a prerequisite for the Labrador freezing events. Understanding and rectifying these issues will be the subject of future research. The performance of ECHAM6–FESOM in simulating the climate variability has not been addressed in this paper and will be the subject of a follow-up paper (Rackow et al. 2014).

In ECHAM6–FESOM a multi-resolution approach is employed only for the sea ice-ocean component. A regular-grid atmosphere model has been used not only for practical reasons: it appears reasonable that the multi-resolution approach is particularly well suited for modeling the sea ice-ocean system with its strong boundary currents, localized deep convection regions, and complex geometry. Multi-resolution techniques are also particularly well suited for the representation of ice shelf-ocean interactions (Timmermann et al. 2012), which have attracted an increasing amount of attention in recent years (e.g., Hellmer et al. 2012).

In this study an unstructured grid with moderate stretching factors was used for the sea ice-ocean component. This was done in order to make the new climate model better comparable to existing more traditional systems and to provide a benchmark against which future experiments with strong grid refinements (e.g. for the western boundary currents, upwelling regions, or overflows) can be evaluated. It has been shown that an isotropically increased tropical resolution leads to a more realistic equatorial current system and a smaller cold SST bias in the tropical Pacific. In future studies, setups with increased horizontal resolution in other key regions, such as the Gulf Stream/North Atlantic and the Arctic Ocean, will be explored. Given the fact that unstructured grid ocean models tend to be computationally more expensive per degree of freedom than traditional quasi-regular grid models, a major goal will be to exploit the potential of unstructured grid modeling with an optimized local refinement, such that the benefits for the simulated climate will outweigh the computational costs.