Evaluation of an ensemble of Arctic regional climate models: spatiotemporal fields during the SHEBA year
- First Online:
- Cite this article as:
- Rinke, A., Dethloff, K., Cassano, J.J. et al. Clim Dyn (2006) 26: 459. doi:10.1007/s00382-005-0095-3
- 148 Views
Simulations of eight different regional climate models (RCMs) have been performed for the period September 1997–September 1998, which coincides with the Surface Heat Budget of the Arctic Ocean (SHEBA) project period. Each of the models employed approximately the same domain covering the western Arctic, the same horizontal resolution of 50 km, and the same boundary forcing. The models differ in their vertical resolution as well as in the treatments of dynamics and physical parameterizations. Both the common features and differences of the simulated spatiotemporal patterns of geopotential, temperature, cloud cover, and long-/shortwave downward radiation between the individual model simulations are investigated. With this work, we quantify the scatter among the models and therefore the magnitude of disagreement and unreliability of current Arctic RCM simulations. Even with the relatively constrained experimental design we notice a considerable scatter among the different RCMs. We found the largest across-model scatter in the 2 m temperature over land, in the surface radiation fluxes, and in the cloud cover which implies a reduced confidence level for these variables.
Recent observations and climate modeling results (e.g., Johannessen et al. 2004; ACIA 2004) have highlighted the Arctic as a region of particular vulnerability to global climate change. Despite the climatic significance of the Arctic, many physical processes occurring in this region are still not well understood. Hence, it is not surprising that simulated climates of the Arctic vary widely (Walsh et al. 2002; Räisänen 2002; Covey et al. 2003), depending on the choice of climate model and physical parameterizations. Large variations have been found among the global climate model (GCM) simulations of Arctic sea level pressure, surface temperature, precipitation, and cloud cover. To arrive on more credible present-day Arctic simulations and estimates of possible Arctic climate changes, an improved understanding and modeling of the Arctic climate processes is necessary. One useful approach for this intention is the application of regional climate models (RCMs) on the Arctic region. The rationale for constructing a high-resolution RCM of the Arctic atmosphere is that the treatment of orography, land-sea mask and the physical processes are limited in GCMs by both vertical and horizontal model resolution. Use of RCMs with specified “perfect” lateral boundary conditions eliminates problems originating from lower latitudes in contaminating the results in the Arctic. Further, deficiencies of GCMs in describing the Arctic climate are at least partly due to inadequate parameterizations of important Arctic physical processes. Recent RCM studies have indicated the importance of accurate representation of momentum, heat and moisture exchange in the PBL (Dethloff et al. 2001; Mirocha and Curry, in preparation; Tjernström et al. 2005), surface albedo (Køltzow et al. 2003), cloud-radiation interaction (Cassano et al. 2001; Bromwich et al. 2001; Girard and Blanchet 2001; Morrison et al. 2003; Jones et al. 2004a), and Greenland topography (Box and Rinke 2003) for Arctic simulations. Additionally, the need for RCM simulations is increasing in the context of regional integrated assessment and socio-economic applications (e.g., Lynch et al. 2004). This rationale has convinced several groups to construct RCMs for Arctic domains.
To evaluate RCM simulations in the Arctic, the Arctic Regional Climate Model Intercomparison Project (ARCMIP; Curry and Lynch 2002; http://www.curry.eas.gatech.edu/ARCMIP/index.html) has been organized. The primary ARCMIP activities focus on coordinated simulations by different Arctic RCMs to determine the level of uncertainty in current Arctic RCM simulations. The simulation experiments are carefully designed so that each of the models is operating under approximately the same external constraints (same boundary conditions, horizontal resolution, and domain size). The first ARCMIP experiment has been conducted for the 1997/1998 period of the Surface Heat Budget of the Arctic Ocean (SHEBA; Uttal et al. 2002) project, which included extensive field observations and accompanying satellite analyses. For the single SHEBA station point, the different SHEBA-, satellite observations and analysis products have been compared to each other (J. Liu et al., submitted) and have been used to evaluate RCM simulations (Tjernström et al. 2005).
The purpose of this paper is to evaluate and to intercompare the spatiotemporal distributions of simulated Arctic fields produced by eight different RCMs for the SHEBA-period. The differences between the models will be assessed with the aim to quantify the scatter among them and therefore the magnitude of disagreement and unreliability of current Arctic RCM simulations. Such an estimate is important in the light of future Arctic climate change simulations. The participating models, experimental design and analysis tools are described in Sect. 2. Results of the evaluation are presented sequentially, for the seasonal spatial patterns of the model ensemble (bias and across-model spread) in Sect. 3.1, and for the individual and ensemble height profiles in Sect. 3.2. Section 4 summarizes the results and presents conclusions.
2 Models, experimental design, and analysis
2.1 Regional climate models
Grid information on the participating models
University of Colorado
Staggered Arakawa-B grid 70×55 grid points Polar-stereographic grid
Sigma Top at 10 hPa
University of Stockholm
Arakawa-C grid 70×55 grid points Polar-stereographic grid
Sigma-z Top at 34,800 m
University of Quebec
Arakawa-C grid 7,056 grid points Polar-stereographic grid
Gal-Chen Top at 29,042 m
Arakawa-C grid 66×53 grid points Rotated latitude/longitude grid
Sigma-p Top at 10 hPa
Arakawa-C grid 66×53 grid points Rotated latitude/longitude grid
Sigma-p Top at 10 hPa
University of Colorado
Staggered Arakawa-B grid 70×55 grid points Polar-stereographic grid
Sigma Top at 10 hPa
Arakawa-C grid 78×60 grid points Rotated latitude/longitude grid
Sigma-p Top at 1 hPa
Arakawa-C grid 73×55 grid points Rotated latitude/longitude grid
Sigma-p Top at 10 hPa
Basic information on the participating models
Lateral boundary scheme; Nudging zone
Exponential Relaxation; Giorgi et al. (1993); 5 points
Relaxation; Davies (1976); 5 points
Leapfrog with Asseline filter
Relaxation; Davies (1976); 9 points
Relaxation; Davies (1976); 6 points
Christensen et al. (1996)
Relaxation; Davies (1976); 5 points
Linear relaxation; Dudhia (1993); 5 points
Relaxation; Davies (1976); 8 points
2 time level
Relaxation; Davies (1976); 8 points
2.2 Experimental design
The ARCMIP1 experiment described here is focussed on the SHEBA year, October 1997-September 1998. The simulation domain (Fig. 1) covers the western Arctic, including the SHEBA drift trajectory. The models are initialized in as common a manner as possible on September 1st 1997 and run continuously until 30th September 1998 for a 13-month-long simulation. The initial conditions are obtained from the ECMWF (European Centre for Medium-Range Weather Forecasts) operational analyses. Owing to their different treatments among the models, orography and soil moisture and temperatures have been initialized by each group individually. The surface temperatures over land and glaciers are not prescribed, but calculated individually by each model, using their own energy balance calculations. All models used a common forcing for the lateral atmospheric boundary and the lower ocean/sea ice boundary. The lateral boundary conditions for the atmospheric prognostic variables (surface pressure, horizontal wind, temperature, specific humidity) were obtained from the 6-hourly ECMWF operational analyses. The lower boundary conditions for the sea ice fraction were determined from 6-hourly SSM/I satellite data (Bootstrap algorithm; Comiso 2002; http://www.nsidc.org/data/ nsidc-0079.html). The lower boundary conditions for SST and sea ice temperature were taken from derived 6-hourly AVHRR satellite data (Key 2001), which were subsequently adjusted to be consistent with the sea ice data.
All models use slightly different model domains and grid types (see Table 1), therefore it was necessary to interpolate all model results to a common intercomparison grid (Fig. 1). This grid has 61×48 grid points and a horizontal resolution of 50 km. For the grid interpolation from the individual grids to the intercomparison grid, a linear triangle interpolation has been used.
Six variables were selected to be presented and discussed here: two key dynamical variables (geopotential heights at 850 hPa and 500 hPa; z850 and z500), two key variables which are important to force sea ice models (longwave and shortwave downward radiation at the surface; LWD and SWD), 2 m air temperature because of its interest for the impact community, and total cloud cover because of its importance for the radiation budget. For the total cloud cover, one model (PolarMM5) did not submit data. Seasonal means have been calculated. The paper presents the winter (December 1997–February 1998) and summer (June–August 1998) results.
The RCM’s spatiotemporal patterns are evaluated using numerical weather prediction (NWP) analysis products. However, a critical issue in using these analyses to evaluate the RCM performance is the accuracy of these analyses in the Arctic. The Arctic represents a challenge for data assimilation, given the relative lack of in situ data over the Arctic Ocean and the challenges of remote sensing from satellite in this region. In extremely cold and stable conditions, the quality of the background field, the handling of model and observation biases, and the accuracy of the linear forward models used to go from model space to satellite radiance space have to be scrutinized carefully and often lead to problems in the analysis (Viterbo 2003). In the data-sparse polar region, the analysis is strongly influenced by the analysis model parameterizations. The two most widely used numerical weather prediction analysis products are the NCEP/NCAR reanalysis and the ECMWF (re-)analysis. To assess the utility of using these analysis products for the presented evaluation, we examined the differences between the NCEP/NCAR reanalysis and the ECMWF (re-)analysis for the SHEBA period. The differences between them (“inter-analysis bias”) are at most ±20 m for geopotential heights at 850 and 500 hPa, and ±4 K (±1 K), and ±1 K (±1 K) for the 850 hPa, and 500 hPa winter (summer) temperatures, respectively. Additional insight into the quality of the analyses in the Arctic can be obtained from comparing them to in situ observations. Serreze (2003) compared the 2 m temperature from the ERA40 analyses to the International Arctic Buoy Program/Polar Exchange at the Sea Surface (IABP/POLES) data set (Martin and Munoz 1997). And he found, that in general, the biases over the central Arctic Ocean (where data coverage is best) are within ±3 K in all months. In the Beaufort Sea (our study area), the bias is within ±1 K. Alaska and other land regions appear to have cold biases between April and October (of up to 3 K). Hagemann et al. (2004) showed a warm ERA40 winter bias (up to 3 K) over Arctic land compared to the CRU2 surface air temperature (Jones and Moberg 2003). Concerning the surface radiation, at the local scale, ERA40 and satellite retrievals have been compared to in situ SHEBA measurements. J. Liu et al., submitted found that the ERA40 has the smallest biases in SWD, and gives the best statistics for the LWD for the entire SHEBA period (the mean radiation biases over different periods are within ±10 W/m2). At the Arctic basin-scale, ERA40 data have quite consistent spatiotemporal patterns for SWD and LWD associated with the seasonal cycle. But, the different data sets (data analyses, satellite estimates) vary in their dominant patterns linked to synoptic variability. Evaluating clouds in the NWP analyses is particularly challenging in the Arctic. Clouds observed from the surface by different instruments during the polar night differ by up to 30% (Wyser and Jones 2005). Satellite cloud retrievals are also challenging in the Arctic owing to the difficulty in discriminating between clouds and the underlaying snow/ice surface.
Based upon these matters of fact, the RCM performances of z850 and z500 are evaluated here against the ECMWF operational analysis data (i.e. the used lateral driving data; resolution: 2×2 degrees and 31 levels). The simulated LWD, SWD, 2 m air temperature, and total cloud cover are evaluated against ECMWF re-analysis data (ERA40; resolution: T159 and 60 level; Simmons and Gibson 2000).
3 Model ensemble results
Results of the evaluation of the spatial patterns of seasonal mean geopotential height, temperature, surface radiative fluxes and clouds and of the temporal development of the temperature- and humidity profiles are presented for the model ensemble.
The examination of the model ensemble statistics is motivated by, and aims to shed light on, a number of questions. In which areas (variables and/or geographical/height locations) do the models have difficulty in reaching a consensus, implying a reduced confidence level in the Arctic RCM results? Do the models have similar errors? If they do, are there common problems in the models? Probably, a small inter-model scatter and a large bias are, in some ways, preferable to the opposite.
The multi-model ensemble average (or model ensemble mean) of a certain variable is the average across the individual model values. These model ensemble means are compared with the ECMWF (re-)analyses (“model ensemble mean minus ECMWF”, model ensemble performance bias) to illustrate how successful the ensemble mean simulates the observed climate. To characterize the across-model spread and the consistency among models, the intermodel standard deviation has been calculated. A small sigma value, along with a small bias, indicates agreement among the models and shows that they capture the processes that govern the variable. A large sigma value indicates disagreement and unreliability in the simulations. Additionally, the across-model scatter will be discussed in relation to the magnitude of natural variability.
3.1 Seasonal spatial patterns
In this section, the spatial structures of the model ensemble performance bias and of the intermodel standard deviation of the six selected variables are presented for winter and summer.
3.1.1 Model ensemble performance biases
As the models are tightly constrained to follow the lateral forcing in a small domain, small large-scale flow biases are expected. For the geopotential height, the model ensemble performance bias is between −25 m and +5 m for both seasons. The uncertainty in the “observed” geopotential heights as determined by the difference between the NCEP and ECMWF analyses (inter-analysis bias) is up to ±20 m. Hence, the ensemble bias exceeds the inter-analysis bias sometimes (e.g., for z500 over Alaska and Chukchi Peninsula in both seasons). The relative bias at 500 hPa is about comparable to that at 850 hPa. An interesting feature is that the geopotential bias has the same pattern over the Beaufort/Chukchi Seas for both seasons, namely an underestimation of the height by the model mean compared to the ECMWF analysis. This bias feature is of barotropic nature, characterizing the entire column, and is seen for all seasons. Factors that may contribute to the bias are likely at lower levels of the atmospheric column related to parameterized processes and generally, a different surface forcing (SST, sea ice) used by the ARCMIP1 experiment and the operational ECMWF analysis. In some degree, the common large-scale flow bias patterns shared in all models are surprising, as it is known that each model normally develops within the domain its own bias structure and internal variability. But here, the small domain circumvents this to some extent.
Examination of the bias in the 2 m air temperature shows that the model ensemble has a bias in both seasons between −5 K and +5 K with the largest biases occurring along the coasts and over land with orographic structure. Factors responsible for the bias are the different orography/land-sea masks and horizontal resolutions in the RCMs relative to ERA40. Differences in the downwelling surface radiation are associated with different 2 m air temperature over land. Additionally, different land surface schemes, soil moisture- and soil temperature initialization are used in the different models and ERA40 and contribute therefore to the differences. To unravel the different contributions is very difficult. The 2 m temperature differences over ocean and sea ice are within ±3 K and are caused by differences in the surface specification (SST, sea ice) between ERA40 and ARCMIP1 experiment. Such differences in the surface forcing can contribute to differences in the geopotential height over the entire troposphere (e.g., Rinke and Dethloff 2000).
The model’s cloud ensemble mean biases compared to the ERA40 data are mainly within ±15% in all seasons within the model domain. The clouds in the Beaufort/Chukchi Seas are overestimated by the ensemble mean in winter by 5–20% and in spring/summer by 5–15%. Relative to the ensemble mean, the overestimation is on the order of 5–20%. Over land, the biases vary strongly by regional. In winter, all models more or less overestimate the cloud cover in the Beaufort/Chukchi Seas (but have quite different magnitudes), while the models do not agree during summer. In summer, over the Beaufort/Chukchi Seas and the part of the Arctic Ocean, three models (ARCSyM, COAMPS, RCA) overestimate clouds by up to 40%, while the other four models (HIRHAM, REMO, RegCM, CRCM) underestimate the clouds by up to 40% in this area. Generally, the biases are within the inter-analysis uncertainty. Given this and the large uncertainty in cloud observations, little can be said about the veracity of the RCM cloud simulations.
During winter, the absolute SWD biases are very small while they are large in summer. However, the relative SWD biases (relative to the ensemble mean) are of comparable order for the different seasons. During summer, the SWD absolute bias shows an underestimation in the inner domain of 5–50 W/m2 which is consistent with the summertime cloud overestimation there. The SWD bias is almost a mirror image of the bias in cloud cover. The maximal underestimation occurs along the coasts (up to 50 W/m2). The individual model performance biases are quite different, with severe biases in some models (in ARCSyM and PolarMM5 during summer with underestimation of up to 90 W/m2 in the Beaufort Sea near the coast of Alaska).
Root mean square error, calculated for the model ensemble mean, median, and the individual models
3.1.2 Intermodel standard deviations
The absolute magnitudes of the across-model scatter have been put into the context of the internal variability. Ideally, this internal variability should be model-derived, but because that is not available for the limited SHEBA period, the observed interannual standard deviation from ERA40 data (period 1979–2001) has been used. A small ratio (<1) indicates that the RCMs’ agreement with each other is relatively high, compared with natural variability, i.e. the model spread is insignificant. A ratio much larger than one indicates that the model spread is significant. Figure 3a, b show the ratio of the intermodel standard deviation (i.e., the across-model scatter) to the observed interannual standard deviation, for winter and summer, respectively.
The across-model scatter in the geopotential heights is between 5 and 20 m. However, compared with the interannual variability which is 2–5 times larger, the scatter is insignificant.
The across-model scatter in the 2 m air temperature shows that the scatter is large over land areas and of the order of 1–5 K. Compared to the ocean, this scatter arises because the land surface temperatures were not specified but were allowed to develop freely within each model. The different land-surface schemes as well as the differences in the radiation budget are responsible for the large intermodel land temperature scatter. The use of different orographic data sets contributes also to the relatively large across-model scatter in the 2 m temperature over islands and near the coasts. The across-model scatter over land is significant, and mostly ~2 times larger than interannual variability.
The across-model scatter in the total cloud cover ranges from 5 to 30%. To evaluate this relatively large scatter, it should be noted here that the models do not agree on a common definition of cloud cover. The different prognostic or diagnostic cloud schemes calculate the cloud fraction using different equations and/or different cloud overlap assumptions. Additionally, the cloud parameterizations differ with regard to the inclusion of cloud liquid water, the specification of the effective cloud droplet/ice particle radius, and the parameterization of the cloud optical properties (both shortwave and longwave). Compared with the interannual variability, the scatter is 2–5 times larger and thus significant over the whole domain.
The LWD across-model scatter within the inner domain is between 5 W/m2 and 35 W/m2. In summer, the scatter is smaller over the land (5–15 W/m2) and larger over the ice (15–20 W/m2), whereas in winter, the scatter over the entire domain is 10–35 W/m2. The scatter in SWD is up to 10 W/m2 in winter, and between 20–55 W/m2 in summer. The comparison of different radiation data sets (CASPR, ISCCP, NCEP, ERA40, SHEBA) and averaged over different summertime week-long SHEBA sub-periods shows for the SHEBA site location a difference between the minimal and maximal estimate of ~150 W/m2 and ~50 W/m2, for SWD and LWD, respectively (J Liu et al., submitted). The across-model radiation scatter is therefore within the uncertainty reflected by the satellite observations and analysis products. However, it is significant larger than the interannual variability.
Summarizing, in general, the across-model scatter is twice as large as the ensemble bias. The exceptions are both the 2 m temperature over ocean/sea ice and the geopotential over selected parts of the region (which depends on the season). Related to the observed interannual variability, the across-model scatter is significant for 2 m land temperature, cloud cover, LWD, and SWD. In contrast, for the large-scale flow, the RCMs closely agree with each other.
3.2 Temperature and humidity profiles
In this section, we evaluate the performance of the simulated vertical profiles of temperature and humidity relative to the ECMWF operational analyses. This analysis includes only seven models (PolarMM5 is not included here because the necessary data were not submitted). The profiles are presented as a domain-averaged quantity. For the temperature, the profiles are presented as an average over both the whole domain and only land grid points. The humidity profiles are shown as the total average.
3.2.1 Winter and summer domain-averaged profiles
Figure 4 presents the winter and summer profiles of the individual models, the model ensemble mean, and the ECMWF analyses. The upper panel shows the temperature profiles averaged over the whole domain, while the second panel shows them averaged only over all land points. First, we recognize considerable scatter between the individual model profiles. In the lower troposphere, the difference between the coldest and warmest model is on the order of 4–5 K. This difference decreases with height and in the free troposphere at 500 hPa is the order of 0.5–1 K. The same height-dependent scatter is found for the specific humidity profiles (lowest panel). The difference between the driest and wettest model is in the lower troposphere and is 0.2–0.3 g/kg, while at 500 hPa the difference is only about 0.05 g/kg (not taking into account the one outlier which is COAMPS). The model ensemble mean temperature profile agrees well with the ECMWF profile particularly in winter, while in summer the model ensemble mean is slightly colder relative to ECMWF. But, this bias arises due to two severe cold individual model biases (ARCSyM, COAMPS). In contrast, the ensemble mean humidity profile does not reproduce the ECMWF profile neither in winter nor in summer. During winter, the ensemble mean is drier (up to ~0.1 g/kg) in the lower levels, but wetter at higher levels. During summer, the ensemble mean is again drier (up to ~0.5 g/kg) in the lower levels, but agrees at higher levels.
The modeled temperature profiles fall into two groups. ARCSyM, COAMPS, and CRCM show larger biases, while the other models have smaller biases and also share a common profile bias structure. The other models are generally slightly warmer than ECMWF, both in winter and summer with the bias decreasing rapidly with increasing height. ARCSyM and COAMPS tend to be too cold relative to ECMWF (up to 3 K in lower levels), while CRCM shows a warm bias (up to 2 K), but again these biases rapidly decrease with height. All models, except CRCM, agree in a common humidity bias structure compared to ECMWF for both winter and summer. This bias structure is characterized by relatively dry conditions in the near surface layers and by relatively wet conditions in the free troposphere. CRCM is too wet in both seasons compared with ECMWF (up to 1.5 g/kg in summer at lower levels). To make the conclusions less sensitive to the extreme scores, also the ensemble median has been calculated (not shown). The ensemble median temperature profiles do not change significantly, while the ensemble median humidity profiles move to even more drier values below 900 hPa levels, compared to the ensemble mean profiles.
For a deeper evaluation of the temperature bias profiles, the biases have been calculated separately for land, open ocean, and sea ice grid points. Since the experimental set-up prescribed the ocean- and sea ice- surface temperatures, the across-model differences are small over the ice-free ocean as well as over sea ice (although a few models did not exactly follow the lower boundary forcing conditions for the experiment), but are large over land. The comparison of the two upper panels in Fig. 4 demonstrates clearly that the land part delivers the biggest contribution to the total temperature biases. The different downwelling radiation as well as the different land-surface parameterizations within the different models are likely to be responsible for the near surface temperature biases in both seasons.
3.2.2 Annual cycle of the domain-averaged profiles
The temporal development of the height profiles over the annual cycle (the 13 month SHEBA period Sept. 97-Sept. 98) has been evaluated. Figure 5 shows the height-time cross sections for the model ensemble mean temperature and humidity, for the ensemble performance biases and for the intermodel scatter.
The domain averaged ensemble mean biases are very small and in the range between −1.4 K and +0.4 K for temperature, and between −0.4 g/kg and 0.1 g/kg for humidity. During autumn and spring, the model ensemble is colder and drier in the lowest levels relative to ECMWF, but warmer and wetter in the free troposphere. Looking at the individual model biases (not shown here), the most striking feature is the common humidity bias structure among all models (except CRCM): The models are drier than ECMWF in lower levels, with maxima in autumn and summer. In the free troposphere, the models are slightly wetter than ECMWF for the whole year. All models (except ARCSyM and CRCM) agree on maximal wet conditions in spring near 750–900 hPa compared to ECMWF. HIRHAM and ARCSyM tend to be outliers in summer and autumn by being drier over the entire column. COAMPS shows a pronounced wet bias from end of spring to summer in the free troposphere. Concerning the temperature, the models agree (again except ARCSyM and CRCM) by being colder than ECMWF in the lower levels in the transition periods (autumn, spring) and by being warmer elsewhere. ARCSyM is an outlier and shows a pronounced cold bias (up to 2 K) over the entire period. Contrary to all other models, CRCM shows complete different bias patterns: It is too warm and too wet throughout the year in the near surface levels, and tends to be too cold from January to June in the whole atmospheric column. It shares the dry summer bias in the lower troposphere with ARCSyM.
Due to the common patterns described above in most of the models, the across- model scatter is generally small, but largest at low levels being 1–2.5 K and 0.1–0.4 g/kg, for temperature and humidity, respectively. In the free troposphere, the scatter is very small (below 0.5 K and 0.1 g/kg). There is a pronounced seasonal dependency of the across-model scatters. The temperature scatter is maximal in winter and summer. Also, in summer, a larger humidity scatter has been calculated in the whole column, but this is mainly due to the larger ARCSyM and COAMPS biases.
Summarizing the temperature and humidity profile results, we arrive at two main conclusions. The performance biases of the individual models (and therefore also of the ensemble mean) as well as the across-model scatter is largest near the surface and within the lowest 1000 m. This highlights the influence of the different physical parameterizations on the simulated dynamical patterns for this part of the atmosphere. The models share a common seasonal profile bias structure and have very similar PBL moisture and temperature biases compared to ECMWF (although the domain-averaged biases are small; for temperature within ±1.5 K, for humidity within 0.5 g/kg). One issue which has not been fully addressed in this work is if this says more about the ECMWF model (PBL scheme, sea ice parameterization) than about ARCMIP model’s deficiencies. However, as Beeseley et al. (2000) showed, the assimilation of temperature, wind, humidity, and height kept the overall thermodynamic state of ECMWF fairly close to SHEBA observations. Although the crude sea ice handling in the ECMWF model (snow-free isothermal slab) causes a strong underestimation of 2 m temperature fluctuations, the ECMWF monthly mean temperatures are in reasonable agreement with SHEBA tower and rawinsonde measurements.
4 Summary and conclusions
The model ensemble mean of the eight regional climate models reproduces the ECMWF analyses, although some differences are obvious, but, the ensemble mean biases are of the same order as the differences between different observations or NWP analysis products. The ensemble mean biases are for the geopotential height between −25 m and +5 m, for the 2 m temperature mainly within ±5 K, for cloud cover within 15%, and for LWD and SWD within ±20 W/m2 and between −50 W/m2 and +10 W/m2, respectively. However, often only a few models show severe biases contributing substantially to the bias in the ensemble mean and some single models perform better than the ensemble mean or median. In general, for the 2 m temperature, cloud cover, LWD, and SWD, the ensemble means outperform the individual models.
Even using the constrained experimental design (small integration domain, specified lower boundary condition for ocean and sea ice) and specified “perfect” horizontal boundary conditions from data analyses, there is a considerable scatter among the different RCMs. The largest and significant (compared with the natural variability) across-model scatter is found in the 2 m temperature over land (5 K), in the surface radiation fluxes (up to 55 W/m2), and in the cloud cover (5–30%). This is not surprising given the very complex and individually different land-surface and radiation-cloud schemes within the models. The scatter between the different models highlights the magnitude and seasonal dependency of the disagreement and unreliability for current Arctic RCM climate simulations. First, this is connected with a large uncertainty cascade in the column feedbacks and the regional circulation patterns. Second, this across-model scatter adds uncertainty for climate change and climate sensitivity estimates to the already pronounced magnitude of the internal variability of each individual Arctic RCM (Rinke et al. 2004; Wu et al. 2005). All this is to be discussed in relation to Arctic downscaled projections for a CO2-changed climate for which a similar across-model scatter can be expected and emphasizes the need for a multi-model ensemble approach for more reliable estimates.
The discussion of the model intercomparison showed that the performance biases as well as the across-model scatter are largest in the lowest height levels and near the surface. There, the individual physical model parameterizations come into play from which the land-surface and boundary layer parameterizations, radiative transfer, and treatment of clouds seem to be of primary importance. The available observations from a broad range of observational projects from the data-rich case period of 1997–1998 present an unprecedented opportunity to improve the treatments of these processes. Such an approach, based on individual models, started recently within the community by the validation of selected models with the SHEBA point measurements: Køltzow et al. (2003) showed that their model biases in the incoming and upward shortwave radiative fluxes at the surface in summer can be improved by using a more sophisticated surface albedo scheme. Wyser and Jones (2005) showed that a more sophisticated cloud scheme significantly improves the annual cycle of cloud cover in their model. Additionally, an intercomparison of off-line model components (radiative codes including cloud parameterization and surface layer codes) has been started within ARCMIP and will hopefully help to unravel the causes of the model differences.
We are grateful to Randy Reeder who re-processed the original AVHRR data with fitting and filtering procedures to make them adequate for our ARCMIP purpose. We thank Jiping Liu for the discussion about the differences between different observational radiation data sets. We thank Elizabeth Cassano for work connected with the driving data. ERA40 data used in this study have been provided by ECMWF, and Paul Berrisford provided the ERA40 interannual standard deviation data. The atmospheric forcing data for the simulations are also from ECMWF, and are being served by the Joint Office for Science Support (JOSS) at UCAR. The SSM/I and AVHRR data are from National Snow and Ice Data Center, Boulder. AR, KD, CJ, KW, JHC, JEH, and MOK were supported by the European Union project GLIMPSE (EVK2-CT-2002–00164). MT and MZ are supported by the Swedish Research Council and by SWECLIM, and also acknowledge the support from the Naval Reaserch Laboratory, Monterey, USA. Support for JJC, AHL, and MJS was provided by NSF grant OPP-0084291 and OPP-0229649. Data management for ARCMIP and JAC were supported from a grant by NSF/IARC. We are grateful to Ines Hebestadt and Sabine Erxleben for preparing the graphics.