Climate Dynamics

, Volume 41, Issue 9, pp 2555–2575

The simulation of European heat waves from an ensemble of regional climate models within the EURO-CORDEX project

Authors

    • Laboratoire des Sciences du Climat et de l’EnvironnementIPSL, CEA/CNRS/UVSQ
  • Andreas Gobiet
    • Wegener Center for Climate and Global Change and Institute for Geophysics, Astrophysics, and MeteorologyUniversity of Graz
  • Daniela Jacob
    • Climate Service Center
  • Michal Belda
    • Department of Meteorology and Environment ProtectionCharles University
  • Augustin Colette
    • Institut National de l’Environnement industriel et des risques (INERIS)
  • Michel Déqué
    • Météo-France/CNRM, CNRS/GAME
  • Jesús Fernández
    • Department Applied Mathematics and Computer ScienceUniversidad de Cantabria
  • Markel García-Díez
    • Instituto de Física de CantabriaCSIC—UC
  • Klaus Goergen
    • Centre de Recherche Public—Gabriel Lippmann
  • Ivan Güttler
    • Croatian Meteorological and Hydrological Service (DHMZ)
  • Tomáš Halenka
    • Department of Meteorology and Environment ProtectionCharles University
  • Theodore Karacostas
    • Department of Meteorology and Climatology, School of GeologyAristotle University of Thessaloniki
  • Eleni Katragkou
    • Department of Meteorology and Climatology, School of GeologyAristotle University of Thessaloniki
  • Klaus Keuler
    • Chair of Environmental MeteorologyBrandenburg University of Technology (BTU) Cottbus
  • Sven Kotlarski
    • Institute for Atmospheric and Climate ScienceETH Zurich
  • Stephanie Mayer
    • Uni ResearchBjerknes Center for Climate Research
  • Erik van Meijgaard
    • Royal Netherlands Meteorological Institute (KNMI)
  • Grigory Nikulin
    • Swedish Meteorological and Hydrological Institute
  • Mirta Patarčić
    • Croatian Meteorological and Hydrological Service (DHMZ)
  • John Scinocca
    • Environment CanadaCanadian Centre for Climate Modelling and Analysis
  • Stefan Sobolowski
    • Uni ResearchBjerknes Center for Climate Research
  • Martin Suklitsch
    • Wegener Center for Climate and Global Change and Institute for Geophysics, Astrophysics, and MeteorologyUniversity of Graz
  • Claas Teichmann
    • Max-Planck-Institut für Meteorologie
  • Kirsten Warrach-Sagi
    • Institute of Physics and MeteorologyUniversity of Hohenheim
  • Volker Wulfmeyer
    • Institute of Physics and MeteorologyUniversity of Hohenheim
  • Pascal Yiou
    • Laboratoire des Sciences du Climat et de l’EnvironnementIPSL, CEA/CNRS/UVSQ
Article

DOI: 10.1007/s00382-013-1714-z

Cite this article as:
Vautard, R., Gobiet, A., Jacob, D. et al. Clim Dyn (2013) 41: 2555. doi:10.1007/s00382-013-1714-z

Abstract

The ability of a large ensemble of regional climate models to accurately simulate heat waves at the regional scale of Europe was evaluated. Within the EURO-CORDEX project, several state-of-the art models, including non-hydrostatic meso-scale models, were run for an extended time period (20 years) at high resolution (12 km), over a large domain allowing for the first time the simultaneous representation of atmospheric phenomena over a large range of spatial scales. Eight models were run in this configuration, and thirteen models were run at a classical resolution of 50 km. The models were driven with the same boundary conditions, the ERA-Interim re-analysis, and except for one simulation, no observations were assimilated in the inner domain. Results, which are compared with daily temperature and precipitation observations (ECA&D and E-OBS data sets) show that, even forced by the same re-analysis, the ensemble exhibits a large spread. A preliminary analysis of the sources of spread, using in particular simulations of the same model with different parameterizations, shows that the simulation of hot temperature is primarily sensitive to the convection and the microphysics schemes, which affect incoming energy and the Bowen ratio. Further, most models exhibit an overestimation of summertime temperature extremes in Mediterranean regions and an underestimation over Scandinavia. Even after bias removal, the simulated heat wave events were found to be too persistent, but a higher resolution reduced this deficiency. The amplitude of events as well as the variability beyond the 90th percentile threshold were found to be too strong in almost all simulations and increasing resolution did not generally improve this deficiency. Resolution increase was also shown to induce large-scale 90th percentile warming or cooling for some models, with beneficial or detrimental effects on the overall biases. Even though full causality cannot be established on the basis of this evaluation work, the drivers of such regional differences were shown to be linked to changes in precipitation due to resolution changes, affecting the energy partitioning. Finally, the inter-annual sequence of hot summers over central/southern Europe was found to be fairly well simulated in most experiments despite an overestimation of the number of hot days and of the variability. The accurate simulation of inter-annual variability for a few models is independent of the model bias. This indicates that internal variability of high summer temperatures should not play a major role in controlling inter-annual variability. Despite some improvements, especially along coastlines, the analyses conducted here did not allow us to generally conclude that a higher resolution is clearly beneficial for a correct representation of heat waves by regional climate models. Even though local-scale feedbacks should be better represented at high resolution, combinations of parameterizations have to be improved or adapted accordingly.

Keywords

Regional climate modelingHeat wavesModel evaluationClimate projectionEURO-CORDEX

1 Introduction

Heat waves are among the most spectacular meteorological events that regularly punctuate the European summer climate. A heat wave is defined as a period of consecutive days with hot temperatures. Major events, often called “mega heat waves”, occurred several times during the last decade (Schär et al. 2004; Founda and Giannakopoulos 2009; Barriopedro et al. 2011), with important consequences on society. The 2003 event in Central Europe triggered a considerable interest in the climate research community, in particular because it was unprecedented and such events are expected to increase both in amplitude and frequency in future climate scenarios (Meehl and Tebaldi 2004; Fischer and Schär 2010; Seneviratne et al. 2012).

Much progress has been recently achieved in understanding processes involved in the development of European heat waves, including: key interactions between land surface and atmosphere (Fischer et al. 2007; Seneviratne et al. 2010) with possible non-local effects (Zampieri et al. 2009) and the influence of large-scale atmospheric circulation (Cassou et al. 2005). The development of heat waves results from the interaction of large-scale and small-scale processes. First of all, for a heat wave to develop, the synoptic environment needs to inhibit strong precipitation events. A typical situation in mid-Europe is the so-called “Omega weather situation” (Degirmendzic and Wibig 2007), which occurred in the summer 2003 heat wave. The persistence of this situation may result from the location of the high pressure area over the center of the region where land-surface-vegetation-atmosphere feedback processes are amplified. Dry soils tend to favor anticyclonic weather (see e.g. Haarsma et al. 2009), and temperatures are more sensitive to persistent blocking under dry than wet conditions (Quesada et al. 2012). These positive feedbacks between anticyclonic weather and soil dryness complement the classical feedback between soil moisture, evapo-transpiration and temperature (Seneviratne et al. 2010). Other feedbacks are possible through clouds and precipitation. The interaction of these large-scale and small-scale processes is not fully understood but they need to be reproduced by climate models in order to provide accurate heat wave assessments for future periods.

To simulate heat wave characteristics on regional scales, dynamical downscaling can be applied in which coarse resolution global climate models (GCMs) are used to drive regional climate models (RCMs) at a higher resolution. This requires that (1) GCMs provide boundary conditions which accurately represent the large-scale synoptic flow, (2) RCMs propagate the atmospheric structures in the inner model domain with high accuracy, and (3) the state of soil moisture and vegetation conditions within the RCM are realistic in order to accurately simulate regional scale feedbacks involving the land surface. This latter implies a sufficient accuracy in the surface energy budget of the coupled land-surface-atmosphere system, associated with a detailed representation of soil textures, soil water transport, root water uptake by the plants, plant transpiration, plant albedo and plant phenology. In practice, due to this complexity and the lack of long-term observations to calibrate parameterizations, such a high accuracy is difficult to reach and a large spread exists among models to simulate the energy budget (Lenderink et al. 2007; De Noblet-Ducoudré et al. 2012; Stegehuis et al. 2012). This complexity is also reflected by a high level of uncertainty in simulating European summer climate and its intra-seasonal and inter-annual variability evolution in climate scenarios (see e.g. Vidale et al. 2007).

The CORDEX RCM ensemble (Giorgi et al. 2009) provides an important opportunity to explore this uncertainty and to identify model deficiencies, which may lead to improvements of the parameterization chain. CORDEX has been designed over several regions of the world (see e.g. the NARCCAP experiment overview, Mearns et al. 2012). Over Europe, several regional climate model inter-comparison studies took place in the past decade within the PRUDENCE (Christensen and Christensen 2007; Jacob et al. 2007) and ENSEMBLES (van der Linden and Mitchell 2009) framework, and biases in both temperature averages and variability were discussed. However, the specific ability of regional models to simulate heat wave phenomena through a large inter-comparison was only partly discussed in these studies, and the spatial model resolution required to properly simulate these phenomena was not addressed. In particular the issue whether high-resolution simulations improve the representation of heat wave characteristics relative to low-resolution simulations was not addressed. This can be expected at small scale from a better representation of topography, coastlines, land use and induced flows. A higher resolution can also change the distributions of precipitation and clouds, with large-scale effects amplified by land–atmosphere couplings. However it is not expected that it much improves much large-scale persistent atmospheric circulations that are typical of heat waves.

The design of the EURO-CORDEX experiment (Gobiet and Jacob 2012), which aims at downscaling CMIP5 simulations over Europe (http://www.euro-cordex.net/), allows for such an evaluation in a controlled framework. EURO-CORDEX includes a suite of experiments. One of them is designed to evaluate regional models in which the lateral boundary forcing for the RCMs is provided by the ERA-Interim (Uppala et al. 2008; Dee et al. 2011) weather re-analysis over the past two decades (1989–2008). The first goal of this paper is therefore to provide skill diagnostics regarding the simulation of European heat waves from the EURO-CORDEX experiments. Various metrics are used to compare simulations and observations. However, it does not aim at an in-depth analysis of mechanisms involved, which is left for a future study.

Eight regional climate models (ARPEGE, CanRCM4, CCLM, RACMO2, RCA, REGCM, REMO, WRF) with different configurations and physical parameterizations participated in this evaluation, including two meso-scale non-hydrostatic models (CCLM, WRF). As compared with previous inter-comparisons and ensemble climate downscaling studies (PRUDENCE, Jacob et al. 2007; ENSEMBLES, Christensen and Christensen 2007, RCA downscaling inter-comparison, Nikulin et al. 2011, CORDEX-Africa inter-comparison, Nikulin et al. 2012), EURO-CORDEX has the aim of using climate models over an unprecedented range of spatial scales over Europe: on the one hand, models are required to use a large domain encompassing the whole of Europe, Northern Africa and a large part of the North-Eastern Atlantic Ocean. This enables coverage of the development of synoptic-scale weather systems and larger-scale weather regimes. On the other hand, modeling groups are invited to provide simulations at a high resolution of 0.11° (about 12 km), and a low resolution of 0.44° (about 50 km). Both resolutions should in principle enable the simulation of the interaction between large-scale dynamics and the local scale. Six out of the eight participating models were performing their simulations at both horizontal resolutions using a fairly similar or identical setup.

By using the two different horizontal resolutions it is further possible to investigate and understand the benefits of using a high spatial resolution for regional climate studies. Therefore, the second focus of this study lies on examining the added value of high resolution.

As a third aspect, also the question of the timing of heat waves in Europe during the simulated period is addressed. Besides the statistical measures of the heat wave climate, the correct representation of the inter-annual sequence of hot and cool summers is of importance: Are regional climate models able to reproduce these phenomena without taking into account any observation in the simulation domain? This also raises the issue of internal variability, which links to the matter of uncertainty.

Section 2 contains a description of observations, models used and their simulation set up. Section 3 is devoted to describe the analysis of the results of the statistical skill evaluation analysis and Sect. 4 contains a summary and a short discussion.

2 Simulations set-up, models and observations

The simulations cover the period 1989–2008 as initially imposed in the EURO-CORDEX project. All models used the 6-hourly ERA-Interim meteorological re-analysis as forcing data at the boundaries, with different nesting methods depending on the model. For ARPEGE, which is a global model, temperature, wind speed and specific humidity were nudged towards ERA-Interim outside the EURO-CORDEX domain (see technical description on http://wcrp.ipsl.jussieu.fr/cordex/domains.html). A strong nudging coefficient was used (1/10 to 10 time steps) in the transition part (known as the Davies region or the sponge region in limited area modeling), 1/10 time step in the outer part, and no nudging was used in the inner part. All other models are limited-area models and are forced at the boundaries using 3-dimensional re-analyses of wind, humidity, temperature or potential temperature and geopotential height. For CanRCM4, all physical prognostic variables were additionally specified on the boundaries, being made available from a previous assimilation run of its parent global model CanAM4 (von Salzen et al. 2013). For CCLM, cloud ice and liquid water and for REMO only liquid water are additionally prescribed at the domain boundaries. At the lower boundary sea surface temperatures (SSTs) from the driving re-analysis were used to prescribe the surface temperature variation over the oceans. CanRCM4 used spectral nudging within the domain with a relaxation time of 24 h applied to spatial scales larger than 1,000 km, but other models did not use nudging inside the domain, allowing the model physics to freely operate without artificial relaxation terms. ARPEGE equations are free inside the domain of evaluation, like the other RCMs, except CanRCM4, but the driving domain being the rest of the globe instead of a narrow strip, it assimilates more information from ERA-interim than the other RCMs. Simulations were generally carried out over a larger grid than the official EURO-CORDEX grid to account for boundary imbalance effects, and a “relaxation region” of various widths (a few hundred kilometers in general) was used.

In total, 21 simulations were produced from eight models (ARPEGE V5.1, Déqué 2010; CanRCM4, Zadra et al. 2008, von Salzen et al. 2013; CCLM, Rockel et al. 2008, http://www.cosmo-model.org; RACMO2 V2.2, van Meijgaard et al. 2012; RCA, Samuelsson et al. 2011; REGCM, Giorgi et al. 2012; REMO, Jacob et al. 2012; WRF V3.3.1, Skamarock et al. 2008; Menut et al. 2012). Eight simulations were carried out with high resolution (0.11° or about 12 km), and thirteen with a low resolution (0.44° or about 50 km) on the CORDEX specified grid. Among these two sets, six simulations were made with the same model and similar setups (ARPEGE, CCLM, RACMO2, RCA, WRF-CRPGL and WRF-IPSL-INERIS) at high and low resolutions. In the case of REMO at high resolution, the model version used at low resolution was extended to include a rain advection parameterization. This change can be seen as adding a new process relevant on the small scale. In the following the simulations will be referred to by the name of the model followed by the name of the modeling group, team or community, and the resolution (11 for 0.11° or 44 for 0.44°). The names of the high resolution simulations are thus: ARPEGE-CNRM11, CCLM-CLMCOM11, RACMO2-KNMI11, RCA-SMHI11, REMO-CSC11, WRF-CRPGL11, WRF-IPSL-INERIS11 and WRF-UHOH11. The thirteen low-resolution simulations are: ARPEGE-CNRM44, CANRCM-CCCMA44, CCLM-CLMCOM44, RACMO2-KNMI44, RCA-SMHI44, REGCM-CUNI44, REGCM-DHMZ44, REMO-CSC44, WRF-AUTH44, WRF-CRPGL44, WRF-IPSL-INERIS44, WRF-UCAN44 and WRF-BCCR44. The various schemes and options used for the simulations are summarized in Table 1.
Table 1

Main characteristics and parameterizations of all models used

Model/Resolution

Institute

Number of levels

Radiation scheme

Convection scheme

Microphysics scheme

Land-surface scheme

Boundary layer scheme

Soil initialization and spin-up

Land use

ARPEGE-CNRM11

ARPEGE-CNRM44

Météo-France

31

Morcrette (1990)

Bougeault (1985)

Ricard and Royer (1993)

Douville et al. (2000)

Ricard and Royer (1993)

Year 1989 is played twice

 

CANRCM4-CCCMA44

CCCma

32

Li and Barker (2005), von Salzen and McFarlane (2002)

Zhang and McFarlane (1995), von Salzen and McFarlane (2002)

von Salzen and McFarlane (2002)

CLASS, Verseghy (2000)

Abdella and McFarlane (1997), von Salzen and McFarlane (2002)

Simulation started Jan 1988 with climatology soil moisture and temperature

GLC2000

CCLM-CLMCOM11

CCLM-CLMCOM44

CLM-Community

http://www.clm-community.eu

40

Ritter and Geleyn (1992)

Tiedtke (1989)

Doms et al. (2007), Baldauf and Schulz (2004)

TERRA-ML, Doms (2007)

Louis (1979)

No spin up, initialization with climatological soil moisture

10-year spin up (start in 1979)

GLC2000 (Joint Research Centre 2003)

RACMO2-KNMI11

RACMO2-KNMI44

KNMI

40

Fouquart and Bonnel (1980), Mlawer et al. (1997)

Tiedtke (1989), Nordeng (1994), Neggers et al. (2009)

Tiedtke (1993), Tompkins et al. (2007), ECMWF-IFS, Neggers (2009)

Van den Hurk et al. (2000), Balsamo et al. (2009)

Lenderink and Holtslag (2004), Siebesma et al. (2007)

Initialized from ERA-Interim on 1979010100

ECOCLIMAP (1 km) (Champeaux et al. 2003; Masson et al. 2003)

RCA-SMHI11

RCA-SMHI44

SMHI

40

Savijärvi (1990), Sass et al. (1994)

Kain and Fritsch (1990, 1993)

Rasch and Kristjánsson (1998)

Samuelsson et al. (2006)

Cuxart et al. (2000)

Initialized from ERA-Interim on 1979010100

ECOCLIMAP (see above refs.)

REGCM-CUNI

Charles University

23

Kiehl et al. (1996)

Grell (1993)

Pal et al. (2000)

Dickinson et al. (1993)

Holtslag et al. (1990), with realistic emissivity

Giorgi and Bates (1989)

USGS

REGCM-DHMZ

DHMZ

23

Kiehl et al. (1996)

Emanuel (1991)

Pal et al. (2000)

Dickinson et al. (1993)

Holtslag et al. (1990)

Giorgi and Bates (1989)

USGS

REMO-CSC11

REMO-CSC44

Climate Service Center

27

Morcrette et al. (1986), Giorgetta and Wild (1995)

Tiedtke (1989), Nordeng (1994), Pfeifer (2006)

Lohmann and Roeckner (1996)

Hagemann (2002), Rechid et al. (2009)

Louis (1979)

Soil initialized from ERA-interim. No spin-up

USGS (Hagemann 2002)

WRF-AUTH44

Aristotle Univ. of Thessaloniki

30

CAM 3.0 Collins et al. (2004)

Modified Kain Fritsch (Kain 2004)

WSM 6-class Hong and Lim (2006)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Soil initialized from ERA interim. No spin up

IGPB-MODIS 30 s + 10 m

WRF-BCCR44

Bjerknes Center

30

CAM 3.0 Collins et al. (2004)

Modified Kain-Fritsch (Kain 2004)

WSM 3-class Hong et al. (2004)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Soil initialized from ERA interim. No spin up

IGBP-MODIS (30″)

WRF-CRPGL11

WRF-CRPGL44

CRP-GL

50

CAM 3.0 Collins et al. (2004)

Modified Kain-Fritsch (Kain 2004)

WSM 6-class

Hong and Lim (2006)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Soil initialized from ERA-Interim. No spin up

IGBP-MODIS (30″)

WRF-IPSL-INERIS11

WRF-IPSL-INERIS44

IPSL and INERIS

32

RRTMG Lacono et al. (2008)

Grell and Devenyi (2002)

Hong et al. (2004)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Soil initialized from ERA-Interim. No spin up

USGS Land Use

WRF-UCAN44

Univ. de Cantabria

30

CAM 3.0 Collins et al. (2004)

Grell and Devenyi (2002)

WSM 6-class

Hong and Lim (2006)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Soil initialized from ERA-Interim. No spin up

IGBP-MODIS (30″)

WRF-UHOH

Univ. of Hohenheim

50

CAM 3.0 Collins et al. (2004)

Modified Kain-Fritsch (Kain 2004)

Morrison 2-moment scheme Morrison et al. (2009)

NOAH Ek et al. (2003)

YSU (Hong et al. 2006)

Simulation started Jan 1987 with ERA-Interim Soil water content and temperature

IGBP-MODIS (30″)

One of the novel aspects of the EURO-CORDEX climate simulations is the use of a high spatial resolution, and one of the objectives of this article is to detect possible skill changes at high resolution relative to low resolution related to heat wave modeling. High resolution is expected to improve (1) the simulation of detailed, small-scale features of climate statistics such as coastal breezes or large valley flows, and (2) large-scale climate characteristics, because of improved interactions across a wider scale spectrum. The only way to cover the two improvement types is to compare simulations with observations or re-analyses having a higher resolution than the simulation themselves. However, such a gridded observation-based data set at the scale of Europe is currently only available at a 0.22° resolution (E-OBS; Haylock et al. 2008: 0.22°).

This led us to use, instead, station observations as provided by the European Climate Assessment & Data (ECA&D) collection of station data (Klein Tank et al. 2002). We used here daily mean temperatures (493 stations) and daily precipitation data (1,072 stations) from this archive. In order to ensure homogeneity in coverage we selected stations that contain at least 95 % of the days along the two decades where models were run (1989–2008). Data flagged as suspicious (Klein Tank et al. 2002) were not taken into account.

Model data were interpolated from their native grid to the target ECA&D stations, using the nearest neighbor method. The nearest neighbor method has the advantage of conserving physical properties of the model column, preserving extreme values and temporal variability. Model temperatures were adjusted by assuming a +6 K/km gradient between the model orography and the observation station height.

The E-OBS data set (0.22° regular latitude-longitude resolution on a grid with rotated pole in order to have almost equal areas over Europe) is however used to provide a map view of large-scale model biases for the 90th percentile (our Fig. 1). However this data set is only used qualitatively in this article through this figure. An elevation correction is also applied here to simulations in order to account for the difference between E-OBS and models orography.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig1_HTML.gif
Fig. 1

Biases (difference simulation-E-OBS) of the 90th percentile of daily mean temperature for each simulation. In order not to modify the model resolution and grid for this comparison, we chose to re-project 0.22° resolution E-OBS data on each model’s native grid using a bilinear interpolation. Simulated temperatures are corrected for elevation differences with the E-OBS data set

3 Results

3.1 Hot temperature frequencies

We first evaluate the ability of the models to correctly simulate the frequency of hot days. A hot day occurs, in our definition, when summertime (JJA) daily mean temperature is exceeding the observed 90th percentile (as in previous studies, see e.g. Fischer and Schär 2010) in the period 1989–2008. In Fig. 1, we displayed the biases (in K) of the simulated 90th percentile, as compared to E-OBS 0.22° resolution. Biases typically range in the [−3 K; +3 K] interval, with some absolute values exceeding 5 K. These biases are robust: we split all data sets (E-OBS and all simulations) into two 10-year periods, recalculated the 90th percentile for each data set over each period and found that the areas with biases of the same sign corresponded roughly to the regions with an absolute bias of about 1 K (not shown in Fig. 1 because these areas would cover almost all colored areas). The area of common bias sign in the two periods always covers 95 % of the grid points above 1.2 K absolute bias in the full 20-year period. For almost all models the coverage is more than 99 %.

There is a general tendency to overestimate the 90th percentile in Southern and Central Europe and to underestimate it over Scandinavia. This is reflected in Fig. 2a showing the number of simulations exceeding the observed 90th percentile at each station. Since these biases are absent in the driving ERA-Interim re-analysis (Fig. 2b), we conclude that they indicate a common deficiency of most models used in this study, and that it is unlikely that the biases stem from potential biases in sea surface temperatures. Table 2 shows several measures of agreement with observations. Beyond the biases discussed above, the spatial pattern of the 90th percentile is fairly well reproduced, with spatial correlations r > 0.9, and root mean square errors in the order of 1–3 K.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig2_HTML.gif
Fig. 2

a Number of simulations [0–21] for which the 90th percentile exceeds that of observations. The area where temperature and precipitation averages are analyzed for the study of inter-annual variability in Sect. 3.4 is indicated by the black frame; b ERA-Interim 2 m temperature 90th percentile bias relative to E-OBS data, as in Fig. 1. c Mean Ensemble 90th percentile bias (K) over the 6 high-resolution simulations. d Same as for c but for the mean over the 9 low-resolution simulations

Table 2

statistics of comparisons between simulations and observations (ECA&D)

Simulation

90th percentile spatial correl.

90th percentile RMS error (K)

\( \sigma \) spatial correl. (K)

\( \sigma \) 5th percentile (K)

\( \sigma \) median (K)

\( \sigma \) 95th percentile (K)

\( \xi \) spatial correl.

\( \xi \) 5th percentile

\( \xi \) median

\( \xi \) 95th percentile

Goodness of fit % of stations

ECA&D

1

0

1

1.44

2.37

3.50

1

−0.63

−0.36

−0.12

86

ARPEGE-CNRM11

0.91

2.7

0.50

1.30

2.35

3.68

0.39

−0.67

−0.36

−0.10

84

ARPEGE-CNRM44

0.94

1.8

0.48

0.81

2.39

3.48

0.42

−0.67

−0.35

−0.08

81

CANRCM4

0.96

1.5

0.52

0.94

2.42

3.53

0.32

−0.60

−0.37

−0.02

83

CCLM-CLMCOM11

0.97

1.8

0.17

1.72

2.58

3.62

0.15

−0.64

−0.33

−0.11

84

CCLM-CLMCOM44

0.95

2.1

0.26

1.62

2.54

3.69

0.19

−0.58

−0.27

−0.09

72

RACMO2-KNMI11

0.97

1.7

0.42

1.09

2.42

3.54

0.20

−0.58

−0.32

−0.08

76

RACMO2-KNMI44

0.97

1.4

0.47

1.03

2.26

3.39

0.12

−0.57

−0.29

−0.10

77

RCA-SMHI11

0.96

2.2

0.12

1.25

2.14

3.21

0.01

−0.55

−0.28

−0.04

73

RCA-SMHI44

0.95

2.0

0.31

1.29

2.28

3.85

−0.11

−0.54

−0.26

−0.03

69

REGCM-CUNI44

0.92

2.1

0.51

1.17

2.51

4.56

0.23

−0.61

−0.32

−0.02

79

REGCM-DHMZ44

0.94

2.6

0.41

1.52

2.56

4.06

−0.06

−0.60

−0.30

−0.07

73

REMO-CSC11

0.97

1.5

0.42

1.25

2.24

3.19

0.34

−0.70

−0.37

−0.10

81

REMO-CSC44

0.97

1.2

0.46

1.01

2.06

3.00

0.20

−0.60

−0.33

−0.07

76

WRF-AUTH44

0.91

2.5

0.45

1.55

2.69

4.22

0.05

−0.49

−0.25

−0.01

73

WRF-BCCR44

0.96

1.7

0.50

1.17

2.14

3.85

0.17

−0.58

−0.31

−0.05

77

WRF-CRPGL11

0.97

1.3

0.45

1.36

2.42

3.72

0.17

−0.56

−0.33

−0.09

84

WRF-CRPGL44

0.96

1.4

0.48

1.17

2.36

3.65

0.08

−0.56

−0.31

−0.10

81

WRF-IPSL-INERIS11

0.97

1.1

0.54

1.29

2.34

3.48

0.35

−0.52

−0.25

−0.06

73

WRF-IPSL-INERIS44

0.96

1.4

0.44

1.22

2.50

3.87

0.28

−0.64

−0.35

−0.09

85

WRF-UCAN44

0.96

2.2

0.49

1.26

2.80

4.27

0.22

−0.61

−0.35

−0.09

84

WRF-UHOH11

0.95

1.6

0.53

1.23

2.12

3.50

0.21

−0.54

−0.31

−0.00

75

Statistics are taken over all stations within the domain [20 W–40E; 30 N–70 N]. Column #2 and #3: Spatial correlation and root mean square error of the 90th percentiles with observations. Columns #4 to #7: Spatial correlation and the 5th, 50th and 95th percentiles of the Pareto scale parameter. Columns #8 to #11: Spatial correlation and the 5th, 50th and 95th percentiles of the Pareto shape parameter. Column #12: Percentage of stations having a GPD goodness of fit measure (RMSE/Scale parameter) less than 0.1

Figure 1 shows the simulation biases for the 7 simulations that were done at both 0.11° and 0.44° resolutions, each with similar model configurations. Despite the better representation of regional-scale surface forcings (e.g. topography), it is not obvious from this figure whether improving the resolution improves the bias (see Fig. 1) in the 90th temperature percentile. In general the two resolutions share the same large-scale bias patterns (see the ensemble mean for each set of simulations (Fig. 2c, d). Only for CCLM (South-Eastern Europe) and RCA (Southern Europe) does the increase of resolution from 0.44° to 0.11° reduce the model bias over large parts of the domain. Biases are more pronounced at high resolution than at low resolution for other cases (e.g. for REMO, RCA and RACMO2 over Scandinavia, WRF-IPSL-INERIS over Russia). Actually for RCA, RACMO2 and CCLM the increase of resolution leads to a large-scale cooling of the whole simulation over vast continental areas, shifting areas of positive bias southward (for RCA and RACMO2) or reducing the large warm bias (for CCLM). By contrast, a large-scale warming over the whole domain is found for REMO. For WRF-IPSL-INERIS and ARPEGE, the resolution increase does not lead to large-scale warming or cooling.

To help quantify the bias changes when switching from low to high resolution, we plotted the absolute improvement (the difference between the distances of the 90th percentile to the observed percentile at each station: Fig. 3). For each model there are areas showing improvements and others showing degradation with increased horizontal resolution, and it is difficult to draw a general conclusion. In most cases, the patterns of improvement are extending over large regions. For instance, CCLM shows large improvements over Eastern Europe and Russia, in RCA the whole of Southern Europe is improved. For ARPEGE and REMO large parts of Europe have a larger bias at high resolution than at low resolution. WRF and RACMO2 do not exhibit improvements over vast areas.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig3_HTML.gif
Fig. 3

Improvement (positive values) of the simulated 90th percentile with 0.11° resolution relative to 0.44°. The improvement is calculated as the difference between the absolute differences between simulated 90th percentiles and observed percentiles

One remarkable feature arises from areas where there are no large-scale improvements or increased deficiencies. In such areas one can distinguish improvements in many (but not all) coastal stations, as a probable consequence of better resolving sea and land breezes and associated temperatures, as well as a better land-sea contrast and representation of the coastline. This feature is evident in the REMO, RACMO2 and WRF-IPSL-INERIS cases, and to a certain extent also in the other models (see Fig. 3), with improvements along the coasts of Spain, and the Nordic sea.

3.2 Analysis of peak over warm thresholds of temperatures

One important issue is whether models are able to represent the main characteristics of the observed temperature distribution’s upper tail. These characteristics can be estimated from extreme value theory (Coles 2001), by fitting a generalized Pareto distribution (GPD) to summer temperatures exceeding a high threshold u (or the peaks over threshold). Under conditions of independence and identical distribution and general technical assumptions of regularity, the probability distribution of high temperatures T exceeding a large threshold u can be approximated by a generalized Pareto distribution with a scale parameter \( \sigma \) and a shape parameter \( \xi \) (Embrechts et al. 1997, see p. 167):
$$ \Pr (T \ge x|T \ge u) = \left\{ \begin{gathered} 1 - \left( {1 + \frac{\xi (x - u)}{\sigma }} \right)^{{ - \frac{1}{\xi }}} ,\quad {\text{if}}\quad \xi \ne 0, \hfill \\ 1 - \exp \left( { - \frac{x - u}{\sigma }} \right),\quad {\text{if}}\quad \xi = 0. \hfill \\ \end{gathered} \right. $$
(1)

The shape parameter \( \xi \) controls the convexity of the return level variations as function of return period. In this paper, we estimate the GPD parameters by a classical maximum likelihood method (Coles 2001). When \( \xi \) is negative the fitted GPD has a finite upper bound \( x_{\infty } = u - \frac{\sigma }{\xi } \). This is the case in general with temperature data in the warm tail (Parey et al. 2007; Yiou et al. 2008).

The choice of the threshold u is a compromise between the convergence of the empirical probability distribution function of the exceedances toward a GPD (Eq. 1), and the number of data points to estimate the GPD parameters. Heuristic (and necessary arbitrary) procedures can be devised for this choice (Coles 2001). Here we use as threshold u the upper 90th percentile of the data, implying that the parameters in Eq. (1), are estimated with 10 % of the data. We found that higher thresholds (with 95th or 99th quantiles) do not provide stable estimates of the GPD parameters. Generally, quantile–quantile plots of the observed and modeled distributions are used to determine the goodness of fit of the GPD parameters for the peaks over threshold. The large number of time series prevents such a graphical verification. Instead, we computed the root mean square error (rmse) of the GPD fit with respect to the observed exceedances. The RMSE is normalized by the estimated scale of the GPD distribution. We consider that the GPD fit is acceptable when this normalized score is lower than 10 %. This heuristic procedure provides a way of assessing the goodness of fit of the GPD model for the threshold exceedances distribution. Table 2 (last column) shows the percentage of ECA&D stations fulfilling this goodness of fit normalized score for each simulation and the observations.

Daily temperature values are not independent and extremes tend to occur in clusters of days. To account for this, our analysis is performed on clusters of threshold exceedances, and the clusters are separated by at least 3 days (otherwise, they are merged). Cluster peak temperature is used here. This procedure ensures a rough independence of the threshold exceedances, and hence a reasonable application of extreme value theory.

We fit a Pareto distribution to the ECA&D daily temperature time series for JJA. The same analysis is performed in the model simulations for the nearest grid points of ECA&D stations. Figure 4a, b respectively show the spatial distribution of scale and shape parameters fitted from observations in the ECA&D database. The scale parameter has a regional pattern with somewhat weaker values in the South than in the North. The shape parameter values are almost always negative, although a few stations near the Black Sea yield positive shape parameters. The spatial pattern of the shape parameter is opposite to the scale pattern, with negative values close to zero in Mediterranean areas. Since the upper bound of the distribution is related to the ratio of the scale by the shape parameters, this indicates a wide range of extreme values (beyond the 90th percentile) in Southern regions, and a smaller range in the Northern areas.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig4_HTML.gif
Fig. 4

Spatial distribution of a observed generalized Pareto scale parameter, b observed generalized Pareto shape parameter, c ensemble mean bias of the scale parameter and d ensemble mean bias of the shape parameter

Table 2 shows the spatial correlations, the median and the range of observed and simulated values of the scale and shape parameters. The ranges of their values are fairly well reproduced, but the spatial correlation is poor for the shape (r ≤ 0.42, value reached for ARPEGE) and somewhat larger (in the 0.15–0.55 range) for the scale. A systematic overestimation of the shape parameter is found (negative values, but too close to 0), while no systematic bias occurs in the scale parameter (see median values). The overestimation is pronounced in Northern Europe while a weak underestimation is found in Southern Europe. The parameter bias patterns in the shape and scale parameters are opposite to the patterns of the observed distributions (compare Fig. 4a, b, c, d) and therefore models tend to have, on average, a smoother North–South parameters gradient than observations. This indicates that the North–South gradient of upper temperature bounds (when the shape parameter is negative) is, on average, lower in the simulations than in the observations and that the range of extreme values beyond the threshold is increased in Northern areas and decreased in Southern areas.

3.3 Persistence and amplitude of heat waves

A heat wave is defined as a period of consecutive days with hot temperatures. Both persistence and amplitude of a hot temperature spell affect health (Anderson and Bell 2009). Therefore the ability of RCMs to predict both features of heat waves is important to evaluate. Here we investigate the ability of models to simulate the observed distribution of persistence and amplitude for hot spells. The distribution of persistence is directly influenced by the bias: a model that has an overestimated 90th percentile has an obvious overestimation of the duration of hot spells. In order to remove this effect we defined here hot days as days where daily mean temperature exceeds the 90th percentile of the simulated temperature of the same model instead of that of observations.

In Fig. 5a, b we show the average observed number of heat wave events of duration larger than a given number of days, as a function of this number of days (heavy black decreasing curve). The average is performed over the ECA&D stations lying in the [20W, 40E: 30N, 70 N] domain. The figure also shows the ratio of the number of simulated to observed events for each model. The number of events persisting more than a few days becomes overestimated by a factor that generally increases with duration. This amplification is found in almost all simulations, and has a range of factors of 1–1.5 for durations larger than 5 days and 1–3 for the longest durations. The amplification remains modest only in a small subset of models (CANRCM4, WRF-AUTH44, WRF-CRPGL11, WRF-BCCR44). Figure 5c, d show the observed frequency of days with temperature exceeding the 90th percentile in spells of consecutive days with duration larger than a given number of days. It also shows the ratio of simulated to observed frequencies. The same conclusions hold in this case: while strictly equal to 10 % for all spells whatever duration, the overestimation factor increases with duration and reaches values of 2–3 for some models. Only the WRF-CRPGL simulation does not exhibit an overestimation.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig5_HTML.gif
Fig. 5

a, b Average observed number of heat wave events of duration larger than a given number of days, as a function of this number of days (heavy black decreasing curve). The average is performed over the ECA&D stations lying in the [20 W, 40E: 30 N, 70 N] domain. The figure also shows the ratio of the number of simulated to observed events for each model (other curves with model legend given in the graph itself). To improve readability the ensemble is split into two sub-ensembles: non-WRF simulations (a) and WRF simulations (b). c, d Same as a, b for the frequency of days (instead of number of events) in spells with durations larger or equal to the value in abscissa (instead of the number of events). High-resolution simulations are highlighted with dashed lines and have the same color as the low-resolution simulations

This tendency of regional models to overestimate the duration of events has not been shown in previous work to be so systematic. It is also robust because this behavior was found when repeating the analysis separately over the two halves of the data set (not shown). However the quantitative value of the amplification factor, for the largest durations (>10 days) robustness is affected by time sampling because the number of events is small for larger durations (it rarely exceeds 5 over the 20-year period at any given station). The systematic overestimation of the number of such events lies principally in Western/Central Europe where there are many stations available. In a number of cases models predict 2–5 events in this area while the observed number of cases is in the range 0–3.

Another interesting feature revealed by Fig. 5 is that, except for WRF-IPSL-INERIS simulations at long durations, the six other high resolution model setups appear to have equal or improved durations of events. The improvement is substantial in particular for ARPEGE and REMO. This general improvement and the relation between low and high resolution simulations for each individual model is robust because it is also found from a separate analysis over the two halves of the period. Only in one case (REMO over the second half of the period), the relation is different, as the low-resolution simulation has a slightly lower over-estimation than the high-resolution simulation.

In order to evaluate the model skill in simulating observed heat wave amplitudes, we consider here only events of sufficient duration. As in Fischer and Schär (2010), a minimal number of 6 consecutive days exceeding the model 90th percentile is considered to define a heat wave. Then the amplitude is defined as the difference between the peak temperature over the period and the model 90th percentile itself.

Figure 6 shows the observed frequency of events of amplitude larger than a given temperature exceedance above the 90th percentile (the number of such events divided by the total number of events), as a function of this exceedance (in K), together with the simulated-to-observed frequency ratio. By definition this frequency is one for an amplitude larger than 0, because all events are taken into account. The number decreases as a function of amplitude. A quasi-systematic overestimation of the frequency of high-amplitude events is found. This behavior is robust because it is also found over the two halves of the data. Several simulations, mostly the high resolution ones, exhibit a frequency of events that is close to observations for amplitudes up to 5–6 K (ARPEGE-CNRM11, RACMO2-KNMI11, RCA-SMHI11, WRF-CRP-GL11, WRF-UHOH11) while others exhibit too frequent large-amplitude events. The improvement with resolution is only found for CCLM and RCA. For REMO and ARPEGE the improvement is difficult to detect while for WRF the high resolution leads to a bias amplification. It is therefore not possible to conclude that resolution improves amplitude distribution. Finally it is interesting to note that models that perform well in terms of frequency do not automatically perform well in intensity (e.g., CCLM).
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig6_HTML.gif
Fig. 6

Solid black monotonically decreasing curve Frequency of observed events with bias-corrected amplitudes larger than the value in abscissa. The amplitude is defined as the maximum, over the period, of the difference between temperature and the model 90th percentile; other curves with model legend in the figure ratio of model-to-observed frequency of events as a function of amplitude exceeding the value in abscissa. High-resolution simulations are highlighted with dashed lines. a Non-WRF simulations; b WRF simulations

3.4 Inter-annual variability of hot seasons

The inter-annual variability of the European summer climate is generally driven by the occurrence or non-occurrence of persistent heat waves characterizing the season’s heat. For instance, during the summer of 2003 two main heat waves occurred in June and in August, and in 2006 one major heat wave occurred in July. This variability is well characterized by the mean frequency of hot days over a given region (see e.g. Hirschi et al. 2011; Quesada et al. 2012). In order to focus over the area where high-impact heat waves recently occurred (Central/Southern Europe), we restricted our analysis of the inter-annual variability to the region [10 W–30E; 36 N–61 N] as defined in Quesada et al. (2012) and shown in Fig. 2a. In order to evaluate the skill in reproducing the inter-annual summer variability of hot days we averaged the frequency of hot days for each season over all stations included in this area. Figure 7 shows the obtained time series. The frequency of hot days was calculated first relative to the observed 90th percentile (Fig. 7a, b) and to each model’s 90th percentile (Fig. 7c, d).
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig7_HTML.gif
Fig. 7

a, b Time series of the observed and simulated seasonal frequency of hot days (days with daily temperature above the observed 90th percentile), averaged over stations in the region [10 W–30E; 36 N–61 N]; c, d same as panela with 90th percentile taken from the model simulations themselves. See legend for curves correspondence

Using this representation, three main hot summers are found: 1994, 2003 and 2006 along the simulation period, with a frequency of hot days clearly higher than normal. Models are able to reproduce this variability, but generally a strong positive bias is found over the Southern/Central Europe. In several simulations, at both high and low resolution, the hot days frequency is overestimated by a factor 2–3. This bias is generally accompanied by an overestimation of the temperature inter-annual variability, as found in previous studies (see e.g. Jacob et al. 2007). Most simulations reproduce the main three hot summers, even though no relaxation was made to any kind of observation inside the model domain (except for CANRCM4-CCCMA44) which is larger than the size of the blocking anticyclone and synoptic structures associated to heat waves. This also indicates that these three hot summers were strongly conditioned by the large scale circulation, imposed onto the RCMs by the ERA-Interim boundary forcing.

When model percentiles, instead of observed percentiles, are used to define simulated hot days, the bias is removed by definition, but the simulated variability still remains larger than observed in several cases, creating conditions of heat waves in some years (for instance in 1997 for WRF-IPSL-INERIS44 and RCA-SMHI11) which have not been observed with a comparable intensity. Some simulations have a variability that is strongly correlated with observations (e.g. ARPEGE, CCLM). The factors driving the accurate simulation of inter-annual variability, synchronous to observations, are difficult to establish, and they seem rather independent of the biases themselves: models that best follow the (corrected) inter-annual evolution have large biases in hot day frequencies.

The fact that some models have an evolution strongly correlated with observations means that the internal climate variability within the domain is rather low, at least as far as summer heat is concerned. Thus inter-annual variability is mostly driven by the large-scale flow through boundary conditions. Internal processes such as convection which could modulate such variability, in particular far from the influence of boundary conditions, do not seem efficient, which modulates results from previous studies (Sánchez-Gómez et al. 2009; Lucas-Picher et al. 2008).

It is noteworthy that the changes in biases are rather systematical across the time period. For RCA, a strong reduction of the bias is found for almost all years when passing from low to high resolution, and the reverse is found for REMO, and ARPEGE to a certain extent (see Fig. 7a). For CCLM, RACMO2 and WRF, the change of resolution did not modify the bias systematically, but the bias changes are small. This confirms that differences found between pairs of low/high resolution simulations are not likely due to internal, random variability, but to systematical changes, which can be found every year.

4 Discussion

We have analyzed an ensemble of regional climate simulations over a 20-year period, carried out within the EURO-CORDEX framework, and examined the specific issue of summer hot days and heat waves. Simulations were made at two resolutions (50 km and 12 km), and were compared to observations of several hundreds of meteorological stations in Europe (493 for temperature). The first striking result is that, despite the common boundary forcing, there is a large spread between simulation results, which is not due to sampling limitations because differences between simulation and observation statistics are also found when restricting the analysis over each of the two halves of the data.

4.1 About sources of spread and differences between simulations

One source of spread, which could not be investigated here, could well be the method to process boundary conditions in the model (vertical layers used for ERA-Interim, relaxation region, number of extra-cells taken out of the CORDEX domain). Other sources of spread are certainly the way interactions between land surfaces and the atmosphere are taken into account in the models. In particular, the partitioning between sensible and latent heat fluxes as well as radiation fluxes is fairly uncertain (Lenderink et al. 2007; Stegehuis et al. 2012). The treatment of convection is a further candidate for introducing strong inter-model variability. The large differences obtained between the simulations made with the same model (WRF or REGCM) but different physics options indicate that a large part of the spread is due to parameterizations. These differences due to physical parameterizations are emphasized here due to the large domain size and specific weather situations with very weak advection.

Due to the strong feedbacks between soil moisture, vegetation and temperature we first suspected the representation of processes relating temperature and the water cycle to be a major source of spread. A clear relation between hot temperatures and precipitation can be seen by analyzing the average summertime frequency of hot days over the Central-Southern region (see Sect. 3.4) as a function of mean daily summertime precipitation over the same region (Fig. 8). Several models have a dry bias associated to a hot bias (e.g. CCLM-CLMCOM, ARPEGE), while others (several WRF simulations in particular) have reverse bias characteristics. Interestingly, simulations having a rather correct hot days statistics often have a wet bias, meaning that there is a compensation by other processes.
https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Fig8_HTML.gif
Fig. 8

Mean frequency of summertime hot days, defined as days where daily temperature exceeds the observed 90th percentile obtained from ECA&D observations, as a function of mean summertime daily precipitation. Results are averaged over Central/Southern Europe as described in Sect. 3.4. Results are for all simulations (0.11° resolution is marked in red and 0.44° resolution in blue), and for ECA&D observations (black diamond). The regression line for all model points is shown (r = −0.72)

In order to further investigate the processes involved in the differences in simulated summer temperature 90th percentiles (Fig. 1), we calculated the spatial correlations, over the area considered in Sect. 3.4, between temperature 90th percentile differences and the mean differences, for a few other parameters: summertime mean precipitation, incoming shortwave radiation and evaporative fraction (the ratio of latent heat flux by the latent plus sensible heat fluxes). Tables 3, 4 and 5 show respectively the matrices of correlations obtained for each parameter.
Table 3

Spatial correlation coefficients between 90th percentile temperature differences and mean summer precipitation amount differences of each pair of simulations

https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Tab3_HTML.gif

Correlation is calculated over land only and over the area shown in Fig. 2a. Boldfaced numbers stand for a correlation larger than 0.5 in absolute value

Table 4

Spatial correlation coefficients between 90th percentile temperature differences and mean summer short-wave radiation differences of each pair of simulations

https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Tab4_HTML.gif

Correlation is calculated over land only and over the area shown in Fig. 2a. Boldfaced numbers stand for a correlation larger than 0.5 in absolute value

Table 5

Spatial correlation coefficients between 90th percentile temperature differences and mean evaporative fraction differences of each pair of simulations

https://static-content.springer.com/image/art%3A10.1007%2Fs00382-013-1714-z/MediaObjects/382_2013_1714_Tab5_HTML.gif

Correlation is calculated over land only and over the area shown in Fig. 2. Boldfaced numbers stand for a correlation larger than 0.5 in absolute value

For REGCM, temperature differences are well anti-correlated with evaporative fraction differences, and to a lesser extent to precipitation differences, but are not correlated with short-wave radiation differences. Additional sensitivity tests made over 3 years showed that the large differences between the two runs are due to the differences of the convection scheme (Grell scheme for CUNI44 and Emanuel scheme for DHMZ44), the Emanuel scheme leading to a drier and much warmer simulation (see Fig. 8).

For WRF simulations the investigation of the differences is more complex, due to the multitude of physics options used. In general, correlations are higher with short-wave radiation, which can result from changes either in radiation, convection or microphysics schemes. For instance the differences between UCAN44 and CRPGL44, with a r = 0.87 short-wave radiation spatial correlation, are due to the convection scheme (Grell and Devenyi vs. Kain-Fritsch schemes, García-Díez et al. 2012), the former producing more clouds than the latter one. The correlations (see Tables 3, 4, 5) between UCAN44 and BCCR44, and IPSLINERIS44 and BCCR44 exhibit the same behavior (high for radiation and evaporative fraction) with the same difference in convection scheme, with microphysics schemes different but belonging to the same family (WSM3, WSM5 and WSM6). Thus a large source of spread is the choice of the convection scheme (see also Warrach-Sagi et al. 2012). Another strong source of spread is due to the microphysics scheme, mostly between the Morrison et al. (2009) scheme, used in UHOH11 and WSM-based schemes. The former was shown to provide less rain than the latter (Warrach-Sagi et al. 2012), which is most probably at the origin of high anti-correlations of the temperature and precipitation differences between UHOH11 and other WRF configurations (see Table 3), also consistent with anti-correlation with evaporative fraction differences (Table 4). There exist other sources of spread in the WRF ensemble, which could explain for instance the temperature differences between AUTH44 and other WRF simulations (vertical grid, processing of input variables and boundary conditions, …).

The differences between temperatures obtained from high and low-resolution simulations can also be investigated from Tables 3, 4, and 5 and further considerations. Except for ARPEGE, which has a specific behavior, temperature changes between high and low resolution simulations are anti-correlated with precipitation or evaporative fraction, and less clearly correlated with incoming radiation. This tends to indicate that models largely respond to changes in resolution through changes in precipitation and subsequent feedbacks. In a companion paper, (Kotlarski S, Keuler K, Déqué M, Gobiet A, Goergen K, Jacob D, Lüthi D, van Meijgard E, Nikulin G, Suklitsch M, Teichmann C, Vautard R, Warrach-Sagi K: Regional climate modelling on European scales: A joint standard evaluation of the Euro-CORDEX RCM ensemble, in preparation for Climate Dynamics) show that for all models but ARPEGE and REMO, resolution increases the rain amount in most regions. We found for instance that, within the region considered in Sect. 3.4, the land area fraction where mean summer precipitation increases is 96 % for CCLM, 72 % for RACMO2, 81 % for RCA, 73 % for WRF-CRPGL, 75 % for WRF-IPSL-INERIS, 35 % for ARPEGE and 10 % for REMO. The increase in precipitation amount is large for CCLM, RCA and RACMO2, and smaller for WRF simulations. This general, but not systematic sensitivity of precipitation to resolution remains to be understood.

The behavior of parameterizations at different spatial resolutions can be strong. An example of resolution change effect occurs for ARPEGE: the cold bias over the mountains is explained by the too conservative ground snow scheme. At low resolution (50–300 km), snow tends to melt too early in Spring over the plains because of the positive snow-albedo feedback at the scale of the model. To cope with this, the ISBA scheme in ARPEGE allows to maintain snow even when surface temperatures are above 0 °C, diagnosing a different temperature for snow (Douville et al. 2000). We can see here (see Fig. 1, ARPEGE 0.11° panel) the detrimental effect of increasing resolution when physical parameterizations are not re-adjusted.

4.2 About persistence and amplitude biases

Another important result, consistent with previous inter-comparison studies, is that models share common biases as concerns hot temperatures in Europe. Most models overestimate the 90th percentile over Southern Europe, and underestimate it over Scandinavia. Even when removing these biases by defining hot days relative to the model 90th percentile, models appear to have too persistent and too strong heat waves. This result is robust as it is found separately over the two halves of the data sets. There is no clear explanation for this. However, long spells of extreme hot days result from the combination of anticyclonic weather and amplifying land–atmosphere feedbacks. If the latter processes are exaggerated in models, one expects an exaggerated asymmetry and skewness in the temperature distribution (Jaeger and Seneviratne 2010), and therefore stretched temperature values at the extremes. This induces higher amplitudes and durations of events, as well as an exaggerated range of extreme values consistent with the extreme value diagnostics presented in Sect. 3.2. This possible cause would explain our results and those of previous studies (Christensen et al. 2008; Boberg and Christensen 2012). Moreover it is consistent with results recently obtained for the ENSEMBLES regional simulations (Stegehuis et al. 2012; Fischer et al. 2012). Therefore, the investigation of surface energy fluxes deserves a careful investigation which is left for a future study.

4.3 Final remarks

One of the broader aims of EURO-CORDEX is to provide a downscaling of CMIP5 simulations, with regional models covering a range of spatial scales from the local scale to the scale of synoptic weather systems and Rossby waves, allowing scale interactions. As far as hot temperatures are concerned, our analysis was not very conclusive concerning the improvement of high relative to low resolutions. Biases in the 90th percentile change with resolution, for some models quite significantly, even at large scale. This could be due to different effects induced by parameterizations at the two resolutions. Despite these changes, we could show that for most models, the strong heat wave persistence bias was improved, especially for moderate durations. Although this would need a detailed specific analysis, an explanation could be that a higher resolution helps to better resolve smaller scale and higher frequency variability due to, for instance, mesoscale convergence zones, summertime convective systems, inducing more variability and intermittency in temperature extremes. There are also local improvements of biases, in particular in some coastal regions. This may be due to the ability of models to better resolve coastal breezes at the 12 km resolution than at the 50 km resolution and to a better accuracy of the representation of the coastline. In mountainous areas improvements were difficult to detect. A reason for this could be that a 12 km resolution is still not fine enough to resolve mesoscale systems, valley flows and therefore the spatial temperature pattern. This was also found by the improvement of weather forecasts by changing the resolution to the convection-permitting scale (Wulfmeyer et al. 2011; Rotach et al. 2009).

The analysis carried out in this study was also designed to improve our understanding of and to characterize regional climate models uncertainties. Such uncertainties are essential components of the information to be delivered to users of regional climate projections from so-called climate services. The results obtained here cannot be directly translated into applicable information for these users, as several other major sources of uncertainties—such as biases of the driving global climate models—are present in regional future climate projections. However they indicate that the uncertainty solely due to regional climate modeling remains an important one for the simulation of heat waves, as the spread between simulations is large. The relation between precipitation and temperature biases (Fig. 8) shows that more observational and modeling efforts are certainly needed to better simulate the regional water cycle.

Acknowledgments

The EURO-CORDEX simulations and analysis were carried out in several groups within the framework of the IMPACT2C FP7 project (Grant FP7-ENV.2011.1.1.6-1). The BTU Cottbus offered an exchange ftp site for sharing all model simulation files. The contribution from CRP-GL was funded by the Luxembourg National Research Fund (FNR) through Grant FNR C09/SR/16 (CLIMPACT). We acknowledge financial support from the Spanish R&D program through Grants CGL2010-21869 (EXTREMBLES) and CGL2010-22158-C02-01 (CORWES). Charles University runs were supported partially in framework of the Research Plan of MSMT (No. MSM 0021620860) and GACR project No. P209/11/2405. The contribution from UHOH was funded by the German Science Foundation (DFG) through project FOR 1695. The REMO simulations were supported by CSC, MPI, as well as BMBF and performed under the “Konsortial” share at the German Climate Computing Centre (DKRZ), as well as the CCLM simulations carried out by BTU, which we are further thankful for their various support. We are thankful to the French CCRT/TGCC supercomputing center support and the CEA and GENCI computing resource allocation agency for the WRF-IPSL-INERIS runs. The development of the modeling chain at INERIS and IPSL was carried out in part within the French national project SALUT’AIR (PRIMEQUAL research program). Part of SMHI contribution was done in the ECLISE projects that receive funding from the European Union Seventh Framework Programme (FP7/2007–2013) under Grant agreement 265240 and in the Swedish Mistra-SWECIA programme founded by Mistra (the Foundation for Strategic Environmental Research). DHMZ contribution was partially supported by the Croatian Ministry of Science, Education and Sports (Grant No. 004-1193086-3035). We also acknowledge the Research Committee of AUTH for the financial support, the Scientific Computing Center of AUTH for the technical support and the EGI and HellasGrid infrastructures for the provision of computational resources. The BCCR contribution was supported in part by the Center for Climate Dynamics (SKD). We acknowledge the E-OBS dataset from the EU-FP6 project ENSEMBLES (http://ensembles-eu.metoffice.com) and the data providers in the ECA&D project (http://eca.knmi.nl). The Climate Data Operators software was extensively used throughout this analysis (https://code.zmaw.de/projects/cdo/).

Copyright information

© Springer-Verlag Berlin Heidelberg 2013