1 Introduction

The frequency and intensity of extreme summer heat events are projected to increase over Europe as a response to rising global mean temperatures (Stott et al. 2004; Christidis et al. 2014; Suarez-Gutierrez et al. 2018). However, the frequency and intensity of heat extremes could also be further exacerbated as a response to changes in the driving mechanisms of extreme summer temperatures under global warming, resulting in an increase in summer temperature variability (Schär et al. 2004; Meehl and Tebaldi 2004; Fischer et al. 2012; Lustenberger et al. 2014; Donat et al. 2017; Bathiany et al. 2018). Although some studies argue that the distribution of European summer extreme temperatures in a warmer climate mostly follows the mean summer warming (Ballester et al. 2009, 2010), the variability in European summer temperatures is large, and the evidence indicating that this variability could increase under global warming, potentially resulting in extremes that warm more than the mean summer climate, is robust (Fischer et al. 2012; Lustenberger et al. 2014; Bathiany et al. 2018). However, it remains unclear which of the two main driving mechanisms of extreme European summer temperatures controls this increase in variability. Several studies attribute the increase in summer temperature variability to the local thermodynamical effects on temperature of moisture limitation (Seneviratne et al. 2006; Diffenbaugh et al. 2007; Fischer and Schär 2009; Vogel et al. 2017; Donat et al. 2017). In contrast, other studies point to the large-scale dynamical atmospheric state as the main driver of heat extremes, and that changes towards more anticyclonic atmospheric patterns drive the increase in temperature variability (Meehl and Tebaldi 2004; Horton et al. 2015; Holmes et al. 2016; Mann et al. 2018). Previous studies thus focus on attributing extreme heat variability changes under global warming to either only dynamical or only thermodynamical drivers, but do not quantify the contribution from both types of drivers simultaneously. Here we quantify the contributions from both colocated dynamical drivers and local thermodynamical drivers to extreme temperature variability, and how these contributions may change in a warmer world, using large samples of extreme summer temperature events simulated by the Max Planck Institute Grand Ensemble (MPI-GE).

Although both colocated dynamical mechanisms and local thermodynamic mechanisms have not yet been considered simultaneously to investigate their contributions to increasing European extreme heat variability under future warming conditions, the relevance of this approach has been demonstrated for different regions, for past European extreme heat events, and for mean temperatures. Over the US, both a shift towards more anticyclonic conditions and drier soils can intensify hot extremes in a warmer world, with the latter thermodynamic drivers accounting for more than half of the summer temperature variability (Diffenbaugh and Ashfaq 2010; Merrifield et al. 2017). In the case of Europe, both dynamical mechanisms connected to atmospheric circulation and thermodynamical effects related to relative humidity have been shown to affect temperature during past extremely hot days in both the ERA interim record and historical simulations from different CMIP5 models (Krueger et al. 2015). Case-based studies also indicate that past record-breaking heat waves occurred as complex combinations of extreme conditions in both dynamic and thermodynamic drivers (Fischer et al. 2007; Miralles et al. 2014; Schumacher et al. 2019). Using past observed European heat waves as test cases, the best predictive skill for extreme temperatures arises from a combination of dynamical and thermodynamical drivers as predictors (Della-Marta et al. 2007). Under future climate conditions, both dynamical and thermodynamical drivers contribute to the changes in variability of European summer temperatures on daily to subdaily scales (Cattiaux et al. 2015). For European summertime mean temperature, studies considering the local downwelling radiation at the surface as a proxy for the local atmospheric state find that downwelling radiation does not substantially influence its change in variability (Fischer et al. 2012; Bathiany et al. 2018), or that is not clear whether the relative contributions from downwelling radiation or soil drying dominate the change in variability (Fischer and Schär 2009). It is therefore clear that both driving mechanisms are not completely independent of each other, but rather capable of intensifying or counteracting one another; and also not completely collinear, but each capable of accounting for part of the variability in extreme temperatures (Della-Marta et al. 2007; Zampieri et al. 2009; Horton et al. 2016; Sillmann et al. 2017; Bunzel et al. 2017; Schumacher et al. 2019). Thus, considering both the dynamical atmospheric state and local thermodynamical mechanisms simultaneously is key to understand what drives heat extremes under current and future conditions, and which of these contributions dominates the increase in extreme European summer temperature variability under further global warming.

To understand how these colocated dynamic and local thermodynamic effects drive extreme temperatures in a warming world, and how relevant internal variability is in the development of heat extremes, we need to analyse large samples of extreme events that develop under a wide range of background conditions, but that are simulated under the same external forcings and model physics. For this we use simulations from the 100-member Max Planck Institute Grand Ensemble (MPI-GE), currently the largest existing ensemble using a comprehensive Earth System Model (Maher et al. 2019; Bittner et al. 2016; Hedemann et al. 2017; Suarez-Gutierrez et al. 2017). The large size of the ensemble is crucial to robustly sample internal variability and to empirically evaluate the statistical significance of changes in very rare events. An ensemble size of 100 simulations under the same forcing conditions allows 1-in-100-years events to occur on average every simulated year (Suarez-Gutierrez et al. 2018), providing the large samples of extreme events that develop under different driving conditions and global warming levels needed for our study. Another relevant aspect to note is that most previous studies investigate changes in variability in multi-model ensembles and using standard deviation changes as a proxy (Fischer and Schär 2009; Fischer et al. 2012; Bathiany et al. 2018). Although using multi-model ensembles gives a good overview of potential differences in how different models simulate these changes, they do not allow a clear separation between the forced transient warming and the deviations caused by internal variability, and can lead to misleading results. Using a very large single-model ensemble, we are able to instead directly evaluate how temperature deviations from the mean state change under global warming, based on a precise characterization of the simulated internal variability and the forced warming signal that are not confounded by different responses to forcing or model configurations.

We introduce a definition of extreme events with respect to the evolving decadal-mean climate state, 2\(\sigma\) events, that allows us to focus on the extreme events that would pose the biggest challenge to society—even if we manage to adapt to a warmer mean climate state. We define 2\(\sigma\) events as summer months (JJA) with monthly mean anomalies of European summer temperatures (EuSTs) that deviate from the decadal mean climate state by at least two EuST pre-industrial standard deviations. This moving threshold definition of very rare extreme events is only well-defined when both the evolving decadal-mean climate state and the probability distribution of EuSTs in a changing climate are known precisely, as occurs in large ensemble experiments, because it relies on a precise characterization of the simulated internal variability. This novel characterization differs from traditional fixed-threshold or percentile approaches in that it allows us to account for both a potential adaptation to the shift in the mean climate, and an increase in the number of events that are above a certain threshold resulting from either and increase in extreme event frequency or amplitude. By focusing on the events that would be extreme even under adaptation to a warmer mean climate, we can study changes in the number of heat extremes not only caused by the mean shift in the distribution toward higher temperatures, but also caused by an increase in variability that leads to temperature deviations from this mean state that are larger than those in our current climate, and that are caused by changes in the underlying driving mechanisms of extreme temperatures.

To represent the main driving mechanisms of summer temperature variability we consider colocated dynamical atmospheric conditions, represented by geopotential height at 500 hPa (Z500) and sea level pressure (SLP), as well as the local thermodynamical effects of moisture availability, represented by soil moisture fraction (SM) and evapotranspiration (ET). In one hand, large Z500 and SLP anomalies are associated with the persistent blocking and anticyclonic conditions that act as a dynamical driver of extreme temperatures (Meehl and Tebaldi 2004; Della-Marta et al. 2007; Pfahl 2014; Horton et al. 2016; Sillmann et al. 2017; Schaller et al. 2018). On the other hand, a large decrease in ET indicates that less water is evaporated into the atmosphere. This may occur as a response to lower temperatures that lead to less heat available, and in turn to the evaporation of less moisture, showcasing a positive ET-EuST relation. However, lower evapotranspiration can also relate to limited moisture conditions that have a local thermodynamic driving effect on high temperatures. This driving mechanism, characterized by a negative ET-EuST relation, is initiated when moisture is limited for evaporation, increasing the fraction of radiative energy that is transformed into sensible—rather than latent—heat, and thus resulting in higher temperatures. Similarly, low SM anomalies can occur both as a response to high temperatures that may dry out the soil through evaporation, or as an indicator of the low moisture conditions that initiate the local thermodynamic driving mechanism leading to higher temperatures (Seneviratne et al. 2006; Horton et al. 2016; Sillmann et al. 2017; Vogel et al. 2017).

In this study, we investigate the internal variability in the response of extreme temperatures to different background states in a warming world, separating circumstantial from necessary conditions in the development of large samples of heat extremes. First, in Sect. 2 we introduce MPI-GE and evaluate its ability to simulate European heat extremes, as well as elaborate on the details of our approach and methods. In Sect. 3.1, we analyze extreme temperature events with respect to evolving decadal climate over Europe, and how these extremes change under warming in MPI-GE. In Sect. 3.2, we quantify the contributions from both colocated dynamical drivers and local thermodynamical drivers to the variability in extreme European summer temperatures by constructing a multiple linear regression model with these dynamical and thermodynamical drivers as predictors, in both current and future climate conditions. We also construct multiple regression models based on only dynamical and only thermodynamical drivers, to characterize the importance of simultaneously considering both sources of variability. In Sect.  3.3, we identify the driving mechanisms that dominate the change in extreme summer temperature variability by evaluating how often heat extremes develop under extreme atmospheric conditions, as opposed to extreme moisture limitation, and how these frequencies change under global warming. In Sect. 4 we discuss the caveats and limitations of our approach, followed by a summary of our main results and conclusions in Sect. 5.

2 Data and methods

We use transient climate simulations from the Max Planck Institute Grand Ensemble (MPI-GE) under historical and RCP4.5 forcing conditions (Maher et al. 2019). The ensemble consists of 100 realizations based on the same model physics and parametrizations and driven by the same external forcings, but each starting from a different initial climate state taken from different points of the model’s pre-industrial control run. The MPI-GE uses the model version MPI-ESM1.1 in the low-resolution (LR) configuration, with resolution T63 and 47 vertical levels in the atmosphere (Giorgetta et al. 2013) and 1.5\({}^{\circ }\) resolution and 40 vertical levels in the ocean (Jungclaus et al. 2013). MPI-ESM1.1 is fairly similar to the the CMIP5 version of MPI-ESM (Taylor et al. 2012), but has a slightly lower equilibrium climate sensitivity of 2.8 \({}^{\circ }\)C (Giorgetta et al. 2013; Flato et al. 2013), and a new five-layer soil hydrology scheme (Hagemann and Stacke 2015) implemented in the land-surface model component of MPI-ESM1.1, JSBACH (Raddatz et al. 2007). Observational data from the CRUTEM4.6 (Jones et al. 2012) dataset are used for comparing the MPI-GE simulations to current climate conditions.

Despite its low resolution, comparable to most models in the CMIP5 ensemble, MPI-GE captures observed temperature variability adequately (Suarez-Gutierrez et al. 2018). MPI-GE offers an adequate representation of the observed estimate of internal variability in European summer temperatures, particularly in the upper tail of the temperature distribution (Suarez-Gutierrez et al. 2018). Temporal resolution is also relatively limited in MPI-GE, with only monthly output available. Ideally, we would use hourly to daily values to capture the amplitude of internal variability more precisely, to identify heat wave events based on specific indicators (e.g., Zampieri et al. 2016), and to separate between specific conditions leading to extreme temperatures from those that occur as a response to them. However, both dynamical and thermodynamical mechanisms leading to temperature extremes are based on the persistence of either anticyclonic or dry conditions, making their signal still clearly identifiable from monthly mean values.

The improved five-layer soil hydrology scheme in MPI-ESM1.1 also offers a better representation of both soil moisture and related atmospheric processes (Hagemann and Stacke 2015; Bunzel et al. 2017). Compared to the previous one-layer bucket scheme (Roeckner et al. 2003), that tends to overestimate evapotranspiration leading to excessively dry conditions, the new five-layer scheme offers a better representation of soil moisture memory (Hagemann and Stacke 2015), and more realistic simulations of the large-scale atmospheric patterns in 500 hPa geopotential height that lead to an improved representation of extreme temperature events (Bunzel et al. 2017). Despite potential shortcomings, MPI-ESM has been shown in model evaluation studies to adequately simulate the relevant dynamic and thermodynamic mechanisms contributing to the development of past observed extreme European summer temperatures (Krueger et al. 2015).

We define extreme summertime heat anomaly events based on European summer temperatures (EuSTs) defined as monthly mean near-surface 2 m air temperature anomalies for the summer months (JJA) over land-only grid cells in the region defined by the [35–68\({}^{\circ }\)N, 10\({}^{\circ }\)W–50\({}^{\circ }\)E] latitude–longitude domain. We then define temperature extremes as anomalies of EuST that deviate from the decadal mean climate state by at least two EuST pre-industrial standard deviations (2\(\sigma\) events). Since we expect the standard deviation to change with time under warming, we use a fixed standard deviation defined in the period of 1850–1899 and averaged across all historical ensemble members as a deviation threshold for extremes; while the mean decadal climate state is defined as the centered decadal running ensemble mean. This moving-threshold extreme event definition differs from traditional fixed-threshold approaches in that it allows us to focus on the events that would still be extreme even if we manage to adapt to the mean shift toward a warmer European summer climate. It also differs from percentile-based definitions of extremes in that it is not restricted to a fixed number of events in the upper most extreme percentile. By using this 2\(\sigma\) extreme event definition we can account for both a potential adaptation to the shift in the mean climate, and an increase in the number of events that are above a certain threshold that results from either and increase in extreme event frequency or amplitude.

The variables representing the main driving mechanisms of European summertime heat events are defined as monthly mean anomalies of Z500, SLP, SM and ET over land-only grid cells in the same domain. The effect of the thermal expansion of the lower troposphere under global warming is removed by detrending Z500 anomalies at each grid cell, subtracting the ensemble mean Z500 averaged over the domain for each time step. Since the subtracted field is spatially constant, the spatial patterns of Z500 that define large-scale dynamics are not affected by this procedure (Cattiaux et al. 2013). SM is defined as the fraction of water accumulated in the root zone of the soil versus the maximum water storage capacity for each grid cell. Lastly, ET is defined as the flux of water going from the soil and vegetation into the atmosphere; positive ET values indicate net gain of water in the atmosphere and net loss in the soil. All anomalies are calculated with respect to the climatological period of 1961–1990.

We use these variables, as well as global mean 2m surface air temperature (GMST), as predictors in a stepwise multiple regression model to account for the variability in large samples of extreme 2\(\sigma\) EuST events simulated by MPI-GE. Our multiple regression model consists of several steps, starting with a forward selection of variables as predictors ranked by their individual correlation to extreme EuSTs, followed by a backward elimination of redundant predictors, accounting for multicollinearity, overfitting, and non-significance (Storch and Zwiers 1999). In the first step, we rank all variables in decreasing order of correlation to extreme EuSTs for each grid cell, and select the variable with the highest correlation as the first predictor in the regression model. In the next step, the variable with the next highest correlation is evaluated for multicollinearity, and only if the multiple correlation coefficient between the considered variable and any of the predictors already introduced in the model is below 0.95, corresponding to an variance inflation factor (VIF) of 10 (O’brien 2007), the variable is then selected as predictor for the regression model. Once each new variable is added to the regression model, we evaluate if the addition improves the model significantly. If the p-value of the newly added predictor is above 0.05, or the addition of the new variable does not reduce the fraction of unexplained variance compared to the step before, the variable is again eliminated from the model. If, on the contrary, these conditions hold, then the remaining predictors in the model are tested for significance, and removed from the model if their p-values are above the 0.05 threshold. This forward-selection backward-elimination procedure is repeated until no other variables can be added to the regression model.

3 Results

3.1 Changes in European heat extremes

Under the moderate warming scenario RCP4.5, corresponding in MPI-GE to a global warming level of 2.25 \({}^{\circ }\)C above preindustrial conditions by the end of the twenty-first century (Suarez-Gutierrez et al. 2018), the MPI-GE simulates an average increase in European summer temperatures (EuST) of almost 3 \({}^{\circ }\)C compared to twentieth century conditions (Fig. 1a). MPI-GE projects that, by the end of the twenty-first century, the average European summer month will be comparable to the warmest month observed in this region during the 2010 European heatwave. Furthermore, temperature anomalies during the most extreme summer months could reach values twice as large as those recorded in 2010. MPI-GE offers an adequate representation of the observed estimate of internal variability and of the frequency and amplitude of extreme European summer temperature events (Suarez-Gutierrez et al. 2018). The large size of MPI-GE makes it well capable of simulating events as extreme as the most extreme European summer temperatures on record, unlike other large ensemble experiments with fewer members (Schaller et al. 2018). Some of the simulated events exhibit even substantially larger EuST deviations from the decadal mean state that those observed. However, these simulated events showcasing large EuST deviations have return periods of over hundreds of years, and the observational record may just be too short to determine whether or not the ensemble overestimates the amplitude of very extreme events.

Fig. 1
figure 1

Change in European heat extremes with respect to the evolving mean climate. a Time series of simulated EuST anomalies for the period of 1850–2099. MPI-GE simulations (orange) are compared to CRUTEM4 observations of EuST anomalies (black and white crosses) for the period 1850–2017. Extreme EuST 2\(\sigma\) events (orange dots) are defined as simulated EuST anomalies that deviate from the decadal mean climate state by more than two EuST pre-industrial standard deviations (moving threshold; dashed red line). The decadal mean climate state is defined by the 10-year running ensemble mean (thick red line). The CRUTEM4 EuST observed anomalies that exceed the 2\(\sigma\) moving threshold are highlighted in black. b Relative change in number of local 2\(\sigma\) extreme EuST events with respect to the mean climate during the twenty-first century (2000–2099) relative to the twentieth century (1900–1999). c Relative change in variability based on change in EuST probability density distribution width (2.5th–97.5th percentiles) for late twenty-first century (2070–2099) compared to early twentieth century (1900–1929) for each grid cell. Stippling shows significance as late twenty-first century PDF widths larger (or smaller) than all the possible 30-year PDFs in the twentieth century. Simulations are historical runs for the period 1850–2005 and RCP4.5 runs for the period 2006–2099 from the MPI-GE. All anomalies are calculated with respect to the period of 1961–1990. Spatially averaged temperatures are calculated for the land points in the [35–68\({}^{\circ }\)N, 10\({}^{\circ }\)W–50\({}^{\circ }\)E] domain.

The novel moving-threshold definition that evolves with the changing decadal climate to characterize 2\(\sigma\) extreme events in the ensemble simulations is illustrated in Fig. 1a. These 2\(\sigma\) events correspond to months with spatially averaged EuST anomalies that deviate from the decadal mean climate state by more than two EuST pre-industrial standard deviations. Such events occur in the hundred MPI-GE simulations a total of 860 times during the twentieth century, and increase by 70% to 1483 extremes during the twenty-first century (Table 1). Extreme events appear to occur with decreased frequency in the second half of the twentieth century both in MPI-GE and observations. Although this decrease might occur due to a variety of factors, one factor influencing it is the climatological baseline period used to calculate anomalies. Using a climatological average in the late twentieth century, as the period of 1961–1990, to calculate anomalies leads to approximately 25% fewer extremes in MPI-GE during the second half of the twentieth century compared to the two previous 50-year periods (Table 1). In contrast, using a pre-industrial climatological average, as the period of 1851–1880, leads to a homogeneous number of extremes during the nineteenth and twentieth century, and to a slightly lower relative increase in extremes during the twenty-first century of 55% (Table 1). Although the choice of 1961–1990 as climatology period can inflate the relative increase in extreme events, we maintain this period as climatological baseline to facilitate the comparison to observations.

Observed EuST anomalies based on CRUTEM4 observations, also with respect to the 1961–1990 climatological baseline, occur 11 times above the 2\(\sigma\) threshold during the second half of the nineteenth century, 8 times during the twentieth century, and twice during the twentieth century until 2017. 2\(\sigma\) extremes appear to occur with comparable frequencies in MPI-GE and the observational record across the different periods. However, this comparison in the number of observed and simulated 2\(\sigma\) events is based on the 2\(\sigma\) threshold definition drawn from MPI-GE simulations. This is because the calculation of the 2\(\sigma\) threshold relies on an accurate and well-defined evolving mean climate state and full characterization of internal variability that is not available for observations.

Table 1 Number of 2\(\sigma\) extremes for different climatological periods

The increase in extreme events with respect to the evolving decadal mean climate does not occur homogeneously over Europe. We find a maximum increase in the number of 2\(\sigma\) extreme events of more than 100% over Central Eastern Europe, accompanied by a lower increase of 50–100% over most of the central part of the continent. By contrast, some Southern and Northern European regions show a moderate decrease in extreme events with respect to mean climate conditions, of less than 25% (Fig. 1b). But not only do extreme events with EuSTs above the 2\(\sigma\) threshold occur more often over most of Europe in the twenty-first century; these events also exhibit larger deviations from the mean decadal climate than those under twentieth century conditions, due to an increase in EuST variability. The change in variability, illustrated by changes in the width of the EuST distribution, exhibits a similar pattern to the change in the number of extreme events (Fig. 1c). By the end of the twenty-first century, we find a maximum increase in summer temperature variability of 35% over Central Eastern Europe; while other Southern and Northern regions present no substantial change to a slight decrease compared with early twentieth century conditions. The decrease in the number of extreme heat events with the respect to the evolving mean climate over these regions (Fig. 1b) comes accompanied by a decrease in temperature variability (Fig. 1c). In some regions, this could occur as a result of consistently dryer conditions over these areas in the future, that in turn result in decreased moisture availability variability.

Using MPI-GE we find variability changes that are comparatively smaller and constricted to smaller regions than those found from standard deviation change assessments in multi-model ensembles (Schär et al. 2004; Fischer and Schär 2009; Fischer et al. 2012). Whereas variability changes analysis based on multi-model ensembles underline intermodel differences and climate sensitivity uncertainty, as well as internal variability changes; our large-ensemble based analysis shows variability changes due to internal variability change alone. Additionally, these previous studies were based on higher forcing scenarios more comparable to RCP8.5 than to RCP4.5 (namely SRES A2 and A1B). Also, these studies use the same climatology period to calculate temperature anomalies, 1961–1990, as period of reference to calculate changes in variability. We find that this leads to an underestimation of the variability in the reference period (Table 1, Fig. 1a, c and Supplementary Information SI Fig. S.1). To avoid this artificial increase in variability we use the period of 1900–1929 as reference for variability changes, and the period of 1961–1990 as climatological baseline. As long as the reference period does not overlap with the period used to define the climatology baseline, choosing a different 30-year variability reference period in the twentieth century does not substantially alter our results (SI Fig. S.1).

3.2 Drivers of variability in European heat extremes

To understand the background conditions that lead to heat extremes and which mechanisms dominate the variability of extreme summer temperatures over Europe, we develop a multiple regression model based on the variables describing the large-scale dynamic atmospheric state (Z500, SLP) and local thermodynamical effects of moisture availability (SM, ET), as well as the annual global mean surface temperature (GMST), as predictors of extreme EuSTs. Using this novel approach, we can account for up to 90% of the variability in extreme EuST (not shown). Figure 2 shows the results of our multiple regression analysis, with the point-to-point standardized regression coefficients for each of these predictors. The standardized regression coefficient refers to the power of each predictor to affect EuSTs, and illustrates the change in EuST in standard deviations (\(\sigma\)) driven by a change of one standard deviation in the predictor when all other predictors are held constant.

We find that Z500 is the multiple regression predictor that presents the strongest relation to extreme summer temperatures, with 1\(\sigma\) deviations in Z500 driving above 1.2\(\sigma\) deviations in temperature over Northern Europe and slightly less southward. The decrease over Southern Europe may arise from the competing effects of thermodynamical drivers over this region; however it may also arise from the fact that the blocking centers over southern Europe are not necessarily colocated over the center of maximum temperatures, but are shifted westwards due to advection of warm air (Pfahl 2014), decreasing the Z500-EuST point-to-point regression coefficient in this region. The high Z500-EuST regression coefficient, combined with the fact that geopotential height at the 500 hPa level is the predictor most independent of surface temperature conditions, indicates that large Z500 anomalies and their associated persistent blocking events are the main individual driver of extreme summer temperature variability over Europe in MPI-GE.

In contrast, the other atmospheric variable considered, SLP, presents a regression coefficient that is slightly lower than for Z500, and negative. Although the notion that temperatures are higher where SLP is lower may appear counterintuitive, it relates to the fact that the peak temperatures are generally not reached where the centre of the high pressure system is located, but rather westwards from this point, where advection of warm air is strongest. In this way, although high-pressure systems and their associated anticyclonic conditions are the drivers ofhigh summer temperatures, the highest temperature anomalies would tend not to occur over the points with the highest SLP anomalies, causing this negative relation. Additionally, SLP does not remain as unaffected by the surface conditions as Z500, and part of this negative relation may also appear as a response to the high surface temperatures causing a local low-pressure area. This may occur over regions where intense surface temperatures cause the heating of the air in the lower atmosphere above, resulting in air that is less dense than the air in the neighboring regions and tends to rise, generating a local low-pressure area or thermal low. Our monthly-mean based analysis cannot completely disentangle the driver and response relations between SLP and high temperatures. However, the fraction of explained EuST variance drops substantially when SLP is excluded from the multiple regression model, demonstrating that SLP plays a relevant role in characterizing the background conditions leading to heat extremes, and in explaining part of the EuST variability. Our results indicate that the colocated dynamical mechanisms combining the effects of Z500 and SLP are the dominating driver of extreme European summer temperature variability.

By contrast, the local thermodynamical effects of moisture limitation also play a significant, albeit smaller, role in explaining extreme summer temperature variability. SM exhibits a negative regression coefficient that is significant mainly only over Northern-Central Europe, where a 1\(\sigma\) decrease in SM corresponds to a maximum increase in temperature of 0.5\(\sigma\). However, comparable to the case of SLP, it is also not directly clear whether the SM-EuST relation indicates limited moisture availability as a driver of extreme temperatures, or whether soil moisture becomes limited as a response to the high temperatures. Limited moisture availability can therefore be considered a driver of high temperatures only when evapotranspiration remains low, and considered a response when otherwise, although it remains challenging to disentangle these two processes on monthly timescales.

Fig. 2
figure 2

Standardized regression coefficients from multiple regression analysis. Point-to-point standardized regression coefficients between 2\(\sigma\) extreme EuST and different colocated drivers from the multiple regression analysis for the twentieth century (left column) compared to for the twenty-first century (right column). Hatching represents regions where the variable is excluded from the regression model either because its contribution is not significant, or because it exhibits too high multicollinearity with the remaining predictors in the model

The distinction between driver and response to high temperatures is more straightforward in the case of ET. The negative ET-EuST relation is slightly larger than for SM, and significant only over Southern Europe, where limited moisture conditions are commonplace. We find that a 1\(\sigma\) negative anomaly in ET leads to an increase in temperature from 0.25\(\sigma\) to 1\(\sigma\). This negative ET-EuST relation indicates that less water is evaporated into the atmosphere when high temperatures occur. This process can only be triggered by limited moisture availability that limits the amount of evapotranspiration. Under this limited moisture conditions less radiative energy is then transformed into latent heat, increasing the amount of energy available for surface heating and increasing temperature, indicating ET as a driver of extreme EuSTs over this region. By contrast, in the case of the smaller positive ET-EuST relation over Northern Europe, the typically large moisture reservoirs in the soil and vegetation over this region result in larger positive ET anomalies that occur as a response to high temperatures.

The regression coefficients of some of the drivers show some differences in their pattern and strength between the twentieth and twenty-first centuries, indicating some changes in the effect that these drivers have on extreme temperatures under further global warming (Fig. 2, left vs. right columns). Albeit the effect of the dynamical variables on extreme temperatures remains dominant, we find a slight weakening of this effect in the warmer world of the twenty-first century, showcased by a decrease in the Z500 and SLP regression coefficients particularly over Southern Europe. Additionally, we also find a slight northward expansion of the area where evapotranspiration acts as a significant driver.

Lastly, we include GMST as predictor to account for the global warming trend, and to avoid a spurious increase in the regression coefficients of predictors that also exhibit a similar trend. GMST presents a slightly positive relation to extreme EuSTs, which increases around twofold during the twenty-first century. This increase derives from EuSTs closely following and contributing to the increasing trend in global temperatures, but we have found no indication that anomalously warm years globally result in higher European summer temperatures (Suarez-Gutierrez et al. 2018).

Using the multiple regression model with both sets of drivers, the fraction of unexplained variance remains mostly under 0.1, indicating that the full regression model can explain ca. 90% of the EuST variability. By contrast, we find that performing the same multiple regression analysis with only dynamical or only thermodynamical drivers leads to a substantial decrease in explained variance, and more importantly, to an overestimation of the independent effect of each set of drivers on extreme temperatures. Considering only the dynamical drivers Z500 and SLP as predictors in the multiple regression model leads to higher regression coefficients, in particular for SLP, while explaining only around 60% of the EuST variability (Fig. 3, left column). On the other hand, considering only the thermodynamical drivers SM and ET as as predictors in the multiple regression model leads to a more than twofold overestimation of their regression coefficients, as well as the area of significance of their effect on temperature, while only explaining about 25% of the EuST variability (Fig. 3, right column). These results highlight the importance of considering both dynamical and thermodynamical sources of variability simultaneously in order to understand which driving mechanisms dominate the variability in EuSTs, and to account for the effect of multicollinearity between the different drivers.

Our findings show that the colocated dynamical atmospheric conditions combining the effects of geopotential height at 500 hPa and sea level pressure are the main driving mechanism of extreme European summer temperature variability. Although their effect is slightly weaker during the twenty-first century, we find that the dynamical atmospheric drivers are the dominating factor controlling the development of extreme summertime heat over Europe, both under past and future climate conditions. By contrast, local thermodynamical effects caused by limited moisture availability, particularly relevant over Southern and Central Europe, play a secondary role in accounting for extreme temperature variability.

Fig. 3
figure 3

Regression coefficients from multiple regression analysis for each set of drivers. Point-to-point standardized regression coefficients for the twenty-first century from multiple regression analysis between 2\(\sigma\) extreme EuST and colocated dynamical drivers only (left column) and between 2\(\sigma\) extreme EuST and local thermodynamical drivers only (right column). Hatching represents regions where the variable is excluded from the regression model either because its contribution is not significant, or because it exhibits too high multicollinearity with the remaining predictors in the model. The last row represents the fraction of unexplained variance in EuST remaining for each multiple regression model

3.3 Drivers of variability change in European heat extremes

Using the multiple regression analysis in the previous section we quantify the effect of each driver on the total variability in large samples of EuST extremes. In these section, we evaluate how the specific background conditions of extreme heat events change under warming, to explain which driving mechanisms control the increase in extreme temperature variability and the 70% increase in heat extremes with respect to the changing mean climate during the twenty-first century. To do this, we first evaluate how often extreme temperature events occur under extreme atmospheric conditions as opposed to extreme moisture limitation, in past compared to future climate conditions.

Extreme 2\(\sigma\) EuST events occur most often, in up to 75% of the cases, accompanied by extreme Z500 anomalies, particularly over Northern Europe (Fig. 4, top row). During the twenty-first century this frequency decreases to under 25% in most of Central Europe, while increasing in some Mediterranean regions. In contrast, we find that during the twentieth century fewer than 25% of the extreme EuST events occur under limited moisture conditions of low SM and ET; while increasing to up to 50% by the end of the twenty-first century (Fig. 4, bottom row). The area of this maximum increase corresponds with the area where we find the largest increase in temperature extremes and variability, and also with the high moisture-variability transition zone between dry climate conditions to the south—where commonplace moisture limitation constrains evapotranspiration—and wetter climate conditions to the north—with large moisture reservoirs and where evapotranspiration is limited by radiation (Koster et al. 2009; Fischer et al. 2012). We find that although most extreme EuST events develop under extreme Z500 anomalies, an increasing number of heat extremes develop under neutral atmospheric conditions in the twenty-first century; while up to 40% more of the extremes develop under extreme moisture limitation. The occurrence of these neutral-atmosphere, moisture-driven extremes increases with time, and is most marked in the second half of the twenty-first century (SI Fig. S.4).

Fig. 4
figure 4

Extreme temperatures under extreme atmospheric or moisture conditions. Percentage of extreme 2\(\sigma\) EuST events that exhibit simultaneous favorable 2\(\sigma\) extreme Z500 positive anomalies (top row) and simultaneous favorable 2\(\sigma\) extreme negative ET and SM anomalies (bottom row) during the twentieth century (1900–1999; left column), during the twenty-first century (2000–2099; middle column) and difference in this frequency in the twenty-first century minus in the twentieth century (right column). Z500, SM and ET extremes are also defined as anomalies that deviate by more than two standard deviations from the average preindustrial conditions in each driver, defined for the period 1850–1899

Lastly, we analyze the specific atmospheric and moisture conditions during extreme EuST events at grid cell level over the regions with large increases in EuST extremes (Fig. 5). We find that the most extreme summer temperatures develop when both persistent anticyclonic conditions and dryness occur, illustrated by high Z500 and low ET values respectively. These results indicate that extreme temperature events with respect to the mean climate state become more intense during the twenty-first century under both extreme atmospheric conditions and limited moisture. However, the increase of points in the lower half of Fig. 5b indicates that heat extremes start to develop more frequently as a result of moisture limitation, even under neutral or unfavorable atmospheric conditions in the second half of the twenty-first century (SI Fig. S.5). Thus, our results confirm the local thermodynamic effect of moisture limitation as the main driver of the increasing 2\(\sigma\) summer temperature extremes over Europe.

Fig. 5
figure 5

Atmospheric and moisture conditions during extreme temperature events. Z500 against ET absolute values during EuST extreme events for the early twentieth century, period 1900–1929 a, and for the late twenty-first century, period 2070–2099 b for each grid cell with an increase in extreme EuST events during the twenty-first century of 50% or larger in the [35–68\({}^{\circ }\)N, 10\({}^{\circ }\)W–50\({}^{\circ }\)E] domain. Each point represents one extreme event at grid cell level, defined for EuST anomalies larger than the 2\(\sigma\) threshold for the given grid cell. The color gradient represents the EuST anomaly for each event. Z500 and ET represent absolute values, with the thermal expansion effect removed from Z500

4 Discussion

Our analysis is based on monthly mean values due to the temporal resolution limitations in MPI-GE. The fact that the key characteristic of the anticyclonic or dry conditions that act as driving mechanisms of heat extremes is their persistence makes their signal still clearly identifiable on monthly values. However, we would ideally use hourly to daily values to capture the amplitude and duration of extremes more precisely, and to directly separate between the specific conditions causing heat extremes from those occurring as a result to them. This is particularly relevant for mechanisms regarding sea level pressure and soil moisture, which can both act as a driver of extremes and result as a response to high temperatures. Although our results agree with the theoretical understanding of how heat extremes develop, it remains challenging to disentangle cause and effect processes on monthly timescales.

The distinction between cause and effect can be partly achieved by applying a multiple regression model that includes all potential sources of variability and that is based on large enough samples of heat extremes. This method quantifies the internal variability in the response of temperature to different background conditions, and can to some extent separate conditions that are necessary to the development of extremes from those that may be circumstantial. Using a multiple regression analysis also allows us to account for the multicollinearity effect of different drivers on each other. This multicollinearity refers to one or more of the drivers, additionally to having an effect on temperature, having also an intensifying or counteracting effect on the other drivers. Multicollinearity is inherit to the highly complex climate system and cannot be simply removed; but its effect on our analysis can be reduced. To avoid an inflation of the explained variability due to high multicollinearity we exclude from our multiple regression model variables with a variance inflation factor of more than 10. More conservative multicorrelation thresholds may eliminate relevant variables from the model and lead to spurious or less robust results, but do not substantially influence our conclusions (SI Fig. S.2)

To achieve the best combination of predictors for our multiple regression model, we evaluate several other variables representing the main driving mechanisms. For representing the colocated dynamic drivers we additionally evaluate 850 hPa geopotential height (Z850) and North Atlantic (NA) jet stream position. First, Z850 presents a strong relation with EuST in MPI-GE, however, it also exhibits high correlation with Z500 and SLP. This high multicollinearity with the other atmospheric variables leads to the exclusion of Z850 from the multiple regression model in the majority of the domain when Z500 and SLP are included as predictors. By contrast, excluding SLP and, especially, Z500 to include Z850 instead results in a substantial decrease in explained EuST variance. Second, the NA jet stream position, represented as the latitude where the monthly-average zonal wind maxima occur, exhibits only a minor relation to extreme EuSTs in MPI-GE. Including NA jet stream position as predictor in the model has only a minor effect on the explained EuST variance. Other variables that could potentially represent the dynamical driving mechanisms of extreme EuST are downwelling radiation and vertical velocity below the 500 hPa level. However, in addition to its effect of surface temperatures, downwelling radiation also affects, and can be affected via cloud-cover changes, by thermodynamical drivers such as evapotranspiration. Similarly, the lower atmosphere vertical velocity can also be locally affected by the moisture conditions. These aspects make the separation of causality and dynamical and thermodynamical effects with these variables less straightforward.

For representing the effects of moisture limitation, in addition to concurrent SM and ET during the extreme month, we also consider soil moisture fraction in the month preceding the extremes as predictor. However, similarly to the case of Z850, this lagged predictor exhibits too high collinearity to SM in the extreme month, while explaining less of the EuSTs variance. When only the SM in the month previous to the extreme is included as predictor, this predictor exhibits a weaker relation with temperature (SI Fig. S.3). This relation, albeit weaker than when considering concurrent SM, is crucial to further disentangle cause and consequence in this mechanism, because it arises from the effect that soil dryness has as a driver of extreme temperatures one month in advance, excluding the direct drying of the soil that results as a consequence from extreme temperatures. While removing the confounding effect of concurrent extreme temperatures on soil drying, using SM during the previous month as a predictor also introduces a lack of information in the multiple regression model about the moisture conditions during the month of the extreme. This lack of characterization of the moisture state does not allow the multiple regression model to distinguish whether low ET is limited by low SM, or rather occurring under high SM conditions and is instead limited by incoming energy, and in turn results in an underestimation of the effect of ET on temperature and less explained temperature variance (SI Fig. S.3)

Another potential shortcoming of our approach is the limitation to point-to-point colocated relations in the multiple regression analysis, which may lead to an underestimation of non-local effects, particularly relevant for the large-scale dynamical drivers. The effect of this requirement of colocation is to some extent counteracted by the use of monthly averages of the dynamic atmospheric conditions as predictors. However, it is possible that the contribution from colocated dynamical drivers, that we find dominates extreme European summer temperature variability, is still underestimated in our study due to an underrepresentation of their non-local contribution to extreme temperatures.

Our results highlight the relevant role that internal variability plays in the development of extreme heat. To quantify the role of the main mechanisms driving extreme summer temperatures, we need to understand to what extent internal variability influences the development of heat extremes under different background conditions in both current and future climates. Large-ensemble experiments such as MPI-GE are a great tool for this purpose, because they provide large samples of extreme events that develop under different background conditions, but under the same external forcings and model physics. Basing our analysis on large ensembles of simulations with the same climate models implies, however, that the results and conclusions drawn from this analysis may be subject to the flaws and uncertainties inherent to any single-model study. Our conclusions on how different driving mechanisms contribute to heat extremes and changes in temperature variability are characterized by how the different driving mechanisms perform and affect temperatures in MPI-ESM. This may occur differently in different climate models, and there is valuable knowledge to be gained from repeating this analysis on other large-ensemble experiments.

In particular, the thermodynamic effect of moisture availability as a driving mechanism of simulated heat extremes is affected by the soil hydrology scheme included in the model. Although the five-layer scheme in MPI-GE represents an improvement with respect to previous versions of MPI-ESM, biases in soil moisture memory may remain. Also the location of the transition zone between wet and dry climates may vary in different models, causing differences on where the maximum increase in 2\(\sigma\) extremes occurs as a response to high moisture variability leading to extremely dry conditions. However, our conclusions are supported by robust evidence of a trend towards dryer summer conditions over Europe (Briffa et al. 2009; Ruosteenoja et al. 2018; Spinoni et al. 2018), and of a poleward shift in the transition zone, where moisture is highly-variable, that can affect temperature variability under global warming (Seneviratne et al. 2006; Fischer et al. 2012).

Regarding the dynamical drivers, there is no clear evidence of whether the blocking and anticyclonic conditions that act as dynamical driving mechanism of heat extremes will occur with more or less frequency in the future (Woollings et al. 2018). While some studies find significant positive trends in the frequency of anticyclonic circulations (Horton et al. 2015); most climate models show a decline in blocking conditions with relatively good agreement (Matsueda and Endo 2017; Woollings et al. 2018). However, the atmospheric circulation over Europe is not realistically captured by most climate models. Climate models tend to consistently underestimate the occurrence and persistency of blocking events compared to observed estimates (Davini and D’Andrea 2016), a tendency that is also shown by MPI-ESM (Müller et al. 2018). Despite these potential biases, the relationship between blocking and heat extremes is often well captured by models, particularly in large-ensemble experiments (Schaller et al. 2018). This indicates that, although the effect of blocking as a driver of heat extremes may be well captured in MPI-GE, a biased representation of blocking conditions or its future occurrence may cause biases in the frequency or the intensity of heat extremes in MPI-GE. If this were the case, more frequent blocking conditions could lead to a larger increase of extreme events under future warming, additionally to the increasing number of extremes that occur under unfavorable atmospheric conditions due to the effect of moisture limitation.

Regardless of these potential limitations in our study, we demonstrate that to obtain robust quantifications of the contributions from different drivers of heat extremes, we need to account for the complex multicollinearity between these driving mechanisms, and this can only be achieved by considering all sources of variability simultaneously. We also demonstrate that the main driver of variability in European summer temperatures may be different from the dominant driver of variability change, and that large samples of extreme events like those provided by large-ensemble experiments are crucial to detect and understand these changes. Lastly, we demonstrate that summertime heat extremes relative to the changing mean climate will become more frequent, and more intense. Furthermore, the number of summer months that exhibit these extreme heat anomalies and also concurrent extreme moisture limitation increases by up to 40% in Central and Southern Europe. These findings highlight that the adaptation and mitigation measures required to deal with extreme European summertime heat extend well beyond the adaptation to the mean European summer temperature increase.

5 Summary and conclusions

We disentangle the contributions from colocated dynamic atmospheric conditions and local thermodynamic effects of moisture limitation as drivers of variability in large samples of European summer heat extremes simulated by the 100-member MPI Grand Ensemble. We use a novel extreme event definition to account for a potential adaptation to the mean shift toward a warmer climate state, as well as to an increase in the number of extremes caused by changes in temperature variability. Based on this extreme event definition, we find a 70% increase in the number of heat extremes with respect to the evolving decadal climate during the twenty-first century. This extreme event characterization allows us to study changes in the number of extremes not only caused by the shift in the distribution towards higher temperatures, but also caused by changes in temperature variability that result from changes in the underlying driving mechanisms of heat extremes. The increase in extremes with respect to the evolving mean climate reaches maximum values of more than 100% over central Eastern Europe, where summer temperature variability increases by 35% in MPI-GE. In agreement with previous studies (Fischer and Schär 2009; Fischer et al. 2012; Bathiany et al. 2018), our results indicate that summer temperature variability increases under global warming over some parts of Europe. This increase in variability results on temperature extremes that present larger deviations from the mean conditions, thus contradicting the assumption that European summer extreme temperatures mostly follow mean summer warming (Ballester et al. 2009, 2010).

With a multiple regression approach we find that the colocated dynamical mechanism represented by 500 hPa geopotential height and sea level pressure is the main driver of variability in extreme European summer temperatures in MPI-GE, both in past and future climate conditions. In particular, we find that 500 hPa geopotential height is the dominating individual driver of extreme temperature variability, especially over Northern Europe. By contrast, local thermodynamic drivers play a secondary role in explaining total extreme temperature variability. Furthermore, we find that considering both sources of variability simultaneously is crucial to understand extreme temperature variability. Assessing the contribution from only one type of drivers can explain much less of the extreme temperature variability and leads to an overestimation of the effect of the drivers on extreme temperatures, particularly when considering only the local thermodynamical drivers.

We find that the dynamical atmospheric mechanisms that act as dominant driver of extreme summer temperature variability are not the dominant driver of variability change. Both under past and future climate conditions, most European summer heat extremes occur under extreme 500 hPa geopotential height conditions, and the most extreme summer temperature anomalies develop when both persistent anticyclonic conditions and dryness occur. However, we find that heat extremes develop 40% more frequently under concurrent extreme moisture limitation during the twenty-first century, even under neutral or unfavorable atmospheric conditions. This tendency toward an increasing frequency of neutral-atmosphere, moisture-driven heat extremes confirms that the increase in European summer heat extremes with respect to the evolving decadal climate and associated variability increase are driven by the the local thermodynamic effect of moisture limitation.