1 Introduction

The frequency and intensity of heat waves or extremely warm summers is expected to increase as anthropogenically-driven global warming progresses (Coumou et al. 2013; Russo et al. 2014). Potential impacts of such high temperature events range from health effects (Anderson et al. 2015, accepted; Marsha et al. 2015, submitted) to drought occurrence (Towler et al. 2015, submitted) and agricultural losses (Battisti and Naylor 2009; Levis et al. 2015, submitted). In particular, projected changes in growing season temperature may result in potential hazards for food security, as they can offset mitigation benefits resulting from irrigation or technological advancements (Lobell et al. 2011). Motivated by the notion that present-day society and agricultural systems are vulnerable to record-breaking summer temperatures, Battisti and Naylor (2009) examined the probability that summers at the end of the 21st century will be warmer than the record to date. They showed that by the end of the 21st century under a scenario of continued rise in greenhouse gas (GHG) concentrations, the probability of exceeding the historical summer temperature record is >50 % for almost the entire land surface. Recent examples of fatal heat waves and food crises linked to record-breaking summer temperatures can therefore serve as case studies for a potential future norm. Here, we revisit this issue, using a new generation of climate model simulations and focusing on potential benefits from climate mitigation in light of different emission scenarios.

A common approach to exploring the range of potential future climate outcomes is to consider different GHG emissions scenarios. Contrasting those scenarios and highlighting their relative costs and benefits has become a standard practice for the depiction of policy-relevant climate science (IPCC 2013). In preparation for the latest Assessment Report of the Intergovernmental Panel on Climate Change, a subset of four scenarios, the so-called Representative Concentration Pathways (RCPs; van Vuuren et al. 2011), have been adopted for climate model simulations: these are RCP 2.6, RCP 4.5, RCP 6.0, and RCP 8.5, with the numbers indicating the estimated global radiative forcing at the end of the 21st century due to human activities. Underlying the RCPs are different socio-economic storylines: deployment of aggressive mitigation strategies and net negative GHG emissions in the case of RCP 2.6, or business-as-usual use of resources in RCP 8.5. The remaining scenarios, RCP 4.5 and RCP 6.0, describe moderate mitigation pathways.

Besides the scenario uncertainty described above, uncertainty in climate projections also arises from structural differences between models, as well as irreducible uncertainty from intrinsic variability of the climate system on multi-decadal time scales (Hawkins and Sutton 2009). Communicating these different types of uncertainty associated with climate change projections has become increasingly important (Deser et al. 2012a) and requires probabilistic risk assessments (Field et al. 2014). The National Center for Atmospheric Research’s project on Benefits of Reducing Anthropogenic Climate changE (BRACE; O’Neill and Gettelman 2015, in preparation) offers an opportunity to tackle some of these challenges by providing two large ensembles of simulations with the Community Earth System Model (CESM), under scenarios RCP 4.5 and RCP 8.5 (Kay et al. 2015; Sanderson et al. 2015).

The aim of this study is to quantify the probability that summers in the future (2061–2080) will be warmer than any experienced in the past roughly 100 years and to investigate how this risk differs for RCP 4.5 versus RCP 8.5. The large ensemble sizes (15 members for each scenario) allow us to calculate this probability in a robust manner within the physically consistent framework of one specific model. Thus, we account for the effects of intrinsic climate variability in the presence of forced climate change without confounding uncertainties due to model differences. Most models in the Coupled Model Intercomparison Porject (CMIP) archive do not provide enough ensemble members to tackle this issue in a robust manner and aggregation across different models prohibits a straightforward physical interpretation of the results (Battisti and Naylor 2009). Hence, we interpret risk in our study as a probability, conditional on the model physics being representative of reality. We also address the influence of model bias in simulating present-day temperature variability as well as the role of future changes in temperature variability (Schär et al. 2004; Huntingford et al. 2013) on the results. In these respects, our study expands on that of Battisti and Naylor (2009), who considered only one emissions scenario, folded in model structural uncertainties by using a multi-model approach, and did not consider that temperature variance may change with climate change.

The paper is structured as follow. A description of the model and experimental setup is presented in Section 2. In Section 3 we investigate the changing risk of record-breaking summer temperatures in the future, its dependence on emissions scenario, as well as its effects on population. In addition, Section 3 examines the importance of changes in temperature variance and model bias on the results. A summary follows in Section 4.

2 Data and methods

Similar to most studies in this special issue, we make use of two ensembles of simulations with the National Center for Atmospheric Research Community Earth System Model version 1 (hereafter CESM) at about 1° horizontal resolution (Hurrell et al. 2013). One ensemble consists of 30 members for the period 1920 to 2100 under historical forcing until 2005 and RCP 8.5 thereafter (Kay et al. 2015). The individual ensemble members differ only by round-off errors in their initial atmospheric temperatures (10−14 °C), and hence provide an ideal setup to characterize the role of internal variability. The second ensemble consists of branches from the first 15 members of the RCP 8.5 ensemble at the year 2006 using RCP 4.5 forcing from 2006 to 2080 (Sanderson et al. 2015). We focus on the period 1920–2080 common to both ensembles. In addition, we make use of a 1800-year control simulation with the same model version under perpetual preindustrial (1850) radiative conditions.

In this study we define “summer” at each location as the climatologically warmest 3-month season based on the reference period 1920–2014, the period common to both observations and model simulations. We do this separately for observations and the model, using quadratically-detrended data. For observations we use monthly means from the Berkeley Earth Surface Temperature (BEST) data set at 1° spatial resolution (Rohde et al. 2013). Similar results were obtained with other data sets (not shown). The similarity of the BEST and model grids allows us to bilinearly regrid the observations to the model grid with minimal error. The model shows good agreement with observations on the definition of summer at most locations; areas of disagreement occur mainly in the tropics, where the amplitude of the seasonal cycle is modest (Fig. S1). We define the warmest summer during 1920–2014 as the “historical record”, considering the first 15 members of each ensemble: that is, the record occuring over 1425 summers (95 years × 15 ensemble members). We then calculate the probability that a summer in a given 20-year period in the future, i.e., post-2014, is warmer than the historical record by counting the number of summers exceeding the historical record and dividing by the total number of summers (300: 20 years × 15 ensemble members). For most locations, the warmest 3-month period does not change in the future (Fig. S1). Therefore, we use the same months to define summer in the future as derived from the historical period. Throughout this study the word “probability” is used in the sense of a relative likelihood based on a single climate model, rather than the strictly statistical use of the word that does not allow for model errors.

3 Results

  1. a)

    Evolution of record-breaking temperatures

Validating the model’s ability to produce realistic extreme summer temperatures similar to observations is inherently difficult, as the limited duration of the observational record prohibits robust estimates of extreme values (even for seasonal means). During the European heat wave in 2003, for example, summer temperatures exceeded its 1920-2014 climatological values by four standard deviations (not shown). Such extreme temperatures have a one in 15,787 chance of occurrence under a stable, normally distributed climate. The model equivalent of the observed period 1920 to 2014 represents only 1425 summers (95 years × 15 ensemble members = 1425 summers) and hence might still be too short to capture such outliers. Indeed, in this model ensemble, as well as in the 1800-year long control simulation, no extreme summer in Europe comparable to the one observed in 2003 occurs.

Another way of comparing the occurrence of extreme summer temperatures in observations and the model is to calculate the fraction of land area over which a new temperature record is set within a given summer. We use the time period 1920 to 2014 from observations (BEST) and 1920 to 2080 from the two model ensembles. In the first summer (1920), by definition a record is set for the entire land area (100 %). In the second summer (1921), only the land fraction that breaks the current record (summer 1920) is counted, and so forth. The land area fraction for each year is then normalized by 1/n × total land area, with n being the number of years considered, so that for the first year the 100 % are normalized to 1 %. Hence, in a stable climate the theoretically expected land area fraction will be 1 % for each year (following Bador et al. 2015). We do this separately for the entire land surface as well as for Europe, the US, and South-East Asia (Fig. 1). For the model ensembles, the range across the ensemble members is shown. For the control simulation, the range across non-overlapping 160-year segments is shown. These ranges provide estimates against which the null-hypothesis of a stable climate can be tested.

Fig. 1
figure 1

Observed (black curve) and simulated (shading) land area fraction that sees record-breaking summer temperatures for a global, b Europe, c the contiguous US, d South-East Asia. Time series are normalized by 1/n × total land area with n being number of summers considered. Shading gives minimum and maximum values across the 15 members from RCP 8.5 (red) and RCP 4.5 (blue) ensembles. The range across the control simulation (gray shading), illustrating uncertainty due to internal variability, is constructed from ten non-overlapping 160-year long segments. Overlap between the shadings is visible due to transparency. Dashed black lines give maximum value from control simulation. Maps show respective domain

The period 1920 to about 1980 is indistinguishable from natural variability as suggested by the control simulation in both observations and the model ensembles. After that, and certainly in the early 21st century, the observed and simulated record-breaking global land area fraction significantly exceeds the range of the control simulation, implying a non-stationary climate (Fig. 1a). The land area fraction also increases for Europe, the contiguous US, and South-East Asia, even though for these regions most observed values are still within the uncertainty range estimated from the control simulation (Fig. 1b–d). Globally and in South-East Asia, the year 1998, which was marked by a strong El Niño event, stands out as having the highest land fraction (>20 %) over which records were broken. To put this event into perspective, no year from the entire control simulation ever exceeds 10 % globally. In Europe, the 2003 summer broke records on >40 % of the land area, while the control simulation produces records over up to 26 % of the land area. While model biases in El Niño magnitude might influence such record-breaking statistics (Coumou et al. 2013), we note that the CESM Large Ensemble simulates the amplitude (and frequency) of ENSO quite well (the standard deviation of the Nino3.4 Index is 1.15 °C compared to 1.0 °C in ERSSTv3b; see Deser et al. 2012b for more information).

Moving further into the 21st century, both scenarios are characterized by consistently larger record-breaking land area fractions globally than would be expected from a stable climate (Fig. 1a). In the year 2080, at least 10 %, and up to >40 % of the land area might see a new temperature record. The scenarios RCP 8.5 and RCP 4.5 start to separate eventually and RCP 8.5 sees a consistently higher land area fraction than RCP 4.5, which seems to stabilize. However, that stabilization occurs around an ensemble mean value of about 15 %, indicative of continued global-scale warming (see also Fig. 1 in Sanderson et al. 2015). In the subdomains Europe, contiguous US, and South-East Asia, the signal is noisier and the two scenarios largely overlap, with new records on anywhere between 1 and >80 % of the land area by 2080 (Fig. 1b–d). This is evident also when the land fraction is scaled by the global average temperature of the respective scenario (not shown).

  1. b)

    Risk of future exceedance of historical record temperatures

As described in Section 2, we now examine the probability that summer temperatures during 2061–2080 will exceed the historical record. Under RCP 8.5, large parts of North and South America, central Europe, Asia and Africa are subject to probabilities of over 90 %, indicating that nearly every summer will be warmer than the warmest during the historical period (Fig. 2a). Some regions with probabilities <50 %, such as Alaska, the central US, Scandinavia, Siberia, and continental Australia, are subject to large interannual variability, and hence a tendency towards small signal-to-noise ratios (Mahlstein and Knutti 2011). Under RCP 4.5, mainly tropical regions see a probability >90 %, while most other regions remain <50 %.

Fig. 2
figure 2

Probability that a given summer during 2061–2080 will be warmer than the warmest on record during the historical period 1920–2014, determined from the CESM ensembles under a RCP 8.5 and b RCP 4.5. c The difference between RCP 8.5 and RCP 4.5, i.e., how much higher the probability is for RCP 8.5 as compared to RCP 4.5. The globally averaged value of each map, excluding Antarctica, is given in the bottom left corner. The stippling in c indicates areas that see a risk of >90 % already under RCP 4.5

By 2061–2080 in RCP 8.5, the probability that the global land area is warmer than the historical record is on average about 80 %, excluding Antarctica (Fig. 2a). By the same time in RCP 4.5, the probability that the global land area will be warmer than the historical record from 1920 to 2014 is only 41 % (Fig. 2b). Globally, this results in an average 39 % reduction in probability of record exceedance under RCP 4.5 instead of RCP 8.5 (Fig. 2c).

However, the scenario dependency of this metric is not spatially uniform. In other words, some regions appear to benefit more than others from RCP 4.5 as compared to RCP 8.5. In parts of Brazil, central Europe, and Eastern China, for example, the probability of exceeding the historical record temperature is more than 50 % larger in RCP 8.5 than in RCP 4.5 (Fig. 2c). The US East coast, Northeastern Africa, Indonesia and parts of China in contrast, show no significant difference in probability, since these regions already see a high probability of exceedance under RCP 4.5 (Fig. 2b, indicated by stippling). This also applies to large parts of the tropics, which—due to their small interannual variability—tend to have a large signal-to-noise ratio with global warming (Mahlstein and Knutti 2011). That is, the absence of a benefit from mitigation in these areas is due to saturation of our metric and, while physically interesting, is therefore not necessarily meaningful in a mitigation context (Fig. 2c). Yet another example is Eastern Australia, where there exists a low probability of exceedance in both RCP 4.5 and RCP 8.5 and only about 20 % difference in probability between the two scenarios.

To further illustrate these issues, Fig. 3 shows the distribution of summer temperatures from the historical (1920–2014) and future (2061–2080) time periods for the two scenarios at three representative locations: Cairo, Egypt (30.0 °N, 31.2 °E), Paris, France (48.9 °S, 2.4 °E), Canberra, Australia (35.3 °S, 149.1 °E). We also show the observed distribution for the historical period. The observed and simulated historical record summer temperatures are marked by triangles for reference. By 2061–2080, Cairo sees a mean warming of 3.2 °C for RCP 4.5 and 5.2 °C for RCP 8.5. Due to the small interannual variability in Cairo, most summers in 2061–2080 under RCP 4.5 are already warmer than the warmest of the historical period, so that there is little apparent benefit from mitigation. Paris is projected to warm a similar amount as Cairo (3.0 °C for RCP 4.5 and 5.0 °C for RCP 8.5); however, due to the larger interannual variability, the distributions show greater overlap than for Cairo, and the probability of exceeding the historical record does not saturate in RCP 4.5, but increases monotonically from RCP 4.5 to RCP 8.5. Finally, Canberra warms less than the previous examples (2.1 °C for RCP 4.5 and 3.3 °C for RCP 8.5), while it has similar interannual variability as Paris. The probability of exceeding the record therefore stays below 50 % under both emissions scenarios. For all of these examples the model shows no significant bias in σ compared to observations.

Fig. 3
figure 3

Kernel fit to the histograms of summer temperature in a Cairo, b Paris, c Canberra, given as anomalies relative to the mean over the historical period 1920–2014. The observed and simulated record, respectively, from the historical period 1920–2014 are given by the black and gray triangles; the blue and red triangles indicate the median of the RCP 4.5 and RCP 8.5, respectively. The number of summers considered are given in the legend of panel c

  1. c)

    Land fraction and population exposure

As with many impact-relevant quantities, an issue of interest is the number of people who will be affected. To address this, we show the probability of exceeding the historical record as a function of time, land surface fraction, and population fraction, using population fixed at 2010 values from Jones et al. (2015) (Fig. 4). Using transient population scenarios instead of Jones et al. (2015) does not alter the results significantly (not shown). Under RCP 8.5 by 2070, 100 % of the world population is subject to >10 % probability of exceeding of their local historical temperature record as defined up until 2014 (Fig. 4a). Under RCP 4.5, 90 % of the world population experiences >10 % probability; in other words, this result is largely independent of the emissions scenario (Fig. 4b). The results start to differ when higher probability categories are considered. Under RCP 8.5 by 2070, 87 % of the world population is subject to >50 % probability of exceeding of their historical temperature record (Fig. 4a). This means that, on average, every second summer will be as warm or warmer than the hottest summer ever experienced by the population during 1920–2014. Under RCP 4.5 by 2070, this will affect only 45 % of people. An even larger disparity arises for >90 % probability of record exceedance: >60 % of the population under RCP 8.5, while only 10 % of the population is subject to this same level of probability under RCP 4.5.

Fig. 4
figure 4

Probability of exceeding the historical summer temperature record as a function of time and land surface/population fraction for a RCP 8.5 and b RCP 4.5. Probability is calculated from 20-year moving windows and plotted at the central date. Population data are from Jones et al. (2015) and kept constant at 2010 values. c Timeslice of a and b for the period 2061–2080

In this global perspective, the land fraction exposed to a certain probability is very similar to the corresponding population fraction exposed; however, there are some interesting differences. Under RCP 4.5 by 2070, the population fraction seeing >70 % and >90 % probability of record exceedance is roughly 4 % smaller than the corresponding land fraction (Fig. 4c). Under RCP 8.5 this is reversed, indicating that densely populated regions preferentially shift to a higher probability category. In other words, in terms of human exposure to record temperature exceedance, the benefits would be disproportionately larger for RCP 4.5 as compared to RCP 8.5, although by only about 9 %.

  1. d)

    Changes in variability

The interannual variability of summer temperature might change with warming due to shifts in regional hydroclimate or as a consequence of feedbacks (Seneviratne et al. 2006; Huntingford et al. 2013). Such changes could influence the probability of record temperature exceedance. In both ensembles, there are indeed regions where significant changes in σ occur towards the end of the 21st century (Fig. 5a and b). There is a well-understood decrease in temperature variability at Northern Hemisphere high-latitudes due to the diminished Arctic sea ice cover at the end of the century (Screen and Simmonds 2010). Robust increases in variability occur in the Eastern Sahel region, the Amazon, Central North America, and Eastern Europe (Fig. 5). While the σ changes in the Sahel region are likely related to a shift of the Inter Tropical Convergence Zone, σ changes in the other regions might be due to soil moisture feedbacks as suggested by, e.g., Seneviratne et al. (2006). To put the σ changes in CESM into a larger perspective, we have examined σ changes in 26 models participating in the fifth Coupled Model Intercomparison Project (CMIP5) using the period 2061–2080 from the RCP8.5 simulations compared to the historical period 1920–2014. Many of the regions with a statistically significant σ change in the CESM Large Ensemble (e.g., the Amazon, central Europe, and Sahel) show the same sign σ change in more than 75 % of the CMIP5 models (not shown).

Fig. 5
figure 5

ab Changes in interannual summer temperature standard deviation (σ) for 2061–2080 minus 1920–2014 in a RCP 8.5 and b RCP 4.5. Only significant changes are shaded, according to a F-test on the variance with 95 % confidence level. cd Difference in probability of exceeding the historical temperature record between the simulated temperature distribution in 2061–2080 according to c RCP 8.5 and d RCP 4.5 (see Fig. 2) and a temperature distribution with same mean but σ from the simulated historical period 1920–2014. Positive (negative) values indicate simulated change in σ leads to larger (smaller) probability

To investigate how the changes in variability impact our metric of record temperature exceedance, we substitute the simulated future temperature distribution with a synthetically generated temperature distribution using the simulated future mean, but using the simulated σ from the historical period 1920–2014 (Fig. 5c and d). Hence, any potential narrowing or widening of the temperature distribution with future warming is omitted. The synthetically generated temperature distributions have the same sample size as the simulated ones (20 years × 15 ensemble members = 300 summers). We stress that in this, as well as in other approaches (Battisti and Naylor 2009; De Vries et al. 2012; Gao et al. 2015), the generation of synthetic temperature distributions, although well grounded in physical understanding, is a purely statistical approach.

As shown in Fig. 5c and d, potential future changes in temperature σ have a very minor effect (<2.5 %) on the probability of record exceedance in most locations. Certain regions, such as Eastern Europe or Siberia are subject to opposite signals for RCP 4.5 and RCP 8.5, because with a strong warming (RCP 8.5) and a widening of the temperature distribution, the probability of exceeding the record can become smaller than when the temperature distribution stays at its historical width (see also Section 3e for further discussion of this effect). For other regions the results are robust for both emissions scenarios. For example, along the US East coast or in Western France, the probability of exceeding the historical record increases by 1–7 % due to changes in σ.

  1. e)

    Influence of model bias

Earlier studies raised concern regarding the capabilities of climate models to correctly simulate interannual temperature variability (Battisti and Naylor 2009). Here, we investigate CESM biases in temperature variability and how they influence our results. Unlike in previous sections, where we used the station-based BEST data set, we here use the European Reanalysis-Interim (ERAi; Dee et al. 2011) product. Topography influences the estimate of temperature variability at higher elevation, as higher elevations tend to have smaller interannual variability than surrounding lowlands. ERAi allows for a more appropriate comparison with the model with regard to variability, as it applies a similar topographic smoothing. We tested two other reanalysis (NCEP/NCAR, Twentieth Century Reanalysis), as well as two station-based data sets (BEST, HadCRUT4) to confirm that reanalyses tend to have higher variability than station-based datasets (not shown).

During the historical period 1979–2014, the model is in good agreement with observations over most of the land areas, as measured by the ratio of simulated and observed interannual summer temperature σ (quadratically detrended time series of 1979–2014 summer temperatures; Fig. 6a). The model tends to overestimate summer temperature variability in parts of India and South East Asia, likely related to biases in the representation of the Indian monsoon. A closer look at a grid point near New Delhi, India (28.6 °N, 77.2 °E), illustrates an overestimation bias (Fig. 6b). Further, New Delhi serves as an example of non-linear warming with forcing scenario, a topic not touched upon in this study. There are other places, such as the US East coast or central Africa, where variability is underestimated. In Africa, the biased location of the Intertropical Convergence Zone leads to an underestimate of precipitation variability with associated effects on temperature variance (Siongco et al. 2014).

Fig. 6
figure 6

a Model bias in interannual summer temperature standard deviation (σ), expressed as simulated σ divided by observed σ over the period 1920–2014. Stippling indicates significant difference in variance of the two samples, applying an F-test at the 95 % level. b Grid point example of a significant model bias; legends as in Fig. 2. The bottom row (cd) shows the difference in probability of exceeding the historical temperature record using two different 2061–2080 temperature distributions: (i) a temperature distribution with the same mean as the simulated temperature (see Fig. 2), but σ from the simulated historical period 1979–2014 σ, and (ii) same as (i) but using the σ from the observed historical period 1979–2014 (ERAi). Positive (negative) values indicate model bias leads to overestimated (underestimated) probability. c (i)–(ii) for RCP8.5 and d (i)–(ii) for RCP 4.5

To examine how these biases influence our metric of record exceedance probability, we repeat the analysis of Fig. 2, but using two different futures: (i) one in which we generate a normal distribution with the simulated future mean temperature, but using the simulated σ from the period 1979–2014 and (ii) one in which we generate a normal distribution with the simulated future mean temperature, but using the observed σ from the period 1979–2014. Neglecting, for now, potential changes of σ itself with warming, this approach isolates the influence of model bias in variability on our metric.

Figure 6c–d contrasts the future temperature distribution with simulated σ with the one with observed σ. From a global average perspective, the influence of model bias in temperature variance is very small: −0.6 % under RCP 8.5 and 1.2 % under RCP 4.5 by 2061–2080. These seemingly contradictory tendencies are a result of the combination of model bias and our metric. For example, in South-East Europe the model tends to overestimate summer temperature variability (Fig. 6a); in other words, the simulated historical σ is comparably large and the observed one is comparably small. Adding the same moderate warming (RCP 4.5) to both of these σ, enables the larger σ to exceed the historical record more easily, hence for RCP 4.5 the probability of exceeding for South-East Europe is overestimated. In contrast, once one adds a large warming (RCP 8.5) to both of these σ the smaller σ tends to exceed the historical record more often, as now the mean warming is as large or larger than the historical record.

Most of the biases in probability of exceeding the historical record are within ±10 % and therefore do not affect the conclusions from Fig. 2, where the range ±10 % is not shaded. However, such biases may need to be taken into account for certain applications.

4 Summary

This study investigates the probability that summers in the future will be as warm or warmer than the hottest on record during 1920–2014, based on large ensembles of simulations with CESM under RCP 4.5 and RCP 8.5 emissions scenarios. The aim was to depict the benefits of climate mitigation by contrasting the two emission scenarios, as well as to illustrate the influence of model bias and changes in temperature variability on the results. While model biases and changes in variability can be important locally, they are generally <10 % and thus do not alter the large scale picture presented here. On a global basis, we find that the probability of exceeding the historical record temperature increases from less than 10 % at present day to over 80 % at 2070 with projected warming under the business-as-usual scenario (RCP 8.5). This increase can be reduced to nearly half by moderate climate mitigation (RCP 4.5). A number of regions such as Brazil, Europe, or Eastern China benefit especially from mitigation by seeing risk reductions of over 50 %. Since these regions are densely populated, this results in global population benefiting disproportionately from climate mitigation when compared to just the land surface, as any mitigated warming in RCP 4.5 prevents those regions from shifting into a higher risk category (see also Lehner and Stocker 2015). Some regions show little risk reduction with climate mitigation, either due to small changes relative to interannual variability, or because for those regions the projected summer temperatures will robustly exceed the historical record temperatures already under RCP 4.5. Caution is therefore warranted when interpreting the results, as the relative weights of benefits and limitations of climate mitigation depend strongly on the metric considered. Besides human exposure to extreme heat, potential risk reductions from mitigation as shown here are also crucial for the sustainability of agriculture, which increasingly seeks to insure its yield for potential losses from recurring heat waves and other severe weather. Note that crop-specific sensitivities to regional climate change and technological advancements, both not addressed here, can complicate the attribution of yield changes to climate change (Lobell and Asner 2003). In conclusion, from the perspective of heat exposure of human and natural systems, there are clear and robust benefits associated with climate mitigation.