1 Introduction

Earth System Model (ESM) ensembles forced with prescribed Representative CO2 Concentration Pathway (RCP) scenarios (Meinshausen et al. 2011) show significant spread in twenty-first century projections of warming and compatible carbon emissions (e.g. Collins et al. 2013; Gillet et al. 2013; Matthews et al. 2009; Zickfield et al. 2009; 2012) (Fig. 1), even though each ensemble-member is consistent, or close to consistent, with observations of historic and present-day climate change (e.g. Hartmann et al. 2013; Rhein et al. 2013; Flato et al. 2013). This future spread leads to significant uncertainty in the sensitivity of future warming to carbon emissions, termed the Transient Climate Response to Emission (Gillet et al. 2013), or TCRE [K (1000 PgC)−1]. Based on the CMIP5-ensemble of 21 complex ESMs, the TCRE is estimated to be between 0.8 and 2.5 K (1000 PgC)−1 for the late twenty-first century (Collins et al. 2013; Gillet et al. 2013), while a separate observation constrained theoretical analysis (Goodwin et al. 2015) suggests TCRE = 1.1 ± 0.5 K (1000 PgC)−1. This large uncertainty in the TCRE introduces large uncertainty in the maximum cumulative carbon emission allowed to restrict CO2-induced warming to a policy-driven target (Zickfield et al. 2009), noting that warming from non-CO2 agents will also affect total anthropogenic warming (Pierrehumbert 2014) and that climate targets other than warming also affect allowable emissions (Steinacher et al. 2013). To reduce the considerable uncertainty in the warming-target allowable carbon emissions, the value of the TCRE must be better constrained.

Fig. 1
figure 1

Twenty-first century warming and carbon emission projection ranges for four RCP scenarios from three model-ensembles. a Projected warming of global mean surface air-temperatures from the 1986–2005 to the 2081–2100 periods (K). b Projected compatible carbon emissions from 2012 to 2100 (1000 PgC). The CMIP5 ensemble is used in Assessment Report 5 of the IPCC (Flato et al. 2013). The WASP (simulation–consistent) ensemble contains 182,500 ensemble members that are consistent with 8 historic constraints based on the simulated historic ranges of the CMIP5 ensemble. The WASP (observation–consistent) ensemble contains 14,500 members that are consistent with 8 historic constraints from observations

One possible source of uncertainty in future model-ensemble climate projections arises due to discrepancies between the range of historic climate change simulated by models and observed in the real climate system (e.g. Flato et al. 2013). This study investigates how small discrepancies between observed and simulated historic climate affect the future projections made by a very large model ensemble. A very large model-ensemble of 4-million members is produced using a new efficient ESM (Sect. 2; “Appendix” section). Two subset model ensembles are then extracted. An historic simulation–consistent ensemble contains all ensemble members that are consistent with the ranges of eight past constraints simulated by the CMIP5 ensemble, while an observation–consistent ensemble contains all ensemble members that are consistent with the ranges of eight past constraints observed for the real climate system. Section 2 describes the new ESM, and the construction of the model ensembles. Section 3 presents the results in comparison to the CMIP5 projections, while Sect. 4 discusses the wider implications of the study.

2 Materials and methods

Section 2.1 describes a new simple ESM, while Sect. 2.2 then describes how an initial 4-million member random ensemble is generated. Sections 2.3 and 2.4 describe how further ensembles are extracted from the initial 4-million member ensemble. Section 2.3 describes the extraction of an ensemble consistent with historic CMIP5 simulation ranges, while Sect. 2.4 describes the extraction of an ensemble consistent with historic observational ranges.

2.1 Efficient earth system model description

A new efficient 8-box model of the atmosphere–ocean–terrestrial system is used (Fig. 2): the Warming, Acidification and Sea-level Projector (WASP). In the WASP model, ocean carbonate chemistry is approximated using the buffered carbon inventory approach of Goodwin et al. (2007, 2009, 2015). Global mean surface air-temperature increase is calculated using the warming-carbon emissions relationship of Goodwin et al. (2015), with additional terms for radiative forcing from non-CO2 agents (Meinshausen et al. 2011) and for equivalent carbon emissions from the ocean temperature-CO2 solubility feedback (Goodwin and Lenton 2009). A full description of the WASP model, including the model equations, is given in the “Appendix” section.

Fig. 2
figure 2

Schematic of the Warming Acidification and Sea-level Projector (WASP). WASP is an 8-box model of the Earth System. Arrows indicate fluxes of carbon and heat. The ocean has prescribed e-folding timescales, τ, for tracers to equilibrate. Full details of the WASP model are found in the “Appendix” section

2.2 Generating the model ensembles

The new efficient ESM (WASP; “Appendix” section) is used to generate an initial 4-million member Monte Carlo ensemble, with 16 model input parameters varied randomly between ensemble-members.

Two forcing parameters are varied with random-normal distributions to approximate uncertainties in anthropogenic radiative forcing after Myhre et al. (2013) (Fig. 3, black). Fourteen internal model properties are varied with random-linear input distributions within prescribed ranges (Fig. 4, black), such that any of the possible values within the prescribed ranges are equally as likely to occur in the initial 4-million member ensemble. Therefore, the assumed prior knowledge about the 14 internal model properties in the 4-million member ensemble is simply that they lie within their prescribed limits, but no information about the relative likelihood of particular values within those limits is assumed.

Fig. 3
figure 3

Normalised frequency distributions of radiative forcing input values for the initial 4-million member ensemble (black), simulation–consistent ensemble (blue) and the observation–consistent ensemble (red). a The CO2 radiative forcing coefficient, \( a_{{{\text{CO}}_{2} }} \) (Wm−2). b The total radiative forcing in 2011 relative to 1750 from all anthropogenic sources, R TOTAL (Wm−2)

Fig. 4
figure 4

Normalised frequency density distributions for model input parameters in the initial 4-million member ensemble (black), the 182,500 member simulation–consistent ensemble (blue), and the 14,500 member observation–consistent ensemble (red). a The equilibrium climate parameter, λ (Wm−2 K−1). b The Equilibrium Climate Sensitivity (ECS, K) for a doubling of CO2, calculated from \( {\text{ECS}} = (a_{{{\text{CO}}_{2} }} \ln 2)/\lambda \). c The efficacy of ocean heat uptake, ε. d The ratio at equilibrium of warming of sea surface temperatures to Surface Air Temperatures, r SST:SAT. e The ratio at equilibrium of warming in the sub-surface ocean to sea surface temperatures, rsub:SST. f The fraction of total Earth System heat content increase from the ocean, f heat . g The CO2 fertilisation coefficient, \( \gamma_{{{\text{CO}}_{2} }} \). h The sensitivity of global Net Primary Productivity to global temperature, ∂NPP/∂T (PgC yr−1 K−1). i The global sensitivity of soil carbon residence time to temperature, ∂τ soil /∂T (yr K−1). j The e-folding timescale for the ocean surface mixed layer to equilibrate in carbon relative to the atmosphere, τ mixed (yr). The e-folding timescales for mixed-layer tracers to equilibrate with: k the upper ocean, τ upper (yr), l the intermediate ocean, τ inter (yr), m the deep ocean, τ deep (yr), and n the bottom ocean, τ bottom (yr). m The buffered carbon inventory of the air–sea system, I B

2.2.1 Monte Carlo forcing parameter distributions

Two parameters are altered between ensemble-members to encapsulate current uncertainty in the magnitude of anthropogenic radiative forcing over time (Myhre et al. 2013):

  1. 1.

    The coefficient relating radiative forcing to the log change in atmospheric CO2, \( a_{{{\text{CO}}_{2} }} \) (Wm−2), is relatively well-constrained (Myhre et al. 2013, 1998), and so is varied with a random-normal distribution with mean 5.35 Wm−2 and standard deviation 0.27 Wm−2 (Fig. 3a, black), to reflect the mean and uncertainty range used in Myhre et al. (2013).

  2. 2.

    The radiative forcing from non-Kyoto agents, R non-Kyoto (Wm−2), is varied with a random-normal distribution to approximate the mean and uncertainty in radiative forcing from agents other than Well Mixed Greenhouse Gasses in Myhre et al. (2013). Note that Myhre et al. (2013) assume a slightly asymmetric distribution for anthropogenic radiative forcing from agents other than Well Mixed Greenhouse Gasses, and that this asymmetry is ignored here in favour of a simpler random-normal relative frequency distribution (Fig. 3b, black).

The radiative forcing from non-Kyoto agents is set by scaling the R non-Kyoto at time t to be proportional to the radiative forcing from non-CO2 agents that are included in the Kyoto protocol (Meinshausen et al. 2011), \( R_{{non{-}{\text{CO}}_{2} }} \), using

$$ R_{non{-}Kyoto} (t) = f_{uncert} \times \frac{{R_{{non{-}{\text{CO}}_{2} }} (t)}}{{R_{{non{-}{\text{CO}}_{2} }}^{2011} }}, $$
(1)

where f uncert is an uncertainty factor varied with a random-normal distribution with mean −0.23 Wm−2 and standard deviation +0.5 Wm−2, \( R_{{non{-}{\text{CO}}_{2} }} (t) \) evolves over time as prescribed in the RCP scenarios (Meinshausen et al. 2011), and \( R_{{non{-}{\text{CO}}_{2} }}^{2011} \) is set to 0.69 Wm−2 to approximate the radiative forcing from non-CO2 agents included in the Kyoto protocol across the four RCP scenarios (Meinshausen et al. 2011). The total radiative forcing at time t, R TOTAL (t), is then set to,

$$ R_{TOTAL} (t) = a_{{{\text{CO}}_{2} }} \ln \frac{{{\text{CO}}_{2} (t)}}{278.0} + R_{{non{-}{\text{CO}}_{2} }} (t) + R_{non{-}Kyoto} (t). $$
(2)

Equations (1) and (2) are applied to the RCP scenarios (Meinshausen et al. 2011) to prescribe total radiative forcing over time, with \( a_{{{\text{CO}}_{2} }} \) and f uncert are varied between ensemble members with random-normal distributions to reflect uncertainty in the magnitude of anthropogenic radiative forcing in 2011 (Fig. 3, black). The mean total anthropogenic radiative forcing in 2011 of all ensemble members is 2.3 Wm−2, and the 90 % range is from 1.5 to 3.2 Wm−2 (Fig. 3b, black). This approximates the best estimate for total anthropogenic radiative forcing for the real climate system in 2011 of 2.3 Wm−2 (Myhre et al. 2013), with an estimated 90 % range of 1.1–3.3 Wm−2.

2.2.2 Monte Carlo input parameter distributions

Fourteen model input properties are varied with random-linear distributions encapsulate uncertainty in the response of the climate system to anthropogenic forcing:

  1. 1.

    The range of the equilibrium climate parameter, λ (Wm−2 K−1), is set from 0.1 to 5.0 Wm−2 K−1 (Fig. 4a, black), to cover a large range of possible equilibrium climate parameter values suggested by palaeo-data, historic climate change and climate models (Collins et al. 2013).

  2. 2.

    The range of the efficacy of ocean heat-uptake, ε (Frölicher et al. 2014; Winton et al. 2010), is set between 0.83 and 1.82 (Fig. 4c, black), equal to the range of ε displayed in CMIP5 models evaluated by Geoffroy et al. (2013).

  3. 3.

    The ratio of SST-warming to SAT-warming at equilibrium, r SST:SAT , is varied from 0.25 to 1.1 (Fig. 4d, black) and,

  4. 4.

    The ratio of global mean sub-surface ocean warming to SST-warming at equilibrium, r sub:SST , is varied between 0.01 and 1.0 (Fig. 4e, black). These input ranges for r SST:SAT and r sub:SST are chosen to encapsulate, and be broader than, the differences in these properties between the two models evaluated by Williams et al. (2012, see Fig. 3 therein) and to include value-ranges consistent with estimates of the land–sea warming ratio (Sutton et al. 2007).

  5. 5.

    The range of the fraction of total Earth System heat-content increase that enters the ocean, f heat , is set from 0.8 to 0.98 (Fig. 4f, black), to reflect uncertainty in heat uptake by components of the Earth System since 1971 (Rhein et al. 2013).

  6. 6.

    The range of the CO2 fertilisation coefficient (Alexandrov et al. 2003; “Appendix” section) is set to between 0 and 1 (Fig. 4g, black) to reflect the large uncertainty in the magnitude of the sensitivity of global Net Primary Productivity (NPP, PgC yr−1) to CO2 doubling (Ciais et al. 2013; Alexandrov et al. 2003).

  7. 7.

    The range for the sensitivity of global NPP to global surface temperature, ∂NPP/∂T, is set to between −5.0 and +1.0 PgC yr−1 K−1 (Fig. 4h, black) to reflect the range displayed in ESMs evaluated by Friedlingstein et al. (2006).

  8. 8.

    The range of the sensitivity of global mean soil-carbon residence time to global surface warming, ∂τ soil /∂T, is set to between −2.0 and +1.0 yr K−1 (Fig. 4i, black), to encapsulate the range displayed in ESMs (Friedlingstein et al. 2006).

  9. 9.

    The range for the e-folding timescale for CO2 equilibration between the atmosphere and surface mixed-layer is set to between 0.1 and 0.5 years (Fig. 4j, black).

Equilibration timescales for tracer-exchange between the surface mixed layer and the sub-surface ocean regions are varied between ensemble members to reflect uncertainty in the timescales of ocean ventilation for different regions of the ocean (DeVries and Primeau 2011) and ocean overturning (Weaver et al. 2012). The ranges of the e-folding timescales to achieve tracer equilibration with the surface mixed layer (“Appendix” section; Fig. 2) are set to between:

  1. 10.

    5 and 40 years for the upper ocean, τ upper (Fig. 4k, black),

  2. 11.

    15 and 60 years for ocean intermediate water, τ inter (Fig. 4l, black),

  3. 12.

    75 and 500 years for ocean deep water, τ deep (Fig. 4m, black), and

  4. 13.

    250 and 1500 years for ocean bottom water, τ bottom (Fig. 4n, black).

  5. 14.

    The prescribed range for the buffered carbon inventory of the air–sea system, I B , set from 3100PgC to 3900PgC (Fig. 4o, black), equal to the range seen in ocean models tested by Goodwin et al. (2007, 2009).

The above describes how the plausible limits for the 14 internal model parameters are set. No prior judgement is made as to the relative likelihood of each parameter value within its prescribed limit, achieved by using random–linear input distributions (Fig. 4, black). The historic constraints are then used to select sub-sets of the initial 4-million member model ensemble. By choosing a sub-set of the initial 4-million member ensemble, the historic constraints themselves are used to determine the final relative likelihood of each parameter value within its prescribed input limit. It should be noted that alternative strategies for choosing the prescribed limits for each parameter, or applying prior knowledge to determine the relative likelihood of each input parameter within its prescribed limit prior to the observational tests, would result in different final ensembles.

2.3 Extracting a historic CMIP5 simulation–consistent model ensemble

At year 2012, the initial 4-million ensemble-members are checked against 8 constraints reflecting ranges of anthropogenic surface warming, heat uptake and carbon uptake (Table 1) as simulated by the CMIP5 ensemble. These eight constraints cover the climate variables that the efficient ESM can simulate: global mean surface warming, ocean heat uptake and ocean and terrestrial carbon uptake (Fig. 2). They also represent metrics used to assess the CMIP5 models in the literature (e.g. Flato et al. 2013; Jha et al. 2014; Song et al. 2014).

Table 1 The historical constraints used to assess WASP ensemble members for simulation–consistency to the CMIP5 models and the resulting ranges in the simulation–consistent WASP ensemble

The eight separate constraints to assess the WASP ensemble members for historic consistency to the CMIP5 ensemble are (Table 1):

  1. 1.

    SAT warming from the 1850 to the 1961–1990 average is between 0.1 and 1.0 K in the CMIP5 ensemble members analysed in Song et al. (2014; see Fig. 1 therein).

  2. 2.

    SAT warming from the 1961–1990 average to 2005 is between 0.3 and 1.1 K in the CMIP5 ensemble members analysed in Song et al. (2014; see Fig. 1 therein)

  3. 3.

    The mean decadal rate of SAT warming from 1951 to 2012 is between 0.5 and 2.3 K per decade in the CMIP5 ensemble members analysed by Flato et al. (2013);

  4. 4.

    SST warming from the 1870–1900 period to the 1985–2005 period is between 0.2 and 0.7 K in the ten CMIP5 models analysed by Jha et al. (2014; see Fig. 5 therein).

    Fig. 5
    figure 5

    Hierarchy of the ensembles from the WASP model. The initial model ensemble contains 4-million members where two radiative forcing coefficients are varied with random-normal distributions to approximate uncertainty in historic radiative forcing (Fig. 3, black) and 14 input parameters are randomly varied between prescribed limits (Fig. 4, black). Of this initial model-ensemble, a Simulation–consistent ensemble is extracted comprising the 182,500 members that are consistent with 8 constraints based on the historic simulated ranges of the CMIP5 ensemble (Table 1). A further Observation–consistent ensemble is extracted comprising 14,500 members that are consistent with 8 historic observational constraints (Table 2). Some 8200 ensemble members are contained in both the historic Simulation–consistent and Observation–consistent ensembles

  5. 5.

    The heat content change of the whole ocean from 1971 to 2005 is between 80 and 380 ZJ in the CMIP5 ensemble members analysed by (Flato et al. 2013);

  6. 6.

    The heat added to the upper 700 m of the ocean from 1971 to 2005 is between 25 and 370 ZJ in the CMIP5 ensemble members analysed by (Flato et al. 2013). The upper ocean heat content is represented in WASP by the sum of the mixed layer and upper ocean boxes (Fig. 2; see Appendix Table 5).

  7. 7.

    The residual terrestrial carbon uptake from 1986 to 2005 is between 0 and 3.0 PgC yr−1 in the CMIP5 models analysed by (Flato et al. 2013).

  8. 8.

    The ocean carbon uptake from 1986 to 2005 is between 1.6 and 2.3 PgC yr−1 in the CMIP5 models analysed by (Flato et al. 2013).

To be counted as historically consistent with the CMIP5 ensemble, a WASP ensemble-member must lie within all eight of these ranges. Of the 4-million initial Monte Carlo WASP ensemble members, some 182,500 are judged to be simulation–consistent with the historical range of the CMIP5 ensemble (Figs. 3, 4, blue). These 182,500 ensemble members make up the simulation–consistent model ensemble (Fig. 5). There are small variations in number of simulation–consistent ensemble members for each RCP scenario, reflecting small differences in prescribed forcing between 2005 and 2012 (Meinshausen et al. 2011).

The simulation–consistent ESM-ensemble is able to reproduce the majority of CMIP5 historical simulation–range parameter space (Table 2) with two exceptions. Firstly, there are slightly reduced ranges in simulated warming relative to the 1961–1990 average, with the efficient ESM unable to reproduce the lowest warming from 1850 or the greatest warming up to 2005 (Table 1). This may reflect the lack of internal variability in the efficient ESM, since the CMIP5-simulated warming ranges reflect both an anthropogenic signal and internal variability but the efficient ESM ensemble reflects only the anthropogenic signal. Secondly, there is a reduced range of simulated ocean heat uptake by the upper 700 m of the ocean from 1971 to 2005 (Table 1), although the entire range of CMIP5-simulated total ocean heat uptake for this period is represented in the efficient ESM ensemble. The reduced range of simulated heat uptake for the upper 700 m of the ocean is likely to be the result of the simplistic box-model representation of the ocean regions in WASP (Fig. 2), relative to the 3D spatial representation of ocean regions in the CMIP5 models.

Table 2 The observational constraints used to assess for observation–consistency and the resulting ranges in the observation–consistent model-ensemble

2.4 Extracting the observationally consistent model ensemble

At year 2012, the initial 4-million ensemble-members are checked against eight observational constraints reflecting anthropogenic surface warming, heat uptake and carbon uptake (Table 2).

The eight separate observational constraints used to assess the ensemble members for observation-consistency (Table 2) are again chosen to cover the climate variables that the efficient ESM can simulate: global mean surface warming, ocean heat uptake and ocean and terrestrial carbon uptake (Fig. 2). However, these constraints represent metrics used to express the historic observations of climate change in the literature (e.g. Ciais et al. 2013; Hartmann et al. 2013; Rhein et al. 2013). The eight historic constraints are:

  1. 1.

    Global mean Surface Air Temperature (SAT) warming from the 1850–1900 to the 2003–2012 periods is from 0.72 to 0.85 K (Hartmann et al. 2013);

  2. 2.

    The mean decadal rate of SAT warming from 1951 to 2012 is from 0.09 to 0.14 K decade−1 (Hartmann et al. 2013);

  3. 3.

    The decadal Sea Surface Temperature (SST) warming rate from 1971 to 2010 is from 0.09 to 0.13 K decade−1 (Rhein et al. 2013; as represented in WASP by the mixed layer ocean box, Fig. 2);

  4. 4.

    The heat content change of the Earth System from 1971 to 2010 is from 196 to 351 ZJ (Rhein et al. 2013);

  5. 5.

    The heat content change of the Earth System from 1993 to 2010 is from 127 to 201 ZJ (Rhein et al. 2013);

  6. 6.

    The heat added to the upper 700 m of the ocean from 1971 to 2010 is from 82 to 154 TW (Rhein et al. 2013; as represented in WASP by the sum of the mixed layer and upper ocean boxes, Fig. 2; see Appendix Table 5),

  7. 7.

    The residual terrestrial anthropogenic carbon uptake since the preindustrial is from 70 to 250 PgC (Ciais et al. 2013); and

  8. 8.

    The anthropogenic ocean carbon uptake since the preindustrial is from 125 to 185 PgC (Ciais et al. 2013).

These eight observational constraints represent the estimated 90 % ranges for each quantity (Table 2). Each WASP ensemble member is therefore judged to be observation–consistent if it lies within the estimated 90 % range (Table 2) of at least 7 of the 8 observational constraints, and may miss the 90 % range of the remaining constraint by up to an extra 50 % relative to the best estimate. Allowing an ensemble-member to be classified as observation–consistent while missing the 90 % range for one out of eight observation–constraints provides a mechanism for the tails of the distribution for each observational constraint to be included in the observation–consistent model ensemble.

Of the 4-million initial ensemble members (Figs. 3, 4, black), those judged to be observationally-consistent are extracted to form an observationally-consistent model-ensemble of some 14,500 members (Figs. 3, 4, red; Fig. 5).

The observation–consistent ESM-ensemble is able to reproduce the majority of observational–consistent parameter space (Table 2) with two exceptions. Firstly, the observation–constrained ESM-ensemble shows a reduced range for the decadal rate of SAT warming from 1951 to 2012. This reduced range is interpreted here as a result of the simple ESM having no internal decadal temperature variability, which reduces the possible range of temperature change achieved by the simple ESM between two specific years and over short periods of time. Secondly, the simulated Earth System heat content increase shows reduced ranges, missing the lower end of the observational range for the 1971–2010 period but missing the upper end of the observational range for the 1993–2010 period. It is unclear precisely why this discrepancy arises, though it may result from the simplicity of ocean circulation and heat uptake representations in WASP, comprising of a surface mixed layer ocean attached to four sub-surface ocean regions with fixed e-folding tracer equilibration timescales (Fig. 2).

3 Model ensemble results and projections

3.1 Twenty-first century warming projections

For each RCP scenario, the twenty-first century warming projection ranges for the simulation–consistent efficient ESM-ensemble (Fig. 5; Table 1) are similar to the ranges from the CMIP5 ensemble of complex climate models (Fig. 1, compare WASP (simulation–consistent) to CMIP5, and Table 3, compare to IPCC 2013; tables SPM.2 and SPM.3 therein). This suggests that the WASP climate model represents a viable tool to quickly emulate future global mean warming ranges simulated an ensemble of complex climate models, given their historic performance to anthropogenic forcing. The WASP observation–consistent ensemble (Fig. 5; Table 2) can be considered in this way: an emulation of the projected warming ranges that may be simulated by a complex climate model ensemble, if the historic ranges of the complex climate model ensemble closely matched historic observations (Table 2).

Table 3 Twenty-first century warming projections from the observation–consistent and simulation–consistent WASP model ensembles for the four RCP scenarios

Now consider the twenty-first century warming projections made by the observation–consistent ensemble (Fig. 1a). The observation–consistent ensemble has narrower ranges for future projected warming than the simulation–consistent ensemble (Fig. 1a), reflecting the narrower historical performance range of the observation–consistent ensemble (Table 2, compare observation–consistent to simulation–consistent ensembles). For each RCP scenario, the lower bounds of the projected warming range agree with the CMIP5 ensemble (Fig. 1a, compare WASP (observation–consistent) to CMIP5, and Table 3, compare to IPCC 2013; tables SPM.2 and SPM.3 therein). However, the projected upper bounds for twenty-first century warming are significantly reduced compared to CMIP5 for all RCP scenarios (Fig. 1a). Thus, the upper components of the projected warming ranges made by the simulation–consistent ensemble are not supported by historic observations [Fig. 1a, compare WASP (simulation–consistent) with WASP (observation–consistent)]. The close agreement between the simulation–consistent ensemble and CMIP5 also suggests that further work is required to establish whether the upper bounds of the CMIP5 projections are also not supported by observations (Fig. 1a).

3.2 Twenty-first century compatible carbon emission projections

Projected compatible carbon emission ranges in both the simulation–consistent and observation–consistent ensembles overlap with, but are generally greater than, the CMIP5 ranges (Fig. 1b), and are most similar for RCPs 3PD and 4.5 scenarios. However, the size of the ranges themselves (i.e. the 95th percentile minus the 5th percentile) are similar across all model ensembles. These similar sized ranges suggest that the variation in compatible carbon emissions between CMIP5 models encapsulates, but does not exceed, the uncertainty implied by carbon cycle observations.

The greater compatible carbon emissions projections made by the efficient ESM compared to CMIP5 models, especially for RCPs 6.0 and 8.5 (Fig. 1b), may be the result of the relatively simple representation of the carbon cycle in WASP (Fig. 2; “Appendix” section). For example, the hydrological cycle is not represented in WASP, but may significantly influence the response of the carbon cycle to emissions (e.g. Alexandrov et al. 2003). Also, there are likely to be regional differences in the responses of the carbon cycle to warming and rising CO2 (e.g. Alexandrov et al. 2003), and such regional differences are not represented in WASP (Fig. 2; “Appendix” section).

3.3 Climate system parameters

The sensitivity of climate to carbon emissions is often characterised by the Equilibrium Climate Sensitivity (ECS, in K) and the Transient Climate Response to Emission [TCRE, in K (1000 PgC)−1] (IPCC 2013).

The Equilibrium Climate Sensitivity represents the equilibrium surface warming for a sustained doubling of atmospheric CO2, and is calculated for the WASP ensemble members by \( {\text{ECS}} = a_{{{\text{CO}}_{2} }} \ln 2/\lambda \). The random-normal input distribution for \( a_{{{\text{CO}}_{2} }} \) (Fig. 3a, black) reflects current uncertainty in the CO2-radiative forcing link (Myhre et al. 2013), while the random-linear input distribution for λ (Fig. 4a, black) assumes λ must lie between 0.1 and 5.0 Wm−2 K−1, but no prior knowledge of relative weighting to particular values within this range is assumed. These input distributions result in the ECS ranging from a minimum of 0.6 K to a maximum of 43 K in the initial 4-million member ensemble.

Now consider how the constraints used to select the simulation–consistent ensemble (Table 1) and observation–consistent ensemble (Table 2) restricts this broad range of ECS values. The 14 500 observation–consistent ensemble members have mean ECS of 2.4 K and a 5th to 95th percentile range of 1.4–4.4 K (Fig. 4b, red), almost identical to the IPCC best estimate range of 1.5–4.5 K (Bindoff et al. 2013). The IPCC best estimate for the ECS is in part based on the same observational constraints (Table 2) and historic forcing estimates (Fig. 3, black; Myhre et al. 2013) used to constrain the observation–consistent model ensemble, and so these two ECS ranges are not independent of one another. However, their close agreement does provide confidence that the WASP model framework is a viable and efficient tool to use observations of the climate system to constrain climate parameters.

For the 182,500 simulation–consistent ensemble-members, the mean ECS is 2.9 K while the 5th to 95th percentile range is 1.4–5.6 K (Fig. 4b, blue). This mean value is comparable to the mean for the CMIP5 ensemble members of 3.2 K (Flato et al. 2013). However, the ECS range is greater the CMIP5 ensemble range of 2.1–4.7 K (Flato et al. 2013). The greater ECS range in the simulation–consistent ensemble may reflect the far greater number of ensemble-members, covering a greater extent of parameter space. While the historic performance of the simulation–consistent ensemble matches the range simulated by CMIP5 (Table 1), there are more different ways of achieving this historic performance range explored in the simulation–consistent ensemble due to the much greater number of ensemble-members.

The Transient Climate Response to Emission (TCRE) represents the anthropogenic surface warming per unit cumulative carbon emitted since the preindustrial period at a specified point in time. The 5th–95th percentile range for the TCRE, evaluated at year 2100 for RCP8.5, is 1.1–2.7 K (1000 PgC)−1 for the simulation–consistent ensemble and 1.0–1.6 K (1000 PgC)−1 for the observation–consistent ensemble. These ranges are consistent with the IPCC best estimate of between 0.8 and 2.5 K (1000 PgC)−1 for the late twenty-first century (IPCC 2013), although the observation–consistent ensemble range is narrower and concentrated towards the lower end of the IPCC estimate.

The observation–consistent and simulation–consistent ensemble ranges for the equilibrium warming ratios r SST:SAT (Fig. 3d) and r sub:SST (Fig. 3e) encompass values consistent with the ranges of these ratios in more complex 3D models following idealised carbon emissions previously analysed by Williams et al. (2012; see Fig. 3 therein). Other properties, such as the efficacy of ocean heat uptake (Fig. 3c), fraction of total Earth system heat uptake by the ocean (Fig. 3f), and other parameters (Fig. 3h–o) have similar frequency distributions for observation–consistent and simulation–consistent (Fig. 3, red and blue) to the input distribution (Fig. 3, black), implying that these properties are not well constrained by the historical tests used (Tables 1, 2).

4 Discussion

A new Earth System Model was presented for efficient projection of the global mean impacts of carbon emissions: the Warming Acidification and Sea-level Projector (WASP: Fig. 2; “Appendix” section). This new model was then used to construct an initial 4-million member Monte Carlo ensemble with randomly varied parameter values (Figs. 3, 4, black), which were forced with historic CO2 concentrations and future RCP scenarios to year 2100 (Meinshausen et al. 2011).

Eight historic constraints, representing the historic warming, heat and carbon ranges simulated by the CMIP5 ensemble (Table 1), were used to extract a simulation–consistent model ensemble of 182,500 members (Fig. 5; Figs. 3, 4, blue). This simulation–consistent model ensemble was then used to make projections of future warming and compatible carbon emissions for four RCP scenarios (Fig. 1; Tables 3, 4). These projections are in good agreement with the projection ranges from the complex CMIP5 model ensemble (Fig. 1, and compare Tables 3, 4 to IPCC, 2013; tables SPM.2 and SPM.3 therein). It should be noted, however, that the WASP model does not contain internal climate variability, but the CMIP5 models do. This lack of internal climate variability in WASP constitutes an important caveat when comparing both the historic performance and future projections of CMIP5 to the simulation–consistent model ensemble.

Table 4 Twenty-first century compatible carbon emission projections from the observation–consistent and simulation–consistent WASP model ensembles for the four RCP scenarios

Starting again from the initial 4-million member ensemble, another eight constraints, this time representing observational ranges for historic warming, heat and carbon (Table 2), were used to extract an observation–consistent model ensemble of 14,500 members (Fig. 5; Figs. 3, 4, red). This observation–consistent model ensemble was then used to make twenty-first century projections of warming and compatible carbon emissions for the four RCP scenarios (Fig. 1; Tables 3, 4). The lower bounds of the projected twenty-first century warming for the observation–consistent ensemble are consistent with both the CMIP5 and simulation–consistent ranges for each RCP scenario (Fig. 1a). However, the upper bounds of projected warming are reduced by 20–35 %. This reduction shows that the upper bounds of projected warming in the simulation–consistent ensemble are not supported by historic observations.

Only 4.5 % of the 182,500 simulation–consistent ensemble members are also consistent with observations (Fig. 5). This implies that a large part of parameter space consistent with historic CMIP5 performance (Table 1) is not observation–consistent. Some 43 % of the 14,500 observation–consistent ensemble members (Fig. 5) lie outside the simulation–consistent ensemble (Table 1). Therefore, significant areas of observation–consistent parameter space are not currently contained within the historic CMIP5 performance-range.

The Equilibrium Climate Sensitivity range in the observation–consistent ensemble, of 1.4–4.4 K, is comparable to the IPCC estimate of 1.5–4.5 K (Bindoff et al. 2013). Also, the ranges of the Transient Climate Response to Emissions, TCRE, in the observation–consistent and simulation–consistent model ensembles are comparable to previous estimates based on models and (Collins et al. 2013; Gillet et al. 2013; Matthews et al. 2009; Zickfield et al. 2009, 2012) and observationally constrained theory (Goodwin et al. 2015). These agreements for the ECS, TCRE and projected warming ranges imply that the WASP model presented here (Fig. 2; “Appendix” section) is a useful tool, within a model-hierarchy, for quickly constraining climate parameter estimates and making future projections based on historic observations and historic model performance.

The possible impact on future climate projections of small discrepancies between the range of historic climate change simulated by models and observed in the real climate system is discussed in Assessment Report 5 (Flato et al. 2013) and elsewhere (e.g. England et al. 2013; Song et al. 2014). Dangerous climate change often refers to 2 K or more global mean surface warming above preindustrial, although other climate targets should also be considered (Steinacher et al. 2013). This equates to around an additional 1.4 K above the warming already achieved during the 1985–2005 period (IPCC 2013). This dangerous climate change limit is at the lower bounds of twenty-first century projections for RCP4.5 and RCP6.0 (Fig. 1; Collins et al. 2013), and is considerably below the lower bound of warming projected for RCP8.5. Here, close agreement is found for the lower bounds of twenty-first century warming projected by the CMIP5 ensemble and the observation–consistent ensemble (Fig. 1). This lower bound agreement implies that the projected likelihood of crossing the defined threshold for dangerous climate change is not significantly affected by small CMIP5-observation discrepancies in the historic period. Indeed, the observation–consistent ensemble presented here is in agreement with the CMIP5 ensemble in suggesting that, to stand a good chance of avoiding dangerous climate change by the end of the century, global emissions should be controlled to follow a path below RCP4.5, for example towards RCP3PD emission levels (Fig. 1).