1 Introduction

During the past century the Earth has experienced a considerable warming. Although most of the warming has been attributed to the burning of fossil fuels, the division between natural and anthropogenic components remains uncertain. Most climate research has been centered on the use of AOGCMs (atmosphere–ocean general circulation models) to simulate the climate system using fundamental physical, chemical and biological processes. At the same time semi-empirical statistical models (Otterman et al. 2002; Lean and Rind 2008; Humlum et al. 2011; Scafetta 2012; Mascioli et al. 2012; Zhou and Tung 2013; Canty et al. 2013; Chylek et al. 2014a, b) have been developed to provide additional insight into the anthropogenic and natural components of climate variability, and to point out components of the climate system that are not yet properly captured by the AOGCMs.

Multiple linear regression analysis (e.g. Wilks 2006) has been used recently (Lean and Rind 2008; Foster and Rahmstorf 2011; Zhou and Tung 2013; Canty et al. 2013; Chylek et al. 2014a, b; Miksovsky et al. 2015) to estimate the relative importance of the natural and anthropogenic components of the past warming. The method assumes a linear relation between the observed temperature and a set of selected physically plausible explanatory variables or predictors. A typical set of explanatory variables contains the known radiative forcing and an additional factors characterizing the oceanic influence on climate (Compo and Sardeshmukh 2009; Zhou and Tung 2013; Canty et al. 2013; Chylek et al. 2014a, b; Miksovsky et al. 2015).

The natural radiative forcing includes the top of the atmosphere solar variability (SOL) and volcanic aerosols (VOLC) (Douglass and Clader 2002; Haigh 2003; Scafetta and West 2006; Camp and Tung 2007; Lean and Rind 2008). The anthropogenic component includes anthropogenic well mixed greenhouse gases (GHG) and anthropogenic aerosols (AER). The oceanic influences are usually characterized by the El Nino Southern Oscillation (ENSO) index (Lean and Rind 2008; Foster and Rahmstorf 2011). However, the Atlantic Multi-decadal oscillation (AMO) (Schlesinger and Ramankutty 1994; Delworth and Mann 2000; Gray et al. 2004; Canty et al. 2013) and the Pacific Decadal Oscillations (PDO) have also exerted a considerable influence on the past century global and regional climate (Polyakov and Johnson 2000; Zhang et al. 2007; Mahajan et al. 2011; Frankcombe and Dijkstra 2011; Zhou and Tung 2013; Canty et al. 2013; Muller et al. 2013; Chylek et al. 2014a, b; Li et al. 2014).

In this note we compare simulations of the past and projections of the future mean global warming by 42 CMIP5 climate models and the statistical regression model. The projections are all based on the assumption of a future moderate increase in anthropogenic greenhouse gas radiative forcing such that it will plateau before 2100 with a magnitude of 4.5 W/m2 above its pre-industrial value. This plausible future trajectory of radiative forcing is described by the IPCC as a Representative Concentration Pathway RCP4.5. We also show that the relative contribution of the natural climate variability (characterized by the AMO) to the anticipated global warming will likely increase during the second half of the current century due to the expected levelling off of the anthropogenic forcing under the RCP4.5 scenario.

2 Data

In our analysis we use the global mean annual temperatures as compiled by the NASA GISS (Goddard Institute for Space Studies) available at http://data.giss.nasa.gov/gistemp/. The NASA GISS procedure includes filling the regions without observations and smoothing the data using correlations with stations up to 1200 km radius (Hansen and Lebedeff 1987; Hansen et al. 2010). This may prevent an underestimation of temperature variability from unobserved parts of the globe—especially the Arctic—which might have occurred in other temperature data sets. However, since the highest rate of anthropogenic warming is not necessarily limited to the Arctic (Lean and Rind 2008) the procedure may also overestimate the warming. To demonstrate the range of uncertainty we consider also the NASA GISS temperature data set constructed with a smaller (250 km) smoothing radius.

The historic radiative forcing used in our analysis is shown in Fig. 1. The forcing by anthropogenic greenhouse gases (GHG) and aerosols (AER), SOL, and VOLC are from IPCC (2013). The annual ENSO index is obtained by averaging monthly data from the NOAA website http://www.esrl.noaa.gov/psd/data/correlation/censo.long.data, and the PDO is from the website http://jisao.washington.edu/pdo/PDO.latest. Since we are interested in decadal scale variability we use an annually averaged ENSO index, rather than the monthly data used in earlier studies (e.g. Lean and Rind 2008; Foster and Rahmstorf 2011; Canty et al. 2013).

Fig. 1
figure 1

a Radiative forcing due to carbon dioxide (CO2), all well mixed greenhouse gases (GHG), anthropogenic aerosols (AER), combined CO2, GHG, and aerosols (CO2GHGAER), and volcanic aerosol (VOLC) after IPCC (2013). b Solar radiative forcing after IPCC (2013). c Three of the considered versions of the AMO indices after NOAA (AMO_K), van Oldenborgh (AMO_O), and Trenberth and Shea (AMO_TS). d The ENSO and PDO indices from sources described in the text

There are several different AMO indices (Fig. 1 ) resulting from different de-trending of the North Atlantic temperature data (Kaplan et al. 1998; Trenberth and Shea 2006; van Oldenbogrh et al. 2009), or from the principal component analysis (Parker et al. 2007). Without trying to resolve the differences of opinions and argue for one specific index, we use either the average of these most often used AMO indices, or all of them separately (when we need to show the dependence of the climate variable of interest on the selected form of the AMO index).

As previously stated, for future climate change projections up to the year 2100 we use the total radiative forcing (CO2, CH4, O3, N2O, halocarbons, and anthropogenic aerosols) prescribed by the IPCC (2013) for the RCP4.5 scenario. This number represents an approximate increase of 4.5 W/m2 between the year 2100 and its pre-industrial value.

3 Aerosol and GHG radiative forcing

Multiple regression analysis can lead to results with a larger uncertainty of regression coefficients when explanatory variables are significantly correlated (e.g. Wilks 2006). Since the degree of association between the GHGs and anthropogenic aerosol (AER) forcing is generally high, some of the previous regression analyses (e.g. Lean and Rind 2008; Chylek et al. 2014a) combined the GHGs and aerosol forcing into one effective forcing (GHGA), represented by the sum of the GHGs and aerosol radiative forcing. We follow the same procedure. We use the notation GHGA for the combined effect of GHG and tropospheric aerosols.

The large temporal and spatial variability of atmospheric aerosols makes global measurements of the aerosol optical depth and composition difficult. Although the direct aerosol effect (aerosol interaction with solar and terrestrial radiation) is reasonably well understood, the aerosol indirect effect (e.g. Chylek et al. 2006) (interaction affecting the cloud micro-structure, cloud albedo and cloud life cycle) remains a major source of uncertainty. The aerosol treatment in climate models varies, with some of the Coupled Models Inter-comparison Project phase 5 (CMIP5) models restricting their treatment to a direct aerosol effect; others attempt to include indirect effects as well (e.g. Wilcox et al. 2013; Flato et al. 2013), while some models do not consider an aerosol effect at all. In the following regression analysis we consider aerosol radiative forcing including an indirect effect as prescribed by the IPCC (2013), and we classify the CMIP5 models according to their aerosol treatment following the description in Table 9.1 of Flato et al. (2013).

In addition to the GHG and AER there are pairs of other potential explanatory variables (Table 1) that are also significantly correlated (e.g. GHGA and SOL). Whenever any of these pairs appears among the set of explanatory variables, their regression coefficients carry a considerable uncertainty. Although the collinearity affects the interpretability of the regression model, it does not affect its predictability.

Table 1 Correlations between explanatory variables considered

4 Regression analysis of the 1900–2015 annual mean global temperature

Using the set of conventional explanatory variables (e.g. Lean and Rind 2008) GHGA, SOL, VOLC, and ENSO, we expand the annual mean global temperature as

$${\text{T}}\left( {\text{t}} \right) = {\text{A}}_{\text{o}} + {\text{A}}_{1} {\text{GHGA}}\left( {\text{t}} \right) + {\text{A}}_{2} {\text{SOL}}\left( {\text{t}} \right) + {\text{A}}_{3} {\text{VOLC}}\left( {\text{t}} \right) + {\text{A}}_{4} {\text{ENSO}}\left( {\text{t}} \right)$$

where GHGA(t), SOL(t), VOLC(t) are respectively the prescribed 1900–2015 annual radiative forcing due to anthropogenic GHG and aerosols, the variability of solar radiation at the top of the atmosphere, and volcanic aerosols. The ENSO(t) is a time series of the annual ENSO index. The regression coefficients Ao to A4 are determined by a least squares fit. When only some of the explanatory variables are considered for a particular regression model, the coefficients of those variables not considered are set to zero. Additional terms are added when the AMO and PDO are considered as potential predictors. The fraction of the dependent variable variance accounted for by regression is given by R2, the square of the multiple regression coefficient.

The regression models using the usual explanatory variables (GHGA, SOL, VOLC, and ENSO) account for 93.68 % of the observed 1900–2015 temperature variance (Table 2). The residual (difference between the observed and modeled temperature) of this model (model #1 in Table 2) was found to be correlated with the AMO (Chylek et al. 2014a), suggesting the AMO as an additional potential explanatory variable, in agreement with previously reported results (e.g. Canty et al. 2013; Zhou and Tung 2013). The use of the PDO as a potential explanatory variables has also been explored. When both the AMO and PDO are added to the set of explanatory variables, the fraction of accounted for temperature variance increases to 95.68 %. However, the VOLC, ENSO and PDO are found not to be statistically significant predictors (model #2 in Table 2). In order to eliminate non-essential predictors we use a backward selection algorithm with the p value as a selection parameter; a statistically not significant predictor with the highest p value is thereby eliminated. Although the PDO may affect the AMO behavior (Guan and Nigam 2009; Chylek et al. 2014a), the PDO itself (p = value 0.72) is not a significant predictor in the context of our set of explanatory variables, as found earlier also by Canty et al. (2013). A compromise between complexity and accuracy leads to a parsimonious model with just three statistically significant explanatory variables, namely GHGA, SOL, and AMO, which still accounts for 95.51 % of the observed (1900–2015) temperature variance (Table 2). We shall use this three predictor regression model (GHGA, SOL and AMO) below to estimate future warming and to compare the regression model’s predicted future warming with the warming projected by the CMIP5 models.

Table 2 Linear regression models considered

5 The AMO as an explanatory variable

The origin of the Atlantic Multi-decadal Oscillation is not yet fully understood (Dima and Lohmann 2007). Some studies (e.g. Mahajan et al. 2011; Delworth and Zeng 2012; Zhang et al. 2013) suggest a connection to the Atlantic Meridional Overturning Circulation (AMOC), while others point to the upwelling of warm water along the Antarctic Circumpolar Current (Toggweiler and Russell 2008). Cycles of about 20 and 60–70 years have been identified in tree rings and ice cores records (Delworth and Mann 2000; Gray et al. 2004, Frankcombe and Dijkstra 2010; Chylek et al. 2011).

Since the AMO index is derived from the North Atlantic sea surface temperature, it represents not only the basic multi-decadal variability connected possibly to the AMOC, but also the imprints of other processes affecting the North Atlantic temperature (e.g. solar variability, volcanic aerosol, cloudiness). The direct warming effect due to increasing atmospheric concentration of greenhouse gases is assumed to be removed from the AMO index by different de-trending methods leading to slightly different AMO indices (Kaplan et al. 1998; Trenberth and Shea 2006; van Oldenborgh et al. 2009).

Although the twentieth century AMO-like cycle has been mimicked by an aerosol effect in some of the CMIP5 models (Booth et al. 2012; Wilcox et al. 2013; Zhang et al. 2013; Chylek et al. 2014b), analysis of the central England temperature (Tung and Zhou 2013), tree rings (Delworth and Mann 2000; Gray et al. 2004) and ice core data (Chylek et al. 2011) suggest that the AMO cyclic behavior existed for hundreds and possibly thousands of years before the beginning of anthropogenic influences. Thus the addition of the AMO to a set of predictors seems to be a justifiable choice (Mascioli et al. 2012; Zhou and Tung 2013; Canty et al. 2013; Kavvada et al. 2013; Muller et al. 2013; Miksovsky et al. 2015).

6 CMIP5 models simulations of the past (1900–2015) global warming

The global mean temperature as simulated by 42 individual CMIP5 models varies between 12 and 15 °C in 1860, and increases under the RCP4.5 forcing to 14–18 °C in 2100. The ensemble mean of the CMIP5 simulations increases by 2.6 °C, from 13.6 to 16.2 °C (Fig. 2).

Fig. 2
figure 2

Simulation of the global mean temperature (1860–2100) by 42 CMIP5 models with the prescribed historic and RCP4.5 radiative forcing. The thick black line (CMIP5_Ave) is a simulation by ensemble mean of all CMIP5 RCP4.5 simulations. The red thick line (T1200) is the observed global mean temperature variability after NASA GISS with 1200 km smoothing normalized to the CMIP5 mean in the year 1900

The temperature increase between the years 1900 (average of 1900–1910) and 2015 (average of 2005–2015) by individual models (Fig. 3a) under the RCP4.5 scenario varies between 0.58 and 1.70 °C with the CMIP5 models mean increase of 1.05 °C. The observed global warming (1900–2015) is 0.95 °C with the GISS smoothing radius of 1200 km, or 0.86 °C with the 250 km smoothing radius (red columns in Fig. 3a). Of the 42 CMIP5 models, 28 models simulate the 1900–2015 temperature increase within ±0.25 °C of the observed range, while 13 models show the warming higher (and one model lower) than that. This represents a partial success of the first principle physics based CMIP5 climate models in reproducing the observed past (1900–2015) warming. The three predictor (GHGA, SOL, and AMO) regression model (blue column in Fig. 3a) reproduces the observed NASA GISS T1200 mean global temperature increase quite accurately (0.96 °C compared to the observed value of 0.95 °C). The reproduction of the past is of course no guarantee of reliable future projections.

Fig. 3
figure 3

a The 1900–2015 warming (defined as a difference between the decadal means of 1900–1910 and 2005–2015) simulated by individual CMIP5 models and their average (red column CMIP5_Ave), together with the observed temperature increase (black columns for T1200 and T250), and the warming reproduced by the regression model of the T1200 (blue column Reg1200). b The projected 2015–2100 warming (defined as the difference between decadal averages of 2090–2100 and 2005–2015) by individual CMIP5 models; the CMIP5 models’ mean (red column CMIP5_Ave); and the warming predicted by the regression model (blue column Reg1200) of the T1200 temperature

7 CMIP5 models projections of the future 2015–2100 mean global warming under the RCP4.5 scenario

As noted in the Introduction, the IPCC (2013) prescribes several pathways (Representative Concentration Pathways—RCP) to the year 2100 specified by an assumed increase of anthropogenic radiative forcing between the year 2100 and its pre-industrial value. The RCP4.5, RCP6.0 and RCP8.5 pathways describe respectively radiative forcing increases of about 4.5, 6.0, and 8.5 W/m2. Most simulations done by the CMIP5 models assume RCP4.5 radiative forcing, which we consider in the following analysis. The radiative forcing equivalent to the doubling of CO2 (about 3.7 W/m2) is reached within the decade of the 2060s under RCP4.5.

The correlation coefficient for the period 2015–2100 between the radiative forcing and the CMIP5 ensemble mean projected warming is extremely high: 0.99 for the case of the RCP4.5 scenario. Thus the ensemble mean of all CMIP5 temperature simulations is a simple linear transformation of applied forcing. The response of individual models, however, varies depending on the parameterization of the model’s physical processes. For the RCP4.5 scenario the individual models’ projected 2015–2100 warming varies between 0.71 and 2.22 °C (Fig. 3b), with a mean of all CMIP5 RCP4.5 simulations of 1.47 °C.

To project the future temperature using the regression model we need, in addition to the RCP prescribed radiative forcing, an estimate of the future natural variability represented by the AMO and solar variability (SOL). For the AMO we assume a cyclic behavior that repeats the twentieth century AMO cycle. Similarly the solar variability is assumed to repeat its past behavior.

Comparing the CMIP5 climate models and the statistical regression model (Fig. 3), we have nine CMIP5 models that simulate both the historic global mean temperature increase (1900–2015) and the models’ projected 2015–2100 warming within ±0.25 °C of that obtained by a regression model (also within ±0.25 °C of the observed 1900–2015 warming). Thus we consider these CMIP5 models and the three predictor statistical regression model to be in a broad agreement. On the other hand there are seven CMIP5 models that project the future 2015–2100 warming to be at least 1.0 °C higher than that suggested by the regression model and more than 0.5 °C over the mean of all the CMIP5 models (Fig. 3b). We note further that all seven of these CMIP5 models with the highest projected future warming have a fully interactive aerosol-cloud interaction (an indirect aerosol effect) incorporated. As has been pointed out earlier in the case of Arctic warming (Chylek et al. 2016), the CMIP5 models classified as having fully interactive aerosols (Flato et al. 2013, their Table 9.1) project generally a higher warming than the rest of the models.

Concerning the total 1900–2100 warming, we have eight CMIP5 models and the regression model that project the total 1900–2100 warming (Fig. 4) to stay below 2 °C under the RCP4.5 scenario. The mean warming (2.8 °C) of models with fully interactive aerosols is statistically significantly different (p = 0.001 for the Welch two sample t test) from the mean (2.3 °C) of models without it. Thus the aerosol indirect effect, as it is presently represented in climate models, leads to higher models’ projected global warming than models without an indirect effect. In this regard it should be recalled that the aerosol indirect effect is regarded presently as a relatively poorly understood and simulated phenomenon.

Fig. 4
figure 4

The mean global warming between 1900 and 2100 as projected by 42 CMIP5 models and the regression model under the RCP4.5 scenario. Black columns designate models with a fully interactive aerosol (Flato et al. 2013; Table 9.1) and gray columns designate models without it. The red column is the mean warming of the CMIP5 models and the blue column is the global warming according to the regression model. Eight CMIP5 models and the regression model project the total 1900–2100 warming to stay below 2 °C under the RCP4.5 scenario. The mean warming of models with a fully interactive aerosol effect (2.8 °C) is statistically significantly different (p = 0.001 for the Welch two sample t test) from the mean (2.3 °C) of models without it

The choice of different forms of the AMO index in the regression model has little effect on the regression model total temperature (Fig. 5). However, the fraction of the AMO contribution (Table 3) to global warming increases significantly (by about a factor of two) in the second half of the twenty-first century (Fig. 5) due to the levelling off of the anthropogenic radiative forcing according to the RCP4.5 scenario (IPCC 2013).

Fig. 5
figure 5

a Assumed form of a cyclic AMO extended to 2100. b Simulated past and projected future temperatures by regression models with different AMO indices [after NOAA-Kaplan (T_K), van Oldenborgh (T_O), Trenberth and Shea (T_TS), and Parker (AMO_P)] are close to each other. The temperature projected by an ensemble mean of all CMIP5 RCP4.5 simulations (CMIP5-black dashed line) and the temperature variability projected by the CMIP5 model with the highest (GFDL-CM3-blue dash line) and the lowest (GFDL-ESM2G-orange dash line) projected warming. The observed NASA GISS temperature (T1200) is shown in red color. All temperature anomalies are 5 year moving averages normalized to zero within 1900–1910 mean. c Individual contributions of the anthropogenic greenhouse gases and aerosols (GHGA), solar variability (SOL), and the average AMO index towards the 1900–2015 warming and d towards 2015–2100 warming under the RCP4.5 scenario. Due to the levelling off of the GHGA radiative forcing during the second half of the twenty-first century, the relative contribution of the AMO (representing natural climate variability) to the GHGA in the twenty-first century is more than double what it was during the twentieth century (see also Table 3). The AMO in c, d stands for an average for the four considered AMO indices

Table 3 The AMO and the GHGA contributions to the global warming within the past (1975–2005) and the expected future AMO cycle for the AMO_K index after Kaplan et al. (1998), AMO_TS index according to Trenberth and Shea (2006), AMO_O after van Oldenborgh et al. (2009), and AMO_P after Parker et al. (2007)

8 Discussion and summary

In addition to the usual explanatory variables of the regression model (anthropogenic greenhouse gases and aerosols, solar variability, volcanic aerosols, and ENSO) we have included the AMO and the PDO as potential proxies for additional unforced natural climate variability. In agreement with earlier studies we confirm that the AMO is an effective explanatory variable (predictor) for the 1900–2015 global mean temperature while the PDO is not (Table 2). The parsimonious regression model (i.e., the version providing a balance of simplicity and accuracy) that accounts for 95.5 % of the global mean temperature variance (1900–2015) contains only three explanatory variables: radiative forcing due to anthropogenic greenhouse gases and aerosol (GHGA), the AMO index, and the solar variability (SOL). The regression model reproduces accurately the 1900–2015 warming (0.96 °C modeled compared to the observed 0.95 °C), and projects another 0.95 °C warming before the end of the twenty-first century under radiative forcing specified by the RCP4.5 pathway (IPCC 2013). There is a broad agreement between the projection of the future warming by the CMIP5 climate models and the regression model.

However, seven the CMIP5 models (GFDL-CM3, MIROC-ESM, MIROC-ESM-CHEM, HadGEM2-ES, HadGEM2-CC, CESM1-CAM5, and CSIRO-Mk3-6-0) project the future (2015–2100) global warming to be by a more than 1 °C higher that the regression model projection (Fig. 3b). All these models projecting the highest warming include a fully interactive aerosol effect (Flato et al. 2013). At the other end there are five CMIP5 models (Fig. 3b) that project a future global warming that is below that of the regression model.

Eight of the CMIP5 models (GFDL-ESM2G, GFDL-ESM2M, GISS-E2-R-p2, GISS-E2-R-CC, GISS-E2-R-p1, FIO-ESM, inmcm4, and FGOALS_g2) and the regression model project the total 1900–2100 global warming to stay below 2 °C, as long as anthropogenic radiative forcing does not exceed that specified by the RCP4.5 (IPCC 2013). We note that none of these models has a fully interactive aerosol as defined by (Flato et al. 2013, their Table 9.1).

The relative influence of the AMO (natural variability) is expected to increase during the second half of the twenty-first century, when the GHGA forcing, under the RCP4.5 scenario, will level off (Fig. 5). Our analysis shows that the anthropogenic warming will be of the same magnitude as natural variability during the second half of the twenty-first century if we limit GHG emissions to keep radiative forcing within the RCP4.5 scenario.