1 Introduction

While there is very strong evidence that climate change in the past decades has been dominated by human effects on the environment, natural factors still exert a significant influence on climate variations (IPCC 2013). The respective roles of various natural influences, their interplay and their expected variation in the future is, however, still poorly understood.

Global climate simulations have by now reached a degree of development where they are able to model many of these effects; however, there is still a very significant diversity between the outcomes of various simulations depending on model assumptions and the effects considered. The high complexity of these simulations often also makes their results hard to interpret from a physical point of view. (For a detailed discussion see IPCC 2013, Chapter 9).

As a result, at the opposite end of the spectrum of models, taking the simplest possible approach of a linear multiregression analysis of the empirical data has become increasingly popular in recent years. In such studies, the time series of a global indicator of climate change y, most commonly the mean global surface temperature (GST) or the mean sea surface temperature (SST) is fitted by a linear combination of of number of parameters \(x_i\):

$$\begin{aligned} y(t) &\,=\,x_0 + c_1 x_1(t-{\varDelta } t_1) + c_2 x_2(t-{\varDelta } t_2)\nonumber \\& \quad + \dots + c_N x_N(t-{\varDelta } t_N) \end{aligned}$$
(1)

Here, t is time and the \(c_i\)’s and \({\varDelta } t_i\)’s are fitting parameters. The set of \(x_i\) considered are usually indices of human and natural influences on the climate system, such as greenhouse gas (GHG) emission, tropospheric and stratospheric aerosol input (these are predominantly of anthropogenic and natural origin, respectively), solar radiative forcing as well as indices of the most important internal oscillations of the atmosphere–ocean system such as the El Niño Southern Oscillation (ENSO).

Such multiregression analyses rely on two assumptions that inherently limit the validity of their conclusions. The first is the assumption of linearity which can be reasonably assumed to hold as long as the input paramaters vary within a certain limited range. How wide that range is and what will be the response of the climate system beyond the range is an issue that can be only be examined in physical models of the climate systems such as global simulations. The second, although often implicit, critical assumption of multiregression analyses is that the set of \(x_i\)’s is complete and non-redundant, i.e. that the input parameters are independent and include all physically important factors influencing climate change. Neglecting an important factor will result in assigning an unduly high explanatory power to the factors considered, while a correlation between two \(x_i\) time series (whether it is a chance correlation or the result of a physical link between the effects) will result in aliasing, i.e. an indecisive situation with a strong cross-talk between the corresponding \(c_i\) coefficients.

Perhaps the most influential of these studies was the work of (Lean and Rind 2008, 2009). The input parameters considered were monthly values of the multivariate ENSO index, total solar irradiance, stratospheric aerosol concentration, and a net anthropogenic forcing parameter. The time lags \({\varDelta } t_i\) were chosen to maximize the explained variance: this generally resulted in lags of a few months except in the case of anthropogenic forcing where an optimal time lag of 10 years was found. The regression was found to explain 76 % of the variance in the CRUFootnote 1 monthly GST data from 1889 to 2006.

In a more recent extension of this approach, Chylek et al. (2014) include the Atlantic Multidecadal Oscillation (AMO) index among the explanatory variables. Further differences in their analysis as compared to Lean and Rind include the use of annual, rather than monthly data and another GST time series (GISSFootnote 2 vs. CRU). Even without the introduction of time lags, the factors considered by Lean and Rind were found to explain 89 % of the variance in the mean annual GST series for the period considered (1900–2012), the higher explanatory power being presumably due to the use of yearly averages. Adding the AMO index to the natural factors influencing climate was found to increase the explained variance to 94 %. A most surprising finding, however, was that a combination of greenhouse gas (GHG) emission and AMO alone explains 93 % of the variance—nearly as much as all the factors combined! (To place this in context we note that GHG alone explains \({\sim }81\,\%\) of the variance).

The fact that a high explanatory power can be achieved in linear multiregression models employing quite different sets of explanatory variables suggests that a strong aliasing may be present either between the various explanatory variables, or between some of the explanatory variables and the dependent variable (the mean GST). The objective of the present paper is to examine to what extent such aliasing may influence the curious results of Chylek et al.

The structure of the paper is the following. In Sect. 2 we present and discuss the data used. Section 3 focuses on the issue on potential aliasing between the AMO index and the GST time series, while in Sect. 4 we address aliasing between the various natural factors determining climate. Section 5 concludes the paper.

2 Data analysis

In contrast to the previous studies, in most of the present work we restrict the analysis to the time period 1950–2011. This is motivated by the following considerations of homogeneity of the data set.

  • The ENSO index is directly available for this period only, earlier values being reconstructions based on temperature data alone.

  • The usual definition of the AMO index involves the subtraction of a linear trend purported to represent anthropogenic influence—this assumed linearity of anthropogenic effects is also questionable over a longer base line.

  • All standard GST anomaly (GSTA) data sets rely heavily on sea surface temperatures. In the period prior to 1950 major bias corrections are applied to these data the details of which are a source of added uncertainty in the results (Smith and Reynolds 2002; Thompson et al. 2008; Hansen et al. 2010).

  • Recently there has been some disagreement regarding a possible temporal inhomogeneity in the calibration of the sunspot number series upon which the solar irradiance reconstruction is based (Clette et al. 2014). This issue also mainly affects data before about 1950. A caveat to be noted in connection with this temporal limitation is that due to a more limited long term variation of solar activity levels during this period, our approach may result in a lower weight attributed to solar irradiation variations in natural climate variations.

Table 1 URLs of data sources

URLs of our data sources are listed in Table 1.

For GSTA we use the data available at the NASA GISS web site (Hansen et al. 2010) while SST data were downloaded from the home page of NOAA (Smith and Reynolds 2004). Regression analyses of GST and SST data yield very similar results. In this paper, after analysing GST data to confirm some previous results we will focus on the analysis of the SST data set.

Radiative forcing by anthropogenic aerosols, greenhouse gases and aerosols froms volcanic eruptions were taken from (Hansen et al. 2007, 2011). Results using greenhouse gas (GHG) data alone or all anthropogenic forcings proved to be virtually indistinguishable; in addition, a strong aliasing exists between these effects, so in what follows anthropogenic forcings will be routinely often represented by GHG alone.

Instead of using a model dependent reconstruction of solar radiative forcing we characterise solar activity variations simply by the International Sunspot Number series as given by the Royal Observatory of Belgium (unrevised values).

Annual values of the ENSO and NAO indices were obtained from the NOAA data base (Wolter and Timlin 1993, 1998; Jones et al. 1997).

Values of the AMO index are also available from NOAA but in most of our analysis here we calculated this index directly from the SST grids as described in (Enfield et al. 2001) but without applying the smoothing on the data.

3 AMO versus GSTA

In the present paper we extend previous multiregression analyses to address a number of questions raised in earlier work. First, there may be an issue regarding the validity of the use of the AMO index as an independent explanatory valuable, given that the definition of this index is based on the mean sea surface temperature over a significant fraction of the globe (i.e., the N-Atlantic basin). It is to some extent natural to expect that, upon linear detrending, these data will reflect naturally induced global temperature variations.

As a first step we made an analysis closely following that of (Chylek et al. 2014), performing a linear regression in order to explain the temporal evolution of the GSTA with different subsets of the potential explanatory variables (anthropogenic, natural, AMO—cf. caption of Fig. 1).

The resulting values of the explained variance \(R^2\) as well as the adjusted variance \(R_{adj}^2\), taking into account the number of variables included to the regression, are summarized in Table 2. Overall, our numbers are in good agreement with the results shown in Table 1 of Chylek et al. (2014). Note that we further test whether the introduction of the North Atlantic Oscillation (NAO) Index as an explanatory variable improves the regressions but, as apparent from the table, the results were negative.

Table 2 Multilingual regressions applied to GST anomaly using subsets of the potential explanatory variables, including \(\hbox {natural}\) and anthropogenic \(\hbox {effects}\)

In order to check whether the good explanatory power of AMO is simply a consequence of its definition based on mean temperature anomalies on a non-negligible part of the global ocean surface, we set out to investigate whether the N-Atlantic region is of any special interest from this point of view.

Fig. 1
figure 1

Sea surface regions defined for comparison with the North Atlantic region, as outlined for the usual definition of AMO index (Enfield et al. 2001)

For this purpose, we arbitrarily assign other eight regions of comparable size on the global ocean (see Fig. 1). For each region we determine a “multidecade oscillation” (MO) index using the same procedure as for the AMO (but no smoothing is applied to the annual values). With each of these MO data sets, combined with the GHG data, we perform a linear regression to the GSTA record. Results are shown in Table 3. (The first line of the table refers to the AMO index calculated by us from the SST data.)

It is clear that significant differences are present between the explanatory power of the MO indices of different regions with respect to the GSTA. For some regions the value of \(R_{adj}^2\) hardly differs from its counterpart obtained with GHG alone, while for others the inclusion of the MO index in the analysis leads to a spectacular improvement. The N-Atlantic region (and the mid-Atlantic region overlapping with it) yields the highest explained variance of all; yet, interestingly, it is almost equalled by the Western Pacific region (and, to some extent, the NW-Pacific region overlapping with it).

Fig. 2
figure 2

MO indices as a function of time for selected sea surface regions individually (top) and overplotted (bottom), after smoothing by an 11-year sliding window

Indeed, the overall parallelism in the MO indices calculated for these regions is borne out in Fig. 2. The characteristic 50–60 year cyclicity of AMO is very clearly reflected in the Western Pacific region, albeit with a higher added “noise” in the form of short-term fluctuations, and also in a smoother but slightly more distorted form in the NW-Pacific. In the first part of the period there is a hint of some phase lag between the Atlantic and Pacific signals but in view of the smoothing applied the only certain statement is that any phase lag is well below the width of the smoothing window (i.e. no more than a few years).

Other areas of the global ocean such as the Indian Ocean, the Eastern Pacific or the Southern Atlantic, on the other hand, do not show such a clear reflection of the AMO signal, nor is their explanatory power with regard to the GSTA even nearly as high.

Fig. 3
figure 3

Same as Fig. 2 (bottom) for the base period 1920–2011

In view of the relative shortness of the time period considered (1950–2011) the robustness of our findings may be in question. In order to address this, ignoring the issues related to the use of pre-1950 data listed in Sect. 2, in Fig. 3 we plot smoothed MO indices for the relevant regions starting from 1920. The plot confirms that the parallel variation of smoothed detrended temperature anomalies between the N-Atlantic and W-Pacific regions was present also in the first half of the 20th century, lending support to the robustness of the result.

The question naturally arises: what is the origin of this surprising parallelism between two distant regions of the globe? One possibility may be that the AMO is a global climate phenomenon that is for some reason more strongly imprinted in the regional signal in some areas than in others (e.g. due to being suppressed in ENSO-affected areas). Alternatively, it is also possible that we are dealing with a real teleconnection effect. The physical mechanism of this teleconnection is an open question. One possibility is an atmospheric teleconnection across the Eurasian land mass: the prevailing westerlies may carry the AMO-induced atmospheric anomalies (SAT and moisture) to the Pacific, where the effect is attenuated from West to East. The other possibility of a link via the global thermophilicoceanic circulation is even more tentative as it would imply a connection, possibly by advection of deep/bottom water, across the Southern hemisphere. In either case, given that the other main manifestation of this oscillatory phenomenon is in the Western Pacific region, it may be more aptly called the “Atlanto-Pacific Multidimensional Oscillation”.

The mechanism driving the multidecade variation is still open to question; here we only note that internal variations in ocean basins on a time scale of decades naturally arise from aline oscillations [(cf. Huang (2010), Ch.5.].

Table 3 Results of linear regression of Global Surface Temperature Anomalies when the explanatory variables employed were the GHG and MO indices

4 AMO versus other natural factors

The results in Table 1 confirm the surprising finding by Chylek et al. (2014) that anthropogenic forcing combined with the AMO has the potential to explain a very high fraction of the observed variance in GSTA, while adding all the other natural forcings only leads to a slight further improvement in the fit. What makes this result rather puzzling is that these other natural factors, combined with the anthropogenic forcing could also explain a rather high proportion (88 %) of the variance even without the inclusion of AMO. Therefore, if the AMO were indeed the predominant natural factor influencing climate, this high explanatory power of other natural influences would remain unexplained or should be attributed to a chance coincidence.

Note, however, that from Table 1 it is also seen that anthropogenic effects alone can explain 81 % of the variance of the temperature anomaly. External forcing and the ENSO only add 7 % to this while the AMO adds approximately 10 %. AMO and other natural effects combined add 14 % instead of the 17 % that would be expected if they were uncorrelated. The correlation between AMO and other natural factors that is needed to explain this discrepancy is then not so excessive as it might appear at first sight.

To independently check this inference, we have performed a regression analysis with a detrended MO index as dependent variable and other natural factors as explanatory variables. The results are shown in Table 4. It is apparent that in the regions displaying a marked AMO signal natural factors explain typically about 20 % of the variance in the MO index, in agreement with our arguments above.

Table 4 Results of linear regression when the dependent variable is one of the different MO indices and explanatory variables are the natural factors alone or combined with anthropogenic forcing
Table 5 Results from multilinear regressions to SST anomalies calculated over regions unaffected by the AMO

The four regions found to display a strong AMO signal in the previous section together cover a significant fraction of the total sea surface area inside the \(-60<\phi <+60\) latitude zone. This prompts the question whether the strong imprint of AMO on the global mean SST signal (and, by inference, on the GST signal) is simply due to its predominance in a significant fraction of the base area or else it influences climate on a more global scale.

Fig. 4
figure 4

A snapshot of the distribution of the MO index defined over the sea surface area not affected by the AMO signal. (Blue-to-red scale with arbitrary normalization)

In order to clarify this issue we consider the mean sea surface temperature anomaly in the part of the \(-60<\phi <+60\) latitude zone not covered by the four regions displaying a strong AMO signal (see Fig. 4). The results of a multiregression analysis of this mean anomaly series are shown in Table 5, and an example of regression on Fig. 5. It is clearly seen that the explanatory power of AMO with respect to the mean SST anomaly series is drastically reduced if the main AMO regions are omitted.

Fig. 5
figure 5

Time series of MO index, according to Fig. 4blue line. Purple dashed line indicates the result of linear regression, when the explanatory variables are the natural and anthropogenic effects

5 Conclusion

In this paper we have presented results from a series of linear multiregression analyses of global and regional SST.

A most surprising and unexpected find was the very strong appearance of the AMO signal in the detrended SST anomaly time series over the Western tropical pacific and over the NW Pacific. To the best of our knowledge, such a clear indication of a strong AMO imprint in this oceanic region (or indeed any oceanic region far from the North Atlantic) has not been demonstrated before. Hints of an in-phase variation of temperature variations in the NW-Pacific and N-Atlantic can be seen in both proxy reconstructions and coupled atmosphere-ocean models (Delworth and Mann 2000). The definitions of the so-called Pacific Decade Oscillation (POD) and Interdepartmental Pacific Oscillation (POI) involve much larger areas over which the the AMO signal becomes inter tangled with the long-term residual of ENSO and other effects. Principal component analysis (PA) and other similar methods applied to the patio-temporal distribution of SST data have in fact already demonstrated the presence of a component related to AMO with a rather high amplitude in the NW Pacific (McCabe and Palecki 2006; d’Orgeville and Peltier 2007), while the other dominant component is a decade-scale ENSO residual. Our results are in agreement with these earlier findings but also display a surprisingly sharp geographical dichotomy between oceanic areas affected by one or the other principal component. These areas predominantly (though not exclusively) lie on the Northern and Southern hemispheres for the AMO and ENSO components, respectively, supporting also the use of the term “Northern Multidimensional Oscillation” as recently suggested by Steinman et al. (2015).

Aliasing between the AMO signal and other natural factors is found to be relatively weak (\({\sim }20\,\%\)). On the other hand, the fact that a strong and evident AMO signal is displayed in detrended SST anomalies over a significant fraction of the global oceanic surface implies a strong aliasing link between the AMO index and the global mean SST anomaly.

Indeed, if the AMO signal, defined on the basis of SST values over a larger area is present over a significant fraction of the ocean surface, this casts doubt on the use of AMO as an explanatory factor in multiregression analysis. “Explaining” global mean temperature variations by the mean temperature variations on a large fraction of the globe is a rather tautological exercise. It may seem that AMO is part of the problem rather than part of the solution. One may even argue that this is also true for ENSO to some extent, even though SST data are only one of six inputs that are combined to form this particular index.

Without delving too deeply into the philosophical issue of the meaning of “explanatory variable” we may suggest that a sensible approach is to consider a statistically significant correlation “explanatory” when the physical link is at least not implausible while the possible alternatives (an inverse cause-effect relationship or the two time series being due to a common cause) are a priori much less plausible; at the same time the two data sets are truly independent. These criteria are satisfied for external climate forcings such as TS, stratospheric aerosols or anthropogenic forcings (note that aliasing between these effects may still be an issue). For internal variabilities independence is difficult to ascertain: even when an index (e.g. the SOU) is based on data that do not include SST it can be argued that it is just another manifestation of the same weather pattern that is also reflected in [a subset of] the SST data analysed. In short of a definitive answer to this conundrum it is still important to stress the importance of choosing an internal variability index that has as little direct dependence on the data set being analysed as possible.

To summarize our findings, naturally induced climate variations seem to be dominated by two dominant internal variablility modes of the ocean–atmosphere system, with a marked geographical dichotomy in their respective areas of dominance. The influence of external forcings is non-negligible but secondary to these internal variabilities. However, for a proper assessment of the influence of forcings a multiregression analysis where internal variabilities, esp. AMO, are treated as explanatory variables on an equal footing with the forcings seems to be inappropriate.