1 Introduction

Reliable climate prediction of multi-scales has been a great challenge for many decades. It is not only important for scientific researches, but also relevant for societal sectors including agriculture, water management, energy and health (Brunet et al. 2010; Robertson and Wang 2012; Clements et al. 2013; Raff et al. 2013; White et al. 2017). During the past years, many studies have been presented that target at seasonal climate predictions (van den Dool 2007; Fan 2010; Bengtsson et al. 1993; Stockdale et al. 1998), while studies focusing on subseasonal to seasonal (S2S), annual, and decadal scales are also growing in number promoted by international projects (Vitart et al. 2012; Robertson et al. 2014; García-Serrano and Doblas-Reyes 2012). Thanks to these effects, considerable progresses have been made in identifying the sources of predictability on multi-scales (Mani et al. 2014; van Oldenborgh et al. 2005; Mochizuki et al. 2010), designing statistical-empirical approaches (van den Dool 2007; Fan 2010), as well as developing process-based dynamical models (Bengtsson et al. 1993; Stockdale et al. 1998).

However, due to the complexity of the climate system, our current climate prediction skills are still limited. For statistical-empirical approaches, relations between target processes and their predictors may not hold in the context of climate change (Holland and Stroeve 2011; Doblas-Reyes et al. 2013). As a result, a good hindcast does not necessarily mean a high predictive skill in real-time forecast. While for process-based dynamical models, due to the lack of perfect initial conditions and the inability to perfectly model the climate system, there are unavoidable uncertainty and inadequacy in the model simulations. Consequently, climate predictions based on dynamical models may be biased (Palmer 2000; Slingo and Palmer 2011). To meet the increasing demand of many societal applications, continuous efforts are thus required to improve the current prediction models and develop new theories.

Climate predictability arises from both externally forced and internally generated variability (Doblas-Reyes et al. 2013; Fyfe et al. 2011; Meehl et al. 2014). Thus, the variance of climate variables related to external forcings and internal oscillations determines the theoretical limit of predictive skill. Regardless of the external forcings, even internal variability is not fully captured by current models (Doblas-Reyes et al. 2013; Meehl et al. 2014). As a results, there are systematic errors in the model simulations and the predictive skill is lower than expected. In this study, we investigate this issue from a new perspective. We ask, besides simulating the internal oscillations in dynamical models, can we deal with this issue alternatively by extracting the effects of internal variability out of climate variables, and then implementing predictions for the residuals. As shown in Eq. (1),

$$\begin{aligned} x(t)=M(t)+\varepsilon (t), \end{aligned}$$
(1)

where x(t) denotes the observed climate variable, M(t) represents the effects of internal variability, and \(\varepsilon (t)\) stands for the residuals.

According to recent studies (Koscielny-Bunde et al. 1998; Fraedrich and Blender 2003; Eichner et al. 2003; Kantelhardt et al. 2006; Lin et al. 2007; Chen et al. 2007; Vyushin and Kushner 2009; Yuan et al. 2010; Dangendorf et al. 2014; Jiang et al. 2017), this research idea is possible, as for many climate variables it has been found that the variability on different time scales is not arbitrary, but follows a scaling manner as shown below,

$$\begin{aligned} x(st)=s^{H}x(t), \end{aligned}$$
(2)

where s represents the time scale and H is the Hurst exponent (Hurst 1951). This scaling behavior indicates that the knowledge of high-frequency variability allows one to predict the low-frequency variability of a given process. As a result, the process is not temporally independent, but auto-correlated with no cut-off correlation time. We name this property fractal or long-term climate memory, as in this case climate states from long time ago may still have influences on the current climate state (Rybski et al. 2008; Zhu et al. 2010; Yuan et al. 2013). In physics, this phenomenon is a kind of “inertia”, while in climate sciences, the closest concept is “natural internal variability” (Yuan and Fu 2014). For example, by taking the effects of climate memory into account, the pause of the warming trend (hiatus) since the late of twentieth century was simulated successfully (Lovejoy et al. 2015; Lovejoy 2015a), which further supported the argument that natural variability is one main reason for the warming “hiatus” (Risbey and Lewandowsky 2017; Medhaug et al. 2017). Although climate memory itself cannot cover the whole climate variability, it determines the initial states induced by past climate conditions, from which one can further study how the processes will evolve. Therefore, climate memory corresponds to “M(t)” in Eq. (1), which represent the effects of natural internal variability on the current climate state.

Recently, in analogy to the stochastic climate model proposed by Hasselmann (Hasselmann 1976), a new stochastic model, Fractional Integral Statistical Model (FISM) was developed (Yuan et al. 2013). Based on fractional integral techniques, FISM was proved to be able to extract the memory signals quantitatively (Yuan et al. 2014). Accordingly, climate state at a given time point can be decomposed into two parts: the memory part and the weather-scale dynamical excitation, which correspond to M(t) and \(\varepsilon (t)\) in Eq. (1), respectively. This decomposition opens a new gate for climate prediction. M(t), which can be calculated quantitatively, determines the bottom bound of predictive skill.

In this study, we will investigate this “bottom bound” in detail. To what extent can the memory part M(t) explain the total climate variability? Does the explained variance vary over different climate processes? How do climate memory effects decay over time? By applying FISM to different observational records and artificial data with different scaling behaviors, we address these questions and provide fundamental information for further climate predictions.

This paper is organized as follows. The data and methods used in this work is briefly described in Sect. 2. In Sect. 3, we present the results of different observational records, and compare the estimated explained variances with those calculated from artificial datasets. An extensive Monte-Carlo simulation is applied in Sect. 4, and the roles of climate memory in both one-step prediction and more-steps prediction are evaluated in the same section. In Sect. 5, we discuss and conclude this paper.

2 Data and methods

2.1 Data

In order to study the climate memory effects, we selected six records of different variables as shown in Table 1. They are: (1) surface air temperature observed in Stockholm (\(\hbox {SAT}_{\mathrm{land}}\)) (Klein Tank et al. 2002) (monthly mean records calculated from daily measurements, available at European Climate Assessment & Dataset, ECAD, http://www.ecad.eu); (2) surface air temperature observed in Cocos island (SAT\(_{\mathrm{island}}\)) (Menne et al. 2012) (monthly data available at Royal Netherlands Meteorological Institute (KNMI) Climate Explorer, http://climexp.knmi.nl/); (3) precipitation sum observed in Copenhagen (Klein Tank et al. 2002) (monthly sum data calculated from daily measurements, available at European Climate Assessment & Dataset, ECAD, http://www.ecad.eu); (4) river runoff (RR) records in Weser river (monthly mean records calculated from daily measurements in Vlotho, provided by the Global Runoff Data Centre, 56068 Koblenz, Germany, http://www.bafg.de/GRDC/EN/Home/homepage_node.html); (5) Northern Hemisphere temperature anomaly (NHTA) data (monthly, available at National Aeronautics and Space Administration (NASA) Goddard Institute for Space Studies (GISS), https://data.giss.nasa.gov/gistemp/); and (6) the Pacific Decadal Oscillation (PDO) index (monthly, available at National Oceanic Atmospheric Administration, NOAA, https://www.esrl.noaa.gov/psd/data/climateindices/list/). These records are either well-known climatic indices (e.g. PDO, NHTA), or long-term in situ observations that have been widely used in previous studies (Kantelhardt et al. 2006; Beniston 2009; Kärner and de Freitas 2014). Accordingly, results based on these records can provide an initial impression of climate memory effects on different variables. Before analysis, the seasonal trend in each record has been removed by subtracting the annual cycle, as \(x(t)=\tau (t)-\langle \tau (t) \rangle\), where \(\tau (t)\) is the original record and x(t) is the anomalies we used for analysis.

We also used Fourier filtering technique to generate artificial data with different memory strengths (Turcotte 1997). For each memory strength, there are 3000 samples to ensure adequate statistical accuracy. By analyzing these artificial data, an extensive Monte-Carlo simulation was performed, with which we confirm, and better illustrate the climate memory effects.

2.2 Methods

2.2.1 Detecting long-term climate memory

In this study, we employed a widely used method, detrended fluctuation analysis of the second-order (DFA-2) (Peng et al. 1994; Kantelhardt et al. 2001), to detect the long-term climate memory. Suppose we have a time series \(\{x_i\},i=1, \ldots , N\). In DFA-2, one mainly considers the cumulated sum \(Y_k=\sum _{i=1}^{k}\{x_i-\langle x \rangle \}\) and studies in non-overlapping time windows of size s. In each window, we calculate the local trend through second-order polynomial fitting and get the square fluctuation \(F_{s}^{2}(j)\) as the variance of \(Y_k\) around this best quadratic fit, where j points to the j-th window. By averaging over all windows, we obtain the fluctuation function F(s). If F(s) increases with s as \(F(s) \sim s^{\alpha }\), and the scaling exponent \(\alpha\) [see also the Hurst exponent in Hurst (1951)] is larger than 0.5, we say this time series \(\{x_i\}\) is characterized by long-term memory. The bigger \(\alpha\) is, the stronger the memory will be. It is worth to note that the long-term climate memory detected by DFA-2 is actually linear memory that originated from mono-fractal behaviors. For variables with multifractality (Kantelhardt et al. 2006; Bogachev and Bunde 2011), it is still a challenge to properly quantify the corresponding nonlinear memory effects. Accordingly, only the linear climate memory is considered in this study.

In addition, as reported in previous studies, the output of DFA normally exhibits initial crossover. For DFA-2, reliable scaling range usually starts from \(s=8\) (Bogachev et al. 2017). As a result, it is not possible to determine whether the scaling behavior \(F(s) \sim s^{\alpha }\) still holds for \(s<8\). To study the relations between F(s) and s at the small time scales (\(s<8\)), we also employed the Haar Wavelet Technique of order 2 (WT2) (Koscielny-Bunde et al. 1998; Bogachev et al. 2017), which does not suffer from the initial crossover. For more details of WT2 and the corresponding outputs, please refer to the supplementary material.

2.2.2 Estimating climate memory effects

To estimate the roles of climate memory in explaining climate variability, we applied the fractional integral statistical model (FISM) to extract memory signals quantitatively. It has been shown that the weather-scale excitation part \(\varepsilon (t)\) and the memory part M(t) are connected via fractional integral (Yuan et al. 2013). Accordingly, the FISM was designed using Riemann-Lioville fractional integral formula as shown in Eq. (3), and the discrete version can be found in (Yuan et al. 2014),

$$\begin{aligned} x(t) = \frac{1}{\varGamma (q)} \int _{u=0}^{t-\delta } \frac{\varepsilon (u)}{(t-u)^{1-q}} {\mathrm{d}}u + \varepsilon (t). \end{aligned}$$
(3)

In Eq. (3), \(\varGamma\) denotes the gamma function, q is the integral order, \(t-u\) represents the distance between historical time point u and present time t, and \(\delta\) is the sampling time interval (e.g., monthly). Theoretically, it is possible to calculate M(t) quantitatively from historical weather-scale excitations \(\varepsilon (u)\), as shown below (Yuan et al. 2014),

$$\begin{aligned} M(t) = \frac{1}{\varGamma (q)} \int _{u=0}^{t-\delta } \frac{\varepsilon (u)}{(t-u)^{1-q}} {\mathrm{d}}u, \end{aligned}$$
(4)

where \(\varepsilon (u)=\varepsilon (0), \varepsilon (\delta ), \ldots , \varepsilon (t-\delta )\). \(\varepsilon (u)\) can be derived reversely from Eq. (3), suppose the historical observations x(t) and the integral order q are known. Since q can be calculated from the DFA exponent \(\alpha\) as \(q=\alpha -0.5\), the procedures to extract M(t) out of the observations can be summarized as following:

  1. (1)

    Apply DFA-2 to the observed records x(t), determine the DFA exponent \(\alpha\) and the integral order q.

  2. (2)

    Substitute x(t) and q into Eq. (3), calculate the historical weather-scale excitations \(\varepsilon (u)\) by reversely deriving Eq. (3).

  3. (3)

    Substitute \(\varepsilon (u)\) into Eq. (4), calculate M(t) by integrating Eq. (4).

It is worth to note that there is a “spin-up” time of about 100 steps in (2). Only after this period, the estimated \(\varepsilon (u)\) can be considered reliable (Yuan et al. 2014).

With the extracted M(t), we are further allowed to calculate the explained variance by climate memory, or in other words, the climate memory effects. As shown below,

$$\begin{aligned} EV = \frac{var(M(t))}{var(M(t))+var(x(t)-M(t))}, \end{aligned}$$
(5)

the higher variance we get, the more we may expect from climate memory, which means stronger predictability. It is worth to note that the M(t) in Eq. (5) contains the historical memory from the past till \(t-\delta\). By comparing with the residuals (\(x(t)-M(t)\)) at the current time point t, climate memory effects in one-step prediction are estimated. To study the climate memory effects in multi-step prediction, Eq. (5) needs to be generalized as,

$$\begin{aligned} EV_{n} = \frac{var(M(t))}{var(M(t))+var(x(t+n-1)-M(t))}, \end{aligned}$$
(6)

where n is the number of steps. When \(n=1\), Eq. (6) degenerates to Eq. (5). When \(n=2\), however, we will compare the memory part M(t) with the residuals (x(t+1)-M(t)) at the next time point \(t+1\). In this way, by varying n, the explained variances of M(t) for future time points (\(n=2, 3, \ldots\)) can be calculated. In this work, we will first study the climate memory effects in different climatic variables using Eq. (5), then discuss the climate memory effects in multi-step prediction using artificial data.

Table 1 Information of the six records
Fig. 1
figure 1

DFA-2 results for the six observational records. The relations between fluctuation function F(s) and time scale s are shown in the log-log plots, and the slope of each line represents the DFA-2 exponent \(\alpha\). From bottom to top, DFA-2 results of precipitation (black), surface air temperature over land (red), river runoff (blue), surface air temperature over island (green), North Hemisphere temperature anomaly (pink), as well as PDO index (yellow) are shown. DFA exponents \(\alpha\) are marked beside the lines. A reference dashed line with slope of 0.5 are shown at the bottom of the figure

Fig. 2
figure 2

Weather-scale dynamical excitations \(\varepsilon (t)\) of the six records presented in Table 1. \(\varepsilon (t)\) of the six records are shown from top to bottom following the order of: precipitation, surface air temperature over land, river runoff, surface air temperature over island, North Hemisphere temperature anomaly, and PDO index. Positive \(\varepsilon (t)\)s are marked by red bars, while negative \(\varepsilon (t)\)s are marked by blue. The gray area covers the recent 30 years \(\varepsilon (t)\) (1980–2010) in NHTA, where more positive \(\varepsilon (t)\)s are found, corresponding to the significantly increasing trend of NHTA during this time period (see Fig. 3)

Fig. 3
figure 3

Original data (black, 1961–2010) of the six records as well as the extracted memory part M(t) (red, 2001–2010). From top to bottom the figure shows the time series of precipitation, surface air temperature over land, river runoff, surface air temperature over island, North Hemisphere temperature anomaly, as well as PDO index, respectively. As one can see, the higher climate memory is, the bigger variance is explained by the memory part M(t)

Fig. 4
figure 4

Explained variances of M(t) for the six observation records. 50 years data (1961–2010) were used for the calculation with the first 10 years (1961–1970) as “spin up” time, and the last 10 years (2001–2010) as test zone. Accordingly, climate memory effects accumulated over 30 years are calculated as the explained variance. According to the \(\alpha\) values, the corresponding explained variances are marked as green points. To verify the results calculated from observation records, the same procedures are also applied to a big number of artificial data, and the estimated explained variances along with uncertainties (95% distribution range) are shown as black points and error bars. By comparison one can see good agreements between the results obtained from the six records and those from the numerical simulations. With the increase of \(\alpha\), bigger variances can be explained by climate memory

Fig. 5
figure 5

Explained variances of M(t) estimated from an extensive Monte-Carlo simulation. To avoid the potential biases due to limited data length in practice, artificial data with longer data length (3000 “months”, 250 “years”) are used. Similar to Fig. 4, climate memory effects accumulated over past 30 years are considered. The yellow, green, and blue areas indicate the reported ranges of climate memory strength (\(\alpha\) values) in precipitation records, surface air temperature records over land (\(\hbox {SAT}_{\mathrm{land}}\)), and surface air temperatures over island (\(\hbox {SAT}_{\mathrm{island}}\)). For precipitation, only very low variances (\(0\sim 5\%\)) can be explained by M(t). For \(\hbox {SAT}_{\mathrm{land}}\), the explained variance by climate memory can be up to 23.8%. While for \(\hbox {SAT}_{\mathrm{island}}\), the maximum explained variance is even higher, around 43.6%

Table 2 Explained variance for different variables
Fig. 6
figure 6

Climate memory effects in multi-steps forward predictions. Different from Fig. 4 and 5 where explained variances of M(t) were calculated by comparing M(t) and \(\varepsilon (t)\) at the current time point t (steps \(= 1\)), in this figure climate memory effects on more distant future (steps \(> 2\)) are shown. Results for data with different \(\alpha\) values are shown in different colors, and the climate memory effects are found to decay with steps (n) as a power-law. The colored lines are calculated using the fitted formula (Eq. 7), which shows good agreements with the results obtained from numerical simulations (colored points). For some cases with strong climate memory (e.g. \(\alpha>0.85\)), the climate memory effects can maintain strong (e.g., larger than 10%) for a long time (more than ten steps)

3 Climate memory effects in different climatic variables

We first estimate the memory strength using DFA-2 for the six records presented in Table 1. Figure 1 shows the DFA results in a double logarithmic plot. Perfect straight lines are obtained for all six records, indicating power law relations between F(s) and s over scaling range from several months to decades. By further employing WT2 to the six records, the scaling range is further extended to \(s=1\) month (see Figure S1). Accordingly, there are scaling behaviors from 1 months to decades for the six records. The DFA exponents \(\alpha\) (slopes) are different, indicating that the records have different memory strengths. For precipitation observed in Copenhagen, the DFA exponent \(\alpha\) is around 0.5 (black points), indicating the absence of memory. This result is consistent with previous studies, which reported that precipitation is usually characterized by very weak long-term memory, or even behaves as white noise on time scales from months to decades (Kantelhardt et al. 2006; Jiang et al. 2017). For the river runoff measured in Weser river, the memory strength is relatively higher, with DFA exponent \(\alpha =0.79\) (blue points). For surface air temperature records measured at Stockholm (\(\hbox {SAT}_{\mathrm{land}}\)), the DFA exponent \(\alpha =0.65\) (red points), in agreement with previous works, is around the average level in \(\hbox {SAT}_{\mathrm{land}}\) over the world (Eichner et al. 2003; Yuan et al. 2010). Surface air temperature over island (\(\hbox {SAT}_{\mathrm{island}}\)), however, has stronger climate memory (Eichner et al. 2003). We find \(\alpha =0.85\) in the SAT records observed at Cocos island (green points). Compared with station records, data averaged/derived from large spatial scales may have stronger climate memory. As shown by the pink points in Fig. 1, the DFA exponent \(\alpha\) from North Hemisphere temperature anomalies (NHTA) is 0.85. While the PDO index, which represents the leading pattern of monthly sea surface temperature anomalies over the North Pacific (poleward of 20N) (Zhang et al. 1997; Mantua et al. 1997), has the strongest long-term memory with \(\alpha =1.08\) (yellow points).

Using the \(\alpha\) values, we then determined the corresponding q values in FISM, see Table 1. Accordingly, we are able to derive the historical weather-scale dynamical excitations \(\varepsilon (u)\) reversely from Eq. (3). Before doing that, however, we need to emphasize that the long-term memory revealed from Fig. 1 are on scales from 1 month to decades. For example, the scaling ranges revealed for precipitation and NHTA are from 1 month to around 30 years; for \(\hbox {SAT}_{\mathrm{land}}\) and river runoff, the scaling ranges are larger, from 1 month to around 50 years; while for \(\hbox {SAT}_{\mathrm{island}}\) and the PDO index, the scaling ranges are smaller, from 1 month to around 20 years. Limited by the data length, we are unable to check whether the scaling behavior can be extended to longer time scales. But from previous studies, it has been argued that the detected long-term memory may be scale-dependent (Fraedrich and Larnder 1993; Markonis and Koutsoyiannis 2016). In fact, as reported by (Lovejoy 2015b), the scaling ranges revealed by this work belong to the regime of “macroweaher”, which has typical scaling range from 10 days to about 50 years. In view of the potential risk that the detected long-term memory may change on time scales longer than 50 years, in the following analysis we only focus on the recent 50 years (1961–2010), which is the period when most meteorological stations of the world are in operation regularly.

In Fig. 2, the derived \(\varepsilon (u)\) for the six records from 1961–2010 are presented. From the time series, they seem to change fast like uncorrelated noises. But dynamically, there are reasons for them to take the current values. For instance, one can see clearly that the \(\varepsilon (u)\)s in NHTA were around zero until 1980, then got positive (see the gray area). This is reasonable as a significant warming trend was observed in NHTA after 1980 (Fig. 3), which is attributed to external forcings such as the increased concentration of atmospheric Greenhouse gases (GHGs). Although we do not study the warming effects of GHGs directly, their effects are manifested in \(\varepsilon (u)\) in the form of more positive values. Similar pattern can also be found in the case of \(\hbox {SAT}_{\mathrm{land}}\), where more positive \(\varepsilon (u)\)s are visible during the past 30 years. Therefore, \(\varepsilon (u)\) carries information from external forcings, and determines the changing directions of climate states. It is not simple white noise, but behaves as the “climate gene” of the considered variable.

With the extracted \(\varepsilon (u),M(t)\) can be easily calculated from Eq. (4). As introduced in Sect. 2.2.2, since the first 10 years \(\varepsilon (u)\)s derived from Eq. (3) have big errors (“spin-up” time), we thus only use the middle 30 years (1971–2000) \(\varepsilon (u)\)s to calculate the memory part M(t) for the last 10 years (2001–2010). In Fig. 3, M(t) (red curves) as well as the original time series (black curves) are shown for all the six records. Apparently, with the increase of climate memory (from top to bottom), the explained variance by M(t) becomes higher. For the precipitation, since no memory was found (\(\alpha =0.5\); \(q=0\)), M(t) vanishes as a horizontal line. While for other variables with strong memory (e.g. PDO index), the M(t) captures a large portion of the variability. To quantify the explained variance by M(t), we follow Eq. (5) and the results are shown in Fig. 4. Based on the last 10 years, the explained variance of climate memory for the \(\hbox {SAT}_{\mathrm{land}}\) records observed in Stockholm is only \(4.2\%\), while for \(\hbox {SAT}_{\mathrm{island}}\), the explained variance is higher (\(19.8\%\)). For regional averaged temperatures, the memory parts explain approximately \(25.8\%\) variance in NHTA. For PDO index, the explained variance is higher and accounts to \(68.4\%\). Concerning the river runoff data, the explained variance is \(16.9\%\), which is higher than the case of \(\hbox {SAT}_{\mathrm{land}}\).

To validate the results from these six records, we performed the same calculation to a large number of artificial datasets. These datasets are characterized into 13 groups with different memory strengths (DFA exponent \(\alpha =0.50, 0.55,\dots , 1.05, 1.10\)). For each group, numerically simulated mean explained variance along with uncertainties were obtained. As shown in Fig. 4, the black dots represent the mean explained variances while the red and blue dashed lines are the upper and lower bounds of 95% distribution range. The results from the six records are in good agreement with the numerical simulations, indicating the extracted memory part M(t) in each record is reasonable and can be used for the estimation of climate memory effects.

However, it is worth to note that due to the limitation of data length, the explained variances of the six records were calculated using only 10 years data (the test zone, 2001–2010, 120 months). This short length may induce biased estimations of climate memory effects. In fact, by studying how explained variance varies with data length, we found 1680 is the minimum length to produce stable variance (figure not shown). Accordingly, to obtain unbiased climate memory effects from the past 30 years, the minimum data length is 10 years “spin up” time plus 30 years historical “observation” plus 140 years (1680 months) test zone, which is 180 years. Obviously, this data length is too long for most stations in the world. Even for the considered six long records, only two of them are longer than 180 years (the SAT observed in Stockholm and the river runoff data from Weser River). Therefore, to obtain unbiased estimations of climate memory effects, we have to turn to artificial data with longer data length.

4 Climate memory effects in one-step and multi-steps climate predictions

In this section, we estimate the climate memory effects using long artificial datasets. By setting a long enough test zone (\(>1680\)), M(t) calculated over this long period can be used to produce reliable and stable variance, which is essential for the estimation of unbiased memory effects. Figure 5 shows the explained variances for different memory strengths (\(\alpha\) values). Compared to Fig. 4, the unbiased memory effects obtained from longer data (Fig. 5) are slightly higher, indicating underestimations of memory effects calculated from short data (Fig. 4). Although in practice, an unbiased estimation of climate memory effects is normally hindered by the length of observational data, using the DFA exponent \(\alpha\), it is possible to indirectly estimate the climate memory effects from Fig. 5. For instance, it has been reported that monthly precipitation records are usually characterized by very weak long-term memory, with \(\alpha\) ranges from 0.5 to 0.65 (Kantelhardt et al. 2006; Jiang et al. 2017). According to Fig. 5 (the yellow area), climate memory may only explain up to \(5\%\) variance in precipitation. For \(\hbox {SAT}_{\mathrm{land}},\alpha\) normally ranges from 0.55 to 0.80 (Eichner et al. 2003; Yuan et al. 2010), as a result, approximately 0.5–23.8% of the SAT variability can be explained (green area in Fig. 5). For \(\hbox {SAT}_{\mathrm{island}}\), stronger climate memories are reported with \(\alpha\) ranges from 0.65 to 0.9 (Eichner et al. 2003). Accordingly, higher variances ranging up to \(43.6\%\) may be explained (the blue area in Fig. 5). Table 2 provides additional information on different variables, including relative humidity (RH) (Lin et al. 2007; Chen et al. 2007), river runoff (RR) (Kantelhardt et al. 2006), and sea level change (SLC) (Dangendorf et al. 2014).

By definition, the explained variance calculated from Eq. (5) can be considered as the predictability of the variable being studied. However, it is worth to note that the results shown in both Figs. 4 and 5 only represent the climate memory effects in one-step prediction, as we used sliding windows of 30 years to calculate the memory part at different time points. To show how the climate memory effects decay over time, we refer to Eq. (6). By varying n, climate memory effects in multi-steps prediction are calculated. Figure 6 shows the climate memory effects in steps \(n=2,3,\ldots ,10\) (Fig. 6). With the increase of n, the explained variance of M(t) shows a power-law decay. This is reasonable as the long-term climate memory discussed in this work is essentially a scaling behavior. Using this power-law relationship between explained variances and n, we further fitted a formula,

$$\begin{aligned} EV_{n} = 10^{a_1 e^{-\alpha /b_1}} \cdot n^{a_2 e^{-1.5\alpha }+b_2}, n\ge 2, \end{aligned}$$
(7)

where \(\alpha\) is the DFA exponent, and n denotes the steps. The parameters \(a_1,b_1,a_2,b_2\) are \(-45.542\), 0.186, \(-3.656\), and 0.532, respectively. With this formula, the explained variances can be estimated easily once the memory strength \(\alpha\) and the steps n are given. As shown in Fig. 6, we found good agreements between the results calculated from the formula (lines) and those obtained from Monte-Carlo simulations (solid circles). It is worth to note that for variables with different memory strengths, their memory effects can be very different. For example, if a given variable is characterized by weak climate memory (e.g., \(\alpha =0.65\), the case of \(\hbox {SAT}_{\mathrm{land}}\)), the climate memory effects may become negligible (e.g., smaller than \(1\%\)) rapidly after several steps (months). However, if the considered variable has stronger climate memory (e.g., \(\alpha =0.85\), the case of NHTA), the climate memory effects are non-negligible for a very long time. Therefore, the climate memory effects should be properly considered not only in short-term climate predictions (e.g., one-step forward prediction), but also in long-term predictions, especially when the considered variables/systems are characterized by strong climate memory.

5 Discussion and conclusion

In this work, we studied the climate memory effects on climate prediction. Using Fractional Integral Statistical Model (FISM), influences of climate states of past 30 years on current climate are quantified [the memory part M(t)], which allows one to calculate the explained variance of climate memory. It was shown that the stronger climate memory is, the higher the explained variance is. Except precipitation, the effects of climate memory are non-negligible for most climatic variables. For some cases with strong climate memory (e.g. river runoff, PDO, etc.), the explained variance by climate memory can be higher than 20%. Besides the climate memory effects on predictions of next step (e.g., next month), even for multi-steps prediction, high explained variances may still maintain, e.g. for NHTA, the climate memory effects can be higher than 10% even after 10 months.

Different from short-term persistence that exists in weather systems, such as cyclones, blocks, etc., which can only last for a few days, the climate memory studied in this work is long-term (linear) climate memory. As the name implies, long-term climate memory refers to persistence on longer time scales (monthly, annual, inter-annual...), which is actually a manifestation of scaling behavior in climate system. Therefore, to avoid mixing the short- and the long-term climate memory, we used monthly data from which the persistence on scales smaller than 1 month has been eliminated. By employing WT2 (see supplementary material for more details), we have confirmed that the scaling behavior of the variables studied in this work starts from 1 month. Therefore, it is reasonable to use monthly data to study the climate memory effects. However, limited by the data length, we can only detect the scaling behaviors till a few decades (maximum 50 years) by DFA-2. Considering that the long-term climate memory may be scale-dependent and change on time scales larger than 50 years (Lovejoy 2015b; Markonis and Koutsoyiannis 2016), we only focused on the recent 50 years (1961–2010), and studied the climate memory effect of past 30 years. In other words, our analysis are valid within the time scales of a few decades (below 50 years). For longer time scales, one has to check first the scaling behavior whether there are any crossovers. Furthermore, it is worth to note that besides long-term linear climate memory, there may be long-term nonlinear memory in some variables (e.g. river runoff) (Bogachev and Bunde 2011; Kantelhardt et al. 2006), which represents the multi-fractal behaviors in the variable being studied. But up to now, it is still challenging to properly quantify the nonlinear climate memory. Accordingly, our work only considered the effects of linear climate memory.

In view of the non-negligible contributions of climate memory to climate variability, our work emphasized the importance of past climate states in future climate predictions. Similar to the concept “inertia” in physics, long-term climate memory is a measure in climate science that quantifies how past climate states are memorized. Traditionally, for processes with memory (or auto-correlation), the simplest prediction method is linear extrapolation, while the widely used methods are autoregressive type models such as the ARFIMA (autoregressive fractionally integrated moving average) model (Baillie and Chung 2002; Taqqu et al. 1995). Although these traditional methods may be powerful in processes with strong memory, they lack physical basis. For instance, ARFIMA is also based on the fractional integral techniques and is capable of simulating long-term correlated processes, as shown in Eq. (8),

$$\begin{aligned} x(t) = \sum _{\nu =1}^{\infty }a(q;\nu )x(t-\nu )+\varepsilon (t), \end{aligned}$$
(8)

where \(a(q;\nu )\) is statistical weights defined as (Caballero et al. 2002)

$$\begin{aligned} a(q;\nu ) = (-1)^{\nu +1} \frac{\varGamma (1+q)}{\varGamma (1+q-\nu )\varGamma (1+\nu )}, \end{aligned}$$
(9)

\(\varepsilon (t)\) denotes an independent and identically distributed (i.i.d) Gaussian noise, and \(\varGamma (\nu )\) is the Gamma function. Similar to FISM, ARFIMA uses the same fractional integral order q. But in ARFIMA, the fractional integration \(a(q;\nu )\) acts directly on the historical observations x(t), which is different from FISM, where the fractional integration is applied to the historical weather-scale excitations \(\varepsilon (u)\) (Eq. 3). As discussed in (Yuan et al. 2014), this design makes ARFIMA difficult to give the noise term \(\varepsilon (t)\) physical meanings. While in FISM, \(\varepsilon (t)\) is not simple white noise, but carries information from external forcings (see Fig. 2). Therefore, different from the traditional methods such as the simplest linear extrapolation, or the more advanced ARFIMA model, our method pays more attention to the physics behind, such as trying to understand why and how long-term memory arises in climate (Yuan et al. 2014). Accordingly, besides calculating M(t) precisely using FISM, it is also possible to focus on the non-memory part, \(\varepsilon (t)\). In this way, a new perspective for climate prediction is suggested. Instead of predicting future climate directly as current models do, we might focus on the non-memory part \(\varepsilon (t)\), and predict future climate indirectly by coupling \(\varepsilon (t)\) into FISM. The explained variance by M(t) determines a bottom bound of the predictive skill, and it is expected to be improved if the future \(\varepsilon (t)\) can be properly estimated. Since it is still challenging to fully capture the internal variability in current models, this new research strategy may open a new gate for climate prediction, which deserves more attention in future.