1 Introduction

Soil moisture has a key role in the hydrological and the energy cycles as well as the carbon cycle. The influence of soil moisture on precipitation and surface temperature has long been noticed and has been drawing renewed attention in the recent years (e.g. Delworth and Manabe 1988; Hong and Kalnay 2000; Koster and Suarez 2003; Conil et al. 2008). The memory of soil moisture is particularly important to the seasonal prediction of precipitation, temperature and other meteorological variables, and various modelling studies (for example, see Dirmeyer 2000) have shown that there is predictability based on anomalies in land surface moisture. Delworth and Manabe (1988) pioneered the study of the temporal variability of soil moisture in the Geophysical Fluid Dynamics Laboratory (GFDL) general circulation model and suggested that soil moisture variations can be considered as a first-order Markov process. The autocorrelation function of the process decays exponentially:

$$r(\tau ){\text{ }} = {\text{ }}\exp ( - \lambda \tau )$$
(1)

where τ is the time lag and λ −1 is the e-folding time of the anomaly correlation in the absence of forcing. The e-folding time is generally referred to as the temporal scale of soil moisture at which the soil integrates precipitation into a red soil moisture process. Here, time scales are considered for which precipitation is uncorrelated and spectrally white. The e-folding time was intensively studied from either observations (Entin et al. 2000) or atmosphere general circulation models (Wu and Dickinson 2004) since soil moisture may have the ability to enhance extreme climate events, such as dry or wet spells in warm seasons.

Manabe and Delworth (1990) suggested that soil moisture has the potential to contribute substantially to low-frequency atmospheric variability, as approximately half of the total variance of soil moisture process resides beyond a long ‘separation time scale’ in a first-order Markov model (defined as 2πλ −1) implying a long-range correlation. Little work has been done to address this issue in more detail. Amenu et al. (2005) reported 17-, 34- and 60-month low-frequency modes in the Illinois soil moisture observations corresponding to El Niño southern oscillation signals; this gave an observational support to longer-term correlation beyond the classic e-folding time. Several studies have revealed that geophysical variables exhibit long-term correlations (equivalent to enhanced low-frequency variability) (Mandelbrot and Wallis 1969), such as runoff (Livina et al. 2003; Mudelsee 2007; Wang et al. 2008) and near surface temperature (Fraedrich and Blender 2003; Koscielny-Bunde et al. 1998). However, as the major driver of hydrological cycle, precipitation is spectrally white (Kantelhardt et al. 2006). Blender and Fraedrich (2006) suggest soil moisture as the major source of memory in the hydrological cycle over land. Observed long-term correlations do not follow exponential decays but follow power laws with exponents quantifying the strength of these correlations. The long-term memory of soil moisture may be relevant for the clustering of soil moisture deficiencies (D’Andrea et al. 2006) and the occurrence of heatwaves (Seneviratne et al. 2006).

Hasselmann (1976) introduced the concept of stochastic climate modes and stated that climate variability in principle can be modelled by autoregressive (AR) processes. Mitchell (1964) also pointed out that ‘…persistence in meteorological data can ordinarily be described very well by a first-order linear Markov model’. Based on this concept, Delworth and Manabe (1988) advanced the e-folding time model of soil moisture memory. However, many climatic variables behave non-linearly, e.g. temperature (Bartos and Jánosi 2006), suggesting that linear models cannot fully capture the statistical properties of such processes. To fully characterise the statistical properties of soil moisture and for a better understanding of the underlying dynamics, it is necessary to determine the degree of non-linearity in soil moisture process time series. Linearity may be defined using the Fourier phases of the time series: if the statistical properties do not depend on the Fourier phases (this is calculated by randomly shuffling the phases), the time series is linear; otherwise, the series is considered to be non-linear. This definition includes linear AR processes that are used by Delworth and Manabe (1988).

Ashkenazy et al. (2003) suggested a method to assess the non-linearity in geophysical time series based on an analysis of the volatility time series which is given by the absolute values of the increments. The main observation is that time series with long-term memory based on linear processes are characterised by an absence of long-term memory in the volatility time series, whereas for non-linear processes, the long-term memory in the volatility is preserved. Volatility correlation is found, for example, in river fluxes (Livina et al. 2003) and land surface temperatures (Bartos and Jánosi 2006; Govindan et al. 2003). Similar properties of temperature are detected on in proxy records (Ashkenazy et al. 2003) and in the abyssal equatorial Pacific (Kalisky et al. 2005). These studies suggest considerable non-linearity and ‘clustering’ of magnitudes in these geophysical variables; that is, a large magnitude tends to follow a large magnitude whilst small magnitudes follow small volatilities.

The aim of this paper is to determine the temporal correlation properties of soil moisture on short and long time scales and to find possible relationships between both regimes. The degree of non-linearity is addressed by a long-term memory analysis of the volatility time series. Since global high-quality observations of soil moisture are sparse in space and time, we use the European Centre for Medium-Range Weather Forecasts (ECMWF) re-analysis product ERA-40 in 1957 to 2002. This paper is organised as follows: in Section 2, the dataset and the analysis methods are described. In Section 3, global results on the short-term and long-term memory properties are presented and Section 4 includes a non-linearity analysis based on the volatility correlation properties. Section 5 concludes with a brief summary and discussion.

2 Data and methods

The variability analysis of soil moisture is based on global daily fields in the ECMWF re-analysis product ERA-40. To determine the short-term and long-term variability correlations (which are related to memory), we use the detrended fluctuation analysis (DFA). Non-linearity of the underlying processes is determined by an analysis of the volatility of the soil moisture time series.

2.1 ERA-40 re-analysis data

We use land surface soil moisture data in the ERA-40 global re-analysis, produced by the ECMWF (Uppala et al. 2005). The land surface parameterisation of ERA-40 (van den Hurk et al. 2000) models the soil–atmosphere and soil–vegetation interactions and delivers a daily surface water and energy balance at each grid cell on four prognostic layers for soil moisture with layer thicknesses of 7, 21, 72 and 189 cm during the entire period of 1957 to 2002. Both daily data and monthly averages are used in this study. The seasonal cycle is removed at each grid point by subtracting the respective monthly and daily climatological means. Trends are not removed.

2.2 Detrended fluctuation analysis and power spectra

The DFA (Peng et al. 1994) is a spectral method developed to detect long-term memory in stationary time series. First, the anomaly time series are determined by removing the climatological means from the original time series. As the first step of DFA, the anomaly series are integrated to the so-called profile. To determine the fluctuation function F(τ), the profile time series is partitioned into segments of duration τ, and linear fits are calculated separately for each segment. The fluctuation function F(τ) is the mean of the variances of the profile with respect to fits at the time scale τ. To obtain robust estimates, overlapping windows are used. When the original time series shows a polynomial trend of order k − 1, polynomials of order k are fitted and subtracted in the segments (denoted as DFAk, note that the above-mentioned DFA does not eliminate trends, Bunde et al. 2000). In this work, DFA is determined using the software matlab. Both DFA1 and DFA2 are performed in this work, and only the DFA1 results are reported since DFA2 produces the same results.

In the case of the power law in the power spectrum, S(f) ∼ f β, the fluctuation function F(τ) obeys a power law, F(τ) ∼ τ α, where α is the DFA exponent. This exponent α can be determined by the slope relating log F(τ) to log τ. The exponents are related by β = 2α − 1. A long-term memory process is characterised by fluctuation exponents α > 0.5 (β > 0, for low frequencies). An uncorrelated process (white noise) is given for α = 0.5 (β = 0), and an anti-persistent process has α < 0.5 (β < 0). Specifically, α = 1.5 (β = 2) corresponds to Brownian noise, which can be regarded as the integration of white noise. Stationarity is violated for α > 1 (β > 1), the threshold being given by 1/f (or flicker) noise.

2.3 Volatility analysis and non-linearity

An empirical relationship between non-linearity and volatility series was suggested by Ashkenazy et al. (2001). For a given a time series u(i) with increments Δu(i) = u(i + 1) − u(i), the volatility series is defined as the absolute value of increments vol(i) = |Δu(i)|. It was found that long-range correlated linear series have uncorrelated volatility series, whilst long-range correlated non-linear series have correlated volatility series. The detection of non-linearity in time series is rather involved and requires long time series.

3 Temporal variability in soil moisture layers

The e-folding time is used to estimate the memory time scale of soil moisture assuming an exponential decay of the autocorrelation function of the time series. Using Eq. 1, we use the 1-month lag autocorrelation values r = 0.8, 0.6, 0.4 and 0.2 to determine the e-folding times λ −1 of 4.5, 2.0, 1.1 and 0.6 months, respectively. The 1-month lag autocorrelation of global soil moisture for the top surface layer (7 cm depth) in ERA-40 data is shown in Fig. 1. This result is based on monthly anomalies at each gird cell which are calculated by removing the monthly climatological mean in the monthly time series. The substantial spatial variability of the e-folding times in the top surface layer is consistent with previous studies (e.g. Delworth and Manabe 1988; Wu and Dickinson 2004). We find short values of the e-folding times in the Tropics and an increase with latitude as well as relatively higher values in arid and broadleaf forest regions. The processes involved in the dynamics of soil moisture (mainly precipitation, evapotranspiration and runoff) contribute to the variability, but the precise mechanisms are model-dependent and still not clearly understood (Delworth and Manabe 1988; Wu and Dickinson 2004).

Fig. 1
figure 1

One-month lag autocorrelation coefficient of monthly soil moisture anomalies in the top surface layer with 7 cm depth

To examine the fluctuation regimes, three regions are selected for the DFA: a mid-latitude region in Europe marked as region I with an e-folding time of 2–3 months, an evergreen broadleaf forest region marked as region II and an arid region marked as region III, both with an e-folding time of approximately 1 year. These three regions represent different types of fluctuations in global soil moisture. In these analyses, daily soil moisture data are used. First, the daily values are averaged in each of the four layers for the whole re-analysis period in each selected region (this yields 12 time series in total). The DFA is performed to the anomalies obtained by removing the climatology means from daily soil moisture time series. The DFA in the three regions I, II and III shows the following results:

Region I (Europe)

A strong seasonality is present in the top three layers, whilst it is weak in the bottom layer (not shown). The DFA fluctuation functions show two distinct power law regimes (Fig. 2, I) with a crossover around 3 months. The regimes corresponding to the power spectra, S(f) ∼ f β, based on the relationship β = 2α − 1 between the spectral exponent β and the fluctuation exponent α. In the first regime, the DFA fluctuation exponents in 1–90 days increase according to α ≈ 1.42, 1.54, 1.70 and 1.82 from surface to bottom layer. Thus, in the two upper layers, the first regime (for shorter time) shows β ≈ 2, which is consistent with the e-folding time scale obtained within the Markovian framework. The exact value β = 2 is found in the high-frequency limit of the Lorentzian power spectrum S(f) ∼ 1/(λ 2 + f 2) which is obtained for an exponential decay, exp(−λτ), of the autocorrelation function. In the deepest layer, α ≈ 1.8 is found, which is related to β ≈ 2.6; such ‘redder’ spectra have been found in previous studies (Wu and Dickinson 2004). The increase of α and β captures ‘the redder spectra in deeper layers’ related to an increasing memory. In the second regime, beyond 3 months, the exponents of the fluctuation functions of the four layers converge to the same value α ≈ 0.8 (β ≈ 0.6). At such long time scales, the exchange between the soil layers is complete and the variability is coherent. Note that values 0 < β < 1 indicate stationary long-term memory time series.

Fig. 2
figure 2

Log–log plots of DFA fluctuation functions F(τ) ∼ τα from daily soil moisture anomalies in the three regions (I Europe, II Amazon and III Sahara) and the four layers from top to bottom [7 cm (filled circles), 21 cm (x), 72 cm (open diamonds) and 189 cm (open circles)]. Crossover time scales are indicated by a vertical bar. The exponent α for short time scales is indicated (determined in 1–90 days in I and in 1–300 days in II and III)

Region II (Amazon)

This broadleaf forest region shows intense seasonality of soil moisture in all layers. The crossover time scale is higher than in region I (Europe) and reaches roughly 1 year (Fig. 2, II). Below 1 year, the fluctuation function exponent α, in 1–300 days, increase with depth, α = 1.20, 1.26, 1.42 and 1.65 from top to bottom. Hence, the power spectra S(f) ∼ f β scale with the exponents β = 1.4, 1.52, 1.84 and 2.3 and the increase of the memory with depth is similar to Europe. However, the short-term memory is slightly weaker in this region than in Europe. In the long-term regime above 1 year, α converges to α = 1.28 (β = 1.56). This value indicates non-stationarity of the anomaly time series up to the maximum time scale given by the duration of the ERA-40 data. Therefore, averages determined in this data should be considered carefully; for climatological means, longer time periods are necessary.

Region III (Sahara)

In this region, seasonality is present only in the top two layers, and throughout the year, the top layer is wetter than the second layer. Besides the rare rainfall events, evaporation is the dominant process in the dynamics of soil moisture. Due to the small amounts involved and due to subsequent fast evaporation, rapidly varying precipitation does not reach the deeper layers. The crossover times extend up to 1 year in the top two layers (Fig. 2, III), beyond which DFA fluctuations converge to α = 1.54 (β = 2.08) determined in 1–300 days. The two bottom layers, which are not impacted by the high-frequency components of the precipitation variability, show a unique variability for all time scales, with the same exponent, which is the limit of the two top layers for large time scales. Thus, all spectra are red (β = 2) in the whole frequency range accessible in the data set. The non-stationarity involved is even more vigorous than in the Amazon region.

A global view of the inter-annual long-term memory is determined for the top 7-cm layer by a fit of the fluctuation exponent α in the time interval of 2–10 years (Fig. 3). Since all of the four layers at each location have identical long-term correlation properties (see Fig. 2), this map shows the long-term memory of the total soil moisture content. The similar spatial structure of the autocorrelation coefficients in Fig. 1 and the fluctuation exponents in Fig. 3 suggests a relationship between the short-range and the long-range correlations. Such a relationship is also found in the land surface temperature anomalies (Kiraly et al. 2006). The physical mechanisms leading to this interdependence need to be further clarified, possibly by simulations with coupled land–atmosphere climate models.

Fig. 3
figure 3

Long-term fluctuation exponent α of soil moisture variability determined in the top layer (7 cm) within 2–10 years

4 Volatility correlations

Non-linear geophysical processes can lead to a clustering of volatility which appears mainly as seasonality and long-range correlations (see Section 2.3). To determine the degree of non-linearity involved in the dynamics of soil moisture, a long-term memory analysis is applied to the volatility time series vol(i) = |u(i + 1) − u(i)| (Ashkenazy et al. 2001). In the volatility time series, the seasonal cycle is removed. The main observation is that time series with long-term memory based on linear processes are characterised by an absence of long-term memory in the volatility time series, whereas for non-linear processes, the long-term memory in the volatility is preserved. The numerical differentiation used to produce volatility series is known to strongly enhance the noise level inherent in the data (Bartos and Jánosi 2006), and the typical volatility of fluctuations around local trends is extremely small, especially in arid regions, leading to noisy results.

In order to substantiate the analysis, we apply a further test which is based on surrogate data for the soil moisture time series where the non-linearity is destroyed whilst other statistical properties are preserved. To produce such surrogate data, Schreiber and Schmitz (2000) suggest the iterative amplitude-adjusted Fourier transform (iAAFT) method which preserves both the power spectrum and probability distribution of the series. The method proceeds as follows: (1) first, the sequence of the original time series is shuffled, (2) the shuffled series is Fourier-transformed, (3) the power spectrum is adjusted to the power spectrum of the original series and (4) the inverse transform is applied to adjust the histogram to the histogram of the original series. Steps (2)–(4) are repeated until the result converges. This surrogate series has random Fourier phases and the non-linearities stored in the phases are destroyed. By means of comparing the correlation properties of volatility series obtained from the original increment soil moisture series and the surrogate series, we can conclude whether the correlation in the volatility series is an indication of non-linearity.

Before we perform a global analysis, we consider the volatility anomalies series in the first two regions, Europe and Amazon (marked as I and II in Fig. 1). The long-term memory is determined by DFA (see Fig. 4). The results of the analysis of the surrogate data and the volatility reveal that (see Table 1):

  1. 1.

    the soil moisture time series show long-term memory (α = 0.8 in Europe, α = 1.28 in the Amazon region),

  2. 2.

    the surrogate data (with non-linearities eliminated) show the same long-term memory (this confirms the application of the iAAFT-method; Schreiber and Schmitz 2000),

  3. 3.

    the volatility time series of soil moisture and of the surrogate time series show no long-term memory (α = 0.5).

Fig. 4
figure 4

Volatility correlations for soil moisture and surrogate data for the regions in Fig. 2 [mid-latitude region (I) Europe, broad leaf forest region (II) Amazon]

Table 1 The decision process for the assessment of non-linearity in long-term memory (LTM) data by volatility analysis and the comparison with linear surrogate data (the result in the present analysis is italicised)

The global distribution of the fluctuation exponent reveals no clear spatial pattern (not shown); therefore, we present scatter plots of the fluctuation exponent α obtained in the data grid points (Fig. 5). This figure confirms that the surrogate data has the same long-term memory as the soil moisture time series (see α original vs. α surro) with a wide distribution ranging from 0.3 to 1.8. The long-term memory is lost in the transformation to the volatility time series (α original vs. α vol) leading to the conclusion of linearity. The exponents of the volatility time series, α vol vs. α surro_vol, show a narrow distribution in α = 0.4 to 0.6 centred at 0.5. Therefore, we conclude that the soil moisture time series is based on a linear physical process; the long-term memory is the same as in the linear surrogate data.

Fig. 5
figure 5

Combined scatter plot of the fluctuation exponent α for the original soil moisture time series, α original, surrogate series α surro, volatility of the original time series, α original_vol, and volatility of the surrogate data, α surro_vol. The three plots are: α original vs. α surro (x), α original vs. α vol (open circles) and α original_vol vs. α surro_vol (filled circles)

Since the link between non-linearity of the time series and the long-range volatility correlation is empirical (Ashkenazy et al. 2001), Kalisky et al. (2005) studied theoretical predictions of the relationship between the correlation exponent of a time series and its volatility. For a linear process obeying a power law, up to the value α = 0.75, the volatility exponent is practically constant α vol = 0.5 and then changes to an approximately linear increase. Unfortunately, this result is not confirmed in our analysis (see α original vs. α vol, for α original > 0.75 in Fig. 5).

5 Summary and discussion

Soil moisture is an important agent in land–atmosphere interactions since it couples rapid precipitation fluctuations to storage with memory of the order of month to years. The memory time scale of soil moisture is widely characterised by an e-folding time within a Markovian framework. We study the power law fluctuations of soil moisture and find that there are two regimes, with either short-range or long-range correlations. The short-range correlation is equivalent to an e-folding time that is considered as the time scale at which soil moisture is integrated from uncorrelated precipitation. Beyond this time scale, the long-range correlations of soil moisture may contribute to low-frequency climate variability. Within the short-range correlation regime, deep layers show a red spectrum, whilst within the long-range correlation regime, soil moisture shows an identical spectrum at different layers. There is a clear correspondence between the e-folding time and long-range correlations modified by local conditions. Whereas the e-folding time is easily interpreted by a simple first-order autoregressive process, a simple model and the interpretation of the long-term power law correlations are less clear and may originate in the complex interactions within the hydrological cycle (Blender and Fraedrich 2006). In very dry and very wet as well as highly elevated regions, the fluctuation exponent for large inter-annual time scales is α > 1; this shows the non-stationarity of soil moisture process at very large time scale and hampers the prediction by traditional statistics.

The volatility correlation of soil moisture, which is considered as an empirical indicator for non-linearity, reveals that soil moisture exhibits a white volatility spectrum. This suggests that the underlying processes of soil moisture in ERA-40 are linear and that the statistical properties of soil moisture may be well-approximated by linear models such as the Markovian model used by Delworth and Manabe (1988). This conclusion is substantiated by a surrogate data test. Further work is needed to clarify the physical processes and whether this corresponds to real linearity or it is in fact an artefact of the model used in the re-analysis.

The linearity found in the soil moisture time series does not necessarily imply a conflict with the expected ‘non-linear’ interactions between soil moisture and other land surface water budget components, since the latter falls into a different definition of non-linearity with respect to the dynamical equation (for a review on the definition of non-linearity, see Ashkenazy et al. 2003), and actually, there is no proven evidence between the non-linearity found in the data and the non-linearity in the governing dynamical equations (e.g. Hsieh 2001).