1 Introduction

The Madden–Julian oscillation (MJO) is the most prominent form of tropical intraseasonal variability (Madden and Julian 1994; Lau and Waliser 2005; Zhang 2005). The oscillation propagates eastward in the tropical region with phase speeds on the order of 5–10 m s−1 and influences the monsoons in Asia, Australia and Americas (Mo 2000; Nogues-Paegle et al. 2000; Goswami and Mohan 2001; Higgins and Shi 2001; Jones and Carvalho 2002). Moreover, several studies have shown that the MJO can affect the distribution of rainfall and extreme events in many locations around the world (Mo and Higgins 1998; Higgins et al. 2000; Jones 2000; Carvalho et al. 2004; Jones et al. 2004a). Since the MJO involves intense tropical convective heating anomalies (Kiladis et al. 2005), tropical–extratropical interactions are significant during its life cycle. Therefore, previous investigations have shown modulations on weather forecasts skills (Ferranti et al. 1990; Lau and Chang 1992; Hendon et al. 2000; Jones and Schemm 2000; Seo et al. 2005) and potential predictability (Waliser et al. 2003; Jones et al. 2004a, b).

An interesting aspect about the MJO is that it was discovered with rigorous statistical data analysis of a few sparse tropical weather stations available at the time (Madden and Julian 1971, 1972). In that context, spectral and cross-spectral analyses of relatively short time series (~5–9 years) revealed the presence of a significant intraseasonal oscillation in the tropical atmosphere. In contrast, characterization of its spatial structure, identification of individual events and variations on seasonal-to-interannual time scales are more easily attained with gridded atmospheric data (e.g., Jones and Carvalho 2006). Presently, knowledge about the history of the MJO is limited to the period 1948-present when reanalyzed data became available (Kalnay et al. 1996; Bengtsson et al. 2004). It is important to note that uncertainties about the variability of the MJO in the reanalysis period are still present, since significant changes in the observational system have occurred. For example, the number of tropical weather stations has varied substantially over time and satellite data became available in the reanalysis after 1978 (Kistler et al. 2001).

Although seasonal variations in the MJO are relatively well understood (Wang 2005), physical mechanisms involved in interannual and longer variations are much less clear (Hendon et al. 1999; Slingo et al. 1999; Lau 2005; Slingo et al. 2005). Jones and Carvalho (2006) and Pohl and Matthews (2007), for instance, found that the activity of the MJO has increased in recent years. Moreover, Jones and Carvalho (2006) showed that the MJO appears to exhibit regime changes on low-frequency time scales (e.g., 5–15 years). Consequently, some fundamental aspects about the temporal behavior of the MJO remain unexplored, especially how decadal climate changes and global warming might affect the MJO.

To investigate mechanisms associated with the MJO on different time scales, one would ideally like to use a comprehensive global coupled climate model and perform a large ensemble of numerical experiments. Although substantial progress has been made over the years, only a very limited number of climate models exhibit intraseasonal variability that resembles the observed MJO (Zhang 2005). Currently, global climate models are unable to accurately represent all characteristics of the MJO (Lin et al. 2006).

This paper investigates the long-term activity of the MJO employing a probabilistic approach. We refer here to activity as the number of events occurring in a period of time. Reanalysis data were used to develop a stochastic model capable to simulate the non-stationary behavior of the MJO during 1948–2008. The model is driven by observed sea surface temperature (SST) anomalies in the tropical Indian and Pacific Oceans warm pool region (hereafter warm pool) and performed a large ensemble of simulations to infer the activity of the MJO in the instrumental period (1880–2008). Specifically, this study investigated the following questions. Do interannual-to-decadal variations in SST in the warm pool lead to substantial changes in MJO activity? Was there a change in the MJO in the 1970s? Can this change be associated to SST anomalies in the warm pool region? What was the level of MJO activity in the pre-reanalysis era?

The paper is organized as follows. Section 2 describes data sets and Sect. 3 discusses identification of MJO events. Section 4 summarizes the basic core of a stochastic model previously published. Section 5 discusses extensions performed in the model to account for non-stationarity of the MJO. Results are presented in Sect. 6. Discussion and conclusions are provided in Sect. 7.

2 Data

The primary data set used in this study is the National Centers for Environmental Prediction/National Center for Atmospheric Research reanalysis (NNR) (Kalnay et al. 1996; Kistler et al. 2001). Daily averages of zonal wind components at 850-hPa (U850) and 200-hPa (U200) were used for the period 1 January–31 December 1948–2008. In general, interannual changes in the MJO based on NNR are consistent with the 40-year European Centre for Medium Range Forecasts (ECMWF) Re-Analysis (ERA-40; Slingo et al. 2005; Jones and Carvalho 2006). To complement the analysis, daily averages of outgoing longwave radiation (OLR; Liebmann and Smith 1996) during 1 January–31 December 1979–2008 were used to characterize the convective signal associated with the MJO.

Daily climatologies of OLR, U850 and U200 were computed by averaging each calendar day and smoothing the resulting time series with 300 passes of a 1-2-1 moving average. Daily climatologies were subtracted from the original time series to remove the annual cycle. To isolate the MJO signal, the time series were detrended and filtered in frequency domain to retain variations between 20 and 200 days. This procedure follows Matthews (2000), who determined that the wide 20–200 day band more accurately represents isolated MJO events.

Interannual-to-decadal changes and long-term trends in the warm pool region (15°S–15°N; 50°E–150°W) were analyzed with monthly SST from four different data sets during January 1880–December 2008 (Kaplan et al. 1998; Smith and Reynolds 2003; Ishii et al. 2005; HadISST 2006). A time series of SST anomalies in the warm pool was obtained by averaging the four data sets and removing the mean seasonal cycle.

3 Identification of the MJO during 1948–2008

Identification of MJO events follows the procedure discussed in Jones (2009) and is summarized here for completeness. A combined empirical orthogonal function (EOF) analysis (Wilks 1995) was performed on U200 and U850 anomalies averaged in latitude (15°S–15°N). The first two EOFs, which account for 23.8 and 19.5% of the total variance respectively, were used to represent the MJO. The spatial structures of the first two EOFs can be seen in Jones (2009) and Wheeler and Hendon (2004).

The phase diagram based on the first two normalized principal components (PC1, PC2) was then used to identify MJO events according to the following criteria: (1) the phase angle between PC1 and PC2 systematically rotated anti-clockwise indicating eastward propagation at least to phase 5 (maritime continent), (2) the amplitude (PC12 +PC22)0.5 was always larger than 0.35, (3) the mean amplitude during the event was larger than 0.9, and (4) the entire duration of the event lasted between 30 and 90 days. The upper bound of 90 days was used to include MJO events with long durations, especially those that propagated to the western/central Pacific and remained active in those regions for several days. Based on these conditions, all MJO events identified in this study started in phases 1–4, propagated eastward and ended in phases 4–8 (i.e., isolated events) or restarted from previous MJO occurrences (i.e., successive events; phase continues from 8 to 1). A total of 239 MJO events were identified during 1948–2008. Figure 1 shows the canonical life cycle of the MJO based on composites of OLR anomalies based on events during 1979–2008.

Fig. 1
figure 1

Phase composites of OLR anomalies. Light (dark) shading indicates positive (negative) anomalies. Contour interval is 2.5 W m−2; zero contours omitted

It is opportune to mention a few issues regarding the identification of MJO events. Since this study focuses on the long term variability of the MJO and there is no OLR data prior to 1977, events are identified here based only on circulation characteristics. Small differences in the number of events (less than 10%) can arise when one uses OLR anomalies as another metric for the MJO. In addition, several previous studies have shown that the first two EOFs represent the bulk of the MJO variability (e.g., Jones and Carvalho 2006; Pohl and Matthews 2007). While the third EOF can sometimes be separated from the fourth and might be important in interannual shifts in the eastward propagation of the MJO (Kessler 2001), the actual initiation of MJO events is well represented by the first two EOFs. This issue was further confirmed by analyzing MJO occurrences during significant ENSO events (not shown). Additional discussions are provided in Jones and Carvalho (2006), Pohl and Matthews (2007) and Jones (2009).

To understand the variability of the MJO in the context of the historical record, Fig. 2a shows number of events per calendar year. A noticeable feature is a modification of behavior in the early-to-mid 1970s. A statistical test at 5% level indicates a significant change point in 1973 with a mean of ~3.0 events before and 4.7 events after (see Rodionov 2004 for details on the statistical test). Pohl and Matthews (2007) also noticed a change in the amplitude of the MJO in 1974, which is consistent with the analysis presented here. This change in behavior does not coincide with the introduction of satellite data in the NNR (Kistler et al. 2001), but rather is consistent with the major climate shift that occurred in the basic state in the Pacific Ocean in the latter part of the 1970s (Hare and Mantua 2000; Deser et al. 2004). Figure 2b shows 30-year running means of annual SST anomalies in the warm pool region and reveals the long-term warming in the ocean basins (Levitus et al. 2005). Another interesting aspect is that annual mean anomalies show positive/negative anomalies during 1948–1976 and a systematic warming from the mid 1970s to present. This apparent relationship between the two trends suggests that the activity of the MJO may be sensitive to the positive trend in the warm pool and motivates the research presented in the following sections. Additional motivation for this problem is based on the results of Slingo et al. (1999), who performed general circulation model experiments forced with observed SSTs and partially reproduced the positive trends in tropical intraseasonal amplitudes since the mid 1970s.

Fig. 2
figure 2

a Number of MJO events per year. Dashed lines denote the mean during 1948–1972 and 1973–2008. b Annual mean sea surface temperature anomalies (°C) in the tropical Indian and Pacific Oceans warm pool. Smoothed solid line denotes 30-year running mean

4 Homogeneous stochastic model of the MJO

The variability of the MJO in the instrumental period was investigated in the context of stochastic simulations. Jones (2009) introduced a homogeneous stochastic model capable of simulating the temporal and spatial variability of the MJO. The model consists of three components which are briefly summarized here.

The temporal variability of the MJO was modeled with a nine-state first-order Markov Chain approach (Wilks 1995). To model the full range of temporal evolution of the MJO, we consider a discrete time series X t in which nine states are possible. State 0 corresponds to days when the MJO is quiescent; states 1–8 when it is active and in one of the eight phases shown in the MJO composites. In this case, 81 conditional probabilities need to be estimated: P {X t+1 = S j | X t  = S i }, where S j  = [0, 8] and S i  = [0, 8] are states (also expressed as P ji , j = [0, 8] and i = [0, 8]). The conditional probabilities therefore determine transitions from situations of non-MJO to MJO (and vice-versa) and transitions through the eight phases that characterize its life cycle (Table 1). The model is called homogeneous because the transition probabilities do not vary in time (see Wilks 1995 and Jones 2009 for additional details).

Table 1 Transition probabilities of homogeneous stochastic model

Figure 3 schematically displays all possible phase transitions in the model. “Primary” events start when there is a transition from phase 0 to phases 1, 2, 3 or 4 (conditional probabilities P 01, P 02, P 03 and P 04). After an MJO starts, it can remain in the same phase (curved arrows) for a few days or propagate eastward or westward. Note that probabilities for eastward propagation are always higher than westward propagation (Table 1). A “successive” MJO begins when an event propagates to phase 8 and continues to phase 1 (conditional probability P 81).

Fig. 3
figure 3

Schematic diagram showing all possible phase transitions in the stochastic model of the MJO. Phase S = 0 corresponds to inactive MJO. Curved arrows indicate that the system remains in the same phase

Recently, Matthews (2008) investigated the characteristics of primary and successive MJOs and found that 40% of the MJOs are primary events (1974–2005). Despite differences in methodology, this study is in good agreement, since the percentage of observed primary events found was 63% in 1948–2008 and 36% in 1974–2008.

Once the temporal evolution of the MJO is simulated, the second component of the model determines the spatial structure. Given the time series of simulated phase transitions (X t−simul), the spatial structures of OLR, U200 and U850 anomalies (or any other field) from observed phase composites are assigned to each phase S of X t−simul, where S ∈ [1, 8]. The third component of the model assigns amplitudes to the events with an amplitude factor that follows a Gaussian distribution. This implies that the amplitudes of the simulated events can be weaker (stronger) than the “canonical” MJO (Fig. 1). Examples of MJO simulations are shown in Jones (2009).

In summary, events generated by the stochastic model occur irregularly in time and can appear as isolated events or successive MJOs. The oscillations in the model can have different zonal propagations, that is, the MJO can start in any of the phases 1–4 (i.e., Indian Ocean to Indonesia), propagate eastward and end in any of the phases 4–8. The zonal scale of simulated events is consistent with observations, since the spatial structure is assigned from observed composites. Moreover, MJOs in the model have different durations lasting between 30 and 90 days and each event can be stronger or weaker than the mean composite according to a normal distribution. Since the present study focuses on the variability of the MJO in the instrumental period, only the temporal component of the model is relevant here.

5 Non-homogeneous stochastic model of the MJO

5.1 MJO activity and variability in the warm pool

The annual number of events shown in Fig. 2a clearly indicates that the occurrence of the MJO is a non-stationary process. In this section, we discuss modifications done to the temporal component of the stochastic model to incorporate this non-stationarity. We begin by examining how transition probabilities of MJO initiation vary over time. We first define a constant parameter given by P HMG = P 01 + P 02 + P 03 + P 04, that is, the sum of conditional probabilities of primary events in the homogeneous model (Table 1, first row, P HMG = 0.015649). Next, we define P NHM = P 01 + P 02 + P 03 + P 04, which is the sum of conditional probabilities of primary events estimated in 3-year moving windows during 1948–2008. The selected size of 3-year moving windows was tested in many different ways, for example, increasing from 1-year, 2-year,…15-year, 20-year etc. As the size of the window increases, P NHM converges to P HMG. On the other hand, if the size of the window is too small (e.g., 1-year) the results can be sensitive due to the non-stationary nature of the MJO. The 3-year size was deemed optimal and, as it is shown next, this window size reproduces some important multi-year variations in the activity of the MJO.

The ratio P NHM/P HMG is displayed on Fig. 4 top and demonstrates the non-stationarity of the MJO. When the ratio is above (below) 1, the conditional probability of primary MJOs is above (below) the expected value from a stationary process simulated with the homogeneous stochastic model (see also Fig. 17 in Jones 2009). P NHM/P HMG has a positive trend from ~0.2 (1948) to ~1.7 (2006). Note, however, that the initiation of primary events exhibits multi-year periods when P NHM/P HMG was significantly larger (less) than 1. For instance, during 1972–1978 and 1999–2003, P NHM reached values up to two and three times larger than P HMG. Periods of high activity lasting ~3–5 years are also noticeable in the early part of the record, which are more clearly visible after removing the long-term trend (as shown next).

Fig. 4
figure 4

Top: time series of PNHM/PHMG which indicates changes in the initiation of primary MJO events relative to the homogeneous stochastic model (see text for additional details). Smooth curve was obtained with a 365 days moving average. Middle: 3-year running means of SST anomalies. Bottom: 3-year running variances of SST anomalies. SST anomalies are computed in the Indian and Pacific Oceans warm pool region. Values are assigned to the first day in the 3-year running windows

As discussed in the introduction, this paper explores relationships between interannual-to-decadal variations in SST and activity of the MJO. For this purpose, Fig. 4 also shows 3-year running means (middle) and 3-year running variances (bottom) of SST anomalies in the warm pool region. In addition to the obvious long-term trend, the running means show high and low periods of SST anomalies lasting roughly 3–7 years. Note additionally, that the warm pool domain does not encompass the Niño3.4 region and therefore El Niño/Southern Oscillation (ENSO) effects in the running means and variances are not too evident (i.e., opposite SST anomalies in the Indian Ocean and western Pacific associated with ENSO may cancel each other out). Another important feature is that, although 3-year running variance does not reveal significant trend, it is highly non-stationary and shows several periods of high variance lasting between ~5 and 10 years. The reasons for selecting this domain are threefold. First, this is the region where the MJO initiates and its convective signal is most strong (Madden and Julian 1994). Second, some studies indicated weak to no contemporaneous correlation between ENSO and MJO (Hendon et al. 1999; Slingo et al. 1999). Third, this study specifically investigates the hypothesis that low-frequency variations and trends in the warm pool interact with the activity of the MJO.

The next step consisted in exploring relationships between 3-year running means (M SSTA) and variances (V SSTA) of SST anomalies and P NHM/P HMG. The time series of P NHM/P HMG, M SSTA and V SSTA were first detrended by removing the respective 30-year running means (brackets are used to differentiate detrended series from raw series). The reason for removing 30-year running means instead of linear trends is explained later in this section.

Figure 5 shows lag correlations between detrended P NHM/P HMG and M SSTA (solid) and indicate that M SSTA leads P NHM/P HMG by 1.7 years (624 days). The calculation was performed using the entire data record (1948–2008). We note that the correlation is negative (~−0.5) suggesting that cold (warm) conditions in the warm pool region lead to increases (decreases) in MJO activity 1.7 years later. In contrast, lag correlations between 〈P NHM/P HMG〉 and V SSTA (dot-dash) show that large variability in SST anomalies leads to high MJO activity by ~3.3 years (1,200 days).

Fig. 5
figure 5

Lag correlations between detrended time series of P NHM/P HMG and 3-year running means (solid) and P NHM/P HMG and 3-year running variances (dot-dash) of SST anomalies in the warm pool region. Correlations larger (less) than ±0.22 are significant at 5% level. Period: 1948–2008

The significance level of the correlations was estimated computing the time between effectively independent samples following the method of Trenberth (1984):

$$ T_{0} = 1 + 2\sum\limits_{L = 1}^{N} {\left( {1 - {\frac{L}{N}}} \right)R_{L} } $$
(1)
$$ N_{\text{eff}} = N{\frac{\Updelta t}{{T_{0} }}} $$
(2)

where R L is the auto-correlation at lag L, N is the total number of data points, N eff is the effective number of degrees of freedom and Δt is the sampling interval (1 day). As a conservative approach, we used the lowest number of degrees of freedom in the time series of 〈P NHM/P HMG〉, M SSTA and V SSTA, which indicated that correlations greater/less ± 0.22 are significant at 5% level (~80 degrees of freedom in ~61 years).

Based on the above relationships, a multi-linear regression model for the detrended 〈P NHM/P HMG〉 was constructed as:

$$ \left\langle {{\frac{{P_{\text{nhm}} }}{{P_{\text{hmg}} }}}} \right\rangle (t) = \sum\limits_{j = 0}^{4} {a_{j} \times M_{\text{SST}} (t - j \times \tau_{j} )} + \sum\limits_{j = 0}^{4} {b_{j} \times V_{\text{SST}} (t - j \times \gamma_{j} )} $$
(3)

where: a j , b j are regression coefficients, M SST are detrended 3-year running means at τ j lags (0, 156, 312, 468 and 624 days), V SST are detrended 3-year running variances at γ j lags (0, 300, 600, 900 and 1,200 days). Stepwise regression tests showed that the model with both predictors is the best fit to reproduce 〈P NHM/P HMG〉 variability.

Figure 6 shows the empirical fit of 〈P NHM/P HMG〉 using this multi-linear regression model, which accounts for 50% of the total variance. Since the plot shows detrended time series, it is clear now that initiation of primary MJOs exhibited periods of high (low) activity lasting ~3–7 years even in the early part of the record. The empirical fit reproduces most of these low-frequency variations in P NHM/P HMG, especially after the early 1970s; variations in P NHM/P HMG before then are within ±0.3.

Fig. 6
figure 6

Detrended time series of observed P NHM/P HMG (solid curve with circles) and multi-linear regression fit (thick solid curve). Explained variance: 50%. Period: 1952–2006

5.2 Sensitivity analysis

The empirical result shown above brings an important new insight relating multi-annual changes in the behavior of the MJO and low-frequency SST variability in the warm pool. Since there was no a priori hypothesis for this finding, we discuss now several a posteriori sensitivity tests to quantify the robustness of the relationships between 〈P NHM/P HMG〉, M SST and V SST.

We first tested the sensitivity of the lag correlations shown in Fig. 5 by recomputing the correlations but in 25-year moving windows instead of using the entire data record. It was found that the results can be sensitive for window sizes less than 20 years. Figure 7 summarizes the three main parameters: most negative and positive correlations between P NHM/P HMG and 3-year running means and variances of SST anomalies (top) as well as time lags between the corresponding time series (middle). The bottom panel shows the explained variance of the regression fit in the dependent 25-year samples. As before, correlations greater/less than ±0.22 are significant at 5% level. In general, the correlations and time lags do not deviate significantly from the values obtained using the entire record, although some variations are noted in the early and later part of the observations. In contrast, one notes that the explained variance can be quite high when fitting the regression model in 25-year windows, especially after the mid 1980s.

Fig. 7
figure 7

Temporal variations in regressions between P NHM/P HMG and 3-year running means and variances of SST anomalies. Regressions are computed in 25-year moving windows. Top: solid (dot-dash) curve shows most negative (positive) correlation between P NHM/P HMG and 3-year running means (variances) of SST anomalies. Middle: solid (dot-dash) curve indicates time lag between P NHM/P HMG and 3-year running means (variances) of SST anomalies. Bottom: solid curve shows the explained variance in the P NHM/P HMG regression. Numbers in the abscissa indicate the mid-year of the 25-year moving windows

To evaluate the generality of the relationships between 〈P NHM/P HMG〉, M SST and V SST, regression models developed in 25-year moving windows were validated in remaining independent data. To illustrate the process, Fig. 8 shows three examples. Training in the early part of the record (top) slightly reproduces some of the variations in 〈P NHM/P HMG〉 in the 1970s and early 1980s, although it is clear that that particular model underestimates other large changes in 〈P NHM/P HMG〉. This difficulty arises because variations in 〈P NHM/P HMG〉 were actually small in the early record and the fitting does not reproduce changes in other periods. Interestingly, the regression fitting during mid 1960s to early 1990s (middle) is able to reproduce large positive and negative changes in 〈P NHM/P HMG〉 in later years. Likewise, the regression fitting during early 1980s to the end of the record (bottom) nicely captures changes in 〈P NHM/P HMG〉 from the late 1960s to early 1980s.

Fig. 8
figure 8

Examples of empirical reconstruction of 〈P NHM/P HMG〉 obtained with regression model which uses lagged 3-year running means and variances of SST anomalies as predictors. Training is performed on 25-year of data and validation on the remaining data. Observed 〈P NHM/P HMG〉 is shown in solid curve with circles

To fully quantify the accuracy of the relationships between 〈P NHM/P HMG〉, M SST and V SST, the process described above was repeated for each 25-year moving window during 1948–2008 (in steps of 6 months) and the fitted regressions were applied to the remaining independent data. Except for ~1965–1969, the explained variance in 〈P NHM/P HMG〉 (Fig. 9) ranges from about 20% in the early part of the record to as large as 55% after 1970. This result indicates that the relationships between 〈P NHM/P HMG〉, M SST and V SST are relatively stable over many years except during mid 1950s to late 1960s. Although we cannot rule out a potential issue with the MJO representation in the NNR before the introduction of satellite data, it is nevertheless interesting that small variations in 〈P NHM/P HMG〉 during the early part of the record were coincident with small variations in 3-year running mean and variances of SST anomalies relative to other periods (see also Figs. 4, 5).

Fig. 9
figure 9

Temporal variation of explained variance in P NHM/P HMG obtained with regression model which uses lagged 3-year running means and variances of SST anomalies as predictors. Training is performed on 25-year of data and validation on the remaining data. Validation is performed in 25-year moving windows of independent data. Numbers in the abscissa indicate the mid-year of the moving windows

5.3 Error analysis

A detailed error analysis associated with the representation of P NHM/P HMG was carried out. This was divided into two components. First, the previous analysis showed that it is possible to reproduce low-frequency changes in 〈P NHM/P HMG〉 using M SST and V SST as predictors. While the correlation between observed and fitted 〈P NHM/P HMG〉 can be high, differences in amplitudes may also be large (Fig. 6). A calculation similar to Fig. 9 was performed to estimate the root-mean square (rms) error between observed and predicted 〈P NHM/P HMG〉 in 25-year moving windows (Fig. 10). The rms ranges from very small values to a maximum of 0.65. The frequency distribution (Fig. 11) indicates that ~88% of the rms error is contained within values 0 and 0.2. This is encouraging and shows that the regression model fitted during one time period can be applied to simulate 〈P NHM/P HMG〉 in another period with errors that are not too large. The standard deviation of the rms error is 0.13. Additional discussions about the impact of random errors in the regression model are provided later in this section and Sect. 6.

Fig. 10
figure 10

Temporal variation of root-mean square (rms) error in regression model between P NHM/P HMG (predictand) and lagged 3-year running means and variances of SST anomalies (predictors). Error is estimated in 25-year moving windows of independent data. Numbers in the abscissa indicate the mid-year of the moving windows. Mean, max and standard deviation values are indicated at the bottom

Fig. 11
figure 11

Frequency distribution of root-mean square (rms) error in regression model between P NHM/P HMG (predictand) and lagged 3-year running means and variances of SST anomalies (predictors)

The second potential source of error is associated with uncertainties in SST observations in the ocean basins. As an estimate of this uncertainty, we considered the standard deviation among the four SST data sets used in this study (Sect. 2). Figure 12 shows the spread in the monthly SST anomalies in the warm pool and indicates large values before 1950s especially from 1880s to mid 1940s. This measure of uncertainty in SST anomalies was used to create upper/lower bounds in the estimation of P NHM/P HMG as described next.

Fig. 12
figure 12

Uncertainty level in sea surface temperature monthly anomalies from four different data sets during 1880–2008

5.4 A model for the non-stationary behavior of the MJO

We finally discuss the implementation of a non-stationary component in the stochastic model of the MJO. Two aspects were considered: a long-term trend in the probability of initiation of primary events (Fig. 4 top) and low-frequency changes in the probability ratio 〈P NHM/P HMG〉 (Fig. 6). They are further discussed as follows.

As a predictor for the long-term changes in P NHM/P HMG (Fig. 4 top), we considered the trend in SST anomalies in the warm pool (Fig. 4 middle). The correlation between 30-year running means of P NHM/P HMG and 30-year running means of SST anomalies is 0.962. To estimate the statistical significance, we generated 1,000 pairs of synthetic time series of P NHM/P HMG and SST anomalies using first order auto-regressive processes and computed the correlations between the 30-year running means of these time series (see Carvalho et al. 2007 for details on similar analysis). The observed correlation was compared with the frequency distribution of the correlations from synthetic time series and determined to be statistically significant at 2% level (statistics were stable after 100 realizations of synthetic time series).

Based on these results, a 30-year running mean trend was added to the multi-linear regression model (3) and expressed as:

$$ {\frac{{P_{\text{nhm}} }}{{P_{\text{hmg}} }}}(t) = 1.25 + \sum\limits_{j = 0}^{4} {a_{j} \times M_{\text{SST}} (t - j \times \tau_{j} )} + \sum\limits_{j = 0}^{4} {b_{j} \times V_{\text{SST}} (t - j \times \gamma_{j} )} + 0.48\times {Tr_{30}} (t) $$
(4)

where: Tr30 is a non-dimensional 30-year running mean of SST anomalies, which accounts for the long-term warming in the warm pool region. Here we note that if a linear trend were used, P NHM/P HMG could end up negative for very low values of SST anomalies. The empirical constants (1.25 and 0.48) were derived to keep the bias in the model to a minimum as shown in the next section. Moreover, given the uncertainty in the observed SST anomalies shown previously, upper/lower bounds for P NHM/P HMG were constructed by expressing SST anomalies as SSTA ± δ, where δ is the temporal spread in the four data sets of SST (Fig. 12). When accounting for these uncertainties in SST anomalies, the mean range between upper and lower bounds is ~3% of 〈P NHM/P HMG〉 (maximum ~16%) during 1880–1947 and 2% (maximum ~16%) in 1948–2008.

As discussed in Sect. 4 and in more detail in Matthews (2008) and Jones (2009), MJO occurrences can also initiate as successive events when one event starts right after one MJO ends. Transitions from phase-8 to phase-1 are represented by conditional probabilities P 81. To account for the non-stationarity in P 81, a time series was created as P81NHM/P81HMG, where P81HMG is the conditional probability in the homogeneous model (Table 1) and P81NHM estimated in 3-year moving windows. The time series P81NHM/P81HMG and P NHM are correlated (0.51) and therefore P81NHM is empirically derived as:

$$ {\frac{{P81_{\text{NHM}} }}{{P81_{\text{hmg}} }}}(t) = 0.4 + 0.56 \times {\frac{{P_{\text{nhm}} }}{{P_{\text{hmg}} }}}(t) $$
(5)

Likewise, the 0.4 constant was obtained to minimize the overall bias of the stochastic model. Upper/lower bounds in P81NHM/P81HMG were constructed based on the corresponding values for P NHM/P HMG.

Models (4) and (5) were used to construct conditional probabilities of primary and successive MJOs using observed SST anomalies and uncertainties during 1880–2008 (Fig. 13). In the final version of the non-homogeneous stochastic model, the multi-linear regression model (3) was fitted to the entire data record (1948–2008). The results show that P NHM/P HMG was less than 1 for most of the late 1800s and beginning of the 20th century suggesting low MJO activity. Progressive increases to high probability of MJO activity are seen in the early 1930s and persisting until the beginning of the reanalysis period. It is interesting to note that low-frequency changes in P NHM/P HMG are seen throughout the period. The early 1970s was marked by a substantial increase in the likelihood of MJO occurrences and persisted until early 1980s. Subsequently, another large increase is noted in late 1980s. The upper/lower bounds in the probability ratio of primary MJOs initiation (Fig. 13 top) vary from a minimum of almost 0 to a maximum of 3. It is also worth noting that the probability ratio of successive MJOs initiation became systematically greater than one approximately after the changing point in 1973. This suggests a possible remarkable modification in the behavior of the MJO, in which high MJO activity is seen as a high frequency of “packets” of MJO events rather than large numbers of isolated events. The implication of this aspect for tropical ocean dynamics is an issue that deserves additional investigation.

Fig. 13
figure 13

Probability ratios of MJO initiation: P NHM/P HMG (top) and P81NHM/P81HMG (bottom). Shaded orange regions indicate upper/lower confidence levels given uncertainties in SST anomalies (see text for additional details). Vertical dotted line in 1948 separates pre-reanalysis and reanalysis eras. Period: 1880–2008

The probability ratios shown in Fig. 13 were used in the stochastic simulations of the MJO. The standard deviation of the rms error (σ = 0.13) shown in Figs. 10 and 11 was used to represent random errors in the regression model (3) that captures low-frequency changes in the MJO. This procedure was implemented as follows. For any given day during 1880–2008, the probability ratio of primary MJO initiation is represented as P NHM/P HMG ± ε, where ε has a Gaussian distribution with zero mean and σ standard deviation. Note that P NHM/P HMG ± ε is bounded by the upper/lower limits shown in Fig. 13 top. This is necessary because ε randomly drawn could be very large just by chance, which would generate unrealistic values of P NHM/P HMG. Once P NHM/P HMG is determined, model (5) specifies P81NHM/P81HMG and the temporal component of the stochastic model is fully implemented.

Lastly, it is important to mention that the model described above is intended to simulate the non-stationarity of the MJO. In this context, we used the entire record of available MJO observations (1948–2008) to construct an empirical model relating probabilities of MJO initiation to low-frequency and long-term changes in SST in the warm pool. The construction of probabilities of MJO activity before 1948 is based on the statistical relationships derived during 1948–2008. Although the model presented here takes into account uncertainties in SST observations and errors in the multi-linear regression model (3), the stochastic simulations of MJO activity in the pre-reanalysis era should be interpreted with caution.

6 Variability of the MJO during 1880–2008

The non-homogeneous stochastic model of the MJO was used to perform simulations during 1880–2008. Here, monthly SST anomalies in the warm pool region drove changes in conditional probabilities of primary and successive MJOs as described in the previous section. To illustrate the process, the model was initialized on 1 January 1880 with observed SST anomaly in the warm pool and model (4) determined P NHM/P HMG. A random error was generated and added to P NHM/P HMG. We recall that the random error is bounded by the upper/lower limits in P NHM/P HMG (Fig. 13 top) and therefore represents the uncertainty in the SST anomalies and the error in the regression model (3). Next, since P HMG is constant, P NHM is known. The increase (decrease) in P NHM was then used to change P 01, P 02, P 03 and P 04 values in Table 1. For instance, if P NHM/P HMG is 1.2, the sum of P 01, P 02, P 03 and P 04 increase by 20% (the increase in each individual term is done maintaining the same relative ratios) and P 00 decreases by 20%. Likewise, the corresponding change in P 81 in this case is 1.072 (model 5). Thus, P 81 increases by 7.2% and P 80 decreases by the same amount. All the remaining probabilities in Table 1 remain unaltered so that we can specifically investigate changes in MJO initiation associated with SST variability in the warm pool. Note that when P NHM/P HMG is equal to 1, the MJO in the non-homogeneous model behaves as a stationary process. Other details about the homogeneous model are discussed in Jones (2009).

An ensemble of 1,000 members was constructed such that each member runs for 129 years at daily resolution. Each simulation resulted in a time series of phases and represents situations of active MJO (phases 1–8) and quiescent periods (phases equal to 0) (see Jones 2009 for examples). Next, we computed the number of MJO events per calendar year during 1880–2008; thus, each year has a frequency distribution of 1,000 data points.

To summarize the results, Fig. 14 displays the ensemble mean number of events per year (black line), inter-quartile (red shading) and 5th–95th quartile spreads (orange shading). The simulations can be compared with the observed number of events (blue curve with circles). Moreover, the horizontal dashed lines represent the inter-quartile (3–5 events year−1) and ensemble mean (~3.6 events year−1) obtained with similar simulations performed with the homogeneous stochastic model and provide expected values if the MJO were a stationary process. Some important aspects are worth noting. In the pre-reanalysis era, the ensemble mean and spread suggest that the MJO was quite inactive during 1895–1924. In fact, the lower boundary in the spread of the simulations dropped to 0 events per year. From 1925 to 1947, the ensemble mean exhibited little changes indicating that the MJO was near normal activity, that is, the MJO in the non-homogeneous model behaved similarly to a stationary process. In the reanalysis era, the ensemble mean follows the range of observed events, except during 1948–1955 when the number of observed events was very low. Note that no events were observed in 1949. Whether or not this reflects an issue with the reanalysis data in the pre-satellite period is a topic that needs to be further investigated. Nevertheless, the ensemble mean tracks the range of observed MJOs and shows that the model is capable to simulate the non-stationarity of the oscillation. This is also evidenced by the increase in activity in the ensemble mean in the mid 1970s. Here we point out that random errors in P NHM/P HMG associated with errors in the regression model (3) largely cancel out when one computes ensemble means from the simulations. Moreover, it is interesting to notice that some observed extremely active and inactive years (e.g. 1 or 7 events per year) are outside the range of 5th–95th quartile spread.

Fig. 14
figure 14

MJO activity during 1880–2008. Black solid curve shows ensemble mean number of events year−1 obtained with non-homogeneous stochastic model simulations (1,000 members). Red and orange shading indicate inter-quartile and 5th–95th quartile spreads. Observed number of MJO events year−1 is shown by blue curve with circles. Vertical dotted line indicates the beginning of reanalysis data. Horizontal lines indicate inter-quartile (3–5 events year−1) and ensemble mean number of events year−1 (3.6 events year−1) obtained with a homogeneous stochastic model

The activity of the MJO in observations and stochastic model simulations can be further realized on Fig. 15, which shows number of events in 3- and 5-year window sizes. This approach is useful to characterize variations in the MJO on low-frequency time scales and long-term trends. As before, the horizontal lines provide a background of MJO activity if the oscillation would behave as a stationary process. We observe that the ensemble means track very well the observed number of MJOs. The correlation between detrended observed 3-year number of events and 3-year ensemble mean (Fig. 15 top) is 0.54 (correlations greater/less than ± 0.22 are significant at 5% level). Likewise, the correlation between detrended observed 5-year number of events and 5-year ensemble mean (Fig. 15 bottom) is 0.68 (correlations greater/less than ± 0.66 are significant at 5% level). It is also interesting that the model captures the regime shift in MJO activity in the mid 1970s. The agreement between observations and stochastic simulations support the hypothesis that we can formulate the non-stationarity of the MJO in terms of low-frequency changes and long-term trends in SST in the warm pool region. The simulations suggest then that, before the reanalysis period, the activity of the MJO was near normal levels in 1880–1896 and 1928–1947 and significantly lower during 1897–1927.

Fig. 15
figure 15

Similar to Fig. 14, but shows the activity of the MJO in number of events in 3-year (top) and 5-year (bottom) moving windows

To summarize changes in MJO activity during the instrumental period, we performed a statistical test to detect change points in the ensemble mean number of events per year (Fig. 14 black curve). The analysis was divided before and after 1948 (the beginning of reanalysis) and the statistical test used is described in Rodionov (2004). Figure 16 displays the results as decadal shifts in the mean number of annual events. The simulations suggest that the MJO was near normal levels of activity during 1880–1895 (3.4 events) and afterward entered a period of low activity in 1896–1917 (2.6 events). Subsequently, the MJO returned to near normal levels for several decades in 1918–1947 (3.3 events). In the reanalysis period, the simulations indicate near normal levels in 1948–1972, an increasing trend to high activity in 1973–1989 (3.9 events) and 1990–2008 (4.6 events). As discussed previously, the changepoint in about 1973–1974 was discussed in Jones and Carvalho (2006) and Pohl and Matthews (2007).

Fig. 16
figure 16

Changes in MJO activity during 1880–2008. Each bar summarizes statistics obtained with stochastic simulations: mean number of events year−1 (circles), 95% confidence level (horizontal ticks) and ± one standard deviation (tips of the bars). Changes between adjacent time periods indicated at the bottom axis are significant at 5% level

Based on the stochastic simulations presented here, we derived probabilities of MJO occurrences. Figure 17 shows cumulative probabilities of n or more events per year for six regimes of MJO activity displayed in Fig. 16. Sample sizes are given by number of years in the regime multiplied by 1,000 (i.e., number of ensemble members). Note that the probability curves in any two adjacent regimes are statistically different from each other at 5% confidence level, since they were obtained after the identification of changepoints in the time series of number of events per year. It is evident that the probabilities changed significantly between regimes of low and high activity. For example, the probability of 5 or more events in a year was 0.214 in 1880–1895, 0.076 in 1896–1917, 0.197 in 1918–1947 and 0.193 in 1948–1972. After the changepoint in the 1970s, this probability appears to have increased: 0.329 in 1973–1989 and 0.510 in 1990–2008. We finally note that the non-homogeneous stochastic simulations presented here can answer some additional interesting questions. For example, what is the probability that 7 successive MJOs could occur in a given year and would last 60 days each? How did this probability change over time? Extreme situations like this could have an important role in tropical ocean dynamics such as ENSO variability.

Fig. 17
figure 17

Cumulative probability of number of MJO events year−1. Each curve represents cumulative probabilities in the given decades shown in the inset. x-axis denotes n or more events year −1. Differences in probability curves between two adjacent time periods are significant at 5% level

7 Discussion and conclusions

This study investigated several questions associated with long-term variability of the MJO. Do interannual-to-decadal variations in SST in the warm pool lead to substantial changes in MJO activity? Was there a change in the MJO in the 1970s? Can this change be associated to SST anomalies in the warm pool region? What was the level of MJO activity in the pre-reanalysis era? These questions were investigated from a probabilistic point of view using stochastic simulations.

In order to represent the non-stationarity of the MJO, modifications were done to the stochastic model described in Jones (2009). A significant finding shown here is the empirical result that relates low-frequency variations in SST anomalies to conditional probabilities of MJO initiation. Cold (warm) SST anomalies in warm pool, expressed as 3-year running means, lead to high (low) MJO activity by about 1.7 years. Likewise, 3-year running variances indicate that high (low) SST variability in the warm pool lead to high (low) MJO activity ~3.3 years later. Furthermore, the 30-year running mean trend in SST anomalies in the warm pool is significantly correlated with the trend in the MJO in the reanalysis period.

A multi-linear regression model that uses 3-year running means (M SSTA) and variances (V SSTA) of SST anomalies in the warm pool as predictors for detrended variations in the probability of primary MJO initiation, 〈P NHM/P HMG〉, was discussed in detail. When the model is trained in one time period, it is relatively successful in reproducing changes in 〈P NHM/P HMG〉 in independent data. The relationships between 〈P NHM/P HMG〉, M SSTA and V SSTA, however, do not appear to hold in the early part of the reanalysis record (before 1960s), when 〈P NHM/P HMG〉 exhibited small variations. Whether or not the NNR data realistically represent MJO variability before the introduction of satellite data is a topic that needs to be further evaluated. Coupled ocean-atmosphere reanalysis using only conventional data could possibly shed additional insights into this problem.

A large ensemble of stochastic simulations shows that the non-homogeneous model is capable to reproduce the activity of the MJO during the reanalysis period. The MJO exhibited a regime of near normal activity in 1948–1972 (3.4 events year−1) and two regimes of high activity in 1973–1989 (3.9 events) and 1990–2008 (4.6 events). Additionally, this is the first study to derive statistical inferences about the activity of the MJO before the availability of reanalysis data. Stochastic simulations indicate decadal shifts with near normal levels in 1880–1895 (3.4 events), low activity in 1896–1917 (2.6 events) and a return to near normal levels for an extended period from 1918 until the beginning of reanalysis data.

Variations in the activity of the MJO were also found in decadal changes of cumulative probabilities of number of events per year. Simulations indicate that probabilities of very active years (5 or more events) varied as: 0.214 (1880–1895), 0.076 (1896–1917), 0.197 (1918–1947) and 0.193 (1948–1972). After the changepoint in the 1970s, it has increased to 0.329 (1973–1989) and 0.510 (1990–2008). Evidently, the stochastic simulations of MJO activity before the reanalysis period need to be further evaluated. It would be interesting to compare the results shown here with coupled ocean-atmosphere simulations from a model that realistically represents the MJO.

In a separate study, (Jones and Carvalho, manuscript in preparation) performed stochastic simulations similar to the ones described here and derived projections of changes in MJO 840 activity in the A1B global warming scenario (IPCC 2007). In that case, conditional probabilities of MJO initiation were driven by projections of SST anomalies from five different global coupled climate models participating in the 4th Assessment report of the Intergovernmental Panel on Climate Change. Stochastic simulations suggest substantial increases in the probabilities of very active years (6 or more events): 0.39 (2009–2048) and 0.56 (2049–2099). Thus, if the current warming rate in the warm pool continues throughout the 21st century, the impact in the MJO will likely be strong as well. It is reasonable to expect that very active years of MJO activity will be associated with other changes in the climate system too.

In the context above, several studies have shown that the MJO may exert an import role as stochastic forcing of ENSO variability (Moore and Kleeman 1999; Batstone and Hendon 2005; Zavala-Garay et al. 2005; Marshall et al. 2009). A unique aspect of our model is that it generates MJO irregularity represented as primary and successive events with different durations. This model can be used to investigate the stochastic forcing of the MJO by coupling it to an ocean model. Since conditional probabilities of MJO initiation in the stochastic model vary with SST anomalies, possible feedback processes could be examined in detail. For instance, as the MJO forces the ocean model, low-frequency changes in tropical SSTs would modify the probabilities of MJO initiation and therefore change the nature of the stochastic forcing.

Lastly, as pointed out in the introduction, the sparseness and irregularity of tropical weather station data make it difficult to characterize the activity of the MJO before the reanalysis era. The stochastic simulations presented in this study differ substantially from traditional techniques used in previous observational MJO studies. The authors are presently developing additional statistical methods to improve the reconstruction of the activity of the MJO in the pre-reanalysis era.