1 Introduction

Globally around \(60\,\%\) of the terrestrial precipitation directly originates from moisture transported from the ocean (Trenberth et al. 2007; Gimeno et al. 2012). The variability of the oceanic water supply greatly influences water availability for all regions. Excessive transports are usually major causes for extreme weather and flood events (Knippertz and Wernli 2010; Galarneau et al. 2010; Chang et al. 2012; Knippertz et al. 2013), while interrupted transports can lead to droughts and subsequent socioeconomic stresses (Cai et al. 2012, 2014). Hence, a clear understanding of the mechanisms that force observed changes to the hydrological cycle is of major importance.

Most of the major oceanic source regions of atmospheric moisture are confined to the tropics and subtropics, where the high sea surface temperature (SST) and anticyclonic circulations provide favorable conditions for evaporation to occur under clear sky conditions. The surplus evaporation (E) over precipitation (P) provides a useful estimate of the net water input to the atmosphere (E\(-\)P). However, large scale estimates of this flux are largely limited to reanalysis datasets, which suffer from model biases and data inhomogeneity issues (Hegerl et al. 2014; Wang and Dickinson 2012; Trenberth et al. 2007, 2011). Evaporation from reanalysis is not constrained by precipitation and radiation (Hartmann et al. 2013), spurious trends and biases can be introduced by changing satellite observations (e.g. Bosilovich et al. 2005; Robertson et al. 2011), which also contribute considerably to budget errors over land (Pan et al. 2012). Similarly, precipitation from reanalysis also depends strongly on the parameterization schemes adopted by a specific model (i.e. it is a “type C” variable: Kistler et al. 2001; Kalnay et al. 1996). Moreover, E and P computed oceanic freshwater fluxes show poorer performance in closing the water budget, compared with atmospheric moisture fluxes derived values (Rodríguez et al. 2010).

Therefore, like many studies (e.g. Trenberth and Guillemot 1998; Trenberth and Stepaniak 2001) we use the moisture divergence fields computed from “type B” variables (i.e. ones that are more dependent on assimilated observations and less dependent on model parameterizations) to balance the water budget. This indirect approach is more reliable and consistent among observations (Trenberth 1997b; Roads 2002, 2003; Gimeno et al. 2012). Moreover, it is the large-scale convergence rather than locally enhanced evaporation that controls the precipitation patterns in the tropics (Mo and Higgins 1996; Soden 2000; Su and Neelin 2002; Trenberth et al. 2003; Zahn and Allan 2011), and analysis of the moisture divergence provides insights into the major modes of precipitation variability, as well as the moisture sources themselves.

On interannual time scales, large-scale atmospheric variability is closely associated with the El Niño Southern Oscillation (ENSO). Associated with the altered Walker circulation (Bjerknes 1966, 1969) and strengthened and shifted Hadley cell (Oort and Yienger 1996; Xw et al. 1950; Hu and Fu 2007; Wang 2002) the atmospheric hydrological cycle is also reorganized. Recently, there have been investigations of different types of ENSO events and their corresponding mechanisms and impacts (Capotondi et al. 2014). Most of them take the SST anomaly (SSTA) patterns as the starting point, and emphasize the different zonal SSTA structures (Larkin and Harrison 2005a, b; Ashok et al. 2007; Kao and Yu 2009; Kug et al. 2009; Fu et al. 1986; Trenberth and Stepaniak 2001; Trenberth and Smith 2006; Giese and Ray 2011; Capotondi 2013). Although each uses a different index definition and separation criterion, and gives different names to the El Niño types and emphasizes somewhat different aspects of these events, it appears that there is some correspondence bewteen these parallel studies:

  • the “1972 type ENSO” in Fu et al. (1986), the “conventional El Niño” in Larkin and Harrison (2005a) and Ashok et al. (2007), the “Eastern Pacific (EP) type ENSO” in Kao and Yu (2009) and Yu and Kao (2007), and the “Cold Tongue (CT) El Niño” in Kug et al. (2009), all refer to those events associated with anomalously warm SSTs over the eastern equatorial Pacific;

  • the “1963 type ENSO”, the “dateline El Niño” and “El Niño Modoki”, the “Central Pacific (CP) type ENSO”, and the “Warm Pool (WP) El Niño” in the aforementioned studies define the counterpart with its warming centered closer to the central equatorial Pacific.

The events identified by these studies are generally consistent when their data periods overlap (see Fig. 1 in Singh et al. (2011) for a summary), suggesting that these diverse interpretations all point to essentially the same phenomena (Kug et al. 2009). Studies starting from spatial patterns in other variables find a similar east-central contrast in the El Niño categorizations: surface salinity (Singh et al. 2011), the first occurrence of significant SSTA (Xu and Chan 2001; Kao and Yu 2009), sea level anomalies (Bosc and Delcroix 2008) and outgoing longwave radiation (OLR) in the equatorial Pacific (Chiodi and Harrison 2010).

Empirical orthogonal function (EOF) analysis is a commonly used technique in studies that describe ENSO. However the orthogonality constraint on the resultant patterns and time-series means that they do not necessarily have direct physical interpretations. This sometimes hampers the ability of this technique to capture non-linear features embedded in the data, particularly when there is a relative spread of variances across multiple EOFs all related to the same forcing. Previous studies suggest that a complete description of different characters and evolutionary features of El Niños cannot be captured fully by a single index, and a second mode reflecting the zonal SST contrast is a necessary complement (Trenberth and Stepaniak 2001; Trenberth and Smith 2006; Kao and Yu 2009). These complementary modes broadly correspond to the two flavours of El Niños, but have serious deficiencies when considering individual events (Johnson 2013). In such cases additional efforts and other techniques, like regression analyses, are required to enable a clear interpretation of the EOF results.

Similar to EOF analysis, self-organizing maps (SOM) is a powerful dimension reduction tool, but is free from orthogonality constraint. Introduced into the geography community in the 1990s, it has been more commonly used for determining synoptic circulation patterns and downscaling (Hewitson and Crane 1994, 2002; Crane and Hewitson 1998; Reusch et al. 2007; Verdon-Kidd and Kiem 2009; Verdon-Kidd et al. 2014). Here, we explore its potential applications in large scale climatic analysis. In this study, we first use conventional EOF-correlation analysis to illustrate how the tropical atmospheric moisture circulation responds to different flavors of El Niños. Then, noting that the different types of El Niños are associated with different patterns of anomalous moisture divergence which may not be orthogonal, but EOF analysis imposes orthogonality, we obtain a new perspective from a neural network algorithm (SOM). More details on the SOM algorithm are described in Sect. 2, including data preprocessing procedures, and the El Niño phase separation method. Sections 3.1, 3.2 and 3.3 show the distinct moisture divergence responses to extreme and moderate El Niños, which is validated by the SOM results described in Sect. 3.4. A summary and discussion is given in Sect. 4.

2 Methods and data

2.1 Moisture divergence

In this study we use the ERA-Interim (ERA-I) reanalysis data (Dee and Uppala 2009), a third generation atmospheric reanalysis product (Trenberth et al. 2011). ERA-I has some major improvements over its predecessor (ERA-40) in hydrological components (Trenberth et al. 2011), and outperforms NCEP I, II and MERRA in depicting the global ocean-land moisture transports (Trenberth et al. 2011). The near surface fields in ERA-I are better correlated with buoy observations (implying more faithful air-sea water fluxes) compared to NCEP products (Praveen Kumar et al. 2011). And it represents the latest and best reanalysis for reproducing and interpreting the atmospheric branch of the hydrological cycle (Trenberth et al. 2011; Lorenz and Kunstmann 2012).

Horizontal moisture divergence was computed following Trenberth and Guillemot (1998):

$$\begin{aligned} \boldsymbol{\nabla} \cdot {\mathbf {Q}} = \boldsymbol{\nabla} \cdot \frac{1}{g} \int _0^{P_s} q {\mathbf {v}}dp \end{aligned}$$
(1)

Specific humidity (q), horizontal winds (\({\mathbf {v}}\)) and surface pressure (\(P_s\)) were obtained from ERA-I for the period of 1st January 1979 to 31st December 2012. Horizontal moisture fluxes were computed on each of the 60 sigma levels using 6-hourly data, to capture as much covariance of q and \({\mathbf {v}}\) as possible. The original full resolution (\(0.75^{\circ }\times 0.75^{\circ }\)) divergence anomaly (with respect to the 34-year mean annual cycle) was temporally averaged into calendar months, and spatially filtered to a lower \(3 \,^{\circ }\times 3 \,^{\circ }\) resolution, before passing into the EOF analysis.

2.2 ENSO events and phase separation

ERA-I SST data during the same time period were used to compute the Nino 3.4 index (Trenberth 1997a). After filtering with a 5-month running mean to remove intra-seasonal variability, the time-series was normalized by its standard deviation. El Niño (La Niña) events are determined by the criterion that the Nino 3.4 index exceeds \(+ 0.75 \, \sigma\) \((-0.75 \, \sigma )\) for at least six consecutive months. If this criterion is met, the beginning of the event is defined as the first month that exceeded \(\pm 0.75 \, \sigma\).

Tracking the evolution of El Niño events through a sequence of phases could be achieved by defining phases according to either their calendar months or their timing relative to the magnitude of the SSTA. Using Nino 3.4 SSTA as the index, Xu and Chan (2001) suggested a 3-month delay in the onset time of “Summer” type El Niños compared with “Spring” type El Niños, which also show distinct warming structures. Considering this time shift in the evolutionary pathways, the calendar-month approach (e.g. using Aug-Oct as the starting phase for both types) might end up comparing events at different evolution stages, particularly for the pre-mature phases.

Therefore, taking into account the irregularity of El Niño events, we defined a relative-amplitude-based method to split each event into five evolutionary phases:

  1. 1.

    “Pre-event” phase: three preceding months before the Nino 3.4 index reaches the El Niño criterion (defined above);

  2. 2.

    “Starting” phase: from the beginning of an event to the time when the Nino 3.4 index rises \(70\,\%\) of the way up to its maximum (See “Appendix” for an illustration);

  3. 3.

    “Peak” phase: the phase in between the “Starting” and the “Decaying” phases;

  4. 4.

    “Decaying” phase: from the time when the Nino 3.4 index drops \(30\,\%\) from its maximum value to the El Niño criterion, until the end of the event;

  5. 5.

    the “Post-event” phase: three subsequent months after the Nino 3.4 index drops below the El Niño criterion.

The Nino 3.4 index experiences fastest changes during “Starting” and “Decaying” phases (whereby we assume swift changes in the overlying atmosphere, which is proved to be the case later). As monthly mean Nino 3.4 SST is used, linear interpolation was used to estimate the timing of the phases more precisely (i.e. in days). The same interpolating factors are later applied to other variables (e.g. moisture divergence) in creating the phase composites. More details are given in the “Appendix”.

Unlike other El Niños that have a single maximum in the Nino 3.4 time-series, the 1986/1987 case features a dual peak, with its first peak occurring in January 1987 and the second, larger, peak in August 1987. In the phase separation procedure described above, only the second peak was identified as the maximum, and the presence of the first peak was not accounted for. However, computations with the 1986/1987 event excluded give very similar results, and suggest that the major conclusions are insensitive to its inclusion.

2.3 Self-organizing maps

SOM is a type of neural network algorithm that introduces a specified number of neurons into the spatio-temporal space of the input dataset, and through an iterative, unsupervised learning process, locates these neurons in such a way that they collectively represent the data values within the entire data space, but individually represent local variability (Kohonen 1990, 2001). Unlike EOF analysis, there are no linear or orthogonal constraints, and the neuron distribution is determined solely by the distribution of the input data. These characteristics allow SOM to represent the dimensions of the input variables along which the variance in the sequence of inputs is most pronounced (Cavazos 1999; Liu et al. 2006).

In addition to positioning the neurons within the multi-dimensional data space, the neurons are themselves laid out in a “map” that topologically links them so that neighbouring neurons tend to be more similar than non-neighbouring neurons. This map is most commonly a 2D grid with a hexagonal or rectangular layout that determines how many neighbours each neuron has (Kohonen 2001), though other options are possible. The topological links between neighbours facilitates examination of evolutionary paths of a physical phenomenon across the map’s neurons, as well as effectively visualizing high-dimensional data and serving as an alternative classification method, as will be shown in the results section.

Even if it is non-linear, the transition from extreme El Niño states to strong La Niña states is nevertheless a continuum and we can represent this using SOM with a simplified 1D map. Thus, each neuron is topologically related only to its immediate neighbours in the 1D array of neurons (of course, each neuron still represents a location in the multi-dimensional data space). A description of the initialization and training formulation to obtain the SOM is given in the “Appendix”.

The size of the SOM array is usually an arbitrary choice made by the user. Analogous to other statistical methods, there is a trade-off between the degree of generalization, the amount of detail to represent, and the capacity of the available data sample to adequately represent the variance and distribution of the data. Therefore some trial and error experiments are usually recommended to determine an appropriate SOM size. In this case, a 1D array with five neurons gives results that can be easily related to ENSO variability. Using seven neurons (not shown) yields similar patterns with large differences to the five-neuron setup only occurring in the neutral and moderate ENSO states, where the influence of other climate variability is relatively larger. This is consistent with Johnson (2013), who suggested that no more than nine SOM neurons could be distinguished in patterns of equatorial Pacific SSTA.

3 Results

3.1 El Niño-La Niña transitions

The two leading EOFs of the moisture divergence anomalies field are found to be ENSO-related, and they explain 15 % and \(11\,\%\) of total variance, respectively. Figure 1 displays the patterns and principal components of EOF #1 and #2, together with the climatological average moisture divergence (negative values indicate moisture convergence or \(P > E\)).

Fig. 1
figure 1

Subplots a and b show the EOF#1 and EOF#2 of tropical Pacific moisture divergence anomalies (mm/day), respectively. c Shows their principle component time-series (PC#1 in blue and PC#2 in red). d is the climatological mean moisture divergence (1979\(-\)2012)

The first EOF (Fig. 1a) features a westward-pointing horseshoe structure over the tropical Pacific region that is in good agreement with the typical ENSO SSTA pattern. Anomalous convergence collocates with the warm SST anomalies during the mature phase of an El Niño, and the encompassing divergent anomalies corresponds to the negative SSTAs over the warm pool and South Pacific Convergence Zone (SPCZ). This suggests the influences of thermally driven circulation changes on the moisture divergence patterns, and the climatological convergence/divergence regions (Fig. 1d) are shifted eastward following the zonal movement of warm SST. Significant correlations (\(p < 0.01\)) with Nino 4 (\(r = 0.68\)), Nino 3.4 (\(r = 0.85\)), Nino 3 (\(r = 0.85\)) and Nino 1+2 (\(r = 0.70\)) indices lend further support to the ENSO attribution. All warm events can be easily recognized in the PC#1 time-series (Fig. 1c), except the 1994/1995 event (which is also the weakest judging by the Nino 3.4 amplitude; not shown).

Although this horseshoe-like spatial pattern of EOF#1 resembles that in the EOF#2 of Ashok et al. (2007), from which they diagnosed the “El Niño Modoki”, the correlation between PC#1 and the El Niño Modoki Index is not particularly high (\(r = 0.31,\, p < 0.01\)). This is partly due to the different fields used in Ashok et al. (2007) (SST) and in this study (moisture divergence), and the non-linear responses of atmospheric circulation to the surface forcing. Therefore this pattern does not effectively distinguish Modoki-associated moisture divergence fields from other warm events, but rather represents the broad structure of ENSO cycles in general.

The second EOF pattern (Fig. 1b) features a southwest-northeast dipole mode over the western Pacific (west of the dateline), and a north-south gradient over the eastern Pacific similar to that found in EOF#1 but shifted \(6 \,^{\circ }\) equatorward. The PC#2 time-series (Fig. 1c) shows more month-to-month variability than PC#1, but some ENSO signatures are still recognizable, with the 1982/1983 and 1997/1998 El Niño cases being most prominent, similar to the Eastern Pacific index time-series in Kao and Yu (2009). A closer look at the two spikes reveals that during these two events they lag their PC#1 counterparts by about one season, but experience fast changes, suggesting a quick restructuring of the moisture circulation patterns.

Fig. 2
figure 2

Scatter plot of PC#2 against Nino 3.4 index with all El Niño (circles) and La Niña (triangles) events color coded. Non-ENSO months are denoted by small black dots. Evolutionary pathways of the 1982/1983 (red), 1991/1992 (blue) and 1997/1998 (purple) El Niño events are illustrated by solid lines, with the final month being represented with a solid square

Besides greater warming magnitudes, these two warm events (1982/1983 and 1997/1998) differ from the others from a number of additional perspectives (see next section). It has previously been noted that two leading EOFs are required to describe different evolutions of ENSO events (Trenberth and Stepaniak 2001; Kao and Yu 2009). Therefore we also attribute EOF#2 to ENSO, representing the non-linear responses not captured by EOF#1. This non-linearity is illustrated by the outlying dots in the scatter plot of PC#2 against Nino 3.4 (Fig. 2). In general, PC#2 and Nino 3.4 are negatively correlated. However, the 1982/1983 and 1997/1998 events, and to a lesser extent the 1991/1992 case, contaminate this negative correlation and make the otherwise strong correlation rather poor (\(r = -0.3, \, p < 0.01\)). Not all of the months during these three warm cases are outliers, therefore to reveal the evolutionary paths of these exceptional events, we linked the points of these events in a chronological order. Consistent for all three of them, as the El Niño event emerges and rises in amplitude (Nino 3.4 increasing), PC#2 decreases, following the linear path defined by the negative relationship. When Nino 3.4 approaches its maximum value, PC#2 swiftly deviates away from the negative relationship and becomes strongly positive. During this period (which will be shown to be the peak-to-decaying phases), there is no further rise in the SST amplitude, yet the moisture divergence field experiences fast changes. Subsequently, both Nino 3.4 and PC#2 decrease towards zero.

Fig. 3
figure 3

Scatter plot of PC#1 and PC#2 with all El Niño (circles) and La Niña (triangles) events color coded. Non-ENSO months are denoted by small black dots. Data points for the extreme El Niño group are enclosed by a red ellipse; the moderate El Niño group by green circles, and the La Niña group by blue circles. Square-boxed numbers show the locations of the five SOM neurons in PC#1, PC#2 space, i.e. regressed onto EOF#1 and EOF#2 using least squares fit

A scatter plot of PC#1 against PC#2 summarizes the complete El Niño-La Niña response (Fig. 3). Two linear relationships are required to fully capture the moisture divergence responses to ENSO effects:

  1. 1.

    The negative La Niña-neutral-moderate El Niño correlation (\(r = - 0.46,\, p < 0.01\));

  2. 2.

    The positive moderate-extreme El Niño correlation (\(r = 0.64, \, p < 0.01\));

Although both are statistically significant, these two linear relationships represent very different time subsets (97 % and \(3\,\%\) of the data, respectively). Despite extreme El Niños only constituting around \(3\,\%\) of the total time (14 out of 408 months exceeding \(2\sigma\) in Nino 3.4), both PC#1 and PC#2 show high positive values, and the associated reorganization of atmospheric convection and related global disruptions (Cai et al. 2014) mean that special attention to these extreme cases is well deserved.

Three groups of nearby points are circled in Fig. 3 to represent typical patterns for extreme El Niño state (1983-1, 1983-2, 1998-1), moderate El Niño state (1991-1911, 1997-1998, 2002-11) and strong La Niña state (1988-12, 2007-12, 2010-11), respectively. Other states can be approximated by the linear relationships defined above. The composite for each group was generated by averaging the linear combinations of EOF#1 and #2 from the corresponding months, and the results are shown in Fig. 4. The spatial pattern of the strong La Niña composite (Fig. 4a) is similar to that of EOF#1, and the moderate El Niño composite (Fig. 4c) but with opposite sign. This is a result of both PC#1 and PC#2 switching sign but remaining approximately the same magnitude (Fig. 3). The extreme El Niño group (Fig. 4e) displays distinct spatial patterns and stronger magnitudes (note the different color scale). Both the maximum convergence and divergence anomalies in the extreme El Niño composite reach \(13.0\,{\mathrm {mm/day}}\) or above, which is more than twice the December to Feburary (DJF) climatology (not shown). A zonally elongated convergence band occurs over the eastern Pacific, which co-locates with enhanced precipitation anomalies (Kug et al. 2009; Cai et al. 2012). The climatological SPCZ swings equatorward by a larger amount than during moderate El Niños (the zonal SPCZ feature will be discussed in the next section). A sharp meridional gradient covers the entire tropical Pacific. This is suggested to be the response to the weakened meridional SST contrast over the eastern Pacific (Cai et al. 2014), and the descent anomalies to the north of the equator, mostly caused by dry advection (Su and Neelin 2002). Lastly, the Northern Hemisphere (NH) branch of the Hadley cell intensifies in both the ascending and descending branches and shifts equatorward by a larger magnitude (Hu and Fu 2007; Quan et al. 2004).

Fig. 4
figure 4

Composites of moisture divergence anomaly fields (mm/day) for ab La Niña group, c, d moderate El Niño group and e, f extreme El Niño group. Subplots a, c and e show the composites reconstructed from only EOF #1 and EOF #2, and b, d, f are the composites of the actual anomaly fields during the same calendar months

These expressions in the space defined by EOFs #1 and #2 of the anomalous moisture divergence during these three event composites are a good representation of the anomaly fields in the full dimensional space (compare Fig. 4a, c, e with Fig. 4b, d, f). This is especially so for the strong La Niña and extreme El Niño composites, while the moderate El Niño composite (Fig. 4d) shows some moisture divergence anomaly features in the South Pacific that are not represented by only EOFs #1 and #2 (Fig. 4c). Note that some anomalous features are expected when using a composite formed from only three monthly fields.

3.2 El Niño classification

Given the unusualness of the three warm events, it is justified to make the following El Niño classification from a moisture divergence perspective:

  1. 1.

    Extreme El Niño: represented by 1982/1983, 1991/1992 and 1997/1998 cases;

  2. 2.

    Moderate El Niño: represented by 1986/1987, 1994/1995, 2002/2003 and 2009/2010 cases.

The 1982/1983 and 1997/1998 events have been found to be exceptional in various El Niño classification studies, either from an SSTA zonal contrast point of view (Kug et al. 2009; Kao and Yu 2009; Larkin and Harrison 2005a, b; Giese and Ray 2011), or by the SSTA onset timing differences (Xu and Chan 2001), or using variables other than SST (Singh et al. 2011; Chiodi and Harrison 2010). The results presented above suggest distinct features from a moisture divergence perspective, and therefore differentiates El Niños on a new dimension.

Unlike the unambiguity in the 1982/1983 and 1997/1998 cases, the 1991/1992 event falls into different groups in different studies: Kug et al. (2009) classified it into the “Mix group” (mix of Cold Tongue and Warm Pool El Niño), and in Kao and Yu (2009) and Singh et al. (2011) it was grouped into the EP category. Similarly in the case of moisture divergence responses it diverges from the linear transitions between La Niña and moderate El Niños, but not as much as the other two extreme events (Fig. 2).

To examine the relationship between different El Niño responses to the SSTA zonal structure, we also created scatter plots of PC#2 against Nino 4, Nino 3 and Nino 1+2 indices (not shown). The negative correlation among non-El Niño and moderate El Niño points becomes weaker as the index moves from west to east. This suggests better correspondence between the moderate ENSO cycle and central-western Pacific SST variations, while extreme El Niños are more related to the east-west SSTA contrast. Moreover, Kao and Yu (2009) and Capotondi (2013) also found consistent east-west differences in the subsurface temperature structures associated with the two types of El Niños. Zonal SST gradient, ocean heat content propagation and the thermocline feedback are key to explaining the observed differences in the atmospheric circulation, moisture divergence and subsequently precipitation responses.

3.3 El Niño phase comparison

To examine the El Niño differences in more detail, each event is broken into five evolutionary phases according to their relative Nino 3.4 amplitudes, and the phase composites for extreme and moderate El Niños are shown in Figs. 5 and 6, respectively.

Fig. 5
figure 5

Phase composites of moisture divergence anomalies (mm/day) for moderate El Niños in a “Pre-event” phase, b “Starting” phase, c “Peak” phase, and e “Post-event phase. Green hatch overlay denotes areas where the anomaly reverses the sign of the climatology. Surface pressure composite fields are plotted as contour lines with a contour interval of 4, and 850 hPa horizontal wind anomalies (m/s) are plotted as vectors

“Pre-event” and “Post-event” are both 3 months in duration by definition. With the dual-peaked 1986/1987 case excluded, “Starting” phase has an average duration of 2.9 months, “Peak” phase around 4.0 months and “Decaying” phase 1.7 months. Therefore an El Niño would typically experience fast SSTA changes in central Pacific within one season, then meander for a slightly longer time in its “Peak” phase, followed by an even faster drop in SSTA in the “Decaying” phase.

Although their onset timings and overall durations differ, the “Peak” phases always occur during the Nov-Dec-Jan season (with the dual-peaked 1986/1987 case being exceptional, where the second peak started in July-Aug of 1987). This has been suggested to be the result of a phase-locking mechanism with the seasonal SST cycle (Xu and Chan 2001; see also Fig. 4 in Wang 2002), and such a feature would help eliminate the obstacles in inter-comparing the amplitude-based approach and calendar-month-based approaches, and promises relationships being made with results from other studies.

Fig. 6
figure 6

Same as Fig. 5 but for extreme El Niños

Notable differences between moisture divergence anomalies associated with the extreme and moderate groups start to emerge in the “Starting” phase (Figs. 5b, 6b), reach a maximum in “Decaying” phase (Figs. 5d, 6d), and persist into the “Post-event” phase (Figs. 5e, 6e). In addition to anomalies that are both larger and have a maximum convergence anomaly further east in the extreme El Niño composite, an important new finding is that the extension of the anomalous moisture convergence to the eastern Pacific moves on to the equator during the peak and decaying phases (Fig. 6c, d), whereas it stays north of the equator throughout moderate El Niños (Fig. 5). Shoaling of the thermocline and the resultant influence on SST is very sensitive to the latitude of the anomalous moisture convergence and its associated wind stress. This latitudinal difference and the stronger westerly wind anomalies that accompany it may contribute to the extension of SSTA further into the eastern Pacific during extreme El Niños. The anomalous convergence also exists in balance with a more zonally symmetric Southern Hemisphere (SH) surface pressure field and stronger southerlies east of the dateline in the peak and decaying phases, displacing the SPCZ to a more zonal orientation (see Cai et al. 2012).

In contrast, easterly anomalies occur over equatorial eastern Pacific during a moderate El Niño. Together with the off-equator position of the moisture convergence anomaly, these act to confine surface warming to the central and western Pacific, and deep convection does not occur in the east (consistent with smaller OLR reductions, Chiodi and Harrison 2010).

To the north of the equator, northwesterly anomalies are stronger in the extreme El Niños. Associated with a more compact NH Hadley cell, this dry advection helps maintain the sharp meridional gradient in the moisture divergence field (Su and Neelin 2002), which is strong enough to reverse the climatology (indicated by the green hatching in Fig. 6) in the “Decaying” phase. Moreover, such a peak-to-decaying phase differentiation is not confined to the moisture divergences observed here: the pattern correlations of SSTA from CT El Niños and WP El Niños in corresponding phases (calendar-month-based) were strongly positive during the peak phases of these two types of El Niños, but swiftly become negative one season later (Kug et al. 2009). Similar results were also found for precipitation and pressure velocity fields (Kug et al. 2009).

3.4 SOM analysis

Although two EOFs capture much of the time-varying ENSO signal, their physical interpretation is hampered by their lack of independence. Both the EOFs and the PC time-series are constrained, by definition, to be orthogonal, but that does not mean that they are unrelated. This can be seen in Fig. 3, where despite an overall zero correlation between PC#1 and PC#2, a non-linear relationship clearly exists between the two PC time-series. Furthermore, the pattern of EOF#2 will have been constrained so that (a) it is orthogonal to EOF#1; and (b) it has the precise characteristics such that the projection of moisture divergence onto it during the few extreme El Niño months when there is a positive relationship with PC#1 exactly counterbalances the projections during all the other months when there is a negative relationship with PC#1, so that the overall correlation with PC#1 is zero. It is unlikely that EOF#2 will have been unaffected by these constraints, and some ENSO-related information would likely have been spread into higher order EOFs as a result.

Fig. 7
figure 7

Self-organizing map (SOM) neurons on moisture divergence anomalies (mm/day); ae are SOM neurons 1 to 5. Note that a uses a different color scale than others

This provides the motivation for our SOM analysis of the same moisture divergence field, to explore its utility in easily capturing this non-linear behaviour. By quantifying the distances between a carefully chosen number of SOM neurons, an equivalent El Niño classification is also achieved.

Figure 7 displays the five SOM neurons we obtained. The 1st neuron (Fig. 7a) shows a good agreement with the extreme El Niño group composite in Fig. 4e, both in terms of spatial patterns and the anomaly strengths. The 2nd (Fig. 7b) and 5th (Fig. 7e) neurons resemble the moderate El Niño group (Fig. 4c) and the La Niña group (Fig. 4a), respectively. Moving from neuron-1 to neuron-5, one observes a gradual transition of the moisture divergence field, therefore the remaining two neurons (neuron-3 and -4) could be expected to represent the neutral and weak La Niña ENSO states.

Fig. 8
figure 8

Stacked time-series of SOM training sample counts, defined as the number of training samples allocated to each neuron in each sliding 13-month time window

This attribution is substantiated by the locations of each neuron in the space defined by EOFs #1 and #2, by least squares estimation of the PC#1 and PC#2 coefficients that best replicate each neuron (shown by the red numbered squares in Fig. 3). The sequence of neurons follows the pathway defined by the two correlations. Figure 8 shows the number of months in each sliding 13-month window allocated to each neuron. The allocation is based upon selecting the closest neuron, in a Euclidean distance sense, to each monthly field. The time-series of neuron-1 displays non-zero values only during the 1982/1983 and 1997/1998 El Niños, and for a shorter period in the 1991/1992 case. The La Niña neuron (neuron-5) shows good correspondence with La Niña years (1983/1984, 1988/1989, 1999/2000/2001, 2007/2008 and 2010/2011). Neuron-2 becomes active either during a moderate El Niño (1986/1987, 1994/1995, 2002/2003 and 2009/2010) or in the early phase of an extreme El Niño (1982/1983 and 1997/1998). The rest of the time period is mostly represented by neutral and weak La Niña neurons (\(-3\) and \(-4\)). Instead of the discrete and selection-exclusive sample counting method used here, one could also use a spatially weighted correlation time-series to reveal more subtle features in the temporal variations of each neuron.

Table 1 Inter-neuron distances and the means and standard deviations of intra-group distances (mm/day)

To validate the El Niño classifications, we computed inter-neuron distances (Table 1), defined as the Euclidean distance between every two neuron pair, and the mean and standard deviation of intra-group distances. Intra-group distances refer to the distances between all training samples and the neuron they are allocated to. The average and standard deviation of the intra-group distances serve as a measure of how closely the training samples are clustering around the neuron (though note that the distances cannot simply be averaged or summed to represent distances across multiple groups because the distances will be based on different directions in the high dimensional space).

Table 2 Correlation matrix between the 5 SOM neurons

As is shown in Table 1, the extreme El Niño neuron (N1) shows in general increasingly larger distances from the moderate El Niño (97.6, N2), neutral (112.8, N3), weak La Niña (105.2, N4) and strong La Niña (120.6, N5) neurons. The separation between extreme and moderate El Niño neurons (97.6) is larger than the direct distance from moderate El Niño to strong La Niña (81.2 from N2 to N5). Table 2 shows the pattern correlations between the neurons, thus removing the effects of magnitudes in constituting the inter-neuron distances. The moderate El Niño neuron (N2) has a much better (but opposite) pattern match with La Niña neurons (N4 and N5), than with the extreme El Niño neuron (N1). Therefore the distinction bewteen extreme and moderate El Niños suggested by the SOM analysis is justified. On the other hand, differences between moderate El Niño and neutral (46.1 from N2 to N3) is much smaller, which is consistent with the relatively clustered data distribution in EOF #1, #2 space (Fig. 3).

4 Conclusions and discussion

We have used EOF and SOM analyses to characterize the spatial patterns of inter-annual variability in the atmospheric moisture divergence over the tropical Pacific, a key component of the hydrological cycle that is linked directly to anomalies in the surface water balance (E\(-\)P). This variability is of course dominated by ENSO influences, with the moisture divergence shifting eastwards to follow the eastward shift of the warmest equatorial SST during moderate El Niños, accompanied by an equatorward rotation of the SPCZ. The moisture divergence anomalies associated with La Niña events have similar spatial patterns and magnitudes as moderate El Niños, but with opposite sign. Our analysis finds, however, that the moisture divergence patterns during extreme El Niño events are not simply a strengthening of the moderate El Niño pattern but exhibit distinct characteristics: the tropical convergence centre moves much further east, the NH Hadley Cell is more compact and the SPCZ swings further towards the equator. These differences from moderate El Niño behaviour are particularly apparent from the peak of the event through the decaying phase, which is consistent with previous studies using other climate variables (Kug et al. 2009; Xu and Chan 2001).

This complex behaviour is evident in the EOF results, with a clear non-linear relationship found between the leading two PC time-series even though they are constrained by EOF analysis to have no linear dependence. This motivated our use of the SOM technique, which is not constrained by the spatial and temporal orthogonality requirements of EOF decomposition. The SOM analysis simplifies the non-linear relationship between two EOF patterns into a simple sequence of five patterns (SOM neurons) representing the range of states from La Niña to extreme El Niño. SOM neuron count time-series and inter-neuron distance/correlation statistics further validate the classification of extreme and moderate El Niños.

Our findings have a number of implications. First, a single index such as Nino 3.4 is insufficient to measure the range of atmospheric moisture divergence responses to ENSO, consistent with the prior findings for other variables (Trenberth and Stepaniak 2001; Trenberth and Smith 2006; Chiodi and Harrison 2010; Kao and Yu 2009). An index is required to represent the SST zonal contrast that distinguishes different types of El Niño, and is likely to be the key factor that causes differences in moisture divergence patterns. Our results suggest that alternatives to the conventional EOF method that are free from orthogonal constraints, such as SOM, deserve more attention when determining additional ENSO indices.

Second, analyses of ENSO behaviour need to consider more ENSO classes than the basic La Niña, neutral and El Niño classification. Our analysis of atmospheric moisture divergence demonstrates that this distinction is present in the atmospheric branch of the hydrological cycle too, providing a new perspective to the existing literature, and confirms the coupled ocean-atmosphere signature of this ENSO difference that is not necessarily implied by the SST-based analyses. The consistency with SST-based studies is not a coincidence. The sensitivity of ocean temperature and atmospheric convection is reversed between the central and eastern Pacific: central Pacific SSTAs are much more effective at inducing anomalous convection than their eastern counterpart, due to the warmer background SSTs (Kug et al. 2009; Hoerling et al. 1997; Capotondi et al. 2014), while subsurface temperature below the mixed layer has a stronger response to the thermocline changes over the eastern Pacific (Capotondi et al. 2014). Therefore once the warm SST anomalies develop over the eastern Pacific or get advected from the west in an extreme El Niño, possibly modulated by the seasonality of Kelvin wave propagation (Harrison and Schopf 1984), or a proper timing of Australia and Asian monsoon (Xu and Chan 2001), the induced thermocline feedback could trigger large magnitudes of deep convection over the eastern Pacific, as manifested by OLR troughs (Chiodi and Harrison 2010), and the moisture divergence changes presented in this study for extreme El Niño (e.g. the first SOM neuron, Fig. 7a).

Similar concerns relate to the use of EOF analyses to classify ENSO behaviour-due to EOF orthogonality constraints, the pattern of variation covering La Niña to moderate El Niño events is mostly captured by EOF#1 but also partly represented in EOF#2, which in turn partly represents the contrasting moisture divergence response to moderate and extreme El Niños. Classifications need to consider this complexity and ideally use methods, such as the SOM presented here, that can represent them as separate patterns rather than the mixed form of the EOF analysis.

Third, the observed non-linear response highlights the need for a coupled Hadley-Walker cell view in explaining the different El Niño types. Although commonly interpreted as a meridional circulation cell, the Hadley cell is not zonally symmetric, but rather a 3D helix circulation where the zonal asymmetry is modulated by the Walker circulation. In neutral ENSO condition, the warm pool low and the subtropical highs to the east form a triangular shape (Fig. 5a, see also Fig. 1 in Zhang and Song (2006)). In the mature phase of an extreme El Niño, strong eastern warming weakens or even reverses the Walker circulation, and compresses the equatorial-low-subtropical-high polarity (Fig. 6d); the pitch distance of the 3D Hadley-Walker helix circulation is reduced. As a result, the dry air intrusion from the subtropics becomes more effective, due to both a tighter pressure gradient and reduced opportunity for evaporation to replenish the moisture because of the shorter travel distance. The reduced trade winds and evaporation also play a role (Su and Neelin 2002). As warming is more confined to the western-central Pacific in a moderate El Niño, the modulation of the Walker circulation is not strong enough to reverse the equatorial-low-subtropical-high polarity.

Finally, we note limitations to this study. The limited time span of ERA-I data allows only a small sample of seven El Niño events to be included. Of the three extreme El Niños, two coincided with major volcanic eruptions (the March 1982 El Chichon and the June 1991 Mt. Pinatubo), and we did not address the possible role volcanic forcing may have on tropical moisture divergence. Moreover, Pacific exhibits distinct decadal (PDO, Pacific Decadal Oscillation) to inter-decadal (IPO, Inter-decadal Pacific Oscillation) variations, with largely consistent manifestations in SST, sea level pressure, wind stress, thermocline evolution, Hadley circulation and ENSO variability (Power et al. 1999; Mantua et al. 1997; Folland and Renwick 2002; Wang and Fiedler 2006; Quan et al. 2004; Trenberth and Stepaniak 2001). The change in PDO/IPO phase around 1976/1977 has been identified as a major “climate shift” (Trenberth 1990; Trenberth and Stepaniak 2001), after which El Niño activity increased and the structure of the SPCZ changed (Folland and Renwick 2002), possibly caused by the altered zonal SST structure (van der Wiel et al. 2015). Therefore, the validity of the results presented here might be limited to positive PDO/IPO epochs. Further investigation with earlier datasets is needed to determine whether they hold in La Niña dominated periods.