1 Introduction

The El Niño Southern Oscillation (ENSO) is an important air-sea interaction mode in the tropical Pacific region, which has attracted extensive attention in the past few decades. The ENSO events can be classified as Eastern-Pacific (EP) ENSO or Central-Pacific (CP) ENSO according to the location of the maximum sea surface temperature anomaly (SSTA). The EP ENSO is also referred to as the canonical ENSO or Cold Tongue ENSO (Kug et al. 2009; Yeh et al. 2009), with the center of maximum SSTAs located in the eastern Pacific. The CP ENSO is also known as the “ENSO Modoki” (Ashok et al. 2007) or Warm Pool ENSO (Kug et al. 2009; Yeh et al. 2009), and has the largest SSTAs variability in the central Pacific (Ashok et al. 2007; Kao and Yu 2009; Kug et al. 2009; Yu and Kao 2007; Yu et al. 2010; Zheng et al. 2014). Compared with the canonical ENSO, ENSO Modoki can generates different teleconnections (Taschetto and England 2009; Zhang et al. 2015), and has different effects on the precipitation (Ashok et al. 2009; Feng and Li 2011; Feng et al. 2016a; Jiang et al. 2019; Zhang et al. 2013, 2014), the Hadley circulation (Feng and Li 2013), the stratosphere (Xie et al 2012, 2014a, b, c), aerosol concentrations (Feng et al. 2016b, 2017), and tropical cyclone activity (Kim et al. 2009; Wang et al. 2013; Magee et al. 2017). As the ENSO has a worldwide climatic effect, the prediction of ENSO can provide a predictability source for short range climate prediction. Given the obviously different climatic effects of the two flavors of ENSO events, it is necessary to comprehensively explore the predictability of the two flavors of ENSO.

The study of ENSO predictability involves two aspects: actual prediction skill and potential predictability. The former focuses on the accuracy of the model in predicting ENSO events against observations, whereas the latter quantitatively measures the upper limit of ENSO prediction skill under a perfect model assumption (Tang et al. 2018). It should be noted that the model is not perfect in reality. Therefore, the measured potential predictability is partly model dependent. However, a useful strategy to explore ENSO predictability is to conduct retrospective forecast experiment using numerical models with varying degrees of complexity. For the canonical ENSO events, many retrospective forecasts with complicated coupled general circulation models (CGCMs) have only been conducted 20–60 years (Luo et al. 2008; Qiao et al. 2013; Kirtman et al. 2014; MacLachlan et al. 2015; Huang et al. 2017; Takaya et al. 2017; Zhu et al. 2017; Zhang et al. 2018; Barnston et al. 2019; Ding et al. 2019; Johnson et al. 2019; Lin et al. 2020). The relatively short retrospective forecast periods only contain a few ENSO cycles, which are inadequate for us to achieve statistically robust cognize for the ENSO predictability, particularly with respect to the interdecadal variation. The existing long-term ENSO retrospective predictions have been mainly conducted with intermediate (Chen et al. 2004; Zheng et al. 2009; Cheng et al. 2010; Liu et al. 2019; Gao et al. 2020) and hybrid coupled models (Tang et al. 2008a; Deng and Tang 2009; Tang and Deng 2011). These efforts revealed the interdecadal variation of the actual prediction skill for the canonical ENSO and the possible reasons accounting for this phenomenon.

However, predictability of the two flavors of ENSO in a long period retrospective forecast has received limited attention. Based on 30–60 years hindcasts using various models, previous studies have reached conflicting conclusions. Some studies have reported that the CP ENSO is more predictable than the EP ENSO due to its high persistence and low “Spring Predictability Barrier” (SPB) (Kim et al. 2009; Yang and Jiang 2014). Other studies have found that the models had better performance in predicting the EP ENSO as compared with the CP ENSO (Hendon et al. 2009; Jeong et al. 2012; Imada et al. 2015; Lee et al. 2018; Zheng and Yu 2017). Such divergent results may be due to the limited hindcast durations. In addition, the previous studies focused only on the actual prediction skill and the variation of the potential predictability for both the EP ENSO and CP ENSO remains unclear. Yao et al. (2019) reported that the Community Earth System Model (CESM) can fairly well simulate the two flavors of ENSO. Therefore, we conducted an ensemble long-term retrospective forecast with CESM from 1881 to 2017. This study focuses on the following issues: (1) how about the performance of the actual skill for the two flavors of ENSO in the CESM; (2) what are the features of the potential predictability for the two flavors of ENSO; (3) whether actual skill and potential predictability also undergo seasonal and interdecadal variations for the two flavors of ENSO, particularly for the CP ENSO. If so, which factor dominates the variation in the predictability for the two flavors of ENSO? These essential questions have not been well explored in previous studies. This paper is organized as follows. The description of the model, ensemble construction and evaluation metrics are introduced in Sect. 2. The characteristics of the actual skill of the two flavors of ENSO are presented in Sect. 3. In Sect. 4, we explore the possible reason accounting for the variation in the predictability of the two flavors of ENSO. Finally, the conclusions and discussion are summarized in Sect. 5.

2 Model and methodology

2.1 Model information

In this study, we used the CESM version 1.2.1 to formulate the atmosphere, ocean, land, land-ice, and sea-ice components. It is one of the most popular fully CGCM and has been involved in many seasonal forecast studies (Hurrell et al. 2013; Bellenger et al. 2014; Hu and Duan 2016; Bellomo et al. 2018; Hu et al. 2019; Xu et al. 2021). The atmospheric component is the Community Atmosphere Model version 4 (CAM4; Neale et al. 2013; horizontal resolution 0.9° × 1.25° with 26-layer hybrid sigma-pressure vertical coordinate). The oceanic component is the Parallel Ocean Program ocean model version 2 (POP 2; Smith et al. 2010; horizontal resolution 1.1° × 0.54–1° with60 layers in the vertical direction). It has been reported that the CESM can reasonably portray the characteristic of the two flavors of ENSO with the above model configuration (Yao et al. 2019). It is also employed as the operational model of the National Marine Environmental Forecasting Center in China (Li et al. 2015a, b; Zhang et al. 2018, 2019), and we modified its nudging scheme by adjusting the nudging weight of the subsurface ocean temperature at depths above 500 m and adding wind data assimilation below 500 hPa to improve the simulation and prediction skill for ENSO (Song et al. 2021).

Before 1983, the monthly mean upper ocean temperature and 6-hourly mean wind variables were extracted from the monthly Simple Ocean Data Assimilation version 2.2.4 (SODA 2.2.4; Carton and Giese 2008) and European Centre for Medium–Range Weather Forecasts (ECMWF) twentieth century reanalysis (ERA-20C; Stickler et al. 2014) to initialize the retrospective forecast. We then used the Global Ocean Data Assimilation System (GODAS; Behringer and Xue 2004) and ECMWF interim reanalysis (ERA-interim; Berrisford et al. 2011) as the oceanic and wind assimilation data from 1983 to 2017. The validation SST data were extracted from the SODA and GODAS datasets before and after 1983, respectively. To eliminate the effect of the low-frequency variation, the anomalies in this study were identified by subtracting the corresponding climatology of the running 20-year window both for the forecast and observational data as in Deng and Tang (2009).

2.2 Methodology

2.2.1 Ensemble construction strategy

To perturb the initial conditions, we employ the climatically relevant singular vector (CSV, Kleeman et al. 2003; Tang et al. 2006) method to obtain the singular vectors of the sea temperature above 200 m, which can consider the influence of the uncertainties of the upper sea temperature on SST prediction. The CSV method can capture the climatically relevant optimal error growth like the traditional SV but avoids using tangent linear and adjoint models. The CSV has also been used in various CGCMs to investigate climate predictability (Hawkins and Sutton 2011; Islam et al. 2016; Li et al. 2020). More details of the CSV algorithm may be found in the relevant literature (Kleeman et al. 2003; Tang et al. 2006). In this study, a random linear combination of the first three CSVs is employed to perturb the initial conditions and form 20 ensemble members. The ensemble retrospective forecast starts on January 1st, April 1st, July 1st and October 1st each calendar year from 1881 to 2017, with 12 month integrations for the initial conditions.

2.2.2 Predictability skill measures

In the current study, we employ the anomaly correlation coefficient (ACC) to measure the deterministic prediction skill, defined as:

$$ACC\left( t \right) = \frac{{\sum\limits_{i = 1}^{M} {\left[ {a_{i}^{o} \left( t \right) - \overline{{a^{o} }} \left( t \right)} \right]\left[ {a_{i}^{p} \left( t \right) - \overline{{a^{p} }} \left( t \right)} \right]} }}{{\sqrt {\sum\limits_{i = 1}^{M} {\left[ {a_{i}^{o} \left( t \right) - \overline{{a^{o} }} \left( t \right)} \right]^{2} } } \sqrt {\sum\limits_{i = 1}^{M} {\left[ {a_{i}^{p} \left( t \right) - \overline{{a^{p} }} \left( t \right)} \right]^{2} } } }}$$
(1)

The contribution of the prediction of the \(i\)th initial condition to the ACC (denoted as C) can be expressed as (Tang et al. 2008c):

$$C_{i} (t){ = }\frac{{{1 \mathord{\left/ {\vphantom {1 M}} \right. \kern-\nulldelimiterspace} M}\left[ {a_{i}^{p} (t)} \right]\left[ {a_{i}^{o} (t)} \right]}}{ACC(t)} \times 100\%$$
(2)

where \(a_{i}^{o} \left( t \right)\) and \(a_{i}^{p} \left( t \right)\) indicate the ensemble mean prediction and corresponding observation of the \(i\)th initial condition at \(t\)th lead time, respectively. The overbar indicates the mean of the total initial conditions. \(M\) represents the total number of predictions. The symbol [] represents the standardization variable.

To measure the potential predictability, we used information-based measure (Tang et al. 2008b) and variance-based metric (Kumar et al. 2016). Compared with the actual prediction skill metric, both two types of measures avoid using observations. For a Gaussian variable \(v\), which is usually the case for most seasonal variables (including ENSO prediction), the climatological variance \(\sigma_{v}^{2}\) can be decomposed into a sum of noise variance (ensemble prediction variance) and signal (ensemble mean) variance (DelSole and Tippett 2007; Tippett et al. 2010):

$$\sigma_{v}^{2} { = }\left\langle {\sigma_{v|i}^{2} } \right\rangle { + }\left\langle {\mu_{v|i}^{2} } \right\rangle$$
(3)

The information-based measure mutual information (MI) and variance-based metric signal to total variance ratio (STR) are defined as (DelSole and Tippett 2007; Tang et al. 2013):

$$STR = \frac{Var(Signal)}{{Var(Signal) + Var(Noise)}}{ = }\frac{{\left\langle {\mu_{v|i}^{2} } \right\rangle }}{{\left\langle {\mu_{v|i}^{2} } \right\rangle { + }\left\langle {\sigma_{v|i}^{2} } \right\rangle }}$$
(4)
$$MI = \iint {p\left( {v,i} \right)}\ln \frac{{p\left( {v,i} \right)}}{p\left( v \right)p\left( i \right)}dvdi\begin{array}{*{20}c} { = \frac{1}{2}\left( {\ln \sigma_{v}^{2} - \left\langle {\ln \sigma_{v|i}^{2} } \right\rangle } \right)} & {} \\ \end{array}$$
(5)

where the symbol \(\left\langle \cdots \right\rangle\) denotes the mean over all initial conditions. \(p\left( v \right)\), \(p\left( i \right)\) and \(p\left( {v,i} \right)\) indicate the climatological distribution, probability distribution for the initial condition \(i\) and forecast distribution of the \(i\)th initial condition, respectively. \(\mu_{v|i}^{{}}\) and \(\sigma_{{v{|}i}}^{2}\) denote the ensemble mean and ensemble prediction variance of the \(i\)th initial condition, respectively. In the perfect model framework, the potential correlation (denoted as R) is the ACC between the prediction (ensemble mean) and “perfect observation” (an arbitrarily predicted ensemble member). R has the following theoretical relationship with the STR as (Kumar 2009; Tang et al. 2013):

$$R_{STR} = \sqrt {STR}$$
(6)

Given that the arithmetic mean is larger than or equal to the geometric mean, then we can derive from Eqs. (3)–(5) (Yang et al. 2012; Tang et al. 2013):

$$MI \ge \frac{1}{2}\left( {\ln \sigma_{v}^{2} - \ln \left\langle {\sigma_{v|i}^{2} } \right\rangle } \right) = - \frac{1}{2}\ln \left( {\frac{{\left\langle {\sigma_{v|i}^{2} } \right\rangle }}{{\sigma_{v}^{2} }}} \right) = - \frac{1}{2}\ln \left( {1 - \ {STR} } \right)$$
(7)

The transformation \(\sqrt {1 - e^{ - 2MI} }\) also exhibits proper limiting behavior of the “potential” skill scores (referred to as \(R_{MI}\); Joe 1989; DelSole 2005), with a maximum value (1) when MI approaches infinity, and minimum value (0) when MI vanishes. Therefore, it can be deduced that:

$$R_{MI} = \sqrt {1 - e^{ - 2MI} } \ge \sqrt {STR} = R_{STR}$$
(8)

When the prediction variance \(\sigma_{v|i}^{2}\) is constant, the Eqs. (7)–(8) are equal.

Relative entropy (RE) is another information-based measure to quantify the potential predictability of each prediction, and the mean RE for all individual predictions is equal to MI (DelSole 2004; DelSole and Tippett 2007; Yang et al. 2012; Tang et al. 2013), which can be defined as:

$$RE(i) = \frac{1}{2}\left[ {\frac{{\left( {\mu_{v\left| i \right.} - \mu_{v} } \right)}}{{\sigma_{v}^{2} }}^{2} { + }ln\left( {\frac{{\sigma_{v}^{2} }}{{\sigma_{v\left| i \right.}^{2} }}} \right) + \frac{{\sigma_{v\left| i \right.}^{2} }}{{\sigma_{v}^{2} }} - 1} \right]$$
(9)

where \(\sigma_{v\left| i \right.}^{2}\), and \(\sigma_{v}^{2}\) are the ensemble and climatological variances, respectively. And \(\mu_{v\left| i \right.}\) and \(\mu_{v}\) indicate the ensemble and climatological means, respectively. The RE represents the extra information from the difference between the climatological distribution and ensemble predicted distribution (Cover and Thomas 1991), which can be decomposed into the signal component (SC; the first term) and dispersion component (DC; the last three terms). The SC indicates additional information between the climatological and prediction means, while the DC represents a reduction in climatological uncertainty from the prediction.

Following the previous studies (Hendon et al. 2009; Jeong et al. 2012; Yang and Jiang 2014; Ren et al. 2016; Lee et al. 2018; Zheng and Yu 2017), we employed the Niño 3 index to depict the variability of the canonical ENSO or EP ENSO events, which is defined as the averaged SSTAs from 5° S–5° N and 90°–150° W. For the CP ENSO events, we adopt the ENSO Modoki index (EMI) defined by Ashok et al. (2007) as:

$$EMI = SSTA_{A} - 0.5 \times (SSTA_{B} + SSTA_{C} )$$
(10)

Here \(SSTA_{A}\), \(SSTA_{B}\) and \(SSTA_{C}\) represent the averaged SSTAs over region A (10° S–10° N, 165° E–140° W), B (15° S–5° N, 110° W–70° W), and C (10° S–20° N, 125° E–145° E), respectively. These two indexes (with a correlation coefficient of − 0.34) in the CESM can basically represent the slightly negative correlation between the observational Niño 3 index and EMI (with a correlation coefficient of − 0.33).

3 Deterministic prediction skill of the two flavors of ENSO

Figure 1 presents the ACC (solid lines) and persistence (dashed lines) skills of the forecast ensemble mean Niño 3 index (blue lines) and EMI (red lines) from 1881 to 2017 as a function of lead time. Generally, the ACC skills for the two flavors of ENSO drop with increasing lead time as expected. Our ensemble system has considerable high prediction skill for both EP ENSO and CP ENSO. The effective predictive skill (ACC > 0.5) extends for 7 and 6 months for the EP ENSO and CP ENSO, respectively. The ACC skill of the Niño 3 index is almost always higher than that of the EMI at all lead times. This indicates that the EP ENSO is more predictable than the CP ENSO in the CESM model, which is consistent with the results based on an intermediate model (Zheng and Yu 2017). For the EP ENSO, the model forecast is obviously superior to the persistence forecast. Whereas the ACC is more skillful than the persistence skill for lead times longer than 3 months for the CP ENSO, while the model forecast is worse than the persistence skill at lead times of 1, 2 and 12 month. The possible reason for this is that the CP ENSO is more sensitive to the initial shock and has a lower prediction skill than persistence skill at the beginning of the forecast. In addition, model bias may also affect the prediction for the two flavors of ENSO. Figure 2a presents the difference of the climatic SST between the control run of the CESM and the observations. There are notable warm SST bias in the tropical Pacific Ocean, which indicates a weaker west–east SST gradient along the equator, and results in a weaker strength of Niño 3 index and EMI than the corresponding observations. The difference in the variances between the simulated and observational Niño 3 index (DMI) is − 0.21 (− 0.22). The warm SST bias also affect the smaller predicted variances of the Niño 3 index (Fig. 2b) and EMI (Fig. 2c) as compared with the observation for all lead times, especially for the long lead times and CP ENSO. The CP ENSO suffers from more model bias may also be reflected by its higher persistence skill than prediction skill at long lead time, which also indicates that there are more improvements to be made for the prediction of the CP ENSO with the CESM. More interestingly, the persistence skill of the EMI is higher than that of the Niño 3 index after 4 months lead. These results are consistent with those of Yang and Jiang (2014) based on version 2 of the National Centers for Environmental Prediction, which showed that the EMI is more persistent than the Niño 3 index.

Fig. 1
figure 1

ACC of the Niño 3 index (blue solid line) and EMI (red solid line) plotted against the observations. The blue and red dashed lines indicate the corresponding persistence skill, respectively

Fig. 2
figure 2

a Difference between the climatic SST for the control run and observations. The ratio of the predicted b Niño 3 index and c EMI to the corresponding observations as a function of lead time

Figure 3 presents the ACC skill for the Niño 3 index (Fig. 3a) and EMI (Fig. 3b) as a function of the target month for different initial conditions, in order to examine the seasonal variations of the prediction skills for the two flavors of ENSO. There are pronounced SPB phenomena for both the EP ENSO and CP ENSO, with an obvious decay in the prediction skill across the spring season for all start months. The predictions started in January and October exhibit a more marked decline in their skills than the skill of the predictions initiated in April and July for the two flavors of ENSO. However, the SPB is more obvious for the CP ENSO than the EP ENSO. In detail, the effective predictive skill (ACC > 0.5) is 6 months for the EP ENSO and 5 months for the CP ENSO for predictions starting in January. When the predictions start in April, the effective prediction skills for the EP ENSO and CP ENSO are 11 and 10 months, respectively.

Fig. 3
figure 3

ACC of the a Niño 3 index and b EMI as compared with the observations for different start months. The dashed green lines indicate the boreal spring

Previous studies have shown that the predictability of the EP ENSO has undergone a significant interdecadal variation in intermediate (Chen et al. 2004; Zheng et al. 2009; Cheng et al. 2010) and hybrid coupled models (Tang et al. 2008a; Deng and Tang 2009; Tang and Deng 2011). This is also the case for the CGCM ensemble prediction system, as shown in Fig. 4. There is relatively higher prediction skill from 1961 to 2017 and lower prediction skill occurs before 1960 for both flavors of ENSO. These periods of high and low prediction skills for the EP ENSO are generally consistent with previous results from the intermediate and hybrid coupled modes. However, the periods of the highest and lowest prediction skills are different for the two flavors of ENSO. Specifically, the highest (lowest) prediction skills occur during 1981–2000 (1921–1940) for the EP ENSO, but during 2001–2017 (1941–1960) for the CP ENSO. Figure 5 presents the ACC skills for the Niño 3 index and EMI (blue lines) with running window of 20 years, averaged over 1–12 lead month. The ACC skill of the EP ENSO is generally high than that of the CP ENSO, indicating that the EP ENSO is more predictable than the CP ENSO in the CESM ensemble forecast system during the past 137 years. The higher ACC skills occur after 1960 for both flavors of ENSO. However, there are some differences in the variations of ACC skills between the EP ENSO and CP ENSO. The ACC of the EP ENSO decreases from 1890 to 1930 and increase after 1930. In contrast, the ACC of the CP ENSO exhibits decreasing trends in 1881–1930 and 1940–1960, and increasing trends in 1930–1940 and after 1960. Figures 3 and 4 indicate that the prediction skills for the two flavors of ENSO both undergo remarkable interdecadal variations in the CESM, with relatively higher (lower) predictability before (after) the 1960s. Apparently, this can not be simply explained by the improvement in data quality or growing number of ocean observations in recent years. For the EP ENSO, there is relative high skill during 1881–1920 (ACC > 0.5), which actually has insufficient observations.

Fig. 4
figure 4

ACC of the ensemble mean Niño 3 index (a) and EMI (b) as compared with the observations for seven consecutive 20-year periods since 1881

Fig. 5
figure 5

a ACC (blue line), standardized variance (red line), and variance of the depth of the 20 °C isotherm in the tropical eastern Pacific Ocean (5° S–5° N, 250°–280° E; green line) of the ensemble mean Niño 3 index. b ACC (blue line), standardized variance (red line) of the EMI, and WWV (green line) in the tropical Pacific Ocean. Both plots are averaged from 1 to 12 lead months with a 20-year running window. The labels on the x-axis indicate the middle year of each 20-year window

As reported in previous studies (Chen et al. 2004; Tang et al. 2008a, b, c; Kumar and Chen 2015), one possible reason for the high skill may be the variation of the ENSO signal intensity (red lines in Fig. 5), as these variances mimic the low-frequency variability of the prediction skill for both types of ENSO. For the EP ENSO (Fig. 5a), the variation of its signal strength is significantly correlated with the variation in the depth of the thermocline in the eastern Pacific Ocean (green line, with a correlation coefficient of 0.89), which represents the strength of the thermocline feedback and plays the leading role for the EP ENSO. Strong variations of the thermocline in the eastern Pacific Ocean are associated with a strong thermocline feedback, which results in strong EP ENSO signal variability and its high prediction skill. For the CP ENSO (Fig. 5b), the variation of its signal strength is highly negatively correlated with the warm water volume (WWV; volume of water with temperatures > 20 °C from 5° S–5° N and 120°–280° E) in the tropical Pacific Ocean (green line), with a correlation coefficient of − 0.78. The background SST in the tropical Pacific Ocean presents an increasing cold tongue mode due to global warming (Zhang et al. 2010; Li et al. 2015a, b, 2017, 2019, 2021), which is characterized by cooling in the eastern equatorial Pacific and warming elsewhere in the tropical Pacific (Cane et al. 1997; Karnauskas et al. 2009; Compo and Sardeshmukh 2010; Zhang et al. 2010; Solomon and Newman 2012; Funk and Hoell 2015). This corresponds to vigorous upwelling of cold water in the eastern equatorial Pacific and a decrease in the WWV and after 1950 (green line in Fig. 4b). This background state change reflects a weakened Bjerknes feedback intensity that suppresses the growth of SSTAs in the eastern equatorial Pacific, which finally lead to more frequent of CP ENSO events (Li et al. 2017). The increased occurrence of the CP ENSO indicates a strong variation of its signal, which favors its high prediction skill. In summary, the increased frequency of the two flavors of ENSO events provide strong additional information compared with the climatological prediction, which lead to larger predictability and high prediction skill. We further investigate this in the next section.

4 Potential predictability of the two flavors of ENSO

Potential predictability can reveal the upper limitation of the prediction skill, and qualitatively measuring the possible improvement room of the skill in an ensemble prediction system. Figure 6a presents the information-based (\(R_{MI}\)) and variance-based (\(R_{STR}\)) potential predictability measures as functions of lead time for the Niño 3 index (red line) and EMI (blue line). Both potential predictability metrics decrease with lead time for the two flavors of ENSO, indicating that the predictability declines with lead time for both the EP ENSO and CP ENSO. However, the potential predictability of the EP ENSO is always higher than that of the CP ENSO at all lead times, irrespective of the predictability metric. This explains why the EP ENSO is more predictable than the CP ENSO as shown in Fig. 1. Figure 6a shows that the information-based potential predictability measure is higher than the variance-based metric for both flavors of ENSO, especially for long lead times and the CP ENSO. This difference may be the consequence of the latter only measures the linear statistical dependence between the ensemble mean prediction and the perfect observation, whereas the former includes more information than the later according to Eqs. (7)–(8). (Yang et al. 2012; Tang et al. 2013). Therefore, the variance-based potential predictability somewhat underestimates the intrinsic limit of predictability, especially for the CP ENSO, and the information-based potential predictability is more suitable for evaluating the predictability in an ensemble prediction system. Unless otherwise stated, the potential correlation (R) used thereinafter is \(R_{MI}\). Given that the averaged RE for all the individual predictions is equal to MI (DelSole 2004; DelSole and Tippett 2007; Tang et al. 2013) and the RE includes the SC and DC, the potential predictability \(R_{MI}\) can also be derived from these two components. The SC has a greater contribution to the potential predictability than the DC for both flavors of ENSO, particularly for the CP ENSO and long lead times (Fig. 6b).

Fig. 6
figure 6

a MI-and STR-based potential correlation of the Niño 3 index (red lines) and EMI (blue lines) as compared with the observations. The solid and dashed lines indicate the MI- and STR-based metrics, respectively. b The ratio of SC to DC for the Niño 3 index (red line) and EMI (blue line) as a function of lead time

The discussion in Sect. 3 showed that the actual prediction skills for the two flavors of ENSO have seasonal and interdecadal variations. Two important questions that arise from this are: how does the potential predictability behave? And what control those behaviors? To address these issues, we calculated another information-based potential predictability measure (RE) for the Niño 3 index and EMI. Unlike R and MI that quantify the overall potential predictability, the RE measures the potential predictability of each prediction. The averaged RE for all individual predictions is equal to the MI, and also related to R according to Eq. (8). The RE is derived from two components: SC and DC. There are significant linear relationships between the RE and SC for the two flavors of ENSO (Fig. 7a, b), with correlation coefficients of 0.79 and 0.93 for the EP ENSO and CP ENSO, respectively. In contrast, the relationships between DC and RE are less significant whether for the EP ENSO or the CP ENSO (Fig. 7c, d). This indicates that the SC has a greater contribution to the variability of RE than the DC for two flavors of ENSO. A larger SC usually provides more additional information and corresponds to a higher potential predictability (RE), especially for the CP ENSO.

Fig. 7
figure 7

Plots of RE versus SC and DC for the a, c Niño 3 index and b, d EMI

Figure 8 presents the relationships between RE and C for all lead times. Most predictions have a small contribution, or even a negative contribution to the ACC. Large values of RE typically correspond to large C, while the variability of C at low values of RE is no rules. This “triangle” relationship between potential predictability and the deterministic prediction skill depends mainly on the SC contribution. This further indicates that a prediction with larger SC corresponds to large potential predictability and makes high contribution to the deterministic prediction skill for both flavors of ENSO.

Fig. 8
figure 8

Plots of the contribution of each ensemble mean prediction to the ACC versus RE, SC, and DC for the ac Niño 3 index and df EMI, respectively

Furthermore, we examined the seasonal variability of the potential correlation (R), SC and DC. There is marked decline in R (Fig. 9a, d) during the boreal spring for both flavors of ENSO, and the rate of decline for the CP ENSO is sharper than that of the EP ENSO. This indicates that there are also SPB phenomena in potential predictability for both flavors of ENSO, and the CP ENSO has a more significant SPB than the EP ENSO. The cause of this barrier is primarily due to the seasonal variability of the SC (Fig. 9b, e), which declines sharply in the boreal spring for both flavors of ENSO. However, the DC exhibits no obvious seasonal variability (Fig. 9c, f), especially for the CP ENSO. Therefore, the SC controls the SPB in potential predictability and actual prediction skill for both flavors of ENSO.

Fig. 9
figure 9

MI-based potential correlation, SC, and DC for the ac Niño 3 index and df EMI for different start months. The dashed green lines indicate the boreal spring

In addition, there are also distinct interdecadal variations of the potential predictability for both flavors of ENSO (Figs. 10, 11), and the trends are generally consistent with those of the ACC skills. In general, the potential predictability of the EP ENSO is higher than that of the CP ENSO for all decades. The relatively low predictability occurs from 1890 to 1930 for the EP ENSO (Fig. 10a), but before the 1960s for the CP ENSO (Fig. 11a). The SC variations also determine the interdecadal variations of potential predictability for both flavors of ENSO (Figs. 10b, 11b). The periods with more ENSO events provide large SC, which results in high predictability and good deterministic prediction skill. Compared with the CP ENSO, the DC has a greater contribution to the potential predictability for the EP ENSO at short lead times. The above discussion indicates that the SC determines the variations of potential predictability and deterministic skill on various time scales, especially for the CP ENSO. As such,, the potential predictability is also a suitable indicator of the actual deterministic skill in the CESM, which is consistent with the previous findings in other models (Tang et al. 2008b; Cheng et al. 2011; Xue et al. 2013; Kumar and Chen 2015; Liu et al. 2019).

Fig. 10
figure 10

Running a RE, b SC, and c DC for the ensemble Niño 3 index with a 20 year running window. The labels on the x-axis indicates the middle year of each 20 year window

Fig. 11
figure 11

Same as Fig. 10, but for EMI

5 Discussion and conclusions

ENSO is the dominant interannual air–sea interaction climatic mode in the tropical Pacific Ocean. It has worldwide effect as it modulates the atmospheric circulation.. A new type of ENSO, the CP ENSO, has maximum SSTAs in the central Pacific Ocean and is distinctly different from the canonical ENSO in terms of its formation mechanism and climatic effects. The predictability of the two flavors of ENSO is an interesting topic that has received less well attention. A long-term ensemble retrospective prediction provides a convenient way to evaluate the predictability of the ENSO in terms of both the actual prediction skill and potential predictability. In this study, we conducted an ensemble retrospective prediction from 1881 to 2017 with the CESM to evaluate the predictability of the two flavors of ENSO. The CP ENSO has a lower deterministic prediction skill than the EP ENSO. The potential predictability declines with lead time for both flavors of ENSO, and the EP ENSO has a higher upper limit for the prediction skill, and is more predictable than the CP ENSO. The information-based metric is more suitable for evaluating the potential predictability because it measures both the linear and nonlinear statistical dependence between the ensemble mean and a hypothetical observation.

The predictability of both flavors of ENSO undergoes distinct seasonal and interdecadal variations whether in actual skill or potential predictability. The CP ENSO has a more obvious SPB than the EP ENSO. A relatively higher predictability occurs after the1960s for the two flavors of ENSO, but the trends are not synchronized. The highest (lowest) prediction skill occurs during 1981–2000 (1921–1940) for the EP ENSO, but during 2001–2017 (1941–1960) for the CP ENSO. In general, a larger SC corresponds to higher potential predictability, and determines the seasonal and interdecadal variations of predictability for both flavors of ENSO. The SC makes more contribution to the predictability of the CP ENSO as compared with the EP ENSO.

To the best of our knowledge, this is the first study to explore the predictability of the two flavors of ENSO based on long-term retrospective forecasting with t CGCM both in terms of the actual skill and potential predictability. Given that the potential predictability measures the upper limit of the actual prediction, the difference between the ACC and potential correlation (R) indicates the margin improvement in the current prediction. The results of this study show that there is significant scope for improvement in the predictions of the two flavors of ENSO with the CESM, especially for the CP ENSO (Fig. 12). But it should be noted that model-based estimate of predictability may be model dependent because of various model biases. In addition, the quality of the assimilated reanalysis data may be limited by the sparse observations prior to the satellite era. Therefore, further in-depth analysis to validate the different aspects of the model predicted variability, especially for the CP ENSO, needs to be undertaken. The underlying reasons for the limits of the current prediction skill for the two flavors of ENSO also requires further study. Relevant improvement to this limitation in the mode should also be considered in the future.

Fig. 12
figure 12

Difference between the potential predictability and actual prediction skill for the Niño 3 index (blue line) and EMI (red line). The labels on the x-axis indicates the middle year of each 20 year window