1 Introduction

The El Niño-Southern Oscillation (ENSO) is characterized by an interannual variability of sea surface temperature (SST) in the tropical Pacific (Philander 1983; Ropelewski and Halpert 1987). Although the ENSO phenomenon originates and develops in the tropical Pacific, it has global climatic, ecological, economic and societal impacts through ocean and atmospheric teleconnection (Rasmusson and Wallace 1983; Trenberth et al. 1998; Ham et al. 2014). Since the extreme 1982/83 El Niño event, considerable progress has been made in the development of observing systems, theories and numerical models related to ENSO (Wallace et al. 1998; Wang 2018; Tang et al. 2018; Zhang et al. 2020). These endeavors have deepened our understanding of ENSO dynamics and improved the prediction skill of the ENSO, and it is skillfully predictable with a 1-year lead time in ENSO hindcast experiments (Zebiak and Cane 1987; Kirtman and Schopf 1998; Fedorov and Philander 2000; Yeh et al. 2012). Despite this progress, there are still considerable uncertainties in real-time ENSO predictions (Jin et al. 2008; Luo et al. 2008; Tang et al. 2008). Especially after the 1990s, El Niño events have behaved more diversely. A new type of El Niño event, the central-Pacific El Niño (CP-El Niño), occurs frequently and increases the difficulty in ENSO forecasts (Ashok et al. 2007; Kao and Yu 2009; Kug et al. 2009). The so-called CP-El Niño events have warm SSTs concentrated in the central tropical Pacific, and their spatial structures and evolution mechanisms are obviously different from those of traditional El Niño events with warm SSTs concentrated in the eastern tropical Pacific.

Lorenz (1975) suggested that uncertainties in numerical weather and climate predictions are mainly caused by both initial and model errors and classified predictability problems into two types: the first type associated with initial errors and the second type associated with model errors. Many studies have described ENSO as a self-sustaining oscillation and suggested that the model-based predictions of ENSO depended dominantly on the initial conditions (Latif et al. 1998; Chen et al. 2004). Therefore, many studies have explored ENSO predictability from the viewpoint of initial error growth to find the initial error that has the largest impact on the prediction of ENSO, that is, the optimally growing initial error (OGE) (Moore and Kleeman 1996; Xue et al. 1997a, b; Thompson 1998; Fan et al. 2000; Samelson and Tziperman 2001). In these studies, the linear singular vector (LSV) method is commonly used to solve the optimization problem. For example, Moore and Kleeman (1996) and Xue et al. (1997a, b) used LSV to investigate the OGEs that induce the largest uncertainties in ENSO prediction. However, the LSV may not be a reasonable approximation to the optimally growing initial perturbation in a nonlinear model when the amplitude of the initial perturbation is large and/or the forecast time is long.

To overcome these limitations in the LSV method, Mu et al. (2003) proposed the concept of conditional nonlinear optimal perturbation (CNOP) to study the predictability problems of weather and climate events. CNOP represents a perturbation which, under a given physical constraint, results in the largest nonlinear error evolution at the prediction time. If being superimposed on the initial fields of a weather and climate event, it denotes the optimally growing initial error that has the largest effect on the uncertainty of prediction results. Presently, the CNOP method has been widely used in a variety of ENSO models with multiple complexities to search for the OGEs of ENSO and investigate the predictability problems concerned with initial errors, such as the maximal prediction error (Xu and Duan 2008; Tao et al. 2017), the spring prediction barrier (Mu et al. 2007a, b; Yu et al. 2009; Duan et al. 2009), and the sensitive area of target observation (Tian and Duan 2016; Duan and Hu 2016; Duan et al. 2018). Moreover, CNOP method is also extended to investigate the ENSO predictability problems concerning the uncertainties in model parameters and external forcings, that is, find the optimal model parameter perturbations (Yu et al. 2012; Gao et al. 2018; Tao et al. 2019) and nonlinear optimal external forcings (Duan and Zhao 2015; Tao et al. 2020) which lead to the the largest growth of prediction error.

Mathematically, the essence of computing the CNOP is to find the global maximum value of a nonlinear function with regard to initial perturbations, which can be transformed into a minimum optimization problem. An appropriate nonlinear optimization algorithm is then necessary to compute the CNOP. There are two main types of optimization algorithms. One type is the traditional optimization algorithm that requires the gradient information of the cost function with regard to initial perturbations. If the corresponding adjoint model of a numerical model has been conducted, the gradient will be efficiently calculated by the adjoint model. The other type is the intelligent algorithm that does not require gradient information and are adjoint-free. For simple or intermediate complex models with relatively small dimensions, these two types of optimization algorithms can be used to calculate CNOP conveniently. However, complicated coupled ocean–atmosphere numerical models usually have no corresponding adjoint models. In particular, coding the adjoint model of a complex numerical model is a large, tedious, and time-consuming job. Meanwhile, because the degrees of freedom in complicated coupled models are generally approximately 106–107, the cost of computing the CNOP directly using intelligent algorithms is usually a formidable challenge and almost unacceptable. All these reasons limit the further application of CNOPs in complex coupled models. Therefore, how to calculate CNOPs in numerical models, especially in complicated coupled models, remains a challenging problem.

Despite these difficulties, scientists have examined numerous methods to compute CNOPs or their approximation. For some intermediate coupled models (ZC, Zebiak and Cane 1987; ICM, Zhang et al. 2003), their adjoint models are coded in advance to compute the gradient of the cost function (Xu 2006; Gao et al. 2016), and then CNOPs are computed by traditional optimization algorithms (Mu et al. 2007b; Tao et al. 2017). Duan et al. (2009) proposed an adjoint-free ensemble-based method to look for a CNOP and found that the spatial pattern of a CNOP obtained approximately by this ensemble method is qualitatively similar to that computed by the adjoint method. Their initial errors, superposed on the “true” El Niño events to be predicted, are generated by taking the differences between the initial fields of the “true” El Niño events at the starting month and those in each month of the dominant 4-year period of ENSO preceding each reference year. Since then, the ensemble-based method has been used in the complicated FGOALS-g (Flexible Global Ocean Atmosphere Land System-gmail model in IAP/LASG, Yan and Yu 2012), CESM (Community Earth System Model in NCAR, Neale et al. 2012) and GFDL_CM2p1 (Geophysical Fluid Dynamics Laboratory Climate Model version 2p1, Delworth et al. 2006) models to compute CNOPs approximately and search for the initial errors that induce an obvious spring predictability barrier for the ENSO (Duan and Wei 2013; Duan and Hu 2016; Qi 2018). However, for the ensemble-based method, Combined Empirical Orthogonal Function (CEOF) analysis is conducted on the selected error samples, composite analysis is performed according to the sign of the time series of CEOFs with larger variance, and then several types of initial errors are obtained. Their results depend heavily on the sample space, yet they cannot contain all satisfying initial errors. Furthermore, even if the sample space is large enough, the results obtained by CEOF and composite analysis may be common (basic) characteristics suitable for many cases but not the optimal solution for a particular El Niño event. As noted by Hu and Duan (2016), a more effective algorithm must be developed to compute CNOPs in complicated models.

In addition, to overcome the difficulties in writing adjoint models but still compute CNOPs by optimization algorithms, Wang and Tan (2010) applied an adjoint-free ensemble projection (EP) algorithm to approximate the gradient of cost function with respect to initial perturbations and then computed the CNOP. The key of this EP algorithm is to use a localization technique and assess spurious correlations between observation stations and model grids. However, this treatment will determine the covariance of the localized Schur radii artificially and empirically. To avoid the localization procedure of the EP algorithm in Wang and Tan (2010), Chen et al. (2015b) proposed a singular vector decomposition (SVD)-based EP algorithm to compute the CNOP, which overcomes the uncertainty caused by empirically choosing the localization radius. The main concept is to construct CNOPs using former main singular vectors of historical time series in numerical models to greatly reduce the degrees of freedom in the optimization problem. Thus, the optimized variables are the weight coefficients of the main SV modes rather than the original physical variables, and the issue of computing the CNOP is transformed into determining the optimal weight coefficient combination of the main SV modes. The SVD-based EP algorithm has been successfully applied to compute CNOPs in the intermediate ZC model combined with traditional algorithms and intelligent algorithms (Chen et al. 2015b; Wen 2015). Therefore, we naturally ask whether the SVD-based EP algorithm can be used in complex coupled ocean–atmosphere models to compute CNOPs and explore the OGEs in El Niño events.

In this study, we use the SVD-based EP algorithm to compute CNOPs in the CESM and search for OGEs of El Niño events. This paper is organized as follows: The CESM used in this study is depicted in Sect. 2. Section 3 introduces the definition of the CNOP and how to compute the CNOP by the SVD-based EP algorithm. The validity of computing the CNOP in the CESM is verified in Sect. 4. Section 5 gives numerical experimental results for the OGEs of El Niño events in detail. The related dynamic mechanisms of error growth are explained in Sect. 6. Section 7 summarizes the study and provides some discussion.

2 The community earth system model

The CESM (version 1.0.3) in this study, superseding version 4 of the Community Climate System Model (CCSM4), is a fully coupled global climate model that provides state-of-the-art simulations of the Earth’s past, present and future climate states. The CESM consists of five component models: atmosphere, ocean, land, land ice, and sea ice, plus one central flux coupler component that coordinates the models and exchanges geophysical fluxes between them. Based on the needs of researchers, the atmospheric components of the CESM can be alternatively accessed as the Community Atmosphere Model (CAM), the high-top atmosphere Whole Atmosphere Community Climate Model (WACCM), or the CAM with chemistry (CAM-CHEM). The Community Atmosphere Model version 4 (CAM4) used in this study has a finite-volume (FV) dynamic core with 26 vertical layers and a horizontal resolution of 0.9° (longitude) × 1.25° (latitude). The model configuration is described in detail in Neale et al. (2012). The ocean component is based on the Parallel Ocean Program version 2 (POP2, Smith et al. 2010) from the Los Alamos National Laboratory (LANL), which has 60 vertical levels with a layer thickness varying from 10 m in the top 150 m to 250 m below 4000 m. It uses spherical coordinates in the Southern Hemisphere and a displaced pole grid in the Northern Hemisphere. The horizontal resolution is approximately 1° (longitude) × 0.27° (latitude) at the equator, with the domain ranging from 79°S to 89°N. The CAM4 and POP2 are coupled through the version 7 coupler (CPL7, Craig et al. 2012) together with the Community Land Model version 4 (CLM4, Oleson et al. 2010), the Community Ice Code version 4 (CICE4, Hunke and Lipscomb 2008) of LANL, and the Glimmer Community Ice Sheet Model version 1.6 (Glimmer-CISM1.6, Rutt et al. 2009; Lipscomb et al. 2013). More details of the CESM and its simulation of the climate system are highlighted in Hurrell et al. (2013). Bellenger et al. (2014) noted that despite some biases in the tropical Pacific interannual variability, the CESM accurately simulates the fundamental characteristics of observational ENSO events.

3 CNOP and SVD-based EP algorithm

The CNOP represents an initial perturbation, subjected to a given constraint, and has the largest nonlinear evolution at the end of the prediction time. The CNOP method is a natural generalization of the LSV to a nonlinear system. Here, we briefly introduce the CNOP approach.

Let \({\varvec{M}}\) be a nonlinear propagation operator (i.e., the numerical model) and \(x_{0}\) be an initial perturbation superimposed on the reference state \(X{(}t{)}\), which is a solution to the nonlinear model and satisfies \(X{(}t{)} = {\varvec{M}}{(}X_{0} ,t_{0} ,t{)}\), with \(X_{0}\) being the initial field of reference state \(X{(}t{)}\) and \(t_{0}\) being the initial prediction time.

For selected norms \(\left\| \cdot \right\|_{\delta }\) and \(\left\| \cdot \right\|\), the CNOP (\(x_{0\delta }\)) is the solution of the following optimization problem:

$$J{(}x_{0\delta } {)} = \mathop {{\mathbf{max}}}\limits_{{\left\| {x_{0} } \right\|_{\delta } \le \delta }} \left\| {{\mathbf{M}}{(}X_{0} + x_{0} {,}t_{0} ,t{)} - {\mathbf{M}}{(}X_{0} {,}t_{0} ,t{)}} \right\|^{2}$$
(1)

where \(\left\| {x_{0} } \right\|_{\delta } \le \delta\) is the initial constraint defined by the selected norm \(\left\| \cdot \right\|_{\delta }\) and \(\delta\) is a prescribed positive number that defines the constraint radius of the initial perturbation \(x_{0}\). The norm \(\left\| \cdot \right\|\) measures the evolution of perturbations. To conveniently apply existing optimization algorithms, let \(J_{{1}} {(}x_{0} {)} = - J{(}x_{0} {)}\) and solve the equivalent minimization problem in Eq. (1) to compute the CNOP.

Both traditional algorithms and intelligent algorithms are often used to solve the above optimization problem. For traditional optimization algorithms, the key is to obtain the gradient of the cost function with respect to the initial perturbation x0. However, the CESM has no corresponding adjoint model, and the gradient cannot be efficiently computed. Thus, adjoint-free intelligent algorithms may be used to compute CNOPs. However, the number of degrees of freedom in the CESM is very large, and the cost of computing CNOPs directly using intelligent algorithms is almost unacceptable. In this study, the CNOP will be computed by the SVD-based EP method proposed by Chen et al. (2015b).

The projection of a continuous system into a discrete numerical model can be expressed as \(x(t) = \sum\nolimits_{i = 1}^{N} {\sigma_{i} u_{i} } v_{i}^{T}\), where \(N\) is the degree of freedom of a numerical model, \(\sigma_{i}\) is the singular value arranged from the highest to lowest, \(u_{i}\) is the spatial mode corresponding to \(v_{i}\), and \(v_{i}\) is the time series of \(u_{i}\).

A forced and dissipative dynamic system tends toward a low-dimensional attractor after long evolution (Teman 1991; Osborne and Pastorello 1993; Foias 1997). The spatial modes \(u_{i}\) can be such that \(\left| {\sigma_{i} } \right|\) monotonously decreases quickly enough as \(i\) increases. If \(m {(}m \le N{)}\) former main spatial modes are used as the bases to construct the approximate state space of the whole system, the original \(N\)-dimensional system can be truncated to an \(m\)-dimensional approximate system.

SVD statistically provides a standard method that reduces the dimension of the system effectively by choosing spatial modes. If \(m\) spatial modes are chosen and combined linearly to approximate the state vector of the discrete system, then \(x_{0} = \sum\nolimits_{i = 1}^{m} {a_{i} u_{i} }\), and the optimization problem that computes the CNOP is then transformed into the following:

$$J_{1} {(}a_{\delta } {)} = \mathop {{\text{min}}}\limits_{{\left\| a \right\|_{\delta } \le \delta }} \left( { - \left\| {{\mathbf{M}}{(}X_{0} + \sum\limits_{i = 1}^{m} {a_{i} u_{i} } ,t_{0} ,t{)} - {\mathbf{M}}{(}X_{0} ,t_{0} ,t{)}} \right\|^{{2}} } \right)$$
(2)

where \(a_{i}\) is the weight coefficient of the chosen mode \(u_{i}\). Because the selected spatial modes are orthonormal vectors, the constrained condition becomes \(\left\| a \right\|_{\delta } \le \delta\). This approach searches for the optimal combinations of weight coefficients for the chosen bases.

The ability of the SVD-based EP algorithm to compute CNOP has been verified in the ZC model that has an adjoint model. Chen et al. (2015b) combined the SVD-based EP method with the traditional SPG2 algorithm (Birgin et al. 2000) to calculate the CNOP and explore the optimal precursor of ENSO events. The results show that the CNOP obtained by the SVD-based EP algorithm can effectively approximate the CNOP calculated by the adjoint method and retain the general spatial characteristics of the latter. Wen (2015) combined this method with four intelligent algorithms to compute CNOPs in the ZC model. Their results show that the spatial pattern of the CNOP obtained by the new method looks similar to that obtained by the adjoint method, and the variation trend of the cost function in different calendar months is almost the same as that of the latter. These studies demonstrated that, independent of the optimization algorithm, the SVD-based EP algorithm can be used to approximately compute CNOPs.

4 Validity of computing a CNOP in the CESM

The CESM is a fully coupled earth system model that includes ocean, atmosphere, land, sea ice, and land ice components and has a very high dimensionality. The number of grids on the upper tropical Pacific alone reaches 105. Therefore, it is necessary to reduce the dimensions of the model to calculate the CNOP. Based on the characteristics of the low-dimensional attractor for the ENSO system, we apply the SVD-based EP algorithm to reduce the dimensions of optimization problems, calculate the CNOP in a truncated phase space and search for the OGEs in El Niño events.

4.1 Experimental strategy

The fixed external data of tracer gases, insolation, aerosols and land cover in the year 2000 are used to force the CESM for 179 years, and the Niño 3.4 index of historical data for the last 85 years (95–179) is analyzed (the figure is neglected). We define the El Niño (La Niña) event when the Niño 3.4 index is greater (smaller) than 0.5 °C (− 0.5 °C) in more than five consecutive months. During this 85-year span, the ENSO simulated in the CESM behaves with 2- to 5-year periodic characteristics, and 22 El Niño events and 24 La Niña events occur. Among them, 21 El Niño events and 16 La Niña events have the characteristics of phase lock, that is, the peaks of the Niño 3.4 indices occur in winter. Figure 1 shows the Empirical Orthogonal Function (EOF) analysis results of SSTA from the 85-year historical data. It can be seen that the leading EOF (EOF1) of SSTA presents a typical EP-El Niño-like mode, and can explain 71.64% of the total variance. The second mode (EOF2) has warm signals in the tropical central and western Pacific, with cold signals in the tropical eastern-southern Pacific, which is similar to the mature CP-El Niño mode. The 6.86% explained variance of EOF2 is obviously smaller than that of EOF1, which means the EOF1 mode is more steady than EOF2. Comparing the principal components (PC) of two modes, PC1 behaves a stronger interannual variability than PC2, which also illustrates that EP-El Niño events simulated by the CESM are usually much stronger than CP-El Niño events. All these results show that the CESM can capture the characteristics of period, phase-locking and diversity of ESNO events. Although some studies have pointed out that the anomalous warm centers of El Niño events are relatively west, corresponding to observations (Capotondi 2013), and the asymmetry of El Niño and La Niña is weak (Zhang and Sun 2014), in general, the fundamental features of the model simulation are consistent with those of observations. Therefore, in this study, it is feasible that we can use the CESM to study the ENSO OGEs.

Fig. 1
figure 1

EOF analysis results of SSTA in the tropical Pacific region from the 85-year historical data. a1 Is the leading EOF mode (EOF1) and a2 is the time series (PC1) of EOF1. b Is the same as (a) but for the second EOF (EOF2)

There are many El Niño events in the 85-year historical run of the CESM. However, because calculating one CNOP requires a large amount of computer time, we are limited by the present computational conditions and cannot study each ENSO event in detail. Thus, we focus on six selected El Niño events with different intensities and diversities (Fig. 2), and these events are denoted by EP1, EP2, EP3, EP4, CP1, and CP2. Among them, EP1, EP2 and EP3 are strong EP-El Niño events, EP4 is a relatively moderate EP-El Niño event, and CP1 and CP2 are weak CP-El Niño events. Figure 3 shows the evolutions of SSTA and sea surface wind anomalies for EP1, EP4 and CP1-El Niño. EP1 and EP4 are typical EP-El Niño events. The SST warm anomaly occurs along the equatorial eastern Pacific coast, extends westward, and finally matures in winter. For the CP1-El Niño event, the warm SSTA anomaly first appears in the tropical central and western Pacific and finally reaches the mature phase in winter. CP1-El Niño has no obvious propagating characteristics in the developing process, which is consistent with the results of the observed CP-El Niño.

Fig. 2
figure 2

Time-dependent Niño3.4 indices for four EP-El Niños, two CP-El Niños and one neutral state from the 85-year control run, denoted by EP1, …, EP4, CP1, CP2, and NR1, respectively. These events are chosen as reference states to conduct the optimization calculation experiments. The red asterisk denotes the start month of prediction of January (0)

Fig. 3
figure 3

Evolutions of the SST anomaly (units: °C) and sea surface wind anomaly (units: m/s) of a EP1-El Niño, b EP4-El Niño and c CP1-El Niño

All these selected El Niño events reached their peak in the winter of the development year, consistent with the observed phase-locked characteristics. In this context, we use Year(0) to denote the year when El Niño reaches the peak and Year(− 1/1) to signify the year before/after Year(0). For each El Niño event, we make 12-month predictions (i.e., lead time of 12 months) with the start months being January (0) (Exactly speaking, January 1st) to assure that we can focus on the predictions during the onset and growth phase of El Niño events.

To study the OGE of an El Niño event, perfect model experiments are conducted in this study. That is, the CESM is assumed to be perfect. In this condition, the uncertainties in El Niño predictions are considered to be caused only by initial errors that are superimposed on the initial sea temperature fields of the six “true” El Niño events. In the following numerical experiments, the initial errors of the sea temperature fields cover the region (19°S-19°N, 129.375°E-84.375°W), which includes the whole tropical Pacific. To explore the role of subsurface processes, the initial errors extend from the surface to a 165 m depth, which is approximately the bottom of the thermocline over the western equatorial Pacific.

Let \(T_{i,j,0}^{p} {(}t{)}\) be the predicted SST, \(T_{i,j,0}^{t} {(}t{)}\) be the “true” SST to be predicted, and (i, j, 0) be the grid points in the predicted area of the sea surface. The prediction error is defined as the difference between the SSTA in the tropical Pacific for the predicted El Niño and that of the “true” El Niño events, and its value is measured as follows:

$$\left\| {T^{p} {(}t{)} - T^{t} {(}t{)}} \right\| = \sqrt {\sum\nolimits_{i,j} {{[}T_{i,j,0}^{p} {(}t{)} - T_{i,j,0}^{t} {(}t{)]}^{{2}} } } .$$
(3)

Then, the CNOP can be computed by solving the following optimization problem:

$$J{(}T^{\prime}_{0\delta } {)} = \mathop {{\text{max}}}\limits_{{\left\| {T^{\prime}_{0} } \right\|_{\delta } \le \delta }} \left\| {T^{p} {(}t{)} - T^{t} {(}t{)}} \right\|^{2}$$
(4)

In Eq. 4, \(\left\| {T^{\prime}_{0} } \right\|_{\delta } \le \delta\) is the constraint of initial errors and is defined as follows:

$$\left\| {T^{\prime}_{0} } \right\|_{\delta } = \sqrt {\sum\nolimits_{i,j,k} {\left( {\frac{{\cos \varphi_{i,j,k} T^{\prime}_{i,j,k} }}{{\sigma_{i,j,k} }}} \right)^{{2}} } } \le 200$$
(5)

where \(T^{\prime}_{0}\) represents the initial error of sea temperature with dimensions of 3.094 × 105, i, j and k represent the grid positions of the zonal, meridional and vertical directions, respectively, and \({\varphi }_{i,j,k}\), \({\sigma }_{i,j,k}\) and \({T}_{i,j,k}^{^{\prime}}\) represent the latitude, standard deviation and initial error of sea temperature at the grid (i, j, k), respectively. The constraint \(\delta = {200}\) means that the averaged absolute value of initial error in each grid point does not exceed 0.36 × standard deviation.

As mentioned above, because the dimensions of the optimization problem are very large, the cost that optimizes the cost function directly is unacceptable. Therefore, we first perform singular vector decomposition on the sea temperature anomaly in January for the long-term (95–179 year) historical data of the CESM and then optimize the linear combination of the weight coefficient of several modes with the larger explained variance to obtain the CNOP. The corresponding concept has been fully discussed in Sect. 3. Figure 4 shows the distribution of the first 85 singular values for the long-term historical data in the CESM. We can see that the singular values decrease very rapidly as the mode numbers increase. The maximum is larger than 3500 for the first mode, and the singular value is less than 1000 from the fourth mode number. Obviously, the more singular vectors that are retained, the better the reducing-dimension space approximates to the original; however, more computing cost is required. Therefore, it is wise to set the mode number so that the reduced space approximates well to that of the original CESM and the computation cost is acceptable under the current calculating conditions. In the following numeral experiments, we truncate the former 10 modes with a larger explained variance as the basement to build the approximating space. For these ten former singular vectors, their cumulative explained variance reaches 79.27% and can capture the signals, including medium and large scales in the original CESM.

Fig. 4
figure 4

Distribution of the first 85 singular values for the 95–179 year historical run in the CESM

It is important to select an appropriate optimization algorithm to compute the CNOP. If the CESM is smooth, the gradient of the cost function with regard to the optimization variables will exist. Then, traditional optimization algorithms can be selected to implement optimization calculations. In fact, because this type of algorithm searches the minimum of the cost function along the direction indicated by the gradient, the searching speed is relatively fast. However, there are many parameterization processes in the CESM, and some processes have switching properties, which causes that the gradient of the objection function may not exist. Figure 5 shows the relations between the evolution of prediction errors and the norm of initial errors. The figure shows that when the random initial errors with the norms 120, 60, 12, 6, 1.2, 0.6 and 0.12 are superimposed in January (0) of the NR1 neutral state (its Niño 3.4 indices are shown in Fig. 2), the corresponding prediction errors do not decrease with a linear trend as the norm of initial error tends to 0. That is, we may not find the gradient of the cost function in the CESM, and it is necessary to adopt optimization algorithms that do not require the gradients. As mentioned above, some intelligent optimization algorithms are appropriate choices. Moreover, they have been combined with the dimension-reduction method to approximate the CNOP effectively in the ZC model with intermediate complexity (Wen 2015), which provides some useful references and adds to our confidence in computing the CNOP in the CESM.

Fig. 5
figure 5

Evolutions of random initial errors that are superimposed in January (0) of the NR1 neutral state. The X axis represents the value of the norm of random initial error, and the Y axis represents the evolution of prediction errors. The red dots denote the prediction errors when the norms are 120, 60, 12, 6, 1.2, 0.6 and 0.12

In this study, the relatively new differential evolution (reduced as DE) algorithm (Rainer and Kenneth 1997) is combined with the reducing-dimension method to calculate the CNOP. The DE algorithm uses the information between candidate solutions to guide the search direction. The DE algorithm, as a type of heuristic algorithm, has the following advantages: (1) no requirement for smoothness of the cost function; (2) ability to parallelize; and (3) good search for the global minimum. In addition, this algorithm has fewer control parameters (only three) and a relatively simple evolution strategy, which makes it fit for coding. More importantly, this algorithm maintains the memory of the historical optimal solutions, which makes it highly suitable for a problem that aims to seek a relatively satisfactory solution rather than a strict optimal solution due to the calculation cost or other reasons. Therefore, it is reasonable to select the DE algorithm to carry out optimization calculations in the complex CESM.

Figure 6 summarizes the schematic diagrams that the reducing-dimension method is combined with the DE optimization algorithm to compute the CNOP. Firstly, the model is integrated for a long term as the control run. Secondly, SVD is performed on the optimized variables of control run and appropriate SVs are selected (10 SVs in this study), based on the cumulative explained variance, as the basement that constructs the CNOP. Thirdly, these selected spatial modes, the CESM and reference states X0 are called by the DE algorithm to calculate the optimal combinations of weighted coefficients. Finally, the CNOP is calculated by combining the selected SVs linearly based on weighted coefficients. Figure 6b illustrates the flow chart that the DE algorithm searches for the optimal weighted coefficient combination. The algorithm first initializes the population of coefficient vectors randomly (the population size is 20 in this study), and then judges if this population meets stop conditions. If stop conditions are met, the algorithm outputs the optimal coefficient combination. If the answer is no, the DE algorithm generates new trial vectors by evolution and crossover. Each trial coefficient vector corresponds to an initial perturbation x0 superimposed on the initial field X0 of reference state, and its corresponding costfunction can be calculated by the CESM. Then, during the selection procedure, the algorithm substitutes some population members that have smaller costfunction with the trial coefficient vectors that have larger costfunction, and generates a new population. One iteration is then completed. For this new population, the algorithm will repeat above iterative process, until the stop conditions are met and the optimal weighted coefficient combination is output.

Fig. 6
figure 6

Schematic diagrams of computing CNOPs in the CESM. a Experimental procedures that the SVD-based EP algorithm is combined with DE optimization algorithm to compute the CNOP. b Flow chart of DE optimization algorithm

4.2 Verifying results for computing CNOP

Before using the CNOP method to explore the OGEs of El Niño events in the CESM, it is necessary to check whether the SVD-based EP algorithm is properly combined with the DE algorithm to calculate the CNOP in the complex coupled numerical model. This validity can be tested from two aspects. In mathematics, as the iterative step increases, the cost function \(J{(}x_{0} {)}\) increases rapidly until it converges to the maximum value. In addition, a CNOP and its development should have physical meaning that accords with the observations or numerical results in other ENSO models to some degree. To attain the above two purposes, we deliberately choose a neutral state that is close to the climatological monthly mean state, denoted by NR1, as the reference state (Fig. 2) to compute the CNOP. In fact, this type of CNOP has been studied in detail in the intermediate ZC model (Duan et al. 2008; Duan et al. 2013). These results show that because the observed ENSO events usually have asymmetrical characteristics with strong El Niño and relatively weak La Niña events, the CNOP calculated on a neutral state often develops into a strong El Niño event. This type of CNOP has the most possibility to develop into an El Niño event and is considered as the optimal precursor of El Niño. Then, if we calculate a CNOP on the neutral state in the complex CESM by the SVD-based EP algorithm and DE algorithm, can its cost function increase rapidly and converge to a maximum? Can the CNOP develop into a strong El Niño event?

Figure 7 shows the varying trend of the cost function as the iteration number of the optimization calculation increases when the initial error is superimposed in January (0) of the NR1 neutral state. It can be seen that the cost function is approximately \(1.2\times {10}^{4}\) at the first iteration, increases rapidly as the iteration number becomes larger, and slows after eight iterations. Although the value of the cost function continues increasing and may eventually exceed \(3.0\times {10}^{4}\) with more iterations, the difference between the final value and the cost function for the tenth iteration may not be very large. Thus, to save computer time, when the value of the cost function has not been obviously improved after several successive iterations, the iterative calculation is stopped. For this case, ten iteration steps are selected for the optimization calculation. These results show that the combination of the reducing-dimension method and DE algorithm can mathematically approximate to find the maximum value of the cost function \(J{(}x_{0} {)}\) in the CESM and the CNOP that has the largest development at the end time of the optimization calculation. Figure 8 shows the spatial structure of the 12-month lead development of the CNOP. It can be seen that the CNOP-type initial perturbation eventually develops into a strong EP-El Niño event, which is consistent with the results in the ZC model (Duan et al. 2013).

Fig. 7
figure 7

Varying trend of the cost function as the iteration number of the optimization calculation increases when the CNOP-type perturbation is superimposed in January (0) of the NR1 neutral state

Fig. 8
figure 8

12-lead evolution (units: °C) of the CNOP-type SST perturbation superimposed in January (0) of the NR1 neutral state

The above results demonstrate that the CNOP calculated in the CESM is not only the maximum point of the cost function in the phase space but also develops into a typical El Niño event, which verifies the validity of the CNOP calculation in the CESM by combining the SVD-based EP algorithm with the DE algorithm. In the following section, we utilize this method to compute the CNOP and explore the OGEs of El Niño events.

5 Optimally growing initial errors of El Niño events

As mentioned above, under the hypothesis perfect model experiments, the uncertainties in El Niño predictions are considered to be caused only by initial errors. We first investigate the characteristics of the spatial structures of the OGEs for the six selected El Niño events. Numerical experimental results for the four EP-El Niño events are shown in Fig. 9. We can see that for the strong EP1-El Niño (Fig. 9a) and EP3-El Niño (Fig. 9c), the spatial structures of the CNOPs are very similar, and their correlation coefficient reaches 0.68. These OGEs possess negative SSTA errors in the whole equatorial Pacific accompanied by a negative west–east slope of subsurface temperature from the subsurface (approximately 160 m depth) to the surface in the equatorial central-eastern Pacific. For the moderate EP4-El Niño event (Fig. 9d), the spatial structure of OGE is similar to those of the EP1 and EP3 cases, except that the sign is approximately opposite, and their correlation coefficients reach − 0.61 and − 0.79, respectively. However, the OGE of the strong EP2-El Niño (Fig. 9b) is obviously different and behaves as a basin-wide dipolar pattern of tropical sea temperature errors from the sea surface to the subsurface, with positive errors in the upper layers of the equatorial eastern Pacific and negative errors in the lower layers of the equatorial western Pacific. We also find that the OGEs for the CP1-El Niño (Fig. 9e) and CP2-El Niño (Fig. 9f) have similar spatial structures to those for the EP1- and EP3-El Niño.

Fig. 9
figure 9

Sea temperature (units: °C) of CNOP-type errors for the six El Niño events. The top 4 rows correspond to ocean depths from the sea surface of 0 m, 55 m, 105 m and 155 m. The bottom row is the meridional mean of sea temperature errors over 5°S-5°N

The development processes of these OGEs are further investigated. We integrate the CESM with the initial fields being the initial sea temperature of each “true” El Niño event plus the OGEs, and then, by subtracting the “true” state, we obtain the evolutions of the prediction errors caused by these OGEs. For the strong EP1- and EP3-El Niño, the developing processes of OGEs are also very similar, and the averaged results are shown in Fig. 10a. It can be seen that the initial negative temperature errors, existing in the tropical central-eastern Pacific, are locally and rapidly amplified during several months and eventually evolve into a prediction error similar to the mature phase of La Niña. That is, the development of this type of initial error presents a growth behavior similar to a La Niña-like evolving mode. For the moderate EP4-El Niño (Fig. 10b), the developing process of CNOP error is also similar to those of the EP1 and EP3 cases, except that positive sea temperature errors exist in the tropical central-eastern Pacific and develop into a prediction error similar to the mature phase of El Niño. However, for the strong EP2-El Niño (Fig. 10c), the positive SST errors in the upper layers of the tropical eastern Pacific decay rapidly, and the negative sea temperature errors in the lower layers of the equatorial western Pacific propagate to the equatorial eastern Pacific and sea surface, amplify persistently and eventually also cause a prediction error similar to the mature phase of La Niña. This type of error initially experiences an El Niño-like decaying phase, subsequently exhibits a transition to a cold phase, and finally evolves into a La Niña-like mode. For the weak CP1- and CP2-El Niño, their developing processes of CNOP-type initial errors are similar to those of the strong EP1- and EP3-El Niño (Fig. 10a), and corresponding figures are not given for simplicity.

Fig. 10
figure 10

Evolutions of errors in SSTAs (units: °C) and sea surface wind (units: m/s) over the tropical Pacific Ocean as well as equatorial (5°S-5°N) subsurface temperature errors (units: °C) for a type-1, b type-2 and c type-3 OGEs of El Niño events. The type-1 OGE is the average of OGEs of EP1 and EP3-El Niño, and the type-2 (type-3) OGE corresponds to that of EP4-El Niño (EP2-El Niño)

The impacts of El Niño events on global weather and climate are related to not only their intensity but also their spatial patterns. Thus, it is necessary to further investigate the effects of OGEs on the intensity and spatial structure of El Niño events. Figure 11 shows the “true” and predicted SST anomalies at the end of the prediction time for six El Niño events. It is shown that when the OGEs are superposed on the “true” El Niño events, the strong EP1-, EP2- and EP3-El Niño will be predicted as neutral states. For the moderate EP4-El Niño, the development of OGEs eventually overestimates it as an extremely strong El Niño event, which may be because the intensity of the EP4-El Niño event is moderate, so there is still considerable room to enhance the intensity of the event. When OGEs are added to the CP1 and CP2 reference states, the two weak CP-El Niño events tend to be predicted as La Niña events. Figure 11c shows the differences between the prediction results and those “true” El Niño events, which also shows that except for the OGE of EP4-El Niño, which causes a mature El Niño-like prediction error, others lead to mature La Niña-like prediction errors and cause El Niño events to be predicted as neutral states or even La Niña events.

Fig. 11
figure 11

SSTA of a control run, b prediction results and c difference between (a) and (b) at the end month of December(0) for six El Niño events, where the prediction results are obtained by integrating the CESM with the initial fields of the control runs plus their corresponding CNOP-type initial errors

We notice that, although EP1 and EP4-El Niño have little differences in intensity, the spatial structures of initial sea temperature errors identified by CNOP method are completely different from each other (shown in Fig. 9a and Fig. 9d). How to explain this phenomenon? In our previous studies, Xu and Duan (2008) found that the structures of CNOPs superimposed on the onset phase of EP-El Niño, and their effects on El Niño predictions are sensitive to the intensities of events. Due to nonlinear saturation, no matter what kind of initial fields, the intensities of El Niño and La Niña events can not exceed a limit, which causes the saturation of initial error growth. Then for strong EP-El Niño events, the CNOP-type initial errors lead to underestimate the intensities of EP-El Niño events or predicts them as La Niña events. The CNOP includes the SSTA pattern that has a zonal dipolar structure with positive errors in the equatorial central-western Pacific and negative errors in the equatorial eastern Pacific, and the negative thermocline depth errors along the equator (somewhat similar to the OGE of EP1-El Niño in Fig. 9a). However, for a weak EP-El Niño, the CNOP may behave as a similar pattern but with the opposite sign (somewhat similar to the OGE of EP4-El Niño in Fig. 9d) and tends to yield a false alarm of a strong EP-El Niño event. Duan et al. (2009) further pointed out that, for weak El Niño events, it is very difficult to determine which type of initial errors leads to worse predictions. All these results demonstrate that, for strong EP-El Niño events, the direction of OGE growth points to cold SST errors, however, for relatively weak EP-El Niño events, there exist more uncertainties in the structures of CNOP and it is arduous to find some conditions that determine the direction of error growth into warm or cold SST errors. Despite these results are model dependent, they may explain, to some extent, why the structures of CNOP-type initial errors of EP1 and EP4-El Niño events are quite different from each other though the intensity of the former is only slightly larger than that of the latter. Specifically for the EP4-El Niño, although its intensity is not very weak (see Figs. 2b, 3), the warm center of SSTA in mature and decaying phase locates in the equatorial central Pacific (see Fig. 2b), and the largest positive SSTA error growth occurs in the equatorial eastern Pacific (see Fig. 11c). That is to say, there are considerable rooms for the CNOP to develop into large warm SST errors and make the EP4-El Niño predicted as a strong El Niño event. Based on above analysis, we hold that the CNOP of EP4-El Niño has physical meaning rather than a mathematical artifact.

The above results demonstrate that we have found three types of OGEs for El Niño events with different intensities and diversity in the CESM. Among them, two types of initial errors in sea temperature have the same spatial structures, except for the opposite sign, and are claimed as type-1 and type-2 OGEs, respectively. The type-1 (type-2) OGE possesses negative (positive) SSTA errors in the whole equatorial Pacific accompanied by a negative (positive) west–east slope of subsurface temperature from the subsurface (approximately 160 m depth) to the surface in the equatorial central-eastern Pacific. The negative (positive) sea temperature errors are locally amplified for several months and eventually evolve into a La Niña (El Niño)-like mode. The type-1 OGE tends to induce the “true” strong EP-El Niño event to be predicted as a neutral state or induce the “true” weak CP-El Niño event to be predicted as a La Niña event. The type-2 OGE tends to induce a moderate EP-El Niño event to be predicted as an extremely strong EP-El Niño event. Although the third type of OGE also develops into a mature La Niña-like mode and induces the “true” strong EP-El Niño to be predicted as a neutral state, its spatial structure is obviously different from that of the type-1 OGE and behaves as a basin-wide dipolar pattern of tropical sea temperature errors from the sea surface to the subsurface, with the positive errors in the upper layers of the equatorial eastern Pacific and the negative errors in the lower layers of the equatorial western Pacific. The negative sea temperature error in the lower layers of the equatorial western Pacific propagates to the eastern Pacific, amplifies and eventually causes a prediction error similar to a La Niña-like mode. We call this kind of initial error a type-3 OGE.

6 Dynamic mechanisms of OGEs

As demonstrated in Sect. 5, there are three types of initial errors that often induce the largest uncertainties in the predictions of El Niño events and especially affect the prediction of the intensity and spatial structure of El Niño events. For the type-1 (type-2) initial error, the cold (warm) sea temperature errors are mainly concentrated in the eastern tropical Pacific and develop locally into a La Niña (El Niño)-like mode. The type-3 initial error shows a basin-wide dipolar structure in the tropical Pacific Ocean from the subsurface to the surface, the warm pole in the eastern tropical Pacific decays gradually, and the cold pole in the subsurface of the western tropical Pacific propagates eastward and grows continuously into a La Niña-like mode. The developmental processes of these three types of initial errors are quite different. What dynamic mechanisms support the developing processes of these OGEs? To illustrate this question, we explore the time-dependent SST, sea surface wind and equatorial subsurface temperature components of the prediction errors caused by three types of OGEs (Fig. 10).

Physically, when the type-1 (type-2) OGE is superposed on the initial field of an El Niño event, a weak negative (positive) SSTA error initially occurs in the equatorial Pacific, and a larger negative (positive) subsurface temperature is located in the greater depths of the central-eastern equatorial Pacific (see the top panel of Fig. 10a, b). Their dynamic growing mechanisms can be explained by the Bjerknes positive feedback (Bjerknes 1969). On the one hand, the negative (positive) SST errors lead to anomalous easterly (westerly) along the equator, acting as a trigger of the Bjerknes positive feedback, intensify (weaken) the easterly trade wind over the equatorial Pacific and the upwelling in the equatorial eastern Pacific, which means a positive (negative) upwelling error. Meanwhile, lots of cold (warm) waters in the subsurface layer of the eastern equatorial Pacific are transported upward by upwelling until they reach the surface. Both of these factors contribute to the sustained growth of the negative (positive) SSTA errors in the eastern equatorial Pacific. That is, the Bjerknes positive feedback mechanism plays a dominant role and causes the negative (positive) SSTA errors in the eastern tropical Pacific to be further amplified, ultimately evolving into a mature La Niña (El Niño)-like mode and yielding negative (positive) prediction errors for El Niño events.

For the type-3 OGE (Fig. 10c), although the initial positive sea temperature errors in the central and eastern equatorial Pacific generate zonal westerly in the central equatorial Pacific by the Bjerknes positive feedback process, this westerly is too weak to intensify the weak warm SST error (the top panel in Fig. 10c). Meanwhile, the relatively larger negative subsurface temperature errors in the western equatorial Pacific make the thermocline shallow and inspire upwelling Kelvin waves that propagate eastward and carry cold water with them. When the upwelling Kelwin waves arrive at the eastern equatorial Pacific, they cause a negative SST perturbation. This SST cooling associated with wave dynamics competes with the weak warming by the Bjerknes positive feedback. Soon, the former defeats the latter, and then the positive SST errors in the eastern equatorial Pacific start to decay gradually. Once the positive SST errors disappear, the negative SST errors subsequently occurring over the eastern equatorial Pacific are further intensified through the easterlies and stronger upwelling due to the Bjerknes positive feedback and develop into a La Niña-like mode that underestimates the El Niño events. In short, during the earlier developing process of the type-3 OGE, the negative feedback associated with equatorial Kelvin waves traveling from the western equatorial Pacific to the east equatorial Pacific plays an important role, once negative SST errors in the eastern equatorial Pacific appear, the Bjerknes positive feedback becomes the most important growing mechanism that makes the negative SST error ultimately evolve into a mature La Niña-like mode.

7 Conclusions and discussion

In this study, we use the CESM in NCAR to investigate the optimally growing initial errors that cause the largest uncertainty in El Niño predictions by the conditional nonlinear optimal perturbation method. To overcome the difficulties in computing a CNOP without an adjoint CESM, we apply the SVD-based EP algorithm (Chen et al. 2015b) to calculate the CNOP of El Niño in the CESM. Based on the characteristics of low-dimensional attractors for the ENSO system, for the 85-year historical data of the CESM, we truncate the former 10 modes as the basements to build the approximating space. The cumulative explained variance of these ten former singular vectors reaches 79.27% and can capture the signals with medium and large scales in the original model and reflect the CESM dynamics. Then, physical variables are treated as the combinations of selected bases, and the optimization problem of the cost function with regard to initial fields is transformed into one concerning the combination coefficients. Hence, the dimension of the optimization problems can be largely reduced from 105 to 10. Then, we calculate the CNOP in a truncated phase space by the DE intelligent optimization algorithm. Numerical experimental results demonstrate that the CNOP superimposed on a neutral state can develop remarkably into a strong El Niño event, which verifies the validity of computing a CNOP in the CESM by combining the concepts of reducing dimensions and an optimization algorithm.

For six El Niño events with different intensities and diversities, this study finds three types of OGEs in the CESM and calls them type-1, type-2 and type-3 initial errors. Among them, the type-1 and type-2 initial errors have similar spatial structures and dynamic growing mechanisms, except that the errors have the opposite sign. The type-1 (type-2) initial error is characterized by negative (positive) SSTA errors in the equatorial Pacific accompanied by a negative (positive) west–east slope of subsurface temperature from the subsurface to the surface in the equatorial central-eastern Pacific. For these two types of OGEs, the negative (positive) sea temperature errors in the eastern tropical Pacific are locally intensified and eventually cause a prediction error similar to a mature La Niña (El Niño)-like mode by the Bjerknes positive feedback mechanism. The type-3 initial error behaves as a basin-wide dipolar pattern of tropical sea temperature errors from the sea surface to the subsurface, with positive errors in the upper layers of the equatorial eastern Pacific and negative errors in the lower layers of the equatorial western Pacific. During the earlier period, the negative feedback associated with equatorial Kelvin waves traveling from the western equatorial Pacific to the east equatorial Pacific plays an important role, while once negative SST errors in the eastern equatorial Pacific appear, the Bjerknes positive feedback mechanism becomes the leading factor and makes the negative SST error ultimately also evolve into a mature La Niña-like mode. Although the type-1 and type-3 initial errors have different spatial patterns, both eventually induce a prediction error similar to the mature La Niña-like mode at the end of prediction time and cause El Niño events to be underpredicted or even predicted as neutral states or La Niña events. The type-2 initial error evolves into a mature El Niño-like mode, which may lead to a relatively weak El Niño event to be predicted as an extremely strong El Niño event.

As mentioned in the introduction, many endeavors have previously been made to explore the OGEs of El Niño events based on CNOP approach. These results can be strongly model dependent, ranging from intermediate coupled models, to comprehensive coupled general circulation models. Mu et al. (2007b) and Yu et al. (2009) used adjoint methods to calculate the CNOP-type errors of EP-El Niño in the ZC model and identified two types of OGEs. One type possesses an SSTA pattern with positive errors in the central-western equatorial Pacific and negative errors in the eastern equatorial Pacific and a thermocline depth pattern with negative errors along the equator. Another type possesses spatial patterns that are almost opposite to those of the former type. In their results, the initial negative (positive) SST errors in the eastern equatorial Pacific develop locally into a La Niña (El Niño)-like mode by the Bjerknes positive feedback mechanism, which are somewhat similar to the developing processes of the type-1 and type-2 OGEs in this study. However, due to the limitation of the simplicity of the ZC model, they cannot identify the type-3 OGE, emphasizing the propagation and development of the negative subsurface temperature errors located in the western equatorial Pacific. It is worth mentioning that, in this study, the OGEs for some EP-El Niño and CP-El Niño events have similar spatial patterns and development processes. Tian and Duan (2016) also suggested the same opinion based on the results in the corrected ZC model. Although their EP-type-1 and CP-type-1 CNOP errors developed into El Niño-like modes and induced both EP-El Niño and CP-El Niño to be predicted as extremely strong EP-El Niño events, the type-1 OGE in the CESM develops into a La Niña-like mode and induces two types of predicted El Niño events to be predicted as neutral states or La Niña events.

Based on another intermediate coupled model (ICM, Zhang et al 2003), Tao et al (2017) also calculated CNOP by an adjoint method and investigated the largest error growth in El Niño predictions. Obviously different from those derived from the ZC model and CESM, their CNOP-type initial errors in SSTAs and sea level anomalies (SLAs) possess the characteristic of seasonal variation, rather than sensitive to the phase and intensity of El Niño. For the SSTA components of four seasonal CNOPs, negative errors can be clearly seen near the dateline in the equatorial Pacific, however, the structures off the equator are totally different. Compared with CNOP-type SSTA errors, the optimal SLA errors are relatively less sensitive to the season and share a dipolar pattern along 10°N, with a positive error in the west and a negative error in the east. Besides, a strong negative SLA CNOP in the equatorial eastern Pacific is distinct in autumn, winter and spring, while a weak positive signal in summer. For the reasons that CNOPs in the ICM depend on the annual cycle, rather than the phase and intensity of El Niño, Tao et al (2017) hold the view that this may be due to that the statistical relationship between the wind stress anomaly and SSTA depends on the season in the ICM, and the relationship between the temperature anomalies Te of subsurface waters entrained into the mixed layer and the thermocline displacement are nonlocal, which makes the largest entrainment temperature anomalies Te occurred in the equatorial central Pacific. Then, the optimal initial errors in the ICM depend on the start season and the SSTA CNOPs are distinct in the equatorial central Pacific. Though CNOPs in the ICM vary seasonally, there are some similarities between them and type-1 OGE in this study. All of them have negative SSTA errors in the equatorial central Pacific. In addition, in the equatorial eastern Pacific, the optimal SLA errors in the ICM are negative except for in summer, and type-1 OGE in the CESM is also negative from the subsurface to surface, which means, in this region, they have the negative heat content errors in the upper ocean. All these CNOP-type initial errors evolve into the La Niña modes and tend to make the El Niño events to be underpredicted.

Duan and Hu (2016) used an ensemble-based method to find two types of initial errors that induce significant spring predictability barriers for EP-El Niño events in the CESM. Comparing the numerical results of two studies, it is obvious that the two types of initial errors found in Duan and Hu (2016) correspond qualitatively to the type-1 and type-3 OGEs in this study, regardless of their spatial structures and development processes. The reason that they cannot find the type-2 OGE, which develops into an El Niño-like mode and tends to predict the “true” El Niño as an extremely strong event, may be because they focused only on the strong EP-El Niño events, whereas we paid attention to both strong and relatively moderate El Niño events (for example, the EP4 case). Although the results for the two studies are somewhat similar, they are quite different. First, the purposes of the two studies are different. This study aims to find the initial errors that produce the largest prediction error at the prediction time, that is, OGE. However, Duan and Hu (2016) focused on finding the initial errors that produce a significant spring predictability barrier (SPB). Although such initial errors also produce large prediction errors, they may not be the initial errors that have the greatest impact on the forecast results. Second, the methods used in the two studies are different. In this study, the optimization algorithm is used to directly compute OGE. Duan and Hu (2016) used an ensemble-based method to obtain two types of initial errors that produce a significant SPB. As mentioned in the introduction, their results depend on the sample space, which cannot cover all realistic initial errors that yield a significant SPB. In addition, the results obtained by CEOF and composite analysis may be common (basic) characteristics suitable for many cases but may not be the optimal solution for a particular El Niño event. In any case, the similarities between the results in Duan and Hu (2016) and those in the present study establish the reasonability and validity of our results. However, this study avoids the difficulty of coding the adjoint model and computes the CNOP directly in the complex CESM by combining the concepts of reducing dimensions and the DE intelligent algorithm, which may provide another way to extend the application of the CNOP method in complicated coupled models.

Three types of OGEs with different spatial features, for El Niño predictions, are found in this study. According to the spatial modes and dynamical growth mechanisms, initial sea temperature errors that lead to large prediction errors, mainly originate from two regions. One locates in the upper layers of the tropical eastern Pacific, and the other locates in the subsurface of the equatorial western Pacific. Based on the hindcast/forecast data from the North American Multimodel Ensemble (NMME) system (Kirtman et al. 2014), Hua and Su (2020) found that the main SST initial errors leading to the failure of El Niño forecast are mainly from the tropical southeast Pacific, which supports our results. In addition, Duan and Wei (2013) used the realistic forecast data generated by the complicated coupled FGOALS-g model (Yan and Yu 2012) and pointed out that the initial errors that grow in a manner similar to El Niño or La Niña events were most likely to result in large prediction errors for El Niño forecasts. In other words, these initial errors have the similar growth mechanism to those of ENSO events. Many studies have shown that, one of main developing mechanisms of ENSO is that initial anomalous signals occur in the subsurface of the equatorial western Pacific, and then propagate eastward through the equatorial Kelvin wave (Battisti and Hirst 1989; Ballester et al. 2015; Lai et al. 2015), which emphasizes the role of the subsurface of the equatorial western Pacific and supports the results in this study.

It is worthing noting that this study reduces the dimensions of optimization problem and calculates CNOPs in a truncated phase space. It is different from the method that reduces the dimensions of the CESM directly and calculates the CNOP of a truncated model. The dimensions of complex coupled models can be decreased directly by reducing model resolution. If the reduction of resolution is not very large, the cost that calculates CNOP directly using intelligent optimization algorithms is still very high. It is necessary to decrease the resolution to a certain extent to ensure that CNOP can be directly calculated. However, too low resolutions will seriously affect the simulation and prediction ability of the model. As a result, it is difficult to guarantee that the calculated CNOP can represent that in the original complicated model. However, the dimension reduction based on SVD-based EP algorithm in this study can not only retain the simulation ability of original models, but also assure that the appropriate number of SV modes are selected under present calculating conditions, so as to succeed in computing CNOPs and grasping their main characteristics. As Chen et al. (2015b) pointed out, the distributions of significant SV modes are closely related to physical problems. Compared with low-resolution models, the number of SVs selected to construct CNOPs will not increase sharply as the number of grid points increases in high-resolution models. The method used in this studies is suitable for solving the nonlinear optimization problems in complex models with high dimensions.

This work is a preliminary study, and only six El Niño events with different intensities and diversities are selected to compute the CNOP in the CESM. For La Niña events and neutral conditions, the predictability problems related to initial errors are also important. Hu and Duan (2016) and Hu et al. (2019) found that there exist similarities between optimal precursory perturbations superimposed on the neutral states and most likely to evolve into ENSO events, and OGEs associated with ENSO events in the CESM, and emphasized that the off-equatorial regions around 10°N in the central Pacific may also be the important error resources for La Niña predictions. These results are equally obtained by the EOF-based ensemble method and need to be further studied by the optimization method. More importantly, many researches, including this study, are based on perfect model hypothesis, it is necessary to validate these results in hindcast experiments or realistic forecasts of ENSO events. In addition, this study mainly focuses on the effect of initial errors in the tropical Pacific on El Niño prediction. However, many studies have shown that ENSO is often influenced by processes outside the tropical Pacific. Kao and Yu (2009) and Yu and Kim (2011) showed that wind forcing from the subtropical and extratropical atmosphere may affect the occurrence of CP-El Niño events by the seasonal footprinting mechanism and emphasized that the signals in the north Pacific are important for some CP-El Niño events. Saji et al. (1999) and Zhou et al. (2015) noted that the variability of the Indian Ocean dipole (IOD) can affect El Niño in the tropical Pacific through both the Indonesian through flow and atmospheric bridges. In this sense, initial errors in these regions may also affect El Niño predictions. In addition, we only emphasize the effect of oceanic initial error on the largest uncertainties in ENSO predictions. Apparently, atmospheric initial conditions and external forcings also affect the prediction skills of ENSO events. Zheng and Zhu (2010) suggested that the coupled assimilation of wind data can decrease the initial errors in ocean currents and improve SST forecast skills. Chen et al. (2015a) demonstrated that the random occurrence of westerly wind bursts (WWBs) in the tropical western Pacific may be an important factor that predicts the diversity of El Niño events successfully. In the future, we will select more ENSO cases, explore more important initial variables, and perturb initial variables in more extensive areas to study ENSO predictability, including spring prediction barriers and target observations.