1 Introduction

One of the most important modes of natural climate variability is the El Niño Southern Oscillation (ENSO). Many climate models can now simulate an ENSO cycle with a reasonable level of fidelity, but errors persist in both the mean climate of the tropical Pacific and in the structure, frequency and magnitude of ENSO (Bellenger et al. 2013). The CMIP3 models (Coupled Model Intercomparison Project, version 3), and more recently the CMIP5 models, have been used in many studies to assess ENSO processes and to investigate the impact anthropogenic climate change may have on ENSO, often with inconclusive results (Collins et al. 2010). Studies that consider alternative definitions of ENSO events (and extreme ENSO events) in terms of variations in precipitation, seemingly show more robust changes in the future, but may also be influenced by mean biases in models (Cai et al. 2014, 2015). Errors and biases in models can influence the characteristics of the modelled ENSO and its projected response under climate change. Developing a greater understanding of ENSO feedback processes and their relation to model biases and the uncertainty of long-term ENSO projections is an ongoing priority area of research (e.g. the CLIVAR ENSO in a Changing Climate Research Focus).

ENSO is governed by a suite of air-sea feedback processes in the equatorial Pacific. As described by Bjerknes (1969), the development of El Niño involves weakening of the easterly trade Winds as sea surface temperature (SST) warms in the eastern equatorial Pacific, rendering a reduced zonal SST gradient. The westerly wind anomalies drive weaker surface currents and upwelling, causing the thermocline to tilt eastward. The deepened thermocline in the east leads to increased SSTs which in turn reinforce the westerly wind anomaly, and so forth. These reinforcing processes are termed the Bjerknes feedback. Conversely, negative SSTAs associated with La Niña, the cool ENSO phase, cause an opposite ocean response resulting in further SST cooling. The development of the Bjerknes Stability Index (BJ Index, Jin et al. 2006), allows for a relatively simple method of quantifying these equatorial Pacific feedback strengths. The BJ index includes the damping feedback of the mean ocean currents and damping via reduced atmosphere–ocean heat flux, termed the thermodynamic damping. The positive feedbacks to a warm SST anomaly resulting from anomalous eastward advection in the ocean, weakening east Pacific upwelling and a flattening thermocline, referred to as the zonal advective, Ekman and thermocline feedbacks respectively, are also included in the BJ index. The BJ Index is derived from a linear stability analysis of a simplified model of the upper ocean and atmosphere in the Pacific (see Sect. 3 for details).

Coupled atmosphere–ocean models still struggle to accurately represent ENSO and the feedbacks that control ENSO characteristics. ENSO amplitude and period are often diverse in multi-model ensembles. Here the focus is on the dominant positive and negative ENSO feedbacks, which are found in a number of studies assessing ENSO feedbacks in models to be areas for improvement (Guilyardi 2006; Lin 2007; Lloyd et al. 2009, 2012; Kim and Jin 2010a; Bellenger et al. 2013; Kim et al. 2013).

Thermodynamic damping, the strength of the response of surface heat flux to sea surface temperature anomalies in the east equatorial Pacific, is a major source of ENSO diversity. This feedback is usually too weak in models and shows large variations in both CMIP3 and CMIP5 (Lloyd et al. 2009; Kim and Jin 2010a; Kim et al. 2013). This is often primarily related to an underestimated shortwave damping response caused by weak atmospheric ascent in response to east Pacific sea surface temperature anomalies during El Niños as well as a weak response of clouds (Lloyd et al. 2012). Guilyardi et al. (2009) and Bony and Dufresne (2005) also find that the distribution of atmospheric convection is important for an accurate thermodynamic damping. Another important component of thermodynamic damping is the cooling of SSTs via latent heat flux, which is also found to be weak in many CGCMs and can be linked to biases in wind speed or near surface humidity sensitivity in the equatorial Pacific (Lin 2007; Lloyd et al. 2010).

In the case of positive feedbacks, such as the zonal advective, Ekman and thermocline feedbacks, the strength of the coupling between atmosphere and ocean, e.g. the response of surface wind stress to sea surface temperature anomalies, is important. This coupling has been found to be generally weak in models and varies in strength between them (Guilyardi 2006; Lloyd et al. 2009; Kim and Jin 2010a; Kim et al. 2013). The strength of this coupling is slightly improved in atmosphere-only models and CMIP5 relative to CMIP3 (Lloyd et al. 2010; Bellenger et al. 2013; Kim et al. 2013) though the reasons behind this improvement are unclear. Weak positive ENSO feedbacks in models are also related to other ocean–atmosphere feedback loops such as the response of ocean currents to wind stress anomalies, important to the zonal advective feedback, and the response of the thermocline slope to wind stress anomalies which is a key component of the thermocline feedback (Lübbecke and McPhaden 2013, 2014; Graham et al. 2014). The biases in these feedbacks can often be linked to errors in the mean equatorial Pacific climate. In particular, the cold tongue bias and weak upper ocean stratification common in models can coincide with weak air-sea couplings (Kim et al. 2013).

Multi-model studies into the response of ENSO to climate change generally do not indicate robust responses, particularly when investigating ENSO amplitude and period (van Oldenborgh et al. 2005; Guilyardi 2006; Merryfield 2006; Yeh et al. 2006; Vecchi and Wittenberg 2010; Collins et al. 2010; DiNezio et al. 2012). However, recent studies using alternative metrics (e.g. precipitation as an indicator of extreme El Niño events) find a projected increase in strong events (Cai et al. 2014, 2015). Santoso et al. (2013) also find a projected increase in extreme events when considering zonal SST propagation as an indicator of strong El Niños. These studies highlight the need to also include process-based metrics in ENSO analysis. It has been suggested that initial mean state or the relative dominance of different feedbacks may have an impact on the ENSO amplitude response to climate change (Zelle and van Oldenborgh 2005; Merryfield 2006; Yeh and Kirtman 2007). A stability analysis of CMIP3 models by Kim and Jin (2010a, b) also find different ENSO amplitude and ENSO stability responses to warmer climate conditions in that ensemble. Despite this inter-ensemble variation, consistent ENSO feedback responses were found for thermodynamic damping (becoming more negative), zonal advective feedback and Ekman feedback (becoming more positive) in response to increased CO2. These feedback responses counteract each other resulting in different ENSO stability (the feedback summation) responses in different models. Similar results have also been found in analyses by DiNezio et al. (2012) and Philip and van Oldenborgh (2006). Despite this, Kim and Jin (2010a, b) find that ENSO stability is a good predictor of ENSO amplitude in CMIP3. In more recent models, ENSO stability climate change responses may be less diverse; Kim et al. (2014) find that a subset of the CMIP5 ensemble, which they judge to most accurately simulate the observed relative importance of feedback terms, do agree on an ENSO amplitude and stability response which rises then falls in later decades. However, the inter-ensemble link between stability and amplitude is less clear in CMIP5 when compared to CMIP3 (Kim et al. 2013).

While the BJ index has proved to be a powerful tool in the assessment of ENSO feedback strength, an ongoing discussion involves the reliability and accuracy of the BJ index as a quantitative predictor of ENSO amplitude. Graham et al. (2014) compare the BJ index with a calculation of the full non-linear mixed-layer heat budget and find that positive feedbacks, associated with ocean processes, tend to be misrepresented by the BJ index. The linear calculation of atmospheric feedbacks is also known to result in underestimated feedback strength in the case of the shortwave damping and the Bjerknes feedback (Jin et al. 2003; Timmermann et al. 2003; An and Jin 2004; Lloyd et al. 2012; Bellenger et al. 2013).

One potential issue is in the area-averaging used in the BJ Index calculation. This involves the choice of various boxes, over which different variables are averaged in order to estimate ENSO feedbacks. CMIP models still feature systematic errors and biases in e.g. the distribution of SSTs, with the cold tongue being too cold and extending too far into the west. Biases are different in different models and, ideally, one should adjust the locations and sizes of the different box-averages in order to accurately capture the different feedback processes (see Kim and Jin 2010a, b).

One way of artificially minimising biases in models is to use flux adjustments. While not a way of formally correcting models, the approach can be useful in understanding the role of model biases on e.g. modelled variability or the climate change response. Here we exploit a ‘perturbed physics’ ensemble of models in which flux adjustments are employed to prevent model drift due to radiation imbalances caused by the parameter perturbations (Collins et al. 2011). This has the side effect that the SST distribution in the different model versions is relatively close to that observed, thus alleviating some of the issues of model biases. However, flux adjustment does not solve all problems and there are still some biases in models and some differences in mean state. These are exploited to relate errors in feedbacks to errors in that mean state. The use of a perturbed physics approach results in an ensemble that covers a range of ENSO variability allowing for inter-ensemble relationships to suggest possible causes of feedback biases. While it does not represent the diversity of different mean-state errors, we consider it a stepping-stone to the understanding of the multi-model ensemble. If we can relate feedbacks to mean state errors in an ensemble in which model mean states are close to each other, but in which ENSO characteristics are quite different, this should aid us in the much harder multi-model problem.

Section 2 outlines the perturbed physics ensemble (PPE) used here as well as the reanalysis data used for comparison. The BJ index and its calculation is introduced in Sect. 3. In Sect. 4 the stability analysis results for the HadCM3 PPE during 1985-2015 are given in comparison to the reanalysis data. ENSO feedbacks are also related to the equatorial mean state of the PPE. The ENSO climate change response of the PPE is shown in Sect. 5 and possible reasons for the responses found are suggested. Section 6 briefly assesses the relationship in the PPE between ENSO amplitude and atmospheric noise. Results are summarised and discussed in Sect. 7 and suggestions for future work are given.

2 Data

2.1 HadCM3 perturbed physics ensemble (PPE)

The analysis carried out here uses a HadCM3 perturbed physics ensemble. Perturbed physics ensembles are generated from a single model and are a relatively easy way of producing a large ensemble, in comparison to a multi-model ensemble, which uses a number of different models. Perturbed physics allows for the experiment to be controlled while exploring uncertainties in processes or feedbacks. A single model is used while values of uncertain parameters are perturbed. Different physical schemes can also be switched in and out as well as perturbing parameters. This approach enables sources of uncertainty in projections due to uncertain parameters (rather than model structure) to be found. The PPE features a fully coupled version of HadCM3 (Collins et al. 2011) with an interactive sulphur cycle. The ensemble consists of a total of thirty-three members, which can be split into two sixteen-member sub-ensembles with one ensemble member having standard HadCM3 parameters. The first sub-ensemble features multiple perturbations to only parameters in the atmosphere component simultaneously (as opposed to single parameter perturbations). The second has perturbations to only schemes and parameters in the ocean component but features standard atmosphere settings. The PPE also uses flux adjustment in order to avoid model drift caused by top of atmosphere imbalances due to perturbations and to improve regional climate change and feedback simulation. The flux adjustment varies spatially and seasonally and is computed during a ‘spin-up’ phase and fixed (with a seasonal cycle) throughout the integration of all the experiments used here. Thus flux adjustments do not damp SST anomalies, as they do not depend on the value of the SST anomaly.

We analyse output from experiments spanning 200 years beginning in 1896 with historical forcings followed by the SRES A1B emissions scenario. This scenario describes a future of rapid economic growth and a peak in global population mid-century. Eighteen overlapping de-trended 30 year time periods with start years staggered at 10 year intervals (1896–1925, 1906–1935, …, 2066–2095) are used when responses under climate change are analysed. Fields used are sea temperatures to depth of 300 m, surface zonal wind stress, surface zonal ocean current, surface meridional ocean current, upwelling ocean current, precipitation, surface sensible heat flux, surface latent heat flux, shortwave radiation and longwave radiation.

2.2 CMIP5 HadCM3

Comparison of the results in this study with previous studies required the use of the CMIP5 versions of HadCM3. CMIP5 has been designed for use in AR5 (Taylor et al. 2012). Monthly means of ocean temperature, surface heat flux, zonal and meridional ocean currents and surface wind stress for 1950–2000 of the historical HadCM3 runs are used for the calculation of ENSO feedbacks.

2.3 Reanalyses

In order to calculate an observed value of the BJ index SODA 2.2.4 and OAFlux datasets are used. The SODA 2.2.4 Simple Ocean Data Assimilation (Carton et al. 2005; Carton and Giese 2008) reanalysis uses an ocean general circulation model that is based on the Parallel Ocean Program (POP) physics and has a resolution of 0.25° latitude by 0.4° longitude with 40 levels. SODA covers 1871–2010 and version 2.2.4 uses 20CRv2 surface wind stress as observed surface forcing. The model is then nudged towards observations that include hydrographic profiles, moored hydrographic observations and remotely sensed SST data. Fields used are monthly mean values of sea temperature, zonal surface wind stress, zonal ocean velocity, meridional ocean velocity and vertical ocean velocity in the calculation of the Bjerknes stability index for the time period 1984–2009.

The OAFlux (Objectively Analyzed air-sea fluxes, Yu and Weller 2007) is a 49 year analysis which uses a combination of satellite data and atmospheric reanalyses to produce products on a 1° latitude-longitude grid. OAFlux provides the latent and sensible heat fluxes for the calculation of the observed BJ index here as well as sea surface temperature used in the calculation of the thermodynamic damping observed value. OAFlux data covers the time period of 1958–2010 though only 1984–2009 is used here due to the availability of shortwave and longwave radiation fields. OAFlux net surface heat flux is obtained by combining OAFlux sensible and latent heat fluxes with the ISCCP (International Satellite Cloud Climatology Project, Schiffer and Rossow 1983) shortwave and longwave radiation.

3 Methods

The BJ index is derived from the linear equation for sea temperature anomalies averaged over the mixed layer (e.g. An et al. 1999), which are then area-averaged over the central and east equatorial Pacific, the areas which are most relevant to ENSO-related variability. Here the mixed layer is taken to have a fixed depth of 50 m. This is based on a PPE mean mixed layer depth of 50.5 ± 4.9 m, calculated as the depth of SST-0.5 °C (e.g. Philip and van Oldenborgh 2006). A number of linear approximations can then be used to obtain equations which form a simple linear coupled recharge oscillator system with growth rate \(I_{BJ}\) such that

$$I_{BJ} = \frac{R}{2},$$
(1)
$$R = CD + TD + ZA + EK + TC.$$
(2)

R represents the collective strength of various ENSO feedbacks. As the BJ index \(\left( {I_{BJ} } \right)\) represents the growth rate of the system, this means that for \(I_{BJ} > 0\) the leading mode of the system is linearly unstable, and when \(I_{BJ} < 0\), the leading ENSO mode is damped.

The first two feedbacks represent the damping feedbacks of mean current damping (CD) and thermodynamic damping (TD) such that

$$CD = - \left( {\frac{{\langle \bar{u}\rangle_{E} }}{{L_{x} }} + \frac{{\langle - 2y\bar{v}\rangle_{E} }}{{L_{y}^{2} }} + \frac{{\langle H\left( {\bar{w}} \right)\bar{w}\rangle_{E} }}{{H_{m} }}} \right),$$
(3)
$$TD = - \alpha ,$$
(4)
$$- \alpha = - \alpha_{SW} - \alpha_{LW} - \alpha_{SH} - \alpha_{LH} ,$$
(5)

where \(\langle \rangle_{E}\) represents area-averaging over the east equatorial Pacific (averaging bounds of longitude 180°E–280°E, latitude 5°S–5°N). The overbar represents the time mean and u, v, w are the surface zonal, surface meridional and upwelling ocean current respectively. A step function, \(H\left( x \right)\), is used for upwelling currents such that only positive upwelling values are used \(\left( {H\left( {\bar{w}} \right) = 0\;{\text{for}}\;\bar{w} < 0,\;H\left( {\bar{w}} \right) = 1\;{\text{for}}\;\bar{w} > 0} \right)\). Lx and Ly are the longitudinal and latitudinal extents of the area averaging region. Hm is taken to be 50 m (depth of mixed layer). The thermodynamic damping term, α, is calculated as the slope of the east Pacific heat flux anomalies regressed onto east Pacific sea surface temperature (SST) anomalies. The thermodynamic damping can be split into separate heat flux components; shortwave radiation (αSW), longwave radiation (αLW), sensible heat (αSH) and latent heat (αLH) fluxes. These are calculated by regressing the seasonal anomalies of the relevant heat flux in the east Pacific against east Pacific SST anomalies. In order to obtain all feedbacks in units of year−1 the heat flux dampings are normalised using the heat capacity of water (cp = 4180 Jkg−1 K−1), the density of seawater (ρ = 1029 kgm−3) and the mixed layer depth, taken fixed as 50 m. Note that the usual sign convention is for surface fluxes to be defined as positive upward. A positive net heat flux (from the ocean to the atmosphere) would lead to a cooling of SSTs. Hence the use of the negative sign in Eq. (4).

The remaining three feedbacks are the positive feedbacks of zonal advective (ZA), Ekman (EK) and thermocline (TC) feedbacks where

$$ZA = \mu_{a} \beta_{u} \left\langle {\frac{{ - \partial \bar{T}}}{\partial x}} \right\rangle_{E} ,$$
(6)
$$EK = \mu_{a} \beta_{w} \left\langle {\frac{{ - \partial \bar{T}}}{\partial z}} \right\rangle_{E} ,$$
(7)
$$TC = \mu_{a} \beta_{h} \left\langle {\frac{{\bar{w}}}{{H_{1} }}} \right\rangle_{E} .$$
(8)

The Bjerknes feedback, μa, is the linear regression coefficient of equatorial Pacific wind stress anomalies against east Pacific SSTAs. βu is found by regressing east Pacific zonal surface ocean current anomalies against equatorial Pacific surface wind stress anomalies, βw is the regression coefficient of east Pacific upwelling ocean current against equatorial Pacific surface wind stress anomalies and βh is obtained by the regression of thermocline slope (represented as the difference between east and west Pacific sea temperatures averaged from the surface to a depth of 300 m) anomalies against equatorial Pacific surface wind stress anomalies. Once calculated, the feedbacks are summed together to obtain the total BJ index (example shown in Fig. 2).

Clearly, the use of area-averaging in the calculation of the BJ index is important and while a method for choosing the longitudinal extents of the area-averaging boxes exists (given in Kim and Jin 2010b) this is not used here as the use of flux adjustment in the ensemble minimises spatial differences between ensemble members. Instead, a fixed division between the east and west Pacific at longitude 180° is chosen for all time periods and ensemble members with an east boundary at 80°W for east Pacific area-averaging. Averaging over the full basin (e.g. for wind stress in calculation of μa) has the same east boundary but a west boundary at 120°E. Sensitivity tests to the definitions of the dividing boundary of the two regions show that the main conclusions of the paper are not affected by these choices.

The role of atmospheric noise is also estimated in Sect. 6 as the residual of wind stress as a function of equatorial Pacific SST (Philip and van Oldenborgh 2009, 2010; Philip et al. 2010) given as:

$$\tau_{x} \left( {x,y,t} \right) = \mathop \sum \limits_{i = 1}^{n} A_{i} \left( {x,y} \right)T_{i} \left( t \right) + \epsilon \left( {x,y,t} \right).$$
(9)

where \(\tau_{x} \left( {x,y,t} \right)\) is the tropical Pacific surface wind stress anomaly, Ti(t) are SST anomalies area-averaged over n separate regions along the equator (here we choose three in keeping with Philip et al. 2010), Ai(x, y) are the wind stress patterns corresponding to the relevant SST anomalies and \(\epsilon \left( {x,y,t} \right)\) is the stochastic forcing by random wind stress variations or the wind stress residual unrelated to SST anomalies which is referred to throughout as the ‘atmospheric noise’. The three averaging areas used divide the longitude range 140°E–280°E into three equal regions with latitude bounds of 5°S–5°N (Philip and van Oldenborgh 2010). To assess the strength of the noise in the central equatorial Pacific the standard deviation of the area-average over Niño 3.4 is taken (longitude of 190°E–240°E, latitude of 5°S–5°N). Other analysis for the mean Pacific climate involves area-averaging over the Niño 3 (longitude of 190°E–240°E, latitude of 5°S–5°N) and Niño 4 (longitude of 190°E–240°E, latitude of 5°S–5°N) regions. For analysis of zonal ocean currents Nino 3′ and Niño 4′ regions are also defined with the same longitude bounds as Niño 3 and Niño 4 but with latitude bounds of 2.5°S–2.5°N.

Error bars for area-averaged mean and standard deviation in the present day time period (1986–2015) are estimated using a moving block bootstrap with windows of 10 months for mean and 36 months for standard deviation. Ensemble mean error is given by one ensemble standard deviation. Error bounds for feedbacks are based on the 95% confidence interval for the linear least squares regression fits used in the calculation of the coefficients. This is calculated using the linear fit standard error and the 97.5th percentile of the Student t distribution. Figures showing climate change trend lines are dashed or solid to demonstrate insignificant or significant trends respectively. A trend is classed as significant if the difference between 1986–2015 and 2065–2095 is larger than the standard deviation of the 1895-1985 time period (referred to as ‘natural variability’).

4 Twentieth century ENSO stability

Perturbations to both atmosphere and ocean parameters impact the amplitude of ENSO (Figs. 1g, 2a) with the standard deviation of Niño 3.4 SST anomalies ranging from 0.5 to 1.5 °C. Later analysis using mean climate condition focuses on Niño 3 and Niño 4 regions as opposed to just Niño 3.4 used here. This is to clarify differences in the relationship between feedbacks and the east and west Pacific. For the case of ENSO amplitude using Niño 3 instead of Niño 3.4 does not alter results given here as inter-ensemble variations in these regions are closely related. It is found that ENSO amplitude bias is mixed but nineteen PPE members show too strong ENSO amplitude; only seven PPE members show a weaker than observed ENSO amplitude (a significant bias is defined as lying outside of the reanalysis ±95% linear fit confidence interval for BJ index calculations or ±1 moving block bootstrap error for ENSO amplitude). SSTAs across the equator, not just in Niño 3.4, show this bias (Fig. 1g) with ocean perturbation ensemble members favouring the strongest SSTAs. The BJ index (Fig. 2b) during 1986–2015 shows a positive bias for twenty PPE members in comparison with observations. All other PPE members show no significant bias.

Fig. 1
figure 1

a Mean sea surface temperature (SST) averaged over latitude −5° to 5° (1986–2015) for the standard HadCM3 PPE member (black line), atmosphere (red line) and ocean (blue line) HadCM3 PPE means with shading for ±1 S.D. and SODA reanalysis (dashed black line), b same as a but for mean zonal surface ocean current and averaged over latitude −2.5° to 2.5°, c same as a but for mean zonal surface wind stress, d same as a but for mean depth of 20 °C isotherm, e same as a but for mean upward net heat flux, f same as a but for mean precipitation, g same as a but for S.D. of SSTA anomalies. Vertical dashed lines show the east and west regions used in the calculation of the BJ index. Vertical solid lines show Niño 3 (east) and Niño 4 (west) regions

Fig. 2
figure 2

a ENSO amplitude as represented by the standard deviation (S.D.) of Niño 3.4 Sea Surface Temperature Anomalies. b The BJ index (BJ) and its components, mean current damping (CD), thermodynamic damping (TD), and zonal advective (ZA), Ekman (EK) and thermocline (TC) feedbacks calculated for all HadCM3 Perturbed Physics Ensemble (PPE) members for 1986–2015. PPE members given by dots colored by perturbation type; black for standard parameters, blue for PPE members that feature perturbations to the ocean parameters and red for atmosphere parameter perturbations. Reanalysis values ±95% linear fit confidence intervals (for BJ index calculations) and moving block bootstrap error (for ENSO amplitude) are shown by the black horizontal dashed lines and grey shading. Ensemble means ± 1 S.D. shown by diamonds and attached error bars. Min/max error bars for PPE members are given along the top of figure

Here the total BJ index for the standard HadCM3 model is found to be 0.65 ± 0.26 year−1 (Table 1; Fig. 2) in comparison with a previous study by Kim et al. (2013) who find a HadCM3 BJ index of approximately 0.4 year−1 (approximated from the figures as the exact value is not stated in the text). Aside from the use of flux adjustments here, there are a number of differences in the methods of these two studies that result in slight differences in the feedback strengths found. This study mainly follows the original BJ index formula given in Jin et al. (2006), though we use the calculation of μa (the sensitivity of zonal wind stress to SSTA) given in Kim et al. (2013) as we feel this to be more consistent with recent studies (e.g. Graham et al. 2014; Lübbecke and McPhaden 2013). Our analysis is consistent with Graham et al. (2014), however Kim et al. (2013) follows the formula in Kim and Jin (2010a, b), which may give a different result. The main difference in the formula used here and the one in Kim et al. (2013) is in the calculation of the mean current damping which includes regression of the SSTAs at the longitudinal and latitudinal boundaries of the averaging box against the full box averaged SSTA as well as consideration of the boundary ocean current, rather than the full box averaged ocean current. The influence of mean upwelling is also not included in the alternative mean current damping calculation.

Table 1 Twentieth century (1986–2015 for PPE, 1984–2009 for SODA/OAFlux) and twenty-first century (2066–2095) mean SST in Niño 3 and Niño 4, ENSO amplitude (S.D. of Niño 3.4 SSTAs), BJ index, nean current damping (CD), thermodynamic damping (TD), zonal advective feedback (ZA), thermocline feedback (TC), and Ekman feedback (EK) for the reanalysis, the standard parameter ensemble member (0), the atmosphere PPE (members 1–16) mean (±S.D.) and the ocean PPE (members 17–32) mean (± S.D)

A large discrepancy between the analysis presented here and that of Kim et al. (2013) lies in the thermodynamic damping, found by Kim et al. (2013) to be approximately 0.15–0.2 year−1 for the CMIP5 version of HadCM3. Thermodynamic damping for the standard PPE member in this study is significantly larger in strength. Aside from the formula differences outlined above, one of the main differences between these two studies is that the model experiment design differs between them. Kim et al. (2013) use the CMIP5 historical non-flux-adjusted HadCM3 run. Here we use a perturbed physics ensemble of HadCM3 with flux adjustment. Repeating the methods used here on the CMIP5 historical HadCM3 run gives a thermodynamic damping of −0.28 ± 0.06 year−1, much weaker than that found for the standard parameter HadCM3 PPE member (Table 1) in this study.

Other differences include the calculation of anomalies; Kim et al. (2013) uses a 7 year smoothing as opposed to seasonal anomalies used here. Finally Kim et al. (2013) find the separating boundary between east and west Pacific averaging areas by using a method based on SST EOFs, whereas here the boundary here is taken to be fixed at 180°. Therefore the feedbacks will be calculated over different areas in the equatorial Pacific.

Positive feedback discrepancies are smaller. Kim et al. (2013) finds the HadCM3 zonal advective, Ekman and thermocline feedbacks to be approximately 0.3, 0.4 and 1.2 year−1 respectively. While there are some differences to this for the standard PPE member, such as the stronger Ekman feedback and weaker zonal advective feedback (see Table 1), these are most likely caused by the different methods described above and the use of flux adjustment in the PPE, which improves the HadCM3 feedback strengths in relation to the observed BJ index. The representation of the relative strength of the BJ index and its components can be tested by correlating the six points of the BJ index and its components for an ensemble member with those found using the reanalysis. A higher correlation then indicates a more accurate representation of the strengths of the feedbacks in relation to each other. The correlation between the components of the standard HadCM3 PPE member BJ index and SODA/OAFlux BJ index is increased to 0.97 in this analysis from 0.87 found by Kim et al. (2013). All but one PPE member have correlations exceeding 99% significance using this measure (by students t test), a criteria used in Kim et al. (2014) to select the ‘best’ models for assessing projected change in ENSO stability. The root mean squared error (RMSE) of an ensemble member’s BJ index and components with the reanalysis BJ index can also be used as a measure of accuracy (e.g. Kim et al. 2013). Here RMSEs range from 0.19 to 0.55 year−1 with ensemble members with lower RMSEs corresponding to those which show the higher correlations with the reanalysis BJ index and therefore have the more accurate relative feedback strengths. An important note here is that selecting the ‘best’ models in this ensemble, those with the lowest RMSEs (below 0.35), gives no significant difference in the projected response of the BJ index and ENSO amplitude in Sect. 5, in contrast to the MME case in Kim et al. (2014).

The use of perturbed physics provides a range of ENSO stability over the ensemble, either strengthening or weakening the feedbacks, often depending on the type of perturbation (e.g. perturbation to atmosphere or to ocean parameters). Atmosphere perturbations typically show increased ENSO stability (smaller or more negative BJ index) from the standard PPE member, with an atmosphere PPE mean (Table 1, PPE members 1–16) significantly different than the ocean PPE mean (Table 1, PPE members 17–32; 99% significance by a student’s t test). This is the result of weakened positive feedbacks and a stronger mean current damping in response to atmosphere perturbations. Atmosphere ensemble means for CD, ZA, TC and EK feedbacks respectively, and are all significantly below the ocean PPE means (99% significance by student’s t test). Ocean perturbations tend to have a destabilising effect, with increased positive feedbacks and a weaker mean current damping for ocean PPE members. In contrast, thermodynamic damping is strengthened by ocean perturbations and weakened by atmosphere perturbations. The atmosphere PPE mean damping is significantly weaker than the ocean PPE mean damping (99% significantly different by student’s t test). These results show that ocean perturbations strengthen ocean–atmosphere feedbacks while weakening the impact of mean currents. Conversely, atmosphere perturbations strengthen the impact of mean ocean currents but weaken ocean–atmosphere couplings.

The strongest positive feedback is the thermocline (TC) feedback which shows consistently high values around 1–2 year−1 in comparison with the other components, in agreement with the reanalysis. However, this feedback is still comparatively weak for many PPE members, the reasons for which are examined in Sect. 4.1. Twenty-eight out of the thirty-three PPE members have a TC feedback significantly weaker than the observed feedback. For the remaining positive feedbacks, the reanalysis shows the strength of the zonal advective feedback as around double the strength of the Ekman feedback (Table 1). This is not the case for the HadCM3 PPE where zonal advective and Ekman feedbacks tend to have similar magnitudes. The zonal advective feedback is generally too weak (twenty-eight PPE members significantly weaker) whereas the Ekman feedback shows no consistent significant biases, although eleven of sixteen ocean ensemble members are significantly stronger than the reanalysis. Thermodynamic damping shows the largest bias in the ensemble with nineteen ensemble members having thermodynamic damping with a magnitude less than half that of the reanalysis value. None of the ensemble members approach the observed damping rate of −2.58 ± 0.15 year−1. Again, the reasons for this are discussed in Sect. 4.1.

The relationship between ENSO amplitude and the BJ index in the PPE is somewhat weak, with a correlation between the two measures of only 0.41. Previous studies found stronger relationships (Jin et al. 2006; Kim and Jin 2010a, b; Kim et al. 2013). For example, Kim et al. (2013) found a positive correlation between ENSO amplitude and BJ index of 0.79 in a set of CMIP3 models. However, for the CMIP5 models, a few outlier models weakened the relationship considerably (Kim et al. 2013). The weaker correlation here is largely due to a negative correlation between thermodynamic damping and ENSO amplitude (−0.54), in contrast to previous studies which typically find weaker thermodynamic damping coincides with stronger ENSO (Lloyd et al. 2009). The remaining dominant feedbacks have positive correlations with ENSO amplitude; 0.56, 0.62 and 0.45 for zonal advective, Ekman and thermocline feedbacks respectively (correlations are 99% significant using a student’s t test for zonal advective and Ekman feedback; 95% significant for the thermocline feedback). This discrepancy with previous studies and possible reasons behind this are discussed further in the summary.

Despite many of the dominant feedbacks being weak, in agreement with previous studies (Lloyd et al. 2009; Kim and Jin 2010a; Kim et al. 2013), these biases counteract each other to a certain degree resulting in no consistent bias for the full BJ index.

4.1 Positive feedback biases in relation to mean climate

Equatorial sea surface temperatures in these flux-adjusted versions of HadCM3 are generally very well represented compared to other models (Guilyardi 2006). Mean equatorial Pacific temperatures are close to those observed (Table 1) and ENSO amplitude (standard deviation of Niño 3.4 SSTAs) is also relatively close to observations for the standard parameter model. Ensemble mean SSTs remain close to the reanalysis along the equator (Fig. 1a) with the most inter-ensemble variation occurring in atmosphere perturbation ensemble members; ensemble standard deviation of mean Niño 3 SST is 0.39 °C for atmosphere perturbation PPE members compared to 0.23 °C for ocean perturbation PPE members. Note that the CMIP5 HadCM3 historical run has mean SST of 24.8 ± 0.09 °C and 27.2 ± 0.08 °C in the Niño 3 & Niño 4 areas respectively, significantly below the HadCM3 PPE means and the reanalysis. However, despite the accuracy of mean temperatures in comparison to other models, model errors introduce a number of biases in ocean currents and surface heat flux. Upwelling ocean currents tend to be too strong, with a PPE mean (±1 S.D.) Niño 3′ upwelling of 1.29 × 10−5 ± 0.06 × 10−5 ms−1 compared to observed upwelling of 1.00 × 10−5 ± 0.03 × 10−5 ms−1. The largest mean climate bias is in zonal surface ocean currents, which are strong along the equator (Fig. 1b), particularly in the averaged Niño 4′ area with a PPE mean speed of −0.61 ± 0.07 ms−1 compared with an observed zonal ocean current speed of −0.35 ± 0.02 ms−1 (see Fig. 3a). The west Pacific also shows the most inter-ensemble variation, as shown by the ensemble standard deviation shading in Fig. 1b. Unlike upwelling ocean current, mean Niño 4′ zonal ocean currents vary depending on perturbation type with atmosphere perturbation ensemble members favouring stronger mean zonal advection (atmosphere PPE mean of −0.66 ± 0.05 ms−1 compared to the ocean PPE mean of −0.56 ± 0.05 ms−1; significantly different at 99% level by t test). It is this bias and diversity that has the largest link to positive ENSO feedback strength.

Fig. 3
figure 3

HadCM3 PPE (1986–2015) a zonal advective (ZA, y axis) feedback, b the surface zonal ocean current response to surface zonal wind stress anomalies, βu, c Thermocline feedback (TC), d the thermocline slope response to surface zonal wind stress anomalies, βh, e Ekman feedback (EK), and f the upwelling ocean current response to surface zonal wind stress anomalies, βw plotted against averaged Niño 4′ zonal surface ocean current (\(\bar{u}\)). PPE members given by dots colored by perturbation type with solid fit lines for 99% significant fit (by t test) and dashed fit lines for 95% significant fits also colored by perturbation type (black for full ensemble, red for atmosphere perturbations only, blue for ocean perturbations only). Reanalysis result shown by a cross. Correlations are printed on the figures with black text for the full thirty-three PPE members, red for the sixteen atmosphere perturbation members and blue for the sixteen ocean perturbation members

A positive correlation of 0.7 (Fig. 3a) exists between the zonal advective feedback and mean zonal surface ocean current (\(\bar{u}\)) averaged over the Niño 4′ area. This relationship suggests that the strong zonal ocean current bias in the PPE may be a cause of the weak zonal advective feedback. The relationship is mainly shown by the ocean perturbation PPE members, highlighting the importance of ocean model parameters to this feedback. Referring to the original feedback calculation (Eq. 6) the zonal advective feedback can be decomposed into three components; the response of equatorial Pacific surface zonal wind stress to east Pacific SSTAs (μa), the response of east Pacific zonal surface ocean current to equatorial Pacific wind stress anomalies (βu) and the mean zonal SST gradient \(\left( {\left\langle {\frac{{ - \partial \bar{T}}}{\partial x}} \right\rangle_{E} } \right)\). By examining these components, it is found that βu is the underlying cause of the weak total feedback represented by the PPE mean as 3.52 × 108 ± 0.67 × 108 mPa−1 year−1 compared with 6.00 × 108 ± 0.5 × 108 mPa−1 year−1 in the reanalysis. This is also the component most strongly related to the total feedback with a correlation of 0.74, compared to 0.68 with the mean zonal SST gradient and −0.041 for the sensitivity of wind stress to SSTAs. This component has a correlation of 0.58 with zonal ocean surface currents (Fig. 3b), suggesting that stronger mean ocean currents show less response to surface wind anomalies resulting in a weakened zonal advective feedback. Conversely, PPE members with weaker mean ocean currents show βu closer to the observed value. This leads to the conclusion that the strong mean zonal currents shown by the PPE may cause the weak zonal advective feedback via a reduced ocean current—wind stress feedback.

Similarly, the thermocline feedback can also be linked to the mean equatorial Pacific climate. Figure 3c shows the positive relationship between the thermocline feedback and mean Niño 4′ zonal surface ocean current. By decomposing the thermocline feedback (Eq. 8) into the sensitivity of the equatorial Pacific thermocline slope to surface wind stress anomalies (βh), the sensitivity of equatorial Pacific wind stress to east Pacific SSTAs (μa) and the mean upwelling current \(\left( {\left\langle {\frac{{\bar{w}}}{{H_{1} }}} \right\rangle_{E} } \right)\), the cause of the weak feedback can be found in the sensitivity of the thermocline slope to surface wind stress anomalies, βh, which is significantly weaker in the PPE than the reanalysis. A possible reason for this can be seen in the positive relationship shown between Niño 4′ zonal surface ocean current and βh (Fig. 3d), which shows the PPE members with stronger zonal ocean currents have a thermocline less sensitive to surface wind stress anomalies (correlation of 0.63).

Inter-ensemble variations in the remaining positive feedback, the Ekman feedback (Eq. 7) is similarly found to be linked to the zonal surface ocean current (correlation of 0.61, Fig. 3e). This can be found to be caused by the dominating component, the upwelling ocean current response to zonal wind stress anomalies (\(\beta_{w}\)). PPE members with weaker zonal ocean currents demonstrate a stronger sensitivity of east Pacific upwelling to surface wind stress anomalies and stronger ocean currents coincide with upwelling currents less sensitive to surface wind stress anomalies (correlation of 0.73, Fig. 3f). Weaker correlations (−0.48) are also found between the Ekman feedback and mean Niño 3 thermocline depth via variations in the less dominant component of mean vertical ocean temperature gradient. A correlation of −0.78 is found between this component and mean thermocline depth (not shown in figures).

The HadCM3 ensemble shows biases in the positive ENSO feedbacks, namely weak zonal advective and thermocline feedbacks. Results suggest that the strong mean zonal ocean current speeds in the PPE, particularly in ensemble members featuring perturbations to the atmosphere component, suppress ocean sensitivity to surface wind anomalies resulting in a weak thermocline slope response and a weak surface ocean current response and subsequently reduced zonal advective and thermocline feedbacks.

4.2 Thermodynamic damping biases in relation to mean climate

Inter-ensemble variations of thermodynamic damping are linked to precipitation in the west Pacific (Fig. 4a), the region which demonstrates the most inter-ensemble variation (Fig. 1f), such that wetter conditions encourage a larger thermodynamic damping response to positive SSTAs (a greater heat flux away from the surface per unit SST). The HadCM3 PPE typically has less precipitation than the reanalysis in Niño 4 (though the difference is only significant for eleven PPE members; ten of which have atmosphere perturbations), which may be linked to the weak bias in thermodynamic damping as suggested by the correlation of 0.72 between the two. By decomposing thermodynamic damping into individual heat flux damping rates (Eq. 5), it is found that the bias is largely caused by weak shortwave (−αSW) and latent heat flux (−αLH) damping terms, the dominant components of thermodynamic damping.

Fig. 4
figure 4

The HadCM3 PPE (1986–2015) strength of a thermodynamic damping (α), b shortwave damping (αSW) plotted against mean Niño 4 precipitation (\(\bar{P}\), x axis), c latent heat flux damping (αLH), and d the strength of shortwave damping (αSW) plotted against mean Niño 3 SST (\(\bar{T}\)). PPE members given by dots colored by perturbation type with solid fit lines for 99% significant fit (by t test) and dashed fit lines for 95% significant fits also colored by perturbation type (black for full ensemble, red for atmosphere perturbations only, blue for ocean perturbations only). Reanalysis result shown by a cross. Correlations are printed on the figures with black text for the full thirty-three PPE members, red for the sixteen atmosphere perturbation members and blue for the sixteen ocean perturbation members

Shortwave damping represents the damping due to decreased incoming shortwave radiation in response to warming SSTs during El Niño. This feedback has two regimes in the Tropical Pacific depending on the large-scale circulation. In regions of subsidence, a warm SST anomaly acts to reduce static stability, which breaks up marine stratiform clouds in the area, leading to an increase of solar radiation reaching the surface. This is a positive feedback that typically occurs at higher latitudes (Klein and Hartmann 1993). However, in warmer tropical areas of ascent, such as over the warm pool in the west equatorial Pacific, a positive SST anomaly increases convective cloud cover causing less shortwave radiation to reach the surface and creating a damping feedback (Ramanathan and Collins 1991).

The shortwave radiation damping in the ensemble, which shows the largest bias, is found to be linked to mean SST and precipitation (Fig. 4c, d). There is a positive correlation (0.64) between αSW and the mean SST (Fig. 4d) and a weaker positive correlation (0.45) between αSW and mean Niño 4 precipitation. So, warmer mean temperatures and more west Pacific precipitation are linked to a stronger east Pacific shortwave radiation response to positive SST anomalies. It is possible this is the result of a convection bias where cooler mean temperatures and less precipitation are indicative of convection occurring less often in response to El Niño SSTAs, with less cloud formation in the east Pacific during El Niños. This results in smaller negative shortwave radiation anomalies and a slight positive shortwave damping regime in the far-east equatorial Pacific (which is not present in the observations). This causes the area-averaged east Pacific shortwave damping to be weakly negative compared with the reanalysis, as shown in Fig. 4c by the weak shortwave damping strength (αSW). The relationship with mean SST is particularly strong for atmosphere perturbation PPE members (correlation of 0.69), suggesting that atmosphere parameters have the most impact on inter-ensemble variation of shortwave damping.

Latent heat flux damping represents the cooling due to increased evaporation during warmer El Niño temperatures and is reliant on the specific humidity difference near the sea surface and on wind speed anomalies. Like shortwave damping, latent heat flux damping is found to have a strong relationship with precipitation in the west equatorial Pacific. PPE members that show wetter conditions exhibit latent heat flux damping closer to observed (Fig. 4b). Again, this could signify a link to the representation of convection in the PPE. Similar to shortwave damping, the relationship between latent heat flux damping and precipitation is particularly strong for atmosphere PPE members (correlation of 0.89).

The near surface humidity difference response to SSTAs in the east equatorial Pacific is suggested to be the dominating contribution to latent heat flux anomalies in the east Pacific (Zhang and McPhaden 1995; Lloyd et al. 2010; Lin 2007). However, in this PPE the near surface humidity response to SSTAs show no consistent biases in comparison to the observations (not shown) suggesting that a wind speed bias in the PPE, linked to precipitation biases, is a more likely cause of the weak latent heat flux damping.

Links with Niño 4 mean surface wind stress are somewhat weaker that those shown with precipitation with a correlation of 0.47 (95% significance) with latent heat flux damping. However, the Bjerknes feedback (μa—wind stress anomalies regressed against Niño 3 SSTA) is also examined in relation to the latent heat flux feedback in Fig. 5. The Bjerknes feedback over the tropical Pacific for the reanalysis (Fig. 5a) shows a positive feedback concentrated mainly in the Niño 4 region, demonstrating decreasing winds there as the Walker circulation slows during El Niño in response to positive Niño 3 SSTAs. The HadCM3 PPE mean (Fig. 5b, c) shows a south-eastward extension of the area of the Bjerknes feedback in comparison to the reanalysis, as shown by the difference between the PPE mean and the reanalysis in Fig. 5c. This bias shows higher levels of ensemble agreement; above 70% PPE members are within PPE mean ± 1 S.D. demonstrated by stippling. Figure 5d correlates the Bjerknes feedback calculated at each gridpoint against the area-averaged east Pacific latent heat flux damping and finds that the spatial biases in μa have an impact on the strength of the east Pacific latent heat flux damping. PPE members with a weaker central Pacific Bjerknes feedback (and a stronger positive east and west Pacific feedback) tend to have a weak east Pacific latent heat flux feedback, indicated by a negative correlation in the central equatorial Pacific and positive correlations further to the east and west. Therefore, winds that weaken too much in the east Pacific during El Niño (demonstrated by a strong positive feedback), as they do in this PPE, reduce the evaporation in the East Pacific causing a weak latent heat flux anomaly resulting in weak latent heat flux damping.

Fig. 5
figure 5

a Zonal surface wind stress anomalies regressed against East Pacific SSTAs (μ) for ERA40/SODA 2.0.2, b the HadCM3 ensemble mean (1986–2015), c the difference between the reanalysis and the PPE mean (ba) and d the inter-ensemble correlation of area-averaged east Pacific latent heat flux against μ calculated at each gridpoint. Stippling shows where the linear fit is at least 99% significant (a), where over 70% ensemble members are within the ensemble mean ± 1 S.D. (b, c) and where correlation is at least 95% significant (d). Niño 3 (east) and Niño 4 (west) areas are outlined with a solid black line. The Niño 3.4 region is shown by a dashed line

The largest feedback bias and source of BJ index uncertainty in the HadCM3 PPE is thermodynamic damping (Fig. 2b). By examining inter-ensemble relationships it appears that the representation of precipitation is a driver of thermodynamic variability, suggesting links to atmospheric convection. Both evaporation and radiative feedbacks are weak in the PPE and both can be linked to mean precipitation strength and spatial pattern that may impact cloud cover in response to SSTAs and wind speed responses which, in turn, affect cooling via reduced radiation and evaporation during El Niño.

5 Projected ENSO response

Projected sea surface temperatures in the Niño 3 region for the HadCM3 PPE are shown in Fig. 6a (time series are found using overlapped 30-year average segments, with start years staggered every 10 years). Mean sea surface temperatures for the equatorial Pacific steadily increase through the twentieth century and then accelerate around the year 2000 in this A1B scenario. There is little variation in the response between ensemble members (despite the different climate sensitivities of the atmosphere-perturbation experiments—Collins et al. 2011, Fig. 5) and the temperature increase tends to be slightly larger in the far east Pacific than in the west (Table 1).

Fig. 6
figure 6

HadCM3 PPE from 1896 to 2095 for a Niño 3 area-averaged SST, b ENSO amplitude (Niño 3.4 area-averaged S.D. of SSTAs) and c Bjerknes stability index. Single ensemble members are plotted by thin lines, the ensemble mean is plotted by thick line colored depending on ensemble member perturbation type. Reanalysis values ± 95% linear fit confidence intervals (for BJ index calculations) and moving block bootstrap error (for ENSO amplitude and mean SST) are shown by the black horizontal dashed lines and grey shading. PPE member/mean lines are dashed if a change from 1986–2016 to 2065–2095 is no larger than natural variability (defined as the S.D. of 1896–1985). Time series are found using 30–year average segments, with start years staggered every 10 years. The result is plotted at the midpoint of each time segment (e.g. 1910, 1920…)

Time series of ENSO amplitude (standard deviation of detrended Niño 3.4 SSTAs in each overlapping 30-year segment, Fig. 6b) show that ENSO amplitude ensemble means for both atmosphere and ocean PPE members steadily increase through the historical period and into the twenty-first century (see also Table 1) suggesting stronger El Niño or La Niña events under climate change. This increase in amplitude is shown to be significant (larger than natural variability, defined as the S.D. of the 1895–1985 time period) for twenty-five PPE members, so is a fairly consistent result across the ensemble. The spread seen in Fig. 6b is largely due to the different baseline ENSO amplitudes.

Despite this increase in ENSO amplitude, the BJ index ensemble means show a slight decrease (significant for atmosphere ensemble mean, Fig. 6c), therefore a more stable ENSO. By examining the individual ENSO feedbacks the reasons for this response can be found. The positive ENSO feedbacks for the PPE show some steady, though comparatively small, changes in strength over time (Fig. 7a–c; Table 1). Inter-ensemble correlations show that positive feedbacks tend to be positively related to ENSO amplitude, however the projected response of the dominant thermocline feedback shows a slight weakening (Fig. 7a). A weakening response of surface zonal wind stress to SSTAs (μa) in the east Pacific contributes to the weakened thermocline feedback with eighteen of thirty-three PPE members showing significant negative trends over time (Fig. 8c). However the other dominant component, the thermocline slope response to surface zonal wind stress (βh, Fig. 8a), demonstrates significant increases in strength for the majority of atmosphere PPE members (greater than natural variability for fourteen of sixteen atm. PPE members) weakening the response of the thermocline feedback for these ensemble members. Those ensemble members which do not show a significant response over time in βh are dominated by the decreasing wind stress sensitivity, μa, resulting in the reducing thermocline feedback. This is in contrast to a studies by Kim et al. (2014) and Borlace and Cai (2013) who find a thermocline feedback dominated by βh.

Fig. 7
figure 7

HadCM3 PPE from 1896 to 2095 for a thermocline feedback, b zonal advective feedback, c Ekman feedback, d thermodynamic damping and e shortwave damping. Single ensemble members are plotted by thin lines, the ensemble mean is plotted by a thick line colored depending on ensemble member perturbation type. Reanalysis values ± 95% linear fit confidence intervals are shown by the black horizontal dashed lines and grey shading. PPE member/mean lines are dashed if a change from 1986–2016 to 2065–2095 is no larger than natural variability (defined as the S.D. of 1896–1985). Time series are found using 30-year average segments, with start years staggered every 10 years. The result is plotted at the midpoint of each time segment (e.g. 1910, 1920…)

Fig. 8
figure 8

HadCM3 PPE from 1896 to 2095 a thermocline slope response to zonal surface wind stress anomalies (βh), b zonal surface ocean current response to wind stress anomalies (βu), c zonal surface wind stress response to SST anomalies (μa), and d upwelling ocean current response to wind stress anomalies (βw). Single ensemble members are plotted by thin lines, the ensemble means are plotted by a thick line colored depending on ensemble member perturbation type. Reanalysis values ± 95% linear fit confidence intervals are shown by the black horizontal dashed lines and grey shading. PPE member/mean lines are dashed if a change from 1986–2016 to 2065–2095 is no larger than natural variability (defined as the S.D. of 1896–1985). Time series are found using 30-year average segments, with start years staggered every 10 years. The result is plotted at the midpoint of each time segment (e.g. 1910, 1920…)

Conversely, the zonal advective feedback shows an increase under climate change (Fig. 7b; Table 1). This is a result of the increasing response of surface zonal ocean currents to surface wind stress anomalies (βu, Fig. 8b) which shows significant positive trends for thirty-one PPE members with PPE means increasing from 3.24 × 108 to 4.68 × 108 mPa−1 year−1 (for atmosphere PPE members) and from 3.81 × 108 to 5.03 × 108 mPa−1 year−1(for ocean PPE members).

The final positive feedback, the Ekman feedback, also shows increases over time (Fig. 7c, Table 1). This response is largely a result of an increasing response of upwelling ocean currents to surface wind stress over time (Fig. 8d) with ensemble mean increases of 1.16 × 103 and 0.29 × 103 mPa−1 year−1 for atmosphere and ocean PPE members respectively. Because of these positive feedback responses, feedbacks relating to ocean current anomalies (zonal advective and Ekman) become a larger contribution to positive ENSO feedbacks balancing out the weakened thermocline influence.

The responses of μa, βh and βw are linked to the changing background climate, as they are on an inter-ensemble basis. Here it is found that μa, βh and βw show relationships with the change in mean Niño 4 zonal surface wind stress (correlations of −0.45, 0.43 and 0.64 respectively). This suggests that weakening zonal winds are linked to a weakening equatorial Pacific wind stress feedback and increased thermocline and upwelling sensitivity to surface wind stress. The decreased wind stress feedback can be explained by the shifts in atmospheric circulation in response to a warmer climate. Initially the wind stress feedback spans the basin as the ascending branch of the Walker circulation is located far west. As temperatures warm, ascent shifts eastward and leads to the wind stress response to SST anomalies to be located closer to the central Pacific and span less of the equatorial Pacific, resulting in a basin-wide decrease in wind stress sensitivity. The response of this feedback is found to vary in MMEs (Kim and Jin 2010a). Philip and van Oldenborgh (2006) suggest that while warmer mean SST can increase wind sensitivity to SSTAs, the more stable atmosphere can counteract this which may explain the variation in its projected response.

Similarly, responses of βu and βh are mixed in MMEs (Kim and Jin 2010a), though Kim et al. (2014) find consensus on an increase then decrease of βh in warming conditions in a subset of models. PPE members showing significant increases in βh here are those which have the strongest mean surface wind stress and zonal ocean currents in the present day time period (those which have atmosphere perturbations). Weakened mean westward wind stress causes a flatter mean thermocline as it deepens in the east and shoals in the west. Both of these changes have been found to cause an increased sensitivity of the thermocline slope to wind stress anomalies (Kim et al. 2014; Philip and van Oldenborgh 2006). Kim et al. (2014) also find that projected βh response is closely tied to changes in mean thermocline slope which in turn is linked to changes in zonal winds due to SST warming gradients. The diversity of the responses shown by different ensemble members here suggests that the initial mean state of the climate has an impact on the projected thermocline feedback in a warming climate.

While the response of the zonal advective feedback can be attributed to an increasing sensitivity of ocean currents to surface wind stress, the cause of this response is unknown as no significant relationships between the βu response and changing mean climate were found. Kim et al. (2013) suggest that the strength of βu can be linked to mean zonal surface winds and upper ocean temperature in the equatorial Pacific in coupled models and suggest upper ocean stratification may have an impact on zonal ocean current sensitivity to wind stress. Significant relationship is shown between the sensitivity of ocean currents to wind stress and mean zonal ocean current on an inter-ensemble basis in this study (Fig. 3c). It is possible the slowing of the Walker circulation and the accompanying eastward shift of ENSO-related variability causes a larger response of the weakened east Pacific surface ocean current to wind stress anomalies, resulting in a more effective zonal advective feedback.

The largest response of the BJ index components is the response of thermodynamic damping (Fig. 7d; Table 1). This increases in strength for all HadCM3 PPE members (significant for thirty PPE members), leading to an increase in ENSO stabilisation. This response is consistent with other studies which find increased damping in warmer conditions (Philip and van Oldenborgh 2006; Kim and Jin 2010a; Kim et al. 2014). Out of the two dominant components of the thermodynamic damping, this response is found to be mainly caused by an increase in the strength of shortwave damping, with significant increases for twenty-nine PPE members (Fig. 7e). As shown in Fig. 4c, this term is sensitive to the mean Niño 4 precipitation, suggesting a link to convection and the hydrological cycle. The same is also seen under climate change (Fig. 9). The climate change response of shortwave damping and equatorial Pacific precipitation are measured by taking the difference between 1986–2015 and 2065–2095 values. Changes in mean Niño 4 precipitation are correlated with changes in shortwave damping strength, showing a correlation of 0.53 significant at the 99% level. This suggests that ensemble members that show a large Niño 4 precipitation response show a greater increase in the strength of shortwave damping. Watanabe et al. (2012) also note the relationship of ENSO characteristics, namely ENSO amplitude, with mean precipitation in a number of PPEs and suggest this is caused by the amplification of coupled feedbacks in response to wetter conditions, however the relationship does not hold for MMEs. The correlations presented here are suggestive of a relationship between mean precipitation and ENSO amplitude that is mediated through shortwave damping of SST anomalies. However, it is uncertain if this relationship can be extended to explain differences in multi-model ensemble ENSO projections.

Fig. 9
figure 9

The HadCM3 PPE projected response (2065–2095 minus 1986–2015) of the strength of shortwave damping (ΔαSW, y axis) against the projected response of mean Niño 4 precipitation (ΔP, x axis). Each PPE member is given by a dot coloured by its perturbation type, red for atmospheric perturbation, blue for ocean perturbation and black for standard parameters. Correlations are printed on the figure with black text for the full thirty-three PPE members, red for the sixteen atmosphere perturbation members and blue for the sixteen ocean perturbation members. Linear fits which are 99% significant (by a student’s t test) are shown by solid lines. 95% significance shown by dashed lines

6 ENSO amplitude in relation to atmospheric noise

The discrepancy between BJ index and ENSO amplitude both on an inter-ensemble basis and a projected response raises the question of the impact of atmospheric noise on ENSO amplitude, which is not accounted for in BJ index. The atmospheric noise is defined as the residual of zonal surface wind stress as a function of SST along the equator (Philip and van Oldenborgh 2009, 2010; Philip et al. 2010). Here we use the standard deviation of the atmospheric noise averaged over the central Pacific as a measure of this stochastic forcing (see Methods). There is a strong positive relationship between ENSO amplitude and atmospheric noise strength, both in inter-ensemble variations where a correlation of 0.82 is found, and in the projected responses over time, with a correlation of 0.79 (Fig. 10). Atmosphere perturbation PPE members show a larger range in the magnitude of atmospheric noise (S.D. of 7.9 × 10−4 Pa compared to 5.2 × 10−4 Pa for ocean perturbations), in agreement with Philip et al. (2010), and show a larger correlation with ENSO amplitude (correlation of 0.85, Fig. 10a).

Fig. 10
figure 10

ENSO amplitude against Niño 3.4 atmospheric noise (ε) for a the time period 1986–2015 and b the difference between 2065–2095 and 1986–2015. Each PPE member is given by a dot coloured by its perturbation type, red for atmospheric perturbation, blue for ocean perturbation and black for standard parameters. Correlations are printed on the figure with black text for the full thirty-three PPE members, red for the sixteen atmosphere perturbation members and blue for the sixteen ocean perturbation members. Linear fits which are 99% significant (by a student’s t test) are shown by solid lines. 95% significance shown by dashed lines

In the projected response PPE members showing a larger increase in atmospheric noise, typically ocean perturbation PPE members, coincide with those showing a larger increase in ENSO amplitude (Fig. 10b). Note that the ensemble mean projected difference in atmospheric noise amplitude is not significantly different from zero for the atmosphere PPE members (0.08 ± 0.67 × 10−3 Pa). The strong relationships found here suggest that in the PPE atmospheric noise plays an important role in ENSO amplitude variation, though it is unclear from this if atmospheric noise drives the variation in ENSO amplitude or if there is simply more atmospheric wind variability in models with higher-amplitude ENSO. However it is possible that this relationship may explain some of the discrepancy between BJ index and ENSO amplitude.

7 Summary and discussion

We use a HadCM3 perturbed physics ensemble (PPE) to assess ENSO feedbacks as calculated by the BJ Index in historical and a future climate scenario over 1895–2095 and relate biases and responses of feedbacks to the model mean state. The use of the PPE, which features the same model framework throughout (as opposed to a multi-model ensemble), allows for a more simple comparison of the model climate and ENSO feedbacks to establish links between them.

Dominant components of the BJ index, namely thermodynamic damping, zonal advective and thermocline feedbacks, are weak. In comparison to previous studies, the biases in the positive feedbacks in the PPE are slightly smaller than those found in MMEs. A possible reason for this is the use of flux adjustment in the ensemble which minimises equatorial Pacific SST biases when compared with those shown in CGCMs used in MMEs. Significant difference is found between the thermodynamic damping in the standard parameter member of this PPE and a non flux adjusted version of HadCM3 which supports this. Improvement in SSTA strength and spatial distribution undoubtedly have an impact on the ocean response during El Niño event development. Despite this improvement in mean SST, model biases still persist resulting in the feedback biases and diversity found here.

Feedback biases in the ensemble are caused by weak shortwave and latent heat flux damping in the case of negative thermodynamic damping. Damping tends to be weaker for atmosphere perturbation PPE members and also shows larger variation when atmosphere parameters are perturbed, particularly for shortwave damping. For the positive zonal advective and thermocline feedbacks, the main causes of ensemble bias are weak responses of ocean currents and the thermocline slope to wind stress anomalies in keeping with previous studies (Kim et al. 2013). Positive feedbacks also tend to be weaker for atmosphere perturbations. This is caused by weaker mean state contributions to the feedbacks (reduced zonal temperature gradient and reduced east Pacific upwelling) and reduced ocean sensitivity. Many of these feedback biases can be related to biases in the mean state equatorial Pacific which then impacts variability. Most importantly, the mean zonal ocean current variation and bias in this ensemble is linked to all three positive feedbacks, an addition to the mean climate links to ENSO feedbacks suggested by Kim et al. (2013). PPE members with strong mean zonal advection have an ocean that is less sensitive to El Niño induced wind stress anomalies than in observations, resulting in reduced zonal advective and thermocline feedbacks. Weak shortwave damping and latent heat flux damping, also found by Lloyd et al. (2010) to be the biggest cause of thermodynamic damping biases in AMIP models, are found to be linked to reduced precipitation. It is possible this may signify convection biases causing less cloud cover formation during El Niño (reducing shortwave damping) and may alter the wind speed response (reducing latent heat flux damping).

Understanding the impact of mean state biases on dominant feedbacks also helps in understanding the response of the ensemble in the climate change scenario. The HadCM3 ensemble projects a decrease in the BJ index in response to global warming, in contrast to the increasing ENSO amplitude. ENSO related variability increases in the central Pacific, heat flux anomalies in particular become a lot stronger. The decrease in the BJ index results primarily from a large increase in the strength of thermodynamic damping, which almost doubles in strength for most ensemble members, and a reduction in the dominating positive feedback, the thermocline feedback. The increased damping is attributed to an increase in shortwave damping in response to warmer SSTs in the central Pacific. Relationships with precipitation (e.g. more central Pacific precipitation coincides with an increase in shortwave damping strength) are found here which again may suggest a link to the response of tropical Pacific convection; Chadwick et al. (2013, 2014) suggest that changes in precipitation spatial pattern are dominated by shifting convergence zones. The previously dominant thermocline feedback shows a decrease in strength and the zonal advective and Ekman feedbacks shows a slight increase in strength. This results in the positive feedbacks being almost equal in strength by the end of the analysed emissions scenario (2066–2095). The response of the thermocline feedback is in contrast with other studies which find that ENSO amplitude variation is controlled by the thermocline feedback (Kim et al. 2014; Borlace and Cai 2013), which in turn is governed by the thermocline response to equatorial zonal surface wind stress (βh). For the thermocline feedback here the initial mean state (as governed by perturbation type) has an impact on the climate change response such that ensemble members featuring atmosphere perturbation (initially the ensemble members with least ocean sensitivity) show significant increases in the sensitivity of the thermocline to surface wind stress anomalies under climate change but ensemble members with ocean perturbations tend not to. This results in less weakening of the thermocline feedback for the atmosphere perturbation ensemble members as the increase in thermocline response to zonal surface wind stress (βh) tends to balance out the decreasing wind stress response to SSTA (μa) for these ensemble members. This demonstrates an importance of the mean state of the climate to ENSO feedback projection.

In contrast to previous studies which find a positive relationship between ENSO amplitude and the BJ index (Kim and Jin 2010a, b; Borlace and Cai 2013), a key finding here is that both inter-ensemble differences and climate change response show a weakly or negatively correlated BJ index and ENSO amplitude—both for present-day conditions and for future changes. [Note that for CMIP5 models this relationship also breaks down as outlier models weaken the relationship between BJ index and ENSO amplitude (Kim et al. 2013)]. There are a number of reasons why this may be the case.

The first of these relates to the assumption that the amplitude of components of the BJ index can be directly related and therefore can be added together to provide a measure of ENSO stability. The terms of the BJ index are approximations of the true feedbacks and rely on a number of assumptions of linearity. If these approximations are not directly comparable in magnitude, as assumed, then the sum of these feedbacks, the BJ index, cannot be reliably used as an approximation of ENSO amplitude, as feedbacks may dominate responses when they should not or be masked by other feedback. This may be the case here where the negative thermodynamic damping dominates over the positive feedbacks, which alone approximate inter-ensemble variations of ENSO amplitude relatively well (correlation of 0.63 between the sum of the three positive feedbacks and ENSO amplitude).

Secondly, the derivation of the BJ Index relies heavily on assumptions of linearity and neglects processes that lie outside of the framework from which the BJ index is derived. For example, we find here significant relationships between ENSO amplitude and atmospheric noise which will not be accounted for in the BJ index calculation and should be considered in future work. It is known that a number of these linearity assumptions are not complete, such as the linearity of the wind stress response (Kang and Kug 2002; Philip and van Oldenborgh 2009; Choi et al. 2013) or the approximation of thermodynamic damping (Lloyd et al. 2010, 2012; Bellenger et al. 2013) which may cause inaccuracies in the BJ index (Graham et al. 2014). It is also important to note that at the height of an El Niño event when SSTAs peak is the point at which non-linearities are more likely to occur and the assumptions of linearity break down. The strength of this peak is essentially what is measured by ENSO amplitude, whereas the BJ index aims to quantify the growth rate of the El Niño event in the lead up to this peak, when the linearity assumptions are more feasible, meaning that the two measures may not be as strongly linked as suggested in previous studies.

Bearing these caveats in mind and considering the results found here, the use of the BJ index as a measure of ENSO amplitude, either on an inter-ensemble basis or as a tool for assessing projected ENSO response, remains questionable. Despite this, the BJ index still allows for assessment of linear dynamics of ENSO and proves useful when attempting to relate feedback bias and response to the background climate. Future work should focus on improvement of the understanding of the feedbacks that the BJ index attempts to approximate and ways in which these issues (e.g. non-linearities) can be accounted for in ENSO analysis e.g. use of the mixed layer heat budget by Graham et al. (2014).

Results here suggest that even when SST biases are improved (here by the use of flux adjustment) and analysis is confined to a single model framework (by use of a PPE) common CGCM biases (e.g. strong equatorial Pacific zonal ocean currents) have an impact on ENSO feedbacks causing persistent feedback biases, such as weak thermocline and zonal advective feedbacks and thermodynamic damping, and are a cause of uncertainty in projections. The representation of mean zonal advection and precipitation are found to be particularly important. Links between heat flux feedbacks and precipitation are suggestive of links to the hydrological cycle, however the questions of whether this behaviour persists in multi-model ensembles and what the underlying causes of this relationship are still remain.