Skill of the Saudi-KAU CGCM in Forecasting ENSO and its Comparison with NMME and C3S Models

This paper assesses the skill of the Saudi-King Abdulaziz University coupled ocean–atmosphere Global Climate Model, namely Saudi-KAU CGCM, in forecasting the El Niño-Southern Oscillation (ENSO)-related sea surface temperature. The model performance is evaluated based on a reforecast of 38 years from 1982 to 2019, with 20 ensemble members of 12-month integrations. The analysis is executed on ensemble mean data separately for boreal winter (December to February: DJF), spring (March to May: MAM), summer (June to August: JJA), and autumn (September to November: SON) seasons. It is found that the Saudi-KAU model mimics the observed climatological pattern and variability of the SST in the tropical Pacific region. A cold bias of about 0.5–1.0 °C is noted in the ENSO region during all seasons at 1-month lead times. A statistically significant positive correlation coefficient is observed for the predicted SST anomalies in the tropical Pacific Ocean that lasts out to 6 months. Across varying times of the year and lead times, the model shows higher skill for autumn and winter target seasons than for spring or summer ones. The skill of the Saudi-KAU model in predicting Niño 3.4 index is comparable to that of state-of-the-art models available in the Copernicus Climate Change Service (C3S) and North American Multi-Model Ensemble (NMME) projects. The ENSO skill demonstrated in this study is potentially useful for regional climate services providing early warning for precipitation and temperature variations on sub-seasonal to seasonal time scales.


Introduction
On seasonal time scales, the El Niño-Southern Oscillation (ENSO) is a major source of predictive skill for atmospheric circulation anomalies (McPhaden et al. 2006;Timmermann et al. 2018). Therefore, assessing and improving ENSO prediction is one of the most important challenges in the modeling and seasonal forecasting community (e.g., Barnston et al. 2012). There are various dynamical prediction systems (e.g., ECMWF; NCEP; Meteo-France and others participating in NMME and C3S multi-model projects) available in the community that provide seasonal forecasts on a regular basis . Most of the models perform reasonably well for ENSO prediction with some limitations (Risbey et al. 2021;Tippett et al. 2012). Since March 2017, Saudi-KAU CGCM is a part of the ENSO plume published monthly at the International Research Institute for Climate 1 3 Published in partnership with CECCR at King Abdulaziz University and Society (IRI). 1 Therefore, it is important to analyze and document the ENSO prediction skill of this model.
ENSO is a naturally occurring atmosphere/ocean coupled phenomenon that originates in the tropical Pacific region. The warm (El Niño) and cold (La Niña) phases of ENSO are associated with the above and below normal SST anomalies in the central and eastern equatorial Pacific regions (Yang and DelSole, 2012;Yeh et al. 2018). The seesaw of warm and cold phases of ENSO has a large impact on the temperature and precipitation of different regions around the globe (e.g., Mason and Goddard, 2001;Diaz et al. 2001;Ehsan et al. 2013Ehsan et al. , 2020aEhsan et al. , 2020aAbid et al. 2016;Almazroui et al. 2019;Atif et al. 2019;Ehsan 2020) and modulates the occurrence and intensity of heat waves, floods, droughts, severe storms and tropical cyclones (Camargo and Sobel, 2005;Curtis et al. 2007;Camargo et al. 2007;Allen et al. 2015;Sobel et al. 2016). Since ENSO is an important predictor for climate extremes and for the regional and global climate, it is important that models predict the ENSO reasonably well in advance.
There is a long history in the literature of assessing various aspects of the skill of ENSO forecasts. Recently, Tippett et al. (2019) investigated probabilistic forecasts of ENSO phases and strength from the North American Multi-Model Ensemble (NMME), complementary to the examination by Barnston et al. (2019) of the skill of deterministic ENSO forecasts from the NMME. Similarly, Trenary et al. (2018) explored the ENSO forecast skill and the optimal lagged ensemble size to reduce the forecast error of the Climate Forecast System version 2 (CFSv2). Furthermore, Jin et al. (2008) analyzed the ENSO prediction skill by using the datasets of "APCC-CliPAS" and "DEMETER" projects. Other studies highlight particular aspects of ENSO prediction skill. For example, the evaluation of ENSO predictability up to 2-4 years has been discussed in Chen et al. (2004) and Luo et al. (2008). The initialization of ENSO prediction prior to March/April in any year leads to a sharp decrease in skill, which is referred to as a "spring predictability barrier" (Webster and Yang, 1992;Zhu et al. 2015). This spring predictability barrier typically restricts the seasonal prediction based on the coupled atmosphere-ocean model up to 1 year lead time (Graham et al. 2011). Larson and Kirtman (2017) investigated the dynamical processes that are responsible for the growth of certain errors and the sources of error which limits the ENSO predictability.
The Saudi-KAU Atmospheric Global Climate Model (AGCM) forced with observed SSTs has been used in many studies to understand the potential predictability due to ENSO and the influence of ENSO on regional climate (e.g., Ehsan et al. 2017a, b;Abid et al. 2018;Rahman et al. 2018;Kamil et al. 2019;Almazroui et al. 2019). However, to date, there has been no study available that analyses the skill of the Saudi-KAU initialized coupled model (CGCM) in forecasting ENSO. The article is organized as follows. Section 2 introduces the Saudi-KAU model, prediction, and observational datasets and methods. Observed and simulated mean and variability analysis of SST, and skill assessment are presented in Sect. 3. A summary and conclusions are given in Sect. 4.

Description of the Saudi-KAU Coupled GCM
The prediction data used in this study comes from the Saudi-KAU CGCM. Briefly, the numerical design of Saudi-KAU spectral dynamical core can be traced back to Bourke (1974). The code solves prognostic equations for vorticity and divergence, from which the zonal and meridional components of the wind are derived diagnostically. Fast Fourier transform and Legendre transforms are performed at each time step so that all linear calculations are done in spectral or wave space, while all nonlinear calculations are performed in (physical) grid space. To advance the model state forward in time, a semi-implicit time integration scheme is adopted. The atmospheric model can be used at several horizontal and vertical resolutions, but in this work we use spectral T42 (2.8° × 2.8°) horizontal resolution with 20 vertical levels (For details please see Almazroui et al. 2017). The atmospheric model is coupled with the modular ocean model version 2.2 (MOM2.2), developed at the Geophysical Fluid Dynamics Laboratory, (GFDL; Pacanowski 1995). MOM2.2 uses the Arakawa B-grid to solve the primitive equations and the hydrostatic and Boussinesq approximations. In MOM2.2, the zonal grid spacing is 1.0°, while the (1/3)° meridional grid spacing between 8°S and 8°N increases gradually to 3.0° between 30° S and 30° N, being fixed at 3.0° for the higher latitudes, and it has 32 vertical levels. In MOM2.2, a mixed layer model developed by Noh and Kim (1999) is implanted to improve the vertical structure of the upper ocean. The convection, radiation, planetary boundary layer (PBL), and Land-Surface Model (LSM) are based on the Simplified Arakawa-Schubert Cumulus (SAS) scheme by Moorthi and Suarez (1992), Nakajima and Tanaka (1986), Holtslag andBoville (1993), andBonan (1998) parametrizations, respectively. The physical schemes used in the initial conditions and hindcast simulations are identical.
Published in partnership with CECCR at King Abdulaziz University

Experimental Design
The initialization method described in Fig. 1a is based on an atmosphere and ocean nudging performed at every model time step. Zonal and meridional winds, specific humidity, temperature and surface pressure are nudged in the AGCM, while potential temperature and salinity are nudged in the OGCM. A typical equation used for nudging is: Here, "T" is the potential temperature, C p is the specific heat capacity at constant pressure, " " is the density, "Q" the heat energy, and "H" the enthalpy; the last term on the right-hand side of the above equation is the nudging term. The main parameter that controls the nudging strength is τ, also known as the relaxation time. This factor is chosen based on empirical considerations and differs from variable to variable. For example, τ is set to 5 days (432,000 s) for the temperature and salinity in case of ocean models, and for atmospheric variables, the atmospheric nudging time scale "τ" is fixed at 6 h (21,600 s). If τ is small, the solution converges toward the observations quickly, and the dynamics does not have time to adjust. That is, nudging terms contribute significantly to prognostic tendencies. In this case, the dynamical balance may not be satisfied when the nudging terms are removed and the analysis field is used as the initial condition (e.g., initial shocks). If τ is large, the model can grow too far away from observations before the nudging becomes effective. Initial conditions are available for the period 01 Mar 1980 to 31 Dec 2019 with 6-hourly output frequency. The lagged average forecast (LAF) is used as a perturbation method to generate the initial conditions for the ensemble hindcast simulations (Hoffman and Kalnay 1983). Importantly, the atmospheric initial conditions are perturbed with a 6-h interval while the ocean initial condition remains the same for all ensemble members (Fig. 1b). The retrospective ensemble forecast dataset from Saudi-KAU CGCM with 20 ensemble members starts on the first of each month and extends 12 months beyond the start day.

NMME and C3S Models Data
In this study, we analyze the ENSO forecast skill of the Saudi-KAU CGCM and compare it to that of operational models in the NMME and C3S projects. The NMME project contains models from operational forecast centers and research institutes in the USA and Canada, while C3S contains models from European countries. The exceptional feature of the "NMME" is the accessibility of consistent hindcasts on seasonal to sub-seasonal time scales. For details, please see Kirtman et al. (2014). The real-time forecasts are generated on the 8th day of every month to use in operational sub-seasonal to seasonal climate prediction. From C3S, we utilized the forecast data from the European

Methods
Saudi-KAU model provides the reforecast for all the calendar months. To compare with our model, we selected only those models from NMME and C3S which are operational and most importantly have reforecast available for all calendar months. In this setup, we have different time periods for different models. In this regard, the lead seasonal (December to February: DJF, March to May: MAM, June to August: JJA, and September to November: SON) or monthly ensemble mean climatologies for each model were constructed over the period mentioned in Table 1, while the respective anomalies were computed by removing these climatologies from the corresponding model climate. The observed seasonal and monthly climatologies and anomalies match those for individual models. For all analysis, the ensemble mean is utilized for Saudi-KAU as well as NMME and C3S models. The Niño 3.4 index (Barnston et al. 1997) is the spatially averaged SST anomaly over the equatorial Pacific region (5°S-5°N and 170°W-120°W). Two other ENSO indices are used together with Niño 3.4, to represent the ENSO diversity (Capotondi et al. 2015), namely Niño 3 (5°S-5°N and 150°-90°W) and Niño 4 (5°S-5°N and 160°E-150°W). For the analysis purpose, we re-gridded the model and observation datasets to 1.0° × 1.0° resolution by using the bi-linear re-gridding technique.

Results
First, we compare the mean state and variability of SST in the tropical Pacific Ocean, followed by the skill assessment, and we finish by comparing the skill of the Saudi-KAU CGCM with other models in predicting Niño 3.4 index.

Mean and Variability of SSTs in Tropical Pacific Ocean: Saudi-KAU vs Observation
The seasonal climatology of the predicted SSTs from the Saudi-KAU model is estimated based on the 20-ensemble members and compared with the observed SSTs shown in Fig. 2. The model predicted seasonal mean SSTs for each season (i.e., DJF, MAM, JJA, and SON) is at Lead-1 forecasts (i.e., initialization dates in the preceding November, February, May and August respectively), over the 1982-2019 study period. Figure 2 shows that the zonal gradient of the mean SSTs in the tropical Pacific is well simulated by the model compared to the observations. The Saudi-KAU model captures well the equatorial cold tongue along the equator over the eastern Pacific. However, the cold tongue extends westward more than in observations, which introduces a cold bias in the model. To explain it further, supplementary Fig. S1 shows the SST bias (negative in all seasons, but varying in magnitude) over the tropical Pacific domain during all four seasons. The analysis reveals the time when the errors in the SST in the tropical Pacific Ocean are greatest among the seasons, and when they are the smallest as simulated by the Saudi-KAU model. Typically, we can see cold bias in all seasons, and the highest cold bias (~ 1.0 to 1.5 °C) is observed during SON, while the lowest (0.5-1.0 °C) cold bias is observed in boreal winter time ( Fig. S1-a) in the central-eastern equatorial Pacific. Moreover, a strong positive bias is observed near the coasts of South America in the Niño 1 + 2 region during all seasons (Fig. S1). This positive bias is also highest during boreal fall (SON) (Fig. S1-d). Figure 3 shows the interannual variability of the predicted SSTs as measured by the standard deviation of each seasonal pattern in the tropical Pacific region, compared to observations for DJF, MAM, JJA, and SON at Lead-1. The overall structure of the SST variability in observations is well captured by the Saudi-KAU model in all seasons. Generally, tropical Pacific SST anomalies tend to have a maximum variance during boreal fall and winter time, and minimum values during boreal spring and summer, and the Saudi-KAU model simulates this feature quite well (Fig. 3), with reduced latitudinal extent in all seasons. In observations, the highest variability (~ 1.5 to 1.75 °C) is observed during SON and DJF, while MAM and JJA have lower standard deviation values (~ 0.75 to 1.0 °C) in the central-eastern equatorial Pacific. The Saudi-KAU model is able to capture the variability of the most variable seasons (DJF and SON) with good accuracy relative to observations. However, the longitudinal extent of the variability is small compared to observations in both seasons (Fig. 3a-b for DJF, and 3 g-h for SON). Spring (MAM) and summer (JJA) are the seasons with low variability, and the model also has low variability during these two seasons with a reduced longitudinal extent (Fig. 3c, d for MAM, and 3e-f for JJA). Figure 4 shows the trajectory maps of the Niño3.4 index, comparing the predictions of the Saudi-KAU CGCM with the observations. The prediction trajectories are shown as lines from each consecutive start month extended to the maximum lead, which is 12 months in this case. Generally, the prediction anomalies closely match the observations. However, this correspondence weakens with increasing lead time. Some ENSO events (e.g., 1998, and were relatively very well predicted even at long leads; on the other hand, several other specific events were less well predicted . These results are largely consistent with the results from Barnston et al. (2019), who compared the Niño3.4 trajectories in CMC11-CanCM3, NCEP-CFSv2, as well as a multi-model ensemble (MME) composed of eight individual models. Importantly, the predictions started during El Niño years have less spread compared to those in normal years.

Trajectory Maps and Skill Assessment
In Fig. 5, we compare the warm (El Niño: 13 events) and cold (La Niña: 13 events) episodes in observations and model (at six leads) during the period 1982-2019, for the boreal winter/DJF season. Again, Niño 3.4 is used to define the warm (SST anomaly in the Niño 3.4 region must be greater than or equal to + 0.5 C) and cold (SST anomaly in the Niño 3.4 region must be less than or equal to − 0.5 C) episodes. 11 out of 13 warm episodes were well predicted by the Saudi-KAU model up to 6 months in advance. However, all forecasts missed the1987-1988 warm event. While the warm event of 1994-95 was missed at longer leads (Lead-5 and 6), the model was able to capture this event at shorter lead times, but with a lower amplitude than in observations. Turning our attention to cold episodes, the model is somewhat able to capture 12 out of 13 events quite well, with the exception of the last (2000-01) of the triplet La Niña event (1999)(2000)(2001). In addition, the model amplitude of Niño 3.4 during cold episodes is typically higher than in observations. Figure 6 shows the anomaly correlation coefficients (ACC) of the three ENSO indices for four seasons and for lead time up to 6 months. Among the three ENSO indices, the Niño 3.4 shows a comparatively higher prediction skill for all seasons. By season, the lowest correlation skill is observed during boreal summer. For instance, the 1 3  SON (g, h) seasons. The period of analysis is 1982 to 2019. The unit of SST climatology is oC Published in partnership with CECCR at King Abdulaziz University occur in DJF (Fig. 7a), similar to Kim et al. (2012) findings that the ENSO prediction skill is higher during the winter. In the case of MAM (Fig. 7b) and JJA (Fig. 7c), the highest values are located slightly off the equator (north-central and southwestern); however, in case of SON (Fig. 7d) the highest contour values are located right at the central-eastern equatorial Pacific. Furthermore, we see a sharp drop in the correlation values during JJA and MAM at longer lead times as compared to DJF and SON (supplementary Figs. 2,4,5,6). The lowest skill values occur during JJA at longer leads and MAM for February initial conditions, which may be due to the spring predictability Published in partnership with CECCR at King Abdulaziz University barrier, as this remains a large challenge that limits the models' skill in forecasting ENSO (e.g., Chen et al. 2020;Zhang et al. 2021). Nonetheless, the correlation analysis indicates that the Saudi-KAU CGCM skill remains statistically significant at approximately the 6-month lead time, and therefore, it can be potentially useful for regional climate services providing early warning of seasonal climate forecasts for precipitation and temperature that depend on the predictions of the ENSO (e.g., Timmermann et al. 2018).

Saudi-KAU CGCM vs NMME and C3S Models
In this section, we compute the temporal anomaly correlation coefficients between the ensemble mean of the models' forecasts and observations for each target calendar month Fig. 4 Observed (black) and forecasted (Saudi-KAU model: blue) SST anomalies spatially averaged over Niño3.4 region during the period . A forecast trajectory (12 month's forecast) is shown for each calendar month. The unit of SST anomalies is o C and lead separately for Saudi-KAU CGCM and other eight models from the NMME and C3S projects. Figure 8 shows the temporal anomaly correlation between the ensemble mean of the models' forecast and observations as a function of target month and lead. The correlation tables for different models can be used to compare the predictive potential in each model. CanSIPSv2, CanCM4i, GEM-NEMO, Meteo-France-Sys7, and ECMWF are among those models that show the highest skill (greater than 0.90), which can persist for several months. It is important to mention here that CanSIPSv2 that is basically the multi-model mean of CanCM4i and GEM-NEMO (Lin et al. 2020) shows the highest skill as compared to individual models, which provides further evidence that a multi-model approach is better than using the single model (e.g., Palmer et al. 2004;Hagedorn et al. 2005;). Saudi-KAU, NCEP-CFSv2, COLA-RSMAS-CCSM4, and GFDL-SPEAR, also show high skill for short leads. However, this skill diminishes quickly as the lead time increases. Nevertheless, the Saudi-KAU CGCM predictions are at approximately the same skill level as the other stateof-the-art models.

Summary and Conclusions
The Saudi-KAU CGCM skillfully predicts ENSO-related SST anomalies. The hindcast from the Saudi-KAU CGCM is based on 20 ensemble members for the period 1982-2019, while the other models have different hindcast periods. The verification measures used here for the skill assessment are the mean, variability, and the temporal anomaly correlation coefficients of the Saudi-KAU CGCM. Overall, the Saudi-KAU CGCM is able to capture the observed SST climatological pattern and interannual variability (measured as the standard deviation) over the tropical Pacific region during four standard seasons. The Saudi-KAU CGCM shows a cold bias (0.5-1 °C) in the equatorial Pacific and a warm bias near the coast of South America in almost all seasons (SON shows highest bias). Trajectory maps of the Nino3.4 index  Anomaly correlation coefficients (ACC) averaged over the three ENSO ((a) Niño 3, (b) Niño 3.4,(c) Niño 4) regions for four target seasons and six starts. On average, ACC is usually above significant level for three ENSO regions (except for Niño 3 region at lead 6 and target season JJA) and four target seasons, representing the positive skill of the Saudi-KAU seasonal forecasting system Published in partnership with CECCR at King Abdulaziz University from the Saudi-KAU CGCM generally follow the observed pattern, particularly during the strong ENSO events. A statistically significant positive prediction skill of the SST anomalies in the tropical Pacific Ocean during all seasons is noted up to six months in advance of the forecasts. Comparatively, the skill is lower in spring and summer than fall and winter varying by year and lead times. A multi-model comparison of Niño3.4 index skill shows that the Saudi-KAU predictions are in line with the other state-of-the-art models.
The high skill demonstrated by the Saudi-KAU CGCM forecasts suggests that useful operational forecasts of the ENSO are feasible. However, there are a number of caveats which are of concern as compared to other state-of-the-art ENSO forecasting models, namely: (1) Low horizontal and vertical resolution of the Saudi-KAU AGCM.
(3) Low prediction skill during boreal summer at longer leads.
The current Saudi-KAU CGCM can be upgraded to a version with higher horizontal and vertical resolutions in addition to the new physical schemes for convection, microphysics, and land surface processes that have been introduced during recent years (e.g., Almazroui et al. 2017). Future plans also include an upgrade to a more up-to-date oceanic component, because the current ocean model MOM2.2 is quite old and does not contain recent improvements in the representation of oceanic physical processes. Finally, the cold bias in the central-eastern equatorial Pacific, and warm bias over some localized areas, is another area of focus for future improvements of the Saudi-KAU CGCM. Due to the Here, the color bar is adjusted to minimum value 0.3, which is the 95% confidence level threshold disproportionate impacts of ENSO on the climate in different parts of the world, progress in these topics will probably result in better ENSO forecasts at longer lead times that may lead to better forecasts of regional seasonal rainfall and temperature anomalies (e.g., Ehsan et al. 2021;Acharya et al. 2021) which may have direct positive societal impacts.