Forecast skill of the Indian monsoon and its onset in the ECMWF seasonal forecasting system 5 (SEAS5)

Accurate forecasting of variations in Indian monsoon precipitation and progression on seasonal time scales remains a challenge for prediction centres. We examine prediction skill for the seasonal-mean Indian summer monsoon and its onset in the European Centre for Medium-Range Weather Forecasts (ECMWF) seasonal forecasting system 5 (SEAS5). We analyse summer hindcasts initialised on 1st of May, with 51 ensemble members, for the 36-year period of 1981–2016. We evaluate the hindcasts against the Global Precipitation Climatology Project (GPCP) precipitation observations and the ECMWF reanalysis 5 (ERA5). The model has significant skill at forecasting dynamical features of the large-scale monsoon and local-scale monsoon onset tercile category one month in advance. SEAS5 shows higher skill for monsoon features calculated using large-scale indices compared to those at smaller scales. Our results also highlight possible model deficiencies in forecasting the all India monsoon rainfall.


Introduction
The Indian summer monsoon (ISM) presents an interesting challenge for seasonal prediction. ISM precipitation varies across different spatio-temporal scales-including the variability of seasonal mean rainfall and ISM onset and withdrawal dates over the Indian subcontinent over seasonal and interannual timescales. These variations have large impacts on major water resources, ecosystems, agriculture and thus the Indian population. Improved forecasting of the ISM progression and seasonal rainfall can help alleviate water stress for agriculture and domestic needs and mitigate the impacts of hydrometeorological disasters.
Similar to its contemporary coupled ensemble prediction systems, the European Centre for Medium Range Weather Forecasts (ECMWF) seasonal forecasting system 4 (SEAS4) represents the mean Asian monsoon circulation well despite systematic errors associated with monsoon precipitation 1 3 (Kim et al. 2012). It has good skill at predicting ISM interannual variability (Jie et al. 2017;Pandey et al. 2015), and shows relatively low model bias for ISM rainfall compared to other models (Jain et al. 2019). The new ECMWF seasonal forecasting system 5 (SEAS5) has many improvements over SEAS4, including reduction in cold SST errors over equatorial Pacific and Indian Ocean regions (Johnson et al. 2019). These improvements have led to significantly higher prediction skill for large-scale ISM rainfall interannual variability in SEAS5 compared to its predecessor, and also put SEAS5 as a front runner in predicting extreme ISM rainfall (Köhn-Reich and Bürger 2019). However, SEAS5 like SEAS4, still shows stronger than observed teleconnections with the tropical Pacific.
It remains unclear how well SEAS5 represents the localscale ISM and its onset, and what errors remain in its monsoon dynamics. Studies have shown that model deficiencies at seasonal timescales have more serious consequences for forecast skill compared to those arising from errors in the initial conditions (Pokhrel et al. 2016). Thus, an assessment of ISM simulation and its prediction skill in the SEAS5 system will be useful to user communities (scientific and operational), by providing information on the strengths and limitations of the model. In this study, we aim to examine the representation of the ISM and its progression in the ECMWF SEAS5 forecasting system with a focus on prediction of the monsoon onset. We quantify the prediction skill of seasonal mean ISM and monsoon onset variability in the SEAS5 hindcasts, using various objective indices.
The article is organized as follows: the SEAS5 hindcast and verification data used in the study are introduced in Sect. 2.1; methods are provided in Sect. 2.2; results for the prediction skill of seasonal mean ISM are discussed in Sect. 3 and for the monsoon onset in Sect. 4; and Sect. 5 provides the summary and concludes this study.

Data
The SEAS5 coupled model features the Integrated Forecast System (IFS; cycle 43r1) atmospheric model coupled to the HTESSEL land-surface model and the Nucleus for European Modelling of the Ocean (NEMO; version 3.4.1) ocean model. SEAS5 seasonal forecasts have O320 ( ≈ 36 km) horizontal resolution and 91 vertical levels for the atmosphere, and ORCA 0.25 ( ≈ 27 km) horizontal resolution and 75 vertical levels for the ocean. The SEAS5 forecasts are integrated for 7 months with a 51-member ensemble initialised on the first of every month. Johnson et al. (2019) and ECMWF (2017) give a detailed description of the SEAS5 forecasting system. We use 36 years  of the retrospective seasonal forecasts (also referred to as reforecasts or hindcasts) to estimate the forecast skill of the system. For our study, the focus is on analysing the 1st of May initialised hindcast set. Studies have shown that dynamical coupled models can skilfully predict ISM rainfall initialized in May (e.g., Wang et al. 2015b;DelSole and Shukla 2010 (Xie et al. 2003). We use GPCP due to its better representation of the Indian monsoon in comparison to other merged rainfall datasets (Prakash et al. 2015).

Objective indices
To evaluate prediction skill we use objective indices that reflect the key physical mechanisms associated with the Indian monsoon and define the monsoon onset date based on these mechanisms. The different indices used and their respective domains are summarized in Table 1. -AIRI (All India Rainfall Index) is defined as the weighted average of JJA rainfall anomalies over the Indian region and is an index used by ECMWF. The weights are scaled by the fraction of low-altitude land and normalized to a unit-area average at the GPCP native resolution. The region covered and the weights at each grid-point are shown in Table 1. This mask allows us to only calculate rainfall over India and ignore the surrounding regions. -TTGI (Tropospheric Temperature Gradient Index) is defined as the difference between vertically integrated tropospheric temperature over 600-200 hPa between northern and southern regions of the South Asian domain. Pre-monsoon warming over the Asian region along with the elevated heat pump due to the Tibetan Plateau establishes a meridional temperature gradient and forms a heat low over South Asia, which is a precursor to the monsoon onset. The TTGI monsoon index is a seasonal average of the TTGI time series for the months of JJA. The TTGI monsoon onset date is specified when the northern box (depicted in the TTGI panel of Table 1) becomes warmer than the southern box (i.e. date on which the TTGI becomes positive). As our results were found not to be sensitive to any smoothing of the TTGI, no smoothing is applied here. -WYI (Webster and Yang Index) (Wang and LinHo Index) defines monsoon onset at each grid-point using rainfall, providing spatial variability at the local-scale. To avoid noise inherent in time series of grid-point rainfall, the WLI is calculated using a smoothed pentad rainfall time series and calculated relative to the minimum in the annual cycle (January mean rainfall). Then the timing of the monsoon onset pentad at   each grid-point is determined when the five-pentad running average of relative rainfall exceeds 5 mm day −1 . The spatial variability of climatological (1981-2016) WLI monsoon onset pentads for GPCP is shown in Table 1.

Verification methods
In this study, we analyse the deterministic skill of the SEAS5 ensemble mean, as well as the probabilistic skill of the SEAS5 ensemble forecasts. Comparison of interannual variability between the hindcast ensemble mean and observations uses the Pearson correlation coefficient (CC). We use a one-sided Student's t-test to examine the existence of skill (CC > 0), and test for significant differences at the 5% level.
To quantify the relationship between real-world skill and potential predictability of the ensemble forecast system, we use the 'ratio of predictable components' (RPC; Eade et al. 2014). RPC is calculated as the CC (actual skill) divided by the ratio between standard deviation of the model ensemble mean and the average standard deviation of individual members (potential predictability). A forecasting system having RPC equal to 1 indicates that the predictable component of the real world is the same as in the model world.
where CC is the correlation coefficient, 2 sig is the signal variance of the model ensemble mean and 2 tot is the variance of all ensemble members. The actual predictability of any forecast system is usually different to its potential predictability, and RPC represents this difference. RPC greater (lesser) than 1 denotes underconfident (overconfident) forecasts.
Reliability of the ensemble forecast system is measured based on the relationship between the intraensemble spread and the error of the ensemble mean forecast, as the 'spread-error ratio' (SER; Ho et al. 2013). For a large and perfect ensemble, the RMSE (root mean square error) of the ensemble mean would be equal to the ensemble spread about the ensemble mean (Weisheimer et al. 2011). An ensemble system with SER larger (smaller) than one is considered overdispersed (underdispersed), and the probabilistic forecasts are expected to be unreliable.
where m is the number of ensemble members, RMSE is the root mean square error of the ensemble mean and 2 tot is the variance of all ensemble members. To estimate the sampling uncertainty of RPC and SER, we use the bootstrapping approach to generate a distribution of RPC and SER values, by randomly generating 1000 samples, from the ensemble and the hindcast years, with replacement. From the randomly generated distribution, using a two-sided test we estimate RPC and SER values which are statistically indistinguishable from 1 at the 95% confidence interval. The skill of SEAS5 at forecasting the monsoon onset is also quantified in terms of tercile categories: (a) early, (b) normal and (c) late onset. Model skill for onset tercile categories is estimated in terms of deterministic (Accuracy; ACC and Heidke skill score; HSS) and probabilistic forecasts (Brier skill score; BSS and Ranked probability skill score; RPSS) (WCRP 2015). ACC is a score that defines the accuracy of the model performance, whereas HSS is the accuracy of forecasts at predicting the observed category, relative to that of random chance. ACC is the ratio of how many times the model forecasts the correct onset category. Negative values for HSS indicate that the model forecast is worse than a randomly generated forecast set. BSS is calculated separately for all three onset categories and measures the mean-squared forecast probability error. RPSS measures the sum of squared probability errors, which is cumulative across the three forecast categories, in order from early to normal to late onsets. Negative values for BSS and RPSS indicate a forecast which is worse than a climatological forecast (with a probability of 1/3 in each onset category). Please refer to Table 2 for detailed description of the verification skill scores used to quantify skill of an ensemble forecast system.

Monsoon dynamics and thermodynamics
In this section, we examine the model performance in terms of the dynamics and thermodynamics of the ISM in order to understand the monsoon rainfall errors. We will first assess the climatological mean state of the model, to diagnose systematic biases in the model. Next we will analyse the interannual prediction skill for the different monsoon features. For an objective analysis of monsoon characteristics, we use seasonal (JJA) mean indices: AIRI, TTGI, WYI and WFI, as described in Table 1.

Climatological mean and interannual variability
SEAS5 generally simulates the pattern of ISM features well ( Fig. 1). It clearly shows the enhanced meridional gradient in tropospheric temperature (Li and Wang 2016), which establishes the lower level cross-equatorial flow (Findlater 1969) and consequently the upper level tropical easterly jet (Koteswaram 1958). SEAS5 shows a warmer than observed temperature gradient ( Fig. 1j) which strengthens the monsoon flow, as indicated by the anomalous westerly bias at 850 hPa (Fig. 1k). An upper level easterly wind bias north of India and a westerly bias to the south (Fig. 1l) are due to a slight northward shift in the positions of the tropical easterly jet and the sub-tropical westerly jet in SEAS5 (not shown).
SEAS5 represents the climatological pattern of the seasonal mean monsoon precipitation well: i.e. enhanced precipitation over the Western Ghats, monsoon core region, Gangetic Delta and the Himalayas (Fig. 1a). SEAS5 overestimates seasonal mean precipitation over the Western Ghats and Himalayas and underestimates it over the Gangetic Plains and Delta (Fig. 1i), as with other seasonal forecast models, such as the North American Multi-Model Ensemble (NMME) seasonal systems (Singh et al. 2019). The underestimation of ISM mean rainfall is not a unique problem for SEAS5 alone; other contemporary models also show a drier ISM with the same forecast lead times (Jain et al. 2019).
The verification of SEAS5 seasonal mean (JJA) monsoon indices is summarized in Table 3. SEAS5 seasonal mean monsoon indices differ significantly from those in observations/reanalyses, due to biases discussed in the above paragraph. Tropospheric temperature for SEAS5 is generally warmer than in reanalysis and the local-scale rainfall biases lead to a seasonal mean dry bias in AIRI. Due to the westerly wind biases, at lower and upper levels, SEAS5 underestimates the zonal vertical wind shear (WYI) and overestimates the zonal horizontal wind shear (WFI). For almost all the monsoon indices, the interannual spread in SEAS5 is usually larger than observed, except for WYI (Table 3). For the detailed interannual ensemble member spread of monsoon indices in SEAS5 compared to observations, please refer to Fig. 2. Compared to the ensemble spread of AIRI in SEAS5, ensemble spread is higher for WYI and lower for TTGI. SEAS5 ensemble spread generally encompasses the seasonal mean observations except in years such as 1994 for TTGI, WYI and WFI, and 2002 for AIRI and TTGI. Further, as all Table 2 Description and calculation of the verification skill scores used in the study

Verification Skill Score Calculation
Accuracy (ACC) quantifies the fraction of forecasts predicting the correct tercile category amongst all forecasts and ranges from 0 (no skill) to 1 (perfect score) where C is the forecast category (early/normal/late), N is the total number of forecasts (years) and n(F i , O i ) is the number of accurate forecasts for all different categories for each year Heidke Skill Score (HSS) represents the accurate forecasts after eliminating those which are correct due to random chance. This score ranges from − ∞ to 1, with 0 meaning no skill and 1 meaning a perfect forecast score. Negative values for HSS indicate that the model forecast is worse than a randomly generated forecast set where C is the number of forecast categories, N is the total number of forecasts (years), n(F i , O i ) represents the accurate forecasts and n(F i )n(O i ) is all combinations of expected forecast and observed category combinations Ranked Probability Skill Score (RPSS) measures how well the multi-category probabilistic forecast predicts the actual observed category in cumulative sense. This score ranges from − ∞ (highest possible error) to 1 (perfect score), with 0 indicating no skill when compared to reference climatology where RPS is the Ranked Probability Score and RPS clim is the reference RPS climatology, calculated with the same formula as RPS, but with climatological probability of 1/3 for the value of F in all the cases. F is the forecast probability, O is the observed category, N is the number of forecasts and C is the number of forecast categories: (1) early, (2) normal, (3) late; quantified as cumulative categorical forecast probability in the given order Brier Skill Score (BSS) defines the skill of the probabilistic forecast for a category and is calculated separately for each category and reflects the mean-squared probability error. This score ranges from -∞ (highest possible error) to 1 (perfect score), with 0 indicating no skill when compared to the reference climatology where BS clim is the reference Brier Score climatology, calculated with the same formula as BS c but with climatological probability of 1/3 for the value of F in all the cases and BS c is the Brier Score of a particular category. F is the forecast probability of that category, O is the observed category and N c is the number of forecasts (years) in the same category. O is 1 for the observed category and 0 for other categories of these indices are indicators of Indian monsoon strength, they are known to not be entirely independent (Moron and Robertson 2014). Our results indicate that these indices show similar patterns of interannual variability and there are significant correlations between the monsoon rainfall index (AIRI) and the three other indices, for both SEAS5 and observations (Fig. 2b-d).
Analysing the monthly mean AIRI (Fig. 3), shows that SEAS5 also represents the monsoon (May to September) seasonal cycle well. Monthly AIRI strongly increases from May to July, as the monsoon peaks, and then shows a gradual reduction from July to September. SEAS5 has a wet bias in the months before the monsoon peak (May-June) and a dry bias from July to September. Thus, the seasonal mean dry bias for AIRI in SEAS5 stems from insufficient rainfall from July onwards. The interannual spread of the monthly mean AIRI is larger in SEAS5 than observed, similar to the interannual spread of the seasonal mean AIRI (Table 3).

Skill of interannual prediction
To evaluate how well SEAS5 hindcasts represent predictable modes of ISM variability, we compare the principal component (PC) time series for the first two modes (PC1 and PC2) of ISM variability between SEAS5 and GPCP (Fig. 4). For the PC time series calculation we use the first two leading modes of empirical orthogonal function (EOF) calculated for GPCP monthly-mean summer rainfall anomalies over the South Asian domain (15 • S-30 • N, 60 • E-120 • E), as EOF1 and EOF2 (ECMWF 2017). The first EOF (EOF1; Fig. 4a) resembles precipitation patterns associated with summer La Niña events, with enhanced precipitation over the eastern equatorial Indian Ocean and southeast Asia and reduced precipitation over the northern Bay of Bengal and East Asia. The second EOF (EOF2; Fig. 4c) pattern has enhanced precipitation over the Indian subcontinent, surrounding oceans and western equatorial Indian Ocean; which resembles the  (Fig. 4b) and PC2 (Fig. 4d) time series are generated by spatially regressing seasonal mean rainfall anomalies each year (SEAS5 and GPCP) onto the EOF1 and EOF2 patterns respectively. Comparison of the SEAS5 PC time series against observations shows moderate but significant correlations between the models and observed for the first two modes of interannual monsoon variability (Fig. 4b, d).
When we look at skill in ISM interannual variability, in terms of CC, RPC and SER (Table 3), the results suggest that large-scale monsoon features measured by temperature gradient (TTGI) and vertical wind-shear (WYI) are better represented in SEAS5 than smaller-scale features (AIRI and WFI), as large-scale seasonal mean monsoon indices in SEAS5 are significantly correlated with the reanalysis. Other models also show moderate skill for ISM rainfall (Rajeevan et al. 2012) but have better skill for large-scale monsoon circulation (Johnson et al. 2017). Further, for smaller-scale monsoon features (AIRI and WFI), SEAS5 RPC is significantly lower than 1 (Eade et al. 2014), which indicates overconfident forecasts, where the ensemble mean resembles the ensemble members more than the observations. RPC for large-scale monsoon features (TTGI and WYI) is The numerical values shown in panels b-d are CC between AIRI against each respective index for observed (red) and SEAS5 (black) and asterisk represents statistical significance at the 5% level Fig. 3 Monthly interannual spread of AIRI for SEAS5 (blue) and GPCP (white) in boxplots. For each boxplot, whiskers show the range (max-min) of the AIRI, middle dash is the median and box ends show the inter-quartile range statistically indistinguishable from 1, which suggests that the forecast skill of SEAS5 is close to the potential predictability limit estimated by the model ensemble spread. Similar to RPC, SEAS5 has SER statistically indistinguishable from 1 for large-scale indices (TTGI, WYI), and SER lower than 1 for smaller-scale indices. This suggests that SEAS5 has reliable forecasts for large-scale monsoon features and the forecasts associated with smaller-scale monsoon features are underdispersed, which leads to unreliable probabilistic forecasts for the smaller-scale indices.
The model's interannual predictive skill at each gridpoint for seasonal (JJA) mean rainfall, tropospheric temperature, vertical zonal wind shear and rainfall is shown in Fig. 5. SEAS5 skill, in terms of correlation, at each grid-point, indicates that the thermodynamic and dynamic features such as the vertical wind shear of zonal winds (Fig. 5c) and tropospheric temperature (Fig. 5b) are generally better represented in SEAS5 than the precipitation anomaly (Fig. 5a), which is a common feature amongst all current models (Kim et al. 2012). SEAS5 only has significant correlations with observed precipitation anomalies (p-value less than 0.05; Fig. 5a or RPC statistically indistinguishable from 1; Fig. 5d) over parts of southern, eastern and western India, with only parts of the northern India having SER statistically indistinguishable from 1. For zonal vertical wind shear, SEAS5 over most parts of India has good prediction skill (Fig. 5b). Its actual skill matches well with potential predictability (Fig. 5e), and has reliable forecasts (Fig. 5h), over the whole Indian subcontinent and surrounding seas. Fig. 5c shows that SEAS5 skill for tropospheric temperature is high over the whole of the tropical Indian Ocean region (15 • S to 15 • N), and SEAS5 has reliable forecasts for tropospheric temperature over southern India (Fig. 5i). SEAS5 generally produces underdispersed and overconfident forecasts (SER < 1), for all the three variables, for most parts of the study domain ( Fig. 5g-i), which is a common problem among other seasonal forecasting systems (Weisheimer et al. 2011). Nonetheless, there are some regions where SER > 1, i.e. parts of central and western Indian subcontinent for precipitation, southern India for zonal vertical wind shear, and the western equatorial Indian Ocean and parts of the Tibetan Plateau for tropospheric temperature. These regions have unreliable (underconfident) probabilistic forecasts with SEAS5, due to a larger ensemble spread than the ensemble mean error.

Monsoon onset and progression
In this section, we examine the model performance for the ISM onset and its progression. We will first assess the climatological mean and interannual variability of modelled monsoon onset dates and progression. Next we will analyse the model prediction skill for tercile categories of monsoon onset. To calculate the monsoon onset dates we use objective indices: TTGI, WYI, WFI and WLI, as described in Table 1.

Climatological mean, interannual variability and skill
Before analysing SEAS5 prediction skill for the monsoon onset, we examine the climatology of the gradual progression of the monsoon over the Indian subcontinent, during different pentads (Fig. 6). Climatologically, the beginning of the monsoon onset over India is traditionally considered to occur around 1st of June (Krishnamurthy and Shukla 2000) along the coast of Kerala (Ananthakrishnan and Soman 1988). Prior to that, the onset of monsoon rains is only clearly seen over the Bay of Bengal and Arabian Sea, and a north-westerly flow dominates over the Indian landmass, which brings dry air towards India during the pre-monsoon. In GPCP, we see monsoon onset well established over Kerala by late May, pentad 29 (Fig. 6g); however, the monsoon onset over continental India is just beginning in SEAS5 (Fig. 6a). Monsoon onset over Kerala for SEAS5 is delayed by a pentad compared to GPCP, and occurs by pentad 30 (Fig. 6b).
As the summer progresses, monsoon winds become established bringing in moist air from the south-west. Advance of the monsoon over India follows the south-east to north-west direction; that is, perpendicular to the monsoon flow. This is due to the presence of pre-monsoon midlevel north-westerly dry winds which are slowly eroded by low-level moist monsoon flow from the tropics (Parker et al. 2016). These dynamics are well represented in SEAS5. Further, although slow to start, the monsoon precipitation in SEAS5 encompasses almost the whole of India by pentad-39 (Fig. 6i), whereas in observations, it only does so by pentad-40 (Fig. 6t) i.e. mid-July (Krishnamurthy and Shukla 2000). SEAS5 shows similar south-east to north-west progression of the monsoon onset, as observed, but the onset progression is slower in SEAS5 than GPCP during May, and faster during July.
To present a holistic assessment of prediction skill for the ISM onset, we use different objective indices based on predominant physical aspects of the monsoon to calculate onset Fig. 4 Modes of June-September GPCP ISM precipitation variability as a EOF1 and c EOF2 patterns over the Indian region; and principal component (PC) time series associated with the b first (PC1) and d second (PC2) leading modes of EOF of ISM precipitation dates in the model and reanalysis, rather than verifying the model against the classical subjective criterion of increased rainfall over a small region of Kerala. SEAS5 ensemble mean climatological monsoon onset dates are statistically similar to the onset dates for reanalysis with TTGI and WYI (Table 3). For these two onset indices the interannual spread of the monsoon onset date in SEAS5 is also statistically similar to the spread in reanalysis and the interannual variability is significantly correlated between the two (Table 3). However, the seasonal mean monsoon values for the same indices (TTGI, WYI) have interannual spread that is significantly different between SEAS5 and reanalysis ( Table 3).
The average onset date measured by WFI is significantly earlier in SEAS5 than ERA5 (Table 3). Even the histogram and the tercile bounds of the WFI onset dates in SEAS5 are shifted towards earlier dates than in reanalysis (Fig. 7c) and the interannual spread is also statistically different for onset dates between SEAS5 and reanalysis (Table 3). However, SEAS5 represents the WFI onset variability moderately well, unlike for the WFI seasonal mean monsoon index. The RPC and SER of SEAS5, with all onset indices, are statistically indistinguishable from 1. RPC equal to 1 suggests that, for monsoon onset dates, SEAS5 skill is comparable to its potential predictability, i.e. the model predicts itself and the reality with similar skill. SER statistically indistinguishable from 1 suggests that the SEAS5 monsoon onset probabilistic forecasts are reliable, wherein ensemble members are statistically indistinguishable from the observations.
We also consider SEAS5 onset forecast skill at each gridpoint based on the WLI (Fig. 8). SEAS5 represents the interannual variability of the monsoon onset well over parts of northern India, central India, coasts of the Indian peninsula Stippling indicates a CC that is significant at the 5% level. d-f show the same as a-c but for RPC and g-i show the same as a-c but for SER. Stippling shows RPC and SER values which are statistically indistinguishable from 1 at the 95% interval and marginal seas of the Indian Ocean, as the hindcast is significantly correlated with observations (Fig. 8a). SEAS5 shows good forecast of onset variability for most parts of India (Fig. 8a) higher than its skill at representing seasonal mean precipitation (Fig. 5a). Over all of the Indian subcontinent and surrounding seas the forecast skill for monsoon onset is higher than that of a random chance (HSS > 0; Fig. 8b). As the monsoon onset occurs within 1-2 months lead time, whereas seasonal mean precipitation is accumulated over months 2-4 of the forecast, this difference in skill between monsoon onset and mean precipitation is expected. However, for the local-scale onset, RPC and SER are statistically indistinguishable from 1 (Fig. 8c, d), over only some small parts of India and the marginal seas, similar to seasonal mean AIRI RPC and SER (Fig. 5d, g). SEAS5 has reliable forecasts (SER ≃ 1) and close-to-perfect predictability potential (RPC ≃ 1), over only small parts of northern, central and western India. Over most parts of India, RPC is less than 1 (overconfident forecasts), and over some parts of eastern India and the Bay of Bengal RPC is greater than 1 (underconfident forecasts). Over most parts of India, SER is greater than 1, indicating unreliable underconfident forecasts, where the forecast spread exceeds the ensemble mean error.

Tercile forecast skill
Forecasting systems are generally better at categorical forecasts than absolute deterministic forecasts. SEAS5 ensemble mean ISM onset forecast skill, for the correct onset category (defined by early, normal and late terciles), for all single onset indices has been verified using ACC and HSS ( Fig. 9a-b; see Sect. 2.2.2). Higher values for ACC and HSS indicate better forecast skill, with positive values of HSS Fig. 6 Spatial pattern of total pentad precipitation (mm; shaded), pentad-mean low-level winds (m s −1 ; vectors) and the WLI monsoon onset isochrone for that pentad (red contour). a Pentad-29 to j pentad-40 in SEAS5. k-t show the same as a-j but with GPCP and ERA5 datasets for verification. Dates in the subplot titles show the start dates of the respective pentads indicating forecast skill better than that from a randomly generated forecast. As for the deterministic forecast skill for the onset dates (Table 3), SEAS5 ensemble mean prediction skill for monsoon onset categories ( Fig. 9a-b) is also higher with larger scale onset indices (TTGI and WYI) than a smaller scale index (WFI).
The overall performance of SEAS5 in delivering a probabilistic categorical forecast is indicated by BSS and RPSS (see Sect. 2.2.2), shown in Fig. 9c-f. Higher positive values of these skill scores indicate better forecast skill (RPSS and BSS range from − ∞ to 1). Any value higher than 0 for BSS and RPSS indicates a forecast better than that from climatology. BSS provides skill for SEAS5 in forecasting onset for each category; early (BSS-E), normal (BSS-N) and late (BSS-L). RPSS summarizes the model performance scores over the three tercile categories. Figure 9 shows that SEAS5 categorical probabilistic forecast skill is better than that of a random forecast (positive HSS) and a climatological forecast (positive RPSS and BSS for all categories).
The skill in SEAS5 increases with the scale of the monsoon onset index, from indices computed at the local-scale (WFI) to those at the large scale (TTGI), perhaps due to the model's ability to simulate the large-scale monsoon features better. Early monsoon onsets are better predicted than later onsets in SEAS5, due to the shorter lead time of forecast. However, skill in different categories of onset, calculated using BSS, does not change linearly with the spatial scale of onset index. Forecast skill for WFI in the late onset category is worse than climatology, and the model does relatively better for the early onset category with WFI. This might stem from the fact that SEAS5 generally predicts earlier onsets with WFI than in reanalysis (Fig. 7c) due to stronger lowlevel westerlies (Fig. 1k) strengthening the WFI index in SEAS5 (Table 3).
SEAS5 skill for representing monsoon onset categories at the local-scale is shown in Fig. 8b. The SEAS5 ensemble forecast has the skill to represent the local-scale monsoon onset category accurately in more than 50% of the forecasts (ACC > 0.5) over most parts of India. The model predicts tercile categories of monsoon onset better than that of a random forecast over large parts of India (HSS > 0). Similar to other forecasting systems like GloSea5-GC2 (Chevuturi et al. 2019), SEAS5 has very good skill at predicting monsoon onset categories over most of the Indian subcontinent, whereas deterministic skill for the monsoon onset is limited to certain regions.

Conclusion and discussion
We have assessed the seasonal prediction skill of the Indian monsoon and its onset in the ECMWF SEAS5 coupled ensemble seasonal forecast system. Using multiple monsoon indices we verified the deterministic, probabilistic and categorical skill of the SEAS5 forecasting system. SEAS5 shows an overall dry bias over the Indian subcontinent, as seen in other contemporary models (Jain et al. 2019). The strengthened lower-level monsoon winds (caused by a warmer than observed temperature gradient) are associated with increased rainfall over over the Arabian Sea and Western Ghats. SEAS5 has notable meanstate ISM precipitation errors, including overestimation of rainfall over the high orography regions (Western Ghats and Himalayas) and underestimation of rainfall over the Gangetic plains and delta, as shown by other seasonal forecasting systems (Singh et al. 2019). Known difficulties in representing orographic precipitation (Pokhrel et al. 2016) and irrigation in surface processes (Mathur and Achuta-Rao 2020) may play a role in these errors. We should, however, be cautious of the fact that the observations over the high orography have large uncertainties due to sparse observational networks. Despite local-scale precipitation errors, SEAS5 represents the interannual variability of the precipitation patterns, associated with the first two EOFs, moderately well. SEAS5 has better skill at predicting the large-scale circulation variability than the all India rainfall, consistent with other forecasting systems (Kim et al. 2012;Johnson et al. 2017). SEAS5 also has better skill for seasonal mean monsoon indices averaged over their defined domains compared to the respective monsoon index calculated over each grid-point. This is because spatially averaging over larger domains yields improved skill due to extended spatial coherence of monsoon variability (Jain et al. 2019). The progression of the monsoon onset over the Indian subcontinent in SEAS5, compared to that observed, is slower during May and faster during July.
SEAS5 has small biases in representing the climatological strengths of JJA mean monsoon indices as well as the mean monsoon onset dates calculated with smaller scale indices. However, the interannual spread for the JJA mean ISM monsoon indices in SEAS5 is generally higher than observed, although the spread in monsoon onset dates is statistically similar to that of the observations. This difference in the skill for mean monsoon features versus monsoon onset dates is also seen in the UK Met Office  Fig. 2 but showing the onset dates calculated with a TTGI, b WYI and c WFI. The dashed horizontal lines show the upper and lower terciles for monsoon onset for SEAS5 (black) and ERA5 (red). The heatmap below the boxplot shows the tercile categories of the onset dates for SEAS5 and ERA5. If the onset date of SEAS5 or ERA5 lies in between their respective upper and lower tercile lines, then it is considered a normal onset (white box in heatmap), but if it is before or after it is considered an early (blue box in heatmap) or late (red box in heatmap) onset respectively. Histogram distributions of the onset dates for SEAS5 ensemble (black) and ERA5 (red) are shown on the right of the boxplots corresponding to the onset dates on the y-axis ◂ seasonal coupled forecast model, GloSea5-GC2 (Johnson et al. 2017;Chevuturi et al. 2019), which is linked to the difference in forecast lead times. SEAS5 has good skill at representing the large-scale monsoon onset date for ISM and large-scale JJA monsoon, with reliable probabilistic forecasts (SER ≃ 1) and close-to-perfect predictability potential (RPC ≃ 1). Generally, SEAS5 has SER and RPC values significantly lower than 1, at the 95% confidence interval, for smaller scale monsoon indices, such as all India rainfall, which indicates unreliable (underdispersive or overconfident) forecasts. Thus, we can conclude that the SEAS5 forecasting systems for small-scale monsoon features predicts itself better than it predicts the reality, as seen in GloSea5-GC2 (Chevuturi et al. 2019). Parts of central India in SEAS5 have unreliable (overdispersive and underconfident) forecasts for both monsoon rainfall and onset, which indicates that the forecast ensemble spread for these features is much larger than the ensemble mean error.
Our results show a steady decrease in skill as we move from onsets calculated using large-scale indices to those at smaller scales. However, we also show that this is not true when we analyse the skill for different onset categories (defined by early, normal and late terciles). Onsets calculated from horizontal wind shear (WFI) have higher skill for early onsets and lower skill for late onsets. Due to the model's westerly wind bias at lower levels, WFI in SEAS5 is stronger than in observations. This strengthened WFI leads Fig. 8 a Grid-point wise CC of WLI onset pentads between SEAS5 and GPCP. Stippling shows locations where the p-value is less than 0.05 (significant at the 5% level). b ACC calculated at each grid-point for WLI onset pentads based on tercile categories. Stippling shows regions of positive HSS. c same as a but for RPC, and d same as a but for SER. Stippling shows RPC and SER values which are statistically indistinguishable from 1 at the 95% confidence interval to an increased tendency in SEAS5 for earlier WFI onsets. In SEAS5, the application of mean-state bias correction techniques to reduce the error in low-level circulation may improve the representation of precipitation biases and the associated monsoon onset date.
SEAS5 has skill which is better than either random or climatological forecasts, when giving probabilistic forecasts for the monsoon onset category. This skill for SEAS5 is not only applicable for large-scale onset prediction but also for rainfall onsets at the local scale on a pentad-bypentad basis. Better skill for monsoon onset compared to the poor skill for precipitation forecasts in SEAS5 is also seen in GloSea5-GC2 (Johnson et al. 2017;Chevuturi et al. 2019), which is associated with the difference in forecast lead time for the mean monsoon precipitation (2-4 months) and monsoon onset (1-2 months). Over most parts of India, SEAS5 shows good skill at forecasting the onset pentad category and can predict the onset pentad accurately with moderate skill over parts of northern central, western and eastern India, despite the systematic biases in precipitation. The two seasonal forecast models, SEAS5 and GloSea5-GC2 (Chevuturi et al. 2019), both show good skill at predicting local-scale monsoon onset over the core monsoon region at the same lead time (May forecasts). Previous studies have shown multi-model ensembles to enhance forecast skill for Indian monsoon rainfall (e.g. Kumar et al. 2012). Such a multi-model ensemble should improve forecast skill for the local-scale monsoon onset, but detailed analysis is required in future in order to identify the best approach for multi-model combinations.
Operationally, IMD issues probabilistic seasonal forecasts of ISM by mid-April, with an update by 1st of June for region-wise or all India rainfall using statistical and dynamical models (IMD 2020). IMD also issues a monsoon onset date for Kerala using a statistical forecast model by mid-May and have recently updated their methodology to identify local-scale monsoon onset date with gridded datasets rather than station information (Pai et al. 2020). Local-scale agroadvisories are currently only provided to the farmers at different timescales, through the Gramin Krishi Mausam Sewa project by the Agricultural Meteorology Division of IMD (AMD 2020). Although the current study's outcomes may not be directly beneficial to the end-user (e.g. farmers), our results show that SEAS5 has appreciable skill for operational state-level products of ISM rainfall and local-scale monsoon onset almost a month in advance over parts of India. Future investigation can help identify specific user-oriented ISM metrics and SEAS5's skill for such metrics relative to current operation forecasts (Rao et al. 2019). Good prediction for the local-scale monsoon features provided to the farmers a month in advance, over major agricultural regions may help reduce resource wastage, mitigate losses and improve crop yield though better-informed decision making in the agriculture sector.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.