1 Introduction

The primary driver of the Indian summer monsoon is the vast expanse of the Asian landmass to the north and the Indian Ocean to the south, along with the resulting contrast in heating between the land and the sea. The initiation of the monsoon season, known as the onset of the southwest monsoon, unfolds gradually and involves multiple stages characterized by notable shifts in the extensive atmospheric and oceanic circulation patterns spanning from the Earth's surface to the upper troposphere. This phenomenon has been explored in many previous studies [1, 2]. One such transition is characterized by an annual reversal of the winds. The onset of the southwest monsoon over Kerala, situated at the southern tip of India (hereafter denoted as MOK), holds significant importance as it signals the start of the rainy season for a predominantly agricultural nation like India. Annually, the India Meteorological Department (IMD) announces the onset date primarily based on the daily rainfall recorded at various weather stations across Kerala, as emphasized in the work of Ananthakrishnan and Soman in 1988 [3]. The onset process is directly connected to the seasonal changes in both regional and planetary-scale atmospheric characteristics. The seasonal shift in tropospheric temperature and pressure gradients is attributed to the presence of diabatic and sensible heat originating from the elevated Tibetan Plateau. This plateau serves as a heat generator in the region, a phenomenon well-documented in earlier study by Xavier et al. [4]. This unique heat source plays a pivotal role in initiating the southwest monsoon over India, as outlined in research by Murakami and Ding [5] and Yanai et al. [6].

The onset of the monsoon is also associated with alterations in the upper tropospheric circulation patterns, such as the northward shift of the mid-latitude westerly jet beyond the Tibetan Plateau, as observed in Yin [7], and the prevalence of a prominent tropical easterly jet stream (TEJ) near the 150 hPa level over the southern periphery of the Tibetan high, as discussed by Koteswaram [8]. Another characteristic observed during the MOK is the presence of significant vertical wind shear over the Indian subcontinent, resulting from the strong upper-level easterly jet and a low-level westerly jet. Previous research, as highlighted in earlier studies such as Rao [1], Soman and Kumar [9], and Simon and Joshi [10], has also pointed to an elevation in atmospheric moisture levels, particularly in the mid-troposphere (around 700–500 hPa), occurring approximately 8–10 days before the MOK. Krishnamurti and Ramanathan [11] investigated observational aspects related to the development of energy exchanges and differential heating in the context of the GARP Monsoon Experiment (MONEX). Sikka and Gadgil [12] observed a strip of intense convection, referred to as the "maximum cloud zone," moving northward across the southern tip of India during the onset phase of the monsoon, as noted by Joseph et al. [13].

In light of the evolving meteorological conditions on both a large-scale and local level during the onset of the monsoon, several recent studies including works by Fasullo and Webster [14], Joseph et al. [15], Lu and Chan [16], Lu et al. [17], Taniguchi and Koike [18], Wang et al. [19], and Zeng & Lu [20] have proposed various objective definitions for characterizing the monsoon's onset process. In previous research conducted by Ananthakrishnan and Soman [21] and later by Soman and Kumar [9], they defined the onset date of the monsoon as the point at which there is a shift from light to heavy rainfall. This transition was identified as the onset date if, in the initial 5 days following this shift, the daily average rainfall accumulated to a minimum of 10 mm. Fasullo and Webster [14] developed a Hydrological Onset and Withdrawal Index (HOWI) of monsoon by using the vertically integrated moisture transport as a variable determining the transition. Joseph et al. [15] utilized the depth of westerlies and widespread convection around Kerala in defining the MOK. Wang et al. [19] determined MOK by the establishment of rapid and sustained 850-hPa zonal wind averaged over the southern Arabian Sea. Zeng and Lu [20] and Lu et al. [17] employed precipitable water data to establish their monsoon onset definitions. In contrast, the Indian Meteorological Department (IMD) adheres to criteria established by Pai and Rajeevan [22]. These criteria involve a combination of three variables: rainfall measurements from multiple stations across the state of Kerala, the depth of westerly winds up to 600 hPa, and satellite-observed outgoing longwave radiation values north of the equator. Another study by Goswami and Gouda [23] indicated that rainfall is the most suitable variable for defining MOK.

Recently IMD has operationally implemented various global, regional and coupled models to cater the need of all users in time scales starting from short range (Up to 3 days), medium range (Up to 7 days) to the extended range (Up to 3–4 weeks). With regard to the ERF, IMD has implemented an operational ERF system (Up to 4 weeks) based on Climate Forecast System version 2 (CFSv2) coupled model [24,25,26,27] based on the CFSv2 coupled model adopted from NCEP [28]. These studies have indicated that the operational ERF is capable of predicting different transition phases of monsoon including onset, withdrawal, active-break-active transitions 2–3 weeks in advance. Apart from the ERF, the Indian Meteorological Department (IMD) also utilizes the Global Forecast System (GFS) and the Global Ensemble Forecast System (GEFS) models for medium-range forecasts. These models operate at approximately 12 km resolution and provide operational forecasts for a period of 10 days. During the monsoon season in 2021, the (IMD) introduced a district-level forecasting system for the medium range. This system is built upon a multi-model ensemble (MME) approach, which combines data from five global models. The purpose of this MME-based approach is to provide forecasts for a period of 5 days at the district level, as documented in the study by Bushair et al. [29]. IMD has also implemented the multi-model ensemble (MME) based district level forecast over Indian districts in medium range for 5 days by considering outputs from 5 global models during the monsoon season 2021 [29]. The rationale behind the present study is to develop an objective based dynamical prediction system of MOK to avoid too much of subjectivity in declaring the onset of monsoon that could successfully avoid bogus onsets. Considering the success of the operational dynamical modelling systems in IMD both in the medium and extended range time scales, the objective of the present study is to develop an objective method of MOK based on the operational ERF system (about 3 to 4 weeks in advance) and also to supplement the same in the medium range time scale (5 days in advance) with district level MME forecasts over India.

2 Operational ERF and medium range MME system in IMD

The present ERF system of IMD was initially adopted at Indian Institute of Tropical Meteorology (IITM) Pune. The modelling system can predict the active-break cycle of monsoon and can be used for various applications [24, 30, 31]. The atmospheric version of the model is the Global Forecast System (GFS) model and the oceanic component is the GFDL Modular Ocean Model V.4 (MOM4). The suite of models consists of: CFSv2 at T382, CFSv2 at T126, GFSbc (bias-corrected SST from CFSv2) at T382 and GFSbc at T126 with 4 members each with a total of 16 members. The operational ERF system of IMD is run weekly based on the Initial Condition (IC) of every Wednesday, and forecasts are generated for a duration of four weeks, commencing from the following Friday and extending through Thursday. The atmosphere and ocean ICs are available from the Global Data Assimilation System (GDAS) and Global Ocean Data Assimilation System (GODAS) run at National Centre for Medium Range Weather Forecasting (NCMRWF) and Indian National Centre for Ocean Information Services (INCOIS) respectively.

IMD has also implemented the Multi-Model Ensemble (MME) based district level forecast for 5 days during the monsoon season 2021 [29, 32]. In the preparation of the multi-model ensemble (MME) forecasts, five different numerical weather prediction (NWP) models are employed. These include the Global Forecasting System (GFS), which is utilized both at the Indian Meteorological Department (IMD) and the National Centers for Environmental Prediction (NCEP). Additionally, the Global Ensemble Forecasting System (GEFS) is operational at IMD. The Unified model (NCUM) is run at the National Centre for Medium Range Weather Forecasting (NCMRWF), and the Global Spectral Model (GSM) is employed by the Japan Meteorological Agency (JMA). The GFS model was first introduced at IMD in 2010 with a resolution of T382L64, as outlined in the work by Durai et al. [33]. The current version, GFS model 14.1.0, utilized at IMD, operates with a spectral resolution of T1534, which is approximately equivalent to a spatial resolution of 12.5 kms, and it incorporates 64 hybrid vertical levels, as documented by some recent studies [34, 35]. The NCEP GFS model data with a horizontal resolution of 0.250× 0.250 is also used (https://www.emc.ncep.noaa.gov/emc/pages/numerical_forecast_systems/gfs.php). Compared to the NCEP-GFS model, IMD-GFS utilizes more Indian observations during assimilation. The global model known as the Unified Model, or NCUM (from NCMRWF), operates with a horizontal resolution of N1024, which is approximately equivalent to 12 kms. Vertically, it encompasses 70 levels, extending up to an altitude of 80 kms [36,37,38]. IMD consistently receives data from the NCUM model. IMD's data collection also includes the JMA's Global Spectral Model (GSM), which is received at a spatial resolution of 25 kms for forecasts up to 10 days. Information from all five of these models is harnessed to produce multi-model ensemble (MME) products at the district level.

3 Present criteria used by IMD for declaring the monsoon onset over Kerala

For more than past 100 years, IMD determined the date of MOK operationally every year. In an operational mode, the date of MOK is based on the synoptic conditions as given by Forecasting Manual Unit (FMU) Report No. IV–18.2 by Ananthakrishnan et al. [39] based on rainfall. Rao [1] further indicated that associated with such rainfall, the lower tropospheric westerly wind over Kerala is strong and the relative humidity of the air is high from the surface to at least 500 hPa. According to these previous criteria, if, starting from May 10th, any five out of the following seven stations—Colombo, Minicoy, Thiruvananthapuram, Alappuzha, Kochi, Kozhikode, and Mangalore—record rainfall of at least 1 mm over a 24 h period for two consecutive days, the declaration of the MOK is made on the second of those 2 days. IMD has been considering these factors in a subjective way to determine the onset date. In 2016, IMD uses a new-criteria adopted (https://mausam.imd.gov.in/) for declaring the MOK which, was based on the daily rainfall of 14 stations over Kerala and neighbouring area along with wind field and Outgoing Longwave Radiation (OLR) over southeast Arabian Sea. The new criteria emphasize on the sharp increase in rainfall over Kerala along with the setting up of large-scale monsoon flow and extension of westerlies up to 600 hPa over the Arabian Sea. In 2016, the IMD introduced a revised set of criteria for determining the MOK. These updated criteria, accessible on the website (https://mausam.imd.gov.in/), took into account the daily rainfall patterns observed at 14 stations across Kerala and the neighbouring regions in addition to the wind field data and Outgoing Longwave Radiation (OLR) measurements over the southeast Arabian Sea. The new criteria placed particular emphasis on detecting a significant and rapid increase in rainfall over Kerala, coinciding with the establishment of a broad-scale monsoon flow and the extension of westerly winds up to the 600 hPa level over the Arabian Sea. The MOK declaration hinges on specific conditions: If, after May 10th, a minimum of 60% of the 14 reporting stations along the west coast of India register rainfall equal to or exceeding 2.5 mm for two consecutive days, then the MOK can be declared on the second day. However, this declaration is contingent upon meeting the following three additional criteria.

  1. (i)

    The depth of westerlies within the Arabian Sea, defined by the region between the equator − 100N and 550–800E, must extend up to the 600 hPa level.

  2. (ii)

    The zonal wind speed at 925 hPa, within the area bordered by 050–100N and 700–800E, should fall within the range of 15 to 20 knots.

  3. (iii)

    The OLR value within the specified box, confined by 05–100N and 700–800E, should be less than 200 watts/m2.

3.1 Objective definition of MOK based on extended range forecast

3.1.1 Objective definition of MOK using the hindcast and forecast data (2003 to 2019)

Since the MOK is declared by considering many factors by forecasters, there is a role of subjectivity is also involved in declaring the exact date of MOK. Considering the same, the NWP models could be useful to declare the MOK based on objective criteria. When establishing an objective prediction for the MOK, it’s crucial to capture the shift from isolated pre-monsoon synoptic events to the widespread and continuous rainfall characteristic of the monsoon. Failing to do so can lead to inaccurate or what's often referred to as “bogus” MOK declarations, as was the case in 2002, as documented by Flatau et al. [40].

In the present case the ERF based MOK date is defined by using three forecast indices based on rainfall over Kerala and the strength of the low-level jet over the Arabian Sea at 850 hPa and mid-tropospheric level wind at 600 hPa level. A total of three indices are defined, one from rainfall measured over Kerala (Region R1 in Fig. 1) and the others based on the strength and depth of the low-level westerly jet over the Arabian Sea (Region ‘R2’ in Fig. 1). Consequently, the Monsoon Onset Date (MOK) is determined on the initial day when all three prescribed conditions are met continuously for a period of five consecutive days. These criteria are slightly different from the objective criteria used by Joseph et al. [41] where they used low-level wind as well as rainfall based on the ERF run at IITM, Pune, which is slightly different from the operational ERF system implemented at IMD in 2017 [24]. In this current study, the MOK is established by considering the following indices.

  1. i.

    Rainfall averaged over the domain (R1) bounded by 80-120N, 740-780E exceeds 80% of their mean;

  2. ii.

    The zonal wind at 850 hPa, calculated as an average over the region spanning 50-120N, 550-750E (referred to as R2), should surpass 70% of its mean value, while the zonal wind at 600 hPa over the same geographic area (R2) should be greater than zero.

Fig. 1
figure 1

Regions selected for defining three indices with rainfall averaged over 080-120N, 740-780E (R1); Zonal wind at 850 and 500 hPa averaged over 050-120N, 550-750E (R2)

As shown in Fig. 1 the region ‘R2’ is used for the wind indices defined in ‘ii’ and ‘iii’ above and the region ‘R1’ is used for the rainfall. In addition to these three parameters, the onset of monsoon over Kerala is also associated with the increase in the pressure difference along the west coast. To see the climatological features of this north–south reversal of the pressure difference over the west coast regions two boxes shown in Fig. 2a are considered. Once the land heating is established in May the South-North pressure difference increases from 15th May onwards. The threshold value of the south-north pressure difference along the west coast for defining the MOK is calculated based on the climatological mean value of the south-north pressure difference over the two boxes as shown in Fig. 2b. When the climatological mean pressure difference is plotted from 15th May to 15th June it reaches a threshold value of 200 Pa (2 hPa) during the climatological MOK date of 1st June. The north–south pressure difference along the west coast of India sets the required pressure gradient for the stronger south-westerly monsoon wind from the Arabian Sea to hit the Kerala coast. Secondly, the ERF onset of monsoon over Kerala is defined on the first day, if all the three given conditions are satisfied for consecutive 5 days along with the threshold value of 2 hPa pressure difference is also met.

Fig. 2
figure 2

a The regions used for calculation of the South (080100N, 730770E) — North (180200N and 710750E) normal mean sea level pressure gradient along the west coast of India from 15 May to 15 June. b The climatological South-North pressure gradient (Pa) along the west coast with the normal onset date over Kerala of 1st June indicates the threshold value

The operational ERF system of IMD consists of both hindcast and forecast run on the fly. The hindcast run is performed for 16 years (2003–2018) along with the forecast for 2019 based on Initial Condition (IC) of 15th of May each year. By using the objective method, the MOK was calculated for the 17 years including the real time forecast year of 2019 based on the ERF with the initial condition of 15th May for each year. Later as the forecast model runs with every Wednesday IC, the objective method developed was tested for the year 2020, 2021 and 2022 with different initial conditions close to the date of 15th May. The real-time forecast of MOK was also prepared for the years 2020, 2021 and 2022 by using the two methods using three indices and four indices respectively.

The forecast onset date based on three indices as mentioned in (i) to (iii) above are calculated for the whole 20 years from 2003 to 2022 with a fixed IC date of 15th May from 2003 to 2019 and the nearest Wednesday dates close to 15th May during 2020, 2021 and 2022 years. The ERF rainfall for 32 days averaged over the region ‘R1’ in Fig. 1 is shown in the form of vertical lines along with the horizontal line in dotted blue colour representing 80% of their mean starting from 16th May to 16th June for each year from 2003 to 2019 (Fig. 3). Similarly, the zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line) shown in Fig. 3. To see the depth of westerly the zonal wind at 600 hPa averaged over the region ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line in Fig. 3. The same process is continued until 2019 as shown in Fig. 3 with objectively defined MOK marked in red dot and the observed MOK date is marked in green dot in Fig. 3. Further to define the dynamically predicted onset date with four indices by including the predicted South-North pressure gradient of 2 hPa as is shown in Fig. 4. Thus, Fig. 4 indicates the forecast MOK using the 4 indices during the period from 2003 to 2022 represented by a red dot along with the observed MOK declared by IMD represented by a green dot.

Fig. 3
figure 3

The time-series of extended range forecast rainfall for 32 days based on 15 May averaged over the region ‘R1’ in Fig. 1 as vertical lines along with the horizontal line in dotted blue color representing 80% of their mean starting from 16th May to 16th June for each year from 2003 to 2019. Similarly, the Zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line). The depth of westerly, the Zonal wind at 600 hPa averaged over the region ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line. The red circle indicates the predicted MOK date with three indices and the green circle indicates the observed MOK date

Fig. 4
figure 4

The time series of extended range forecast mean sea level pressure difference (South-North MSLP) in Pa bounded by the south and north regions (80100N, 730770E) and (180200N and 710750E) respectively as indicated in Fig. 2 starting from 16th May to 16th June for each year from 2003 to 2019 based on the initial condition of 15 May. The horizontal line indicates the threshold value of 200 Pa (2 hPa). The red circle indicates the predicted MOK date with these four indices and the green circle indicates the observed MOK date of IMD

In order to see the performance of ERF forecast the near normal (NN) date of MOK is considered to be within ± 2 days from the normal (N) date of 1st June. Thus, if the onset is between 30 and 31 May or between 2 and 3 June, the MOK is considered to be Near Normal (NN). The earlier onsets are categorized into: Slightly Early (SE) with onset between 27 and 29 May; Early (EA) with onset between 24 and 26 May and the Very Early (VE) with onset date is on or before 23rd May. Similarly, the three late categories of onset dates are Slightly Late (SL) with onset between 04 and 06 June; Late (LA) with onset between 07 and 09 June and Very Late (VL) with onset date on or after 10th June as given below.

  • Very Early (VE): 23 May or early

  • Early (EA): 24–26 May

  • Slight Early (SE): 27–29 May

  • Near Normal (NN): 30–31 May

  • Normal (N): 01 June

  • Near Normal (NN): 02–03 June

  • Slight Late (SL): 04–06 June

  • Late (LA): 07–09 June

  • Very Late (VL): 10 June or later

As per the classification of different onset categories, the forecasts MOK dates with three indices and with four indices along with IMD observed MOK dates are given in Table 1. Table 1 also contains the deviation in the number of days from the observed MOK during the whole period. In order to see the performance of ERF in predicting the MOK date based on 8 categories during the whole period from 2003 to 2022 by using the three and four indices. As shown in Table 1 the mean deviation of onset days between forecast MOK with three indices and IMD observed MOK during the period from 2003 to 2019 is found to be 2.12. However, the mean deviation between the IMD observed MOK day and ERF MOK day with four indices is found to be 0.35, which indicate the better performance of dynamical MOK forecast with four indices. Even during the remaining period of 3 years from 2020 to 2022 with different ICs the ERF of MOK with four indices performed better compared to that with three indices. As shown in Table 1 with four indices the forecast MOK and observed MOK during the period from 2003 to 2019 deviate by 8 or more days during the years 2003, 2007, and 2010, whereas during the other 14 years the forecast and observed MOK dates are close to one another. With three indices MOK forecast, it also deviates by more than 8 days during 2003 and 2010.

Table 1 IMD observed date of Monsoon Onset over Kerala (MOK) along with the extended range forecast date of MOK with three indices and four indices during the year 2003 to 2022 along with the different categories of MOK and its deviation from the observed MOK date

To examine the reasons for such large deviations between forecast and observed MOK days during 2003, 2007, and 2010, the large-scale variables are also analysed. It may be mentioned here that during 2003, the observed onset was ‘LA’ (8th June), whereas the forecast was predicted as ‘SE’. During 2007, the observed was ‘SE’ but the forecast was ‘SL’ and for 2010 the observed was ‘NN’, whereas the forecast was ‘VE’. To investigate the reason for the large deviation in observed and forecast onset dates during the three seasons the daily observed rainfall from 15 April to 15 May over the Kerala coast is shown in Fig. 5a–c for 2003, 2007 and 2010 respectively. The observed rainfall during 2003 indicated an increase of rainfall only around 8th June, thereby the observed MOK was ‘LA’ (Fig. 5a). It is also observed from the study by Pattanaik et al. [42] the long-duration cyclone during the period of the month of May over the eastern part of the Bay of Bengal with re-curvature characteristics can delay the onset of monsoon. The same thing happened during the 2003 season as shown in Fig. 6, when the cyclonic storm during 10–19 May had a recurve track and delayed the MOK. With regard to the ERF, the rainfall criteria satisfied for the early onset of monsoon over Kerala as indicated in Fig. 3. However, it possibly could not predict the genesis and re-curvature of the cyclone over the Bay of Bengal in the extended range time scale and hence the predicted MOK was slightly early, leading to having a large deviation in the observed and forecast date of MOK.

Fig. 5
figure 5

IMD observed time series of rainfall averaged over the region ‘R1’ bounded by 80-120N, 740-780E during the period from 15 May to 15 June for the year (a) 2003, (b) 2007, and (c) 2010

Fig. 6
figure 6

The track of the cyclonic storm of 10–20 May 2003, which recurved over the Bay of Bengal and delayed the monsoon onset over Kerala

During 2007 the observed rainfall from 15 May to 15 June (Fig. 5b) over Kerala coast indicated a spell of rainfall during the last week of May with observed monsoon onset over Kerala on 28th May, with Slight Early (SE) onset. The slight early onset during 2007 was consistent with the increase in convective activity and associated rainfall in conjunction with satisfying the wind patterns required for declaring the onset of monsoon over Kerala [43]. Looking at the ‘ERF’ of MOK during 2007, it indicated a Slight Late (SE) with 5th June as the forecast onset date. To understand the deviation between the observed and forecast MOK day, the ERF based on 15th May, 2007 with the three forecast variables (rainfall, zonal wind at 850 hPa and zonal wind at 600 hPa) indicated rainfall spell in the region (R1) commencing around 22nd May and continued till 30th May with the rainfall criteria satisfying the condition of exceeding 80% of its mean during 26–30 May (Fig. 3). When the ERF of zonal wind at 850 and 600 hPa over the region (R2) is seen in Fig. 3 the objective criteria of 850 hPa and 600 hPa averaged wind over the region R2 satisfied the threshold condition on 28th May. Thus, the three objective conditions were satisfied on 28 May for MOK based on ERF, however, as per the definition of declaring the onset the first day when the rainfall condition is satisfied, which is on 26th May in this case, when the zonal wind conditions were not satisfying (Fig. 3). Similarly, the MSLP condition is also not satisfied during the first rainfall spell period from 26 to 30 May 2007, consequently, the objective definition of MOK based on ERF was not indicated during the first rainfall spell and only during the second rainfall spell where the conditions are satisfied and the onset date was found to be on 5th June as shown in Figs. 3, 4. Thus, the deviation in forecast day of MOK during 2007 was due to narrowly missing the objective criteria during the first rainfall spell where the other conditions were satisfying.

Another year when the observed and forecast MOK differ by 8 days is for 2010. The observed MOK was on 31st May with near normal (NN) onset, whereas the MOK in ERF was on 23rd May with ‘SE’. In order to understand the deviation, the observed rainfall over region R1 from 15 May to 15 June indicated commencement of rainfall around 17th May and continued till the middle of June (Fig. 5c). However, the observed MOK was on 31st May. But when we see the ERF rainfall over the region R1 and zonal winds at 850 and 600 hPa over the region R2 in Fig. 3 and MSLP difference in Fig. 4, it clearly indicated the forecast MOK on 23rd May based on the 2nd method (four indices) and 20th May with 1st method (three indices), with both the cases indicating SE. Thus, the deviation in observed and forecast MOK during 2010 is basically due to the subjective interpretation of the forecasters in declaring the observed MOK as 31 May and not before this day in spite of observed rainfall continuing since 17th May and the wind conditions were not favourable till 30th May [44].

It is also seen in Table 1 that the MOK during 2019 was 7 days later than the normal date of 1st June, which has been predicted well in the ERF with four indices with the MOK of 11 days late. It may be mentioned here that the monsoon onset over Kerala in 2019 was influenced by the severe cyclonic storm “Vayu” over the Arabian Sea, which pulled in the moisture-rich westerly winds, halting the advancement of the monsoon and hence the delay in MOK with observed onset is on 8th June. This is also well predicted in the operational ERF well in advance with the initial conditions from 1st May, 8th May and 15th May by Pattanaik et al. [26].

3.1.2 Extended range forecast of MOK during 2020, 2021, and 2022

Based on the objective criteria fixed for declaring MOK date using the hindcast periods from 2003 to 2019, the real-time forecast performance is also evaluated for recent 3 years from 2020 to 2022. During 2020, the onset phase of the monsoon over Kerala, was not adversely influenced by the presence of super cyclone ‘AMPHAN’ over the Bay of Bengal during 15–21 May. Although the onset over the Bay of Bengal was earlier than normal, the MOK was on the normal date of 1 June, which was associated with the presence of the cyclonic storm “Nisarga” during 1–4 June over the Arabian Sea. The observed rainfall time series over the region ‘R1’ from 15th May to 15th June during 2020, 2021, and 2022 is shown in Fig. 7a–c respectively.

Fig. 7
figure 7

IMD observed time series of rainfall averaged over the region ‘R1’ bounded by 080-120N, 740-780E during the period from 15 May to 15 June for the year (a) 2020, (b) 2021, and (c) 2022

As shown in Fig. 7a, during 2020 the observed rainfall over the Kerala coast commenced with the first spell during 15–22 May and the second spell from 29 May to 08 June, with the onset occurring during the second spell of monsoon. As shown by Pattanaik et al. [27] the normal onset of monsoon over Kerala during 2020 is also reflected well in observed rainfall anomalies associated with the cyclone “NISARGA” for the week from 29 May to 04 June 2020 over the Kerala coast of India (Top panel plot in Fig. 8). The normal onset is predicted with 2 weeks lead-time based on the initial conditions of 27 May and 20 May (bottom panel plot of Fig. 8) with an indication of positive rainfall anomalies over the entire southwestern coastal belt of India covering the region of Kerala. However, the week 3 forecast based on the IC of 13 May could not capture the rainfall patterns; thereby the MOK for 2020 with 13th May IC was not predicted well (right bottom panel plot of Fig. 8). Based on the objective methods discussed above the ERF time-series of rainfall over the region ‘R1’, with the rainfall criteria satisfying the condition of exceeding 80% of their mean for continuous 5 days are from 4 to 8 June, which indicated the MOK on 4th June (Fig. 9a). However, when the wind conditions are also considered, it only satisfies the condition on 6th June. Also, when we consider the MSLP and the other 3 criteria simultaneously as given in Fig. 9b the MOK becomes 12 June (Table 1). Thus, based on IC of 13 May, it indicated a slight delay in MOK against the observed normal onset date (Table 1). This is also reflected in Fig. 8 where the positive rainfall anomaly over the Kerala region for the target week of 29 May-04 June based on 13th May IC was not well captured. However, based on the initial condition of 20th May with the use of 4 indices the MOK forecast is found to be 3rd June, which is very close to the observed MOK of 1st June (Figures not shown).

Fig. 8
figure 8

Observed weekly rainfall anomaly for the target week from 15 to 21 May, 2020 (in top panel) and the real-time Extended Range Forecast rainfall anomaly for the same target week based on 13 May (1- week lead), 06 May (2- week lead) and 29 April (3- week lead), 2020 initial conditions (in bottom panel)

Fig. 9
figure 9

The time-series of extended range forecast of (a) rainfall for 32 days based on the initial condition of 13 May,2020 averaged over the Region ‘R1’ (in Fig. 1) as vertical lines along with the horizontal line in dotted blue color representing 80% of their mean starting from 14th May to 14th June for the year 2020. Similarly, the Zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line). The depth of westerly, the Zonal wind at 600 hPa averaged over ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line. (b) 4th index of South-North meridional difference of MSLP time series based on IC of 13 May 2020

During 2021 monsoon season the observed rainfall time series (Fig. 7b) indicated good rainfall activity starting from 15th May to 15th June, with observed MOK declared by IMD on 3rd June. The mean and anomaly ERF of wind and rainfall for 4 weeks based on 12th May initial condition and valid for 14 May to 10 June indicated stronger monsoon circulation during week 1 to week 3 with above-normal rainfall over the Kerala coast for the first 3 weeks valid for 14–20 May, 21–27 May, and 28 May-03 June, there by indicating an early MOK (Fig. 10a–d). Based on the objective methods discussed above the ERF time series of rainfall over the region ‘R1’ and the wind patterns over the region ‘R2’ based on the IC of 12 May 2021 is shown in Fig. 11a. Similarly, the MSLP forecast time series exceeding the threshold is also shown in Fig. 11b. The objectively defined MOK based on 4 indices is found to be slightly early (27th May) compared to the operationally defined onset by IMD of 3rd June. It may be mentioned here that the observed rainfall over the Kerala region was continuing during the last week of May 2021, however, due to the wind criteria not fulfilling during that time the MOK declared was on 3rd June, 2021 [45].

Fig. 10
figure 10

a Extended range forecast of 850 hPa wind for 4 weeks based on 12 May, 2021 and valid for 14 May to 10 June 2021. b Same as ‘a’ but for 850 hPa wind anomaly. c, d Same as ‘a’ and ‘b’ but for mean rainfall and rainfall anomaly for 4 weeks from 14 May to 10 June 2021 based on IC of 12 May, 2021

Fig. 11
figure 11

a The time-series of extended range forecast rainfall for 32 days based on the initial condition of 12 May, 2021 averaged over ‘R1’ in Fig. 1 as vertical lines along with the horizontal line in dotted blue color representing 80% of their mean starting from 13th May to 13th June, 2020. Similarly, the Zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line). The depth of westerly, the Zonal wind at 600 hPa averaged over ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line. b 4th index of South-North meridional difference of MSLP time series based on IC of 12 May 2021

During 2022 monsoon season the observed rainfall time series (Fig. 7c) indicated good rainfall activity starting from 15th May to 15th June, with 1st rainfall spell of relatively higher value persisting during 15–22 May and the second spell with slightly lower rainfall during 23 May-02 June. The observed MOK declared by IMD was on May 29. The ERF of wind and rainfall along with its anomaly for 4 weeks valid for 13 May to 09 June, based on 11 May 2022 indicated stronger monsoon circulation during week 1 to week 3 with above-normal rainfall over the Kerala coast during week 1, week 2 and week 3 forecast valid for 13–19 May, 20–26 May and 27 May-02 June 2022 respectively (Fig. 12).

Fig. 12
figure 12

a Extended range forecast of 850 hPa wind for 4 weeks based on 11 May, 2022 and valid for 13 May to 09 June 2022. b Same as ‘a’ but for 850 hPa wind anomaly. c, d Same as ‘a’ and ‘b’ but for mean rainfall and rainfall anomaly for 4 weeks from 13 May to 09 June 2022

Based on the objective methods discussed above the ERF time series of rainfall over the region ‘R1’ and the wind patterns over the region ‘R2’ based on the IC of 11 May 2022 is shown in Fig. 13a. Similarly, the MSLP forecast time series exceeding the threshold is also shown in Fig. 13b. The objectively defined MOK based on four indices is found to be slightly early (May 20) compared to the operationally defined onset by IMD on 29 May. Early onset was also indicated in the ERF with MOK of 25th May based on the objective criteria used with earlier IC of 27th April (Fig. 14). Even for the later ICs of 4th May and 18th May the objectively defined MOK was found to be on 22 May and 27th May respectively (Fig. not shown). Table 1 also indicated that results indicate that the MOK forecast with four indices performed well compared to that with three indices during the whole period from 2003 to 2022 with the mean deviation days of MOK found to be 0.75 and 3.05 days respectively.

Fig. 13
figure 13

The time-series of extended range forecast of (a) rainfall for 32 days based on the initial condition of 11th May, 2022 averaged over the Region ‘R1’ in Fig. 1 as vertical lines along with the horizontal line in dotted blue color representing 80% of their mean starting from 12th May to 12th June 2022. Similarly, the Zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line). The depth of westerly, the Zonal wind at 600 hPa averaged over ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line. (b) 4th index of South-North meridional difference time series based on IC of 11th May 2022

Fig. 14
figure 14

The time-series of extended range forecast of (a) rainfall for 32 days based on the initial condition of 27th April, 2022 averaged over the region ‘R1’ (in Fig. 1) as vertical lines along with the horizontal line in dotted blue color representing 80% of their mean starting from 28th April to 29th May for the year 2022. Similarly, the Zonal wind at 850 hPa averaged over the region ‘R2’ represented by a dark green line exceeds 70% of their mean (represented by a horizontal dotted line). The depth of westerly, the Zonal wind at 600 hPa averaged over ‘R2’ represented by dark yellow exceeds zero represented by the dotted yellow line. (b) 4th index of South-North meridional difference of MSLP time series based on IC of 27 April 2022

In order to see the performance of ERF in predicting the date of MOK for the whole period from 2003 to 2022 as indicated in Table 1, an error histogram of predicted date of MOK for the whole period from 2003 to 2022 as indicated in Table 1 with method-1 (3 parameter) and method − 2 (4 parameter) with different bins of 4 days interval are shown in Fig. 15. Number of years is put in different bin of 4 days on either side of normal date of MOK versus early and late onset with the deviation towards negative or positive side respectively. As shown in Fig. 15, the deviation with ± 4 days of error histogram includes 12 cases in method-2 compared to 11 cases in methid-1. When we see the higher (> 4 days) positive deviation (Late to very late onset years) indicated 8 cases in methid-1 and 5 cases in method-2, indicating a much better skill with 4 parameters compared to that with 3- parameters. However, in case of higher negative departure (− 4 to − 8) or the early onset case, the method-1 covers 1 case against 3 cases in method-2. Thus, the method-2 with 4 parameters performed much better compared to the method-1 with 3 parameters. Even the Heidke Skill Score (HSS), which is a verification measure of categorical forecast has got higher score (0.15) in method-2 compared to that in method-1 (− 0.08) when the seven categories of MOK (Very Early, Early, Slightly Early, Near Normal, Normal, Slightly Late, Late and Very Late) as mentioned here are considered. Thus, the dynamically defined onset date over Kerala based on the real-time ERF indicated reasonable skill at least about 2–3 weeks in advance. Each year IMD issued the forecast for the date of MOK around 15th of May. Thus, in case of forecasting of MOK using the objective method using operational ERF the 8 categories of onset classifications used in the present study may be more appropriate to be used. However, there are some subjective interferences are also needed for defining the actual date of MOK when it is declared operationally by IMD in the medium range time scale (5–7 days in advance). Thus, the MME based forecast can also be used for the exact date of MOK in the medium-range time scale. In order to see the MME-based district-level forecast, the real-time forecast for 5 days based on the IC of 24 May, 2022 is indicated in Fig. 16a–e. The precipitation forecasts from five operational NWP modelling systems, viz. the Global Forecast System (GFS), the Global Ensemble Forecasting System (GEFS) from IMD, the GFS model running at National Centre for Environment Prediction (NCEP), the Unified Model running at National Centre for Medium-Range Weather Forecasting (NCMRWF) and the Global Spectral Model (GSM), runs by Japan Meteorological Agency (JMA) have been used for developing the MME forecasts [29, 32]. The day 5 forecast valid for 29 May 2022 with many districts indicating the increase in rainfall can be considered as the expected date of MOK. In addition to the MME based objective forecast of MOK the specification of a criterion for the onset is generally a subjective decision based on an overall judgment that takes into account of the changes in the circulation features, seasonal reversal of winds, and a sustained increase in rainfall over Kerala [40]. Many studies by Singh et al. [46], Baburaj et al. [47], Lenka et al. [48, 49] have also indicated that the MJO phases are also linked with the different convection centres and hence, influences the global circulation process and the rainfall and the onset of monsoon. Baburaj et al. [47] showed the dependence of monsoon onset over Kerala with the phases of Madden Julian Oscillation (MJO), which is one of the dominant intra-seasonal oscillations over the tropics. They have shown that the MJO plays a major role in preconditioning the atmosphere and ocean before the MOK. They have further shown that during the study period from 1980 to 2020, almost 80% of MOK have occurred in MJO phases 1, 2 and 3, whereas, the remaining 20% MOK occurred in all other phases of MJO except in phase 8.

Fig. 15
figure 15

Error histogram of predicted date of Monsoon Onset over Kerala (MOK) for the whole period from 2003 to 2022 as indicated in Table 1 with Method-1 and Method-2

Fig. 16
figure 16

District level forecast rainfall for 5 days based on multi model ensemble of 5 NWP models based on initial condition of 24 May 2022 and forecast for subsequent 5 days valid for (a) 25 May, (b) 26 May, (c) 27 May, (d) 28 May and (e) 29 May 2022 

Thus, the ERF can provide MOK category forecast reasonably well about 3 weeks in advance in conjunction with the forecast of the exact MOK date on medium range time scale, 5 days in advance. However, the specification of a criterion for the exact onset date of MOK a few days in advance will have some subjective judgment that takes into account the changes in the circulation features, seasonal reversal of winds, condition of MJO and a sustained increase in rainfall over Kerala as has explained in an earlier paper by Flatau et al. [40].

4 Conclusions

The dynamical prediction system for monsoon onset over Kerala based on the real-time operational medium and extended range forecast systems of IMD is developed by using the multi-model ensemble district level forecasts in the medium range time scale (up to 5 days) and the coupled modelling based extended range forecast (about 3 weeks in advance). The dynamical method based objective criteria of defining MOK was to avoid the occurrence of ‘‘bogus onsets’’ that are unrelated to the large-scale monsoon system. The objective method was tested for the forecast of MOK for the years for the year 2003 to 2022 using the operational extended range forecast. Two objective prediction systems for MOK are developed based on the real-time operational coupled model extended range forecast system of IMD by using (i) three indices consisting of forecast rainfall measured over Kerala and the other two are based on the strength and depth of the low-level westerly jet over the Arabian Sea and (ii) four indices consisting of three indices mentioned in ‘i’ and the meridional south-north mean sea level pressure gradient along the west coast of India.

By using the three indices the MOK is defined on the first day, if all three given conditions are satisfied for consecutive 5 days with rainfall averaged over the Kerala coast exceeds 80% of their mean; and the zonal wind at 850 hPa averaged over the Arabian Sea exceeds 70% of their mean; and Zonal wind at 600 hPa exceeds zero (or become westerly). In addition, the MOK is also defined by adding one more index in terms of south-north pressure gradient exceeding the threshold of 2 hPa, when all the 3 conditions are satisfied. Besides the normal (1st June) and near normal (within ± 2 days) the MOK dates are classified in to additional 3 early onset categories (Slightly Early, SE; Early, EA; and Very Early, VE) and 3 late onset (Slightly Late, SL; Late, LA; and Very Late, VL) categories.

The results indicated that the dynamical method of defining MOK based on the extended range forecast (ERF) worked very well during the 20 years from 2003 to 2022, except with relatively large deviation, particularly during 2003, 2007, and 2010 in hindcast and 2020 in the forecast period. It is also observed that the forecast MOK dates with 4 indices performed very well compared to that with 3 indices with mean error of IMD declared observed MOK date with that of ERF MOK dates during the whole period from 2003 to 2022 are found to be 1.6 days and 4.2 days respectively. The analysis of the results further indicated that the ERF could not capture the MOK very well in 2003 as it could not predict the genesis and re-curvature of the observed cyclonic storm over the Bay of Bengal. Similarly, during 2007 and 2010 the rainfall criteria were satisfying, however, the winds and MSLP conditions were not in agreement for the large deviation in observed and forecast MOK dates. During the year 2020, the initial condition of 13 May did not predict well, whereas, it predicted well with the initial condition of 20 May. Thus, the dynamically defined onset date over Kerala based on the real-time ERF indicated reasonable skill at least about 2 to 3 weeks in advance.

It is also demonstrated that the district-level Multi-Model Ensemble forecast in the medium range (5 days in advance) based on 5 global models can add to this dynamical prediction system of MOK date from operational ERF system about 3 weeks in advance for the exact date of monsoon onset in the medium range time scale about 5 days in advance. The exact onset date of MOK in 2022 was very well reflected in the medium-range MME forecast with an increase of rainfall on 29 May (The onset date) in the 5th day forecast based on 24th May initial condition.