1 Introduction

Accurate rainfall forecasting is crucial for applications such as disaster preparedness, agricultural planning, water resource management, and urban development. However, predicting future rainfall patterns remains challenging due to the dynamic and complex nature of weather systems (Chen et al. 2023). Identifying underlying patterns and linkages in historical rainfall data is difficult, as it requires models capable of integrating diverse climatic factors and addressing the intrinsic variability and unpredictability of weather systems (Zhao et al. 2024). Two essential techniques in rainfall forecasting have emerged: conceptual modeling and system theoretical modeling. Conceptual modeling is widely used in hydrological forecasting due to its focus on understanding physical principles governing hydrological processes. This approach relies on specific catchment characteristics but faces challenges in rainfall prediction, such as data calibration difficulties and computational demands (Xu et al. 2022). In contrast, system theoretical methods, like the ARMAX model, focus on mapping input–output relationships without detailed physical insights. While useful in time series forecasting, ARMAX models struggle with predicting nonlinear rainfall variations (Zhao et al. 2023).

1.1 Literature Review

Traditional forecasting methods often fall short, particularly in regions like Delhi, where rainfall patterns are influenced by multiple neighboring states. These methods struggle with integrating complex inter-regional climatic data and tend to rely on simplistic, linear models that cannot adequately capture both rapid daily changes and longer-term weekly trends. Existing models also face difficulties in managing noisy data and distinguishing significant climatic signals from transient anomalies (Xie et al. 2021). An alternative approach that has gained prominence in improving rainfall forecasting is the use of machine learning algorithms. Machine learning provides a data-driven method that can complement traditional forecasting techniques by uncovering complex patterns and relationships within rainfall data (Truong et al. 2023). Techniques such as Support Vector Machines (SVM), Artificial Neural Networks (ANN), Gradient Boosting, and Random Forests are examples of algorithms can be trained on historical rainfall data to detect hidden patterns and trends. These methods excel at capturing nonlinear relationships and adapting to dynamic conditions, which is particularly beneficial for rainfall forecasting, where abrupt shifts and complex interactions are common. By inputting historical meteorological data into these algorithms, machine learning models can learn to identify correlations between various factors (such as temperature, humidity, and wind speed) and rainfall (Zhou et al. 2021). Once trained, these models can predict future rainfall based on real-time or forecasted meteorological data. A key advantage of machine learning is its capacity to simultaneously consider numerous variables and their interactions, leading to more precise predictions (Yin et al. 2023). Moreover, as machine learning models encounter new data, they can continually refine their predictions through ongoing learning.

Rainfall forecasting has made significant strides with the use of machine learning and data-driven methods. The unpredictable nature of rainfall poses unique challenges for accurate prediction (Kumar and Yadav 2020). Machine learning algorithms have been shown to outperform traditional methods in forecasting. For instance, (Abbot and Marohasy 2014) use ANN to forecast continuous rainfall based on climate indices, demonstrating improved accuracy. Chen et al., (2013) find that a neural network model enhances runoff prediction by managing complex spatial rainfall distributions in Taiwan. Mekanik et al., (2013) compare ANN and multiple regression for spring rainfall forecasting in Victoria, Australia, showing that ANN provides better generalization. Yu et al., (2017) compare Random Forests (RF) and SVM for real-time radar-derived rainfall forecasting, noting that single-mode RF and SVM models perform better than their multiple-mode counterparts for 1-h ahead predictions. Feng et al., (2015) introduce a wavelet analysis-support vector machine (WA-SVM) model for arid regions, which forecasts monthly rainfall accurately over various lead times. This approach is especially useful for urban areas needing short-term rainfall forecasts.

Abbasi et al., (2021) developed a hybrid model that integrates RF, Deep Auto-Encoder, and support vector regression (SVR) methods for streamflow prediction, improving long-term accuracy and reducing uncertainty. Tan et al., (2021) introduced a novel method combining RF and inverse distance weighting (RF-IDW) for generating precipitation and temperature climate surfaces, which improves accuracy by accounting for spatial complexities and environmental factors. Rahman et al., (2022) proposed a real-time rainfall prediction system for smart cities, integrating fuzzy logic with four supervised machine learning techniques. Using historical weather data, this fusion-based framework outperforms other models. Diez-Sierra and Del-Jesus, (2020) evaluated the performance of eight statistical and machine learning methods driven by atmospheric synoptic patterns for long-term daily rainfall prediction in Tenerife, Spain. They found that neural networks excel in predicting rainfall occurrence and intensity. (Nunno et al., (2022) developed a reliable precipitation prediction model using machine learning algorithms for the northern region of Bangladesh. A hybrid model, based on M5P and SVR, performs exceptionally well for precipitation prediction. Exploring machine learning's potential, (Zhang et al. 2022) focus on spatial patterns in precipitation forecasts, investigating convolutional neural network models to improve the skill of predicting precipitation occurrence while balancing the trade-off between false positives and negatives. Additionally, studies by (Endalie et al. 2022) demonstrate the application of various machine learning techniques, to enhance rainfall prediction accuracy. Collectively, these studies underscore the growing role of machine learning in addressing the complexities of rainfall prediction across different contexts, highlighting its potential for improved accuracy and practical application.

The novelty of this study lies in its innovative approach of integrating meteorological data from multiple states to enhance rainfall forecasting accuracy for a specific target region, Delhi, India. By utilizing diverse climatic data inputs from Uttarakhand, Haryana, Punjab, Uttar Pradesh, Himachal Pradesh, Madhya Pradesh, and Rajasthan States, this research pioneers a cross-regional data integration methodology. This approach not only improves predictive performance but also provides a robust framework for regional weather forecasting, demonstrating the practical application of advanced machine learning techniques in enhancing meteorological predictions. The study also explores and compares the performance of various machine learning models CatBoost, ElasticNet, Multilayer Perceptron (MLP), Lasso, Random Forest (RF), LGBM, Linear Regression (LR), Ridge, XGBoost, and Stochastic Gradient Descent (SGD), in capturing intricate patterns and delivering high accuracy in both daily and weekly forecasts. This cross-state data integration strategy represents a significant advancement in the field of rainfall prediction, offering a novel perspective on leveraging regional climatic variations for more accurate and reliable forecasts.

1.2 Objective of the study

The primary objective of this study is to enhance the accuracy of rainfall forecasting for Delhi by leveraging historical rainfall data from neighboring states, including Uttarakhand, Uttar Pradesh, Haryana, Punjab, Himachal Pradesh, Madhya Pradesh, and Rajasthan. The research examines how the climate and geography of these surrounding regions influence the rainfall patterns in Delhi. To achieve this, the study employs a range of algorithms, including CatBoost, ElasticNet, MLP, Lasso, RF, LGBM, LR, Ridge, XGBoost, and SGD.

2 Study Area and Data Preparation

2.1 Study Area

The research region encompasses Uttarakhand, Uttar Pradesh, Haryana, Punjab, Himachal Pradesh, Madhya Pradesh, Rajasthan, and the Union Territory of Delhi. This area features a diverse range of geographic and climatic conditions, including the Himalayan foothills, the Gangetic plains, deserts, and the metropolitan agglomeration of Delhi. Due to this geographical diversity, the region experiences a wide range of climatic conditions. Uttarakhand and Himachal Pradesh have a moderate climate with robust summer monsoons. The Gangetic plains, covering Uttar Pradesh, Haryana, and Punjab, experience a subtropical climate characterized by hot summers and milder winters. Rajasthan is largely desert or semi-arid, with extreme temperatures and scant rainfall. Delhi, part of the National Capital Territory, has a subtropical climate with additional urban heat island effects. Rainfall in the region is significantly influenced by the Indian Monsoon, which provides most of the annual precipitation. The monsoon season generally spans from June to September, with varying start and end dates across the region. The Himalayan foothills and parts of Himachal Pradesh receive substantial rainfall due to orographic lifting, while Rajasthan's arid areas experience minimal monsoon precipitation. The variability in rainfall patterns results from the intricate interaction between monsoon winds, topography, and local climatic factors. Figure 1(a) illustrates the location of the research area.

Fig. 1
figure 1figure 1

(a) Study area map. (b) Feature importance results using random forest. (c) Presents the correlation matrix for daily rainfall in Delhi

Accurate rainfall forecasting is essential for various sectors in this region. Since a significant portion of the population relies on agriculture, timely and precise rainfall predictions are crucial for optimizing irrigation and crop management. Reliable forecasts also support reservoir management, water allocation, and flood control decisions. Given the region's vulnerability to both droughts and floods, accurate rainfall predictions enable proactive disaster management measures. In urban areas like Delhi, forecasts play a key role in planning drainage and flood control infrastructure. The unique aspect of this study lies in its cross-state analysis aimed at improving rainfall forecasting for Delhi. By utilizing historical rainfall data from neighboring states, the research seeks to enhance the accuracy of predictions for Delhi. This approach recognizes the interconnections between the precipitation patterns of these states, driven by the monsoon system and regional climatic influences. The study aims to bridge the gap between localized meteorological data and broader regional factors, investigating how the rainfall patterns of Uttarakhand, Uttar Pradesh, Haryana, Punjab, Himachal Pradesh, Madhya Pradesh, and Rajasthan affect Delhi. Additionally, it explores how these interconnections can be leveraged to refine the accuracy of rainfall forecasting for Delhi.

2.2 Data Collection

The approach for collecting data for this comprehensive research region, which includes Delhi and the states of Uttarakhand, Uttar Pradesh, Haryana, Punjab, Himachal Pradesh, Madhya Pradesh, and Rajasthan, involves gathering daily rainfall data. This data was sourced from WRIS (India's Water Resources Information System), a comprehensive database managed by the Ministry of Jal Shakti of the Indian Government. WRIS provides a repository of hydrological and meteorological information, offering access to essential historical rainfall observations. The dataset includes daily rainfall records spanning four decades, from 1980 to 2021.

2.3 Data Preprocessing

Temporal adjustments were crucial in processing the dataset. The approach involved using rainfall metrics from neighboring states from the previous day to predict Delhi's rainfall for the current day. This shift is based on the meteorological insight that weather patterns in adjacent areas can influence conditions in a central location like Delhi. For example, rainfall in Uttar Pradesh today might indicate potential rainfall in Delhi tomorrow. This alignment is essential for accurate predictive modeling. Given the complexity of meteorological events, the study employed a dual model approach. One model was designed to capture the day-to-day variations in rainfall, while the other focused on longer, weekly trends. This bifurcation provided a comprehensive perspective, allowing for precise detection of immediate triggers and broader, week-long patterns.

2.4 Feature and Correlation Analysis

In developing the machine learning model for rainfall prediction, various features were incorporated. The RandomForestRegressor from the `sklearn.ensemble` library was employed to train the model using seven features: 'Uttarakhand', 'Uttar Pradesh', 'Haryana', 'Punjab', 'Himachal Pradesh', 'Madhya Pradesh', and 'Rajasthan'. Feature importance was assessed using the trained Random Forest model, where the importance of each feature is determined by the total reduction in the criterion attributed to that feature, also known as Gini importance. As shown in Fig. 1(b), 'Uttar Pradesh' had the highest importance score of 0.823254, indicating its substantial impact on the model's predictions. This was followed by 'Uttarakhand' with a score of 0.072523, 'Rajasthan' at 0.031806, 'Punjab' at 0.028555, 'Haryana' at 0.024239, 'Himachal Pradesh' at 0.010216, and 'Madhya Pradesh' at 0.009406. These results highlight that rainfall in 'Uttar Pradesh' plays a significant role in the model’s predictive accuracy.

A correlation matrix was also computed to examine the relationships between the different features. This matrix provides correlation coefficients that measure the linear relationship between pairs of features, with values ranging from -1 to 1. A value closer to 1 indicates a strong positive correlation, while a value closer to -1 signifies a strong negative correlation. As illustrated in Fig. 1(c), the correlation matrix reveals that all features exhibit a strong positive correlation. This suggests that similar weather patterns influence rainfall across these regions. Analyzing the features in this manner offers valuable insights into their importance and relationships, which is essential for understanding the model's predictions and improving its performance.

3 Methodology

Rainfall forecasting using machine learning follows a structured process. It begins with the collection of historical weather data, including rainfall observations. The data is then preprocessed to ensure quality and consistency by addressing missing values, outliers, and normalization. Key climatic factors are identified through feature selection and engineering. The dataset is divided into training, validation, and testing sets. The model is trained on the training set and its performance is evaluated on the validation dataset. Once validated, the model predicts future rainfall for different time frames. It is subsequently deployed for operational use, with ongoing monitoring to ensure accuracy as weather patterns evolve. Figure 2 illustrates the flowchart of the approach used in this study for the daily model, with a similar flowchart being developed for the weekly model.

Fig. 2
figure 2

Illustrates the methodology flowchart for the daily forecasting model

3.1 Model Preparation

The study employed time series segmentation to partition the data into training, validation, and test sets, reflecting the importance of maintaining temporal order in time series data. This method segments the data across chronological intervals: the training set includes the earliest data, the validation set consists of sequential data, and the test set covers the most recent observations. Unlike random splits, this approach preserves chronological accuracy and the inherent temporal dependencies of the data. By using time series splitting, the study maintained the temporal structure of the data, ensuring that future predictions are based on a realistic sequence of events. Based on this concept, two models are developed, as presented in Table SI-18 (supplementary information), a Daily Model and a Weekly Model. Both models use data from Uttarakhand, Haryana, Punjab, Uttar Pradesh, Himachal Pradesh, Madhya Pradesh, and Rajasthan as inputs to predict rainfall for Delhi. The Daily Rainfall Model must quickly detect significant shifts, such as predicting rain after a sunny day, while maintaining robustness against brief anomalies and focusing on daily trends. The Weekly Rainfall Model analyzes seven-day patterns to identify trends like increasing rainfall or alternating dry and wet days, aiming to capture broader trends despite occasional data anomalies. Both models must balance sensitivity to temporal changes with resilience to noise in meteorological data.

3.2 Pipelines Creation and Model Training

In machine learning, feature scaling is crucial for maintaining the effectiveness of models, particularly those that are sensitive to the magnitudes of data points or rely on gradient descent optimization techniques, such as K-Nearest Neighbors (KNN) and Neural Networks. To address the issue of diverse units and scales in the dataset, normalization is essential. For this purpose, the Scikit-learn StandardScaler was employed. This scaler standardizes features by adjusting the mean to zero and scaling each feature to have unit variance. This process ensures that no single feature disproportionately influences the model, leading to more balanced and accurate predictions. Scikit-learn's pipelines facilitated this process by integrating preprocessing and modeling steps, simplifying the workflow and ensuring consistent preprocessing across datasets, which enhances reproducibility.

3.3 CatBoost Regressor Algorithm

CatBoost, like other gradient boosting algorithms, builds a strong predictive model by combining multiple weak learners, typically decision trees (Karbasi et al. 2022). The final prediction is generated by aggregating the predictions from these individual trees, each weighted according to their importance. The overall process can be described as follows:

For a binary classification problem, the formula for CatBoost's prediction can be expressed as:

$$F\left(x\right)=\sigma \left({\sum }_{m=1}^{M}{\alpha }_{m}\bullet {f}_{m}\left(x\right)\right)$$
(1)

where, \(F(x)\) represents the estimated probability of the positive class (class 1) for input \(M\) is the number of trees in the ensemble. The prediction of the \({m}_{th}\) decision tree is \({f}_{m}\left(x\right)\). The weight associated with the \({m}_{th}\) tree is denoted by \({\alpha }_{m}\). \(\sigma\) is the sigmoid function that converts the total of tree predictions into a probability between 0 and 1.

CatBoost employs several techniques to effectively manage categorical data and enhance the boosting process. It utilizes ordered boosting, which adjusts the learning rate for each tree according to its depth and sequence position. Additionally, CatBoost uses a method that optimizes the order of categorical feature levels during splits to minimize overfitting. While the provided formula offers a high-level overview of CatBoost's functionality, the actual implementation includes additional complexities and refinements that further boost its performance and accuracy.

Other methodologies, including Lasso, Ridge, ElasticNet Regression, Light Gradient-Boosting Machine Regressor (LGBM), Linear Regression (LR), Multilayer Perceptron (MLP), Random Forest (RF), Stochastic Gradient Descent (SGD), and XGBoost, have been added to the supplementary information as SI-1 to SI-9.

3.4 Model Evaluation

Several assessment metrics are employed to compare against the actual observations as follows:

Mean squared error (MSE) is calculated as the average of the squared differences between actual (\({y}_{i}\)) and predicted (\({\widehat{y}}_{i}\)) values:

$$\text{MSE} = (1/n) * \Sigma [{({y}_{\text{i}} - {\widehat{y}}_{\text{i}})}^{2}]$$
(2)

Mean absolute error (MAE) is the average of the absolute differences between actual and predicted values:

$$\text{MAE }= (1/n) * \Sigma |{y}_{\text{i}} - {\widehat{y}}_{\text{i}}|$$
(3)

Root means square error (RMSE) is the square root of MSE:

$$\text{RMSE}=\surd (\text{MSE}) = \surd \left[\left(\frac{1}{n}\right)*\Sigma \left[\left({y}_{\text{i}}-{\widehat{y}}_{\text{i}}\right)\right]\right]$$
(4)

Root means square percent error (RMSPE) calculates the root mean square percentage error:

$$\text{RMSPE} = \surd [(1/n) * \Sigma [({({y}_{\text{i}} - {\widehat{y}}_{\text{i}}) / {y}_{\text{i}})}^{2}]]$$
(5)

R-squared (R2) is computed as \(1\) minus the ratio of the sum of squared differences between actual and predicted values to the sum of squared differences between actual values and the mean (\(\overline{y }\)) of the dependent variable:

$${\text{R}}^{2}=1-\left(\Sigma \left[{\left({y}_{\text{i}}-{\widehat{y}}_{\text{i}}\right)}^{2}\right]/\Sigma \left[{\left({y}_{\text{i}}-\overline{y }\right)}^{2}\right]\right)$$
(6)

Here, \(n\) represents the total number of data points, \({y}_{\text{i}}\) is the observed value for the \({i}_{th}\) data point, and \({\widehat{y}}_{i}\) is the predicted value for the \({i}_{th}\) data point. The \(\overline{y }\) the mean of the dependent variable and \(\Sigma\) reflects the sum of squared differences across data points.

4 Results and Discussion

Table SI-19 in the supplementary information presents a detailed summary of key parameters for various machine learning models. These parameters are crucial for optimizing model performance and are adjusted based on the dataset and desired outcomes. This table illustrates the adaptability and customization options available in modern regression algorithms.

4.1 Daily Model Results and Discussion

This section presents the results and discussion of the daily forecasting model. The performance of various models is evaluated using critical assessment criteria, including MSE, MAE, RMSE, RMSPE, and R2. These metrics offer insights into each model’s accuracy, precision, and ability to generalize.

4.1.1 Training Performance Metrics for the Daily Models

Table 1(a) summarizes the training results of different models used for daily forecasting, with the best-performing models highlighted in bold. CatBoost, XGBoost, and RF stand out, showcasing excellent accuracy with perfect R2 values. These models effectively capture underlying data patterns, resulting in minimal errors across various metrics. LGBM also performs well, though its slightly higher RMSPE indicates some room for improvement in capturing percentage errors. The Neural Network MLP model, built with multiple layers, achieves low error scores and a high R2 value, demonstrating its capability to understand complex relationships within the data. In contrast, the LR and Ridge models deliver moderate results, indicating that they may not effectively grasp complex relationships compared to more advanced models. ElasticNet and Lasso models exhibit relatively higher errors and lower R2 values, suggesting that their regularization components might hinder optimal data fitting. The SGD model performs similarly to LR and Ridge but could benefit from hyperparameter fine-tuning for better results. CatBoost, XGBoost, and RF emerge as top contenders for the daily forecasting task due to their remarkable accuracy and ability to capture underlying patterns. LGBM and MLP also offer strong alternatives, especially if computational efficiency is a priority. While simpler linear models like LR, Ridge, ElasticNet, and Lasso yield satisfactory results, they may struggle with intricate data relationships. Evaluating these models on validation and test datasets is essential to ensure their generalizability and ultimately determine the most suitable model, considering both accuracy and interpretability.

Table 1 (a) Training results, (b) Validation results, and (c) Testing results of models in daily rainfall forecasting

4.1.2 Validation Performance Metrics for Daily Models

Table 1(b) presents the validation results for various models used in the daily forecasting task. Comparing these results with the training dataset in Table 1(a) reveals several key observations. CatBoost maintains its exceptional performance on the validation dataset, reaffirming its strong generalization ability. It achieves a perfect R2 score and minimal errors across all metrics, consistently capturing the data's underlying patterns with high accuracy. ElasticNet, Lasso, LR, Ridge, and SGD show similar trends in both datasets, with only marginal differences in their error scores and R2 values. While these models may not achieve the same level of accuracy as more complex ones, their consistency across datasets suggests they capture core trends reasonably well. LGBM and MLP, which demonstrated strong performance in Table 1(a), continue to excel on the validation dataset. Their low error scores and high R2 values highlight their robustness and ability to generalize beyond the training data. XGBoost and RF maintain their accuracy, mirroring their training dataset performance. These models exhibit a perfect R2 score, indicating their capacity to consistently capture and reproduce the data's patterns. The validation results reaffirm the strengths identified in the training dataset. CatBoost, XGBoost, and RF consistently deliver remarkable accuracy across both datasets. LGBM and MLP also maintain strong performance, indicating their reliability and potential for practical application.

4.1.3 Test Performance Metrics for the Daily Models

Table 1(c) provides an overview of the testing results for various models employed in the daily forecasting task. Comparing these testing results with the training and validation datasets in Tables 1(a) and 1(b), several trends can be discerned. CatBoost demonstrates exceptional consistency, maintaining a perfect R2 value and exhibiting minimal errors across all metrics, underscoring its robust generalizability and ability to consistently capture underlying patterns. ElasticNet, Lasso, LR, Ridge, and SGD continue to show stable performance across the datasets, with slight variations in their error scores and R2 values. LGBM and MLP, which exhibited strong results in the training and validation datasets, retain their robust performance on the test dataset, with low error scores and high R2 values, suggesting their capacity to generalize beyond the training data. XGBoost and RF again demonstrate impressive accuracy on the test dataset, reflected by perfect R2 values. These results highlight the dependability and generalizability of several models, namely CatBoost, XGBoost, RF, LGBM, and MLP.

The best model, which predicts rainfall using CatBoost, RF, and LGBM for daily data, is displayed in Figures SI-10 to SI-12 of the supplementary information. The worst model, however, is depicted in Figure SI-13, which uses Lasso for daily data to predict rainfall. Figure SI-10 shows how well the CatBoost model matches actual rainfall data with its predictions, demonstrating striking accuracy. Similarly, the RF model performs well, successfully capturing daily fluctuations, as shown in Figure SI-11. The LGBM model also precisely identifies rainfall peaks and troughs, maintaining close agreement with actual data (Figure SI-12). Conversely, the Lasso model has significant difficulties, exhibiting noticeable discrepancies from actual rainfall data, as illustrated in Figure SI-13.

4.1.4 Comparison of Predicted Error Plots for Daily Models

To gain a deeper understanding of the performance of various machine learning models (CatBoost, ElasticNet, Lasso, LGBM, LR, MLP, RF, Ridge, SGD, and XGBoost) in predicting rainfall, error scatter plots shown in Fig. 3(a)-(j) were analyzed to illustrate the relationship between predicted and actual rainfall values. The figures reveal that CatBoost and RF exhibit data point errors of less than 0 to 2% for rainfall predictions, as shown in Fig. 3(a) and (g). XGBoost demonstrates similar data point errors, ranging from 2 to 5%, as illustrated in Fig. 3(j). In contrast, LGBM displays a broader range of data point errors, spanning from 2 to 15%, as shown in Fig. 3(c). ElasticNet, Lasso, LR, MLP, Ridge, and SGD consistently show higher percentages of data point errors, exceeding 15%, as depicted in Fig. 3(b), (d), (e), (f), (h), and (i), respectively. Based on these findings, it is reasonable to conclude that CatBoost and RF models outperform the others in terms of robustness and reliability for predicting daily rainfall.

Fig. 3
figure 3figure 3

(a) Scatter plots for the CatBoost daily model. (b) Scatter plots for the ElasticNet daily model. (c) Scatter plots for the LGBM daily model. (d) Scatter plots for the Lasso daily model. (e) Scatter plots for the LR daily model. (f) Scatter plots for the MLP daily model. (g) Scatter plots for the RF daily model. (h) Scatter plots for the Ridge daily model. (i) Scatter plots for the SGD daily model. (j) Scatter plots for the XGBoost daily model

4.2 Results and Discussion for Weekly Models

This section presents the results and discusses the weekly forecasting models. The performance of various models is evaluated with respect to \(\text{MSE}\), \(\text{MAE}\), \(\text{RMSE}\), \(\text{RMSPE}\), and R2.

4.2.1 Training Performance Metrics for the Weekly Models

Table 2(a) presents the training results for various models used in a weekly forecasting. The analysis reveals the following insights into model performance: XGBoost emerges as the most accurate, exhibiting minimal error scores across all metrics and an impressive R2 value of 0.99. This indicates that XGBoost effectively captures underlying patterns within the data, leading to highly accurate forecasts. RF and CatBoost also show near-perfect accuracy with R2 values of 0.99, benefiting from ensemble approaches that leverage multiple decision trees to achieve accurate forecasts and strong performance across all metrics. ElasticNet, Lasso, LR, Ridge, and SGD deliver reasonably good performance with moderate error scores and R2 values ranging from 0.92 to 0.96. LGBM and MLP further demonstrate strong performance, with LGBM achieving a high R2 value of 0.99 and MLP a value of 0.98, showcasing their proficiency in capturing complex relationships and modeling weekly trends. The top performers, based on consistent and accurate forecasts, are XGBoost, RF, CatBoost, LGBM, and MLP. To ensure their generalizability and suitability for real-world applications, further validation on independent validation and test datasets are considered.

Table 2 (a) Training results, (b) Validation results, and (c) Testing results of models in weekly rainfall forecasting

4.2.2 Validation Performance Metrics for the Weekly Models

Table 2(b) provides a comprehensive overview of the validation results for various models used in the weekly forecasting task. The insights gained from these results allow for a thorough assessment of each model's performance. XGBoost continues to stand out as the most accurate model, showing consistent performance across all metrics with an R2 value of 0.99. This indicates that XGBoost effectively captures the underlying trends in the data, leading to precise weekly forecasts on the validation dataset. RF and CatBoost also maintain exceptional accuracy, each with an R2 value of 0.99. These ensemble methods successfully utilize multiple decision trees to produce accurate predictions, resulting in strong performance across all metrics. These ensemble methods continue to leverage multiple decision trees effectively, resulting in high performance across all metrics. ElasticNet, Lasso, LR, Ridge, and SGD demonstrate stable and respectable performance, with R2 values ranging from 0.92 to 0.96. This range highlights their ability to consistently capture the fundamental trends in the data. LGBM and MLP also perform strongly, with LGBM achieving an R2 value of 0.99 and MLP a value of 0.98, showcasing their effectiveness in modeling complex relationships and reinforcing their value for weekly trend forecasting. Again, XGBoost, RF, CatBoost, LGBM, and MLP emerge as the most consistent and accurate models, providing precise predictions across the validation dataset.

4.2.3 Test Performance Metrics for the Weekly Models

Table 2(c) provides an overview of the testing results for various models used in the weekly forecasting. The analysis of these results offers valuable insights into each model's performance. XGBoost remains a standout performer, maintaining consistent accuracy across all metrics. ElasticNet, Lasso, LR, Ridge, and SGD demonstrate steady performance, with moderate error scores and R2 values ranging from 0.92 to 0.96, indicating their ability to consistently capture the fundamental trends in the data. LGBM and MLP also continue to excel, showcasing their strengths with low error scores and high R2 values of 0.99 and 0.98, respectively. RF and CatBoost maintain their impressive accuracy with an R2 value of 0.99. XGBoost, RF, CatBoost, LGBM, and MLP consistently deliver accurate predictions across the testing dataset.

The best models for predicting rainfall using CatBoost, RF, and LGBM for weekly data are illustrated in Figures SI-14 to SI-16 of the supplementary information. Figure SI-14 demonstrates the CatBoost model's ability to match actual rainfall data with its predictions, showcasing striking accuracy. Figure SI-15 highlights the RF model's effective performance in capturing weekly fluctuations. Figure SI-16 shows how well the LGBM model identifies rainfall peaks and troughs, maintaining close agreement with actual data. Conversely, Figure SI-17 depicts the Lasso model, which faces significant challenges and exhibits noticeable discrepancies from the actual rainfall data.

4.2.4 Comparison of Predicted Error Plots for Weekly Models

To gain a deeper understanding of the performance of various machine learning models (CatBoost, ElasticNet, Lasso, LGBM, LR, MLP, RF, Ridge, SGD, and XGBoost) in predicting rainfall, error scatter plots shown in Fig. 4(a)-(j) were analyzed to illustrate the relationship between predicted and actual rainfall values. The analysis reveals that XGBoost exhibits data point errors of less than 0 to 2% for rainfall predictions, as shown in Fig. 4(j). CatBoost and RF demonstrate data point errors ranging from 2 to 5%, with some errors extending between 5 and 15%, as depicted in Fig. 4(a) and (g). In contrast, LGBM displays a broader error range, spanning from 2 to 15% for rainfall data, as illustrated in Fig. 4(c). ElasticNet, Lasso, LR, MLP, Ridge, and SGD consistently show higher error percentages, exceeding 15% for rainfall predictions, as shown in Fig. 4(b), (d), (e), (f), (h), and (i), respectively. Based on these findings, it is reasonable to conclude that XGBoost outperforms the other models in terms of robustness and reliability for predicting weekly rainfall.

Fig. 4
figure 4figure 4

(a) Scatter plots for the CatBoost weekly model. (b) Scatter plots for the ElasticNet weekly model. (c) Scatter plots for the LGBM weekly model. (d) Scatter plots for the Lasso weekly model. (e) Scatter plots for the LR weekly model. (f) Scatter plots for the MLP weekly model. (g) Scatter plots for the RF weekly model. (h) Scatter plots for the Ridge weekly model. (i) Scatter plots for the SGD weekly model. (j) Scatter plots for the XGBoost weekly model

5 Conclusion

This study provides an in-depth analysis of rainfall forecasting models using historical data from Uttarakhand, Uttar Pradesh, Haryana, Punjab, Himachal Pradesh, Madhya Pradesh, Rajasthan, and Delhi. The evaluation of both daily and weekly forecasting approaches revealed significant insights into model performance and accuracy. For daily forecasting, the CatBoost model demonstrated exceptional accuracy with an R2 of 0.99, RMSE of 0.0014, and MAE of 0.0011, effectively capturing intricate daily rainfall patterns. XGBoost and RF followed closely with an R2 of 0.99, RMSE of 0.0022, and MAE of 0.0017, also showing strong predictive capabilities. In contrast, Lasso Regression exhibited lower accuracy with an R2 of 0.75, RMSE of 1.50, and MAE of 1.02, highlighting its limitations in modeling detailed rainfall variability. For weekly forecasting, XGBoost emerged as the top performer with an R2 of 0.99, RMSE of 0.10, and MAE of 0.019, demonstrating its robustness in capturing weekly rainfall trends. RF achieved an R2 of 0.99, RMSE of 0.112, and MAE of 0.032, providing consistent and reliable predictions. CatBoost also performed effectively with an R2 of 0.99, RMSE of 0.118, and MAE of 0.062. These findings underscore the effectiveness of advanced machine learning techniques in enhancing rainfall forecasting accuracy. The study highlights the critical role of precise rainfall predictions in informed decision-making across agriculture, water resource management, flood control, and urban planning. The successful application of these models promises to improve our understanding of regional climate patterns and their implications for weather forecasting in northern India.