Abstract
In the present study, a model has been developed for nowcasting using the Automatic Weather Station (AWS) data collected from Thiruvananthapuram, Kerala, India. The Proposed model is based on machine learning techniques: Random Forest (RF), which had been coupled with Principal Component Analysis (PCA). PCA minimizes the presence of multicollinearity issue in the AWS data, which enables the RF to access independent effects of predictors efficiently to predict rainy or non-rainy conditions of the atmosphere for the next 4 hours during the peak summer monsoon of month July. The sensitivity and feasibility of the model were tested for different predictors such as wind speed, temperature, pressure, relative humidity, sunshine, and rainfall, where the demarcation between rainy and non-rainy events was computed using a precision-recall curve. The performance of proposed algorithms for rainfall events is evaluated by using different statistics such as accuracy, precision, recall, probability of detection (POD), and false alarm rate (FAR). The proposed algorithm is found to nowcast with an accuracy rate of 90% and the probability of detection is 68%. The analysis of in-situ observations establishes that the most influential predictors for the nowcasting of rainfall are atmospheric pressure and wind speed.
Similar content being viewed by others
Data availability
The pre-processed data used in this study, could be made available if requested.
References
Agrawal, S, Barrington, L, Bromberg, C, Burge, J, Gazen, C, Hickey, J (2019) Machine learning for precipitation nowcasting from radar images. arXiv preprint arXiv:1912.12132
Amini A, Dolatshahi M, Kerachian R (2022) Adaptive precipitation nowcasting using deep learning and ensemble modeling. J Hydrol 612:128197. https://doi.org/10.1016/j.jhydrol.2022.128197
Amit Y, German D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7):1545–1588. https://doi.org/10.1162/neco.1997.9.7.1545
Baez-Villanueva OM, Zambrano-Bigiarini M, Beck HE, McNamara I, Ribbe L, Nauditt A, Birkel C, Verbist K, Giraldo-Osorio JD, Thinh NX (2020) RF-MEP: a novel random Forest method for merging gridded precipitation products and ground-based measurements. Remote Sens Environ 239(2020):111606. https://doi.org/10.1016/j.rse.2019.111606
Belgiu M, Drăguţ L (2016) Random forest in remote sensing: a review of applications and future directions. ISPRS J Photogramm Remote Sens 114(2016):24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
Breiman, L (1999) Random forests. UC Berkeley TR567(1999)
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Belmont, CA: Wadsworth. Int Group 432:151–166
Chakraborty R, Das S, Jana S, Maitra A (2014) Nowcasting of rain events using multi-frequency radiometric observations. J Hydrol 513:467–474. https://doi.org/10.1016/j.jhydrol.2014.03.066
Chakraborty R, Das S, Maitra A (2016) Prediction of convective events using multi-frequency radiometric observations at Kolkata. Atmos Res 169:24–31. https://doi.org/10.1016/j.atmosres.2015.09.024
Chawla NV, Japkowicz N, Kotcz A (2004) Special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newslett 6(1):1–6. https://doi.org/10.1145/1007730.1007733
Chen L, Cao Y, Ma L, Zhang J (2020) A deep learning-based methodology for precipitation nowcasting with radar. Earth Space Sci 7(2):e2019EA000812
Das S, Chakraborty R, Maitra A (2017) A random forest algorithm for nowcasting of intense precipitation events. Adv Space Res 60(6):1271–1282. https://doi.org/10.1016/j.asr.2017.03.026
Davis, J, Goadrich, M (2006) The relationship between precision-recall and ROC curves. In proceedings of the 23rd Int Conf Mach Learn 233-240. https://doi.org/10.1145/1143844.1143874
Feller W (1968) An extension of the law of the iterated logarithm to variables without variance. J Math Mech 18(4):343–355
Goadrich M, Oliphant L, Shavlik J (2004) Learning ensembles of first-order clauses for recall-precision curves: a case study in biomedical information extraction. In: International conference on inductive logic programming springer, Berlin, Heidelberg, pp 98–115
Guo, X, Yin, Y, Dong, C, Yang, G, Zhou, G (2008) On the class imbalance problem. In 2008 fourth international conference on natural computation IEEE, Jinan, China, 4,192-201. https://doi.org/10.1109/ICNC.2008.871
Halko, N, Martinsson, PG, Tropp, JA (2009) Finding structure with randomness: stochastic algorithms for constructing approximate matrix decompositions. ACM technical reports, 2009–05. California Institute of Technology, Pasadena, CA. https://doi.org/10.7907/PK8V-V047
Ham J, Chen Y, Crawford MM, Ghosh J (2005) Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans Geosci Remote Sens 43(3):492–501. https://doi.org/10.1109/TGRS.2004.842481
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284. https://doi.org/10.1109/TKDE.2008.239
Hossin M, Sulaiman MN (2015) A review on evaluation metrics for data classification evaluations. International Journal of Data Mining and Knowledge Management Process 5(2):1–11. https://doi.org/10.5121/ijdkp.2015.5201
Hutengs C, Vohland M (2016) Downscaling land surface temperatures at regional scales with random forest regression. Remote Sens Environ 178(206):127–141. https://doi.org/10.1016/j.rse.2016.03.006
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intel Data Anal 6(5):429–449. https://doi.org/10.3233/IDA-2002-6504
Ling CX, Li C (1998) Data mining for direct marketing: problems and solutions. KDD 98:73–79
Łoś M, Smolak K, Guerova G, Rohm W (2020) GNSS-based machine learning storm Nowcasting. Remote Sens 12(16):2536. https://doi.org/10.3390/rs12162536
Macabiog, REN, Cruz, JCD (2019) Rainfall predictive approach for La Trinidad, Benguet using machine learning classification. In 2019 IEEE 11th international conference on humanoid, nanotechnology, information technology, communication and control, environment, and management (HNICEM) IEEE. 1-6. https://doi.org/10.1109/HNICEM48295.2019.9072761
Mao Y, Sorteberg A (2020) Improving radar-based precipitation Nowcasts with machine learning using an approach based on random Forest. Weather Forecast 35(6):2461–2478. https://doi.org/10.1175/WAF-D-20-0080.1
Mecklenburg S, Joss J, Schmid W (2000) Improving the nowcasting of precipitation in an Alpine region with an enhanced radar echo tracking algorithm. J Hydrol 239(1–4):46–68. https://doi.org/10.1016/S0022-1694(00)00352-8
Moon SH, Kim YH, Lee YH, Moon BR (2019) Application of machine learning to an early warning system for very short-term heavy rainfall. J Hydrol 568(2019):1042–1054. https://doi.org/10.1016/j.jhydrol.2018.11.060
Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222. https://doi.org/10.1080/01431160412331269698
Paul RK (2006) Multicollinearity: causes, effects and remedies. IASRI, New Delhi 1(1):58–65. https://doi.org/10.1011/498.1478
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. The J Mach Learn Res 12(2011):2825–2830
Rodriguez-Galiano VF, Ghimire B, Rogan J, Chica-Olmo M, Rigol-Sanchez JP (2012) An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J Photogramm Remote Sens 67(2012):93–104. https://doi.org/10.1016/j.isprsjprs.2011.11.002
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674. https://doi.org/10.1109/21.97458
Shi, X, Chen, Z, Wang, H, Yeung, DY, Wong, WK, Woo, WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. Adv Neural Inf Proces Syst, 28
Shi, X, Gao, Z, Lausen, L, Wang, H, Yeung, DY, Wong, WK, Woo, WC (2017) Deep learning for precipitation nowcasting: a benchmark and a new model. Adv Neural Inf Proces Syst, 30
Smith A (2010) Image segmentation scale parameter optimization and land cover classification using the random Forest algorithm. J Spat Sci 55(1):69–79. https://doi.org/10.1080/14498596.2010.487851
Visa S, Ralescu A (2005) Issues in mining imbalanced data sets-a review paper. In: Proceedings of the sixteen midwest artificial intelligence and cognitive science conference 2005. pp 67–73
Wang, Y, Coning, ED, Harou, A, Jacobs, W, Joe, P, Nikitina, L, Roberts, R, Wang, J, Wilson, J (2017) Guidelines for nowcasting techniques. WMO publication, published online, php (Vol. 1198).: https://library.wmo.int/opac/doc_num. Accessed March 2021
Acknowledgments
The authors thank Editor-in-Chief and reviewers for their constructive comments to increase the impact of research. The authors acknowledge Space Applications Centre (SAC), Ahmedabad ISRO as data for the study was downloaded from the website www.mosdac.gov.in and the two anonymous reviewers for their constructive suggestions for the betterment of the study.
Funding
The researcher did not receive any grant from any kind of funding agency in public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
Conceptualization: Bipasha Paul Shukla and Anupam Priamvada; Methodology: Anupam Priamvada.
Formal analysis and Investigation: Anupam Priamvada; Writing—original draft preparation: Anupam Priamvada; Writing—review and editing: Bipasha Paul Shukla and Nita H. Shah; Resources: Bipasha Paul Shukla and Nita H. Shah.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Communicated by: H. Babaie
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shah, N.H., Priamvada, A. & Shukla, B.P. Random forest-based nowcast model for rainfall. Earth Sci Inform 16, 2391–2403 (2023). https://doi.org/10.1007/s12145-023-01037-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-01037-0