Abstract
Accurate and sustainable management of water resources is among the most important circumstances of basin and river engineering. In this study, a hybrid machine learning (ML) model was generated using CatBoost and Genetic Algorithm (GA) for significant impact on river flow prediction. The study was applied to Sakarya Basin, which is located in semi-arid climatic conditions in Turkey. The forecast performance of the models was observed by developing a day-step ahead forecast scenario with the data of Adatepe, Aktaş and Rüstümköy flow measurement stations (FMS). The daily flow data of the specified stations between 2002 and 2012 were used and the performance of the proposed model was tested by comparing with CatBoost, Long-Short Term Memory (LSTM) and the classical estimation method, Linear Regression (LR). The study was also aimed to improve the predictive performance of genetic algorithms combined with the gradient boosting model (GA-CatBoost). The developed hybrid model outperformed the benchmarked models. The results showed that the developed model can be successfully applied in river flow forecasting.
Similar content being viewed by others
Availability of Data and Materials
The data will be available upon reasonable request.
References
Ardabili S, Mosavi A, Dehghani M, Várkonyi-Kóczy AR (2020) Deep learning and machine learning in hydrological processes climate change and earth systems a systematic review. In: Engineering for Sustainable Future: Selected papers of the 18th International Conference on Global Research and Education Inter-Academia–2019 18. Springer, pp 52–62
Basilio SA, Goliatt L (2022) Gradient boosting hybridized with exponential natural evolution strategies for estimating the strength of geopolymer self-compacting concrete. Knowledge-Based Eng Sci 3:1–16
Box GEP, Tiao GC (1975) Intervention analysis with applications to economic and environmental problems. J Am Stat Assoc 70:70–79. https://doi.org/10.1080/01621459.1975.10480264
Carvalho TMN, de Assis de Souza Filho F (2021) Variational mode decomposition hybridized with gradient boost regression for seasonal forecast of residential water demand. Water Resour Manag 35:3431–3445. https://doi.org/10.1007/s11269-021-02902-7
Ceribasi G, Ceyhunlu AI, Wałęga A, Młyński D (2022) Investigation of the effect of climate change on energy produced by hydroelectric power plants (HEPPs) by trend analysis method: A case study for dogancay I-II HEPPs. Energies 15:2474. https://doi.org/10.3390/en15072474
Danandeh Mehr A, Rikhtehgar Ghiasi A, Yaseen ZM et al (2022) A novel intelligent deep learning predictive model for meteorological drought forecasting. J Ambient Intell Humaniz Comput 1–15
Dorogush AV, Ershov V, Gulin A (2018) CatBoost: Gradient boosting with categorical features support. arXiv Prepr arXiv181011363
Fan J, Wang X, Wu L et al (2018) Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China. Energy Convers Manag 164:102–111. https://doi.org/10.1016/j.enconman.2018.02.087
Fernández-Carrillo VH, Quej-Chi VH, De los Santos-Posadas HM, Carrillo-Ávila E (2022) Do AI models improve taper estimation? A comparative approach for teak. Forests 13:1465. https://doi.org/10.3390/f13091465
Ghimire S, Yaseen ZM, Farooque AA et al (2021) Streamflow prediction using an integrated methodology based on convolutional neural network and long short-term memory networks. Sci Rep 11:1–26
Goliatt L, Yaseen ZM (2023) Development of a hybrid computational intelligent model for daily global solar radiation prediction. Expert Syst Appl 212:118295
He X, Luo J, Li P, Zuo G, Xie J (2020) A hybrid model based on variational mode decomposition and gradient boosting regression tree for monthly runoff forecasting. Water Resour Manag 34:865–884
Hochreiter S, Schmidhuber JJ (1997) Long short-term memory. Neural Comput 9:1–32. https://doi.org/10.1162/neco.1997.9.8.1735
Huang G, Wu L, Ma X et al (2019) Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J Hydrol. https://doi.org/10.1016/j.jhydrol.2019.04.085
Ibrahim KSMH, Huang YF, Ahmed AN et al (2022) Forecasting multi-step-ahead reservoir monthly and daily inflow using machine learning models based on different scenarios. Appl Intell. https://doi.org/10.1007/s10489-022-04029-7
Imrie CE, Durucan S, Korre A (2000) River flow prediction using artificial neural networks: generalisation beyond the calibration range. J Hydrol 233:138–153
Ivanov AM, Gorbarenko AV, Kireeva MB, Povalishnikova ES (2022) Identifying climate change impacts on hydrological behavior on large-scale with machine learning algorithms. Geogr Environ Sustain 15:80–87
Karbasi M, Jamei M, Ali M et al (2022) Forecasting weekly reference evapotranspiration using Auto Encoder Decoder Bidirectional LSTM model hybridized with a Boruta-CatBoost input optimizer. Comput Electron Agric 198:107121
Khan MI, Maity R (2020) Hybrid deep learning approach for multi-step-ahead daily rainfall prediction using GCM simulations. IEEE Access 8:52774–52784. https://doi.org/10.1109/access.2020.2980977
Kilinc HC, Ahmadianfar I, Demir V et al (2023) Daily scale streamflow forecasting based-hybrid gradient boosting machine learning model. Researchsquare
Kilinc HC (2022) Daily streamflow forecasting based on the hybrid particle swarm optimization and long short-term memory model in the Orontes basin. Water 14:490. https://doi.org/10.3390/w14030490
Kilinc HC, Yurtsever A (2022) Short-term streamflow forecasting using hybrid deep learning model based on grey wolf algorithm for hydrological time series. Sustainability 14:3352
Kim J, Han H, Johnson LE et al (2019) Hybrid machine learning framework for hydrological assessment. J Hydrol 577:123913
Kumar P, Singh AK (2022) A comparison between MLR, MARS, SVR and RF techniques: hydrological time-series modeling. J Hum Earth Futur 3:90–98
Li L, Qiao J, Yu G et al (2022) Interpretable tree-based ensemble model for predicting beach water quality. Water Res 211:118078. https://doi.org/10.1016/j.watres.2022.118078
Mahmood R, Jia S (2022) A comprehensive approach to develop a hydrological model for the simulation of all the important hydrological components: The case of the Three-Tiver Headwater Region, China. Water 14:2778. https://doi.org/10.3390/w14182778
Munawar HS, Hammad AWA, Waller ST (2021) A review on flood management technologies related to image processing and machine learning. Autom Constr
Naganna SR, Beyaztas BH, Bokde N, Armanuos AM (2020) On the evaluation of the gradient tree boosting model for groundwater level forecasting. Knowledge-Based Eng Sci 1:48–57
Nguyen DH, Le Hien X, Heo J-Y, Bae D-H (2021) Development of an extreme gradient boosting model integrated with evolutionary algorithms for hourly water level prediction. IEEE Access 9:125853–125867. https://doi.org/10.1109/access.2021.3111287
Niu D, Diao L, Zang Z et al (2021) A machine-learning approach combining wavelet packet denoising with catboost for weather forecasting. Atmosphere (Basel) 12:1618. https://doi.org/10.3390/atmos12121618
Patrous Z (2018) Evaluating XGBoost for user classification by using behavioral features extracted from smartphone sensors. Msc. Thesis, KTH Royal Institute Of Technology, Stockholm, Sweden
Prokhorenkova L, Gusev G, Vorobev A et al (2018) CatBoost: unbiased boosting with categorical features. Adv Neural Inf Process Syst 31
Qi C, Wu M, Liu H et al (2023) Machine learning exploration of the mobility and environmental assessment of toxic elements in mining-associated solid wastes. J Clean Prod 136771
Sarioglu FC, Yaslan Y (2019) Item prediction with RNN using different types of user-item interactions. Signal Process Commun Appl Conf
Singh SK, Goyal A (2020) Performance analysis of machine learning algorithms for cervical cancer detection. Int J Healthc Inf Syst Informatics. https://doi.org/10.4018/IJHISI.2020040101
Solak CN, Peszek Ł, Yilmaz E et al (2020) Use of diatoms in monitoring the Sakarya river basin. Turkey Water 12:703. https://doi.org/10.3390/w12030703
Taylor KE (2001) Summarizing multiple aspects of model performance in a single diagram. J Geophys Res 106:7183–7192
Tur R, Yontem S (2021) A comparison of soft computing methods for the prediction of wave height parameters. Knowledge-Based Eng Sci 2:31–46
Wang L, Guo Y, Fan M (2022) Improving annual streamflow prediction by extracting information from high-frequency components of streamflow. Water Resour Manag 36:4535–4555. https://doi.org/10.1007/s11269-022-03262-6
Xia F, Jiang D, Kong L et al (2022) Prediction of dichloroethene concentration in the groundwater of a contaminated site using XGBoost and LSTM. Int J Environ Res Public Health 19:9374. https://doi.org/10.3390/ijerph19159374
Xie T, Zhang G, Hou J et al (2019) Hybrid forecasting model for non-stationary daily runoff series: A case study in the Han River Basin, China. J Hydrol
Yaseen ZM, Sulaiman SO, Deo RC, Chau K-W (2018) An enhanced extreme learning machine model for river flow forecasting: state-of-the-art, practical applications in water resource engineering area and future research direction. J Hydrol 569:387–408. https://doi.org/10.1016/j.jhydrol.2018.11.069
Yukseltan E, Yucekaya A, Bilge AH, Agca Aktunc E (2021) Forecasting models for daily natural gas consumption considering periodic variations and demand segregation. Socioecon Plann Sci 74:100937. https://doi.org/10.1016/j.seps.2020.100937
Zeng H, Shao B, Dai H et al (2023) Prediction of fluctuation loads based on GARCH family-CatBoost-CNNLSTM. Energy 263:126125
Zhang Y, Zhao Z, Zheng J (2020) CatBoost: A new approach for estimating daily reference crop evapotranspiration in arid and semi-arid regions of Northern China. J Hydrol 588:125087. https://doi.org/10.1016/j.jhydrol.2020.125087
Zheng Z, Ali M, Jamei M et al (2023) Design data decomposition-based reference evapotranspiration forecasting model: A soft feature filter based deep learning driven approach. Eng Appl Artif Intell 121:105984
Zounemat-Kermani M, Batelaan O, Fadaee M, Hinkelmann R (2021) Ensemble machine learning paradigms in hydrology: A review. J Hydrol 598. https://doi.org/10.1016/j.jhydrol.2021.126266
Acknowledgements
The authors acknowledge the data source “General Directorate of Electrical Works Survey Administration”. In addition, this research was previously published as preprint, readers can refer to the published research (Kilinc et al. 2023).
Author information
Authors and Affiliations
Contributions
Huseyin Cagan Kilinc: Conceptualization; Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation; Resources; Software. Iman Ahmadianfar: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Vahdettin Demir: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Salim Heddam: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Ahmed M. Al-Areeq: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Sani I. Abba: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Mou Leong Tan: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Bijay Halder: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Haydar Abdulameer Marhoon: Data curation; Formal analysis; Methodology; Investigation; Visualization; Writing—original draft,—review & editing draft preparation. Zaher Mundher Yaseen: Supervision, Conceptualization; Formal analysis; Project administration; Investigation; Writing—review & editing.
Corresponding authors
Ethics declarations
Ethical Approval
The authors undertake that this article has not been published in any other journal and that no plagiarism has occurred.
Consent to Participate
The authors agree to participate in the journal.
Consent to Publish
The authors agree to publish in the journal.
Competing Interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kilinc, H.C., Ahmadianfar, I., Demir, V. et al. Daily Scale River Flow Forecasting Using Hybrid Gradient Boosting Model with Genetic Algorithm Optimization. Water Resour Manage 37, 3699–3714 (2023). https://doi.org/10.1007/s11269-023-03522-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-023-03522-z