Skip to main content
Log in

Applications of Data-driven Models for Daily Discharge Estimation Based on Different Input Combinations

  • Published:
Water Resources Management Aims and scope Submit manuscript

Abstract

Accurate and reliable discharge estimation is considered vital in managing water resources, agriculture, industry, and flood management on the basin scale. In this study, five data-driven tree-based algorithms: M5-Pruned model-M5P (Model-1), Random Forest-RF (Model-2), Random Tree-RT (Model-3), Reduced Error Pruning Tree-REP Tree (Model-4), and Decision Stump-DS (Model-5) have been examined to measure the daily discharge of Govindpur site at Burhabalang river, India. The proposed models will be calibrated by daily 10-years time-series hydrological data (i.e., river stage (h) and daily discharge (Q)) measured from 2004 to 2013. In these models, 70% and 30% of the dataset were used for the training and testing stage for the reliability of the developed models. The precision of the models was optimized by investigating five different scenarios based on various time-lags combinations. Model’s performance has been assessed and evaluated using five statistical metrics, namely, correlation coefficient (R2), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE), and Root Relative Squared Error (RRSE). Results showed that Model-3 outperforms as compared to other proposed models. Machine learning models have been examined five scenarios of input variables during training and testing phases. In comparison of the Model-5 struggled in capturing the river's flow rate and showed poor performance in scenarios where R2 metric values ranged from 0.64 to 0.94. Therefore, it can be concluded that the RT model could be used as a robust model for sustainable flood plain management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of Data and Materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  • Abdulkareem JH, Pradhan B, Sulaiman WNA, Jamil NR (2018) Review of studies on hydrological modelling in Malaysia. Model Earth Syst Environ 4:1577–1605

    Article  Google Scholar 

  • Ahmed AN, Othman FB, Afan HA, Ibrahim RK, Fai CM, Hossain MS, Ehteram M, Elshafie A (2019) Machine learning methods for better water quality prediction. J Hydrol 578:124084

  • Al-Juboori AM (2021) A hybrid model to predict monthly streamflow using neighboring rivers annual flows. Water Resour Manag 35:729–743. https://doi.org/10.1007/s11269-020-02757-4

    Article  Google Scholar 

  • Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9:1545–1588

    Article  Google Scholar 

  • Bajirao TS, Kumar P, Kumar M, Elbeltagi A, Kuriqi A (2021) Superiority of hybrid soft computing models in daily suspended sediment estimation in highly dynamic rivers. Sustain 13:1–29. https://doi.org/10.3390/su13020542

    Article  Google Scholar 

  • Banadkooki FB, Ehteram M, Ahmed AN, Teo FY, Ebrahimi M, Fai CM, Huang YF, El-Shafie A (2020) Suspended sediment load prediction using artificial neural network and ant lion optimization algorithm. Environ Sci Pollut Res 1–23. https://doi.org/10.1007/s11356-020-09876-w

    Article  Google Scholar 

  • Bharti B, Pandey A, Tripathi SK, Kumar D (2017) Modelling of runoff and sediment yield using ANN, LS-SVR, REPTree and M5 models. Hydrol Res 48:1489–1507. https://doi.org/10.2166/nh.2017.153

    Article  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees, 1st edn. Chapman and Hall/CRC, New York/Boca Raton, FL

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Busch JR, Ferrari PA, Flesia AG, Fraiman R, Grynberg SP, Leonardi F (2009) Testing statistical hypothesis on random trees and applications to the protein classification problem. J Appl Statist 3:542–563

    Google Scholar 

  • Cutler DR, Edwards TC Jr, Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random forests for classification in ecology. Ecol 88:2783–2792

    Article  Google Scholar 

  • Daud MNR, Corne DW (2007) Human readable rule induction in medical data mining: A survey of existing algorithms, WSEAS European Computing Conf., Athens, Greece

  • Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach Learn

  • Dietterich T, Kong EB (1995) Machine learning bias, statistical bias, and statistical variance of decision tree algorithms. Tech Rep. http://datam.i2r.astar.edu.sg/datasets/krbd/

  • Elbeltagi A, Deng J, Wang K, Hong Y (2020a) Crop Water footprint estimation and modeling using an artificial neural network approach in the Nile Delta. Egypt Agric Water Manag 235:106–180

    Google Scholar 

  • Elbeltagi A, Deng J, Wang K, Malik A, Maroufpoor S (2020b) Modeling long-term dynamics of crop evapotranspiration using deep learning in a semi-arid environment. Agric Water Manag 241:106–334. https://doi.org/10.1016/j.agwat.2020.106334

    Article  Google Scholar 

  • Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence 14:1612

  • Golabi MR, Farzi S, Khodabakhshi F, Sohrabi Geshnigani F, Nazdane F, Radmanesh F (2020) Biochemical oxygen demand prediction: development of hybrid wavelet-random forest and M5 model tree approach using feature selection algorithms. Environ Sci Pollut Res 27:34322–34336. https://doi.org/10.1007/s11356-020-09457-x

    Article  Google Scholar 

  • Goyal MK, Ojha CSP (2011) Estimation of scour downstream of a ski-jump bucket using support vector and M5 model tree. Water Resour Manag 25:2177–2195. https://doi.org/10.1007/s11269-011-9801-6

    Article  Google Scholar 

  • Hussain D, Khan AA (2020) Machine learning techniques for monthly river flow forecasting of Hunza River, Pakistan. Earth Sci Inform 13:939–949

    Article  Google Scholar 

  • Jayanthi SK, Sasikala S (2013) Reptree classifier for identifying link spam in web search engines. IJSC 3:498–505

  • Jerin JN, Islam HMT, Islam T, Shahid S, Zhenghua H, Mehnaz B, Ronghao C, Ahmed E (2021) Spatiotemporal trends in reference evapotranspiration and its driving factors in Bangladesh. Theor Appl Climatol 144:793–808. https://doi.org/10.1007/s00704-021-03566-4

  • Jerin JN, Islam HT, Islam ARMT, Badhan MA, Chu R, Hu Z, Ibrahim SM (2020) Trends in reference evapotranspiration and its influential factors in Bangladesh. Authorea Preprints

  • Joseph KS, Ravichandran T (2012) A comparative evaluation of software effort estimation using REPTree and K* in handling with missing values. Aust J Basic Appl Sci 6:312–317

    Google Scholar 

  • Jumin E, Zaini N, Ahmed AN, Abdullah S, Ismail M, Sherif M, Sefelnasr A, El-Shafie A (2020) Machine learning versus linear regression modelling approach for accurate ozone concentrations prediction. Eng Appl Comput Fluid Mech 14:713–725

    Google Scholar 

  • Jumin E, Basaruddin FB, Yusoff YB, Latif SD, Ahmed AN (2021) Solar radiation prediction using boosted decision tree regression model: A case study in Malaysia. Environ Sci Pollut Res 1–13. https://doi.org/10.1007/s11356-021-12435-6

    Article  Google Scholar 

  • Kar AK, Hembram R, Mohanty H (2021) Study of morphological changes in deltaic river of odisha using GIS. Water Manag Water Govern 35–46

  • Khosravi K, Mao L, Kisi O, Yaseen ZM, Shahid S (2018) Quantifying hourly suspended sediment load using data mining models: case study of a glacierized Andean catchment in Chile. J Hydrol 567:165–179

    Article  Google Scholar 

  • Kumar M, Kumar P (2021) Stage-discharge-sediment modelling using support vector machine. Pharma Innov J 1:149–154

    Google Scholar 

  • Kumar M, Kumari A, Kushwaha DP, Kumar P, Malik A, Ali R, Kuriqi A (2020) Estimation of daily stage-discharge relationship by using data-driven techniques of a perennial river. India Sustainability 12:7877. https://doi.org/10.3390/su12197877

    Article  Google Scholar 

  • Melesse AM, Khosravi K, Tiefenbacher JP, Heddam S, Kim S, Mosavi A, Pham BT (2020) River water salinity prediction using hybrid machine learning models. Water 12:2951

  • Najock D, Heyde CO (1982) The number of terminal vertices in certain random trees with an application to stemma construction in philology. J Appl Prob 19:675–680

    Article  Google Scholar 

  • Nhu VH, Shahabi H, Nohani E, Shirzadi A, Al-Ansari N, Bahrami S, Miraki S, Geertsema M, Nguyen H (2020) Daily water level prediction of Zrebar Lake (Iran): A comparison between M5P, random forest, random tree and reduced error pruning trees algorithms. ISPRS Int J Geo-Inf 9:479

    Article  Google Scholar 

  • Pham BT, Qi C, Ho LS, Nguyen-Thoi T, Al-Ansari N, Nguyen MD, Nguyen HD, Ly HB, Le HV, Prakash I (2020a) A novel hybrid soft computing model using random forest and particle swarm optimization for estimation of undrained shear strength of soil. Sustainability 12:2218. https://doi.org/10.3390/su12062218

    Article  Google Scholar 

  • Pham QB, Afan HA, Mohammadi B, Ahmed AN, Vo LNTT, ND, Moazenzadeh R, Yu PS, El-Shafie A, (2020b) Hybrid model to improve the river streamflow forecasting utilizing multi-layer perceptron-based intelligent water drop optimization algorithm. Soft Comput 24:18039–18056. https://doi.org/10.1007/s00500-020-05058-5

    Article  Google Scholar 

  • Quinlan JR (1992) Learning with continuous classes. Proceeding 5th Australian Joint Conference on Artificial Intelligence, World Scientific, Singapore: 343–348.

  • Ridwan WM, Sapitang M, Aziz A, Kushiar KF, Ahmed AN, El-Shafie A (2021) Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia. Ain Shams Eng J 12:1651–1663

    Article  Google Scholar 

  • Rodriguez JJ, Kuncheva LI, Carlos J (2006) Rotation forest: A new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28:1619–1630

    Article  Google Scholar 

  • Sattari MT, Mirabbasi R, Sushab RS, Abraham J (2018) Prediction of groundwater level in Ardebil plain using support vector regression and M5 tree model. Groundwater 56:636–646. https://doi.org/10.1111/gwat.12620

    Article  Google Scholar 

  • Senthil Kumar AR, Ojha CSP, Goyal MK, Singh RD, Swamee PK (2012) Modelling of suspended sediment concentration at Kasol in India using ANN, fuzzy logic and decision tree algorithms. J Hydrol Eng 17:394–404

    Article  Google Scholar 

  • Sihag P, Angelaki A, Chaplot B (2020) Estimation of the recharging rate of groundwater using random forest technique. Appl Water Sci 10:182. https://doi.org/10.1007/s13201-020-01267-3

    Article  Google Scholar 

  • Srivastava A, Sahoo B, Raghuwanshi NS, Singh R (2017) Evaluation of variable-infiltration capacity model and MODIS-terra satellitederived grid-scale evapotranspiration estimates in a River Basin with Tropical Monsoon-Type climatology. J Irrig Drain Eng 143:04017028

  • Srivastava A, Sahoo B, Raghuwanshi NS, Chatterjee C (2018) Modelling the dynamics of evapotranspiration using Variable Infiltration Capacity model and regionally calibrated Hargreaves approach. Irrig Sci 36:289–300

  • Torgo L (1997) Functional models for regression tree leaves. In Machine Learning, Proceedings of the 14th International Conference (D. Fisher, ed.). Morgan Kaufmann 385–393

  • Verbyla DL (1987) Classification trees: a new discrimination tool. Can J for Res 17:1150–1152

    Article  Google Scholar 

  • Witten IH, Frank E (2000) Data mining: Practical machine learning tools and techniques with Java implementations, Morgan Kaufmann, San Francisco, CA (vol no. pp. journal publication)

  • Witten IH, Frank E, Trigg L, Hall M, Holmes G, Cunningham SJ (1999) Weka: Practical machine learning tools and techniques with Java implementations. Emerging Knowledge Engineering and Connectionist-Based Information Systems 192–196

  • Yariyan P, Janizadeh S, Van Phong T, Nguyen HD, Costache R, Van Le H, Pham BT, Pradhan B, Tiefenbacher JP (2020) Improvement of best first decision trees using bagging and dagging ensembles for flood probability mapping. Water Resour Manag 34:3037–3053. https://doi.org/10.1007/s11269-020-02603-7

    Article  Google Scholar 

  • Zarei M, Bozorg-Haddad O, Baghban S, Delpasand M, Goharian E, Loaiciga HA (2021) Machine-learning algorithms for forecast-informed reservoir operation (FIRO) to reduce flood damages. Sci Rep 11:24295

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Manish Kumar: Project administration, Conceptualization, Writing- original draft, Formal analysis, Visualization. Ahmed Elbetagi: Software; Formal analysis; Writing- original draft, and editing Chaitanya B. Pande, Ali Najah Ahmed, Ming Fai Chow: Writing- original draft, Visualization. Anuradha Kumari, Deepak Kumar: Data curation, Writing, Review, and editing. Quoc Bao Pham: Supervision, Writing, Review, Editing.

Corresponding author

Correspondence to Quoc Bao Pham.

Ethics declarations

Ethical Approval

Not applicable.

Consent to Participate

Not applicable.

Consent to Publish

Not applicable.

Competing Interests

This manuscript has not been published or presented elsewhere in part or entirety and is not under consideration by another journal. There are no conflicts of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, M., Elbeltagi, A., Pande, C.B. et al. Applications of Data-driven Models for Daily Discharge Estimation Based on Different Input Combinations. Water Resour Manage 36, 2201–2221 (2022). https://doi.org/10.1007/s11269-022-03136-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11269-022-03136-x

Keywords

Navigation