Skip to main content
Log in

The use of explainable artificial intelligence for interpreting the effect of flow phase and hysteresis on turbidity prediction

  • Original Article
  • Published:
Environmental Earth Sciences Aims and scope Submit manuscript

Abstract

Predicting turbidity (T), which represents the amount of fine sediment in water, is essential in effective water quality management. In this study, two ensemble learning models, XGBoost and light gradient boosting decision tree (LGB), were employed to predict T, using discharge (Q) as an independent variable. The input variables were classified into three groups based on the flow phase: rising limb, falling limb, and base flow, where different time–frequency datasets (2, 8, and 24 h) were utilized to develop the model. In the first model set (Model 1), each model was trained separately for every phase, and their performance was tested by applying each to the corresponding Q. Another model set using XGBoost and LGB was developed by considering the entire period without classification for a comparison purpose (Model 2). The results demonstrated that Model 1 which used data classified into three phases outperformed Model 2. Further analysis of the flood phase and hysteresis in the relationship between Q and T showed that different data distributions in the three phases determined the performance differences between Models 1 and 2. By considering these differences, Model 1 exhibited better performance compared to Model 2. The Shapley additive explanation (SHAP), a novel explainable artificial intelligence method, provided a reasonable interpretation of the difference in model predictions between Models 1 and 2.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The manuscript contains data representation itself, additional data will be made available on reasonable request.

References

  • Asrafuzzaman M, Fakhruddin A, Hossain M (2011) Reduction of turbidity of water using locally available natural coagulants. Int Sch Res Notices 2011:1–6

    Google Scholar 

  • Asselman NE (1999) Suspended sediment dynamics in a large drainage basin: the River Rhine. Hydrol Process 13(10):1437–1450

    Article  Google Scholar 

  • Bailey LP, Clare MA, Pope EL, Haigh ID, Cartigny MJ, Talling PJ, Lintern DG, Hage S, Heijnen M (2023) Predicting turbidity current activity offshore from meltwater-fed river deltas. Earth Planet Sci Lett 604:117977

    Article  Google Scholar 

  • Bennett ND, Croke BF, Guariso G, Guillaume JH, Hamilton SH, Jakeman AJ, Marsili-Libelli S, Newham LT, Norton JP, Perrin C (2013) Characterising performance of environmental models. Environ Model Softw 40:1–20

    Article  Google Scholar 

  • Bezak N, Mikoš M, Šraj M (2014) Trivariate frequency analyses of peak discharge, hydrograph volume and suspended sediment concentration data using copulas. Water Resour Manag 28(8):2195–2212

    Article  Google Scholar 

  • Buendia C, Vericat D, Batalla RJ, Gibbins CN (2016) Temporal dynamics of sediment transport and transient in-channel storage in a highly erodible catchment. Land Degrad Dev 27(4):1045–1063

    Article  Google Scholar 

  • Cantalice JRB, Cunha Filho M, Stosic BD, Piscoya VC, Guerra SM, Singh VP (2013) Relationship between bedload and suspended sediment in the sand-bed Exu River, in the semi-arid region of Brazil. Hydrol Sci J 58(8):1789–1802

    Article  Google Scholar 

  • Carling PA (1983) Particulate dynamics, dissolved and total load, in two small basins, northern Pennines. UK Hydrol Sci J 28(3):355–375

    Article  Google Scholar 

  • Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. Proceedings of Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery And Data Mining, pp. 785–794.

  • Choubin B, Darabi H, Rahmati O, Sajedi-Hosseini F, Kløve B (2018) River suspended sediment modelling using the CART model: a comparative study of machine learning techniques. Sci Total Environ 615:272–281

    Article  Google Scholar 

  • Cui F, Salih SQ, Choubin B, Bhagat SK, Samui P, Yaseen ZM (2020) Newly explored machine learning model for river flow time series forecasting at Mary River. Australia Environ Monit Assess 192:1–15

    Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232

    Article  Google Scholar 

  • Frostick LE, Lucas P, Reid I (1984) The infiltration of fine matrices into coarse-grained alluvial sediments and its implications for stratigraphical interpretation. J Geol Soc 141(6):955–965

    Article  Google Scholar 

  • Greig S, Sear D, Carling P (2005) The impact of fine sediment accumulation on the survival of incubating salmon progeny: implications for sediment management. Sci Total Environ 344(1–3):241–258

    Article  Google Scholar 

  • Gunning D, Stefik M, Choi J, Miller T, Stumpf S, Yang G (2019) XAI—explainable artificial intelligence. Sci Robot 4(37):eaay7120

    Article  Google Scholar 

  • Hamshaw SD, Dewoolkar MM, Schroth AW, Wemple BC, Rizzo DM (2018) A new machine-learning approach for classifying hysteresis in suspended-sediment discharge relationships using high-frequency monitoring data. Water Resour Res 54(6):4040–4058

    Article  Google Scholar 

  • Harvey JW, Drummond JD, Martin RL, McPhillips LE, Packman AI, Jerolmack DJ, Stonedahl SH, Aubeneau AF, Sawyer AH, Larsen LG (2012) Hydrogeomorphology of the hyporheic zone: stream solute and fine particle interactions with a dynamic streambed. J Geophys Res-Biogeo 117(G4):1–20

    Article  Google Scholar 

  • Jensen DW, Steel EA, Fullerton AH, Pess GR (2009) Impact of fine sediment on egg-to-fry survival of Pacific salmon: a meta-analysis of published studies. Rev Fish Sci 17(3):348–359

    Article  Google Scholar 

  • Kastl B, Obedzinski M, Carlson SM, Boucher WT, Grantham TE (2022) Migration in drought: receding streams contract the seaward migration window of endangered salmon. Ecosphere 13(12):e4295

    Article  Google Scholar 

  • Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inform Process Syst 30:3146–3154

    Google Scholar 

  • Kemp P, Sear D, Collins A, Naden P, Jones I (2011) The impacts of fine sediment on riverine fish. Hydrol Process 25(11):1800–1821

    Article  Google Scholar 

  • Li L, Qiao J, Yu G, Wang L, Li H-Y, Liao C, Zhu Z (2022) Interpretable tree-based ensemble model for predicting beach water quality. Water Res 211:118078

    Article  Google Scholar 

  • Lin W, Sung S, Chen L, Chung H, Wang C, Wu R, Lee D, Huang C, Juang R, Peng X (2004) Treating high-turbidity water using full-scale floc blanket clarifiers. J Environ Eng 130(12):1481–1487

    Article  Google Scholar 

  • Lu H, Ma X (2020) Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249:126169

    Article  Google Scholar 

  • Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. Adv Neural Inform Process Syst 30:1–10

    Google Scholar 

  • Lundberg SM, Erion GG, Lee S-I (2018) Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888.

  • Ma X, Sha J, Wang D, Yu Y, Yang Q, Niu X (2018) Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGBoost algorithms according to different high dimensional data cleaning. Electron Commer Res Appl 31:24–39

    Article  Google Scholar 

  • Malutta S, Kobiyama M, Chaffe PLB, Bonumá NB (2020) Hysteresis analysis to quantify and qualify the sediment dynamics: state of the art. Water Sci Technol 81(12):2471–2487

    Article  Google Scholar 

  • Megnounif A, Terfous A, Ouillon S (2013) A graphical method to study suspended sediment dynamics during flood events in the Wadi Sebdou, NW Algeria (1973–2004). J Hydrol 497:24–36

    Article  Google Scholar 

  • Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900

    Article  Google Scholar 

  • Navratil O, Legout C, Gateuille D, Esteves M, Liebault F (2010) Assessment of intermediate fine sediment storage in a braided river reach (southern French Prealps). Hydrol Process 24(10):1318–1332

    Google Scholar 

  • Park J, Lee WH, Kim KT, Park CY, Lee S, Heo T-Y (2022) Interpretation of ensemble learning to predict water quality using explainable artificial intelligence. Sci Total Environ 832:155070

    Article  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  • Piqué G, López-Tarazón JA, Batalla RJ (2014) Variability of in-channel sediment storage in a river draining highly erodible areas (the Isábena, Ebro Basin). J Soil Sediment 14(12):2031–2044

    Article  Google Scholar 

  • Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?” explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144

  • Tuset J, Vericat D, Batalla R (2016) Rainfall, runoff and sediment transport in a Mediterranean mountainous catchment. Sci Total Environ 540:114–132

    Article  Google Scholar 

  • Walling D (1977) Assessing the accuracy of suspended sediment rating curves for a small basin. Water Resour Res 13(3):531–538

    Article  Google Scholar 

  • Walling DE, Owens PN, Leeks GJ (1998) The role of channel and floodplain storage in the suspended sediment budget of the River Ouse, Yorkshire. UK Geomorphol 22(3–4):225–242

    Article  Google Scholar 

  • Williams GP (1989) Sediment concentration versus water discharge during single hydrologic events in rivers. J Hydrol 111(1–4):89–106

    Article  Google Scholar 

  • Zhang D, Qian L, Mao B, Huang C, Huang B, Si Y (2018) A data-driven design for fault detection of wind turbines using random forests and XGBoost. IEEE Access 6:21020–21031

    Article  Google Scholar 

  • Zounemat-Kermani M, Kişi Ö, Adamowski J, Ramezani-Charmahineh A (2016) Evaluation of data driven models for river suspended sediment concentration modeling. J Hydrol 535:457–472

    Article  Google Scholar 

  • Zounemat-Kermani M, Mahdavi-Meymand A, Alizamir M, Adarsh S, Yaseen ZM (2020) On the complexities of sediment load modeling using integrative machine learning: application of the great river of Loíza in Puerto Rico. J Hydrol 585:124759

    Article  Google Scholar 

  • Zounemat-Kermani M, Alizamir M, Fadaee M, Sankaran Namboothiri A, Shiri J (2021) Online sequential extreme learning machine in river water quality (turbidity) prediction: a comparative study on different data mining approaches. Water Environ J 35(1):335–348

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2022R1F1A1065518) (50%). This study [G21S302588802] was supported by the technology development project of the Ministry of SMEs in 2022 (50%).

Author information

Authors and Affiliations

Authors

Contributions

JP: conceptualization, carried out the modeling and data analysis, investigation, methodology, writing—original draft, writing—review and editing. WHL: conceptualization, investigation, writing—review and editing. IK: conceptualization, investigation, writing—review and editing. JCJ: conceptualization, methodology, writing—review and editing.

Corresponding authors

Correspondence to Ilsuk Kang or Woo Hyoung Lee.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, J., Joo, J.C., Kang, I. et al. The use of explainable artificial intelligence for interpreting the effect of flow phase and hysteresis on turbidity prediction. Environ Earth Sci 82, 375 (2023). https://doi.org/10.1007/s12665-023-11056-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12665-023-11056-1

Keywords

Navigation