Abstract
Despite high pollution levels in Indian rivers, a comprehensive study on the water quality index (WQI) remains elusive. WQI values were computed, and their classes were determined using six water quality parameters from an available decennial dataset (n = 3595) on Indian Rivers. This study aims to assess the spatial distribution of WQI values and their classes across Indian River systems while exploring the application of machine learning (ML) based models in predicting WQI classes using a reduced number of input parameters.
Modeling experiments were designed on five models- Decision Tree (DT), Random Forest (RF), Gradient Boosted Trees (GBT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) for predicting WQI classes. Each model was trained with input parameters and WQI classes with 2990 datasets. Testing of WQI classes by each model was made on 605 datasets under different framework sets. Models’ performance metrics were evaluated by accuracy, weighted mean recall and precision, and F-score.
Our study demonstrates that the two largest systems, Ganga and Brahmaputra, lie on the extremes of the WQI (mean) spectrum, reflecting the impact of contrasting population density, industrial activities, change in land-use-land-cover pattern, and agricultural use on the riverine WQI. Our modeling experiments underscore that with only three input parameters, GBT can predict WQI classes with > 80% of performance metrics. With only two input parameters, GBT, RF, and ANN, all can provide reliable estimates. Our study highlights that ML models can serve as decision-supporting tools for water resource policymakers and managers in making effective pollution control and water resource management decisions.
Similar content being viewed by others
Data availability
All data were shown in the main manuscript and supplementary information.
References
Almeida C, González S, Mallea M, González P (2012) A recreational water quality index using chemical, physical and microbiological parameters. Environ Sci Pollut Res 19(8):3400–3411. https://doi.org/10.1007/s11356-012-0865-5
Asadollah SBHS, Sharafati A, Motta D, Yaseen ZM (2021) River water quality index prediction and uncertainty analysis: A comparative study of machine learning models. J Environ Chem Eng 9(1):104599. https://doi.org/10.1016/j.jece.2020.104599
Barbulescu A, Barbes L, Dumitriu CS (2021) Assessing the water pollution of the Brahmaputra River using water quality indexes. Toxics 9(11):297. https://doi.org/10.3390/toxics9110297
Bascarón M (1979) Establishment of a methodology for the determination of water quality. Boletin Informativo Del Medio Ambiente 9:30–51
Bhargava DS (1983) Use of water quality index for river classification and zoning of Ganga River. Environ Pollut B 6(1):51–67. https://doi.org/10.1016/0143-148X(83)90029-0
Bhuyan MS, Bakar M, Sharif ASM, Hasan M, Islam MS (2018) Water quality assessment using water quality indicators and multivariate analyses of the old Brahmaputra River. Pollution 4(3):481–493
Brown RM, McClelland NI (1974) Up from Chaos: the water quality index as an effective instrument in water quality management. National Sanitation Foundation, Michigan, p 27
Brown RM, McClelland NI, Deininger RA, O’Connor MF (1972) A Water Quality Index — Crashing the Psychological Barrier. In: Thomas WA (ed) Indicators of Environmental Quality. Environmental Science Research, vol 1. Springer, Boston, MA. https://doi.org/10.1007/978-1-4684-2856-8_15
Bureau of Indian Standards (2012) Indian standard drinking water specification, IS:10500: 2012. Drinking water sectional committee, FAD. New Delhi, India, p 25
Chandra DS, Asadi SS, Raju MVS (2017) Estimation of water quality index by weighted arithmetic water quality index method: a model study. International Journal of Civil Engineering and Technology 8(4):1215–1222
Chatterjee PR, Raziuddin M (2007) Studies on the water quality of a water body at Asansol town, West Bengal. Nat Environ Pollut Technol 6(2):289–292
Chen K, Chen H, Zhou C, Huang Y, Qi X, Shen R, Ren H (2020) Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water research 171:115454. https://doi.org/10.1016/j.watres.2019.115454
Cude CG (2001) Oregon water quality index: A tool for evaluating water quality management effectiveness. J Am Water Resour Assoc. https://doi.org/10.1111/j.1752-1688.2001.tb05480.x
CWC (1989) Major River Basins of India - An Overview (1989). Central Water Commission (CWC), New Delhi, India
Das P, Kumar M (2020) Assessment of water quality using multivariate analysis—a case study on the Brahmaputra River, Assam, India. Emerging Issues in the Water Environment during Anthropocene: A South East Asian Perspective. pp 179–194. https://doi.org/10.1007/978-981-32-9771-5_10
Das CR, Das S (2023) Assessment of Surface Water Quality for Drinking by Combining Three Water Quality Indices with Their Usefulness: Case of Damodar River in India. Water Air Soil Pollut 234(5):1–20. https://doi.org/10.1007/s11270-023-06342-4
Dimri D, Daverey A, Kumar A, Sharma A (2021) Monitoring water quality of River Ganga using multivariate techniques and WQI (Water Quality Index) in Western Himalayan region of Uttarakhand, India. Environmental Nanotechnology, Monitoring and Management 15:100375. https://doi.org/10.1016/j.enmm.2020.100375
Dinius SH (1987) Design of an index of water quality 1. JAWRA Journal of the American Water Resources Association 23(5):833–843. https://doi.org/10.1111/j.1752-1688.1987.tb02959.x
Naubi I, Zardari NH, Shirazi SM, Ibrahim NFB, Baloo L (2016) Effectiveness of water quality index for monitoring Malaysian river water quality. Pol J Environ Stud 25(1)
Dojlido JAN, Raniszewski J, Woyciechowska J (1994) Water quality index applied to rivers in the vistula river basin in poland. Environ Monit Assess 33:33–42
Dunnette DA (1979) A geographically variable water quality index used in Oregon. J Water Pollution Cont Fed 53–61
Dwivedi S, Mishra S, Tripathi RD (2018) Ganga water pollution: a potential health threat to inhabitants of Ganga basin. Environ Int 117:327–338. https://doi.org/10.1016/j.envint.2018.05.015
Ghadai M, Satapathy DP, Krishnasamy S, Ramalingam M, Sreelal GP, Dhilipkumar B (2022) Artificial neural network and weighted arithmetic indexing approach for surface water quality assessment of the Brahmani river. Glob Nest J 24(4):562–568
Hameed M, Sharqi SS, Yaseen ZM, Afan HA, Hussain A, Elshafie A (2017) Application of artificial intelligence (AI) techniques in water quality index prediction: a case study in tropical region. Malaysia Neural Comput Appl 28:893–905. https://doi.org/10.1007/s00521-016-2404-7
Hanh PTM, Sthiannopkao S, Ba DT, Kim KW (2011) Development of water quality indexes to identify pollutants in Vietnam’s surface water. J Environ Eng 137(4):273–283
Ho JY, Afan HA, El-Shafie AH, Koting SB, Mohd NS, Jaafar WZB, El-Shafie A (2019) Towards a time and cost effective approach to water quality index class prediction. J Hydrol 575:148–165. https://doi.org/10.1016/j.jhydrol.2019.05.016
Horton RK (1965) An index number system for rating water quality. J Water Pollut Control Fed 37:300–306. https://doi.org/10.1002/clen.202200321
House MA (1989) A water quality index for river management. J Inst Water Environ Manag 3:336–344
Juwana I, Muttil N, Perera BJC (2016) Uncertainty and sensitivity analysis of West Java Water Sustainability Index - A case study on Citarum catchment in Indonesia. Ecol Indic 61:170–178. https://doi.org/10.1016/j.ecolind.2015.08.034
Kanaujiya AK, Tiwari V (2024) Water quality analysis of River Ganga and Yamuna using water quality index (WQI) during Kumbh Mela 2019, Prayagraj, India. Environ Dev Sustain 26(2):5451–5472. https://doi.org/10.1007/s10668-023-02907-9
Kaur H, Chandel S, Benbi DK, Singh D, Kaur M, Singh K, Marwaha SS (2023) Quantifying and trend analyzing dynamics of water quality variables of two Indus basin rivers of Indian Punjab. Sustainable Water Resources Management 9(4):111
Khan I, Khan A, Khan MS, Zafar S, Hameed A, Badshah S, Yasmeen G (2018) Impact of city effluents on water quality of Indus River: assessment of temporal and spatial variations in the southern region of Khyber Pakhtunkhwa, Pakistan. Environ Monit Assess 190:1–19. https://doi.org/10.1007/s10661-018-6621-7
Koranga M, Pant P, Kumar T, Pant D, Bhatt AK, Pant RP (2022) Efficient water quality prediction models based on machine learning algorithms for Nainital Lake, Uttarakhand. Materials Today: Proceedings 57:1706–1712. https://doi.org/10.1016/j.matpr.2021.12.334
Liou S-M, Lo S-L, Wang S-H (2004) A Generalized Water Quality Index for Taiwan. Environ Monit Assess 96:35–52. https://doi.org/10.1023/B:EMAS.0000031715.83752.a1
Lukhabi DK, Mensah PK, Asare NK, Pulumuka-Kamanga T, Ouma KO (2023) Adapted water quality indices: limitations and potential for water quality monitoring in Africa. Water 15(9):1736. https://doi.org/10.3390/w15091736
Lumb A, Sharma TC, Bibeault JF (2011) A review of genesis and evolution of water quality index (WQI) and some future directions. Water Qual Exposure Health 3:11–24. https://doi.org/10.1007/s12403-011-0040-0
Meena V, Paul S, Sarma AK, Mahanta C, Bhattacharyya KG (2023) Effects of COVID-19 lockdown on hydrochemical properties of the Brahmaputra River, India. https://doi.org/10.21203/rs.3.rs-2767010/v1
Nihalani S, Meeruty A (2021) Water quality index evaluation for major rivers in Gujarat. Environ Sci Pollut Res 28:63523–63531. https://doi.org/10.1007/s11356-020-10509-5
Pramanik S, Biswas JK, Kaviraj A, Saha S (2023) Assessment of the Present State and Future Fate of River Saraswati, India: Water Quality Indices and Forecast Models as Diagnostic and Management Tools. CLEAN–Soil, Air, Water 51(4):2200321
Ramjan S, Sunkpho J (Eds.) (2023) Principles and Theories of Data Mining with RapidMiner. IGI Global
Ray P, Sarmah S, Mourya KK, Jena RK, Sharma GK, Hota S, Ray SK (2023) Assessment of water quality of the Brahmaputra River in India for irrigation purpose. J Soil Water Conserv 22(1):41–46
Richards LA, Guo S, Lapworth DJ, White D, Civil W, Wilson GJ, Gooddy DC (2023) Emerging organic contaminants in the River Ganga and key tributaries in the middle Gangetic Plain, India: Characterization, distribution and controls. Environ Pollut 327:121626. https://doi.org/10.1016/j.envpol.2023.121626
Ross SL (1977) An index system for classifying river water quality. Water Pollut Control 76(1):113–122
Said A, Stevens DK, Sehlke G (2004) An innovative index for evaluating water quality in streams. Environ Manage. https://doi.org/10.1007/s00267-004-0210-y
Schierow LJ, Chesters G (1988) Evaluation of the great lakes nearshore index. Water Res 22:269–277. https://doi.org/10.1016/S0043-1354(88)90020-6
Scottish Research Development Department (SRDD) (1976) Development of A Water Quality Index. Appl Res Dev Rep Num ARD3 61
Sharma R, Kumar R, Sharma DK, Sarkar M, Mishra BK, Puri V, Nhu VH (2022) Water pollution examination through quality analysis of different rivers: a case study in India. Environ Dev Sustain 24(6):7471–7492. https://doi.org/10.1007/s10668-021-01777-3
Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river water quality—a case study. Ecological Modeling 220(6):888–895. https://doi.org/10.1016/j.ecolmodel.2009.01.004
Singh UK, Kumar B (2018) Climate change impacts on hydrology and water resources of Indian river basins. Curr World Environ 13(1):32. https://doi.org/10.12944/CWE.13.1.04
Smith DG (1990) A better water quality indexing system for rivers and streams. Water Res 24(10):1237–1244. https://doi.org/10.1016/0043-1354(90)90047-A
Stambuk-Giljanović N (2003) Comparison of Dalmatian Water Evaluation Indices. Water Environ Res 75:388–405. https://doi.org/10.2175/106143003X141196
Sudarshan P, Mahesh MK, Ramachandra TV (2019) Assessment of seasonal variation in water quality and water quality index (WQI) of Hebbal Lake, Bangalore. India Environment and Ecology 37(1B):309–317
Tiwari TN, Mishra MA (1985) A preliminary assignment of water quality index of major Indian rivers. Indian J Environ Prot 5(4):276–279
Tiwari AK, Singh AK, Phartiyal B, Sharma A (2021) Hydrogeochemical characteristics of the Indus river water system. Chem Ecol 37(9–10):780–808. https://doi.org/10.1080/02757540.2021.1999425
Tripathi M, Singal SK (2019) Use of principal component analysis for parameter selection for development of a novel water quality index: a case study of river Ganga India. Ecol Ind 96:430–436. https://doi.org/10.1016/j.ecolind.2018.09.025
Uddin MG, Nash S, Olbert AI (2021) A review of water quality index models and their use for assessing surface water quality. Ecol Indic 122:107218. https://doi.org/10.1016/j.ecolind.2020.107218
Wu B, Tian F, Zhang M, Piao S, Zeng H, Zhu W, Lu Y (2022) Quantifying global agricultural water appropriation with data derived from earth observations. J Clean Prod 358:131891. https://doi.org/10.1016/j.jclepro.2022.131891
Zali MA, Retnam A, Juahir H, Zain SM, Kasim MF, Abdullah B, Saadudin SB (2011) Sensitivity analysis for water quality index (WQI) prediction for Kinta River. Malaysia World Appl Sci J 14:60–65
Zhu M, Wang J, Yang X, Zhang Y, Zhang L, Ren H, Ye L (2022) A review of the application of machine learning in water quality evaluation. Eco-Environ Health 1(2):107–116. https://doi.org/10.1016/j.eehl.2022.06.001
Acknowledgements
We acknowledge Pandit Deendayal Energy University for the extended support provided to carry out the research in a smooth manner. We thank Yuvraj Singh Jadon (Lloyds Banking Group, Cardiff, UK) for his timely support in understanding ML-related works to a better extent. The authors thank the two anonymous reviewers for their constructive comments, which helped in improving the clarity of the manuscript.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
Study conception and design by AD and SS. Machine learning algorithms were performed by SS and PS. Modeling experiments were performed by SS. Manuscript written by SS and AD.
Corresponding author
Ethics declarations
Consent to participate
Not applicable.
Consent to publish
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, S., Das, A. & Sharma, P. Predictive modeling of water quality index (WQI) classes in Indian rivers: Insights from the application of multiple Machine Learning (ML) models on a decennial dataset. Stoch Environ Res Risk Assess (2024). https://doi.org/10.1007/s00477-024-02741-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s00477-024-02741-z