Abstract
Evaluation of water quality is essential for protecting both the environment and human wellbeing. There is a paucity of research on using machine learning for classification of groundwater used for irrigation with fewer input parameters and still getting satisfactory results, despite earlier studies exploring its application in evaluating water quality. Studies are required to determine the feasibility of using machine learning to classify groundwater used for irrigation using minimal input parameters. In this study, we developed machine learning models to simulate the Irrigation Water Quality Index (IWQI) and an economic model that used an optimal number of inputs with the highest possible accuracy. We utilized eight classification algorithms, including the LightGBM classifier, CatBoost, Extra Trees, Random Forest, Gradient Boosting classifiers, Support Vector Machines, Multi-Layer Perceptrons, and K-Nearest Neighbors Algorithm. Two scenarios were considered, the first using six inputs, including conductivity, chloride (\(\mathrm{Cl}^{-}\)), bicarbonate (\(\mathrm{HCO}_3{}^{-}\)), sodium (\(\mathrm{Na}^{+}\)), calcium (\(\mathrm{Ca}^{2+}\)), and magnesium (\(\mathrm{Mg}^{2+}\)), and the second using three parameters, including total hardness (TH), chloride (\(\mathrm{Cl}^{-}\)), and sulfate (\({\mathrm{SO}_4{}^{2-}}\)) that were selected based on the Mutual Information (MI) result. The models achieved satisfactory performance, with the LightGBM classifier as the best model, yielding a 91.08% F1 score using six inputs, and the Extra Trees classifier as the best model, yielding an 86.30% F1 score using three parameters. Our findings provide a valuable contribution to the development of accurate and efficient machine learning models for water quality evaluation.
Similar content being viewed by others
Data Availibility
The study’s data can be accessed by interested parties upon request, subject to ethical, legal, practical, and technical constraints.
References
Abuzir SY, Abuzir YS (2022) Machine learning for water quality classification. Water Qual Res J 57. https://doi.org/10.2166/wqrj.2022.004
Alexakis D, Tsihrintzis VA, Tsakiris G, Gikas GD (2016) Suitability of water quality indices for application in lakes in the mediterranean. Water Resour Manag 30. https://doi.org/10.1007/s11269-016-1240-y
Biswas AK, Tortajada C (2019) Water quality management: a globally neglected issue. Int J Water Resour Dev 35:913–916. https://doi.org/10.1080/07900627.2019.1670506
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. Proceedings of the fifth annual workshop on computational learning theory (pp 144–152)
Breiman L (2001) Random forests. Mach Learn 45:5–32
Dezfooli D, Hosseini-Moghari SM, Ebrahimi K, Araghinejad S (2018) Classification of water quality status based on minimum quality parameters: application of machine learning techniques. Model Earth Syst Environ 4:311–324. https://doi.org/10.1007/s40808-017-0406-9
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat, pp 1189–1232
Gupta D, Mishra VK (2023) Development of entropy-river water quality index for predicting water quality classification through machine learning approach. Stoch Environ Res Risk Assess. https://doi.org/10.1007/s00477-023-02506-0
Kumar MJ (2022) Geostatistical analyses empowered with gradient boosting and extra trees classifier algorithms in the prediction of groundwater quality and geology-lithology attributes over ysr district, india. Int J Hydrol Sci Technol 1. https://doi.org/10.1504/ijhst.2022.10050042
Lu H, Ma X (2020) Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249:126169. https://doi.org/10.1016/j.chemosphere.2020.126169
Meireles ACM, de Andrade EM, Chaves LCG, Frischkorn H, Crisostomo LA (2010) A new proposal of the classification of irrigation water. Revista Ciência Agronômica, pp 41349-357
Modaresi F, Araghinejad S (2014) A comparative assessment of support vector machines, probabilistic neural networks, and k-nearest neighbor algorithms for water quality classification. Water Resour Manag 28:4095–4111. https://doi.org/10.1007/s11269-014-0730-z
Nasir N, Kansal A, Alshaltone O, Barneih F, Sameer M, Shanableh A, Al-Shamma’a A (2022) Water quality classification using machine learning algorithms. J Water Process Eng 48:102920. https://doi.org/10.1016/j.jwpe.2022.102920
Nayak A, Matta G, Uniyal DP (2022) Hydrochemical characterization of groundwater quality using chemometric analysis and water quality indices in the foothills of himalayas. Environ Dev Sustain. https://doi.org/10.1007/s10668-022-02661-4
Nikoo MR, Mahjouri N (2013) Water quality zoning using probabilistic support vector machines and self-organizing maps. Water Resour Manag 27. https://doi.org/10.1007/s11269-013-0304-5
Pietrucha-Urbanik K, Rak JR (2020) Consumers’ perceptions of the supply of tap water in crisis situations. Energies 13. https://doi.org/10.3390/en13143617
Rahimi D, Hasheminasab S (2017) Analysis water quality by artificial neural network in bazoft river (iran). J Chem Pharm Res 9:115–121
Sadat-Noori SM, Ebrahimi K, Liaghat AM (2014) Groundwater quality assessment using the water quality index and gis in saveh-nobaran aquifer. Iran. Environ Earth Sci 71. https://doi.org/10.1007/s12665-013-2770-8
Shrivastava A, Sahu M, Jhariya DC (2022) Comparative analysis on ensemble learning techniques for groundwater quality assessment of chhattisgarh region. https://doi.org/10.1109/AIC55036.2022.9848863
Uddin MG, Nash S, Olbert AI (2021) A review of water quality index models and their use for assessing surface water quality. Ecol Indic 122:107218. https://doi.org/10.1016/j.ecolind.2020.107218
Zhou L, Pan S, Wang J, Vasilakos AV (2017) Machine learning on big data: Opportunities and challenges. Neurocomputing 237:350–361. https://doi.org/10.1016/j.neucom.2017.01.026
Zuo R, Xiong Y (2018) Big data analytics of identifying geochemical anomalies supported by machine learning methods. Nat Resour Res 27:5–13. https://doi.org/10.1007/s11053-017-9357-0
Acknowledgements
The authors would like to express their gratitude to all anonymous referees for their helpful remarks and opinions. We are grateful for the assistance given by the Directorate of Scientific Research and Technological Development (DGRST) and the Algerian project PRFU: Classification of Big Data Using Machine Learning Approaches, project number C00L07UN070120220004.
Funding
The authors declare that this study received no financial support or funding.
Author information
Authors and Affiliations
Contributions
The study was designed by all authors, with Aymen Zegaar handling data preparation and results analysis. Co-authors Pr. Samira Ounoki and Dr. Abdelmoutia Telli edited and reviewed the final manuscript.
Corresponding author
Ethics declarations
Ethical Responsibilities of Authors
The author declares that there are no conflicts of interest regarding the research project.
Conflict of Interest
The authors have complied with the Instructions for Authors’ "Ethical Responsibilities of Authors" and acknowledge that authorship cannot be altered, except for minor changes, once the paper is submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zegaar, A., Ounoki, S. & Telli, A. Machine Learning For Groundwater Quality Classification: A Step Towards Economic and Sustainable Groundwater Quality Assessment Process. Water Resour Manage 38, 621–637 (2024). https://doi.org/10.1007/s11269-023-03690-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-023-03690-y