Abstract
The classification and prediction of water quality parameters (WQPs) such as Fecal Coliform in river waters are crucial for developing a Decision Support System or Tool for water quality protection or water resource management. Using Support Vector Machine (SVM) classification and regression, a predictive modeling attempt is made for the Upper Green River Watershed, Kentucky, the U.S.A. The Linear, Polynomial, and Radial Basis Function (RBF) Kernels are used for classification and regression. A sensitivity analysis is performed for SVM models with the help of variants of Gamma and C values to obtain the best predictions of fecal coliform. Further, Least Squares Support Vector Machine (LS-SVM) is also employed to strengthen the accuracy of forecasts of individual input parameters. The results of SVM are compared with Artificial Neural Networks (ANN) for the same watershed. It is found that while the ANN models perform better than linear, polynomial SVM models, the SVM RBF regression models stream water quality (as good as or) slightly better than ANN models for the same inputs. This study obtains coefficients of determination of 0.91, 0.87, and 0.90 using the SVM RBF model in training, testing, and overall, respectively. These coefficients are 0.82, 0.90, and 0.85 using feed-forward ANNs for fecal coliform in training, testing, and overall. The results of LS-SVM indicate that the climate parameters are more crucial for water quality modeling than land use parameters.
Similar content being viewed by others
Data availability
The data is owned by the Department of Biology, Western Kentucky University, USA.
References
Abdullah D, Gartsiyanova K, Mansur Qizi KM, Javlievich EA, Bulturbayevich MB, Zokirova G, Nordin MN (2023) An artificial neural networks approach and hybrid method with wavelet transform to investigate the quality of Tallo River, Indonesia. Casp J Environ Sci 21(3):647–656. https://doi.org/10.22124/cjes.2023.6942
Adib A, Farajpanah H, Shoushtari MM, Lotfirad M, Saeedpanah I, Sasani H (2022) Selection of the best machine learning method for estimation of concentration of different water quality parameters. Sustain Water Resour Manag 8:172. https://doi.org/10.1007/s40899-022-00765-3
Ahmadpour A, Mirhashemi SH, Panahi M, Foroughi F (2022) Comparative evaluation of classic and seasonal time series hybrid models in predicting electrical conductivity of Maroun river, Iran. Sustain Water Resour Manag 8:165. https://doi.org/10.1007/s40899-022-00744-8
Ahmed U, Mumtaz R, Anwar H, Mumtaz S, Qamar AM (2020) Water quality monitoring: from conventional to emerging technologies. Water Sci Technol Water Supply 20(1):28–45. https://doi.org/10.2166/ws.2019.144
Ahmed U, Mumtaz R, Anwar H, Shah AA, Irfan R, García-Nieto J (2019) Efficient water quality prediction using supervised machine learning. Water (switzerland) 11(11):2210. https://doi.org/10.3390/w11112210
Alwan Al Mashhadani AM, Himdan TA, Hamadi Al Dulaimi AS, AbuZaid MYI (2022) Adsorptive removal of some carbonyl containing compounds from aqueous solutions using Iraqi porcelanite rocks: a kinetic-model study. Casp J Environ Sci 20(1):117–129. https://doi.org/10.22124/cjes.2022.5406
Anmala J, Turuganti V (2019) Statistical assessment and neural network modeling of stream water quality observations of Green River watershed, KY, USA. Water Supply (water Sci Technol) 19(6):1831–1840. https://doi.org/10.2166/ws.2019.058
Anmala J, Meier OW, Meier AJ, Grubbs S (2015) A GIS and an artificial neural network based water quality model for a stream network in Upper Green River Basin, Kentucky, USA. ASCE J Environ Eng 141(5):04014082. https://doi.org/10.1061/(ASCE)EE.1943-7870.0000801
Anmala J, Turuganti V (2021) Comparison of the performance of decision tree (DT) algorithms and extreme learning machine (ELM) model in the prediction of water quality of the Upper Green River watershed. Water Environ Res. https://doi.org/10.1002/wer.1642
Azar NA, Sami GM, Zahra K (2021) The prediction of longitudinal dispersion coefficient in natural streams using LS-SVM and ANFIS optimized by Harris hawk optimization algorithm. J Contam Hydrol 240:103781. https://doi.org/10.1016/j.jconhyd.2021.103781
Brum M, Fan FM, Salla MR, von Sperling M (2023) Analysis of a probabilistic approach for modelling and assessment of the water quality of rivers. J Hydroinf 24(4):783–797. https://doi.org/10.2166/hydro.2022.157
Bui DT, Khabat K, John T, Hoang N, Nerantzis K (2020) Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Sci Total Environ 721:137612. https://doi.org/10.1016/j.scitotenv.2020.137612
Bushra B, Bazney L, Deka L, Wood PJ, McGowan S, Das DB (2023) Temporal modelling of long-term heavy metal concentrations in aquatic ecosystems. J Hydroinf 25(4):1188–1209. https://doi.org/10.2166/hydro.2023.151
Chen K, Hexia C, Chuanlong Z, Yichao H, Xiangyang Q, Ruqin S, Fengrui L et al (2020) Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res 171:115454. https://doi.org/10.1016/j.watres.2019.115454
Chou JS, Chia CH, Ha SH (2018) Determining quality of water in reservoir using machine learning. Eco Inform 44:57–75. https://doi.org/10.1016/j.ecoinf.2018.01.005
Forough K-T, Seyed FM, Mohammadreza K, Ozra Y-F, Mojtaba N-M (2019) Prediction of water quality index by support vector machine: a case study in the Sefidrud Basin, Northern Iran. Water Resour 46(1):112–116. https://doi.org/10.1134/S0097807819010056
Ghiasi B, Ata J, Behnam A (2021) Using a deep convolutional network to predict the longitudinal dispersion coefficient. J Contam Hydrol 240:103798. https://doi.org/10.1016/j.jconhyd.2021.103798
Haghiabi AH, Ali HN, Abbas P (2018) Water quality prediction using machine learning methods. Water Qual Res J Can 53(1):3–13. https://doi.org/10.2166/wqrj.2018.025
Haykin S (1999) Neural networks a comprehensive foundation. Pearson Education Inc.
He C, Yvonne P, John D, Ege T, Mahesh P, Quintin R (2016) A physical descriptive model for predicting bacteria level variation at a dynamic beach. J Water Health 14(4):617–629. https://doi.org/10.2166/wh.2016.206
Hecht-Nielsen R (1990) Neurocomputing. Addison-Wesley, Reading, MA
Ibrahim T, Geremew B, Tesfay F (2021) Spatio-temporal dynamic of land use and land cover in Andit Tid Watershed, wet frost/afro-alpine highland of Ethiopia. Edelweiss Appl Sci Technol 5(1):33. https://doi.org/10.33805/2576-8484.192
Institut Teknologi Bandung. School of Electrical Engineering and Informatics, Universiti Teknologi MARA. Faculty of Electrical Engineering, IEEE Control Systems Society. Chapter Malaysia, Institut Teknologi Bandung. Pusat Penelitian Teknologi Informasi dan Komunika, Institute of Electrical and Electronics Engineers. Indonesia Section, and Institute of Electrical and Electronics Engineers (n.d) Proceedings of the 2016 6th IEEE International Conference on System Engineering and Technology (ICSET) : 3–6 October 2016, Bandung, Indonesia
Jadhav MS, Kanchan CK, Arundhati SW (2015) Water quality prediction of gangapur reservoir (India) using LS-SVM and genetic programming. Lakes Reserv Res Manag 20(4):275–284. https://doi.org/10.1111/lre.12113
Kentucky Division of Water (KDW) (2001) Green and trade water basins status report, Frankfort
Laureano-Rosario AE, Andrew PD, Erin MS, Dragan AS, Frank EM-K (2019) Predicting culturable enterococci exceedances at Escambron Beach, San Juan, Puerto Rico using satellite remote sensing and artificial neural networks. J Water Health 17(1):137–148. https://doi.org/10.2166/wh.2018.128
Li Y, Xiao W, Zuoxi Z, Sunghwa H, Zong L (2020) Lagoon water quality monitoring based on digital image analysis and machine learning estimators. Water Res 172:115471. https://doi.org/10.1016/j.watres.2020.115471
Maabreh HG, Waheeb K, Ryadh A, Abdulghani SB, Hamoodah ZJ, Jasim NY, Alajeeli F, Al Mansor AHO, Andreevich M (2023) Application of M5 algorithm of decision tree in simulation and investigation of effective factors of erosion in rangelands and forests. Casp J Environ Sci 21(3):533–541. https://doi.org/10.22124/cjes.2023.6929
Najafzadeh M, Alireza G (2019) Prediction of the five-day biochemical oxygen demand and chemical oxygen demand in natural streams using machine learning methods. Environ Monit Assess 191:380. https://doi.org/10.1007/s10661-019-744608
Najafzadeh M, Saied N (2021) A novel multiple-kernel support vector regression algorithm for estimation of water quality parameters. Nat Resour Res. https://doi.org/10.1007/s11053-021-09895-5
Nurdin N, Adam E, Rahman R, Mustapa R, Pembengo W, Moonti A (2023) Impacts of parametric methods on land suitability classification and land management prioritization for porang, Amorphophallus onchophyllus in Indonesia: a comparative study. Casp J Environ Sci 21(4):801–814. https://doi.org/10.22124/cjes.2023.7130
Penick MD, Grubbs SA, Meier AJ (2012) Algal biomass accrual in relation to nutrient availability and limitation along a longitudinal gradient of a karst riverine system. Int Aquat Res 4(20):1–13
Phys Org n.d. What percent of Earth is Water? Available from: https://phys.org/news/2014-12-percent-earth.html#:~:text=To%20break%20the%20numbers%20down,of%20it%2C%20to%20be%20exact.
Ravichandran J (2019) Probability and statistics for engineers. Wiley, New Delhi, p 597
Rehana S (2019) River water temperature modeling under climate change using support vector regression. In: Singh S, Dhanya C (eds) Hydrology in a changing world. Springer, Cham, pp 171–183
Samadi M, Jabbari E, Azamathulla HM (2014) Assessment of M5′ model tree and classification and regression trees for prediction of scour depth below free overfall spillways. Neural Comput Appl 24:357–366. https://doi.org/10.1007/s00521-012-1230-9
Samadi M, Sarkardeh H, Jabbari E (2021) Prediction of the dynamic pressure distribution in hydraulic structures using soft computing methods. Soft Comput 25:3873–3888. https://doi.org/10.1007/s00500-020-05413-6
Saunders C, Gammerman A, Vovk V (1998) Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th international conference on machine learning ICML-98, Madison-Wisconsin
Seow MXC, Ziegler AD (2017) Correcting systematic underprediction of biochemical oxygen demand in support vector regression. J Environ Eng (united States). https://doi.org/10.1061/(ASCE)EE.1943-7870.0001243
Surono S, Goh KW, Onn CW, Marestiani F (2023) Developing an optimized recurrent neural network model for air quality prediction using K-means clustering and PCA dimension reduction. Int J Innov Res Sci Stud 6(2):330–343. https://doi.org/10.53894/ijirss.v6i2.1427
Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9:293–300
Tan P-N, Steinbach M, Kumar V (2016) Introduction to data mining. Pearson India Education Services Pvt. Ltd, Bengaluru, p 760
Tizro AT, Fryar AE, Vanael A, Kazakis N, Voudouris K, Mohammadi P (2021) Estimation of total dissolved solids in Zayandehrood river using intelligent models and PCA. Sustain Water Resour Manag 7:22. https://doi.org/10.1007/s40899-021-00497-w
Tufail M, Lindell O, Ramesh T (2008) Artificial intelligence-based inductive models for prediction and classification of fecal coliform in surface waters. J Environ Eng 134(9):789–799. https://doi.org/10.1061/ASCE0733-93722008134:9789
Turuganti V, Jagadeesh A, Mayank D (2020) PCA, CCA, and ANN modeling of climate and land-use effects on stream water quality of Karst Watershed in Upper Green River, Kentucky, USA. ASCE J Hydrol Eng 25(6):05020008-1–05020008-11. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001921
UNESCO n.d. International Initiative on Water Quality (IIWQ). Available from: https://en.unesco.org/waterquality-iiwq/wq-challenge
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Vapnik V (1998a) Statistical learning theory. Wiley, New York
Vapnik V (1998b) Statistical learning theory. Wiley-Interscience
Vijayashanthar V, Jundong Q, Zhenduo Z, Paul E, Guan Y (2018) Modeling fecal indicator bacteria in urban waterways using artificial neural networks. J Environ Eng (united States). https://doi.org/10.1061/(ASCE)EE.1943-7870.0001377
Xu T, Giovanni C, Martin N (2020) A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning. Water Res 177:115788. https://doi.org/10.1016/j.watres.2020.115788
Zhang Y, Gao X, Kate S, Goulven I, Shuming L, Lenny BC, Bingcai P (2019) Integrating water quality and operation into prediction of water production in drinking water treatment plants by genetic algorithm enhanced artificial neural network. Water Res 164:114888. https://doi.org/10.1016/j.watres.2019.114888
Acknowledgements
The authors appreciate the help of Mr. Tim Rink, Jenna Harbaugh (GIS Analysts), Dr. Stuart Foster, Director of Kentucky Climate Center, Dr. Ouida Meier, and Prof. Albert J. Meier of Western Kentucky University for the required data. The corresponding author would also like to thank the Council of Scientific and Industrial Research (CSIR), India grant (No. 24(0356)/19/EMR-II) for the project titled ‘Experimental and Computational Studies of Surface Water Quality parameters from Morphometry and Spectral Characteristics.’
Funding
The motivation, methodology, and objectives of this manuscript were met by the Council of Scientific and Industrial Research (CSIR), India grant (No. 24(0356)/19/EMR-II) for the project titled ‘Experimental and Computational Studies of Surface Water Quality parameters from Morphometry and Spectral Characteristics’.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study's conception and design. Jagadeesh Anmala formulated the problem, and modeling methodology and secured the funding. Maitreyee Talnikar performed the modeling and wrote the first draft of the manuscript. Jagadeesh Anmala reviewed the modeling and revised the manuscript significantly. Chandu Parimi reviewed the manuscript, especially results, discussion and conclusions sections. Turuganti Venkateswarlu reviewed and verified the modeling simulations, and contributed to the revision of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Talnikar, M., Anmala, J., Venkateswarlu, T. et al. Support vector machine (SVM) model development for prediction of fecal coliform of Upper Green River Watershed, Kentucky, USA. Sustain. Water Resour. Manag. 10, 114 (2024). https://doi.org/10.1007/s40899-024-01092-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40899-024-01092-5