Abstract
Laboratory determination of trihalomethanes (THMs) is a very time-consuming task. Therefore, establishing a THMs model using easily obtainable water quality parameters would be very helpful. This study explored the modeling methods of the random forest regression (RFR) model, support vector regression (SVR) model, and Log-linear regression model to predict the concentration of total-trihalomethanes (T-THMs), bromodichloromethane (BDCM), and dibromochloromethane (DBCM), using nine water quality parameters as input variables. The models were developed and tested using a dataset of 175 samples collected from a water treatment plant. The results showed that the RFR model, with the optimal parameter combination, outperformed the Log-linear regression model in predicting the concentration of T-THMs (N25 = 82–88%, rp = 0.70–0.80), while the SVR model performed slightly better than the RFR model in predicting the concentration of BDCM (N25 = 85–98%, rp = 0.70–0.97). The RFR model exhibited superior performance compared to the other two models in predicting the concentration of T-THMs and DBCM. The study concludes that the RFR model is superior overall to the SVR model and Log-linear regression models and could be used to monitor THMs concentration in water supply systems.
Similar content being viewed by others
Data availability
Data will be made available on request.
References
Abu Awad, Y., Koutrakis, P., Coull, B. A., & Schwartz, J. (2017). A spatio-temporal prediction model based on support vector machine regression: Ambient Black Carbon in three New England States. Environmental Research, 159, 427–434. https://doi.org/10.1016/j.envres.2017.08.039
Albanakis, C., Tsanana, E., & Fragkaki, A. G. (2021). Modeling and prediction of trihalomethanes in the drinking water treatment plant of Thessaloniki, Greece. Journal of Water Process Engineering, 43, 102252. https://doi.org/10.1016/j.jwpe.2021.102252
Chen, H., Lin, T., Wang, P., Zhang, X., Jiang, F., & Wang, Y. (2023). Novel solar/sulfite advanced oxidation process for carbamazepine degradation: Radical chemistry, transformation pathways, influence on disinfection byproducts and toxic changes. Chemical Engineering Journal, 451, 138634. https://doi.org/10.1016/j.cej.2022.138634
Dubey, S., Gusain, D., Sharma, Y. C., Bux, F. (2020). Chapter 15 - The occurrence of various types of disinfectant by-products (trihalomethanes, haloacetic acids, haloacetonitrile) in drinking water. In M. N. V. Prasad (Ed.), Disinfection By-products in Drinking Water (pp. 371–391). Butterworth-Heinemann. https://doi.org/10.1016/B978-0-08-102977-0.00016-0
Egwari, L. O., Benson, N. U., Effiok, W. W. (2020). Chapter 8 - Disinfection by-product-induced diseases and human health risk. In M. N. V. Prasad (Ed.), Disinfection By-products in Drinking Water (pp. 185–204). Butterworth-Heinemann. https://doi.org/10.1016/B978-0-08-102977-0.00008-1
Hong, H., Song, Q., Mazumder, A., Luo, Q., Chen, J., Lin, H., & Liang, Y. (2016). Using regression models to evaluate the formation of trihalomethanes and haloacetonitriles via chlorination of source water with low SUVA values in the Yangtze River Delta region, China. Environmental Geochemistry and Health, 38(6), 1303–1312. https://doi.org/10.1007/s10653-016-9797-1
Hong, H., Zhang, Z., Guo, A., Shen, L., Sun, H., Liang, Y., & Lin, H. (2020). Radial basis function artificial neural network (RBF ANN) as well as the hybrid method of RBF ANN and grey relational analysis able to well predict trihalomethanes levels in tap water. Journal of Hydrology, 591, 125574. https://doi.org/10.1016/j.jhydrol.2020.125574
Hu, G., Mian, H. R., Mohammadiun, S., Rodriguez, M. J., Hewage, K., & Sadiq, R. (2023). Appraisal of machine learning techniques for predicting emerging disinfection byproducts in small water distribution networks. Journal of Hazardous Materials, 446, 130633. https://doi.org/10.1016/j.jhazmat.2022.130633
Hydrophobic organic compounds in drinking water reservoirs: Toxic effects of chlorination and protective effects of dietary antioxidants against disinfection by-products. (2019). Water Research, 166, 115041. https://doi.org/10.1016/j.watres.2019.115041
Kar, S., Senthilkumaran, B. (2020). Chapter 16—Water disinfection by-products cause acute toxicity in teleosts: a review. In M. N. V. Prasad (Ed.), Disinfection By-products in Drinking Water (pp. 393–411). Butterworth-Heinemann. https://doi.org/10.1016/B978-0-08-102977-0.00017-2
Karabadji, N. E. I., Amara Korba, A., Assi, A., Seridi, H., Aridhi, S., & Dhifli, W. (2023). Accuracy and diversity-aware multi-objective approach for random forest construction. Expert Systems with Applications, 225, 120138. https://doi.org/10.1016/j.eswa.2023.120138
Kulkarni, P., & Chellam, S. (2010). Disinfection by-product formation following chlorination of drinking water: Artificial neural network models and changes in speciation with treatment. Science of the Total Environment, 408(19), 4202–4210. https://doi.org/10.1016/j.scitotenv.2010.05.040
Liang, L., & Singer, P. C. (2003). Factors influencing the formation and relative distribution of haloacetic acids and trihalomethanes in drinking water. Environmental Science and Technology, 37(13), 2920–2928. https://doi.org/10.1021/es026230q
Lin, J., Chen, X., Ansheng, Z., Hong, H., Liang, Y., Sun, H., & Chen, J. (2018). Regression models evaluating THMs, HAAs and HANs formation upon chloramination of source water collected from Yangtze River Delta Region, China. Ecotoxicology and Environmental Safety, 160, 249–256. https://doi.org/10.1016/j.ecoenv.2018.05.038
Liu, B., Zheng, X., Ke, Y., Cao, X., Sun, Q., & Wu, H. (2022). Automated headspace solid-phase microextraction-gas chromatography-mass spectrometry of trihalomethane and typical nitrogenous disinfection by-products in water. Journal of Chromatography A, 1673, 463068. https://doi.org/10.1016/j.chroma.2022.463068
Liu, K., Lin, T., Zhong, T., Ge, X., Jiang, F., & Zhang, X. (2023). New methods based on a genetic algorithm back propagation (GABP) neural network and general regression neural network (GRNN) for predicting the occurrence of trihalomethanes in tap water. Science of the Total Environment, 870, 161976. https://doi.org/10.1016/j.scitotenv.2023.161976
Ma, X., Chen, Z., Chen, P., Zheng, H., Gao, X., Xiang, J., & Huang, Y. (2023). Predicting the utilization factor of blasthole in rock roadways by random forest. Underground Space, 11, 232–245. https://doi.org/10.1016/j.undsp.2023.01.006
Mohammadi, A., Faraji, M., Ebrahimi, A. A., Nemati, S., Abdolahnejad, A., & Miri, M. (2020). Comparing THMs level in old and new water distribution systems; seasonal variation and probabilistic risk assessment. Ecotoxicology and Environmental Safety, 192, 110286. https://doi.org/10.1016/j.ecoenv.2020.110286
Okoji, C. N., Okoji, A. I., Ibrahim, M. S., & Obinna, O. (2022). Comparative analysis of adaptive neuro-fuzzy inference system (ANFIS) and RSRM models to predict DBP (trihalomethanes) levels in the water treatment plant. Arabian Journal of Chemistry, 15(6), 103794. https://doi.org/10.1016/j.arabjc.2022.103794
Ozgur, C., Kaplan-Bekaroglu, S. S. (2022). Carbonaceous disinfection by-products in low Suva waters: occurrence, formation potential, and health risk assessment. Applied Ecology and Environmental Research, 20(5), 3833–3851. https://doi.org/10.15666/aeer/2005_38333851
Peng, F., Lu, Y., Wang, Y., Yang, L., Yang, Z., & Li, H. (2023). Predicting the formation of disinfection by-products using multiple linear and machine learning regression. Journal of Environmental Chemical Engineering, 11(5), 110612. https://doi.org/10.1016/j.jece.2023.110612
Peng, F., Peng, J., Li, H., Li, Y., Wang, B., & Yang, Z. (2020). Health risks and predictive modeling of disinfection byproducts in swimming pools. Environment International, 139, 105726. https://doi.org/10.1016/j.envint.2020.105726
Peng, X., & Chen, D. (2018). PTSVRs: Regression models via projection twin support vector machine. Information Sciences, 435, 1–14. https://doi.org/10.1016/j.ins.2018.01.002
Pérez-Lucas, G., Martínez-Menchón, M., Vela, N., & Navarro, S. (2022). Removal assessment of disinfection by-products (DBPs) from drinking water supplies by solar heterogeneous photocatalysis: A case study of trihalomethanes (THMs). Journal of Environmental Management, 321, 115936. https://doi.org/10.1016/j.jenvman.2022.115936
Platikanov, S., Martín, J., & Tauler, R. (2012). Linear and non-linear chemometric modeling of THM formation in Barcelona’s water treatment plant. Science of the Total Environment, 432, 365–374. https://doi.org/10.1016/j.scitotenv.2012.05.097
Shi, X., Liu, D., Chen, L., Lin, Y., Fu, M.-L., Sun, W., & Yuan, B. (2023). Challenges of point-of-use devices in purifying tap water: The growth of biofilm on filters and the formation of disinfection byproducts. Chemical Engineering Journal, 462, 142235. https://doi.org/10.1016/j.cej.2023.142235
Singh, K. P., & Gupta, S. (2012). Artificial intelligence based modeling for predicting the disinfection by-products in water. Chemometrics and Intelligent Laboratory Systems, 114, 122–131. https://doi.org/10.1016/j.chemolab.2012.03.014
Srivastav, A. L., Kaur, T. (2020). Chapter 18 - Factors affecting the formation of disinfection by-products in drinking water: human health risk. In M. N. V. Prasad (Ed.), Disinfection By-products in Drinking Water (pp. 433–450). Butterworth-Heinemann. https://doi.org/10.1016/B978-0-08-102977-0.00019-6
Uyak, V., Toroz, I., & Meriç, S. (2005). Monitoring and modeling of trihalomethanes (THMs) for a water treatment plant in Istanbul. Desalination, 176(1), 91–101. https://doi.org/10.1016/j.desal.2004.10.023
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. New York, NY: Springer New York. https://doi.org/10.1007/978-1-4757-2440-0
Xu, Q., Zhang, J., Jiang, C., Huang, X., & He, Y. (2015). Weighted quantile regression via support vector machine. Expert Systems with Applications, 42(13), 5441–5451. https://doi.org/10.1016/j.eswa.2015.03.003
Xu, Z., Shen, J., Qu, Y., Chen, H., Zhou, X., Hong, H., & Wu, F. (2022). Using simple and easy water quality parameters to predict trihalomethane occurrence in tap water. Chemosphere, 286, 131586. https://doi.org/10.1016/j.chemosphere.2021.131586
Yang, H., Ye, S., Wang, J., Wang, H., Wang, Z., Chen, Q., & Tan, X. (2021). The approaches and prospects for natural organic matter-derived disinfection byproducts control by carbon-based materials in water disinfection progresses. Journal of Cleaner Production, 311, 127799. https://doi.org/10.1016/j.jclepro.2021.127799
Zhang, M., Deng, Y.-L., Liu, C., Lu, W.-Q., & Zeng, Q. (2023). Impacts of disinfection byproduct exposures on male reproductive health: Current evidence, possible mechanisms and future needs. Chemosphere, 331, 138808. https://doi.org/10.1016/j.chemosphere.2023.138808
Zheng, W., Tian, D., Wang, X., Tian, W., Zhang, H., Jiang, S., & Qu, W. (2013). Support vector machine: Classifying and predicting mutagenicity of complex mixtures based on pollution profiles. Toxicology, 313(2), 151–159. https://doi.org/10.1016/j.tox.2013.01.016
Zheng, Y., Ge, Y., Muhsen, S., Wang, S., Elkamchouchi, D. H., Ali, E., & Ali, H. E. (2023). New ridge regression, artificial neural networks and support vector machine for wind speed prediction. Advances in Engineering Software, 179, 103426. https://doi.org/10.1016/j.advengsoft.2023.103426
Acknowledgements
Financial support was received from the Key Program of the Shanghai Science and Technology Commission (19DZ1204401).
Funding
This work was supported by the Key Program of the Shanghai Science and Technology Commission (19DZ1204401).
Author information
Authors and Affiliations
Contributions
H.L.: Acquisition and analysis of data; Methodology; Model testing; Writing. Y.C.: Revising it critically for important intellectual content. Y.Z.: Methodology; Polish; Final approval of the version to be submitted. X.H.: Polish. S.S.: Investigation; Polish.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Chu, Y., Zhu, Y. et al. Trihalomethane prediction model for water supply system based on machine learning and Log-linear regression. Environ Geochem Health 46, 31 (2024). https://doi.org/10.1007/s10653-023-01778-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10653-023-01778-3