Abstract
The high cost and time for determining water quality parameters justify the importance of application of mathematical models in discovering connection among them. This paper presents a data mining technique and its improved version in estimating water quality parameters. For this purpose, the surface and ground water quality data from Hamedan (Iran) between 2006 and 2015 were analyzed using M5 model tree and its modified version optimized with Excel Solver Platform (ESP). The values of electrical conductivity (EC), total dissolved solids (TDS), sodium adsorption ratio (SAR), and total hardness (TH) were considered as target variables, whereas pH, concentrations of sodium (Na), chlorine (Cl), bicarbonate (HCO3), sulfate (SO4), magnesium (Mg), calcium (Ca), and potassium (K) were as inputs. The results showed that in both the sources, pH was the least influential parameter on EC, TDS, SAR, and TH. It was found that among the objective parameters, the accuracy of models in estimating TH was higher than the other parameters, whereas SAR was a complex variable. The comparison of performances of the M5 and the M5-ESP models illustrated that the application of the ESP significantly decreased the normal root mean error (NRMSE) of the M5 model; the mean NRMSEs were decreased by 18.95% and 20.29% in estimating groundwater and surface water quality parameters, respectively. Moreover, ability of both the M5 and the M5-ESP models in computing objective parameters of the groundwater was found to be better than the surface water.
Similar content being viewed by others
Data availability
Data were obtained from the Ministry of Energy, Regional Water Company of Hamedan, Iran.
References
Abanyie SK, Sunkari ED, Apea OB, Abagale S, Korboe HM (2020) Assessment of the quality of water resources in the Upper East Region, Ghana: a review. Sustain Water Resour Manag 6. https://doi.org/10.1007/s40899-020-00409-4
Ajmera TK, Goyal MK (2012) Development of stage–discharge rating curve using model tree and neural networks: an application to Peachtree Creek in Atlanta. Expert Syst Appl 39:5702–5710. https://doi.org/10.1016/j.eswa.2011.11.101
Al-Mukhtar M, Al-Yaseen F (2019) Modeling water quality parameters using data-driven models, a case study Abu-Ziriq Marsh in South of Iraq. Hydrology 6:24. https://doi.org/10.3390/hydrology6010024
Antonopoulos VZ, Papamichail DM, Mitsiou KA (2001) Statistical and trend analysis of water quality and quantity data for the Strymon River in Greece. Hydrol Earth Syst Sci 5:679–692. https://doi.org/10.5194/hess-5-679-2001
Ateeq-ur-Rauf GAR, Ahmad S, Hashmi HN (2018) Performance assessment of artificial neural networks and support vector regression models for stream flow predictions. Environ Monit Assess 190:704. https://doi.org/10.1007/s10661-018-7012-9
Awadh SM, Al-Mimar H, Yaseen ZM (2020) Groundwater availability and water demand sustainability over the upper mega aquifers of Arabian Peninsula and west region of Iraq. Environment, Development and Sustainability
Azad A, Karami H, Farzin S, Mousavi SF, Kisi O (2019) Modeling river water quality parameters using modified adaptive neuro fuzzy inference system. Water Sci Eng 12:45–54. https://doi.org/10.1016/j.wse.2018.11.001
Babbar R, Babbar S (2017) Predicting river water quality index using data mining techniques. Environ Earth Sci 76. https://doi.org/10.1007/s12665-017-6845-9
Bui DT, Khosravi K, Tiefenbacher J, Nguyen H, Kazakis N (2020) Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Sci Total Environ 721:137612. https://doi.org/10.1016/j.scitotenv.2020.137612
Fathima A, Mangai JA, Gulyani BB (2014) An ensemble method for predicting biochemical oxygen demand in river water using data mining techniques. Int J River Basin Manag 12:357–366. https://doi.org/10.1080/15715124.2014.936442
Ghahreman N, Sameti M (2014) Comparison of M5 model tree and artificial neural network for estimating potential evapotranspiration in semi-arid climates
Grano C, Abensur E (2017) Optimization model for vehicle routing and equipment replacement in farm machinery. Engenharia Agrícola 37:987–993. https://doi.org/10.1590/1809-4430-eng.agric.v37n5p987-993/2017
Grossman TA, Özlük Ö (2009) A spreadsheet scenario analysis technique that integrates with optimization and simulation. INFORMS Trans Educ 10:18–33. https://doi.org/10.1287/ited.1090.0027
Grover J, Lavin AM (2007) Modern portfolio optimization: a practical approach using an excel solver single-index model. J Wealth Manag 10:60–72
Hart A (2001) Mann-Whitney test is not just a test of medians: differences in spread can be important. BMJ 323:391–393. https://doi.org/10.1136/bmj.323.7309.391
Hazra A, Gogtay N (2016) Biostatistics series module 3: comparing groups: numerical variables. Indian J Dermatol 61:–251. https://doi.org/10.4103/0019-5154.182416
Jeihouni M, Toomanian A, Mansourian A (2020) Decision tree-based data mining and rule induction for identifying high quality groundwater zones to water supply management: a novel hybrid use of data mining and GIS. Water Resour Manag 34:139–154
Kisi O, Parmar KS, Soni K, Demir V (2017) Modeling of air pollutants using least square support vector regression, multivariate adaptive regression spline, and M5 model tree models. Air Qual Atmos Health 10:873–883. https://doi.org/10.1007/s11869-017-0477-9
Kisi O, Azad A, Kashi H, Saeedian A, Hashemi SAA, Ghorbani S (2019) Modeling groundwater quality parameters using hybrid neuro-fuzzy methods. Water Resour Manag 33:847–861
Kolli K, Seshadri R (2013) Ground water quality assessment using data mining techniques. Int J Comput Appl 76:39–45. https://doi.org/10.5120/13324-0885
Lee HW, Kim H-Y, Choi JH, Park SS (2019) Statistical and visual comparison of water quality changes caused by a large river restoration project. Environ Eng Sci 36:23–34. https://doi.org/10.1089/ees.2018.0150
Lerios JL, Villarica MV (2019) Pattern extraction of water quality prediction using machine learning algorithms of water reservoir. Int J Mech Eng Robot Res 8:992–997. https://doi.org/10.18178/ijmerr.8.6.992-997
Luo S, Wu B, Xiong X, Wang J (2016) Effects of total hardness and calcium:magnesium ratio of water during early stages of rare minnows (Gobiocypris rarus)
Mohammed M, Sharafati A, Al-Ansari N, Yaseen ZM (2020) Shallow foundation settlement quantification: application of hybridized adaptive neuro-fuzzy inference system model
Nourani V, Molajou A, Tajbakhsh AD, Najafi H (2019) A wavelet based data mining technique for suspended sediment load modeling. Water Resour Manag 33:1769–1784. https://doi.org/10.1007/s11269-019-02216-9
Olasoji S, Oyewole N, Abiola B, Edokpayi J (2019) Water quality assessment of surface and groundwater sources using a water quality index method: a case study of a Peri-Urban Town in Southwest, Nigeria. Environments 6:23. https://doi.org/10.3390/environments6020023
Quinlan JR (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence. Singapore, pp 343–348
Robbins TR (2017) Complexity and flexibility in call center scheduling models
Salih SQ, Alakili I, Beyaztas U et al (2020) Prediction of dissolved oxygen, biochemical oxygen demand, and chemical oxygen demand using hydrometeorological variables: case study of Selangor River, Malaysia. Environ Dev Sustain:1–20
Sanikhani H, Deo RC, Yaseen ZM, Eray O, Kisi O (2018) Non-tuned data intelligent model for soil temperature estimation: a new approach. Geoderma 330:52–64. https://doi.org/10.1016/j.geoderma.2018.05.030
Sasakova N, Gregova G, Takacova D, Mojzisova J, Papajova I, Venglovsky J, Szaboova T, Kovacova S (2018) Pollution of surface and ground water by sources related to agricultural activities. Front Sustain Food Syst 2. https://doi.org/10.3389/fsufs.2018.00042
Sattari MT, Pal M, Apaydin H, Ozturk F (2013) M5 model tree application in daily river flow forecasting in Sohu Stream, Turkey. Water Res 40:233–242
Sattari MT, Joudi AR, Kusiak A (2016) Estimation of Water Quality Parameters With Data-Driven Model. J Am Water Works Assoc 108(4):E232–E239
Sattari MT, Pal M, Mirabbasi R, Abraham J (2018) Ensemble of M5 model tree based modelling of sodium adsorption ratio. J AI Data Min 6:69–78
Sharafati A, Pezeshki E (2020) A strategy to assess the uncertainty of a climate change impact on extreme hydrological events in the semi-arid Dehbar catchment in Iran. Theor Appl Climatol 139:389–402
Sharafati A, Nabaei S, Shahid S (2019) Spatial assessment of meteorological drought features over different climate regions in Iran. Int J Climatol 40. https://doi.org/10.1002/joc.6307
Sharafati A, Asadollah SBHS, Hosseinzadeh M (2020a) The potential of new ensemble machine learning models for effluent quality parameters prediction and related uncertainty
Sharafati A, Pezeshki E, Shahid S, Motta D (2020b) Quantification and uncertainty of the impact of climate change on river discharge and sediment yield in the Dehbar river basin in Iran. J Soils Sediments 20:2977–2996. https://doi.org/10.1007/s11368-020-02632-0
Srivastava R, Tiwari AN, Giri VK (2019) Solar radiation forecasting using MARS, CART, M5, and random forest model: a case study for India. Heliyon 5:e02692. https://doi.org/10.1016/j.heliyon.2019.e02692
Subhashini R, Jeevitha JK, Samhitha BK (2019) Application of data mining techniques to examine quality of water. Int J Innov Technol Explor Eng 8:613–617
Tamilarasi P, Akila D (2019) Ground water data analysis using data mining: a literature review. Int J Recent Technol Eng 7:2277–3878
Tao H, Bobaker AM, Ramal MM, Yaseen ZM, Hossain MS, Shahid S (2018) Determination of biochemical oxygen demand and dissolved oxygen for semi-arid river environment: application of soft computing models. Environ Sci Pollut Res 26:923–937. https://doi.org/10.1007/s11356-018-3663-x
Tao H, Keshtegar B, Yaseen ZM (2019) The feasibility of integrative radial basis M5Tree predictive model for river suspended sediment load simulation. Water Resour Manag 33:4471–4490. https://doi.org/10.1007/s11269-019-02378-6
Terzi Ö (2012) Monthly rainfall estimation using data-mining process. Appl Comput Intell Soft Comput 2012:1–6. https://doi.org/10.1155/2012/698071
Tiyasha TTM, Yaseen ZM (2020) A survey on river water quality modelling using artificial intelligence models: 2000–2020. J Hydrol 585:124670
Verlicchi P, Grillini V (2020) Surface water and groundwater quality in South Africa and mozambique—Analysis of the Most critical pollutants for drinking purposes and challenges in water treatment selection. Water 12(1):30
Wang L, Kisi O, Zounemat-Kermani M, Zhu Z, Gong W, Niu Z, Liu H, Liu Z (2017) Prediction of solar radiation in China using different adaptive neuro-fuzzy methods and M5 model tree. Int J Climatol 37:1141–1155. https://doi.org/10.1002/joc.4762
Wu L, Huang G, Fan J, Ma X, Zhou H, Zeng W (2020) Hybrid extreme learning machine with meta-heuristic algorithms for monthly pan evaporation prediction. Comput Electron Agric 168:105115. https://doi.org/10.1016/j.compag.2019.105115
Yaseen ZM, Deo RC, Hilal A, Abd AM, Bueno LC, Salcedo-Sanz S, Nehdi ML (2018) Predicting compressive strength of lightweight foamed concrete using extreme learning machine model. Adv Eng Softw 115:112–125. https://doi.org/10.1016/j.advengsoft.2017.09.004
Yaseen ZM, Naganna SR, Sa’adi Z, Samui P, Ghorbani MA, Salih SQ, Shahid S (2020) Hourly River flow forecasting: application of emotional neural network versus multiple machine learning paradigms. Water Resour Manag 34:1075–1091. https://doi.org/10.1007/s11269-020-02484-w
Zia H, Harris NR, Merrett G V (2014) Water quality monitoring, control and management (WQMCM) framework using collaborative wireless sensor networks
Acknowledgments
The authors appreciate the data source provider. In addition, we appreciate the respected editors and reviewers for their constructive comments.
Funding
The research received no external funds.
Author information
Authors and Affiliations
Contributions
Formal analysis and conceptualization: Maryam Bayatvarkeshi and Zaher Mundher Yaseen; methodology: Maryam Bayatvarkeshi and Mahtab Zarei; writing—original draft: Maryam Bayatvarkeshi, Zaher Mundher Yaseen, and Monzur Imteaz; project administrative: Zaher Mundher Yaseen; manuscript revision: Zaher Mundher Yaseen and Ozgur Kisi; supervision: Ozgur Kisi and Zaher Mundher Yaseen.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
The manuscript has been established in accordance with the journal ethics.
Consent to participate
The research does not involve human participants and/or animals.
Consent to publish
The research is conducted following the consent ethics of the Environmental Science and Pollution Research.
Additional information
Responsible editor: Xianliang Yi
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bayatvarkeshi, M., Imteaz, M., Kisi, O. et al. Application of M5 model tree optimized with Excel Solver Platform for water quality parameter estimation. Environ Sci Pollut Res 28, 7347–7364 (2021). https://doi.org/10.1007/s11356-020-11047-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-020-11047-w