Abstract
One of the most useful research fields with many real-life applications, such as in water science, is the subject of data mining. Data mining (DM) is considered a process to extract valuable data from a wide range of information stored in various databases. The data is categorized into the form of patterns, associations, changes, anomalies and significant structures. In water recourses management and environmental engineering, predicting and modelling parameters play an integral role in decision making. The most critical freshwater water resource for millions of people worldwide are rivers with a dynamic nature (floods/droughts), in terms of available freshwater quantity and quality. With various basin characteristics, river flow and sediment regime may be influenced by natural processes such as erosion and sediment transport as well as anthropogenic factors such as urban stormwater runoff and semi-treated sanitary/industrial sewage discharge. Therefore, artificial intelligence (AI) techniques are used to decrease model development costs and improve prediction errors, achieving more efficient models. In this chapter, some well-known techniques and AI-based methods are introduced, and their applications are elaborated. The models are comprised of extreme learning machine (ELM), least square support vector machine (LSSVM), genetic programming (GP), adaptive neural-fuzzy inference system (ANFIS), and multivariate adaptive regression spline (MARS). Each technique, then, is illustrated with a brief literature review. Having being evaluated in terms of the basic concept, the methods are addressed based on a mathematical statement. In the last part, the pseudocode of the ways, an acceptable guideline for coding the methods, is pointed out. This chapter is collected for graduate students, researchers, educators, and practitioners interested in engineering optimization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abba, S., Hadi, S. J., Sammen, S. S., Salih, S. Q., Abdulkadir, R., Pham, Q. B., & Yaseen, Z. M. (2020). Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination. Journal of Hydrology, 587, 124974.
Ahmed, A. N., Othman, F. B., Afan, H. A., Ibrahim, R. K., Fai, C. M., Hossain, M. S., Elshafie, A., et al. (2019). Machine learning methods for better water quality prediction. Journal of Hydrology, 578, 124084.
Al-Sudani, Z. A., Salih, S. Q., & Yaseen, Z. M. (2019). Development of multivariate adaptive regression spline integrated with differential evolution model for streamflow simulation. Journal of Hydrology, 573, 1–12.
Alizamir, M., Heddam, S., Kim, S., & Mehr, A. D. (2021). On the implementation of a novel data-intelligence model based on extreme learning machine optimized by bat algorithm for estimating daily chlorophyll-a concentration: Case studies of river and lake in USA. Journal of Cleaner Production, 285, 124868.
Arora, S., & Keshari, A. K. (2021). ANFIS-ARIMA modelling for scheming re-aeration of hydrologically altered rivers. Journal of Hydrology, 126635.
Aryafar, A., Khosravi, V., Zarepourfard, H., & Rooki, R. (2019). Evolving genetic programming and other AI-based models for estimating groundwater quality parameters of the Khezri plain Eastern Iran. Environmental Earth Sciences, 78(3), 69.
Asadollah, S. B. H. S., Sharafati, A., Motta, D., & Yaseen, Z. M. (2021). River water quality index prediction and uncertainty analysis: A comparative study of machine learning models. Journal of Environmental Chemical Engineering, 9(1), 104599.
Azar, N. A., Milan, S. G., & Kayhomayoon, Z. (2021). The prediction of longitudinal dispersion coefficient in natural streams using LS-SVM and ANFIS optimized by Harris hawk optimization algorithm. Journal of Contaminant Hydrology, 240, 103781.
Barzegar, R., Moghaddam, A. A., Adamowski, J., & Ozga-Zielinski, B. (2018). Multi-step water quality forecasting using a boosting ensemble multi-wavelet extreme learning machine model. Stochastic Environmental Research and Risk Assessment, 32(3), 799–813.
Bhardwaj, R., & Bangia, A. (2021). Neuronal Brownian dynamics for salinity of river basins’ water management. Neural Computing and Applications, 1–14.
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Paper presented at the Proceedings of the fifth annual workshop on Computational learning theory.
Çamdevýren, H., Demýr, N., Kanik, A., & Keskýn, S. (2005). Use of principal component scores in multiple linear regression models for prediction of Chlorophyll-a in reservoirs. Ecological Modelling, 181(4), 581–589.
Chen, H., Xu, L., Ai, W., Lin, B., Feng, Q., & Cai, K. (2020). Kernel functions embedded in support vector machine learning models for rapid water pollution assessment via near-infrared spectroscopy. Science of the Total Environment, 714, 136765.
Cheng, M.-Y., Tsai, H.-C., & Hsieh, W.-S. (2009). Web-based conceptual cost estimates for construction projects using evolutionary fuzzy neural inference model. Automation in Construction, 18(2), 164–172.
Civelekoglu, G., Yigit, N., Diamadopoulos, E., & Kitis, M. (2007). Prediction of bromate formation using multi-linear regression and artificial neural networks. Ozone Science and Engineering, 29(5), 353–362.
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods: Cambridge university press.
Deng, W.-Y., Zheng, Q.-H., Chen, L., & Xu, X.-B. (2010a). Research on extreme learning of neural networks. Chinese Journal of Computers, 33(2), 279–287.
Deng, W.-Y., Zheng, Q.-H., Lian, S., Chen, L., & Wang, X. (2010b). Ordinal extreme learning machine. Neurocomputing, 74(1–3), 447–456.
Deng, W., Wang, G., & Zhang, X. (2015). A novel hybrid water quality time series prediction method based on cloud model and fuzzy forecasting. Chemometrics and Intelligent Laboratory Systems, 149, 39–49.
Fijani, E., Barzegar, R., Deo, R., Tziritis, E., & Skordas, K. (2019). Design and implementation of a hybrid model based on two-layer decomposition method coupled with extreme learning machines to support real-time environmental monitoring of water quality parameters. Science of the Total Environment, 648, 839–853.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics, 1–67.
Goh, A. T., & Zhang, W. (2014). An improvement to MLR model for predicting liquefaction-induced lateral spread using multivariate adaptive regression splines. Engineering Geology, 170, 1–10.
Heddam, S., & Kisi, O. (2018). Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree. Journal of Hydrology, 559, 499–509.
Herrig, I. M., Böer, S. I., Brennholt, N., & Manz, W. (2015). Development of multiple linear regression models as predictive tools for fecal indicator concentrations in a stretch of the lower Lahn River, Germany. Water Research, 85, 148–157.
Huan, J., Cao, W., & Qin, Y. (2018). Prediction of dissolved oxygen in aquaculture based on EEMD and LSSVM optimized by the Bayesian evidence framework. Computers and Electronics in Agriculture, 150, 257–265.
Huang, C., Davis, L., & Townshend, J. (2002). An assessment of support vector machines for land cover classification. International Journal of Remote Sensing, 23(4), 725–749.
Jafari, H., Rajaee, T., & Kisi, O. (2020). Improved water quality prediction with hybrid wavelet-genetic programming model and shannon entropy. Natural Resources Research, 29, 3819–3840.
Jamei, M., Ahmadianfar, I., Chu, X., & Yaseen, Z. M. (2020). Prediction of surface water total dissolved solids using hybridized wavelet-multigene genetic programming: New approach. Journal of Hydrology, 589, 125335.
Jayaweera, C., Othman, M., & Aziz, N. (2019). Improved predictive capability of coagulation process by extreme learning machine with radial basis function. Journal of Water Process Engineering, 32, 100977.
Kargar, K., Samadianfard, S., Parsa, J., Nabipour, N., Shamshirband, S., Mosavi, A., & Chau, K.-W. (2020). Estimating longitudinal dispersion coefficient in natural streams using empirical models and machine learning algorithms. Engineering Applications of Computational Fluid Mechanics, 14(1), 311–322.
Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4(2), 87–112.
Koza, J. R., & Koza, J. R. (1992). Genetic programming: on the programming of computers by means of natural selection (Vol. 1). MIT press.
Kwon, Y.-K., & Moon, B.-R. (2007). A hybrid neurogenetic approach for stock forecasting. IEEE Transactions on Neural Networks, 18(3), 851–864.
Lashkari, A. (2013). Prediction of the shaft resistance of nondisplacement piles in sand. International Journal for Numerical and Analytical Methods in Geomechanics, 37(8), 904–931.
Liang, N.-Y., Huang, G.-B., Saratchandran, P., & Sundararajan, N. (2006). A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Transactions on Neural Networks, 17(6), 1411–1423.
Lin, C.-T., & Lee, C. G. (1996). Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems: Prentice hall.
Liu, J., Yu, C., Hu, Z., Zhao, Y., Bai, Y., Xie, M., & Luo, J. (2020). Accurate prediction scheme of water quality in smart mariculture with deep Bi-S-SRU learning network. IEEE Access, 8, 24784–24798.
Mirzahosseini, M. R., Aghaeifar, A., Alavi, A. H., Gandomi, A. H., & Seyednour, R. (2011). Permanent deformation analysis of asphalt mixtures using soft computing techniques. Expert Systems with Applications, 38(5), 6081–6100.
Nacar, S., Mete, B., & Bayram, A. (2020). Estimation of daily dissolved oxygen concentration for river water quality using conventional regression analysis, multivariate adaptive regression splines, and TreeNet techniques. Environmental Monitoring and Assessment, 192(12), 1–21.
Najafzadeh, M., & Ghaemi, A. (2019). Prediction of the five-day biochemical oxygen demand and chemical oxygen demand in natural streams using machine learning methods. Environmental Monitoring and Assessment, 191(6), 1–21.
Najafzadeh, M., Homaei, F., & Farhadi, H. (2021). Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: integration of remote sensing and data-driven models. Artificial Intelligence Review, 1–33.
Orouji, H., Bozorg Haddad, O., Fallah-Mehdipour, E., & Mariño, M. (2013). Modeling of water quality parameters using data-driven models. Journal of Environmental Engineering, 139(7), 947–957.
Poli, R., Langdon, W., & McPhee, N. (2008). A field guide to genetic programming (With contributions by JR Koza) (2008). Published via http://lulu.com.
RadFard, M., Seif, M., Hashemi, A. H. G., Zarei, A., Saghi, M. H., Shalyari, N., Samaei, M. R., et al. (2019). Protocol for the estimation of drinking water quality index (DWQI) in water resources: Artificial neural network (ANFIS) and Arc-Gis. MethodsX, 6, 1021–1029.
Samui, P. (2012). Determination of ultimate capacity of driven piles in cohesionless soil: A multivariate adaptive regression spline approach. International Journal for Numerical and Analytical Methods in Geomechanics, 36(11), 1434–1439.
Shi, P., Li, G., Yuan, Y., Huang, G., & Kuang, L. (2019). Prediction of dissolved oxygen content in aquaculture using Clustering-based Softplus Extreme Learning Machine. Computers and Electronics in Agriculture, 157, 329–338.
Sihag, P., Tiwari, N., & Ranjan, S. (2017). Modelling of infiltration of sandy soil using gaussian process regression. Modeling Earth Systems and Environment, 3(3), 1091–1100.
Sihag, P., Tiwari, N., & Ranjan, S. (2019). Prediction of unsaturated hydraulic conductivity using adaptive neuro-fuzzy inference system (ANFIS). ISH Journal of Hydraulic Engineering, 25(2), 132–142.
Song, C., Yao, L., Hua, C., & Ni, Q. (2021). A water quality prediction model based on variational mode decomposition and the least squares support vector machine optimized by the sparrow search algorithm (VMD-SSA-LSSVM) of the Yangtze River China. Environmental Monitoring and Assessment, 193(6), 1–17.
Su, H., Li, X., Yang, B., & Wen, Z. (2018). Wavelet support vector machine-based prediction model of dam deformation. Mechanical Systems and Signal Processing, 110, 412–427.
Suykens, J. A., & Vandewalle, J. (1999). Multiclass least squares support vector machines. Paper presented at the IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No. 99CH36339).
Tsoukalas, L. H., & Uhrig, R. E. (1997). Hypermedia integration of information resources for nuclear plant operations. Nuclear Technology, 119(1), 48–62.
Vapnik, V. (2013). The nature of statistical learning theory. Springer science and business media.
Yaseen, Z. M., Ramal, M. M., Diop, L., Jaafar, O., Demir, V., & Kisi, O. (2018). Hybrid adaptive neuro-fuzzy models for water quality index estimation. Water Resources Management, 32(7), 2227–2245.
Zarnani, S., El-Emam, M. M., & Bathurst, R. J. (2011). Comparison of numerical and analytical solutions for reinforced soil wall shaking table tests. Geomechanics and Engineering, 3(4), 291–321.
Zhang, W., & Goh, A. (2016). Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression. Geomech Eng, 10(3), 269–284.
Zhang, W., & Goh, A. T. C. (2013). Multivariate adaptive regression splines for analysis of geotechnical engineering systems. Computers and Geotechnics, 48, 82–95.
Zhang, W., Zhang, Y., & Goh, A. T. (2017). Multivariate adaptive regression splines for inverse analysis of soil and wall properties in braced excavation. Tunnelling and Underground Space Technology, 64, 24–33.
Zhao, C., Lu, T., Hodson, H., & Jackson, J. (2004). The temperature dependence of effective thermal conductivity of open-celled steel alloy foams. Materials Science and Engineering: A, 367(1–2), 123–131.
Zhu, S., Heddam, S., Wu, S., Dai, J., & Jia, B. (2019). Extreme learning machine-based prediction of daily water temperature for rivers. Environmental Earth Sciences, 78(6), 1–17.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Shirvani-Hosseini, S., Samadi-Koucheksaraee, A., Ahmadianfar, I., Gharabaghi, B. (2022). Data Mining Methods for Modeling in Water Science. In: Bozorg-Haddad, O., Zolghadr-Asli, B. (eds) Computational Intelligence for Water and Environmental Sciences. Studies in Computational Intelligence, vol 1043. Springer, Singapore. https://doi.org/10.1007/978-981-19-2519-1_8
Download citation
DOI: https://doi.org/10.1007/978-981-19-2519-1_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2518-4
Online ISBN: 978-981-19-2519-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)