Abstract
Predicting failure or success of an event or value is a problem that has recently been addressed using data mining techniques. By using the information we have from the past and the information of the present, we can increase the chance to take the best decision on a future event. In this paper, we evaluate some popular classification algorithms to model a water quality detection system. The experiment is carried out using data gathered from Thüringer Fernwasserversorgung water company. We briefly introduce baseline steps we followed in order to achieve a descent model for this binary classification problem. We describe the algorithms we have used, and the purpose of using each algorithm, and in the end we come up with a final best model. Representative models are compared using the F1 score, as a performance measurement. Finding the best model allows for early recognition of undesirable changes in the drinking water quality and enables the water supply companies to counteract in time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M.: In: Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, 23–25 Oct 2016, Thessaloniki, Greece, vol. 529. Springer (2016)
Bottenberg, R.A., Ward, J.H.: Applied multiple linear regression. Technical report. Personnel Research Lab Lackland AFB TEX (1963)
Chandrasekaran, S., Freise, M., Stork, J., Rebolledo, M., Bartz-Beielstein, T.: GECCO 2017 Industrial Challenge: Monitoring of Drinking-Water Quality (2017)
Darlington, R.B., Hayes, A.F.: Regression Analysis and Linear Models: Concepts, Applications, and Implementation. Guilford Publications (2016)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7(Jan), 1–30 (2006)
García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer (2015)
Hartshorn, S.: Machine Learning with Random Forests and Decision Trees (2016)
Hassoun, M.H.: Fundamentals of Artificial Neural Networks. MIT press (1995)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning, vol. 112. Springer (2013)
Kang, G.K., Gao, J.Z., Xie, G.: Data-driven Water Quality Analysis and Prediction: A survey
Kursa, M.B., Rudnicki, W.R., et al.: Feature selection with the boruta package. J. Stat. Softw. 36(11), 1–13 (2010)
Mohammadpour, R., Shaharuddin, S., Chang, C.K., Zakaria, N.A., Ab Ghani, A., Chan, N.W.: Prediction of water quality index in constructed wetlands using support vector machine. Environ. Sci. Pollut. Res. 22(8), 6208–6219 (2015)
Rodkey, F.L.: The Effect of Temperature on the Oxidation-reduction Potential of the Diphosphopyridine Nucleotide System
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science & Business Media (2013)
Wong, J.: Imputation: imputation. R Package Version 2.0, 1 (2013)
Xiang, Y., Jiang, L.: Water quality prediction using LS-SVM and particle swarm optimization. In: Second International Workshop on Knowledge Discovery and Data Mining, 2009. WKDD 2009, pp. 900–904. IEEE (2009)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Muharemi, F., Logofătu, D., Andersson, C., Leon, F. (2018). Approaches to Building a Detection Model for Water Quality: A Case Study. In: Sieminski, A., Kozierkiewicz, A., Nunez, M., Ha, Q. (eds) Modern Approaches for Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 769. Springer, Cham. https://doi.org/10.1007/978-3-319-76081-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-76081-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76080-3
Online ISBN: 978-3-319-76081-0
eBook Packages: EngineeringEngineering (R0)