Advertisement

Anomaly Detection Procedures in a Real World Dataset by Using Deep-Learning Approaches

  • Alabbas Alhaj AliEmail author
  • Abdul Rasheeq
  • Doina Logofătu
  • Costin Bădică
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11431)

Abstract

Water covers 71% of the Earth’s surface and is vital for all known forms of life. Quality of drinking water is very important. The concentration of major chemical elements under the desirable limit is good for health but an increase in the concentration of the element above the desirable limit may cause adverse effects on human health. Major problems being faced by the world population are due to the presence of excess fluoride, sulfate, chloride, nitrate, and sodium in water. In this paper, we address the problem of changes in the drinking water quality and the crucial task for public water companies to monitor the quality of water. Requirements for drinking water quality monitoring change frequently, e.g., due to contamination by civilization itself or in the supply and distribution network. The proposed methods are K-Nearest Neighbour Algorithm (KNN) and Classification Neural Network based on Logistic Regression for obtaining an appropriate solution in an adequate period of time. Also, the paper compares of the result between the proposed methods and other methods applied in previous work. All experiments are carried out using data gathered from Thüringer Fernwasserversorgung (TFW) water company.

Keywords

Drinking water quality Data analysis Neural network Logistic regression K-Nearest Neighbour (KNN) 

References

  1. 1.
    Rehbach, F., Moritz, S., Chandrasekaran, S., Rebolledo, M., Friese, M., Bartz-Beielstein, T.: Industrial challenge: monitoring of drinking-water quality. In: GECCO (2018)Google Scholar
  2. 2.
    Baobao, W., Jinsheng, M., Minru, S.: An enhancement of K-Nearest Neighbor algorithm using information gain and extension relativity. In: Proceedings of International Conference on Condition Monitoring and Diagnosis, CMD 2008, April 2008Google Scholar
  3. 3.
    Bo, S., Junping, D., Tian, G.: Study on the improvement of K-Nearest-Neighbor algorithm. In: Proceedings of International Conference on Artificial Intelligence and Computational Intelligence, AICI 2009, vol. 4. IEEE Computer Society, November 2009Google Scholar
  4. 4.
    Muharemi, F., Logofătu, D., Andersson, C., Leon, F.: Approaches to building a detection model for water quality: a case study. In: Sieminski, A., Kozierkiewicz, A., Nunez, M., Ha, Q.T. (eds.) Modern Approaches for Intelligent Information and Database Systems. SCI, vol. 769, pp. 173–183. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-76081-0_15CrossRefGoogle Scholar
  5. 5.
    Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press, OxfordGoogle Scholar
  6. 6.
    Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of 22nd ACM International Conference on Research and Development in Information Retrieval, SIGIR 1999 (1999)Google Scholar
  7. 7.
    Zeng, D., Gu, L., Lian, L., Guo, S., Hu, J.: On cost-efficient sensor placement for contaminant detection in water distribution systems. IEEE Trans. Ind. Inform. 12(6), 2177–2185 (2016). ieeexplore.ieee.org/document/7470468/. Accessed 14 June 2018CrossRefGoogle Scholar
  8. 8.
    Xiao, X., Ding, H.: Enhancement of K-Nearest Neighbor algorithm based on weighted entropy of attribute value. In: Proceedings of 5th International Conference on BioMedical Engineering and Informatics, BMEI 2012. IEEE Press, October 2012. https://ieeexplore.ieee.org/document/6513101/. Accessed 06 June 2018
  9. 9.
    Jiang, L., Cai, Z., Wang, D., Jiang, S.: Survey of improving K-Nearest-Neighbor for classification. In: Proceedings of 4th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007, vol. 1, August 2007Google Scholar
  10. 10.
    Osuna, E., Freund, R., Girosi, F.: Support vector machines: training and applications. Technical report AIM-1602 (1997)Google Scholar
  11. 11.
    Muharemi, F., Logofătu, D., Leon, F.: Review on general techniques and packages for data imputation in R on a real world dataset. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS, vol. 11056, pp. 386–395. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-98446-9_36CrossRefGoogle Scholar
  12. 12.
    Nguyen, M., Logofătu, D.: Applying tree ensemble to detect anomalies in real-world water composition dataset. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A.J. (eds.) IDEAL 2018. LNCS, vol. 11314, pp. 429–438. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-03493-1_45CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Alabbas Alhaj Ali
    • 1
    Email author
  • Abdul Rasheeq
    • 1
  • Doina Logofătu
    • 1
  • Costin Bădică
    • 2
  1. 1.Department of Computer Science and EngineeringFrankfurt University of Applied SciencesFrankfurt am MainGermany
  2. 2.Department of Computer Sciences and Information TechnologyUniversity of CraiovaCraiovaRomania

Personalised recommendations