Skip to main content

Anomaly Detection Through Comparison of Heterogeneous Machine Learning Classifiers vs KPCA

  • Conference paper
  • First Online:
Security in Computing and Communications (SSCC 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 536))

Included in the following conference series:

Abstract

The anomaly detection is applicable to wide range of critical infrastructure elements due to frequent change in anomaly occurrences and make sure to avoid all threats identified in regular. In this perception, we have to identify the abnormal patterns in applications and to model them by using a new adorned machine learning classifiers. In this paper we are investigating the performance by comparison of heterogeneous machine learning classifiers: ICA (Independent Component Analysis), LDA (Linear Discriminant Analysis), PCA (Principal Component Analysis), Kernel PCA and other learning classifiers. The Kernel PCA (KPCA) is a non-linear extension to PCA used to classify the data and detect anomalies by orthogonal transformation of input space into (usually high dimensional) feature space. The KPCA use kernel trick by extract the principal components from set of corresponding eigenvectors and use kernel width as performance parameter to determine rate of classification. The KPCA is implemented on taking two UCI machine learning repository sets and one real bank dataset. The KPCA implemented with classic Gaussian kernel internally. Finally KPCA performance compared with projection methods (ICA, LDA, PLSDA and PCA), other kernel (SVM-K) and non-kernel techniques (ID3, C4.5, Rule C4.5, k-NN and NB) applied on same datasets using training and test set combinations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Denning, D.E.: An intrusion detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)

    Article  Google Scholar 

  2. Axelsson, S.: Intrusion Detection Systems: A Survey and Taxonomy. Chalmers University, Technical report, vol. 99(15), March 2000

    Google Scholar 

  3. Lee, W., Stolfo, S.J.: Data mining approaches for intrusion detection. In: 7th USENIX Security Symposium, pp. 79–94, Berkeley (1998)

    Google Scholar 

  4. Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, pp. 366–377, October 1997

    Google Scholar 

  5. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Saddle River (2009)

    Google Scholar 

  7. www.mathworks.in/products/matlab/

  8. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml

  9. Hotelling, H.: Analysis of a complex statistical variable into principal components. J. Educ. Psychol. 24, 417–441 (1933)

    Article  Google Scholar 

  10. Bai, Z.-J., Chan, R.H., Luk, F.T.: Principal component analysis for distributed data sets with updating. In: Cao, J., Nejdl, W., Xu, M. (eds.) APPT 2005. LNCS, vol. 3756, pp. 471–483. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Lakhina, A., Crovella, M., Diot, C.: Diagnosing network-wide traffic anomalies. In: Proceedings of ACM Conference, Special Interest Group on Data Communication (2004)

    Google Scholar 

  12. Eskin, E., Arnold, A., Prerau, M.: A Geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security (2003)

    Google Scholar 

  13. Lin, C.H., Chun Liu, J.. Ho, C.H.: Anomaly Detection Using LibSVM Training Tools. IEEE, Tunghai University, Taiwan. doi:10.1109/ISA.2008.12, ISBN-978-0-7695-3126-7/08 2008

  14. Kleinbaum, D.G.: Applied Regression Analysis and Multivariable Methods, 3rd edn. Brooks/Cole Publishing Company, Pacific Grove (1998)

    Google Scholar 

  15. Hoffmann, H.: Kernel PCA for novelty detection. Pattern Recogn. 40(3), 863–874 (2006)

    Article  Google Scholar 

  16. Genton, M.G.: Classes of kernels for machine learning: a statistics perspective. J. Mach. Learn. Res. 2, 299–312 (2001)

    MathSciNet  Google Scholar 

  17. Scholkopf, B., Smola, A.J., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)

    Article  Google Scholar 

  18. Jidiga, G.R., Sammulal, P.: Foundations of IDS: focus on role of anomaly detection using machine learning. In: ICACM-2013 Elsevier 2nd International Conference. August 2013. ISBN No: 9789351071495

    Google Scholar 

  19. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  20. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, LosAltos (1993)

    Google Scholar 

  21. Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80, 227–248 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  22. Polat, K., Güne, S.: A novel hybrid intelligent method based on C4.5 decision tree classifier and one against all approach for multi-class classification problems. Expert Syst. Appl. 36, 1587–1592 (2009)

    Article  Google Scholar 

  23. Yu, M., Ai, T.H.: Study of RS data classification based on rough sets and C4.5 algorithms. In: Proceedings of the SPIE Conference Series (2009)

    Google Scholar 

  24. Prema, R., Kannan, A.: An active rule approach for network intrusion detection with enhanced C4.5 Algorithm. In: IJCNS, pp. 285–385 (2008)

    Google Scholar 

  25. Ghosh, A., Schwartzbard, A.: A study using NN for anomaly detection and misuse detection. Reliable Software Technologies. http://www.docshow.net/ids/usenix_sec99.zip

  26. http://eric.univ-lyon2.fr/~ricco/sipina.html

  27. Daniel, L., Davis, J.: Improving Markov network structure learning using decision trees. J. Mach. Learn. Res. 15, 501–532 (2014)

    Google Scholar 

  28. Comon, P.: ICA: a new concept. Signal Process. 36, 287–314 (1994)

    Article  MATH  Google Scholar 

  29. Fukunaga, K.: Statistical Pattern Recognition. Academic Press, New York (1989)

    Google Scholar 

  30. Haeb’h, R., Ney, H.: Linear discriminant analysis for improved large vocabulary speech recognition. In: Proceedings of ICASSP 1992, pp. 13–16, San Francisco, March 1992

    Google Scholar 

  31. Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)

    Article  Google Scholar 

  32. Jolliffe, I.J.: Principal Component Analysis. Springer, New York (1986)

    Book  Google Scholar 

  33. Kocsor, A., Tóth, L., Paczolay, D.: A nonlinearized discriminant analysis and its application to speech impediment therapy. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 249–257. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  34. Kocsor, A., Csirik, J.A.: Fast independent component analysis in kernel feature spaces. In: Pacholski, L., Ružička, P. (eds.) SOFSEM 2001. LNCS, vol. 2234, pp. 271–281. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  35. Scholkopf, B., Smola, A.J., Muller, K.R.: Kernel Principal Component Analysis in Advances in Kernel Methods - Support Vector Learning, pp. 327–352. MIT Press, Cambridge (1999)

    Google Scholar 

  36. Siohan, O.: On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition. In: Proceedings of ICASSP 1995, pp. 125–128, Detroit, May 1995

    Google Scholar 

  37. Beveridge, J.R., She, K., Draper, B., Givens, G.H.: A nonparametric statistical comparison of principal component and linear discriminant subspaces for face recognition. In: Proceedings of the IEEE Conference on CVPR, pp. 535–542, USA, December 2001

    Google Scholar 

  38. Martinez, A., Kak, A.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 228–233 (2001)

    Article  Google Scholar 

  39. Hang, Z., Greenberg, A., Roughan, M.: Network anomography. In: Proceedings of Internet Measurement Conference (IMC) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Goverdhan Reddy Jidiga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jidiga, G.R., Sammulal, P. (2015). Anomaly Detection Through Comparison of Heterogeneous Machine Learning Classifiers vs KPCA. In: Abawajy, J., Mukherjea, S., Thampi, S., Ruiz-Martínez, A. (eds) Security in Computing and Communications. SSCC 2015. Communications in Computer and Information Science, vol 536. Springer, Cham. https://doi.org/10.1007/978-3-319-22915-7_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22915-7_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22914-0

  • Online ISBN: 978-3-319-22915-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics