Anomaly Detection Through Comparison of Heterogeneous Machine Learning Classifiers vs KPCA

Jidiga, Goverdhan Reddy; Sammulal, P.

doi:10.1007/978-3-319-22915-7_44

Goverdhan Reddy Jidiga⁵ &
P. Sammulal⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 536))

Included in the following conference series:

International Symposium on Security in Computing and Communication

1731 Accesses
1 Citations

Abstract

The anomaly detection is applicable to wide range of critical infrastructure elements due to frequent change in anomaly occurrences and make sure to avoid all threats identified in regular. In this perception, we have to identify the abnormal patterns in applications and to model them by using a new adorned machine learning classifiers. In this paper we are investigating the performance by comparison of heterogeneous machine learning classifiers: ICA (Independent Component Analysis), LDA (Linear Discriminant Analysis), PCA (Principal Component Analysis), Kernel PCA and other learning classifiers. The Kernel PCA (KPCA) is a non-linear extension to PCA used to classify the data and detect anomalies by orthogonal transformation of input space into (usually high dimensional) feature space. The KPCA use kernel trick by extract the principal components from set of corresponding eigenvectors and use kernel width as performance parameter to determine rate of classification. The KPCA is implemented on taking two UCI machine learning repository sets and one real bank dataset. The KPCA implemented with classic Gaussian kernel internally. Finally KPCA performance compared with projection methods (ICA, LDA, PLSDA and PCA), other kernel (SVM-K) and non-kernel techniques (ID3, C4.5, Rule C4.5, k-NN and NB) applied on same datasets using training and test set combinations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Denning, D.E.: An intrusion detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)
Article Google Scholar
Axelsson, S.: Intrusion Detection Systems: A Survey and Taxonomy. Chalmers University, Technical report, vol. 99(15), March 2000
Google Scholar
Lee, W., Stolfo, S.J.: Data mining approaches for intrusion detection. In: 7th USENIX Security Symposium, pp. 79–94, Berkeley (1998)
Google Scholar
Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, pp. 366–377, October 1997
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Saddle River (2009)
Google Scholar
www.mathworks.in/products/matlab/‎
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Hotelling, H.: Analysis of a complex statistical variable into principal components. J. Educ. Psychol. 24, 417–441 (1933)
Article Google Scholar
Bai, Z.-J., Chan, R.H., Luk, F.T.: Principal component analysis for distributed data sets with updating. In: Cao, J., Nejdl, W., Xu, M. (eds.) APPT 2005. LNCS, vol. 3756, pp. 471–483. Springer, Heidelberg (2005)
Chapter Google Scholar
Lakhina, A., Crovella, M., Diot, C.: Diagnosing network-wide traffic anomalies. In: Proceedings of ACM Conference, Special Interest Group on Data Communication (2004)
Google Scholar
Eskin, E., Arnold, A., Prerau, M.: A Geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security (2003)
Google Scholar
Lin, C.H., Chun Liu, J.. Ho, C.H.: Anomaly Detection Using LibSVM Training Tools. IEEE, Tunghai University, Taiwan. doi:10.1109/ISA.2008.12, ISBN-978-0-7695-3126-7/08 2008
Kleinbaum, D.G.: Applied Regression Analysis and Multivariable Methods, 3rd edn. Brooks/Cole Publishing Company, Pacific Grove (1998)
Google Scholar
Hoffmann, H.: Kernel PCA for novelty detection. Pattern Recogn. 40(3), 863–874 (2006)
Article Google Scholar
Genton, M.G.: Classes of kernels for machine learning: a statistics perspective. J. Mach. Learn. Res. 2, 299–312 (2001)
MathSciNet Google Scholar
Scholkopf, B., Smola, A.J., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)
Article Google Scholar
Jidiga, G.R., Sammulal, P.: Foundations of IDS: focus on role of anomaly detection using machine learning. In: ICACM-2013 Elsevier 2nd International Conference. August 2013. ISBN No: 9789351071495
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, LosAltos (1993)
Google Scholar
Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80, 227–248 (1989)
Article MathSciNet MATH Google Scholar
Polat, K., Güne, S.: A novel hybrid intelligent method based on C4.5 decision tree classifier and one against all approach for multi-class classification problems. Expert Syst. Appl. 36, 1587–1592 (2009)
Article Google Scholar
Yu, M., Ai, T.H.: Study of RS data classification based on rough sets and C4.5 algorithms. In: Proceedings of the SPIE Conference Series (2009)
Google Scholar
Prema, R., Kannan, A.: An active rule approach for network intrusion detection with enhanced C4.5 Algorithm. In: IJCNS, pp. 285–385 (2008)
Google Scholar
Ghosh, A., Schwartzbard, A.: A study using NN for anomaly detection and misuse detection. Reliable Software Technologies. http://www.docshow.net/ids/usenix_sec99.zip
http://eric.univ-lyon2.fr/~ricco/sipina.html
Daniel, L., Davis, J.: Improving Markov network structure learning using decision trees. J. Mach. Learn. Res. 15, 501–532 (2014)
Google Scholar
Comon, P.: ICA: a new concept. Signal Process. 36, 287–314 (1994)
Article MATH Google Scholar
Fukunaga, K.: Statistical Pattern Recognition. Academic Press, New York (1989)
Google Scholar
Haeb’h, R., Ney, H.: Linear discriminant analysis for improved large vocabulary speech recognition. In: Proceedings of ICASSP 1992, pp. 13–16, San Francisco, March 1992
Google Scholar
Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)
Article Google Scholar
Jolliffe, I.J.: Principal Component Analysis. Springer, New York (1986)
Book Google Scholar
Kocsor, A., Tóth, L., Paczolay, D.: A nonlinearized discriminant analysis and its application to speech impediment therapy. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 249–257. Springer, Heidelberg (2001)
Chapter Google Scholar
Kocsor, A., Csirik, J.A.: Fast independent component analysis in kernel feature spaces. In: Pacholski, L., Ružička, P. (eds.) SOFSEM 2001. LNCS, vol. 2234, pp. 271–281. Springer, Heidelberg (2001)
Chapter Google Scholar
Scholkopf, B., Smola, A.J., Muller, K.R.: Kernel Principal Component Analysis in Advances in Kernel Methods - Support Vector Learning, pp. 327–352. MIT Press, Cambridge (1999)
Google Scholar
Siohan, O.: On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition. In: Proceedings of ICASSP 1995, pp. 125–128, Detroit, May 1995
Google Scholar
Beveridge, J.R., She, K., Draper, B., Givens, G.H.: A nonparametric statistical comparison of principal component and linear discriminant subspaces for face recognition. In: Proceedings of the IEEE Conference on CVPR, pp. 535–542, USA, December 2001
Google Scholar
Martinez, A., Kak, A.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 228–233 (2001)
Article Google Scholar
Hang, Z., Greenberg, A., Roughan, M.: Network anomography. In: Proceedings of Internet Measurement Conference (IMC) (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Technical Education, Government of Telangana, Hyderabad, India
Goverdhan Reddy Jidiga
JNTUH College of Engineering, JNTU University, Karimnagar, Hyderabad, India
P. Sammulal

Authors

Goverdhan Reddy Jidiga
View author publications
You can also search for this author in PubMed Google Scholar
P. Sammulal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Goverdhan Reddy Jidiga .

Editor information

Editors and Affiliations

Deakin University, Geelong, Victoria, Australia
Jemal H. Abawajy
IBM Research-India, New Delhi, India
Sougata Mukherjea
Indian Institute of Information Technology and Management, Kerala, India
Sabu M. Thampi
University of Murcia, Espinardo, Murcia, Spain
Antonio Ruiz-Martínez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jidiga, G.R., Sammulal, P. (2015). Anomaly Detection Through Comparison of Heterogeneous Machine Learning Classifiers vs KPCA. In: Abawajy, J., Mukherjea, S., Thampi, S., Ruiz-Martínez, A. (eds) Security in Computing and Communications. SSCC 2015. Communications in Computer and Information Science, vol 536. Springer, Cham. https://doi.org/10.1007/978-3-319-22915-7_44

Download citation

DOI: https://doi.org/10.1007/978-3-319-22915-7_44
Published: 08 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22914-0
Online ISBN: 978-3-319-22915-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics