Abstract
The anomaly detection is applicable to wide range of critical infrastructure elements due to frequent change in anomaly occurrences and make sure to avoid all threats identified in regular. In this perception, we have to identify the abnormal patterns in applications and to model them by using a new adorned machine learning classifiers. In this paper we are investigating the performance by comparison of heterogeneous machine learning classifiers: ICA (Independent Component Analysis), LDA (Linear Discriminant Analysis), PCA (Principal Component Analysis), Kernel PCA and other learning classifiers. The Kernel PCA (KPCA) is a non-linear extension to PCA used to classify the data and detect anomalies by orthogonal transformation of input space into (usually high dimensional) feature space. The KPCA use kernel trick by extract the principal components from set of corresponding eigenvectors and use kernel width as performance parameter to determine rate of classification. The KPCA is implemented on taking two UCI machine learning repository sets and one real bank dataset. The KPCA implemented with classic Gaussian kernel internally. Finally KPCA performance compared with projection methods (ICA, LDA, PLSDA and PCA), other kernel (SVM-K) and non-kernel techniques (ID3, C4.5, Rule C4.5, k-NN and NB) applied on same datasets using training and test set combinations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Denning, D.E.: An intrusion detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)
Axelsson, S.: Intrusion Detection Systems: A Survey and Taxonomy. Chalmers University, Technical report, vol. 99(15), March 2000
Lee, W., Stolfo, S.J.: Data mining approaches for intrusion detection. In: 7th USENIX Security Symposium, pp. 79–94, Berkeley (1998)
Lane, T., Brodley, C.E.: An application of machine learning to anomaly detection. In: Proceedings of the 20th National Information Systems Security Conference, pp. 366–377, October 1997
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice Hall, Saddle River (2009)
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Hotelling, H.: Analysis of a complex statistical variable into principal components. J. Educ. Psychol. 24, 417–441 (1933)
Bai, Z.-J., Chan, R.H., Luk, F.T.: Principal component analysis for distributed data sets with updating. In: Cao, J., Nejdl, W., Xu, M. (eds.) APPT 2005. LNCS, vol. 3756, pp. 471–483. Springer, Heidelberg (2005)
Lakhina, A., Crovella, M., Diot, C.: Diagnosing network-wide traffic anomalies. In: Proceedings of ACM Conference, Special Interest Group on Data Communication (2004)
Eskin, E., Arnold, A., Prerau, M.: A Geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer Security (2003)
Lin, C.H., Chun Liu, J.. Ho, C.H.: Anomaly Detection Using LibSVM Training Tools. IEEE, Tunghai University, Taiwan. doi:10.1109/ISA.2008.12, ISBN-978-0-7695-3126-7/08 2008
Kleinbaum, D.G.: Applied Regression Analysis and Multivariable Methods, 3rd edn. Brooks/Cole Publishing Company, Pacific Grove (1998)
Hoffmann, H.: Kernel PCA for novelty detection. Pattern Recogn. 40(3), 863–874 (2006)
Genton, M.G.: Classes of kernels for machine learning: a statistics perspective. J. Mach. Learn. Res. 2, 299–312 (2001)
Scholkopf, B., Smola, A.J., Muller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998)
Jidiga, G.R., Sammulal, P.: Foundations of IDS: focus on role of anomaly detection using machine learning. In: ICACM-2013 Elsevier 2nd International Conference. August 2013. ISBN No: 9789351071495
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, LosAltos (1993)
Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80, 227–248 (1989)
Polat, K., Güne, S.: A novel hybrid intelligent method based on C4.5 decision tree classifier and one against all approach for multi-class classification problems. Expert Syst. Appl. 36, 1587–1592 (2009)
Yu, M., Ai, T.H.: Study of RS data classification based on rough sets and C4.5 algorithms. In: Proceedings of the SPIE Conference Series (2009)
Prema, R., Kannan, A.: An active rule approach for network intrusion detection with enhanced C4.5 Algorithm. In: IJCNS, pp. 285–385 (2008)
Ghosh, A., Schwartzbard, A.: A study using NN for anomaly detection and misuse detection. Reliable Software Technologies. http://www.docshow.net/ids/usenix_sec99.zip
Daniel, L., Davis, J.: Improving Markov network structure learning using decision trees. J. Mach. Learn. Res. 15, 501–532 (2014)
Comon, P.: ICA: a new concept. Signal Process. 36, 287–314 (1994)
Fukunaga, K.: Statistical Pattern Recognition. Academic Press, New York (1989)
Haeb’h, R., Ney, H.: Linear discriminant analysis for improved large vocabulary speech recognition. In: Proceedings of ICASSP 1992, pp. 13–16, San Francisco, March 1992
Hyvarinen, A.: Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 626–634 (1999)
Jolliffe, I.J.: Principal Component Analysis. Springer, New York (1986)
Kocsor, A., Tóth, L., Paczolay, D.: A nonlinearized discriminant analysis and its application to speech impediment therapy. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 249–257. Springer, Heidelberg (2001)
Kocsor, A., Csirik, J.A.: Fast independent component analysis in kernel feature spaces. In: Pacholski, L., Ružička, P. (eds.) SOFSEM 2001. LNCS, vol. 2234, pp. 271–281. Springer, Heidelberg (2001)
Scholkopf, B., Smola, A.J., Muller, K.R.: Kernel Principal Component Analysis in Advances in Kernel Methods - Support Vector Learning, pp. 327–352. MIT Press, Cambridge (1999)
Siohan, O.: On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition. In: Proceedings of ICASSP 1995, pp. 125–128, Detroit, May 1995
Beveridge, J.R., She, K., Draper, B., Givens, G.H.: A nonparametric statistical comparison of principal component and linear discriminant subspaces for face recognition. In: Proceedings of the IEEE Conference on CVPR, pp. 535–542, USA, December 2001
Martinez, A., Kak, A.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 228–233 (2001)
Hang, Z., Greenberg, A., Roughan, M.: Network anomography. In: Proceedings of Internet Measurement Conference (IMC) (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Jidiga, G.R., Sammulal, P. (2015). Anomaly Detection Through Comparison of Heterogeneous Machine Learning Classifiers vs KPCA. In: Abawajy, J., Mukherjea, S., Thampi, S., Ruiz-Martínez, A. (eds) Security in Computing and Communications. SSCC 2015. Communications in Computer and Information Science, vol 536. Springer, Cham. https://doi.org/10.1007/978-3-319-22915-7_44
Download citation
DOI: https://doi.org/10.1007/978-3-319-22915-7_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22914-0
Online ISBN: 978-3-319-22915-7
eBook Packages: Computer ScienceComputer Science (R0)