Abstract
In this paper, we consider hidden Markov model (HMM) based sequence classification to misuse based intrusion detection. HMM is one of statistical Markov models that regards the system as a group of observable states and hidden states. We apply HMM for detecting intrusive program traces in some public benchmark datasets including University of New Mexico (UNM) and Massachusetts Institute of Technology: Lincoln Laboratory (MIT LL) datasets. We compare the performance of HMM with that of Naïve Bayes (NB) classification algorithm, support vector machines (SVM), and other basic machine learning algorithms. Our experiments and their results on the UNM and MIT LL datasets show that HMM shows comparable performance to previous methods.
Similar content being viewed by others
References
Cha, K.-H., Kang, D.-K.: Applying hidden Markov model to misuse intrusion trace classification. In: Proceedings of the 6th International Conference on Convergence Technology in 2016, pp. 746–747, June 29–July 2, 2016, MAISON GLAD Jeju, Jeju, Korea
Forrest, S., Allen, L., Perelson, A.S., Cherukuri, R.: Self-nonself discrimination in a computer. In: Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, pp. 202–212 (1994)
Kang, D.-K.: Lightweight and scalable intrusion trace classification using inter-element dependency models suitable for wireless sensor network environment. Int. J. Distrib. Sens. Netw. 2013, Article ID 904953 (2013)
Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Stat. 41, 164–171 (1970)
Rennie, J., Shih, L., Teevan, J., Karger, D.: Tackling the poor assumptions of Naive Bayes classifiers. In: Proceedings of the Twentieth International Conference on Machine Learning (2003)
Peng, F., Schuurmans, D.: Combining naive Bayes and n-gram language models for text classification. In: Sebastiani, F. (ed.) Advances in Information Retrieval, Vol. 2633 of Lecture Notes in Computer Science, pp. 335–350, New York (2003)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273 (1995)
Warrender, C., Forrest, S., Pearlmutter, B.A.: Detecting intrusions using system calls: alternative data models. In: IEEE Symposium on Security and Privacy, pp. 133–145 (1999)
Anderson, D., Lunt, T.F., Javitz, H., Tamaru, A., Valdes, A.: Detecting unusual program behavior using the statistical component of the next-generation intrusion detection expert system (NIDES). Tech. Rep. SRI-CSL-95-06, Computer Science Laboratory, SRI International, Menlo Park, CA(1995)
Cohen, W.W.: Fast effective rule induction. In Prieditis, A., Russell, S. (eds.) Proceedings of the 12th International Conference on Machine Learning, pp. 115–123 Morgan Kaufmann, Tahoe City, CA, 9–12 July 1995
Lee, W., Stolfo, S.: Data mining approaches for intrusion detection. In: Proceedings of the 7th USENIX Security Symposium, San Antonio, TX (1998)
Ghosh, A., Schwartzbard, A.: A study in using neural networks for anomaly and misuse detection. In: The 8th USENIX Security Symposium, Washington, DC, pp. 141–151 (1999)
Kayacik, H.G.: Hierarchical self organizing map based IDS on KDD benchmark. Master Thesis. Dalhousie University, Faculty of Computer Science (2003)
Heller, K., Svore, K., Keromytis, A.D., Stolfo, S.: One class support vector machines for detecting anomalous windows registry accesses. Workshop on Data Mining for Computer Security (DMSEC), Melbourne, FL, 19 November 2003
Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for SVM protein classification. In: Proceedings of the Pacific Symposium on Biocomputing 2002 (PSB 2002), pp. 564–575 (2002)
Leslie, C., Eskin, E., Weston, J., Noble, W.S.: Mismatch string kernels for SVM protein classification. In: Proceedings of Neural Information Processing Systems 2002 (NIPS 2002) (2002)
Tian, S., Yu, J., Yin, C.: Anomaly detection using support vector machines. In: Proceedings of International Symposium on Neural Networks, Dalian, China, pp. 592–597 (2004)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., Burlington (1993)
Cessie, S.L., Houwelingen, J.V.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods: Support Vector Learning, pp. 185–208 (1999)
He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1282 (2009)
Sun, Y., Kamel, M.S., Wong, A.K.C., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit. 40(12), 3358–3378 (2007)
Luh, R., Marschalek, S., Kaiser, M., Janicke, H., Schrittwieser, S.: Semantics-aware detection of targeted attacks: a survey. J. Comput. Virol. Hacking Tech. 13(1), 47–85 (2017)
Tax, D.M.J., Duin, R.P.W.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)
Acknowledgements
This work was supported by Dongseo University, “Dongseo Frontier Project” Research Fund of 2015.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cha, KH., Kang, DK. Experimental analysis of hidden Markov model based secure misuse intrusion trace classification and hacking detection. J Comput Virol Hack Tech 13, 233–238 (2017). https://doi.org/10.1007/s11416-017-0293-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-017-0293-7