Abstract
Intrusion detection is very important to solve an increasing number of security threats. With new types of attack appearing continually, traditional approaches for detecting hazardous contents are facing a severe challenge. In this work, a new feature grouping method is proposed to select features for intrusion detection. The method is based on agglomerative hierarchical clustering method and is tested against KDD CUP 99 dataset. Agglomerative hierarchical clustering method is used to construct a hierarchical tree and it is combined with mutual information theory. Groups are created from the hierarchical tree by a given number. The largest mutual information between each feature and a class label within a certain group is then selected. The performance evaluation results show that better classification performance can be attained from such selected features.
Chapter PDF
Similar content being viewed by others
References
Kim, H.J., Kim, H.-S., Kang, S.: A memory-dfficient bit-split parallel string matching using pattern dividing for intrusion detection systems. IEEE Transactions on Parallel and Distributed Systems 22(11), 1904–1911 (2011)
GarcÃa-Teodoroa, P., DÃaz-Verdejoa, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: Techniques, systems and challenges. Computers & Security 28, 18–28 (2009)
Horng, S.-J., Su, M.-Y., Chen, Y.-H., Kao, T.-W., Chen, R.-J., Lai, J.-L., Perkasa, C.D.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Systems with Applications 38, 306–313 (2011)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset. Expert Systems with Applications 38, 5947–5957 (2011)
Sobh, T.S.: Anomaly Detection Based on Hybrid Artificial Immune Principles. Information Management & Computer Security 21(14), 1–25 (2013)
Mehdi, M., Zair, S., Anou, A., Bensebti, M.: A Bayesian Networks in Intrusion Detection Systems. Journal of Computer Science 3(5), 259–265 (2007)
Shan, S., Karthik, V.: An approach for automatic selection of relevance features in intrusion detection systems. In: Proc. of the 2011 International Conference on Security and Management, pp. 215–219 (2011)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks, 537–550 (1994)
Liu, H., Suna, J., Liu, L., Zhang, H.: Feature selection with dynamic mutual information. Pattern Recognition 42, 1330–1339 (2009)
Vinh, L.T., Lee, S., Park, Y.-T., d’Auriol, B.J.: A novel feature selection method based on normalized mutual information. International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies 37(1), 100–120 (2012)
Muniyandia, A.P., Rajeswarib, R., Rajaramc, R.: Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree algorithm. In: International Conference on Communication Technology and System Design, pp. 174–182 (2012)
Chebrolu, S., Abraham, A., Thomas, J.P.: Feature deduction and ensemble design of intrusion detection systems. Journal of Computers & Security 24(4), 295–307 (2005)
Mukkamala, S., Sung, A.H.: Feature ranking and selection for intrusion detection systems using support vector machines. In: International Conference on Information and Knowledge Engineering (ICIKE), pp. 503–509 (2002)
Lin, S.-W., Ying, K.-C., Lee, C.-Y., Lee, Z.-J.: An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection. Applied Soft Computing 12, 3285–3290 (2012)
Amiri, F., Yousefi, M.R., Lucas, C., Shakery, A., Yazdani, N.: Mutual information-based feature selection for intrusion detection systems. Journal of Network and Computer Applications 34, 1184–1199 (2011)
Oh, S.-J., Kim, J.-Y.: A hierarchical clustering algorithm for categorical sequence data. Information Processing Letters 91, 135–140 (2004)
Cilibrasi, R.L., Vitanyi, P.M.B.: A fast quartet tree heuristic for hierarchical clustering. Pattern Recognition 44, 662–677 (2011)
Kojadinovic, I.: Agglomerative hierarchical clustering of continuous variables based on mutual information. Computational Statistics & Data Analysis 46, 269–294 (2004)
Özdamar, L., Demir, O.: A hierarchical clustering and routing procedure for large scale disaster relief logistics planning. Transportation Research Part E 48, 591–602 (2012)
Liu, X., Lang, B., Xu, Y., Cheng, B.: Feature grouping and local soft match for mobile visual search. Pattern Recognition Letters 33, 239–246 (2012)
Kayacik, H.G., Zincir-Heywood, A.N., Heywood, M.I.: Selecting features for intrusion detection: A feature relevance analysis on KDD 99 intrusion detection datasets. In: Proceedings of the Third annual Conference on Privacy, Security and Trust (2005)
Cho, J., Lee, C., Cho, S., Song, J.H., Lim, J., Moonam, J.: A statistical model for network data analysis: KDD CUP 99’ data evaluation and its comparing with MIT Lincoln Laboratory network data. Simulation Modelling Practice and Theory 18, 431–435 (2010)
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A Detailed Analysis of the KDD CUP 99 Data Set. In: Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defence Applications (2009)
Song, J., Zhu, Z., Scully, P., Price, C.: Modified Mutual Information-based Feature Selection for Intrusion Detection Systems in Decision Tree Learning. Journal of computers 9(7), 1542–1546 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 IFIP International Federation for Information Processing
About this paper
Cite this paper
Song, J., Zhu, Z., Price, C. (2014). Feature Grouping for Intrusion Detection System Based on Hierarchical Clustering. In: Teufel, S., Min, T.A., You, I., Weippl, E. (eds) Availability, Reliability, and Security in Information Systems. CD-ARES 2014. Lecture Notes in Computer Science, vol 8708. Springer, Cham. https://doi.org/10.1007/978-3-319-10975-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-10975-6_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10974-9
Online ISBN: 978-3-319-10975-6
eBook Packages: Computer ScienceComputer Science (R0)