Abstract
A major drawback of signature-based intrusion detection systems is the inability to detect novel attacks that do not match the known signatures already stored in the database. Anomaly detection is a kind of intrusion detection in which the activities of a system are monitored and these activities are classified as normal or anomalous based on their expected behavior. Tree-based classifiers have been successfully used to separate the abnormal behavior from the normal one. Tree pruning is a machine learning technique used to minimize the size of a decision tree (DT) in order to reduce the complexity of the classifier and improve its predictive accuracy. In this paper, we attempt to prune a DT using particle swarm optimization (PSO) algorithm and apply it to the network intrusion detection problem. The proposed technique is a hybrid approach in which PSO is used for node pruning and the pruned DT is used for classification of the network intrusions. Both single and multi-objective PSO algorithms are used in the proposed approach. The experiments are carried out on the well-known KDD99Cup dataset. This dataset has been widely used as a benchmark dataset for network intrusion detection problems. The results of the proposed technique are compared to the other state-of-the-art classifiers and it is observed that the proposed technique performs better than the other classifiers in terms of intrusion detection rate, false positive rate, accuracy, and precision.
Similar content being viewed by others
References
Safavin, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)
Murthy, S.K.: Automatic construction of decision trees from data: a multidisciplinary survey. Data Min. Knowl. Disc. 2(4), 345–389 (1998)
Kohavi, R., Quinlan, J.R.: Decision-tree discovery, In: Handbook of Data Mining and Knowledge Discovery, Klosgen, W., Zytkow, J.M. (eds.),ch. 16.1.3, pp. 267–276. Oxford University Press, London, UK (2002)
Breiman, L., Friedman, J., Olshan, R., Stone, C.: Classification and Regression Trees. Wadsworth International, California (1984)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Inc, California (1993)
Quinlan, J.R.: Simplifying decision trees. Int. J. Man-Mach. Stud. 27, 221–234 (1987)
Wei, J.M., Wang, S.Q., Yu, G., Gu, L., Wang, G.Y., Yuan, X.J.: A Novel method for pruning decision tree. In: Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding, 12–15 July 2009
Alves, R.T., Delgado, M.R.B.S., Lopes, H.S., Freitas, A.A.: An Artificial Immune System for Fuzzy-rule Induction in Data Mining. Lecture Notes in Computer Science, pp. 1011–1020. Springer, Berlin (2004)
Srinoy, S., Kurutach, W.: Combination Artificial Ant Clustering and K-PSO Clustering Approach to Network Security Model. In: IEEE International Conference on Hybrid Information Technology (ICHIT’06) (2006)
Jo, M., Han, L., Kim, D., In, H.P.: Selfish attacks and detection in cognitive radio ad-hoc networks. IEEE Netw. 27(3), 46–50 (2013)
Hai, T.H., Huh, E.N., Jo, M.: A lightweight intrusion detection framework for wireless sensor networks. Wirel. Commun. Mob. Comput. 10(4), 559–572 (2010)
Malik, A.J., Shahzad, W., Khan, F.A.: Binary PSO and random forests algorithm for PROBE attacks detection in a network. In: IEEE Congress on Evolutionary Computation (CEC 2011), New Orleans, USA, pp. 662–668, 5–8 June 2011
Malik, A.J., Shahzad, W., Khan, F.A.: Network intrusion detection using hybrid binary PSO and random forests algorithm. Secur. Commun. Netw. 8(16), 2646–2660 (2015)
Guo, L. et al.: Robust Prediction of Fault-Proneness by Random Forests. In: Proceedings of the 15th International Symposium on Software Reliability Engineering (ISSRE’04), pp. 417–428, Brittany, France, November (2004)
Punithavathani, D.S., Sujatha, K., Jain, J.M.: Surveillance of anomaly and misuse in critical networks to counter insider threats using computational intelligence. Clust. Comput. 18(1), 435–451 (2015)
Kang, S., Kim, K.J.: A feature selection approach to find optimal feature subsets for the network intrusion detection system. Clust. Comput. 19(1), 325–333 (2016)
Gondal, M.S., Malik, A.J., Khan, F.A.: Network Intrusion Detection using Diversity-based Centroid Mechanism. In: 12th International Conference on Information Technology: New Generations (ITNG 2015), Las Vegas, Nevada, USA, 13–15 April 2015
Malik, A.J., Khan, F.A.: A Hybrid Technique using Multi-objective Particle Swarm Optimization and Random Forests for PROBE Attacks Detection in a Network. In: IEEE Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, UK, 13–16 October 2013
Ashfaq, R.A.R., Wang, X., Huang, J.Z., Abbas, H., He, Y.: Fuzziness based semi-supervised learning approach for intrusion detection system. Inf. Sci. 378, 484–497 (2017)
Barbarra, D., Couto, J., Jajodia, S., Popyack, L., Wu, N.: ADAM: Detecting Intrusions by Data Mining. In: Proceedings of the 2001 IEEE, Workshop on Information Assurance and Security T1A3 1100 United States Military Academy, West Point, NY, June 2001
Random Forests: http://www.stat.berkeley.edu/~breiman/RandomForests/
Lee, W., Stolfo, S.J.: A framework for constructing features and models for intrusion detection systems. ACM Trans. Inf. Syst. Secur. 3(4), 227–261 (2000)
Hall, M., Frank, E.: Combining Naive Bayes and Decision Tables. In: Proceedings of Twenty-First International Florida Artificial Intelligence Research Society Conference, AAAI Press, Coconut Grove, Florida, USA , pp. 318–319 15–17 May 2008
Su, J., Zhang, H., Ling, C.X., Matwin, S.: Discriminative Parameter Learning for Bayesian Networks. In: Proceedings of the 25th international conference on Machine learning, pp. 1016–1023. New York, USA (2008)
Chebrolu, S., Abraham, A., Thomas, J.P.: Feature deduction and ensemble design of intrusion detection systems. Int. J. Comput. Secur. 24, 295–307 (2005)
Wu, Q., Liu, H., Yan, X.: Multi-label classification algorithm research based on swarm intelligence. Clust. Comput. 19(4), 2075–2085 (2016)
Mahmood, A.M., Rao, K.M., Reddi, K.K.: A novel algorithm for scaling up the accuracy of decision trees. Int. J. Comput. Sci. Eng. 2(2), 126–131 (2010)
Jin, C., De-lin, L., Xiang, M.F.: An Improved ID3 Decision Tree Algorithm. In: Proceedings of 4th International Conference on Computer Science & Education, pp. 127–130 (2009)
Tsang, S., Kao, B., Yip, K.Y., Ho, W.S., Lee, S.D.: Decision trees for uncertain data. IEEE Trans. Data Eng. 23(1), 441–444 (2009)
Esposito, F., Malerba, D., Semeraro, G.: A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–492 (1997)
Breslow, L.A., Aha, D.W.: Simplifying decision trees: a survey. Knowl. Eng. Rev. 12(1), 1–40 (1997)
Xizhao, W., Ziying, Y.: A brief survey of methods for decision tree simplification. Comput. Eng. Appl. 40(27), 66–69 (2004)
Quinlan, J.R.: Simplifying decision trees. Int. J. Man-Mach. Stud. 27, 221–234 (1987)
Mingers, J.: An empirical comparison of pruning methods for decision tree induction. Mach. Learn. 4(2), 227–243 (1989)
Niblett, T., Bratko, I.: Learning decision rules in noisy domains, in Expert Systems. Cambridge Univ. Press, Cambridge, MA (1986)
Bratko, I., Bohanec, M.: Trading accuracy for simplicity in decision trees. Mach. Learn. 15, 223–250 (1994)
Almuallim, H.: An efficient algorithm for optimal pruning of decision trees. Artif. Intell. 83(2), 347–362 (1996)
Rissanen, J.: Stochastic Complexity and Statistical Inquiry. World Scientific, Singapore (1989)
Quinlan, J.R., Rivest, R.L.: Inferring decision trees using the minimum description length principle. Inf. Comput. 80, 227–248 (1989)
Mehta, R.L., Rissanen, J., Agrawal, R.: Mdl-based decision tree pruning. In: Proc. 1st Int. Conf. Knowledge Discovery and Data Mining, pp. 216–221 (1995)
Kennedy, J., Eberhart, R.C.: A discrete binary version of the particle swarm algorithm. IEEE Int. Conf. Syst. Man Cybern. 5, 4104–4108 (1997)
Fonseca, C.M., Fleming, P.J.: Multiobjective Optimization. In: Evolutionary Computation 2 Advanced Algorithms and Operators, Back, T., Fogel, D.B., Michalewicz, Z. (eds.) 2, pp. 25–37 (2000)
Veldhuizen, D.A.V.: Multiobjective Evolutionary Algorithms: Classifications, Analyses, and New Innovations, Ph.D. thesis, Department of Electrical and Computer Engineering. Graduate School of Engineering. Air Force Institute of Technology, Wright-Patterson AFB, Ohio (1999)
Parsopoulos, K.E., Vrahatis, M.N.: Particle Swarm Optimization Method in Multiobjective Problems. In: Proceedings of the ACM Symposium on Applied Computing, pp. 603–607 (2002)
Schaffer, J.D.: Multiple Objective Optimization with Vector Evaluated Genetic Algorithms. In: Proceedings of the First International Conference on Genetic Algorithms, pp. 93–100 (1985)
Sarasama, S.T., Zhu, Q.A., Huff, J.: Hierarchical Kohonen net for anomaly detection in network security. IEEE Trans. Syst. Man Cybern. Part B 35(2), 302–312 (2005)
Dong, L., Frank, E., Kramer, S.: Ensembles of Balanced Nested Dichotomies for Multi-class Problems. In: PKDD, pp. 84–95 (2005)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)
WEKA Data Mining Software: http://www.cs.waikato.ac.nz/~l/weka/
Acknowledgements
The authors would like to extend their sincere appreciation to the Deanship of Scientific Research at King Saud University for its funding of this research through the Research Group Project no. RGP-214.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Malik, A.J., Khan, F.A. A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput 21, 667–680 (2018). https://doi.org/10.1007/s10586-017-0971-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0971-8