Skip to main content
Log in

Feature selection for intrusion detection using new multi-objective estimation of distribution algorithms

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The manipulation of a large number of features has become a critical problem in Intrusion Detection Systems(IDS). Therefore, Feature Selection (FS) is integrated to select the significant features, in order to avoid the computational complexity, and improve the classification performance. In this paper, we present a new multi-objective feature selection algorithm MOEDAFS (Multi-Objective Estimation of Distribution Algorithms (EDA) for Feature Selection). The MOEDAFS is based on EDA and Mutual Information (MI). EDA is used to explore the search space and MI is integrated as a probabilistic model to guide the search by modeling the redundancy and relevance relations between features. Therefore, we propose four probabilistic models for MOEDAFS. MOEDAFS selects the better feature subsets (non-dominated solutions) that have a better detection accuracy and smaller number of features. MOEDAFS uses two objective functions (minimizing classification Error Rate (ER) and minimizing the Number of Features(NF)). In order to demonstrate the performance of MOEDAFS, a comparative study is designed by internal and external comparison on NSL-KDD dataset. Internal comparison is performed between the four versions of MOEDAFS. External comparison is organized against some well-known deterministic, metaheuristic, and multi-objective feature selection algorithms that have a single and Multi-solution. Experimental results demonstrate that MOEDAFS outperforms recent algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wu SX, Banzhaf W (2010) The use of computational intelligence in intrusion detection systems: a review. Appl Soft Comput 10(1):1–35

    Article  Google Scholar 

  2. Bostani H, Sheikhan M (2017) Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems. Soft Comput 21(9):2307– 2324

    Article  Google Scholar 

  3. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  Google Scholar 

  4. Chen Y, Li Y, Cheng X-Q, Guo L Survey and taxonomy of feature selection algorithms in intrusion detection system. In: Information Security and Cryptology 2006, Springer, pp 153–167

  5. Salappa A, Doumpos M, Zopounidis C (2007) Feature selection algorithms in classification problems: an experimental evaluation. Optim Methods Softw 22(1):199–212

    Article  MathSciNet  MATH  Google Scholar 

  6. Eid HF, Hassanien AE, Kim T-h, Banerjee S (2013) Linear correlation-based feature selection for network intrusion detection model. In: Advances in Security of Information and Communication Networks, Springer, pp 240–248

  7. Laamari MA, Kamel N (2014) A hybrid bat based feature selection approach for intrusion detection. In: Bio-inspired Computing-theories and Applications, Springer, pp 230–238

  8. Luo B, Xia J (2014) A novel intrusion detection system based on feature generation with visualization strategy. Expert Syst Appl 41(9):4139–4147

    Article  MathSciNet  Google Scholar 

  9. Ahmad I (2015) Feature selection using particle swarm optimization in intrusion detection. Int J Distrib Sens Netw 11(10):806954

    Google Scholar 

  10. Thaseen IS, Kumar CA Intrusion detection model using chi square feature selection and modified naïve bayes classifier. In: Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC–16’) 2016, Springer, pp 81–91

  11. Thaseen IS, Kumar CA (2016) Intrusion Detection Model using fusion of chi-square feature selection and multi class SVM. Journal of King Saud University-Computer and Information Sciences

  12. Bahl S, Sharma SK A minimal subset of features using correlation feature selection model for intrusion detection system. In: Proceedings of the Second International Conference on Computer and Communication Technologies 2016, Springer, pp 337–346

  13. Kang S-H, Kim KJ (2016) A feature selection approach to find optimal feature subsets for the network intrusion detection system. Clust Comput 19(1):325–333

    Article  Google Scholar 

  14. De la Hoz E, de la Hoz E, Ortiz A, Ortega J, Martínez-Álvarez A (2014) Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps. Knowl-Based Syst 71:322–338

    Article  Google Scholar 

  15. Sujitha B, Kavitha V (2015) Layered approach for intrusion detection using multiobjective particle swarm optimization. Int J Appl Eng Res 10(12):31999–32014

    Google Scholar 

  16. Hauschild M, Pelikan M (2011) An introduction and survey of estimation of distribution algorithms. Swarm Evol Comput 1(3):111–128

    Article  Google Scholar 

  17. Karshenas H, Santana R, Bielza C, Larranaga P (2014) Multiobjective estimation of distribution algorithm based on joint modeling of objectives and variables. IEEE Trans Evol Comput 18(4):519–542

    Article  Google Scholar 

  18. Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello CAC (2014) A survey of multiobjective evolutionary algorithms for data mining: Part I. IEEE Trans Evol Comput 18(1):4–19

    Article  Google Scholar 

  19. hou A, Qu B-Y, Li H, Zhao S-Z, Suganthan PN, Zhang Q (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evol Comput 1(1):32–49

    Article  Google Scholar 

  20. Xue B (2014) Particle swarm optimisation for feature selection in classification. victoria university of wellington

  21. Wang L, Fang C, Mu C-D, Liu M (2013) A Pareto-archived estimation-of-distribution algorithm for multiobjective resource-constrained project scheduling problem. IEEE Trans Eng Manag 60(3):617–626

    Article  Google Scholar 

  22. Zhang Q, Zhou A, Jin Y (2008) RM-MEDA: A regularity model-based multiobjective estimation of distribution algorithm. IEEE Trans Evol Comput 12(1):41–63

    Article  Google Scholar 

  23. Foithong S, Pinngern O, Attachoo B (2012) Feature subset selection wrapper based on mutual information and rough sets. Expert Syst Appl 39(1):574–584

    Article  Google Scholar 

  24. Liu H, Sun J, Liu L, Zhang H (2009) Feature selection with dynamic mutual information. Pattern Recogn 42(7):1330–1339

    Article  MATH  Google Scholar 

  25. Kwak N, Choi C-H (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13 (1):143–159

    Article  Google Scholar 

  26. Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinforma Comput Biol 3(02):185–205

    Article  Google Scholar 

  27. Qu G, Hariri S, Yousif M (2005) A new dependency and correlation analysis for features. IEEE Trans Knowl Data Eng 17(9):1199–1207

    Article  Google Scholar 

  28. Amiri F, Yousefi MR, Lucas C, Shakery A, Yazdani N (2011) Mutual information-based feature selection for intrusion detection systems. J Netw Comput Appl 34(4):1184–1199

    Article  Google Scholar 

  29. Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Transactions on Neural Networks 20(2):189–201

    Article  Google Scholar 

  30. Sotoca JM, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recogn 43(6):2068–2081

    Article  MATH  Google Scholar 

  31. Cheng H, Qin Z, Feng C, Wang Y, Li F (2011) Conditional mutual information-based feature selection analyzing for synergy and redundancy. Etri J 33(2):210–218

    Article  Google Scholar 

  32. Xue B, Cervante L, Shang L, Browne WN, Zhang M (2012) A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Connect Sci 24(2-3):91–116

    Article  Google Scholar 

  33. Cover TM, Thomas JA (1991) Entropy, relative entropy and mutual information. Elements of Information Theory 2:1–55

    Google Scholar 

  34. Timme N, Alford W, Flecker B, Beggs JM (2011) Multivariate information measures: an experimentalist’s perspective. arXiv:1111.6857

  35. McGill W (1954) Multivariate information transmission. Transactions of the IRE Professional Group on Information Theory 4(4):93–111

    Article  MathSciNet  Google Scholar 

  36. Bell AJ (2003) The co-information lattice. In: Proceedings of the Fifth International Workshop on Independent Component Analysis and Blind Signal Separation: ICA

  37. Van de Cruys T Two multivariate generalizations of pointwise mutual information. In: Proceedings of the Workshop on Distributional Semantics and Compositionality 2011, Association for Computational Linguistics, pp 16–20

  38. Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Transactions on Cybernetics 43(6):1656–1671

    Article  Google Scholar 

  39. Knowles J, Corne D The pareto archived evolution strategy: A new baseline algorithm for pareto multiobjective optimisation. In: Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on 1999, IEEE, pp 98–105

  40. Zitzler E, Laumanns M, Thiele L (2001) SPEA2: Improving the strength Pareto evolutionary algorithm

  41. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  42. Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

  43. KDDCup99: The KDD Cup 1999 Dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99 (1999). Accessed 2017-01-15

  44. NSL-KDD: The NSL-KDD Dataset. http://nsl.cs.unb.ca/nsl-kdd (2009). Accessed 2017-01-15

  45. Tavallaee M, Bagheri E, Lu W, Ghorbani AA A detailed analysis of the KDD CUP 99 data set. In: Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium on 2009, IEEE, pp 1–6

  46. Thaseen IS, Kumar CA (2016) An integrated intrusion detection model using consistency based feature selection and lpboost. In: 2016 Online International Conference on Green Engineering and Technologies (IC-GET), IEEE

  47. Aljawarneh S, Aldwairi M, Yassein MB (2017) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. Journal of Computational Science

  48. Raman MG, Somu N, Kirthivasan K, Liscano R, Sriram VS (2017) An efficient intrusion detection system based on hypergraph-Genetic algorithm for parameter optimization and feature selection in support vector machine. Knowl-Based Syst 134:1–12

    Article  Google Scholar 

  49. Enache A-C, Sgârciu V, Togan M Comparative study on feature selection methods rooted in swarm intelligence for intrusion detection. In: Control Systems and Computer Science (CSCS), 2017 21st International Conference on 2017, IEEE, pp 239– 244

  50. Biswas NA, Shah FM, Tammi WM, Chakraborty S (2015) Fp-ank: An improvised intrusion detection system with hybridization of neural network and k-means clustering over feature selection by pca. In: 2015 18th International Conference on Computer and Information Technology (ICCIT), 2015: IEEE

  51. Beer F, Bühler U Feature selection for flow-based intrusion detection using Rough Set Theory. In: Networking, Sensing and Control (ICNSC), 2017 IEEE 14th International Conference on 2017, IEEE, pp 617–624

  52. Anwer HM, Farouk M, Abdel-Hamid A (2018) A framework for efficient network anomaly intrusion detection with features selection. In: 9th International Conference on Information and Communication Systems (ICICS), Irbid, pp 157–162

  53. Dongre S, Chawla M (2018) Analysis of feature selection techniques for denial of service (dos) attacks. In: 4th International Conference on Recent Advances in Information Technology (RAIT), Dhanbad, pp 1–4

  54. Hooks D, Yuan X, Roy K, Esterline A, Hernandez J Applying artificial immune system for intrusion detection. In: 2018 IEEE Fourth International Conference on Big Data Computing Service and Applications (BigDataService), IEEE, pp 287–292

  55. Pham NT, Foo E, Suriadi S, Jeffrey H, Lahza HFM (2018) Improving performance of intrusion detection system using ensemble methods and feature selection. In: Proceedings of the Australasian Computer Science Week Multiconference, ACM, p 2

  56. Osman IH, Laporte G (1996) Metaheuristics:A bibliography. Ann Oper Res 63(5):511–623

    Article  Google Scholar 

  57. Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recogn Lett 30(5):525–534

    Article  Google Scholar 

  58. Boussaïd I, Lepagnot J, et Siarry P (2013) A survey on optimization metaheuristics. Inf Sci 237:82–117

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sofiane Maza.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maza, S., Touahria, M. Feature selection for intrusion detection using new multi-objective estimation of distribution algorithms. Appl Intell 49, 4237–4257 (2019). https://doi.org/10.1007/s10489-019-01503-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01503-7

Keywords

Navigation