Artificial Intelligence Review

, Volume 51, Issue 3, pp 403–443 | Cite as

A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers

  • Ahmed I. Saleh
  • Fatma M. TalaatEmail author
  • Labib M. Labib


Intrusion Detection System (IDS) is an effective security tool that helps preventing unauthorized access to network resources through analyzing the network traffic. However, due to the large amount of data flowing over the network, effective real time intrusion detection is almost impossible. The goal of this paper is to design a Hybrid IDS (HIDS) that can be successfully employed in a real time manner and suitable for resolving the multi-class classification problem. HIDS relies on a Naïve Base feature selection (NBFS) technique, which is used to reduce the dimensionality of sample data. Moreover, HIDS has another pioneering issue that other techniques do not have, which is the outlier rejection. Outliers are noisy input samples that can lead to high rate of misclassification if they are applied for model training. Rejecting outliers has been accomplished through applying a distance based methodology to choose the most informative training examples, which are then used to train an Optimized Support Vector Machines (OSVM). Afterward, OSVM is employed for rejecting outliers. Finally, after outlier rejection, HIDS can successfully detect attacks through applying a Prioritized K-Nearest Neighbors (PKNN) classifier. Hence, HIDS is a triple edged strategy as it has three main contributions, which are: (i) NBFS, which has been employed for dimensionality reduction, (ii) OSVM, which is applied for outlier rejection, and (iii) PKNN, which is used for detecting input attacks. HIDS has been compared against recent techniques using three well-known intrusion detection datasets: KDD Cup ’99, NSL-KDD and Kyoto 2006+ datasets. HIDS has the ability to quickly detect attacks and accordingly can be employed for real time intrusion detection. Thanks to OSVM and PKNN, HIDS performed high detection rates specifically for the attacks which are rare such as R2L and U2R. PKNN is also suitable for resolving the multi-label classification problem.


IDS Anomaly detection Misuse detection Feature selection (FS) Naïve Bayes (NB) Support Vector Machines (SVM) K-Nearest Neighbor (KNN) Multi-label classification 


  1. Aksoy S (2008) Feature reduction and selection. Department of Computer Engineering, Bilkent University, CS 551Google Scholar
  2. Al-mamory SO, Jassim FS (2013) Evaluation of different data mining algorithms with KDD CUP 99 data set. J Babylon Univ Pure Appl Sci 21(8):2663–2681Google Scholar
  3. Amrita MA (2013) Performance analysis of different feature selection methods in intrusion detection. Int J Sci Technol Res 2(6):225–231Google Scholar
  4. Atefi K, Yahya S, Dak AY, Atefi A (2013) A hybrid intrusion detection system based on different machine learning algorithms. In: Proceedings of the 4th international conference on computing and informatics, Sarawak, Malaysia. pp 312–320Google Scholar
  5. Bin Y, Qiao Y, Xin XW et al (2002) Anomaly intrusion detection method based on HMM[J]. IEEE Electron Lett 38:663–664CrossRefGoogle Scholar
  6. Chitrakar R, Huang C (2014) Selection of candidate support vectors in incremental SVM for network intrusion detection. Comput Secur 45:231–241CrossRefGoogle Scholar
  7. Davy M, Gretton A, Doucet A, Rayner PJW (2002) Optimized support vector machines for nonstationary signal classification. IEEE Signal Process Lett 9(12):442–445CrossRefGoogle Scholar
  8. Devarakondaa N, Pamidib S, Kumari VV, Govardhan A (2011) Intrusion detection system using bayesian network and hidden Markov model. In: Selection and/or peer-review under responsibility of C3IT. Elsevier LtdGoogle Scholar
  9. Di Martino S, Ferrucci F, Gravino C, Sarro F (2011) A genetic algorithm to configure support vector machines for predicting fault-prone components. In: Product-focused software process improvement. Springer, pp 247–261Google Scholar
  10. Feng W, Zhang Q, Hu G, Huang JX (2014) Mining network data for intrusion detection through combining SVMs with ant colony networks. Future Gener Comput Syst 37:127–140CrossRefGoogle Scholar
  11. Frohlich H, Chapelle O (2003) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of the 15th IEEE international conference on tools with artificial intelligence, Sacramento, 3–5 November. pp 142–148Google Scholar
  12. Gutierrez-Osuna R (2002) Pattern analysis for machine olfaction: a review. IEEE Sens J 2:189–202CrossRefGoogle Scholar
  13. Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification, Technical report. Department of Computer Science and Information Engineering, University of National Taiwan, Taipei. pp 1–12Google Scholar
  14. Kayacik HG, Zincir-Heywood AN, Heywood MI (2005) Selecting features for intrusion detection: a feature relevance analysis on KDD 99 intrusion detection datasets. In: Proceedings of the third annual conference on privacy, security and trust , October 12–14, 2005, The Fairmont Algonquin, St. Andrews, New Brunswick, CanadaGoogle Scholar
  15. KDD Cup (1999) Intrusion detection dataset.
  16. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol IV. pp 1942–1948Google Scholar
  17. Kuang F, Xu W, Zhang S (2014) A novel hybrid KPCA and SVM with GA model for intrusion detection. Appl Soft Comput 18:178–184CrossRefGoogle Scholar
  18. Kuang F, Zhang S, Jin Z, Xu W (2015) A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Comput 21:1–13Google Scholar
  19. Le Thi HA, Le AV, Vo XT, Zidna A (2014) A filter based feature selection approach in MSVM using DCA and its application in network intrusion detection. In: Nguyen NT, Attachoo B, Trawiński B, Somboonviwat K (eds) Intelligent information and database systems. ACIIDS 2014. Lecture notes in computer science, vol 8398. Springer, ChamGoogle Scholar
  20. Liu H, Yu L (2005) Towards integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502CrossRefGoogle Scholar
  21. Mukkamala S, Janoski G, Sung AH (2002) Intrusion detection using neural networks and support vector machines. In: Proceedings of IEEE international joint conference on neural networks, vol 2. Honolulu, pp 1702–1707Google Scholar
  22. Olusola AA, Oladele AS, Abosede DO (2010) Analysis of KDD’99 intrusion detection dataset for selection of relevance features. In: Proceedings of the world congress on engineering and computer science, vol 1Google Scholar
  23. Roobaert D, Karakoulas G, Chawla NV (2006) Information gain, correlation and support vector machines. In: Feature extraction. Springer, Berlin, pp 463–470Google Scholar
  24. Saleh AI, El Desouky AI, Ali SH (2015) Promoting the performance of vertical recommendation systems by applying new classification techniques. Knowl Based Syst 75:192–223CrossRefGoogle Scholar
  25. Song J, Takakura H, Okabe Y, Eto M, Inoue D, Nakao K (2011) Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In: Proceedings of the 1st workshop on building analysis datasets and gathering experience returns for security, Salzburg, 10–13 April 2011. pp 29–36. doi: 10.1145/1978672.1978676
  26. Sravani K, Srinivasu P (2014) Comparative study of machine learning algorithm for intrusion detection system. In: Satapathy S, Udgata S, Biswal B (eds) Proceedings of the international conference on frontiers of intelligent computing: theory and applications (FICTA) 2013. Advances in intelligent systems and computing, vol 247. Springer, ChamGoogle Scholar
  27. Subaira AS, Anitha P (2013) An efficient classification mechanism for network intrusion detection system based on data mining techniques: a survey. ISSN: 1694-2108Google Scholar
  28. Tan Z, Nagar UT, He X, Liu RP, Wang S, Hu J (2014) Enhancing big data security with collaborative intrusion detection. IEEE Cloud Comput 3(3):27–33CrossRefGoogle Scholar
  29. Vapnik VN (2000) The nature of statistical learning theory. Springer, New YorkCrossRefzbMATHGoogle Scholar
  30. Wang GP, Chen SY, Liu J (2015) Anomaly-based intrusion detection using multiclass-SVM with parameters optimized by PSO. Int J Secur Appl 9:227–242Google Scholar
  31. Warrender C, Forrest S, Pearlmutter B (1999) Detecting intrusion using system calls: alternative data models. In: IEEE symposium on security and privacy. IEEE Computer SocietyGoogle Scholar
  32. Yi Y, Wu J, Xu W (2011) Incremental SVM based on reserved set for network intrusion detection. Expert Syst Appl 38:7698–7707CrossRefGoogle Scholar
  33. Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Machine learning-international workshop then conference, vol 20. p 856Google Scholar
  34. Zhang M, Yao JT (2004) A rough sets based approach to feature selection. In: IEEE annual meeting of the fuzzy information, processing NAFIPS’04, vol 1. IEEE, pp 434–439Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  • Ahmed I. Saleh
    • 1
  • Fatma M. Talaat
    • 1
    Email author
  • Labib M. Labib
    • 1
  1. 1.Computer engineering and Systems DepartmentFaculty of Engineering, Mansoura UniversityMansouraEgypt

Personalised recommendations