Toward Modeling Lightweight Intrusion Detection System Through Correlation-Based Hybrid Feature Selection

  • Jong Sou Park
  • Khaja Mohammad Shazzad
  • Dong Seong Kim
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3822)


Modeling IDS have been focused on improving detection model(s) in terms of (i) detection model design based on classification algorithm, clustering algorithm, and soft computing techniques such as Artificial Neural Networks (ANN), Hidden Markov Model (HMM), Support Vector Machines (SVM), K-means clustering, Fuzzy approaches and so on and (ii) feature selection through wrapper and filter approaches. However these approaches require large overhead due to heavy computations for both feature selection and cross validation method to minimize generalization errors. In addition selected feature set varies according to detection model so that they are inefficient for modeling lightweight IDS. Therefore this paper proposes a new approach to model lightweight Intrusion Detection System (IDS) based on a new feature selection approach named Correlation-based Hybrid Feature Selection (CBHFS) which is able to significantly decrease training and testing times while retaining high detection rates with low false positives rates as well as stable feature selection results. The experimental results on KDD 1999 intrusion detection datasets show the feasibility of our approach to enable one to modeling lightweight IDS.


Support Vector Machine Feature Selection Hide Markov Model Intrusion Detection Feature Subset 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ourston, D., Matzner, S., Stump, W., Hopkins, B.: Applications of Hidden Markov Models to Detect Multi-Stage Network Attacks. In: Proc. of the 36th Hawaii Int. Conf. on System Science, pp. 334–343. IEEE Computer Society Press, Los Alamitos (2002)Google Scholar
  2. 2.
    Hall, M.A.: Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: Proc. of the 17th Int. Conf. on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc, San Francisco (2000)Google Scholar
  3. 3.
    Dash, M., Liu, H., Motoda, H.: Consistency Based Feature Selection. Proc. of the 4th Pacific Asia Conf. on Knowledge Discovery and Data Mining, 98–109 (2000)Google Scholar
  4. 4.
    Almuallim, H., Dietterich, T.G.: Learning Boolean Concepts in the Presence of Many Irrelevant Features. In: Artificial Intelligence, vol. 69, pp. 279–305. Elsevier Science Publishers Ltd, Amsterdam (1994)Google Scholar
  5. 5.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)Google Scholar
  6. 6.
    Fugate, M., Gattiker, J.R.: Anomaly detection enhanced classification in computer intrusion detection. In: Lee, S.-W., Verri, A. (eds.) SVM 2002. LNCS, vol. 2388, p. 186. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Nguyen, B.V.: An Application of Support Vector Machines to Anomaly Detection (2002), available at,
  8. 8.
    Hu, W., Liao, Y., Vemuri, V.R.: Robust Support Vector Machines for Anomaly Detection in Computer Security. In: Proc. of the 2003 Int. Conf. on Machine Learning and Application, pp. 168–174. CSREA Press (2003)Google Scholar
  9. 9.
    Cannady, J.: Artificial Neural Network for Misuse detection. In: Proc. of the 1998 National Information System Security Conference, pp. 443–356 (1998)Google Scholar
  10. 10.
    Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing Multiple Parameters for Support Vector Machines. In: Machine Learning, vol. 46(1), pp. 131–159. Kluwer Academic Publishers, Dordrecht (2002)Google Scholar
  11. 11.
    Duan, K., Keerthi, S.S., Poo, A.N.: Evaluation of Simple Performance Measures for Tuning SVM Hyperparameters. Neurocomputing 51, 41–59 (2003)CrossRefGoogle Scholar
  12. 12.
    Cup, K.D.D. (1999), Data. available,
  13. 13.
    Open Source WEKA Project.: available
  14. 14.
    Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Trans. on Knowledge and Data Engineering 17(3), 1–12 (2005)zbMATHCrossRefGoogle Scholar
  15. 15.
    Sung, A.H., Mukkamala, S.: Identifying Important Features for Intrusion Detection Using Support Vector Machines and Neural Networks. In: Proc. of the 2003 Int. Sym. on Applications and the Internet Technology, pp. 209–216. IEEE Computer Society Press, Los Alamitos (2003)CrossRefGoogle Scholar
  16. 16.
    Kim, D.S., Nguyen, H.-N., Ohn, S.-Y., Park, J.-S.: Fusions of GA and SVM for anomaly detection in intrusion detection system. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3498, pp. 415–420. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)zbMATHGoogle Scholar
  18. 18.
    Fayyad, U., Irani, K.: Multi-interval discretization of continuos attributes as preprocessing for classification learning. In: Proc. of the 13th Int. Join Conf. on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann Publishers, San Francisco (1993)Google Scholar
  19. 19.
    Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical recipes in C. Cambridge University Press, Cambridge (1988)Google Scholar
  20. 20.
    Chebrolu, S., Abraham, A., Thomas, J.P.: Data Reduction and Data Classification in an Intrusion Detection System. In: Proc. of 2004 South Central Information Security Symposium (2004)Google Scholar
  21. 21.
    Sabhnani, M., Serpen, G.: On Failure of Machine Learning Algorithms for Detecting Misuse in KDD Intrusion Detection Data Set. J. of Intelligent Data Analysis (2004)Google Scholar
  22. 22.
    Kim, D.S., Park, J.S.: Network-based Intrusion Detection with Support Vector Machines. LNCS, vol. 2662, pp. 747–756. Springer, Heidelberg (2003)Google Scholar
  23. 23.
    Mitchell, M.: Introduction to Genetic Algorithms. MIT press, Cambridge (1999)Google Scholar
  24. 24.
    Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg (1996)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Jong Sou Park
    • 1
  • Khaja Mohammad Shazzad
    • 1
  • Dong Seong Kim
    • 1
  1. 1.Computer Engineering DepartmentHankuk Aviation University 

Personalised recommendations