Automatic Feature Construction for Network Intrusion Detection

  • Binh Tran
  • Stjepan Picek
  • Bing XueEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10593)


The notion of cyberspace became impossible to separate from the notions of cyber threat and cyberattack. Since cyberattacks are getting easier to run, they are also becoming more serious threats from the economic damage perspective. Consequently, we are evident of a continuous adversarial relationship between the attackers trying to mount as powerful as possible attacks and defenders trying to stop the attackers in their goals. To defend against such attacks, defenders have at their disposal a plethora of techniques but they are often falling behind the attackers due to the fact that they need to protect the whole system while the attacker needs to find only a single weakness to exploit. In this paper, we consider one type of a cyberattack – network intrusion – and investigate how to use feature construction via genetic programming in order to improve the intrusion detection accuracy. The obtained results show that feature construction offers improvements in a number of tested scenarios and therefore should be considered as an important step in defense efforts. Such improvements are especially apparent in scenario with the highly unbalanced data, which also represents the most interesting case from the defensive perspective.


  1. 1.
    Browne, R.: Nato: we ward off 500 cyberattacks each month, January 2017.
  2. 2.
  3. 3.
    Fratantonio, Y., Qian, C., Chung, S., Lee, W.: Cloak and Dagger: from two permissions to complete control of the UI feedback loop. In: Proceedings of the IEEE Symposium on Security and Privacy, Oakland, San Jose, CA, May 2017Google Scholar
  4. 4.
    García-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques. Syst. Chall. Comput. Secur. 28(1–2), 18–28 (2009)CrossRefGoogle Scholar
  5. 5.
    Wu, S.X., Banzhaf, W.: Review: the use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010)CrossRefGoogle Scholar
  6. 6.
    Tsai, C.F., Hsu, Y.F., Lin, C.Y., Lin, W.Y.: Intrusion detection by machine learning: a review. Expert Syst. Appl. 36(10), 11994–12000 (2009)CrossRefGoogle Scholar
  7. 7.
    Al-Sahaf, H., Al-Sahaf, A., Xue, B., Johnston, M., Zhang, M.: Automatically evolving rotation-invariant texture image descriptors by genetic programming. IEEE Trans. Evol. Comput. 21(1), 83–101 (2017)Google Scholar
  8. 8.
    Tran, B., Xue, B., Zhang, M.: Genetic programming for feature construction and selection in classification on high-dimensional data. Memet. Comput. 8(1), 3–15 (2015)CrossRefGoogle Scholar
  9. 9.
    Tran, B., Zhang, M., Xue, B.: Multiple feature construction in classification on high-dimensional data using GP. In: IEEE Symposium Series on Computational Intelligence (SSCI), pp. 210–218, December 2017Google Scholar
  10. 10.
  11. 11.
    Habibi, A., et al.: UNB ISCX NSL-KDD dataset.
  12. 12.
    Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: Proceedings of the Second IEEE International Conference on Computational Intelligence for Security and Defense Applications, CISDA 2009, Piscataway, NJ, USA, pp. 53–58. IEEE Press (2009)Google Scholar
  13. 13.
    Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)CrossRefGoogle Scholar
  14. 14.
    Curry, R., Heywood, M.I.: One-class genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 1–12. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-01181-8_1 CrossRefGoogle Scholar
  15. 15.
    Cao, V.L., Nicolau, M., McDermott, J.: One-class classification for anomaly detection with kernel density estimation and genetic programming. In: Heywood, M.I., McDermott, J., Castelli, M., Costa, E., Sim, K. (eds.) EuroGP 2016. LNCS, vol. 9594, pp. 3–18. Springer, Cham (2016). doi: 10.1007/978-3-319-30668-1_1 CrossRefGoogle Scholar
  16. 16.
    To, C., Elati, M.: A Parallel genetic programming for single class classification. In: Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation, GECCO 2013 Companion, pp. 1579–1586. ACM, New York (2013)Google Scholar
  17. 17.
    Song, D., Heywood, M.I., Zincir-Heywood, A.N.: Training genetic programming on half a million patterns: an example from anomaly detection. IEEE Trans. Evol. Comput. 9(3), 225–239 (2005)CrossRefGoogle Scholar
  18. 18.
    Wang, W., Gombault, S., Guyet, T.: Towards fast detecting intrusions: using key attributes of network traffic. In: Proceedings of the 2008 The Third International Conference on Internet Monitoring and Protection, ICIMP 2008, pp. 86–91. IEEE Computer Society, Washington, DC (2008)Google Scholar
  19. 19.
    Zargari, S., Voorhis, D.: Feature selection in the corrected KDD-dataset. In: 2012 Third International Conference on Emerging Intelligent Data and Web Technologies, pp. 174–180, September 2012Google Scholar
  20. 20.
    Lee, W., Stolfo, S.J.: A framework for constructing features and models for intrusion detection systems. ACM Trans. Inf. Syst. Secur. 3(4), 227–261 (2000)CrossRefGoogle Scholar
  21. 21.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2), 131–163 (1997)CrossRefzbMATHGoogle Scholar
  22. 22.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  23. 23.
    Tran, B., Xue, B., Zhang, M.: Using feature clustering for GP-based feature construction on high-dimensional data. In: McDermott, J., Castelli, M., Sekanina, L., Haasdijk, E., García-Sánchez, P. (eds.) EuroGP 2017. LNCS, vol. 10196, pp. 210–226. Springer, Cham (2017). doi: 10.1007/978-3-319-55696-3_14 CrossRefGoogle Scholar
  24. 24.
    Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Reusing genetic programming for ensemble selection in classification of unbalanced data. IEEE Trans. Evol. Comput. 18(6), 893–908 (2014)CrossRefGoogle Scholar
  25. 25.
    Evolutionary Computation Laboratory: ECJ: a Java-based evolutionary computation research system.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Engineering and Computer ScienceVictoria University of WellingtonWellingtonNew Zealand
  2. 2.Cyber Security Research GroupDelft University of TechnologyDelftThe Netherlands

Personalised recommendations