Learning to Detect Network Intrusion from a Few Labeled Events and Background Traffic

  • Gustav Šourek
  • Ondřej Kuželka
  • Filip Železný
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9122)


Intrusion detection systems (IDS) analyse network traffic data with the goal to reveal malicious activities and incidents. A general problem with learning within this domain is a lack of relevant ground truth data, i.e. real attacks, capturing malicious behaviors in their full variety. Most of existing solutions thus, up to a certain level, rely on rules designed by network domain experts. Although there are advantages to the use of rules, they lack the basic ability of adapting to traffic data. As a result, we propose an ensemble tree bagging classifier, capable of learning from an extremely small number of true attack representatives, and demonstrate that, incorporating a general background traffic, we are able to generalize from those few representatives to achieve competitive results to the expert designed rules used in existing IDS Camnep.


Intrusion detection Random forests NetFlow Camnep 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Van Assche, A., Blockeel, H.: Seeing the forest through the trees: Learning a comprehensible model from an ensemble. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 418–429. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Bartos, K., Rehak, M.: Trust-based solution for robust self-configuration of distributed intrusion detection systems, pp. 121–126 (2012)Google Scholar
  3. 3.
    Błaszczyński, J., Stefanowski, J., Idkowiak, Ł.: Extending bagging for imbalanced data. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) CORES 2013. AISC, vol. 226, pp. 269–278. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
  5. 5.
    Chaudhary, U.K., Papapanagiotou, I., Devetsikiotis, M.: Flow classification using clustering and association rule mining. In: 2010 15th IEEE International Workshop on Computer Aided Modeling, Analysis and Design of Communication Links and Networks (CAMAD), pp. 76–80. IEEE (2010)Google Scholar
  6. 6.
    Chen, C., Liaw, A., Breiman, L.: Using random forest to learn imbalanced data. University of California, Berkeley (2004)Google Scholar
  7. 7.
    Claise, B.: Cisco systems netflow services export version 9 (September 2004)Google Scholar
  8. 8.
    Elbasiony, R.M., Sallam, E.A., Eltobely, T.E., Fahmy, M.M.: A hybrid network intrusion detection framework based on random forests and weighted k-means. Ain Shams Engineering Journal 4(4), 753–762 (2013)CrossRefGoogle Scholar
  9. 9.
    Erman, J., Mahanti, A., Arlitt, M., Cohen, I., Williamson, C.: Offline/realtime traffic classification using semi-supervised learning. Performance Evaluation 64(9), 1194–1213 (2007)CrossRefGoogle Scholar
  10. 10.
    Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research 15(1), 3133–3181 (2014)MATHGoogle Scholar
  11. 11.
    Huang, T.M., Kecman, V.: Semi-supervised learning from unbalanced labeled data–an improvement. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 802–808. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  12. 12.
    Jiang, H., Moore, A.W., Ge, Z., Jin, S., Wang, J.: Lightweight application classification for network management. In: Proceedings of the 2007 SIGCOMM Workshop on Internet Network Management, pp. 299–304. ACM (2007)Google Scholar
  13. 13.
    Karagiannis, T., Papagiannaki, K., Faloutsos, M.: Blinc: multilevel traffic classification in the dark. In: ACM SIGCOMM Computer Communication Review, vol. 35, pp. 229–240. ACM (2005)Google Scholar
  14. 14.
    Khan, S.S., Madden, M.G.: A survey of recent trends in one class classification. In: Coyle, L., Freyne, J. (eds.) AICS 2009. LNCS (LNAI), vol. 6206, pp. 188–197. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Laskov, P., Düssel, P., Schäfer, C., Rieck, K.: Learning intrusion detection: supervised or unsupervised? In: Roli, F., Vitulano, S. (eds.) ICIAP 2005. LNCS, vol. 3617, pp. 50–57. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Leung, K., Leckie, C.: Unsupervised anomaly detection in network intrusion detection using clusters, pp. 333–342 (2005)Google Scholar
  17. 17.
    McHugh, J.: Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Transactions on Information and system Security 3(4), 262–294 (2000)CrossRefGoogle Scholar
  18. 18.
    Mizutani, M., Takeda, K., Murai, J.: Behavior rule based intrusion detection, pp. 57–58 (2009)Google Scholar
  19. 19.
    Adetunmbi, A., Olusola, A.S.: Oladele, and Daramola O Abosede. Analysis of kdd99 intrusion detection dataset for selection of relevance features. In: Proceedings of the World Congress on Engineering and Computer Science, vol. 1, pp. 20–22 (2010)Google Scholar
  20. 20.
    Perdisci, R., Gu, V., Lee, W.: Using an ensemble of one-class svm classifiers to harden payload-based anomaly detection systems. In: Sixth International Conference on Data Mining, ICDM 2006, pp. 488–498. IEEE (2006)Google Scholar
  21. 21.
    Pevný, T., Ker, A.D.: The challenges of rich features in universal steganalysis (2013)Google Scholar
  22. 22.
    Rehak, M., Pechoucek, M., Celeda, P., Novotny, J., Minarik, P.: Camnep: agent-based network intrusion detection system, pp. 133–136 (2008)Google Scholar
  23. 23.
    Rehak, M., Pechoucek, M., Grill, M., Stiborek, J., Bartoš, K., Celeda, P.: Adaptive multiagent system for network traffic monitoring. IEEE Intelligent Systems (3), 16–25 (2009)Google Scholar
  24. 24.
    Rossi, D., Valenti, S.: Fine-grained traffic classification with netflow data, pp. 479–483 (2010)Google Scholar
  25. 25.
    So-In, C.: A survey of network traffic monitoring and analysis tools. Cse 576m Computer System Analysis Project, Washington University in St. Louis (2009)Google Scholar
  26. 26.
    Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of ip flow-based intrusion detection. IEEE Communications Surveys Tutorials 12(3), 343–356 (2010)CrossRefGoogle Scholar
  27. 27.
    Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.-A.: A detailed analysis of the kdd cup 99 data set (2009)Google Scholar
  28. 28.
    Tsai, C.-F., Hsu, Y.-F., Lin, C.-Y., Lin, W.-Y.: Intrusion detection by machine learning: A review. Expert Systems with Applications 36(10), 11994–12000 (2009)CrossRefGoogle Scholar
  29. 29.
    Zhang, J., Zulkernine, M., Haque, A.: Random-forests-based network intrusion detection systems. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 38(5), 649–659 (2008)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • Gustav Šourek
    • 1
  • Ondřej Kuželka
    • 2
  • Filip Železný
    • 1
  1. 1.CTU PraguePragueCzech Republic
  2. 2.Cardiff UniversityCardiffUK

Personalised recommendations