A Survey on the Use of Data Points in IDS Research

  • Heini Ahde
  • Sampsa RautiEmail author
  • Ville Leppanen
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 942)


In today’s diverse cyber threat landscape, anomaly-based intrusion detection systems that learn the normal behavior of a system and have the ability to detect previously unknown attacks are needed. However, the data gathered by the intrusion detection system is useless if we do not form reasonable data points for machine learning methods to work, based on the collected data sets. In this paper, we present a survey on data points used in previous research in the context of anomaly-based IDS research. We also introduce a novel categorization of the features used to form these data points.


Network security Intrusion detection Data points 


  1. 1.
    Al-Jarrah, O., Arafat, A.: Network intrusion detection system using attack behavior classification. In: 5th International Conference on Information and Communication Systems (ICICS), pp. 1–6. IEEE (2014)Google Scholar
  2. 2.
    Alanazi, H., Noor, R., Zaidan, B., Zaidan, A.: Intrusion detection system: overview. arXiv preprint arXiv:1002.4047 (2010)
  3. 3.
    Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutor. 16(1), 303–336 (2014)CrossRefGoogle Scholar
  4. 4.
    Erman, J., Mahanti, A., Arlitt, M.: Qrp05-4: Internet traffic identification using machine learning. In: 2006 Global Telecommunications Conference, GLOBECOM 2006, pp. 1–6. IEEE (2006)Google Scholar
  5. 5.
    Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: Proceedings of the 2003 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 137–148. ACM (2003)Google Scholar
  6. 6.
    Feroz, M.N., Mengel, S.: Phishing URL detection using URL ranking. In: 2015 IEEE International Congress on Big Data (BigData Congress), pp. 635–638. IEEE (2015)Google Scholar
  7. 7.
    Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly-based network intrusion detection: techniques, systems and challenges. Comput. Secur. 28(1–2), 18–28 (2009)CrossRefGoogle Scholar
  8. 8.
    Gonzalez, R., Manco, F., Garcia-Duran, A., Mendes, J., Huici, F., Niccolini, S., Niepert, M.: Net2Vec: deep learning for the network. In: Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, pp. 13–18. ACM (2017)Google Scholar
  9. 9.
    Hammerschmidt, C., Marchal, S., State, R., Pellegrino, G., Verwer, S.: Efficient learning of communication profiles from IP flow records. In: 2016 41st Conference on Local Computer Networks (LCN), pp. 559–562. IEEE (2016)Google Scholar
  10. 10.
    Hotho, A., Maedche, A., Staab, S.: Ontology-based text document clustering. Künsliche Intelligenz (KI) 16(4), 48–54 (2002)Google Scholar
  11. 11.
    Kemmerer, R.A., Vigna, G.: Intrusion detection: a brief history and overview. Computer 35(4), supl27–supl30 (2002)CrossRefGoogle Scholar
  12. 12.
    Liu, Y., Li, W., Li, Y.C.: Network traffic classification using k-means clustering. In: 2007 Second International Multi-Symposiums on Computer and Computational Sciences, IMSCCS 2007, pp. 360–365. IEEE (2007)Google Scholar
  13. 13.
    Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Beyond blacklists: learning to detect malicious web sites from suspicious URLs. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254. ACM (2009)Google Scholar
  14. 14.
    Magdalinos, P., Barmpounakis, S., Spapis, P., Kaloxylos, A., Kyprianidis, G., Kousaridas, A., Alonistioti, N., Zhou, C.: A context extraction and profiling engine for 5G network resource mapping. Comput. Commun. 109, 184–201 (2017)CrossRefGoogle Scholar
  15. 15.
    Mahmood, A.N., Leckie, C., Udaya, P.: An efficient clustering scheme to exploit hierarchical data in network traffic analysis. IEEE Trans. Knowl. Data Eng. 20(6), 752–767 (2008)CrossRefGoogle Scholar
  16. 16.
    McGregor, A., Hall, M., Lorier, P., Brunskill, J.: Flow clustering using machine learning techniques. In: International Workshop on Passive and Active Network Measurement, pp. 205–214. Springer (2004)Google Scholar
  17. 17.
    Sarmadi, S., Li, M., Chellappan, S.: On the feasibility of profiling internet users based on volume and time of usage. In: 2017 9th Latin-American Conference on Communications (LATINCOM), pp. 1–6. IEEE (2017)Google Scholar
  18. 18.
    Shadi, K., Natarajan, P., Dovrolis, C.: Hierarchical IP flow clustering. In: Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, pp. 25–30. ACM (2017)Google Scholar
  19. 19.
    Singh, H.: Performance analysis of unsupervised machine learning techniques for network traffic classification. In: 2015 Fifth International Conference on Advanced Computing & Communication Technologies (ACCT), pp. 401–404. IEEE (2015)Google Scholar
  20. 20.
    Singh, R., Kumar, H., Singla, R.: An intrusion detection system using network traffic profiling and online sequential extreme learning machine. Expert Syst. Appl. 42(22), 8609–8624 (2015)CrossRefGoogle Scholar
  21. 21.
    Tsai, C.F., Hsu, Y.F., Lin, C.Y., Lin, W.Y.: Intrusion detection by machine learning: a review. Expert Syst. Appl. 36(10), 11994–12000 (2009)CrossRefGoogle Scholar
  22. 22.
    Valenti, S., Rossi, D., Dainotti, A., Pescapè, A., Finamore, A., Mellia, M.: Reviewing traffic classification. In: Data Traffic Monitoring and Analysis, pp. 123–147. Springer (2013)Google Scholar
  23. 23.
    Wang, C., Song, Y., El-Kishky, A., Roth, D., Zhang, M., Han, J.: Incorporating world knowledge to document clustering via heterogeneous information networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1215–1224. ACM (2015)Google Scholar
  24. 24.
    Wang, Y., Xiang, Y., Zhang, J., Zhou, W., Wei, G., Yang, L.T.: Internet traffic classification using constrained clustering. IEEE Trans. Parallel Distrib. Syst. 25(11), 2932–2943 (2014)CrossRefGoogle Scholar
  25. 25.
    Xu, K., Zhang, Z.L., Bhattacharyya, S.: Internet traffic behavior profiling for network security monitoring. IEEE/ACM Trans. Netw. 16(6), 1241–1252 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of TurkuTurkuFinland

Personalised recommendations