Effectiveness of Hard Clustering Algorithms for Securing Cyber Space

  • Sakib Mahtab KhandakerEmail author
  • Afzal Hussain
  • Mohiuddin Ahmed
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 256)


In the era of big data, it is more challenging than before to accurately identify cyber attacks. The characteristics of big data create constraints for the existing network anomaly detection techniques. Among these techniques, unsupervised algorithms are superior than the supervised algorithms for not requiring training data. Among the unsupervised techniques, hard clustering is widely accepted for deployment. Therefore, in this paper, we investigated the effectiveness of different hard clustering techniques for identification of a range of state-of-the-art cyber attacks such as backdoor, fuzzers, worms, reconnaissance etc. from the popular UNSW-NB15 dataset. The existing literature only provides the accuracy of identification of the all types of attacks in generic fashion, however, our investigation ensures the effectiveness of hard clustering for individual attacks. The experimental results reveal the performance of a number of hard clustering techniques. The insights from this paper will help both the cyber security and data science community to design robust techniques for securing cyber space.


Network traffic analysis Cyber attacks Unsupervised clustering Big data 


  1. 1.
    Baaziz, A., Quoniam, L.: How to use Big Data technologies to optimize operations in Upstream Petroleum Industry. Int. J. Innov. 1(1), 19–29 (2013)CrossRefGoogle Scholar
  2. 2.
    Editorial: community cleverness required. Nature 455(7209), 1 (2008).
  3. 3.
    Manyika, J., et al.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, New York (2011)Google Scholar
  4. 4.
    De Mauro, A., Greco, M., Grimaldi, M.: What is big data? A consensual definition and a review of key research topics. In: AIP Conference Proceedings, vol. 1644, pp. 97–104. AIP (2015).
  5. 5.
    Akerkar, R.: Big Data Computing, International Standard Book Number 13: 978-1-4665-7838-8Google Scholar
  6. 6.
    Mahmood, T., Afzal, U.: Security analytics: big data analytics for cybersecurity: a review of trends, techniques and tools. In: 2013 2nd National Conference on Information Assurance (NCIA), Rawalpindi, pp. 129–134 (2013)Google Scholar
  7. 7.
    Alguliyev, R., Imamverdiyev, Y.: Big data: big promises for information security. In: Proceedings of the 2014 8th IEEE International Conference on Application of Information and Communication Technology AICT, pp. 1–4, October 2014Google Scholar
  8. 8.
    Edgeworth, F.Y.: On discordant observations. Philosoph. Mag. 23(5), 364–375 (1887)zbMATHGoogle Scholar
  9. 9.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection. ACM Comput. Sur. 41(3), 1–58 (2009). Scholar
  10. 10.
    Dasgupta, D., Andmajumdar, N.: Anomaly detection in multidimensional data using negative selection algorithm. In: Proceedings of the IEEE Conference on Evolutionary Computation, pp. 1039–1044 (2002)Google Scholar
  11. 11.
    Dasgupta, D., Andnino, F.: A comparison of negative and positive selection algorithms in novel pattern detection. Proc. IEEE Int. Conf. Syst. Man Cybernet. 1, 125–130 (2000)CrossRefGoogle Scholar
  12. 12.
    Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for un-supervised anomaly detection: detecting intrusions in unlabeled data. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security, vol. 6. Springer, Boston (2002). Scholar
  13. 13.
    Oldmeadow, J., Ravinutala, S., Leckie, C.: Adaptive clustering for network intrusion detection. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 255–259. Springer, Heidelberg (2004). Scholar
  14. 14.
    Zanero, S., Savaresi, S.: Unsupervised learning techniques for an intrusion detection system. In: Proceedings of the ACM Symposium on Applied Computing, SAC 2004. ACM (2004)Google Scholar
  15. 15.
    Mahoney, M.V., Chan, P.K.: PHAD: Packet Header Anomaly Detection for Identifying Hostile Network Traffic Department of Computer Sciences, Florida Institute of Technology, Melbourne, FL, USA, Technical report CS- 2001-4, April 2001Google Scholar
  16. 16.
    Mahoney, M.V., Chan, P.K.: Learning nonstationary models of normal network traffic for detecting novel attacks. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, pp. 376–385 (2002)Google Scholar
  17. 17.
    Mahoney, M.V., Chan, P.K.: Learning Models of Network Traffic for Detecting Novel Attacks Computer Science Department, Florida Institute of Technology CS-2002-8, August 2002Google Scholar
  18. 18.
    Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (1998)Google Scholar
  19. 19.
    Nairac, A., Townsend, N., Carr, R., King, S., Cowley, P., Tarassenko, L.: A system for the analysis of jet system vibration data. Integr. Comput. Aided Eng. 6(1), 53–65 (1999)CrossRefGoogle Scholar
  20. 20.
    Gaddam, S.R., Phoha, V.V., Balagani, K.S.: K-Means+ID3: a novel method for supervised anomaly detection by cascading K-means clustering and ID3 decision tree learning methods. IEEE Trans. Knowl. Data Eng. 19(3), 345–354 (2007)CrossRefGoogle Scholar
  21. 21.
    Moustafa, N., Slay, J.: UNSW-NB15 DataSet for Network Intrusion Detection Systems, May 2014.
  22. 22.
    Moustafa, N., Slay, J.: The evaluation of Network Anomaly Detection Systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. Glob. Perspect. 25(1–3), 18–31 (2016)CrossRefGoogle Scholar
  23. 23.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874 (2006). Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2019

Authors and Affiliations

  • Sakib Mahtab Khandaker
    • 1
    • 2
    Email author
  • Afzal Hussain
    • 1
    • 2
  • Mohiuddin Ahmed
    • 1
    • 2
  1. 1.Islamic University of TechnologyGazipur CityBangladesh
  2. 2.Canberra Institute of TechnologyReidAustralia

Personalised recommendations