Skip to main content

Effectiveness of Hard Clustering Algorithms for Securing Cyber Space

  • Conference paper
  • First Online:
Smart Grid and Internet of Things (SGIoT 2018)

Abstract

In the era of big data, it is more challenging than before to accurately identify cyber attacks. The characteristics of big data create constraints for the existing network anomaly detection techniques. Among these techniques, unsupervised algorithms are superior than the supervised algorithms for not requiring training data. Among the unsupervised techniques, hard clustering is widely accepted for deployment. Therefore, in this paper, we investigated the effectiveness of different hard clustering techniques for identification of a range of state-of-the-art cyber attacks such as backdoor, fuzzers, worms, reconnaissance etc. from the popular UNSW-NB15 dataset. The existing literature only provides the accuracy of identification of the all types of attacks in generic fashion, however, our investigation ensures the effectiveness of hard clustering for individual attacks. The experimental results reveal the performance of a number of hard clustering techniques. The insights from this paper will help both the cyber security and data science community to design robust techniques for securing cyber space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 60.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baaziz, A., Quoniam, L.: How to use Big Data technologies to optimize operations in Upstream Petroleum Industry. Int. J. Innov. 1(1), 19–29 (2013)

    Article  Google Scholar 

  2. Editorial: community cleverness required. Nature 455(7209), 1 (2008). http://www.nature.com/news/specials/bigdata/index.html

  3. Manyika, J., et al.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, New York (2011)

    Google Scholar 

  4. De Mauro, A., Greco, M., Grimaldi, M.: What is big data? A consensual definition and a review of key research topics. In: AIP Conference Proceedings, vol. 1644, pp. 97–104. AIP (2015). http://aip.scitation.org/doi/abs/10.1063/1.4907823

  5. Akerkar, R.: Big Data Computing, International Standard Book Number 13: 978-1-4665-7838-8

    Google Scholar 

  6. Mahmood, T., Afzal, U.: Security analytics: big data analytics for cybersecurity: a review of trends, techniques and tools. In: 2013 2nd National Conference on Information Assurance (NCIA), Rawalpindi, pp. 129–134 (2013)

    Google Scholar 

  7. Alguliyev, R., Imamverdiyev, Y.: Big data: big promises for information security. In: Proceedings of the 2014 8th IEEE International Conference on Application of Information and Communication Technology AICT, pp. 1–4, October 2014

    Google Scholar 

  8. Edgeworth, F.Y.: On discordant observations. Philosoph. Mag. 23(5), 364–375 (1887)

    MATH  Google Scholar 

  9. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection. ACM Comput. Sur. 41(3), 1–58 (2009). https://doi.org/10.1145/1541880.1541882

    Article  Google Scholar 

  10. Dasgupta, D., Andmajumdar, N.: Anomaly detection in multidimensional data using negative selection algorithm. In: Proceedings of the IEEE Conference on Evolutionary Computation, pp. 1039–1044 (2002)

    Google Scholar 

  11. Dasgupta, D., Andnino, F.: A comparison of negative and positive selection algorithms in novel pattern detection. Proc. IEEE Int. Conf. Syst. Man Cybernet. 1, 125–130 (2000)

    Article  Google Scholar 

  12. Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for un-supervised anomaly detection: detecting intrusions in unlabeled data. In: Barbará, D., Jajodia, S. (eds.) Applications of Data Mining in Computer Security, vol. 6. Springer, Boston (2002). https://doi.org/10.1007/978-1-4615-0953-0_4

    Chapter  Google Scholar 

  13. Oldmeadow, J., Ravinutala, S., Leckie, C.: Adaptive clustering for network intrusion detection. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 255–259. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24775-3_33

    Chapter  Google Scholar 

  14. Zanero, S., Savaresi, S.: Unsupervised learning techniques for an intrusion detection system. In: Proceedings of the ACM Symposium on Applied Computing, SAC 2004. ACM (2004)

    Google Scholar 

  15. Mahoney, M.V., Chan, P.K.: PHAD: Packet Header Anomaly Detection for Identifying Hostile Network Traffic Department of Computer Sciences, Florida Institute of Technology, Melbourne, FL, USA, Technical report CS- 2001-4, April 2001

    Google Scholar 

  16. Mahoney, M.V., Chan, P.K.: Learning nonstationary models of normal network traffic for detecting novel attacks. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, pp. 376–385 (2002)

    Google Scholar 

  17. Mahoney, M.V., Chan, P.K.: Learning Models of Network Traffic for Detecting Novel Attacks Computer Science Department, Florida Institute of Technology CS-2002-8, August 2002

    Google Scholar 

  18. Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (1998)

    Google Scholar 

  19. Nairac, A., Townsend, N., Carr, R., King, S., Cowley, P., Tarassenko, L.: A system for the analysis of jet system vibration data. Integr. Comput. Aided Eng. 6(1), 53–65 (1999)

    Article  Google Scholar 

  20. Gaddam, S.R., Phoha, V.V., Balagani, K.S.: K-Means+ID3: a novel method for supervised anomaly detection by cascading K-means clustering and ID3 decision tree learning methods. IEEE Trans. Knowl. Data Eng. 19(3), 345–354 (2007)

    Article  Google Scholar 

  21. Moustafa, N., Slay, J.: UNSW-NB15 DataSet for Network Intrusion Detection Systems, May 2014. http://www.cybersecurity.unsw.adfa.edu.au/ADFA20NB15

  22. Moustafa, N., Slay, J.: The evaluation of Network Anomaly Detection Systems: statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. Glob. Perspect. 25(1–3), 18–31 (2016)

    Article  Google Scholar 

  23. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sakib Mahtab Khandaker .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khandaker, S.M., Hussain, A., Ahmed, M. (2019). Effectiveness of Hard Clustering Algorithms for Securing Cyber Space. In: Pathan, AS., Fadlullah, Z., Guerroumi, M. (eds) Smart Grid and Internet of Things. SGIoT 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 256. Springer, Cham. https://doi.org/10.1007/978-3-030-05928-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05928-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05927-9

  • Online ISBN: 978-3-030-05928-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics