Skip to main content

Unsupervised Clustering of Honeypot Attacks by Deep HTTP Packet Inspection

  • Conference paper
  • First Online:
Foundations and Practice of Security (FPS 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14551))

Included in the following conference series:

  • 65 Accesses

Abstract

The increasing complexity of cyberattacks has prompted researchers to keep pace with this trend by proposing automated cyberattack classification methods. Current research directions favor supervised learning detection methods; however, they are limited by the fact that they must be continually trained on vast labelled datasets and cannot generalize to unseen events. We propose a novel unsupervised learning detection approach that performs deep packet inspection on HTTP-specific features, contrary to other works that work with generic numerical network-based features. Our method is divided into three phases: pre-processing, dimension reduction and clustering. By analyzing the content of each HTTP packet, we achieve the perfect isolation of each web attack in the CIC-IDS2017 dataset in separate clusters. Further, we run our method on real-world data collected from a honeypot platform to demonstrate its classification abilities. For future work, the proposed method could be applied to other protocols and extended with more correlation techniques to classify complex attacks.

This research was supported by Thales Research and Technology (TRT) Canada.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Censys—industry-leading cloud and internet asset discovery solutions. https://censys.io/

  2. curl. https://curl.se/

  3. CVE - CVE-2019-16759. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16759

  4. difflib - Helpers for computing deltas - Python 3.10.6 documentation. https://docs.python.org/3/library/difflib.html

  5. “l9explore,” original-date: 2020-12-15T00:39:15Z. https://github.com/LeakIX/l9explore

  6. Azhar, N.B.: “gohttp,” original-date: 2017-11-08T15:28:32Z. https://github.com/nahid/gohttp

  7. NDI/LDAP service provider. https://docs.oracle.com/javase/8/docs/technotes/guides/jndi/jndi-ldap.html

  8. Overview - OkHttp. https://square.github.io/okhttp/

  9. Prince \(\cdot \) PyPI. https://pypi.org/project/prince/

  10. Product catalog—mercury security access control hardware & solutions. https://mercury-security.com/portal/

  11. Projectdiscovery.io. https://projectdiscovery.io/#/

  12. PycURL home page. http://pycurl.io/

  13. Graham, R.D.: “MASSCAN: Mass IP port scanner,” original-date: 2013-07-28T05:35:33Z. https://github.com/robertdavidgraham/masscan

  14. Requests \(\cdot \) PyPI. https://pypi.org/project/requests/

  15. urllib - URL handling modules - python 3.11.0 documentation. https://docs.python.org/3/library/urllib.html

  16. vBulletin 5 connect, the world’s leading community software. https://www.vbulletin.com/

  17. Welcome to AIOHTTP - aiohttp 3.8.3 documentation. https://docs.aiohttp.org/en/stable/

  18. “ZGrab 2.0,” original-date: 2016-08-19T23:22:02Z. https://github.com/zmap/zgrab2

  19. ZmEu, “Zmeubot - module for ZNC (v0.1),” original-date: 2016-01-22T12:00:27Z. https://github.com/happyhater/zmeubot-znc

  20. Abdi, H., Valentin, D.: Multiple correspondence analysis, p. 13 (2007)

    Google Scholar 

  21. Ahmetoglu, H., Das, R.: A comprehensive review on detection of cyber-attacks: data sets, methods, challenges, and future research directions. Internet of Things 20, 100615 (2022). https://doi.org/10.1016/j.iot.2022.100615, https://www.sciencedirect.com/science/article/pii/S254266052200097X

  22. Bejarano, J., et al.: Sampling within k-means algorithm to cluster large datasets. UMBC Student Collection (2011)

    Google Scholar 

  23. Boukela, L., Zhang, G., Bouzefrane, S., Zhou, J.: An outlier ensemble for unsupervised anomaly detection in honeypots data. Intell. Data Anal. 24(4), 743–758 (2020)

    Article  Google Scholar 

  24. Faker, O., Dogdu, E.: Intrusion detection using big data and deep learning techniques. In: Proceedings of the 2019 ACM Southeast Conference, ACM SE 2019, pp. 86–93. Association for Computing Machinery (2019)

    Google Scholar 

  25. Ghurab, M., Gaphari, G., Alshami, F., Alshamy, R., Othman, S.: A detailed analysis of benchmark datasets for network intrusion detection system (2021)

    Google Scholar 

  26. Lippmann, R., Haines, J.W., Fried, D.J., Korba, J., Das, K.: The 1999 DARPA off-line intrusion detection evaluation. Comput. Netw. 34(4), 579–595 (2000)

    Google Scholar 

  27. Matin, I.M.M., Rahardjo, B.: Malware detection using honeypot and machine learning. In: 2019 7th International Conference on Cyber and IT Service Management (CITSM), vol. 7, pp. 1–4. IEEE (2019)

    Google Scholar 

  28. Meira, J., et al.: Performance evaluation of unsupervised techniques in cyber-attack anomaly detection. J. Ambient Intell. Human Comput. 11(11), 4477–4489 (2020)

    Article  Google Scholar 

  29. Mokube, I., Adams, M.: Honeypots: concepts, approaches, and challenges. In: Proceedings of the 45th Annual Southeast Regional Conference, pp. 321–326 (2007)

    Google Scholar 

  30. Owezarski, P.: Unsupervised classification and characterization of honeypot attacks. In: 10th International Conference on Network and Service Management (CNSM) and Workshop, pp. 10–18. IEEE (2014)

    Google Scholar 

  31. Panigrahi, R., Borah, S.: A detailed analysis of CICIDS2017 dataset for designing intrusion detection systems. Int. J. Eng. Technol. 7, 479–482 (2018)

    Google Scholar 

  32. Pelletier, Z., Abualkibash, M.: Evaluating the CIC IDS-2017 dataset using machine learning methods and creating multiple predictive models in the statistical computing language R. Int. Res. J. Adv. Eng. Sci. 5(2), 5 (2020)

    Google Scholar 

  33. Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)

    Article  Google Scholar 

  34. Sinaga, K.P., Yang, M.S.: Unsupervised k-means clustering algorithm. IEEE Access 8, 80716–80727 (2020)

    Article  Google Scholar 

  35. Takyi, K., Bagga, A., Goopta, P.: Clustering techniques for traffic classification: a comprehensive review. In: 2018 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), pp. 224–230 (2018)

    Google Scholar 

  36. Wu, Y., Wei, D., Feng, J.: Network attacks detection methods based on deep learning techniques: a survey. Secur. Commun. Netw. 2020, e8872923 (2020)

    Article  Google Scholar 

  37. Yavanoglu, O., Aydos, M.: A review on cyber security datasets for machine learning algorithms. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 2186–2193 (2017)

    Google Scholar 

  38. Zanero, S., Savaresi, S.M.: Unsupervised learning techniques for an intrusion detection system. In: Proceedings of the 2004 ACM Symposium on Applied Computing, SAC 2004, pp. 412–419. Association for Computing Machinery (2004)

    Google Scholar 

  39. Zhang, X., Chen, J., Zhou, Y., Han, L., Lin, J.: A multiple-layer representation learning model for network-based attack detection. IEEE Access 7, 91992–92008 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Neal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aurora, V., Neal, C., Proulx, A., Boulahia Cuppens, N., Cuppens, F. (2024). Unsupervised Clustering of Honeypot Attacks by Deep HTTP Packet Inspection. In: Mosbah, M., Sèdes, F., Tawbi, N., Ahmed, T., Boulahia-Cuppens, N., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2023. Lecture Notes in Computer Science, vol 14551. Springer, Cham. https://doi.org/10.1007/978-3-031-57537-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-57537-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-57536-5

  • Online ISBN: 978-3-031-57537-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics