Skip to main content

BotTokenizer: Exploring Network Tokens of HTTP-Based Botnet Using Malicious Network Traces

  • 1123 Accesses

Part of the Lecture Notes in Computer Science book series (LNSC,volume 10726)

Abstract

Nowadays, malicious software and especially botnets leverage HTTP protocol as their communication and command (C&C) channels to connect to the attackers and control compromised clients. Due to its large popularity and facility across firewall, the malicious traffic can blend with legitimate traffic and remains undetected. While network signature-based detection systems and models show extraordinary advantages, such as high detection efficiency and accuracy, their scalability and automatization still need to be improved.

In this work, we present BotTokenizer, a novel network signature-based detection system that aims to detect malicious HTTP C&C traffic. BotTokenizer automatically learns recognizable network tokens from known HTTP C&C communications from different botnet families by using words segmentation technologies. In essence, BotTokenizer implements a coarse-grained network signature generation prototype only relying on Uniform Resource Locators (URLs) in HTTP requests. Our evaluation results demonstrate that BotTokenizer performs very well on identifying HTTP-based botnets with an acceptable classification errors.

Keywords

  • HTTP-based botnet detection
  • Network tokens
  • Words segmentation

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-75160-3_23
  • Chapter length: 21 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-75160-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   107.00
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.

Notes

  1. 1.

    Snort. http://www.snort.org/.

  2. 2.

    Suricata. http://suricata-ids.org/.

  3. 3.

    ANUBIS. http://www.assiste.com/Anubis.html/.

  4. 4.

    CWSandbox. http://www.cwsandbox.org/.

  5. 5.

    Alexa. http://www.alexa.com/topsites/.

  6. 6.

    MCFP. https://stratosphereips.org/category/dataset.html.

  7. 7.

    VirusTotal. https://www.virustotal.com/.

  8. 8.

    scikit-learn: http://scikit-learn.org/stable/.

References

  1. Antonakakis, M., Demar, J., Stevens, K., Dagon, D.: Unveiling the network criminal infrastructure of TDSS/TDL4. Damballa Research Report 2012 (2012)

    Google Scholar 

  2. Chiba, D., Yagi, T., Akiyama, M., Aoki, K., Hariu, T., Goto, S.: BotProfiler: profiling variability of substrings in HTTP requests to detect malware-infected hosts. In: 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 758–765. IEEE (2015)

    Google Scholar 

  3. Garcia, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)

    CrossRef  Google Scholar 

  4. Goebel, J., Holz, T.: Rishi: identify bot contaminated hosts by IRC nickname evaluation. HotBots 7, 8–8 (2007)

    Google Scholar 

  5. Goodman, N.: A survey of advances in botnet technologies. arXiv preprint arXiv:1702.01132 (2017)

  6. Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, vol. 5, pp. 139–154 (2008)

    Google Scholar 

  7. Gu, G., Yegneswaran, V., Porras, P., Stoll, J., Lee, W.: Active botnet probing to identify obscure command and control channels. In: Annual Computer Security Applications Conference, ACSAC 2009, pp. 241–253. IEEE (2009)

    Google Scholar 

  8. Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)

    Google Scholar 

  9. Han, X., Kheir, N., Balzarotti, D.: PhishEye: live monitoring of sandboxed phishing kits. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 1402–1413. ACM (2016)

    Google Scholar 

  10. Jang, J., Brumley, D., Venkataraman, S.: BitShred: feature hashing malware for scalable triage and semantic analysis. In: Proceedings of the 18th ACM Conference on Computer and Communications security, pp. 309–320. ACM (2011)

    Google Scholar 

  11. Kim, H.A., Karp, B.: Autograph: toward automated, distributed worm signature detection. In: USENIX Security Symposium, San Diego, CA, vol. 286 (2004)

    Google Scholar 

  12. Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.: Behavior-based spyware detection. In: USENIX Security, vol. 6 (2006)

    Google Scholar 

  13. Li, Z., Sanghi, M., Chen, Y., Kao, M.Y., Chavez, B.: Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In: 2006 IEEE Symposium on Security and Privacy, pp. 15. IEEE (2006)

    Google Scholar 

  14. Lu, W., Rammidi, G., Ghorbani, A.A.: Clustering botnet communication traffic based on n-gram feature selection. Comput. Commun. 34(3), 502–514 (2011)

    CrossRef  Google Scholar 

  15. Ma, J., Saul, L.K., Savage, S., Voelker, G.M.: Identifying suspicious URLs: an application of large-scale online learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 681–688. ACM (2009)

    Google Scholar 

  16. Malan, D.J., Smith, M.D.: Host-based detection of worms through peer-to-peer cooperation. In: Proceedings of the 2005 ACM Workshop on Rapid Malcode, pp. 72–80. ACM (2005)

    Google Scholar 

  17. Nelms, T., Perdisci, R., Ahamad, M.: ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates. In: USENIX Security, pp. 589–604 (2013)

    Google Scholar 

  18. Newsome, J., Karp, B., Song, D.: Polygraph: automatically generating signatures for polymorphic worms. In: 2005 IEEE Symposium on Security and Privacy, pp. 226–241. IEEE (2005)

    Google Scholar 

  19. Perdisci, R., Ariu, D., Giacinto, G.: Scalable fine-grained behavioral clustering of HTTP-based malware. Comput. Netw. 57(2), 487–500 (2013)

    CrossRef  Google Scholar 

  20. Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In: NSDI, vol. 10, p. 14 (2010)

    Google Scholar 

  21. Perdisci, R., et al.: VAMO: towards a fully automated malware clustering validity analysis. In: Proceedings of the 28th Annual Computer Security Applications Conference, pp. 329–338. ACM (2012)

    Google Scholar 

  22. Rafique, M.Z., Caballero, J.: FIRMA: malware clustering and network signature generation with mixed network behaviors. In: Stolfo, S.J., Stavrou, A., Wright, C.V. (eds.) RAID 2013. LNCS, vol. 8145, pp. 144–163. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41284-4_8

    CrossRef  Google Scholar 

  23. Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174–180. IEEE (2011)

    Google Scholar 

  24. Sakib, M.N., Huang, C.T.: Using anomaly detection based techniques to detect HTTP-based botnet C&C traffic. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016)

    Google Scholar 

  25. Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: OSDI, vol. 4, p. 4 (2004)

    Google Scholar 

  26. Small, S., Mason, J., Monrose, F., Provos, N., Stubblefield, A.: To catch a predator: a natural language approach for eliciting malicious payloads. In: USENIX Security Symposium, pp. 171–184 (2008)

    Google Scholar 

  27. Sourdis, I., Pnevmatikatos, D.: Fast, large-scale string match for a 10 Gbps FPGA-based network intrusion detection system. In: Y. K. Cheung, P., Constantinides, G.A. (eds.) FPL 2003. LNCS, vol. 2778, pp. 880–889. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45234-8_85

    CrossRef  Google Scholar 

  28. Spitzner, L.: The honeynet project: trapping the hackers. IEEE Secur. Priv. 99(2), 15–23 (2003)

    CrossRef  Google Scholar 

  29. Wang, K., Cretu, G., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006). https://doi.org/10.1007/11663812_12

    CrossRef  Google Scholar 

  30. Wang, X., Zheng, K., Niu, X., Wu, B., Wu, C.: Detection of command and control in advanced persistent threat based on independent access. In: 2016 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2016)

    Google Scholar 

  31. Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of the Fourth ACM Conference on Digital Libraries, pp. 254–255. ACM (1999)

    Google Scholar 

  32. Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04444-1_15

    CrossRef  Google Scholar 

  33. Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. ACM SIGCOMM Comput. Commun. Rev. 38(4), 171–182 (2008)

    CrossRef  Google Scholar 

  34. Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 116–127. ACM (2007)

    Google Scholar 

  35. Zand, A., Vigna, G., Yan, X., Kruegel, C.: Extracting probable command and control signatures for detecting botnets. In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, pp. 1657–1662. ACM (2014)

    Google Scholar 

  36. Zarras, A., Papadogiannakis, A., Gawlik, R., Holz, T.: Automated generation of models for fast and precise detection of http-based malware. In: 2014 Twelfth Annual International Conference on Privacy, Security and Trust (PST), pp. 249–256. IEEE (2014)

    Google Scholar 

  37. Zeidanloo, H.R., Manaf, A.B.A.: Botnet detection by monitoring similar communication patterns. arXiv preprint arXiv:1004.1232 (2010)

  38. Zeng, Y., Hu, X., Shin, K.G.: Detection of botnets using combined host-and network-level information. In: 2010 IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 291–300. IEEE (2010)

    Google Scholar 

  39. Zhang, J., Perdisci, R., Lee, W., Sarfraz, U., Luo, X.: Detecting stealthy P2P botnets using statistical traffic fingerprints. In: 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), pp. 121–132. IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixin Shi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Qi, B., Shi, Z., Wang, Y., Wang, J., Wang, Q., Jiang, J. (2018). BotTokenizer: Exploring Network Tokens of HTTP-Based Botnet Using Malicious Network Traces. In: Chen, X., Lin, D., Yung, M. (eds) Information Security and Cryptology. Inscrypt 2017. Lecture Notes in Computer Science(), vol 10726. Springer, Cham. https://doi.org/10.1007/978-3-319-75160-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75160-3_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75159-7

  • Online ISBN: 978-3-319-75160-3

  • eBook Packages: Computer ScienceComputer Science (R0)