Skip to main content
Log in

Diagnosing bot infections using Bayesian inference

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Prior research in botnet detection has used the bot lifecycle to build detection systems. These systems, however, use rule-based decision engines which lack automated adaptability and learning, accuracy tunability, the ability to cope with gaps in training data, and the ability to incorporate local security policies. To counter these limitations, we propose to replace the rigid decision engines in contemporary bot detectors with a more formal Bayesian inference engine. Bottleneck, our prototype implementation, builds confidence in bot infections based on the causal bot lifecycle encoded in a Bayesian network. We evaluate Bottleneck by applying it as a post-processing decision engine on lifecycle events generated by two existing bot detectors (BotHunter and BotFlex) on two independently-collected datasets. Our experimental results show that Bottleneck consistently achieves comparable or better accuracy than the existing rule-based detectors when the test data is similar to the training data. For differing training and test data, Bottleneck, due to its automated learning and inference models, easily surpasses the accuracies of rule-based systems. Moreover, Bottleneck’s stochastic nature allows its accuracy to be tuned with respect to organizational needs. Extending Bottleneck’s Bayesian network into an influence diagram allows for local security policies to be defined within our framework. Lastly, we show that Bottleneck can also be extended to incorporate evidence trustscore for false alarm reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Reference is not included due to double blind review.

  2. While some rule-based engines soften the impact of these fundamental problems using regression-based weight assignment and soft timers [3], these schemes lack a formal rigor and remain susceptible to data overfitting.

  3. A few variants of the Conficker botnet copy these DLL-form bot binaries to removable media, but since these events cannot be extracted from the network trace, we do not include such events in our lifecycle. Similarly the bot also performs password cracking to copy itself to the ADMIN$ folder, update registry values and performs other configuration changes, which we also not consider as part of our lifecycle events owing to the inability to extract them from the network trace.

  4. Implies a network of clean hosts. In the SysNet Lab dataset, the benign network traffic was collected from the hosts in the lab and it was necessary to ensure that the benign data was clean and free of any bot traffic for judicious evaluation.

  5. BotHunter’s three conditions are (1) evidence of (local host infection AND outward bot coordination or attack); (2) at least two distinct signs of outward bot coordination or attack; and (3) evidence that a host attempts communication with a confirmed malware site (E8[rb]).

  6. E8[rb] is not a bot lifecycle event.

  7. The steepness of the curve owes to the small size of the SysNet data trace. The trace includes ten bot infections and benign data from 22 hosts. Hence, there are a limited number of instances in the trace which are not uniformly distributed w.r.t. threshold, ultimately resulting in the steepness of the ROC curve.

  8. BotHunter’s three conditions are (1) evidence of (local host infection AND outward bot coordination or attack); (2) at least two distinct signs of outward bot coordination or attack; and (3) evidence that a host attempts communication with a confirmed malware site.

References

  1. Bencsth, B., Pk, G., Buttyn, L., Flegyhzi, M.: Duqu: analysis, detection, and lessons learned. In: 2012 ACM European Workshop on System Security (EuroSec), vol. 2012 (2012)

  2. Falliere, N., Murchu, L.O., Chien, E.: W32. stuxnet dossier. In: White Paper, Symantec Corp., Security Response, 2011, online 5 June (2013)

  3. Gu, G., Porras, P., Yegneswaran, V., Fong, M., Lee, W.: Bothunter: detecting malware infection through ids-driven dialog correlation, In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07. USENIX Association, Berkeley, pp. 12:1–12:16 (2007). http://dl.acm.org/citation.cfm?id=1362903.1362915

  4. Khattak, S., Ahmed, Z., Syed, A. A., Khayam, S.A.: Poster: Botflex: a community-driven tool for botnet detection, online 17 May (2013)

  5. Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs. Springer, New York (2007)

    Book  MATH  Google Scholar 

  6. Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection, In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 305–316. doi:10.1109/SP.2010.25

  7. Ramay, N.R., Khattak, S., Syed, A.A., Khayam, S.A.: Poster: Bottleneck: a generalized, flexible, and extensible framework for botnet defense, online 13 April (2013)

  8. Cooper, G., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992). doi:10.1007/BF00994110

    MATH  Google Scholar 

  9. Cheng, J., Greiner, R.: Learning Bayesian belief network classifiers: algorithm and systems. In: Stroulia, E., Matwin, S. (eds.) Advances in Artificial Intelligence. Lecture Notes in Computer Science, vol. 2056, pp. 141–151. Springer, Berlin, Heidelberg (2001)

  10. Netica programers library reference manual. http://www.norsys.com/netica-j/docs/NeticaJ_Man.pdf, online 20 May (2013)

  11. Spiegelhalter, D. J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian analysis in expert systems. Stat. Sci. 219–247 (1993)

  12. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, New York (1988)

    MATH  Google Scholar 

  13. Conficker: https://mil.fireeye.com/edp.php?sname=Bot.Conficker, online June (2013)

  14. Inside the storm: https://www.blackhat.com/presentations/bh-usa-08/Stewart/BH_US_08_Stewart_Protocols_of_the_Storm.pdf

  15. Roesch, M., et al.: Snort-lightweight intrusion detection for networks. In: Proceedings of the 13th USENIX Conference on System Administration, Seattle, Washington, pp. 229–238 (1999)

  16. Bro: http://www.bro.org/, online 10 April (2013)

  17. Kruegel, C., Mutz, D., Robertson, W., Valeur, F.: Bayesian event classification for intrusion detection. In: 19th Annual Proceedings Computer Security Applications Conference, pp. 14–23. IEEE, New York (2003)

  18. Cert-polaska: http://www.cert.pl/PDF/Report_Virut_EN.pdf, online 25 February (2013)

  19. Nayatel: http://www.nayatel.pk/index.php, online April (2013)

  20. Team cymru: https://www.team-cymru.org/, online April (2013)

  21. The ICSI networking and security group. http://www.icir.org/, online 13 May (2013)

  22. Bottleneck: http://sysnet.org.pk/w/Code_and_Tools#Bottleneck, online 15 June (2013)

  23. Stewart, J.: Inside the storm: protocols and encryption of the storm botnet. In: Black Hat Technical Security Conference, New York (2008)

  24. Emerging threats malware rulesets: http://www.emergingthreats.net, online 5 August (2013)

  25. Poole, D., Mackworth, A.: Artificial intelligence: foundations of computational agents, online November (2013)

  26. Netica-j reference manual—Norsys Software Corp. http://www.norsys.com/downloads/NeticaJ_Man_418.pdf, online 16 September (2013)

  27. Costa, E., Lorena, A., Carvalho, A., Freitas, A.: A review of performance evaluation measures for hierarchical classifiers. In: Evaluation Methods for Machine Learning II: Papers from the AAAI-2007 Workshop, pp. 1–6 (2007)

  28. lozano, J.A., Santaf, G., Inza, I.: Classier performance evaluation and comparison. In: International Conference on Machine Learning and Applications (ICMLA 2010). http://www.icmla-conference.org/icmla10/CFP_Tutorial_files/jose.pdf

  29. Han, J., Kamber, M., Pei, J.: Data Mining Concepts and Techniques, 3rd edn. http://www.amazon.de/Data-Mining-Concepts-Techniques-Management/dp/0123814790/ref=tmm_hrd_title_0?ie=UTF8&qid=1366039033&sr=1-1 (2012)

  30. Invernizzi, L., Miskovic, S., Torres, R., Saha, S., Lee, S., Mellia, M., Kruegel, C., Vigna, G.: Nazca: detecting malware distribution in large-scale networks. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2014)

  31. Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: an automated approach to the detection of evasive web-based malware. In: USENIX Security, Citeseer, pp. 637–652 (2013)

  32. Chinchani, R., Van Den Berg, E.: A fast static analysis approach to detect exploit code inside network flows. In: Recent Advances in Intrusion Detection. Springer, New York, pp. 284–308 (2006)

  33. Baldoni, R., Di Luna, G.A., Querzoni, L.: Collaborative detection of coordinated port scans. In: Distributed Computing and Networking. Springer, New York, pp. 102–117 (2013)

  34. Muelder, C., Ma, K.-L., Bartoletti, T.: Interactive visualization for network and port scan detection. In: Recent Advances in Intrusion Detection. Springer, New York, pp. 265–283 (2006)

  35. Zargar, S.T., Joshi, J., Tipper, D.: A survey of defense mechanisms against distributed denial of service (ddos) flooding attacks. IEEE Commun Surv Tutor 15(4), 2046–2069 (2013), online 28 May (2013)

  36. Feinstein, L., Schnackenberg, D., Balupari, R., Kindred, D.: Statistical approaches to ddos attack detection and response. In: Proceedings of the DARPA Information Survivability Conference and Exposition, vol. 1. IEEE, New York, pp. 303–314 (2003)

  37. Zhao, Y., Xie, Y., Yu, F., Ke, Q., Yu, Y., Chen, Y., Gillum, E.: Botgraph: large scale spamming botnet detection. In: NSDI, vol. 9, pp. 321–334 (2009)

  38. Nelms, T., Perdisci, R., Ahamad, M.: Execscent: mining for new c&c domains in live networks with adaptive control protocol templates. In: USENIX Security, pp. 589–604 (2013)

  39. Perdisci, R., Ariu, D., Giacinto, G.: Scalable fine-grained behavioral clustering of http-based malware. Comput. Netw. 57(2), 487–500 (2013)

    Article  Google Scholar 

  40. Goebel, J., Holz, T.: Rishi: identify bot contaminated hosts by IRC nickname evaluation. In: Proceedings of the First Conference on First Workshop on Hot Topics in Understanding Botnets, Cambridge, p. 8 (2007)

  41. Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting p2p botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174–180. IEEE, New York (2011)

  42. Hsu, C.-H., Huang, C.-Y., Chen, K.-T.: Fast-flux bot detection in real time. In: Recent Advances in Intrusion Detection, pp. 464–483. Springer, New York (2010)

  43. Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX Security Symposium (2012)

  44. Khattak, S., Ramay, N., Khan, K., Syed, A., Khayam, S.: A taxonomy of botnet behavior, detection and defense. IEEE Commun. Surv. Tutor., online June (2014)

  45. Fabian, M.A.R.J.Z., Terzis, M. A.: A multifaceted approach to understanding the botnet phenomenon. In: Proceedings of the 2006 ACM SIGCOMM Internet Measurement Conference (IMC), vol. 2006 (2006)

  46. Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: Botminer: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, pp. 139–154 (2008)

  47. Silva, S.S., Silva, R.M., Pinto, R.C., Salles, R.M.: Botnets: a survey. Comput. Netw. 57(2), 378–403 (2013)

    Article  Google Scholar 

  48. Hachem, N., Ben Mustapha, Y., Granadillo, G.G., Debar, H.: Botnets: lifecycle and taxonomy, In: 2011 Conference on Network and Information Systems Security (SAR-SSI), pp. 1–8. IEEE, New York (2011)

  49. Lu, C., Brooks, R.: Botnet traffic detection using hidden markov models. In: Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research, p. 31. ACM, New York (2011)

  50. Kidmose, E.: Botnet detection using hidden Markov models. Master’s thesis, Aalborg University (2014)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ayesha Binte Ashfaq.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 159 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ashfaq, A.B., Abaid, Z., Ismail, M. et al. Diagnosing bot infections using Bayesian inference. J Comput Virol Hack Tech 14, 21–38 (2018). https://doi.org/10.1007/s11416-016-0286-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-016-0286-y

Keywords

Navigation