Abstract
Prior research in botnet detection has used the bot lifecycle to build detection systems. These systems, however, use rule-based decision engines which lack automated adaptability and learning, accuracy tunability, the ability to cope with gaps in training data, and the ability to incorporate local security policies. To counter these limitations, we propose to replace the rigid decision engines in contemporary bot detectors with a more formal Bayesian inference engine. Bottleneck, our prototype implementation, builds confidence in bot infections based on the causal bot lifecycle encoded in a Bayesian network. We evaluate Bottleneck by applying it as a post-processing decision engine on lifecycle events generated by two existing bot detectors (BotHunter and BotFlex) on two independently-collected datasets. Our experimental results show that Bottleneck consistently achieves comparable or better accuracy than the existing rule-based detectors when the test data is similar to the training data. For differing training and test data, Bottleneck, due to its automated learning and inference models, easily surpasses the accuracies of rule-based systems. Moreover, Bottleneck’s stochastic nature allows its accuracy to be tuned with respect to organizational needs. Extending Bottleneck’s Bayesian network into an influence diagram allows for local security policies to be defined within our framework. Lastly, we show that Bottleneck can also be extended to incorporate evidence trustscore for false alarm reduction.
Similar content being viewed by others
Notes
Reference is not included due to double blind review.
While some rule-based engines soften the impact of these fundamental problems using regression-based weight assignment and soft timers [3], these schemes lack a formal rigor and remain susceptible to data overfitting.
A few variants of the Conficker botnet copy these DLL-form bot binaries to removable media, but since these events cannot be extracted from the network trace, we do not include such events in our lifecycle. Similarly the bot also performs password cracking to copy itself to the ADMIN$ folder, update registry values and performs other configuration changes, which we also not consider as part of our lifecycle events owing to the inability to extract them from the network trace.
Implies a network of clean hosts. In the SysNet Lab dataset, the benign network traffic was collected from the hosts in the lab and it was necessary to ensure that the benign data was clean and free of any bot traffic for judicious evaluation.
BotHunter’s three conditions are (1) evidence of (local host infection AND outward bot coordination or attack); (2) at least two distinct signs of outward bot coordination or attack; and (3) evidence that a host attempts communication with a confirmed malware site (E8[rb]).
E8[rb] is not a bot lifecycle event.
The steepness of the curve owes to the small size of the SysNet data trace. The trace includes ten bot infections and benign data from 22 hosts. Hence, there are a limited number of instances in the trace which are not uniformly distributed w.r.t. threshold, ultimately resulting in the steepness of the ROC curve.
BotHunter’s three conditions are (1) evidence of (local host infection AND outward bot coordination or attack); (2) at least two distinct signs of outward bot coordination or attack; and (3) evidence that a host attempts communication with a confirmed malware site.
References
Bencsth, B., Pk, G., Buttyn, L., Flegyhzi, M.: Duqu: analysis, detection, and lessons learned. In: 2012 ACM European Workshop on System Security (EuroSec), vol. 2012 (2012)
Falliere, N., Murchu, L.O., Chien, E.: W32. stuxnet dossier. In: White Paper, Symantec Corp., Security Response, 2011, online 5 June (2013)
Gu, G., Porras, P., Yegneswaran, V., Fong, M., Lee, W.: Bothunter: detecting malware infection through ids-driven dialog correlation, In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07. USENIX Association, Berkeley, pp. 12:1–12:16 (2007). http://dl.acm.org/citation.cfm?id=1362903.1362915
Khattak, S., Ahmed, Z., Syed, A. A., Khayam, S.A.: Poster: Botflex: a community-driven tool for botnet detection, online 17 May (2013)
Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs. Springer, New York (2007)
Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection, In: 2010 IEEE Symposium on Security and Privacy (SP), pp. 305–316. doi:10.1109/SP.2010.25
Ramay, N.R., Khattak, S., Syed, A.A., Khayam, S.A.: Poster: Bottleneck: a generalized, flexible, and extensible framework for botnet defense, online 13 April (2013)
Cooper, G., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992). doi:10.1007/BF00994110
Cheng, J., Greiner, R.: Learning Bayesian belief network classifiers: algorithm and systems. In: Stroulia, E., Matwin, S. (eds.) Advances in Artificial Intelligence. Lecture Notes in Computer Science, vol. 2056, pp. 141–151. Springer, Berlin, Heidelberg (2001)
Netica programers library reference manual. http://www.norsys.com/netica-j/docs/NeticaJ_Man.pdf, online 20 May (2013)
Spiegelhalter, D. J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian analysis in expert systems. Stat. Sci. 219–247 (1993)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, New York (1988)
Conficker: https://mil.fireeye.com/edp.php?sname=Bot.Conficker, online June (2013)
Inside the storm: https://www.blackhat.com/presentations/bh-usa-08/Stewart/BH_US_08_Stewart_Protocols_of_the_Storm.pdf
Roesch, M., et al.: Snort-lightweight intrusion detection for networks. In: Proceedings of the 13th USENIX Conference on System Administration, Seattle, Washington, pp. 229–238 (1999)
Bro: http://www.bro.org/, online 10 April (2013)
Kruegel, C., Mutz, D., Robertson, W., Valeur, F.: Bayesian event classification for intrusion detection. In: 19th Annual Proceedings Computer Security Applications Conference, pp. 14–23. IEEE, New York (2003)
Cert-polaska: http://www.cert.pl/PDF/Report_Virut_EN.pdf, online 25 February (2013)
Nayatel: http://www.nayatel.pk/index.php, online April (2013)
Team cymru: https://www.team-cymru.org/, online April (2013)
The ICSI networking and security group. http://www.icir.org/, online 13 May (2013)
Bottleneck: http://sysnet.org.pk/w/Code_and_Tools#Bottleneck, online 15 June (2013)
Stewart, J.: Inside the storm: protocols and encryption of the storm botnet. In: Black Hat Technical Security Conference, New York (2008)
Emerging threats malware rulesets: http://www.emergingthreats.net, online 5 August (2013)
Poole, D., Mackworth, A.: Artificial intelligence: foundations of computational agents, online November (2013)
Netica-j reference manual—Norsys Software Corp. http://www.norsys.com/downloads/NeticaJ_Man_418.pdf, online 16 September (2013)
Costa, E., Lorena, A., Carvalho, A., Freitas, A.: A review of performance evaluation measures for hierarchical classifiers. In: Evaluation Methods for Machine Learning II: Papers from the AAAI-2007 Workshop, pp. 1–6 (2007)
lozano, J.A., Santaf, G., Inza, I.: Classier performance evaluation and comparison. In: International Conference on Machine Learning and Applications (ICMLA 2010). http://www.icmla-conference.org/icmla10/CFP_Tutorial_files/jose.pdf
Han, J., Kamber, M., Pei, J.: Data Mining Concepts and Techniques, 3rd edn. http://www.amazon.de/Data-Mining-Concepts-Techniques-Management/dp/0123814790/ref=tmm_hrd_title_0?ie=UTF8&qid=1366039033&sr=1-1 (2012)
Invernizzi, L., Miskovic, S., Torres, R., Saha, S., Lee, S., Mellia, M., Kruegel, C., Vigna, G.: Nazca: detecting malware distribution in large-scale networks. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2014)
Kapravelos, A., Shoshitaishvili, Y., Cova, M., Kruegel, C., Vigna, G.: Revolver: an automated approach to the detection of evasive web-based malware. In: USENIX Security, Citeseer, pp. 637–652 (2013)
Chinchani, R., Van Den Berg, E.: A fast static analysis approach to detect exploit code inside network flows. In: Recent Advances in Intrusion Detection. Springer, New York, pp. 284–308 (2006)
Baldoni, R., Di Luna, G.A., Querzoni, L.: Collaborative detection of coordinated port scans. In: Distributed Computing and Networking. Springer, New York, pp. 102–117 (2013)
Muelder, C., Ma, K.-L., Bartoletti, T.: Interactive visualization for network and port scan detection. In: Recent Advances in Intrusion Detection. Springer, New York, pp. 265–283 (2006)
Zargar, S.T., Joshi, J., Tipper, D.: A survey of defense mechanisms against distributed denial of service (ddos) flooding attacks. IEEE Commun Surv Tutor 15(4), 2046–2069 (2013), online 28 May (2013)
Feinstein, L., Schnackenberg, D., Balupari, R., Kindred, D.: Statistical approaches to ddos attack detection and response. In: Proceedings of the DARPA Information Survivability Conference and Exposition, vol. 1. IEEE, New York, pp. 303–314 (2003)
Zhao, Y., Xie, Y., Yu, F., Ke, Q., Yu, Y., Chen, Y., Gillum, E.: Botgraph: large scale spamming botnet detection. In: NSDI, vol. 9, pp. 321–334 (2009)
Nelms, T., Perdisci, R., Ahamad, M.: Execscent: mining for new c&c domains in live networks with adaptive control protocol templates. In: USENIX Security, pp. 589–604 (2013)
Perdisci, R., Ariu, D., Giacinto, G.: Scalable fine-grained behavioral clustering of http-based malware. Comput. Netw. 57(2), 487–500 (2013)
Goebel, J., Holz, T.: Rishi: identify bot contaminated hosts by IRC nickname evaluation. In: Proceedings of the First Conference on First Workshop on Hot Topics in Understanding Botnets, Cambridge, p. 8 (2007)
Saad, S., Traore, I., Ghorbani, A., Sayed, B., Zhao, D., Lu, W., Felix, J., Hakimian, P.: Detecting p2p botnets through network behavior analysis and machine learning. In: 2011 Ninth Annual International Conference on Privacy, Security and Trust (PST), pp. 174–180. IEEE, New York (2011)
Hsu, C.-H., Huang, C.-Y., Chen, K.-T.: Fast-flux bot detection in real time. In: Recent Advances in Intrusion Detection, pp. 464–483. Springer, New York (2010)
Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: detecting the rise of DGA-based malware. In: Proceedings of the 21st USENIX Security Symposium (2012)
Khattak, S., Ramay, N., Khan, K., Syed, A., Khayam, S.: A taxonomy of botnet behavior, detection and defense. IEEE Commun. Surv. Tutor., online June (2014)
Fabian, M.A.R.J.Z., Terzis, M. A.: A multifaceted approach to understanding the botnet phenomenon. In: Proceedings of the 2006 ACM SIGCOMM Internet Measurement Conference (IMC), vol. 2006 (2006)
Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: Botminer: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, pp. 139–154 (2008)
Silva, S.S., Silva, R.M., Pinto, R.C., Salles, R.M.: Botnets: a survey. Comput. Netw. 57(2), 378–403 (2013)
Hachem, N., Ben Mustapha, Y., Granadillo, G.G., Debar, H.: Botnets: lifecycle and taxonomy, In: 2011 Conference on Network and Information Systems Security (SAR-SSI), pp. 1–8. IEEE, New York (2011)
Lu, C., Brooks, R.: Botnet traffic detection using hidden markov models. In: Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research, p. 31. ACM, New York (2011)
Kidmose, E.: Botnet detection using hidden Markov models. Master’s thesis, Aalborg University (2014)
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ashfaq, A.B., Abaid, Z., Ismail, M. et al. Diagnosing bot infections using Bayesian inference. J Comput Virol Hack Tech 14, 21–38 (2018). https://doi.org/10.1007/s11416-016-0286-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-016-0286-y