Supervised Detection of Infected Machines Using Anti-virus Induced Labels

(Extended Abstract)
  • Tomer Cohen
  • Danny HendlerEmail author
  • Dennis Potashnik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10332)


Traditional antivirus software relies on signatures to uniquely identify malicious files. Malware writers, on the other hand, have responded by developing obfuscation techniques with the goal of evading content-based detection. A consequence of this arms race is that numerous new malware instances are generated every day, thus limiting the effectiveness of static detection approaches. For effective and timely malware detection, signature-based mechanisms must be augmented with detection approaches that are harder to evade.

We introduce a novel detector that uses the information gathered by IBM’s QRadar SIEM (Security Information and Event Management) system and leverages anti-virus reports for automatically generating a labelled training set for identifying malware. Using this training set, our detector is able to automatically detect complex and dynamic patterns of suspicious machine behavior and issue high-quality security alerts. We believe that our approach can be used for providing a detection scheme that complements signature-based detection and is harder to circumvent.



This research was supported by IBM’s Cyber Center of Excellence in Beer Sheva and by the Cyber Security Research Center and the Lynne and William Frankel Center for Computing Science at Ben-Gurion University. We thank Yaron Wolfshtal from IBM for allowing Tomer to use IBM’s facilities, for providing us the data on which this research is based, and for many helpful discussions.


  1. 1.
    Hadoop distributed file system.
  2. 2.
    Spark cluster computing.
  3. 3.
    Antonakakis, M., Perdisci, R., Nadji, Y., Vasiloglou II, N., Abu-Nimeh, S., Lee, W., Dagon, D.: From throw-away traffic to bots: Detecting the rise of DGA-based malware. In: USENIX Security Symposium, vol.12 (2012)Google Scholar
  4. 4.
    Bocchi, E., Grimaudo, L., Mellia, M., Baralis, E., Saha, S., Miskovic, S., Modelo-Howard, G., Lee, S.-J.: Magma network behavior classifier for malware traffic. Comput. Netw. 109, 142–156 (2016)CrossRefGoogle Scholar
  5. 5.
    Dietrich, C.J., Rossow, C., Pohlmann, N.: CoCoSpot: clustering and recognizing botnet command and control channels using traffic analysis. Comput. Netw. 57(2), 475–486 (2013)CrossRefGoogle Scholar
  6. 6.
    Gu, G., Perdisci, R., Zhang, J., Lee, W., et al.: BotMiner: clustering analysis of network traffic for protocol-and structure-independent botnet detection. In: USENIX Security Symposium, vol. 5, pp. 139–154 (2008)Google Scholar
  7. 7.
    Gu, G., Zhang, J., Lee, W.: BotSniffer: detecting botnet command and control channels in network traffic (2008)Google Scholar
  8. 8.
    Hall, M.A., Smith, L.A.: Practical feature subset selection for machine learning (1998)Google Scholar
  9. 9.
  10. 10.
    iicybersecurity: International institute of cyber security.
  11. 11.
    Jiang, N., Cao, J., Jin, Y., Li, L.E., Zhang, Z.-L.: Identifying suspicious activities through DNS failure graph analysis. In: 2010 18th IEEE International Conference on Network Protocols (ICNP), pp. 144–153. IEEE (2010)Google Scholar
  12. 12.
    Kent, J.T.: Information gain and a general measure of correlation. Biometrika 70(1), 163–173 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: AAAI, vol. 2, pp. 129–134 (1992)Google Scholar
  14. 14.
    Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Elovici, Y.: Unknown malcode detection via text categorization and the imbalance problem. In: IEEE International Conference on Intelligence and Security Informatics, ISI 2008, pp. 156–161. IEEE (2008)Google Scholar
  15. 15.
    Musale, M., Austin, T.H., Stamp, M.: Hunting for metamorphic JavaScript malware. J. Comput. Virol. Hacking Tech. 11(2), 89–102 (2015)CrossRefGoogle Scholar
  16. 16.
    Narang, P., Ray, S., Hota, C., Venkatakrishnan, V.: PeerShark: detecting peer-to-peer botnets by tracking conversations. In: 2014 IEEE Security and Privacy Workshops (SPW), pp. 108–115. IEEE (2014)Google Scholar
  17. 17.
    Nari, S., Ghorbani, A.A.: Automated malware classification based on network behavior. In: 2013 International Conference on Computing, Networking and Communications (ICNC), pp. 642–647. IEEE (2013)Google Scholar
  18. 18.
    Deep Web News.
  19. 19.
    Weka 3: Data mining software in Java. University of Waikato.
  20. 20.
    Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI, vol. 10, p. 14 (2010)Google Scholar
  21. 21.
    AV TEST: The independent it-security institute.
  22. 22.
    Yen, T.-F., Oprea, A., Onarlioglu, K., Leetham, T., Robertson, W., Juels, A., Kirda, E.: Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In: Proceedings of the 29th Annual Computer Security Applications Conference, pp. 199–208. ACM (2013)Google Scholar
  23. 23.
    You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications (BWCCA), pp. 297–300. IEEE (2010)Google Scholar
  24. 24.
    Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: ICML, vol. 3, pp. 856–863 (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceBen-Gurion University of the NegevBeer ShevaIsrael
  2. 2.IBM Cyber Center of ExcellenceBeer ShevaIsrael

Personalised recommendations