Skip to main content

Towards a Methodical Evaluation of Antivirus Scans and Labels

“If You’re Not Confused, You’re Not Paying Attention”

  • Conference paper
  • First Online:
Book cover Information Security Applications (WISA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 8267))

Included in the following conference series:

Abstract

In recent years, researchers have relied heavily on labels provided by antivirus companies in establishing ground truth for applications and algorithms of malware detection, classification, and clustering. Furthermore, companies use those labels for guiding their mitigation and disinfection efforts. However, ironically, there is no prior systematic work that validates the performance of antivirus vendors, the reliability of those labels (or even detections), or how they affect the said applications. Equipped with malware samples of several malware families that are manually inspected and labeled, we pose the following questions: How do different antivirus scans perform relatively? How correct are the labels given by those scans? How consistent are AV scans among each other? Our answers to these questions reveal alarming results about the correctness, completeness, coverage, and consistency of the labels utilized by much existing research. We invite the research community to challenge the assumption of relying on antivirus scans and labels as a ground truth for evaluating malware analysis and classification techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Ironically, some of those works pointed out the problem and yet used AV-provided labels for validating their malware clustering algorithms [3, 15].

  2. 2.

    We use malware samples accumulated over a period of a year (mid 2011 to 2012). As we will see later, this would give the AV vendors an advantage and might overestimate their performance compared to more emerging threats (APT).

  3. 3.

    The greedy strategy, by adding the AV scan with least overlap to the current set, is the best known approximation [21].

References

  1. VirusTotal - Free Online Virus, Malware and URL Scanner. https://www.virustotal.com/en/ August 2013

  2. Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)

    Google Scholar 

  3. Bayer, U., Comparetti, P.M., Hlauschek, C., Krügel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)

    Google Scholar 

  4. Kerr, D.: Ubisoft hacked; users’ e-mails and passwords exposed. http://cnet.co/14ONGDi July 2013

  5. Kinable, J., Kostakis, O.: Malware classification based on call graph clustering. J. Comput. Virol. 7(4), 233–245 (2011)

    Article  Google Scholar 

  6. Kong, D., Yan, G.: Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2013)

    Google Scholar 

  7. Kruss, P.: Complete zeus source code has been leaked to the masses. http://www.csis.dk/en/csis/blog/3229 March 2011

  8. Lanzi, A., Sharif, M.I., Lee, W.: K-tracer: a system for extracting kernel malware behavior. In: NDSS (2009)

    Google Scholar 

  9. Mohaisen, A., Alrawi, O.: Unveiling zeus: automated classification of malware samples. In: WWW (Companion Volume), pp. 829–832 (2013)

    Google Scholar 

  10. Mozzherina, E.: An approach to improving the classification of the New York Times annotated corpus. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2013. CCIS, vol. 394, pp. 83–91. Springer, Heidelberg (2013)

    Google Scholar 

  11. Park, Y., Reeves, D., Mulukutla, V., Sundaravel, B.: Fast malware classification by automated behavioral graph matching. In: CSIIR Workshop, ACM (2010)

    Google Scholar 

  12. Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: USENIX NSDI (2010)

    Google Scholar 

  13. Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)

    Google Scholar 

  14. Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19(4), 639–668 (2011)

    Google Scholar 

  15. Rossow, C., Dietrich, C.J., Grier, C., Kreibich, C., Paxson, V., Pohlmann, N., Bos, H., van Steen, M.: Prudent practices for designing malware experiments: status quo and outlook. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  16. Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Automatic reverse engineering of malware emulators. In: IEEE Symposium on Security and Privacy (2009)

    Google Scholar 

  17. Shaw, A.: Livingsocial hacked: cyber attack affects more than 50 million customers. http://abcn.ws/15ipKsw April 2013

  18. Silveira, V.: An update on linkedin member passwords compromised. http://linkd.in/Ni5aTg July 2012

  19. Strayer, W.T., Lapsley, D.E., Walsh, R., Livadas, C.: Botnet detection based on network behavior. In: Lee, W., Wang, C., Dagon, D. (eds.) Botnet Detection. Advances in Information Security, vol. 36. Springer, New York (2008)

    Chapter  Google Scholar 

  20. Tian, R., Batten, L., Versteeg, S.: Function length as a tool for malware classification. In: IEEE MALWARE (2008)

    Google Scholar 

  21. Vazirani, V.V.: Approximation Algorithms. Springer, Heidelberg (2004)

    Google Scholar 

  22. Yan, G., Brown, N., Kong, D.: Exploring discriminatory features for automated malware classification. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 41–61. Springer, Heidelberg (2013)

    Google Scholar 

  23. Zhao, H., Xu, M., Zheng, N., Yao, J., Ho, Q.: Malicious executables classification based on behavioral factor analysis. In: IC4E (2010)

    Google Scholar 

Download references

Acknowledgement

We would like to thank Andrew West for proofreading this work, and Allison Mankin and Burt Kaliski for their feedback. We would like to further thank Trevor Tonn, Ryan Olson, Brandon Dixon, Leo Fernandes, and Blake Hartstein for sharing with us the dataset and for their valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aziz Mohaisen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Mohaisen, A., Alrawi, O., Larson, M., McPherson, D. (2014). Towards a Methodical Evaluation of Antivirus Scans and Labels. In: Kim, Y., Lee, H., Perrig, A. (eds) Information Security Applications. WISA 2013. Lecture Notes in Computer Science(), vol 8267. Springer, Cham. https://doi.org/10.1007/978-3-319-05149-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05149-9_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05148-2

  • Online ISBN: 978-3-319-05149-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics