Malware Biodiversity Using Static Analysis

  • Jeremy D. Seideman
  • Bilal Khan
  • Antonio Cesar Vargas
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 523)

Abstract

Malware is constantly changing and is released very rapidly, necessarily to remain effective in the changing computer landscape. Some malware files can be related to each other; studies that indicate that malware samples are similar often base that determination on common behavior or code. Given, then, that new malware is often developed based on existing malware, we can see that some code fragments, behavior, and techniques may be influencing more development than others. We propose a method by which we can determine the extent that previously released malware is influencing the development of new malware. Our method allows us to examine the way that malware changes over time, allowing us to look at trends in the changing malware landscape. This method, which involves a historical study of malware, can then be extended to investigate specific behaviors or code fragments. Our method shows that, with respect to the method in which we compared malware samples, over 64 % of malware samples that we analyzed are contributing to the biodiversity of the malware ecosystem and influencing new malware development.

Keywords

Cosine Similarity Discovery Date Infection Technique Biodiversity Calculation Biodiversity Curve 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Annachhatre, C., Austin, T., Stamp, M.: Hidden markov models for malware classification. J. Comput. Virol. Hacking Tech., 1–15 (2014). http://dx.doi.org/10.1007/s11416-014-0215-x
  2. 2.
    Arora, A., Krishnan, R., Telang, R., Yang, Y.: Impact of vulnerability disclosure and patch availability - an empirical analysis. In. Third Workshop on the Economics of Information Security (2004). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.9350
  3. 3.
    Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007). http://dx.doi.org/10.1007/978-3-540-74320-0_10 CrossRefGoogle Scholar
  4. 4.
    Bayer, U., Comparetti, P.M., Hlauschek, C., Krügel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: Proceedings of NDSS 2009 (2009). http://www.isoc.org/isoc/conferences/ndss/09/pdf/11.pdf
  5. 5.
    Campbell, N.A.: Biology, 4th edn. The Benjamin/Cummings Publishing Company Inc., New York (1996) Google Scholar
  6. 6.
    Cencini, A., Yu, K., Chan, T.: Software Vulnerabilities: Full-, Responsible-, and Non-Disclosure (2005). http://www.cs.washington.edu/education/courses/csep590/05au/whitepaper_turnin/software_vulnerabilities_by_cencini_yu_chan.pdf
  7. 7.
  8. 8.
    Darmetko, C., Jilcott, S., Everett, J.: Inferring accurate histories of malware evolution from structural evidence. In: The Twenty-Sixth International FLAIRS Conference (2013)Google Scholar
  9. 9.
  10. 10.
    Filiol, E., Helenius, M., Zanero, S.: Open problems in computer virology. J. Comput. Virol. 1(3–4), 55–66 (2006). http://dx.doi.org/10.1007/s11416-005-0008-3 CrossRefGoogle Scholar
  11. 11.
    Gheorghescu, M.: An automated virus classification system. In: Virus Bulletin Conference, pp. 294–300, Oct 2005Google Scholar
  12. 12.
    Ibrahim, A., Abdelhalim, M.B., Hussein, H., Fahmy, A.: Analysis of x86 instruction set usage for Windows 7 applications. In: 2010 2nd International Conference on Computer Technology and Development (ICCTD), pp. 511–516 (2010)Google Scholar
  13. 13.
    Intel Corporation: Intel\({\textregistered }\) 64 and IA-32 Architectures Software Developer Manuals (2013). http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html/
  14. 14.
    Jacob, G., Debar, H., Filiol, E.: Behavioral detection of malware: from a survey towards an established taxonomy. J. Comput. Virol. 4(3), 251–266 (2008)CrossRefGoogle Scholar
  15. 15.
    Jang, J., Woo, M., Brumley, D.: Towards automatic software lineage inference. In: Proceedings of the 22nd USENIX Conference on Security, pp. 81–96. USENIX Association (2013)Google Scholar
  16. 16.
    Karim, M.E., Walenstein, A., Lakhotia, A., Parida, L.: Malware phylogeny generation using permutations of code. J. Comput. Virol. 1(1–2), 13–23 (2005)CrossRefGoogle Scholar
  17. 17.
    Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006). http://www.jmlr.org/papers/v7/kolter06a.html MATHMathSciNetGoogle Scholar
  18. 18.
    Lee, T., Mody, J.J.: Behavioral classification. In: Proceedings of EICAR 2006, pp. 1–17, May 2006Google Scholar
  19. 19.
    Li, Z., Sanghi, M., Chen, Y., Kao, M.Y., Chavez, B.: Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience. In: Proceedings of the 2006 IEEE Symposium on Security and Privacy. pp. 32–47 (2006). http://doi.ieeecomputersociety.org/10.1109/SP.2006.18
  20. 20.
    Mader, S.S.: Inquiry into Life (Customized for Brooklyn College), 9th edn. The McGraw-Hill Companies Inc., Primis Custom Publishing, New York (1999) Google Scholar
  21. 21.
    Newsome, J., Karp, B., Song, D.X.: Polygraph: automatically generating signatures for polymorphic worms. In: Proceedings of the 2005 IEEE Symposium on Security and Privacy, pp. 226–241 (2005). http://doi.ieeecomputersociety.org/10.1109/SP.2005.15
  22. 22.
  23. 23.
    Salthe, S.N.: Evolutionary Biology. Holt, Rinehart and Winston Inc., New York (1972) Google Scholar
  24. 24.
    Seewald, A.K.: Towards autmating malware classification and characterization. In: Proceedings of Sicherheit 2008, pp. 291–302 (2008). http://alex.seewald.at/files/2008-01.pdf
  25. 25.
    Seideman, J., Khan, B., Ben Brahim, G.: Determining vulnerability resolution time by examining malware proliferation rates. In: 2013 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1678–1682 (2013)Google Scholar
  26. 26.
    Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 24(4), 35–43 (2001)Google Scholar
  27. 27.
    Spafford, E.H.: Computer viruses as artificial life. Artif. Life 1(3), 249–265 (1994)CrossRefGoogle Scholar
  28. 28.
    Threat explorer - spyware and adware, dialers, hack tools, hoaxes and other risks (2012). http://www.symantec.com/security_response/threatexplorer/
  29. 29.
    UPX: the Ultimate Packer for eXecutables - Homepage (2010). http://upx.sourceforge.net/
  30. 30.
    VirusTotal (2008). http://www.virustotal.com
  31. 31.
    VX heavens (2010). http://vxheaven.org/
  32. 32.
    Wagener, G., State, R., Dulaunoy, A.: Malware behaviour analysis. J. Comput. Virol. 4(4), 279–287 (2008)CrossRefGoogle Scholar
  33. 33.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006). http://dx.doi.org/10.1007/s11416-006-0028-7 CrossRefGoogle Scholar
  34. 34.
    Woodberry, O.G., Korb, K.B., Nicholson, A.E.: Testing punctuated equilibrium theory using evolutionary activity statistics. In: Korb, K., Randall, M., Hendtlass, T. (eds.) ACAL 2009. LNCS, vol. 5865, pp. 86–95. Springer, Heidelberg (2009) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jeremy D. Seideman
    • 1
  • Bilal Khan
    • 2
  • Antonio Cesar Vargas
    • 3
  1. 1.The Graduate CenterCity University of New YorkNew YorkUSA
  2. 2.Department of Mathematics and Computer Science, John Jay CollegeCUNYNew YorkUSA
  3. 3.NacoLabs ConsultingLLCNew YorkUSA

Personalised recommendations