Advertisement

Metamorphic code generation from LLVM bytecode

Original Paper

Abstract

Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strategies. However, code morphing also has potential security benefits, since it can serve to increase the “genetic diversity” of software. We have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language is transformed to this IR bytecode as part of the LLVM compilation process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over morphing at the assembly or source code level. The morphing techniques that we employ include dead code insertion and transposition, where the dead code is actually executed within the morphed code, making its detection and removal more challenging. We have verified the effectiveness of our code morphing using hidden Markov model analysis.

Keywords

Hide Markov Model Intermediate Representation Base File Dead Code Hide Markov Model Classifier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    The Mental Driller, Metamorphism in practice or “How I made MetaPHOR and what I’ve learnt” (2002). http://download.adamas.ai/dlbase/Stuff/VX%20Heavens%20Library/vmd01.html
  2. 2.
    An example of metamorphic virus. http://spth.virii.lu/main.html
  3. 3.
    Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)CrossRefGoogle Scholar
  4. 4.
    Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine. J. Comput. Virol. Hacking Tech. 9(2), 49–58 (2013)CrossRefGoogle Scholar
  5. 5.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar
  6. 6.
    Gao, X., Stamp, M.: Metamorphic software for buffer overflow mitigation. In: Dey, P.P., Amin, M.N. (eds.) Proceedings of 3rd Conference on Computer Science and its Applications. San Diego, California (2005)Google Scholar
  7. 7.
    Stamp, M.: Risks of monoculture, Inside Risks 165. Commun. ACM 47(3):120 (2004). http://www.csl.sri.com/users/neumann/insiderisks04.html#165 Google Scholar
  8. 8.
  9. 9.
  10. 10.
    Attaluri, S., McGhee, S., Stamp, M.: Profile hidden markov models and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169 (2009)CrossRefGoogle Scholar
  11. 11.
    Lattner, C., Adve, V.: Architecture for a next generation GCC. In: First GCC Annual Developer’s Summit (2003). http://llvm.org/pubs/2003-05-01-GCCSummit2003pres.pdf
  12. 12.
    The LLVM Compiler Infrastructure Project. http://llvm.org/
  13. 13.
    Sharif, M. et al.: Impending Malware Analysis Using Conditional Code Obfuscation. College of Computing, Georgia Institute of Technology. http://cyber4.us/sites/default/files/Impeding%20Malware%20Analysis%20Using%20Conditional%20Code%20Obfuscation-NDSS2008.pdf
  14. 14.
    Ma, W., et al.: Shadow attacks: automatically evading system-call behavior. J. Comput. Virol. 8(1–2), 1–13 (2012)CrossRefGoogle Scholar
  15. 15.
    Kazi, S., Stamp, M.: Hidden Markov models for software piracy detection. Inf. Secur. J. A Glob. Perspect. 22(3), 140–149 (2013)CrossRefGoogle Scholar
  16. 16.
    Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hacking Tech. 9(4), 179–192 (2013) (to appear)Google Scholar
  17. 17.
    Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012)CrossRefGoogle Scholar
  18. 18.
    Shanmugam, G., Low, R.M., Stamp, M.: Simple substitution distance and metamorphic detection. J. Comput. Virol. Hacking Tech. 9(3), 159–170 (2013)CrossRefGoogle Scholar
  19. 19.
    Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. Hacking Tech. 9(1), 1–14 (2013)CrossRefGoogle Scholar
  20. 20.
    Panda Security, Virus, worms, trojans and backdoors: other harmful relatives of viruses (2011). http://www.pandasecurity.com/homeusers-cms3/security-info/about-malware/generalconcepts/concept-2.html
  21. 21.
    Aycock, J.: Computer Viruses and Malware. Springer, New York (2006)Google Scholar
  22. 22.
    Filiol, E.: Computer Viruses: From Theory to Applications, vol. 1, pp. 19–38. Birkhäuser (2005)Google Scholar
  23. 23.
  24. 24.
    Beaucamps, P.: Advanced metamorphic techniques in computer viruses. In: International Conference on Computer, Electrical, and Systems Science, and Engineering, CESSE’07. Venice, Italy (2007)Google Scholar
  25. 25.
    Filiol, E.: Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2, 70–75 (2007)Google Scholar
  26. 26.
    Zbitskiy, P.: Code mutation techniques by means of formal grammars and automatons. J. Comput. Virol. 5(3), 199–207 (2009)CrossRefGoogle Scholar
  27. 27.
  28. 28.
    The Lifelong Code Optimization Project. http://www-faculty.cs.uiuc.edu/vadve/lcoproject.html
  29. 29.
  30. 30.
    Lattner, C., Adve, V.: A compilation framework for lifelong program analysis and transformation. In: Proceedings of the 2004 International Symposium on Code Generation and Optimization (2004). http://www.cgo.org/cgo2004/papers/06_76_lattner_c.pdf
  31. 31.
    Praher, J.: A Change Framework Based on the Low Level Virtual Machine Compiler Infrastructure. Thesis Report, Johannes Kepler University (2007). http://llvm.cs.uiuc.edu/pubs/2007-04-PraherMSThesis.pdf
  32. 32.
  33. 33.
  34. 34.
    Stamp, M.: A revealing introduction to hidden Markov models (2012). http://www.cs.sjsu.edu/stamp/RUA/HMM.pdf
  35. 35.
    Linux coreutils source code. http://ftp.gnu.org/gnu/coreutil
  36. 36.
    Tamboli, T.: Metamorphic code generation from LLVM IR bytecode, Master’s Project 301 (2013). http://scholarworks.sjsu.edu/etd_projects/301/
  37. 37.
  38. 38.
    Introduction to fuzzing using spike fuzzer. http://resources.infosecinstitute.com/intro-to-fuzzing/
  39. 39.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag France 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceSan Jose State UniversitySan JoseUSA

Personalised recommendations