Metamorphic code generation from LLVM bytecode

  • Teja Tamboli
  • Thomas H. Austin
  • Mark Stamp
Original Paper


Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strategies. However, code morphing also has potential security benefits, since it can serve to increase the “genetic diversity” of software. We have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language is transformed to this IR bytecode as part of the LLVM compilation process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over morphing at the assembly or source code level. The morphing techniques that we employ include dead code insertion and transposition, where the dead code is actually executed within the morphed code, making its detection and removal more challenging. We have verified the effectiveness of our code morphing using hidden Markov model analysis.


Hide Markov Model Intermediate Representation Base File Dead Code Hide Markov Model Classifier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    The Mental Driller, Metamorphism in practice or “How I made MetaPHOR and what I’ve learnt” (2002).
  2. 2.
    An example of metamorphic virus.
  3. 3.
    Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)CrossRefGoogle Scholar
  4. 4.
    Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine. J. Comput. Virol. Hacking Tech. 9(2), 49–58 (2013)CrossRefGoogle Scholar
  5. 5.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar
  6. 6.
    Gao, X., Stamp, M.: Metamorphic software for buffer overflow mitigation. In: Dey, P.P., Amin, M.N. (eds.) Proceedings of 3rd Conference on Computer Science and its Applications. San Diego, California (2005)Google Scholar
  7. 7.
    Stamp, M.: Risks of monoculture, Inside Risks 165. Commun. ACM 47(3):120 (2004). Google Scholar
  8. 8.
  9. 9.
  10. 10.
    Attaluri, S., McGhee, S., Stamp, M.: Profile hidden markov models and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169 (2009)CrossRefGoogle Scholar
  11. 11.
    Lattner, C., Adve, V.: Architecture for a next generation GCC. In: First GCC Annual Developer’s Summit (2003).
  12. 12.
    The LLVM Compiler Infrastructure Project.
  13. 13.
    Sharif, M. et al.: Impending Malware Analysis Using Conditional Code Obfuscation. College of Computing, Georgia Institute of Technology.
  14. 14.
    Ma, W., et al.: Shadow attacks: automatically evading system-call behavior. J. Comput. Virol. 8(1–2), 1–13 (2012)CrossRefGoogle Scholar
  15. 15.
    Kazi, S., Stamp, M.: Hidden Markov models for software piracy detection. Inf. Secur. J. A Glob. Perspect. 22(3), 140–149 (2013)CrossRefGoogle Scholar
  16. 16.
    Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware. J. Comput. Virol. Hacking Tech. 9(4), 179–192 (2013) (to appear)Google Scholar
  17. 17.
    Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012)CrossRefGoogle Scholar
  18. 18.
    Shanmugam, G., Low, R.M., Stamp, M.: Simple substitution distance and metamorphic detection. J. Comput. Virol. Hacking Tech. 9(3), 159–170 (2013)CrossRefGoogle Scholar
  19. 19.
    Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. J. Comput. Virol. Hacking Tech. 9(1), 1–14 (2013)CrossRefGoogle Scholar
  20. 20.
    Panda Security, Virus, worms, trojans and backdoors: other harmful relatives of viruses (2011).
  21. 21.
    Aycock, J.: Computer Viruses and Malware. Springer, New York (2006)Google Scholar
  22. 22.
    Filiol, E.: Computer Viruses: From Theory to Applications, vol. 1, pp. 19–38. Birkhäuser (2005)Google Scholar
  23. 23.
  24. 24.
    Beaucamps, P.: Advanced metamorphic techniques in computer viruses. In: International Conference on Computer, Electrical, and Systems Science, and Engineering, CESSE’07. Venice, Italy (2007)Google Scholar
  25. 25.
    Filiol, E.: Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2, 70–75 (2007)Google Scholar
  26. 26.
    Zbitskiy, P.: Code mutation techniques by means of formal grammars and automatons. J. Comput. Virol. 5(3), 199–207 (2009)CrossRefGoogle Scholar
  27. 27.
  28. 28.
    The Lifelong Code Optimization Project.
  29. 29.
  30. 30.
    Lattner, C., Adve, V.: A compilation framework for lifelong program analysis and transformation. In: Proceedings of the 2004 International Symposium on Code Generation and Optimization (2004).
  31. 31.
    Praher, J.: A Change Framework Based on the Low Level Virtual Machine Compiler Infrastructure. Thesis Report, Johannes Kepler University (2007).
  32. 32.
  33. 33.
  34. 34.
    Stamp, M.: A revealing introduction to hidden Markov models (2012).
  35. 35.
    Linux coreutils source code.
  36. 36.
    Tamboli, T.: Metamorphic code generation from LLVM IR bytecode, Master’s Project 301 (2013).
  37. 37.
  38. 38.
    Introduction to fuzzing using spike fuzzer.
  39. 39.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag France 2013

Authors and Affiliations

  1. 1.Department of Computer ScienceSan Jose State UniversitySan JoseUSA

Personalised recommendations