Simple substitution distance and metamorphic detection

  • Gayathri Shanmugam
  • Richard M. Low
  • Mark Stamp
Original Paper


To evade signature-based detection, metamorphic viruses transform their code before each new infection. Software similarity measures are a potentially useful means of detecting such malware. We can compare a given file to a known sample of metamorphic malware and compute their similarity—if they are sufficiently similar, we classify the file as malware of the same family. In this paper, we analyze an opcode-based software similarity measure inspired by simple substitution cipher cryptanalysis. We show that the technique provides a useful means of classifying metamorphic malware.


Receiver Operating Characteristic Curve Family Virus Simple Substitution Metamorphic Virus Benign File 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Attaluri, S., McGhee, S., Stamp, M.: Profile hidden Markov models and metamorphic virus detection. J. Comput. Virol. 5(2), 151–169 (2009)CrossRefGoogle Scholar
  2. 2.
    Aycock, J.: Computer Viruses and Malware. Springer, Berlin (2006)Google Scholar
  3. 3.
    Austin, T.H. et al.: Exploring hidden Markov models for virus analysis: A semantic approach, Proceedings of 46th Hawaii International Conference on System Sciences (HICSS 46), January 7–10 (2013)Google Scholar
  4. 4.
    Baysa, D., Low, R.M., Stamp, M.: Structural entropy and metamorphic malware, submittedGoogle Scholar
  5. 5.
    Bilar, D.: Opcodes as predictor for malware. Int. J. Electron. Secur. Digit. Forensics 1(2), 156–168 (2007)CrossRefGoogle Scholar
  6. 6.
    Borello, J., Me, L.: Code obfuscation techniques for metamorphic viruses. J. Comput. Virol. 4(3), 30–40 (2008)CrossRefGoogle Scholar
  7. 7.
    Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997)CrossRefGoogle Scholar
  8. 8.
    Cygwin, Cygwin Utility Files,
  9. 9.
    Desai, P.: Towards an undetectable computer virus, Master’s report, Department of Computer Science, San Jose State University (2008).
  10. 10.
    Deshpande, S.: Eigenvalue Analysis for Metamorphic Detection, Master’s report, Department of Computer Science, San Jose State University (2012).
  11. 11.
    Dhavare, A., Low, R.M., Stamp, M.: Efficient cryptanalysis of homophonic substitution ciphers. to appear in Cryptologia Google Scholar
  12. 12.
    Filiol, E.: Metamorphism, formal grammars and undecidable code mutation. Int. J. Comput. Sci. 2, 70–75 (2007)Google Scholar
  13. 13.
    Idika, N., Mathur, A.: A Survey of Malware Detection Techniques, Technical report, Department of Computer Science, Purdue University (2007).
  14. 14.
    Islita, M.: Levenshtein Edit Distance (2006).
  15. 15.
    Jakobsen, T.: A fast method for the cryptanalysis of substitution ciphers. Cryptologia 19, 265–274 (1995)CrossRefzbMATHGoogle Scholar
  16. 16.
    Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)CrossRefGoogle Scholar
  17. 17.
    Mathai, J.: History of Computer Cryptography and Secrecy System.
  18. 18.
    Patel, M.: Similarity Tests for Metamorphic Virus Detection, Master’s report, Department of Computer Science, San Jose State University, (2011).
  19. 19.
    Rad, B.B., Masrom, M., Ibrahim, S.: Evolution of computer virus concealment and anti-virus techniques: a short survey. IJCSI Int. J. Comput. Sci. Issues 8(1) (2011).
  20. 20.
    Runwal, N., Low, R.M., Stamp, M.: Opcode graph similarity and metamorphic detection. J. Comput. Virol. 8(1–2), 37–52 (2012)CrossRefGoogle Scholar
  21. 21.
    Shanmugam, G.: Simple Substitution Distance and Metamorphic Detection, Master’s report, Department of Computer Science, San Jose State University (2012).
  22. 22.
    Snakebyte. Next Generation Virus Construction Kit (NGVCK) (2000).
  23. 23.
    Sorokin, I.: Comparing files using structural entropy. J. Comput. Virol. 7(4), 259–265 (2011)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Sridhara, S.M., Stamp, M.: Metamorphic worm that carries its own morphing engine. to appear in J. Comput. Virol.Google Scholar
  25. 25.
    Stamp, M.: Information Security: Principles and Practice, 2nd edn. Wiley, Hoboken (2011)CrossRefGoogle Scholar
  26. 26.
    Stamp, M., Low, R.M.: Applied Cryptanalysis: Breaking Ciphers in the Real World. Wiley-IEEE Press, Chichester (2007)CrossRefGoogle Scholar
  27. 27.
    Szor, P., Ferrie, P.: Hunting for Metamorphic, Symantec Security Response.
  28. 28.
    Toderici, A.H., Stamp, M.: Chi-squared distance and metamorphic virus detection. to appear in J. Comput. Virol.Google Scholar
  29. 29.
    Venkatachalam, S., Stamp, M.: Detecting undetectable computer viruses. Proceedings of 2011 International Conference on Security & Management (SAM ’11), pp. 340–345Google Scholar
  30. 30.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar
  31. 31.
    Zbitskiy, P.: Code mutation techniques by means of formal grammars and automatons. J. Comput. Virol. 5(3), 199–207 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag France 2013

Authors and Affiliations

  • Gayathri Shanmugam
    • 1
  • Richard M. Low
    • 2
  • Mark Stamp
    • 3
  1. 1.Department of Computer ScienceSan Jose State UniversitySan JoseUSA
  2. 2.Department of MathematicsSan Jose State UniversitySan JoseUSA
  3. 3.Department of Computer ScienceSan Jose State UniversitySan JoseUSA

Personalised recommendations