Chi-squared distance and metamorphic virus detection

  • Annie H. Toderici
  • Mark Stamp 
Original Paper


Metamorphic malware changes its internal structure with each generation, while maintaining its original behavior. Current commercial antivirus software generally scan for known malware signatures; therefore, they are not able to detect metamorphic malware that sufficiently morphs its internal structure. Machine learning methods such as hidden Markov models (HMM) have shown promise for detecting hacker-produced metamorphic malware. However, previous research has shown that it is possible to evade HMM-based detection by carefully morphing with content from benign files. In this paper, we combine HMM detection with a statistical technique based on the chi-squared test to build an improved detection method. We discuss our technique in detail and provide experimental evidence to support our claim of improved detection.


Hide Markov Model Dead Code Metamorphic Virus Hide Markov Model Classifier Benign File 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18, 333–340 (1975)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    Ataman, K., Street, W.N., Zhang, Y.: Learning to rank by maximizing auc with linear programming. IEEE Technical Report (2006)Google Scholar
  3. 3.
    Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden Markov models for virus analysis: a semantic approach (2012) (submitted)Google Scholar
  4. 4.
    Aycock, J.: Computer Viruses and Malware. Springer, Berlin (2006)Google Scholar
  5. 5.
    Chess, D., White, S.: An undetectable computer virus. Virus Bulletin Conference (2000)Google Scholar
  6. 6.
    Borello, J., Me, L.: Code obfuscation techniques for metamorphic viruses. J. Comput. Virol. 4(3), 211–220 (2008)CrossRefGoogle Scholar
  7. 7.
    Coulter, F., Eichorn, K.: A good decade for cybercrime. McAfee, Inc., Technical Report (2011)Google Scholar
  8. 8.
    Cygwin September 2011, [online]. Available at (2011)
  9. 9.
    Desai, P., Stamp, M.: A highly metamorphic virus generator. Int. J. Multimed. Intell. Security 1(4), 402–427 (2010)CrossRefGoogle Scholar
  10. 10.
    Egan, J.: Signal Detection Theory and ROC Analysis. Academic Press, New York (1975)Google Scholar
  11. 11.
    Filiol, E., Josse, S.: A statistical model for undecidable viral detection. J. Comput. Virol. 3(1), 65–74 (2007)CrossRefGoogle Scholar
  12. 12.
    Geisser, S.: Predictive Inference: An Introduction. Chapman and Hall, London (1993)CrossRefzbMATHGoogle Scholar
  13. 13.
    IDAPro, Interactive dissassembler, 2011, [online]. Available at
  14. 14.
    Intel, Intel®Architecture Software Developer’s Manual, vol. 2. Instruction Set Reference Manual, October (2011)Google Scholar
  15. 15.
    Kolter, J., Maloof, M.: Learning to detect malicious executables in the wild. Proceedings of KDD ’04 (2004)Google Scholar
  16. 16.
    Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)CrossRefGoogle Scholar
  17. 17.
    Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)zbMATHGoogle Scholar
  18. 18.
    Schultz, M., Eskin, E., Zadok, E.: Data mining methods for data mining methods for detection of new malicious executables. Proceedings of IEEE International Conference on Data Mining (2001)Google Scholar
  19. 19.
    Madenur Sridhara, S.: Metamorphic worm that carries its own morphing engine, Master’s Projects, Paper 240, (2012).
  20. 20.
    Madenur Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine (2012) (submitted)Google Scholar
  21. 21.
    Stamp, M.: Information Security: Principles and Practice. Wiley, New York (2011)CrossRefGoogle Scholar
  22. 22.
    Stamp, M.: A revealing introduction to hidden Markov models, [online]. Available at
  23. 23.
    Ször, P.: The Art of Computer Virus Research and Defense. Addition Wesley Professional, Boston (2005)Google Scholar
  24. 24.
    Ször P., Ferrie P.: Hunting for metamorphic. Virus Bull. 123–144 (2001)Google Scholar
  25. 25.
    Thrun, S., Saul, L.K., Scholkopf, B. (eds.): AUC Optimization vs Error Rate Minimization. MIT Press, Cambridge (2004)Google Scholar
  26. 26.
    Toderici, A.H.: Chi-squared distance and metamorphic virus detection. Department of Computer Science, San Jose State University, May, Master’s Thesis (2012)Google Scholar
  27. 27.
    Vx heavens, [online]. Available at
  28. 28.
    Wang, R.: Flash in the pan? Virus Bull. (1998) Google Scholar
  29. 29.
    Wong, W.: Analysis and detection of metamorphic computer viruses. Department of Computer Science, San Jose State University, May, Master’s Thesis (2006)Google Scholar
  30. 30.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag France 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceSan Jose State UniversitySan JoseUSA

Personalised recommendations