Dueling hidden Markov models for virus analysis

  • Ashwin Kalbhor
  • Thomas H. Austin
  • Eric Filiol
  • Sébastien Josse
  • Mark Stamp
Original Paper


Recent work has presented hidden Markov models (HMMs) as a compelling option for malware identification. However, some advanced metamorphic malware like MetaPHOR and MWOR have proven to be more challenging to detect with these techniques. In this paper, we develop the dueling HMM Strategy, which leverages our knowledge about different compilers for more precise identification. We also show how this approach may be combined with previous techniques to minimize the performance overhead. Additionally, we examine the HMMs in order to identify the meaning of these hidden states. We examine HMMs for four different compilers, hand-written assembly code, three virus construction kits, and two metamorphic malware families in order to note similarities and differences in the hidden states of the HMMs.


Hide Markov Model Hide State Assembly Code Threshold Approach Dead Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Annachhatre, C., Austin, T.H., Stamp, M.: Hidden markov models for malware classification. J. Comput. Virol. Hack. Tech. pp. 1–15 (2014). doi:  10.1007/s11416-014-0215-x
  2. 2.
    Attaluri, S., McGhee, S., Stamp, M.: Profile hidden markov models and metamorphic virus detection. J. Comput. Virol. 5, 151–169 (2009). doi: 10.1007/s11416-008-0105-1 CrossRefGoogle Scholar
  3. 3.
    Austin, T.H., Filiol, E., Josse, S., Stamp, M.: Exploring hidden markov models for virus analysis: a semantic approach. In: IEEE HICSS, pp. 5039–5048 (2013)Google Scholar
  4. 4.
    Bruschi, D., Martignoni, L., Monga, M.: Detecting self-mutating malware using control-flow graph matching. In: DIMVA (2006)Google Scholar
  5. 5.
    Cave, R.L., Neuwirth, L.P.: Hidden markov models for english. In: Ferguson, J.D. (ed) Hidden Markov Models for Speech (1980)Google Scholar
  6. 6.
    Chen, S.F., Goodman, J.: An empirical study of smoothing techniques for language modeling. In: Association for computational linguistics (1996). doi:  10.3115/981863.981904
  7. 7.
    Chess, D.M., White, S.R.: An undetectable computer virus. In: Virus bulletin conference (2000)Google Scholar
  8. 8.
    Cho, S.B., Han, S.J.: Two sophisticated techniques to improve hmm-based intrusion detection systems. In: RAID (2003)Google Scholar
  9. 9.
    Christodorescu, M., Jha, S.: Testing malware detectors. In: ISSTA (2004)Google Scholar
  10. 10.
    Christodorescu, M., Jha, S., Seshia, S.A., Song, D.X., Bryant, R.E.: Semantics-aware malware detection. In: Symposium on security and privacy (2005)Google Scholar
  11. 11.
    Clang: a C language family frontend for LLVM. Accessed November 2011
  12. 12.
    Driller, T.M.: Metamorphic permutating high-obfuscating reassembler source. Accessed December 2011
  13. 13.
    Filiol, E., Josse, S.: A statistical model for undecidable viral detection. J. Comput. Virol. 3, 64–74 (2007). doi: 10.1007/s11416-007-0041-5 Google Scholar
  14. 14.
    Filiol, E., Josse, S.: Malware spectral analysis: security evaluation of Bayesian network based detection models. In: EICAR conference (2011)Google Scholar
  15. 15.
    Francois, J.M.: JAHMM: An implementation of hidden Markov models in Java. Accessed October 2011
  16. 16.
    GCC, the GNU compiler collection. Accessed November 2011
  17. 17.
    Iliopoulos, D., Adami, C., Szor, P.: Darwin inside the machines: malware evolution and the consequences for computer security. CoRR abs/1111.2503 (2011)Google Scholar
  18. 18.
    Intersimone, D.: Antique software: Turbo C version 2.01. Accessed November 2011
  19. 19.
    Krügel, C., Kirda, E., Mutz, D., Robertson, W.K., Vigna, G.: Polymorphic worm detection using structural information of executables. In: RAID (2005)Google Scholar
  20. 20.
    Leder, F., Steinbock, B., Martini, P.: Classification and detection of metamorphic malware using value set analysis. In: International conference on malicious and unwanted software MALWARE (2009)Google Scholar
  21. 21.
    Lin, D., Stamp, M.: Hunting for undetectable metamorphic viruses. J. Comput. Virol. 7(3), 201–214 (2011)CrossRefGoogle Scholar
  22. 22.
    Madenur Sridhara, S., Stamp, M.: Metamorphic worm that carries its own morphing engine. J. Comput. Virol. 9(2), 49–58 (2013). doi: 10.1007/s11416-012-0174-z
  23. 23.
    MinGW | the minimalist GNU for Windows. Accessed November 2011
  24. 24.
    Mohammed, M.: Zeroing in on metaphoric computer viruses. Master’s thesis, University of Louisiana at Lafayette (2003)Google Scholar
  25. 25.
    SnakeByte: next generation virus construktion kit. Accessed December 2011
  26. 26.
    Song, Y., Locasto, M.E., Stavrou, A., Keromytis, A.D., Stolfo, S.J.: On the infeasibility of modeling polymorphic shellcode—re-thinking the role of learning in intrusion detection systems. Mach. Learn. 81(2), 179–205 (2010)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Stamp, M.: A revealing introduction to hidden Markov models (2004). Accessed October 2011
  28. 28.
    Symantec security response: W32.simile. Accessed December 2011
  29. 29.
    Szor, P.: The Art of Computer Virus Research and Defense. Addison Wesley, Boston (2005)Google Scholar
  30. 30.
    Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)CrossRefGoogle Scholar
  31. 31.
    Zhang, Q., Reeves, D.S.: Metaaware: identifying metamorphic malware. In: ACSAC (2007)Google Scholar

Copyright information

© Springer-Verlag France 2014

Authors and Affiliations

  • Ashwin Kalbhor
    • 1
  • Thomas H. Austin
    • 1
  • Eric Filiol
    • 2
  • Sébastien Josse
    • 3
  • Mark Stamp
    • 1
  1. 1.Department of Computer ScienceSan José State UniversitySan JoseUSA
  2. 2.ESIEA Laboratoire (C + V)oLavalFrance
  3. 3.Direction générale de l’armement (DGA)RennesFrance

Personalised recommendations