Skip to main content

Packer Identification Using Hidden Markov Model

  • 1508 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 10607)

Abstract

Most of modern malware are packed by packers to evade the anti-virus software. Basically, packers will apply various obfuscating techniques to hide their true behaviors from static analysis methods. Thus, how to deal with packed malware has always been a tough problem so far. This paper proposes a novel approach for packer detection using a combination of BE-PUM tool and Hidden Markov Model. First, BE-PUM tool is applied to detect the sequence of possible obfuscation techniques embedded in the analyzed binary program. Then, Hidden Markov Model is used to effectively identify the possibility of packer existence from the generated sequences. As Hidden Markov is very effective for pattern recognition, our proposed technique can accurately identify the packers deployed in binaries files. We have performed experiments on more than 2000 real-world malwares taken from VirusShare. The result is very promising.

Keywords

  • Malware
  • Obfuscation techniques
  • Packers
  • Hidden Markov Model
  • BE-PUM

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-69456-6_8
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-69456-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Notes

  1. 1.

    http://www.aspack.com.

  2. 2.

    http://fsg.soft112.com.

  3. 3.

    https://bitsum.com/pecompact.

  4. 4.

    http://www.telock.com-about.com.

  5. 5.

    http://upx.sourceforge.net.

  6. 6.

    http://www.yodas-crypter.com-about.com.

  7. 7.

    https://virusshare.com/.

  8. 8.

    https://www.aldeid.com/wiki/PEiD.

  9. 9.

    http://www.ntcore.com/exsuite.php.

  10. 10.

    http://www.joestewart.org/ollybone.

  11. 11.

    http://www.ollydbg.de.

  12. 12.

    http://bitblaze.cs.berkeley.edu/temu.html.

  13. 13.

    https://virustotal.com/.

References

  1. McAfee: The good, the bad, and the unknown. http://www.techdata.com/mcafee/files/MCAFEE_wp_appcontrol-good-bad-unknown.pdf. Accessed 21 May 2017

  2. Santos, I., Ugarte-Pedrero, X., Sanz, B., Laorden, C., Bringas, P.G.: Collective classification for packed executable identification. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, Australia, pp. 23–30 (2011)

    Google Scholar 

  3. Al-Anezi, M.M.K.: Generic packing detection using several complexity analysis for accurate malware detection int. J. Adv. Comput. Sci. 3, 32–39 (2016)

    Google Scholar 

  4. Osaghae, E.O.: Classifying packed programs as malicious software detected. Int. J. Inf. Technol. Electr. Eng. 5, 22–25 (2016)

    Google Scholar 

  5. Nguyen, M.H., Nguyen, T.B., Quan, T.T., Ogawa, M.: A hybrid approach for control flow graph construction from binary code. In: IEEE APSEC, pp. 159–164 (2013)

    Google Scholar 

  6. Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 229–247. Springer, Cham (2016). doi:10.1007/978-3-319-30303-1_14

    CrossRef  Google Scholar 

  7. Morgenstern, M., Marx, A.: Runtime packer testing experiences. In: CARO. LNCS, vol. 6174, pp. 288–305 (2008)

    Google Scholar 

  8. Kang, M.G., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: ACM WORM, pp. 46–53 (2007)

    Google Scholar 

  9. Bonfante, G., Fernez, J., Marion, J.-Y., Rouxel, B., Sabatier, F., Thierry, A.: CoDisasm: medium scale concatic disassembly of self-modifying binaries with overlapping instructions. In: ACM SIGSAC CCS, pp. 46–53 (2015)

    Google Scholar 

  10. Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 1–32 (2013)

    CrossRef  Google Scholar 

  11. Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89862-7_1

    CrossRef  Google Scholar 

  12. Anti-virus technology whitepaper. Technical report, BitDefender (2007)

    Google Scholar 

  13. Nguyen, M.H., Tho, Q.T.: An experimental study on identifying obfuscation techniques in packer. In: 5th World Conference on Applied Sciences, Engineering and Technology (WCSET), 02–04 June 2016, HCMUT, Vietnam (2016). ISBN 978-81-930222-2-1

    Google Scholar 

  14. Thakur, A., Lim, J., Lal, A., Burton, A., Driscoll, E., Elder, M., Andersen, T., Reps, T.: Directed proof generation for machine code. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 288–305. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14295-6_27

    CrossRef  Google Scholar 

  15. Kinder, J.: Static analysis of x86 executables. Ph.D. thesis, Technische Universitat Darmstadt (2010)

    Google Scholar 

  16. Kinder, J., Kravchenko, D.: Alternating control flow reconstruction. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 267–282. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27940-9_18

    CrossRef  Google Scholar 

  17. Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. In: Jones, N.D., Müller-Olm, M. (eds.) VMCAI 2009. LNCS, vol. 5403, pp. 214–228. Springer, Heidelberg (2008). doi:10.1007/978-3-540-93900-9_19

    CrossRef  Google Scholar 

  18. Rabiner, L.R., Juang, H.: Hidden Markov models for speech recognition - strengths and limitations. In: Laface, P., De Mori, R. (eds.) Speech Recognition and Understanding. NATO ASI Series, vol. 75, pp. 3–29. Springer, Heidelberg (1992). doi:10.1007/978-3-642-76626-8_1

    CrossRef  Google Scholar 

  19. Kunda, A., He, Y., Bahl, P.: Handwritten word recognition: a hidden Markov model based approach. In: pattern recognition, pp. 283–297, May 1989

    Google Scholar 

  20. Rimey, R.D., Brown, C.M.: Selective attention as sequential behavior: modeling eye movements with an augmented hidden Markov model. In: Proceedings of the DARPA Image Understanding Workshop, pp. 840–649 (1990)

    Google Scholar 

  21. Bakis, R.: Continuous speech word recognition via centisecond acoustic states. In: Proceedings of ASA Meeting, Washington, D.C., April 1976

    Google Scholar 

  22. Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)

    CrossRef  MathSciNet  Google Scholar 

  23. Singhal, A.: Modern information retrieval a brief overview. Bull. IEEE Comput. Soc. Techn. Comm. Data Eng. 24, 35–43 (2001)

    Google Scholar 

  24. Hai, N.M., Tho, Q.T., Anh, L.D.: Multi-threaded on-the-fly model generation of malware with hash compaction. In: Ogata, K., Lawford, M., Liu, S. (eds.) ICFEM 2016. LNCS, vol. 10009, pp. 159–174. Springer, Cham (2016). doi:10.1007/978-3-319-47846-3_11

    CrossRef  Google Scholar 

Download references

Acknowledgments

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2015.16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nguyen Minh Hai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hai, N.M., Tho, Q.T. (2017). Packer Identification Using Hidden Markov Model. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69456-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69455-9

  • Online ISBN: 978-3-319-69456-6

  • eBook Packages: Computer ScienceComputer Science (R0)