Abstract
Most of modern malware are packed by packers to evade the anti-virus software. Basically, packers will apply various obfuscating techniques to hide their true behaviors from static analysis methods. Thus, how to deal with packed malware has always been a tough problem so far. This paper proposes a novel approach for packer detection using a combination of BE-PUM tool and Hidden Markov Model. First, BE-PUM tool is applied to detect the sequence of possible obfuscation techniques embedded in the analyzed binary program. Then, Hidden Markov Model is used to effectively identify the possibility of packer existence from the generated sequences. As Hidden Markov is very effective for pattern recognition, our proposed technique can accurately identify the packers deployed in binaries files. We have performed experiments on more than 2000 real-world malwares taken from VirusShare. The result is very promising.
Keywords
- Malware
- Obfuscation techniques
- Packers
- Hidden Markov Model
- BE-PUM
This is a preview of subscription content, access via your institution.
Buying options






Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
References
McAfee: The good, the bad, and the unknown. http://www.techdata.com/mcafee/files/MCAFEE_wp_appcontrol-good-bad-unknown.pdf. Accessed 21 May 2017
Santos, I., Ugarte-Pedrero, X., Sanz, B., Laorden, C., Bringas, P.G.: Collective classification for packed executable identification. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, Australia, pp. 23–30 (2011)
Al-Anezi, M.M.K.: Generic packing detection using several complexity analysis for accurate malware detection int. J. Adv. Comput. Sci. 3, 32–39 (2016)
Osaghae, E.O.: Classifying packed programs as malicious software detected. Int. J. Inf. Technol. Electr. Eng. 5, 22–25 (2016)
Nguyen, M.H., Nguyen, T.B., Quan, T.T., Ogawa, M.: A hybrid approach for control flow graph construction from binary code. In: IEEE APSEC, pp. 159–164 (2013)
Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 229–247. Springer, Cham (2016). doi:10.1007/978-3-319-30303-1_14
Morgenstern, M., Marx, A.: Runtime packer testing experiences. In: CARO. LNCS, vol. 6174, pp. 288–305 (2008)
Kang, M.G., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: ACM WORM, pp. 46–53 (2007)
Bonfante, G., Fernez, J., Marion, J.-Y., Rouxel, B., Sabatier, F., Thierry, A.: CoDisasm: medium scale concatic disassembly of self-modifying binaries with overlapping instructions. In: ACM SIGSAC CCS, pp. 46–53 (2015)
Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 1–32 (2013)
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89862-7_1
Anti-virus technology whitepaper. Technical report, BitDefender (2007)
Nguyen, M.H., Tho, Q.T.: An experimental study on identifying obfuscation techniques in packer. In: 5th World Conference on Applied Sciences, Engineering and Technology (WCSET), 02–04 June 2016, HCMUT, Vietnam (2016). ISBN 978-81-930222-2-1
Thakur, A., Lim, J., Lal, A., Burton, A., Driscoll, E., Elder, M., Andersen, T., Reps, T.: Directed proof generation for machine code. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 288–305. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14295-6_27
Kinder, J.: Static analysis of x86 executables. Ph.D. thesis, Technische Universitat Darmstadt (2010)
Kinder, J., Kravchenko, D.: Alternating control flow reconstruction. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 267–282. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27940-9_18
Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. In: Jones, N.D., Müller-Olm, M. (eds.) VMCAI 2009. LNCS, vol. 5403, pp. 214–228. Springer, Heidelberg (2008). doi:10.1007/978-3-540-93900-9_19
Rabiner, L.R., Juang, H.: Hidden Markov models for speech recognition - strengths and limitations. In: Laface, P., De Mori, R. (eds.) Speech Recognition and Understanding. NATO ASI Series, vol. 75, pp. 3–29. Springer, Heidelberg (1992). doi:10.1007/978-3-642-76626-8_1
Kunda, A., He, Y., Bahl, P.: Handwritten word recognition: a hidden Markov model based approach. In: pattern recognition, pp. 283–297, May 1989
Rimey, R.D., Brown, C.M.: Selective attention as sequential behavior: modeling eye movements with an augmented hidden Markov model. In: Proceedings of the DARPA Image Understanding Workshop, pp. 840–649 (1990)
Bakis, R.: Continuous speech word recognition via centisecond acoustic states. In: Proceedings of ASA Meeting, Washington, D.C., April 1976
Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)
Singhal, A.: Modern information retrieval a brief overview. Bull. IEEE Comput. Soc. Techn. Comm. Data Eng. 24, 35–43 (2001)
Hai, N.M., Tho, Q.T., Anh, L.D.: Multi-threaded on-the-fly model generation of malware with hash compaction. In: Ogata, K., Lawford, M., Liu, S. (eds.) ICFEM 2016. LNCS, vol. 10009, pp. 159–174. Springer, Cham (2016). doi:10.1007/978-3-319-47846-3_11
Acknowledgments
This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2015.16.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Hai, N.M., Tho, Q.T. (2017). Packer Identification Using Hidden Markov Model. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-69456-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69455-9
Online ISBN: 978-3-319-69456-6
eBook Packages: Computer ScienceComputer Science (R0)