Skip to main content

SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android Applications

  • Conference paper
  • First Online:
Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous 2021)

Abstract

Software repackaging is a common approach for creating malware. Malware authors often use software repackaging to obfuscate code containing malicious payloads. This forces analysts to spend a large amount of time filtering out benign obfuscated methods in order to locate potentially malicious methods for further analysis. If an effective mechanism for filtering out benign obfuscated methods were available, the number of methods that analysts must consider could be reduced, allowing them to be more productive. In this paper, we present Semeo, an obfuscation-resilient approach for semantic equivalence analysis of Android apps. Semeo automatically and with high accuracy determines whether a repackaged and obfuscated version of a method is semantically equivalent to an original version thereof. Semeo further handles widely-used and complicated types of obfuscations, as well as the scenarios where multiple obfuscation types are applied in tandem. Our empirical evaluation corroborates that Semeo significantly outperforms the state-of-the-art, achieving 100% precision in identifying semantically equivalent methods across almost all apps under analysis. Semeo consistently provides over 80% recall when one or two types of obfuscation are used and 73% recall when five different types of obfuscation are compositely applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In general, the problem of determining the semantic equivalence of two programs is undecidable [18], so our approach is necessarily a heuristic.

  2. 2.

    Available from https://github.com/techexpertize/SignApk.

References

  1. Ponomarenko, A.: A tool for checking backward API/ABI compatibility of a Java library (2013). https://github.com/lvc/japi-compliance-checker

  2. Android. Dalvik bytecode (2015). https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html

  3. Fenton, C.: A pattern based Dalvik deobfuscator which uses limited execution to improve semantic analysis (2015). https://github.com/CalebFenton/dex-oracle

  4. Fenton, C.: Generic Android Deobfuscator (2015). https://github.com/CalebFenton/simplify

  5. Christodorescu, M., Jha, S.: Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th Conference on USENIX Security Symposium - Volume 12, SSYM’03, Washington, DC, pp. 12 (2003)

    Google Scholar 

  6. Collberg, C., Myles, G., Huntwork, A.: Sandmark-a tool for software protection research. IEEE Secur. Priv. 1(4), 40–49 (2003)

    Article  Google Scholar 

  7. Collberg, C., Nagra, J.: Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection, 1st edn. Addison-Wesley Professional, Boston (2009)

    Google Scholar 

  8. Contagio Mini Dump: Pokemon GO with Droidjack - Android sample (2016). http://contagiominidump.blogspot.com

  9. Crussell, J., Gibler, C., Chen, H.: Attack of the clones: detecting cloned applications on android markets. In: Foresti, S., Yung, M., Martinelli, F. (eds.) ESORICS 2012. LNCS, vol. 7459, pp. 37–54. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33167-1_3

    Chapter  Google Scholar 

  10. Crussell, J., Gibler, C., Chen, H.: AnDarwin: scalable detection of semantically similar android applications. In: Crampton, J., Jajodia, S., Mayes, K. (eds.) ESORICS 2013. LNCS, vol. 8134, pp. 182–199. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40203-6_11

    Chapter  Google Scholar 

  11. Goodin, D.: Fake Pokémon Go app on Google Play infects phones with screenlocker (2016). http://arstechnica.com/security/2016/07/fake-pokemon-go-app-on-google-play-infects-phones-with-screenlocker/

  12. DARPA: Automated Program Analysis for Cybersecurity (APAC) (2012). http://www.darpa.mil/program/automated-program-analysis-for-cybersecurity

  13. Desnos, A.: AndroGuard, May 2013. http://androguard.blogspot.com/

  14. Duan, Y., et al.: Things you may not know about android (un) packers: a systematic study based on whole-system emulation. In: 25th Annual Network and Distributed System Security Symposium, NDSS, San Diego, CA, pp. 18–21 (2018)

    Google Scholar 

  15. Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: semantics-based detection of android malware through static analysis. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 576–587. ACM, New York (2014)

    Google Scholar 

  16. Guan, Q., Huang, H., Luo, W., Zhu, S.: Semantics-based repackaging detection for mobile apps. In: Caballero, J., Bodden, E., Athanasopoulos, E. (eds.) ESSoS 2016. LNCS, vol. 9639, pp. 89–105. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30806-7_6

    Chapter  Google Scholar 

  17. I. IDC Research: Smartphone OS Market Share, 2015 Q2. IDC Research Report (2015)

    Google Scholar 

  18. Jones, N.D.: Computability and Complexity: From a Programming Perspective. MIT Press, Cambridge (1997)

    Book  Google Scholar 

  19. Sullivan, J.: Pokémon Go bundles with malicious remote administration tool DroidJack (2016). http://blog.trustlook.com/2016/09/02/pokemon-go-bundles-with-malicious-remote-administration-tool-droidjack/

  20. Cannell, J.: Obfuscation: malware’s best friend (2013). https://blog.malwarebytes.com/threat-analysis/2013/03/obfuscation-malwares-best-friend/

  21. Junaid, M., Liu, D., Kung, D.C.: Dexteroid: detecting malicious behaviors in android apps using reverse-engineered life cycle models. CoRR arXiv:1506.05217 (2015)

  22. Komondoor, R., Horwitz, S.: Semantics-preserving procedure extraction. In: Proceedings of the ACM Symposium on Principles of Programming Languages, pp. 155–169 (2000)

    Google Scholar 

  23. Konstantinou, E.: Metamorphic virus: analysis and detection. Technical Report RHUL-MA-2008-02, Royal Holloway, University of London, January 2008

    Google Scholar 

  24. Kruegel, C., Robertson, W., Vigna, G.: Detecting kernel-level rootkits through binary analysis. In: Proceedings of the 20th Annual Computer Security Applications Conference, ACSAC’04, Tucson, AZ, USA, pp. 91–100 (2004)

    Google Scholar 

  25. Li, Z., Sun, J., Yan, Q., Srisa-an, W., Tsutano, Y.: Obfusifier: obfuscation-resistant android malware detection system. In: Chen, S., Choo, K.-K.R., Fu, X., Lou, W., Mohaisen, A. (eds.) SecureComm 2019. LNICST, vol. 304, pp. 214–234. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37228-6_11

    Chapter  Google Scholar 

  26. Linux Foundation: JavaAPI Compliance Checker (2015). http://ispras.linuxbase.org/index.php/Java_API_Compliance_Checker

  27. Mr.Trojans. ALAN - Android Malware Evaluating Tools Released (2015). http://seclist.us/alan-android-malware-evaluating-tools-released.html

  28. Myles, G., Collberg, C.: K-gram based software birthmarks. In: Proceedings of the 2005 ACM Symposium on Applied Computing, SAC’05, Santa Fe, New Mexico, pp. 314–318 (2005)

    Google Scholar 

  29. National Cyber Security Center (UK): Code Obfuscation (2014). https://www.ncsc.gov.uk/content/files/protected_files/guidance_files/Code-obfuscation.pdf

  30. Partush, N., Yahav, E.: Abstract semantic differencing via speculative correlation. In: SIGPLAN Not., vol. 49, no. 10, pp. 811–828 (2014)

    Google Scholar 

  31. Person, S., Dwyer, M.B., Elbaum, S., Pǎsǎreanu, C.S.: Differential symbolic execution. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT’08/FSE-16, pp. 226–237. ACM, New York (2008)

    Google Scholar 

  32. Pomilia, M.: A study on obfuscation techniques for android malware. Technical Report, Sapienza University of Rome, March 2016. http://midlab.diag.uniroma1.it/articoli/matteo_pomilia_master_thesis.pdf

  33. Preda, M.D., Maggi, F.: Testing android malware detectors against code obfuscation: a systematization of knowledge and unified methodology. J. Comput. Virol. Hacking Tech. 13(3), 209–232 (2016). https://doi.org/10.1007/s11416-016-0282-2

    Article  Google Scholar 

  34. Proofpoint Staff: DroidJack Uses Side-Load...It’s Super Effective! Backdoored Pokemon GO Android App Found (2016). https://www.proofpoint.com/us/threat-insight/post/droidjack-uses-side-load-backdoored-pokemon-go-android-app

  35. Rad, B.B., Masrom, M.: Metamorphic virus variants classification using opcode frequency histogram. In: Proceedings of the International Conference on Computers, pp. 147–155 (2010)

    Google Scholar 

  36. Ramos, D.A., Engler, D.R.: Practical, low-effort equivalence verification of real code. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 669–685. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22110-1_55

    Chapter  Google Scholar 

  37. Rastogi, V., Chen, Y., Jiang, X.: DroidChameleon: evaluating android anti-malware against transformation attacks. In: Proceedings of the ACM Symposium on Information, Computer and Communications Security, pp. 329–334 (2013)

    Google Scholar 

  38. Siek, J.G., Lee, L.-Q., Lumsdaine, A.: The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley Longman Publishing Co., Inc., Boston (2002)

    Google Scholar 

  39. Tsutano, Y., Bachala, S., Srisa-an, W., Rothermel, G., Dinh, J.: An efficient, robust, and scalable approach for analyzing interacting android apps. In: Proceedings of the International Conference on Software Engineering, Buenos Aires, Argentina, May 2017

    Google Scholar 

  40. Tsutano, Y., Bachala, S., Srisa-an, W., Rothermel, G., Dinh, J.: JITANA: a modern hybrid program analysis framework for android platforms. J. Comput. Lang. 52, 55–71 (2019)

    Article  Google Scholar 

  41. Wong, W., Stamp, M.: Hunting for metamorphic engines. J. Comput. Virol. 2(3), 211–229 (2006)

    Article  Google Scholar 

  42. Zhauniarovich, Y., Gadyatskaya, O., Crispo, B., La Spina, F., Moser, E.: FSquaDRA: fast detection of repackaged applications. In: Atluri, V., Pernul, G. (eds.) DBSec 2014. LNCS, vol. 8566, pp. 130–145. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43936-4_9

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Witawas Srisa-an .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, Z., Silva, B.V.R.E., Bagheri, H., Srisa-an, W., Rothermel, G., Dinh, J. (2022). SEMEO: A Semantic Equivalence Analysis Framework for Obfuscated Android Applications. In: Hara, T., Yamaguchi, H. (eds) Mobile and Ubiquitous Systems: Computing, Networking and Services. MobiQuitous 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 419. Springer, Cham. https://doi.org/10.1007/978-3-030-94822-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94822-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94821-4

  • Online ISBN: 978-3-030-94822-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics