Juxtapp: A Scalable System for Detecting Code Reuse among Android Applications

  • Steve Hanna
  • Ling Huang
  • Edward Wu
  • Saung Li
  • Charles Chen
  • Dawn Song
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7591)

Abstract

Mobile application markets such as the Android Marketplace provide a centralized showcase of applications that end users can purchase or download for free onto their mobile phones. Despite the influx of applications to the markets, applications are cursorily reviewed by marketplace maintainers due to the vast number of submissions. User policing and reporting is the primary method to detect misbehaving applications. This reactive approach to application security, especially when programs can contain bugs, malware, or pirated (inauthentic) code, puts too much responsibility on the end users. In light of this, we propose Juxtapp, a scalable infrastructure for code similarity analysis among Android applications. Juxtapp provides a key solution to a number of problems in Android security, including determining if apps contain copies of buggy code, have significant code reuse that indicates piracy, or are instances of known malware. We evaluate our system using more than 58,000 Android applications and demonstrate that our system scales well and is effective. Our results show that Juxtapp is able to detect: 1) 463 applications with confirmed buggy code reuse that can lead to serious vulnerabilities in real-world apps, 2) 34 instances of known malware and variants (13 distinct variants of the GoldDream malware), and 3) pirated variants of a popular paid game.

Keywords

Basic Block Pairwise Similarity Jaccard Similarity Android Application Code Clone 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anzhi android market, http://www.anzhi.com/
  2. 2.
    Contagio malware dump, http://contagiodump.blogspot.com/
  3. 3.
    Dalvik virtual machine, http://www.dalvikvm.com/
  4. 4.
    Developers express concern over pirated games on android market, http://www.guardian.co.uk/technology/blog/2011/mar/17/android-market-pirated-games-concerns/
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
    Number of available android applications, http://www.appbrain.com/stats/number-of-android-apps/
  13. 13.
  14. 14.
    Up to a million android users affected by malware, says report, http://www.linuxfordevices.com/c/a/News/Lookout-malware-report-2011/
  15. 15.
    Update: Security alert: Droiddream malware found in official android market, http://blog.mylookout.com/2011/03/security-alert-malware-found-in-official-android-market-droiddream/
  16. 16.
    Freemarket: Shopping for free in android applications. Extended Abstract, to appear NDSS (2012)Google Scholar
  17. 17.
    Baker, B.S., Manber, U.: Deducing similarities in java sources from bytecodes. In: Proceedings of the USENIX Annual Technical Conference (1998)Google Scholar
  18. 18.
    Chin, E., Felt, A.P., Greenwood, K., Wagner, D.: Analyzing inter-application communication in android. In: Proceedings of MobiSys (2011)Google Scholar
  19. 19.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley and Sons (2000)Google Scholar
  20. 20.
    Felt, A.P., Chin, E., Hanna, S., Song, D., Wagner, D.: Android permissions demystified. In: Proceedings of ACM CCS (2011)Google Scholar
  21. 21.
    Gabel, M., Jiang, L., Su, Z.: Scalable detection of semantic clones. In: Proceedings of the 30th International Conference on Software Engineering, ICSE 2008, pp. 321–330. ACM, New York (2008)Google Scholar
  22. 22.
    Gao, D., Reiter, M.K., Song, D.: BinHunt: Automatically Finding Semantic Differences in Binary Programs. In: Chen, L., Ryan, M.D., Wang, G. (eds.) ICICS 2008. LNCS, vol. 5308, pp. 238–255. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  23. 23.
    Hu, X., Cker Chiueh, T., Shin, K.G.: Large-scale malware indexing using function call graphs. In: Proceedings ACM CCS (2009)Google Scholar
  24. 24.
    Jang, J., Brumley, D., Venkataraman, S.: Bitshred: Feature hashing malware for scalable triage and semantic analysis. In: Proceedings of ACM CCS (2011)Google Scholar
  25. 25.
    Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: Scalable and accurate tree-based detection of code clones. In: Proceedings of ICSE (2007)Google Scholar
  26. 26.
    Weinberger, K., Dasgupta, A., Langford, J., Smola, A., Attenberg, J.: Feature hashing for large scale multitask learning. In: Proceedings of ICML (June 2009)Google Scholar
  27. 27.
    Kim, H., Jung, Y., Kim, S., Yi, K.: Mecc: memory comparison-based clone detector. In: Proceeding of the 33rd International Conference on Software Engineering, ICSE 2011, pp. 301–310. ACM, New York (2011)Google Scholar
  28. 28.
    Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. Journal of Machine Learning Research 7 (December 2006)Google Scholar
  29. 29.
    Li, Z., Lu, S., Myagmar, S., Zhou, Y.: Cp-miner: Finding copy-paste and related bugs in large-scale software code. IEEE Transactions on Software Engineering 32(3) (2006)Google Scholar
  30. 30.
    Schleimer, S., Wilkerson, D., Aiken, A.: Winnowing: Local algorithms for document fingerprinting. In: Proceedings of the ACM SIGMOD/PODS ConferenceGoogle Scholar
  31. 31.
    Shi, Q., Petterson, J., Dror, G., Langford, J., Smola, A., Strehl, A., Vishwanathan, V.: Hash kernels. In: Proceedings of AISTATS 2009 (2009)Google Scholar
  32. 32.
    Walenstein, A., Lakhotia, A.: The software similarity problem in malware analysis. In: Proceedings of Duplication, Redundancy, and Similarity in Software (2007)Google Scholar
  33. 33.
    Yarow, J., Terbush, J.: Android is totally blowing away the competition, http://www.businessinsider.com/chart-of-the-day-android-is-taking-over-the-smartphone-market-2011-11
  34. 34.
    Zhou, W., Zhou, Y., Jiang, X., Ning, P.: Droidmoss: Detecting repackaged smartphone applications in third-party android marketplaces. In: Proceedings of the 2nd ACM Conference on Data and Application Security and Privacy (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Steve Hanna
    • 1
  • Ling Huang
    • 2
  • Edward Wu
    • 1
  • Saung Li
    • 1
  • Charles Chen
    • 1
  • Dawn Song
    • 1
  1. 1.UC BerkeleyUSA
  2. 2.Intel LabsUSA

Personalised recommendations