Plagiarizing Smartphone Applications: Attack Strategies and Defense Techniques

  • Rahul Potharaju
  • Andrew Newell
  • Cristina Nita-Rotaru
  • Xiangyu Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7159)


In this paper, we show how an attacker can launch malware onto a large number of smartphone users by plagiarizing Android applications and by using elements of social engineering to increase infection rate. Our analysis of a dataset of 158,000 smartphone applications meta-information indicates that 29.4% of the applications are more likely to be plagiarized. We propose three detection schemes that rely on syntactic fingerprinting to detect plagiarized applications under different levels of obfuscation used by the attacker. Our analysis of 7,600 smartphone application binaries shows that our schemes detect all instances of plagiarism from a set of real-world malware incidents with 0.5% false positives and scale to millions of applications using only commodity servers.


Feature Vector Abstract Syntax Malicious Code Android Application Clone Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kerris, N., Neumayr, T.: Apple App Store Downloads Top Two Billion (2009)Google Scholar
  2. 2.
    Chu, E.: Android Market: A User-driven Content Distribution System (2008)Google Scholar
  3. 3.
    Animal Rights Protesters use Mobile Means for their Message,
  4. 4.
    Warning on Possible Android Mobile Trojans,
  5. 5.
    Lookout Anti-Virus,
  6. 6.
    Norton Mobile Security,
  7. 7.
    Bitdefender Mobile Security,
  8. 8.
    Enck, W., Gilbert, P., Chun, B., Cox, L., Jung, J., McDaniel, P., Sheth, A.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. In: OSDI (2010)Google Scholar
  9. 9.
    Nauman, M., Khan, S., Zhang, X.: Apex: Extending Android Permission Model with user-defined runtime constraints. In: ICCS (2010)Google Scholar
  10. 10.
    Jakobsson, M., Johansson, K.: Retroactive detection of malware with applications to mobile platforms. In: HotSec (2010)Google Scholar
  11. 11.
  12. 12.
    Dalvik Virtual Machine,
  13. 13.
  14. 14.
    Lafortune, E., et al.: ProGuard (2004),
  15. 15.
    Linn, C., Debray, S.K.: Obfuscation of executable code to improve resistance to static disassembly. In: CCS (2003)Google Scholar
  16. 16.
    Collberg, C.S., Thomborson, C.D.: Watermarking, Tamper-Proofing, and Obfuscation-Tools for Software Protection. In: IEEE TSE (2002)Google Scholar
  17. 17.
    Felt, A., Chin, E., Hanna, S., Song, D., Wagner, D.: Android permissions demystified. Technical Report UCB/EECS-2011-48, University of California, Berkeley, Tech. Rep. (2011)Google Scholar
  18. 18.
    Shneiderman, B.: Treemaps for space-constrained visualization of hierarchies. In: ACM TOG (1998)Google Scholar
  19. 19.
  20. 20.
  21. 21.
    Nguyen, H., Nguyen, T., Pham, N., Al-Kofahi, J., Nguyen, T.: Accurate and Efficient Structural Characteristic Feature Extraction for Clone Detection. In: Chechik, M., Wirsing, M. (eds.) FASE 2009. LNCS, vol. 5503, pp. 440–455. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  22. 22.
    Lookout Security Blog,
  23. 23.
    Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An optimal algorithm for approximate nearest neighbor search in fixed dimensions. JACM (1998)Google Scholar
  24. 24.
    Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: Scalable and accurate tree-based detection of code clones. In: ICSE. IEEE Computer Society (2007)Google Scholar
  25. 25.
    Li, Z., Lu, S., Myagmar, S., Zhou, Y.: CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code. In: IEEE TSE (2006)Google Scholar
  26. 26.
    Apiwattanapong, T., Orso, A., Harrold, M.: A Differencing Algorithm for Object-Oriented Programs. In: ASE (2004)Google Scholar
  27. 27.
    Jackson, D., Ladd, D.: Semantic Diff: A Tool for Summarizing the Effects of Modifications. In: ICSM (1994)Google Scholar
  28. 28.
    Laski, J., Szermer, W.: Identification of Program Modifications and its Applications to Software Maintenance. In: ICSM (1992)Google Scholar
  29. 29.
    Aiken, A., et al.: Moss: System for detecting software plagiarism,
  30. 30.
    Komondoor, R., Horwitz, S.: Semantics-Preserving Procedure Extraction. In: POPL (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Rahul Potharaju
    • 1
  • Andrew Newell
    • 1
  • Cristina Nita-Rotaru
    • 1
  • Xiangyu Zhang
    • 1
  1. 1.Department of Computer SciencePurdue UniversityUSA

Personalised recommendations