Automated Software Engineering

, Volume 23, Issue 4, pp 591–618 | Cite as

Detecting plagiarized mobile apps using API birthmarks

  • Daeyoung Kim
  • Amruta Gokhale
  • Vinod Ganapathy
  • Abhinav Srivastava


This paper addresses the problem of detecting plagiarized mobile apps. Plagiarism is the practice of building mobile apps by reusing code from other apps without the consent of the corresponding app developers. Recent studies on third-party app markets have suggested that plagiarized apps are an important vehicle for malware delivery on mobile phones. Malware authors repackage official versions of apps with malicious functionality, and distribute them for free via these third-party app markets. An effective technique to detect app plagiarism can therefore help identify malicious apps. Code plagiarism has long been a problem and a number of code similarity detectors have been developed over the years to detect plagiarism. In this paper we show that obfuscation techniques can be used to easily defeat similarity detectors that rely solely on statically scanning the code of an app. We propose a dynamic technique to detect plagiarized apps that works by observing the interaction of an app with the underlying mobile platform via its API invocations. We propose API birthmarks to characterize unique app behaviors, and develop a robust plagiarism detection tool using API birthmarks.


Mobile apps Plagiarism API birthmarks 



This work is supported in part by NSF Grants 0952128, 1117711, 1420815, 1441724 and 1408803.


  1. Allatori Java obfuscator.
  2. Baker, B.S.: On finding duplication and near-duplication in large software systems. In: WCRE (1995)Google Scholar
  3. Baxter, I., Yahin, A., Moura, L., Sant’Anna, M., Bier, L.: Clone detection using abstract syntax trees. In: ICSM (1998)Google Scholar
  4. Bellon, S., Koschke, R., Antoniol, G., Krinke, J., Merlo, E.: Comparison and evaluation of clone detection tools. IEEE Trans. Softw. Eng. 33(9), 577–591 (2007)CrossRefGoogle Scholar
  5. Christodorescu, M., Jha, S.: Testing malware detectors. In: ISSTA (2004)Google Scholar
  6. Crussell, J., Gibler, C., Chen, H.: AnDarwin: Scalable detection of semantically similar android applications. In: ESORICS (2013)Google Scholar
  7. Crussell, J., Gibler, C., Chen, H.: Attack of the clones: detecting cloned applications on android markets. In: ESORICS (2012)Google Scholar
  8. Dalvik Debug Monitor Server (DDMS).
  9. Ducasse, S., Nierstrasz, O., Rieger, M.: On the effectiveness of clone detection by string matching. J. Softw. Maint. 18(1), 37–58 (2006)CrossRefGoogle Scholar
  10. Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: SPSM (2011)Google Scholar
  11. FOSS apps for Android.
  12. Gibler, C., Stevens, R., Crussell, J., Chen, H., Zang, H., Choi, H.: Adrob: examining the landscape and impact of android application plagiarism. In: MobiSys (2013)Google Scholar
  13. Hanna, S., Huang, L., Wu, E., Li, S., Chen, C., Song, D.: Juxtapp: a scalable system for detecting code reuse among android applications. In: DIMVA (2012)Google Scholar
  14. Higo, Y., Ueda, Y., Kamiya, T., Kusumoto, S., Inoue, K.: On software maintenance process improvement based on code clone analysis. In: PROFES (2002)Google Scholar
  15. Kamiya, T., Kusumoto, S., Inoue, K.: CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7), 654–670 (2002)CrossRefGoogle Scholar
  16. Kontogiannis, K., de Mori, R., Bernstein, M., Galler, M., Merlo, E.: Pattern matching for design concept localization. In: WCRE (1995)Google Scholar
  17. Krinke, J.: Identifying similar code with program dependence graphs. In: WCRE (2001)Google Scholar
  18. Lim, H.I., Park, H., Choi, S., Han, T.: Detecting theft of java applications via a static birthmark based on weighted stack patterns. In: IEICE (2008)Google Scholar
  19. Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: KDD (2006)Google Scholar
  20. Machiry, A., Tahiliani, R., Naik, M.: Dynodroid: an input generation system for android apps. In: ESEC/FSE 2013 (2013)Google Scholar
  21. Myles, G., Collberg, C.S.: Detecting software theft via whole program path birthmarks. In: ISC (2004)Google Scholar
  22. Myles, G., Collberg, C.: K-gram based software birthmarks. In: SAC (2005)Google Scholar
  23. Rastogi, V., Chen, Y., Jiang, X.: DroidChameleon: evaluating Android anti-malware against transformation attacks. In: ASIACCS (2013)Google Scholar
  24. Schuler, D., Dallmeier, V., Lindig, C.: A dynamic birthmark for java. In: ASE (2007)Google Scholar
  25. Tamada, H., Okamoto, K., Nakamura, M., Monden, A., Matsumoto, K.I.: Dynamic software birthmarks to detect the theft of windows applications. In: ISFST (2004)Google Scholar
  26. UI/Application exerciser monkey.
  27. Yeh, T., Chang, T.-H., Miller, R.C.: Sikuli: using gui screenshots for search and automation. In: UIST (2009)Google Scholar
  28. You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: BWCCA (2010)Google Scholar
  29. Zhang, F., Jhi, Y.-C., Wu, D., Liu, P., Zhu, S.: A first step towards algorithm plagiarism detection. In: ISSTA (2012)Google Scholar
  30. Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy (2012)Google Scholar
  31. Zhou, W., Zhou, Y., Jiang, X., Ning, P.: Detecting repackaged smartphone applications in third-party android marketplaces. In: CODASPY (2012)Google Scholar
  32. Zhou, W., Zhou, Y., Grace, M., Jiang, X., Zou, S.: Fast, scalable detection of “piggybacked” mobile applications. In: CODASPY (2013)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Daeyoung Kim
    • 1
  • Amruta Gokhale
    • 1
  • Vinod Ganapathy
    • 1
  • Abhinav Srivastava
    • 2
  1. 1.Rutgers UniversityPiscatawayUSA
  2. 2.AT&T Labs–ResearchBedminsterUSA

Personalised recommendations