Skip to main content
Log in

On Identifying and Explaining Similarities in Android Apps

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

App updates and repackaging are recurrent in the Android ecosystem, filling markets with similar apps that must be identified. Despite the existence of several approaches to improving the scalability of detecting repackaged/cloned apps, researchers and practitioners are eventually faced with the need for a comprehensive pairwise comparison (or simultaneously multiple app comparisons) to understand and validate the similarities among apps. In this work, we present the design and implementation of our research-based prototype tool called SimiDroid for multi-level similarity comparison of Android apps. SimiDroid is built with the aim to support the comprehension of similarities/changes among app versions and among repackaged apps. In particular, we demonstrate the need and usefulness of such a framework based on different case studies implementing different dissection scenarios for revealing various insights on how repackaged apps are built. We further show that the similarity comparison plugins implemented in SimiDroid yield more accurate results than the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Dong F, Wang H Y, Li L, Guo Y, Xu G A, Zhang S D. How do mobile apps violate the behavioral policy of advertisement libraries? In Proc. the 19th Workshop on Mobile Computing Systems and Applications, February 2018, pp.75-80.

  2. Dong F, Wang H Y, Li L, Guo Y, Bissyandé T F, Liu T M, Xu G A, Klein J. FraudDroid: Automated ad fraud detection for Android apps. In Proc. the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, November 2018, pp.257-268.

  3. Li L, Li D, Bissyandé T F, Klein J, Le Traon Y, Lo D, Cavallaro L. Understanding Android app piggybacking: A systematic study of malicious code grafting. IEEE Transactions on Information Forensics and Security, 2017, 12(6): 1269-1284.

    Article  Google Scholar 

  4. Wang H Y, Liu Z, Guo Y, Chen X Q, Zhang M, Xu G A, Hong J. An explorative study of the mobile app ecosystem from app developers’ perspective. In Proc. the 26th International Conference on World Wide Web, April 2017, pp.163-172.

  5. Wang H Y, Li H, Li L, Guo Y, Xu G A. Why are Android apps removed from Google play? A large-scale empirical study. In Proc. the 15th International Conference on Mining Software Repositories, May 2018, pp.231-242.

  6. Chen J, Alalfi M H, Dean T R, Zou Y. Detecting Android malware using clone detection. Journal of Computer Science and Technology, 2015, 30(5): 942-956.

    Article  Google Scholar 

  7. Wang H Y, Guo Y, Ma Z A, Chen X Q. WuKong: A scalable and accurate two-phase approach to Android app clone detection. In Proc. the 2015 International Symposium on Software Testing and Analysis, July 2015, pp.71-82.

  8. Chen K, Liu P, Zhang Y J. Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In Proc. the 36th International Conference on Software Engineering, May 2014, pp.175-186.

  9. Zhou W, Zhou Y J, Jiang X X, Ning P. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. the 2nd ACM Conference on Data and Application Security and Privacy, February 2012, pp.317-326.

  10. Li L, Li D Y, Bissyandé T F, Klein J, Cai H P, Lo D, Traon L Y. On locating malicious code in piggybacked Android apps. Journal of Computer Science and Technology, 2017, 32(6): 1108-1124.

    Article  Google Scholar 

  11. Li L, Bissyandé T F, Papadakis M, Rasthofer S, Bartel A, Octeau D, Klein J, Traon L. Static analysis of Android apps: A systematic literature review. Information and Software Technology, 2017, 88: 67-95.

    Article  Google Scholar 

  12. Tian K, Yao D F, Ryder B G, Tan G. Analysis of code heterogeneity for high-precision classification of repackaged malware. In Proc. the 2016 IEEE Security and Privacy Workshops, May 2016, pp.262-271.

  13. Guan Q L, Huang H Q, Luo W Q, Zhu S C. Semantics-based repackaging detection for mobile apps. In Proc. the 8th International Symposium on Engineering Secure Software and Systems, April 2016, pp.89-105.

  14. Wu X P, Zhang D F, Su X, Li W W. Detect repackaged android application based on HTTP traffic similarity. Security and Communication Networks, 2015, 8(13): 2257-2266.

    Article  Google Scholar 

  15. Sun M. S, Li M M, Lui J. DroidEagle: Seamless detection of visually similar Android apps. In Proc. the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, June 2015, Article No. 9.

  16. Jiao S B, Cheng Y, Ying L Y, Su P R, Feng D G. A rapid and scalable method for Android application repackaging detection. In Proc. the 11th International Conference on Information Security Practice and Experience, May 2015, pp.349-364.

  17. Aldini A, Martinelli F, Saracino A, Sgandurra D. Detection of repackaged mobile applications through a collaborative approach. Concurrency and Computation: Practice and Experience, 2015, 27(11): 2818-2838.

    Article  Google Scholar 

  18. Soh C, Tan H B. K, Arnatovich Y L, Wang L. Detecting clones in Android applications through analyzing user interfaces. In Proc. the 23rd International Conference on Program Comprehension, May 2015, pp.163-173.

  19. Gonzalez H, Kadir A A, Stakhanova N, Alzahrani A J, Ghorbani A A. Exploring reverse engineering symptoms in Android apps. In Proc. the 8th European Workshop on System Security, April 2015, Article No. 7.

  20. Chen K, Wang P, Lee Y J, Wang X F, Zhang N, Huang H Q, Zou W, Liu P. Finding unknown malice in 10 seconds: Mass vetting for new threats at the Google-Play scale. In Proc. the 24th USENIX Security Symposium, August 2015, pp.659-674.

  21. Zhou W, Wang Z, Zhou Y J, Jiang X X. DIVILAR: Diversifying intermediate language for anti-repackaging on Android platform. In Proc. the 4th ACM Conference on Data and Application Security and Privacy, March 2014, pp.199-210.

  22. Gonzalez H, Stakhanova N, Ghorbani A A. DroidKin: Lightweight detection of Android apps similarity. In Proc. the 10th International Conference on Security and Privacy in Communication Systems, September 2014, pp.436-453.

  23. Deshotels L, Notani V, Lakhotia A. DroidLegacy: Automated familial classification of Android malware. In Proc. ACM SIGPLAN on Program Protection and Reverse Engineering Workshop, January 2014, Article No. 3.

  24. Mojica I J, Adams B, Nagappan M, Dienst S, Berger T, Hassan A E. A large-scale empirical study on software reuse in mobile apps. IEEE Software, 2014, 31(2): 78-86.

    Article  Google Scholar 

  25. Vásquez L M, Holtzhauer A, Bernal-Cárdenas C, Poshyvanyk D. Revisiting Android reuse studies in the context of code obfuscation and library usages. In Proc. the 11th Working Conference on Mining Software Repositories, May 2014, pp.242-251.

  26. Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of Android application clones based on semantics. IEEE Transactions on Mobile Computing, 2015, 14(10): 2007-2019.

    Article  Google Scholar 

  27. Shao Y R, Luo X P, Qian C X, Zhu P F, Zhang L. Towards a scalable resource-driven approach for detecting repackaged Android applications. In Proc. the 30th Annual Computer Security Applications Conference, December 2014, pp.56-65.

  28. Zhang F F, Huang H Q, Zhu S C, Wu D H, Liu P. View-Droid: Towards obfuscation-resilient mobile application repackaging detection. In Proc. the 7th ACM Conference on Security and Privacy in Wireless & Mobile Networks, July 2014, pp.25-36.

  29. Ren C G, Chen K, Liu P. Droidmarking: Resilient software watermarking for impeding Android application repackaging. In Proc. the 29th ACM/IEEE International Conference on Automated Software Engineering, September 2014, pp.635-646.

  30. Sun X, Zhongyang Y B, Xin Z, Mao B, Xie L. Detecting code reuse in Android applications using component-based control flow graph. In Proc. the 29th IFIP TC 11 International Conference on ICT Systems Security and Privacy Protection, December 2014, pp.142-155.

  31. Lindorfer M, Volanis S, Sisto A, Neugschwandtner M, Athanasopoulos E, Maggi F, Platzer C, Zanero S, Ioannidis S. AndRadar: Fast discovery of Android applications in alternative markets. In Proc. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, July 2014, pp.51-71.

  32. Kywe S M, Li Y J, Deng R H, Hong J. Detecting camouflaged applications on mobile application markets. In Proc. the 17th International Conference on Information Security and Cryptology, December 2014, pp.241-254.

  33. Lin Y D, Lai Y C, Chen C H, Tsai H C. Identifying Android malicious repackaged applications by thread-grained system call sequences. Computers & Security, 2013, 39(B): 340-350.

    Article  Google Scholar 

  34. Zhou W, Zhou Y J, Grace M, Jiang X X, Zou S H. Fast, scalable detection of piggybacked mobile applications. In Proc. the 3rd ACM Conference on Data and Application Security and Privacy, February 2013, pp.185-196.

  35. Vidas T, Christin N. Sweetening Android lemon markets: Measuring and combating malware in application marketplaces. In Proc. the 3rd ACM Conference on Data and Application Security and Privacy, February 2013, pp.197-208.

  36. Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of semantically similar Android applications. In Proc. the 18th European Symposium on Research in Computer Security, September 2013, pp.182-199.

  37. Zheng M, Sun M S, Lui J. DroidAnalytics: A signature based analytic system to collect, extract, analyze and associate android malware. arXiv:1302.7212, 2013. https://arxiv.org/pdf/1302.7212.pdf, September 2018.

  38. Zhou W, Zhang X W, Jiang X X. Appink: Watermarking Android apps for repackaging deterrence. In Proc. the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, May 2013, pp.1-12.

  39. Gibler C, Stevens R, Crussell J, Chen H, Zang H, Choi H. AdRob: Examining the landscape and impact of Android application plagiarism. In Proc. the 11th Annual International Conference on Mobile Systems, Applications, and Services, June 2013, pp.431-444.

  40. Hanna S, Huang L, Wu E, Li S, Chen C, Song D. Juxtapp: A scalable system for detecting code reuse among Android applications. In Proc. the 9th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, July 2012, pp.62-81.

  41. Crussell J, Gibler C, Chen H. Attack of the clones: Detecting cloned applications on Android markets. In Proc. the 17th European Symposium on Research in Computer Security, September 2012, pp.37-54.

  42. Potharaju R, Newell A, Nita R C, Zhang X Y. Plagiarizing smartphone applications: Attack strategies and defense techniques. In Proc. the 4th International Symposium on Engineering Secure Software and Systems, February 2012, pp.106-120.

  43. Wu D J, Mao C H, Wei T E, Lee H M, Wu K P. DroidMat: Android malware detection through manifest and API calls tracing. In Proc. the 7th Asia Joint Conference on Information Security, August 2012, pp.62-69.

  44. Ruiz I J M, Nagappan M, Adams B, Hassan A E. Understanding reuse in the Android market. In Proc. the 20th IEEE International Conference on Program Comprehension, June 2012, pp.113-122.

  45. Desnos A. Android: Static analysis using similarity distance. In Proc. the 45th Hawaii International Conference on System Sciences, February 2012, pp.5394-5403.

  46. Zhauniarovich Y, Gadyatskaya O, Crispo B, La S F, Moser E. FSquaDRA: Fast detection of repackaged applications. In Proc. the 28th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, July 2014, pp.130-145.

  47. Gao J, Li L, Kong P F, Bissyandé T F, Klein J. On vulnerability evolution in Android apps. In Proc. the 40th International Conference on Software Engineering: Companion Proceedings, May 2018, pp.276-277.

  48. Kong P F, Li L, Gao J, Liu K, Bissyandé T F, Klein J. Automated testing of Android apps: A systematic literature review. IEEE Transactions on Reliability. doi:https://doi.org/10.1109/TR.2018.2865733.

  49. Li L, Bissyandé T F, Klein J, Le T Y. An investigation into the use of common libraries in Android apps. In Proc. the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering, March 2016, pp.403-414.

  50. Viennot N, Garcia E, Nieh J. A measurement study of Google play. In Proc. ACM International Conference on Measurement and Modeling of Computer Systems, June 2014, pp.221-233.

  51. Ma Z, Wang H Y, Guo Y, Chen X Q. LibRadar: Fast and accurate detection of third-party libraries in Android apps. In Proc. the 38th ACM/IEEE International Conference on Software Engineering Companion, May 2016, pp.653-656.

  52. Wang H Y, Guo Y. Understanding third-party libraries in mobile app analysis. In Proc. the 39th IEEE/ACM International Conference on Software Engineering Companion, May 2017, pp.515-516.

  53. Li L, Bissyandé T F, Octeau D, Klein J. DroidRA: Taming reflection to support whole-program analysis of Android apps. In Proc. the 25th International Symposium on Software Testing and Analysis, July 2016, pp.318-329.

  54. Lam P, Bodden E, Lhoták O, Hendren L. The Soot framework for Java program analysis: A retrospective. In Proc. Cetus Users and Compiler Infrastructure Workshop, October 2011, Article No. 35.

  55. Bartel A, Klein J, Le Traon Y, Monperrus M. Dexpler: Converting Android Dalvik bytecode to jimple for static analysis with soot. In Proc. the ACM SIGPLAN International Workshop on State of the Art in Java Program Analysis, June 2012, pp.27-38.

  56. Li L, Gao J, Hurier M, Kong P F, Bissyandé T F, Bartel A, Klein J, Traon Y L. Androzoo++: Collecting millions of Android apps and their metadata for the research community. arXiv:1709.05281, 2017. https://arxiv.org/pdf/1709.05281.pdf, September 2018.

  57. Sebastián M, Rivera R, Kotzias P, Caballero J. AVCLASS: A tool for massive malware labeling. In Proc. the 19th International Symposium on Research in Attacks, Intrusions, and Defenses, September 2016, pp.230-253.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Li.

Electronic supplementary material

ESM 1

(PDF 708 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, L., Bissyandé, T.F., Wang, HY. et al. On Identifying and Explaining Similarities in Android Apps. J. Comput. Sci. Technol. 34, 437–455 (2019). https://doi.org/10.1007/s11390-019-1918-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-019-1918-8

Keywords

Navigation