Skip to main content
Log in

Empirical study of android repackaged applications

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The growing popularity of Android applications has generated increased concerns over the danger of piracy and the spread of malware, and particularly of adware: malware that seeks to present unwanted advertisements to the user. A popular way to distribute malware in the mobile world is through repackaging of legitimate apps. This process consists of downloading, unpacking, manipulating, recompiling an application, and publishing it again in an app store. In this paper, we conduct an empirical study of over 15,000 apps to gain insights into the factors that drive the spread of repackaged apps. We also examine the motivations of developers who publish repackaged apps and those of users who download them, as well as the factors that determine which apps are chosen for repackaging, and the ways in which the apps are modified during the repackaging process. Having observed that adware is particularly prevalent in repackaged apps, we focus on this type of malware and examine how the app is modified when it is injected in an app’s code. Our findings shed much needed light on this class of malware that can be useful to security experts, and allow us to make recommendations that could lead to the creation of more effective malware detection tools, Furthermore, on the basis of our results, we propose a novel app indexing scheme that minimizes the number of comparisons needed to detect repackaged apps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Notes

  1. http://www.guardsquare.com/en/dexguard

  2. https://github.com/Mimino666/langdetect

References

  • Aafer Y, Du W, Yin H (2013) DroidAPIMiner: mining API-level features for robust malware detection in android. In: International conference on security and privacy in communication systems. Springer, Cham, pp 86–103

    Google Scholar 

  • AdMob and AdSense policies (2019) https://support.google.com/admob/answer/6128543?hl=en. Accessed 25 Feb 2019a

  • Alazab M, Venkataraman S, Watters P (2010) Towards Understanding Malware Behaviour by the Extraction of API Calls. In: 2010 Second cybercrime and trustworthy computing workshop. Ieee, pp 52–59

  • Aldini A, Martinelli F, Saracino A, Sgandurra D (2015) Detection of repackaged mobile applications through a collaborative approach. Wiley Concurr Comput Pract Exp 27(11):2818–2838. https://doi.org/10.1002/cpe.3447

    Article  Google Scholar 

  • Android Developer Documentation (2018). https://developer.android.com/reference/dalvik/system/package-summary Accessed 3 Mar 2018

  • Arp D, Spreitzenbarth M, Malte H, Gascon H, Rieck K (2014) DREBIN : effective and explainable detection of android malware in your pocket. In: NDSS (Vol. 14). pp 23–26

  • Au YWK, Zhou YF, Huang Z, Lie D (2012) PScout : analyzing the android permission specification. In: CCS ‘12 proceedings of the 2012 ACM conference on computer and communications security. pp 217–228

  • Backes M, Bugiel S, Derr E (2016) Reliable Third-Party Library Detection in Android and its Security Applications. In: the 2016 ACM SIGSAC conference on computer and 2Communications security. ACM, pp 356–367

  • Bartel A, Klein J, Monperrus M, Traon Y Le (2012) Dexpler: converting android Dalvik bytecode to Jimple for static analysis with soot. In: ACM SIGPLAN International Workshop on State of the Art in Java Program analysis. ACM, pp 27–38

  • Book T, Pridgen A, Wallach DS (2013) Longitudinal analysis of android ad library permissions. IEEE Mob Secur Technol, ArXiv 1303:0857

    Google Scholar 

  • Canfora G, Mercaldo F, Visaggio CA (2013) A classifier of malicious android applications. In: Eighth international conference on availability, Reliability and Security (ARES). IEEE, pp 607–614

  • Chen K, Liu P, Zhang Y (2014) Achieving accuracy and scalability simultaneously in detecting application clones on android markets. In: 36th international conference on software engineering - ICSE 2014. Pp 175–186

  • Chen J, Alalfi MH, Dean TR, Zou Y (2015a) Detecting android malware using clone detection. J Comput Sci Technol 30(5):942–956. https://doi.org/10.1007/s11390-015-1573-7

    Article  Google Scholar 

  • Chen K, Wang P, Lee Y, Wang X, Zhang N, Huang H, Zou W, Liu P (2015b) Finding unknown malice in 10 seconds: mass vetting for new threats at the Google-play scale. In: 24th USENIX security symposium (USENIX security 15). pp 659–674

  • Chien E (2005) Techniques of adware and spyware. In: the proceedings of the fifteenth virus bulletin conference (Vol. 47). Dublin Ireland

  • Crussell J, Gibler C, Chen H (2012) Attack of the clones: detecting cloned applications on android markets. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 37–54

    Chapter  Google Scholar 

  • Crussell J, Stevens R, Chen H (2014) MadFraud : investigating ad fraud in android applications. In: the 12th annual international conference on Mobile systems, applications, and services. ACM, pp 123–134

  • Crussell J, Gibler C, Chen H (2015) AnDarwin: scalable detection of android application clones based on semantics. IEEE Trans Mob Comput 14(10):2007–2019. https://doi.org/10.1109/TMC.2014.2381212

    Article  Google Scholar 

  • Desnos A (2015) Androguard: Reverse engineering, Malware and goodware analysis of Android applications ... and more (ninja !). https://github.com/androguard/androguard. Accessed 19 Jul 2018

  • Dong F, Wang H, Li L, Guo Y, Bissyande TF, Liu T, Xu G, Klein J (2018a) FraudDroid : automated ad fraud detection for android apps. ArXiv: 1709.01213v4

  • Dong S, Li M, Diao W, Liu X, Liu J, Li Z, Xu F, Chen K, Wang X, Zhang K (2018b) Understanding android obfuscation techniques : a large-scale investigation in the wild. ArXiv: 801.01633v1

  • Enck W, Cox LP, Gilbert P, Mcdaniel P (2014) TaintDroid : an information-flow tracking system for Realtime privacy monitoring on smartphones. ACM Trans Comput Syst 32(2):5

    Article  Google Scholar 

  • Erturk E (2012) A case study in open source software security and privacy : Android Adware. In: In 2012 World congress on Internet security (WorldCIS). IEEE, pp 189–191

  • Gao J, Li L, Tegawend PK (2019) Should you consider adware as malware in your study ? In: 26th international conference on software analysis, Evolution and Reengineering (SANER). IEEE, pp 604–608

  • Gascon H, Yamaguchi F, Rieck K, Arp D (2013) Structural detection of android malware using embedded call graphs categories and subject descriptors. In: ACM workshop on Artificial intelligence and Security. ACM, pp 45–54

  • Gonzalez H, Kadir AA, Stakhanova N, Alzahrani AJ, Ghorbani AA (2014) Exploring reverse engineering symptoms in android apps. In: the eighth European workshop on system security. ACM, p 7

  • Google Inc. (2012) Cloud to device messaging (Deprecated). https://developers.google.com/android/c2dm/. Accessed 23 Jul 2018

  • Grace M, Zhou Y, Zhang Q, Zou S, Jiang X (2012) RiskRanker: scalable and accurate zero-day android malware detection. In: 10th international conference on mobile systems, applications, and services. pp 281–294

  • Guan Q, Huang H, Luo W, Zhu S (2016) Semantics-based repackaging detection for mobile apps. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 89–105

    Google Scholar 

  • Gupta S (2013) Types of malware and its analysis. Int J Sci Eng Res 4(1)

  • Hanna S, Huang L, Wu E, Li S, Chen C, Song D (2012) Juxtapp: A scalable system for detecting code reuse among android applications. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, pp. 62-81

  • Hu W, Tao J, Ma X, Zhou W, Zhao S, Han T (2014) MIGDroid: detecting APP-repackaging android malware via method invocation graph. In: Proceedings - International Conference on Computer Communications and Networks, ICCCN. pp 1–7

  • Huang A (2008) Similarity measures for text document clustering. In: the sixth New Zealand computer science research student conference. pp 49–56

  • Hurier M, Suarez-Tangil G, Dash SK, Bissyande TF, Le Traon Y, Klein J, Cavallaro L (2017) Euphony: harmonious unification of cacophonous anti-virus vendor labels for android malware. IEEE International Working Conference on Mining Software Repositories, In, pp 425–435

    Google Scholar 

  • Islam R, Altas I (2012) A comparative study of malware family classification. pp 488–496

    Chapter  Google Scholar 

  • Jiao S, Cheng Y, Ying L, Su P, Feng D (2015) A rapid and scalable method for android application repackaging detection. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 349–364

    Chapter  Google Scholar 

  • Khanmohammadi K, Hamou-Lhadj A (2017) HyDroid: A Hybrid Approach for Generating API Call Traces from Obfuscated Android Applications for Mobile Security. In: the 2017 IEEE international conference on software quality, reliability and security (QRS), Prague, Czech Republic, p. 168–175

  • Khanmohammadi K, Rejali M, Hamou-Lhadj A (2015) Understanding the service life cycle of android apps: an exploratory study. In: the 5th annual ACM CCS workshop on security and privacy in smartphones and Mobile devices (SPSM), Denver, US

  • Kornblum J (2006) Identifying almost identical files using context triggered piecewise hashing. Digital Investigation, In, pp 91–97

    Google Scholar 

  • Kumar M (2017) Beware! New android malware infected 2 Million Google Play Store Users. https://thehackernews.com/2017/04/android-malware-playstore.html. Accessed 19 Jul 2018

  • Kywe SM, Li Y, Deng RH, Hong J (2014) Detecting camouflaged applications on mobile application markets. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 241–254

    Chapter  Google Scholar 

  • Lee YK, Lim JD, Jeon YS, Kim JN (2014) Protection method from APP repackaging attack on mobile device with separated domain. In: International Conference on ICT Convergence. pp 667–668

  • Leka O (2016) Database of android apps | Kaggle. https://www.kaggle.com/orgesleka/android-apps/data. Accessed 19 Jul 2018

  • Li Y, Sundaramurthy SC, Bardas AG, et al (2015) Experimental study of fuzzy hashing in malware clustering analysis. In: 8th workshop on cyber security experimentation and test (CSET 15)

  • Li L, Gao J, Hurier M, Kong P, Bissyandé TF, Bartel A, Klein J, Traon, Y Le (2017a) AndroZoo++: collecting millions of android apps and their metadata for the research community. https://doi.org/10.1145/2901739.2903508

  • Li L, Li D, Bissyande TF, Klein J, Le Traon Y, Lo D, Cavallaro L (2017b) Understanding android app piggybacking: a systematic study of malicious code grafting. IEEE Transactions on Information Forensics and Security. IEEE, In, pp 359–361

    Google Scholar 

  • Li L, Bissyandé TF, Klein J (2018a) MoonlightBox: mining android API histories for uncovering release-time inconsistencies. In: the 29th IEEE international symposium on software reliability engineering (ISSRE 2018). IEEE

  • Li Li, Tegawendé Bissyandé, Jacques Klein (2018b) Rebooting research on detecting repackaged android apps: Literature Review and Benchmark, arXiv preprint arXiv:1811.08520

  • Lin YD, Lai YC, Chen CH, Tsai HC (2013) Identifying android malicious repackaged applications by thread-grained system call sequences. Comput Secur 39(PART B):340–350. https://doi.org/10.1016/j.cose.2013.08.010

    Article  Google Scholar 

  • Linares-Vásquez M, Holtzhauer A, Bernal-Cárdenas C, Poshyvanyk D (2014) Revisiting android reuse studies in the context of code obfuscation and library usages. In: 11th working conference on mining software repositories - MSR 2014. pp 242–251

  • Liu B, California S, Nath S, Nsdi I (2014) DECAF : detecting and characterizing ad fraud in Mobile apps this paper is included in the proceedings of the. In: 11th {USENIX} symposium on networked systems design and implementation ({NSDI} 14). pp 57–70

  • Luo L, Fu Y, Wu D, Zhu S, Liu P (2016) Repackage-proofing android apps. In: in 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, pp 550–561

  • Ma Z, Wang H, Guo Y, Chen X (2016) LibRadar : fast and accurate detection of third-party libraries in android apps. In: the 38th international conference on software engineering companion. ACM, pp 653–656

  • Maly F, Kriz P (2015) An Ad Hoc mobile cloud and its dynamic loading of modules into a mobile device running Google android. In: New trends in intelligent information and database systems. Springer, Cham, pp 191–198

    Chapter  Google Scholar 

  • Mariconti E, Onwuzurike L, Andriotis P, De Cristofaro E, Ross G, Stringhini G (2017) MaMaDroid: detecting android malware by building Markov chains of behavioral models. In: 24th network and distributed system security symposium

  • Microsoft Advertising (2019) https://advertising.microsoft.com/home. Accessed 25 Feb 2019b

  • Mojica IJ, Nagappan M, Adams B, Hassan AE (2012) Understanding reuse in the android market. In: Program Comprehension (ICPC), 2012 IEEE 20th International Conference on. pp 113–122

  • Mojica IJ, Adams B, Nagappan M, Dienst S, Berger T, Hassan AE (2014) A large scale empirical study on software reuse in mobile apps. Software, IEEE 31(2):78–86. https://doi.org/10.1109/MS.2013.142

    Article  Google Scholar 

  • Mulliner C, Robertson W, Kirda E (2014) VirtualSwindle : an automated attack against in-app Billing on android. In: Proceedings of the 9th ACM symposium on Information, computer and communications security. ACM, pp 459–470

  • OWASP (2016) Mobile Top 10 2016-Top 10 - OWASP. https://www.owasp.org/index.php/Mobile_Top_10_2016-Top_10. Accessed 19 Jul 2018

  • Potharaju R, Newell A, Nita-rotaru C, Zhang X (2012) Plagiarizing smartphone applications : attack strategies and defense techniques. In: In International Symposium on Engineering Secure Software and Systems. Springer, Berlin, Heidelberg, pp 106–120

    Google Scholar 

  • Rastogi V, Shao R, Chen Y et al (2016) Are these ads safe : detecting hidden attacks through the mobile app-web interfaces. NDSS, In

    Google Scholar 

  • Ren C, Chen K, Liu P (2014) Droidmarking: resilient softwarewatermarking for impeding android application repackaging. In: 29th ACM/IEEE international conference on Automated software engineering. pp 635–646

  • Sahs J, Khan L (2012) A machine learning approach to android malware detection. European Intelligence and Security Informatics Conference, In, pp 141–147

    Google Scholar 

  • Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG (2012) On the automatic categorisation of android applications. In: IEEE consumer communications and networking conference, CCNC’2012. pp 149–153

  • Shahriar H, Clincy V (2014) Detection of repackaged android malware. In: 9th international conference for Internet technology and secured transactions. ICITST 2014:349–354

    Google Scholar 

  • Shao Y, Luo X, Qian C, Zhu P, Zhang L (2014) Towards a scalable resource-driven approach for detecting repackaged android applications. In: ACSAC ‘14 (30th annual computer security applications conference). pp 56–65

  • Sharif M, Lanzi A, Giffin J, Lee W (2008) Impeding malware analysis using conditional code obfuscation. In: Network and Distributed System Security Symposium, NDSS 2008

  • Singhal A (2001) Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4), pp.35-43

  • Soh C, Tan HBK, Arnatovich YL, Wang L (2015) Detecting clones in android applications through analyzing user interfaces. In: 23rd IEEE international conference on program comprehension. IEEE, pp 163–173

  • Suarez-Tangil G, Tapiador JE, Peris-Lopez P, Blasco J (2014) Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Expert Syst Appl 41(4):1104–1117. https://doi.org/10.1016/j.eswa.2013.07.106

    Article  Google Scholar 

  • Sun X, Zhongyang Y, Xin Z, Mao B, Xie L (2014) Detecting code reuse in android applications using component-based control flow graph. In: International Information Security and Privacy Conference. pp 142–155

    Chapter  Google Scholar 

  • Sun M, Li M, Lui JCS (2015) DroidEagle: seamless detection of visually similar android apps. In: 8th ACM conference on Security & Privacy in wireless and mobile networks. ACM, p 9

  • Symantec (2014) Android.Appenda. https://www.symantec.com/security-center/writeup/2012-062812-0516-99. Accessed 4 Mar 2019

  • Takabi H, Joshi JBD, Ahn GJ (2010) SecureCloud: towards a comprehensive security framework for cloud computing environments. Proc - Int Comput Softw Appl Conf :393–398. https://doi.org/10.1109/COMPSACW.2010.74

  • Tian K, Yao D, Ryder BG, Tan G (2016) Analysis of code heterogeneity for high-precision classification of repackaged malware. In: IEEE Symposium on Security and Privacy Workshops, SPW 2016. pp 262–271

  • Viennot N, Garcia E, Nieh J (2014) A measurement study of google play. Meas Model Comput Syst - SIGMETRICS 42(1):221–233. https://doi.org/10.1145/2591971.2592003

    Article  Google Scholar 

  • Vigna G, Kruegel C, Bianchi A, Poeplau S, Fratantonio Y (2014) Execute this! Analyzing unsafe and malicious dynamic code loading in android applications. In: NDSS (Vol. 14). pp 23–49

  • VirusTotal (2018) Free online virus malware and URL scanner. Google Inc., In https://www.virustotal.com/#/home/upload. Accessed 19 Jul 2018

    Google Scholar 

  • Wang H, Guo Y, Ma Z, Chen X (2015) WuKong: a scalable and accurate two-phase approach to android app clone detection. International Symposium on Software Testing and Analysis - ISSTA 2015:71–82

    Google Scholar 

  • Winter C, Schneider M, Yannikos Y (2013) F2S2: fast forensic similarity search through indexing piecewise hash signatures. Digit Investig 10(4):361–371. https://doi.org/10.1016/j.diin.2013.08.003

    Article  Google Scholar 

  • Wiśniewski R (2012) Apktool - A tool for reverse engineering 3rd party, closed, binary Android apps. https://ibotpeaches.github.io/Apktool/. Accessed 19 Jul 2018

  • Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) DroidMat: android malware detection through manifest and API calls tracing. In: Proceedings of the 2012 7th Asia joint conference on information security, AsiaJCIS 2012. pp 62–69

  • Wu X, Zhang D, Su X, Li W (2015) Detect repackaged android application based on HTTP traffic similarity. Secur Commun Networks 8(13):2257–2266. https://doi.org/10.1002/sec.1170

    Article  Google Scholar 

  • Xu K, Li Y, Deng RH (2016) ICCDetector: ICC-based malware detection on android. IEEE Trans Inf Forensics Secur 11(6):1252–1264. https://doi.org/10.1109/TIFS.2016.2523912

    Article  Google Scholar 

  • Xue Y, Meng G, Liu Y, Tan TH, Chen H, Sun J, Zhang J (2017) Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Trans Inf Forensics Secur 12(7):1529–1544. https://doi.org/10.1109/TIFS.2017.2661723

    Article  Google Scholar 

  • Yang C, Xu Z, Gu G, Yegneswaran V, Porras P (2014) DroidMiner : automated mining and characterization of fine-grained malicious behaviors in android. European Symposium on Research in Computer Security, In, pp 163–182

    Google Scholar 

  • Yue S, Feng W, Jiang Y, Tao X, Xu C, Lu J (2017) RepDroid: an automated tool for android application repackaging detection. In: in (ICPC), 2017 IEEE/ACM 25th International Conference on Program Comprehension. IEEE, pp 132–142

  • Zeng Q, Luo L, Qian Z, Du X, Li Z (2018) Resilient decentralized android application repackaging detection using logic bombs. In: In Proceedings of the 2018 International Symposium on Code Generation and Optimization. ACM, pp 50–61

  • Zhang F, Huang H, Zhu S, Wu D, Liu P (2014) ViewDroid: towards obfuscation-resilient mobile application repackaging detection. WiSec 2014 - Proc 7th ACM Conf Secur Priv Wirel Mob Networks :25–36 . https://doi.org/10.1145/2627393.2627395

  • Zhao Y, Qian Q (2018) Android malware identification through visual exploration of disassembly files. Int J Netw Secur 20(6):1005–1015. https://doi.org/10.6633/IJNS.201811

    Article  Google Scholar 

  • Zhauniarovich Y, Gadyatskaya O, Crispo B, La Spina F, Moser E (2014) FSquaDRA: fast detection of repackaged applications. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 130–145

    Google Scholar 

  • Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE, pp 95–109

  • Zhou W, Zhou Y, Jiang X, Ning P, Drive O (2012a) Detecting repackaged smartphone applications in third-party android marketplaces. In Proceedings of the second ACM conference on Data and Application Security and Privacy. ACM, In, pp 317–326

    Google Scholar 

  • Zhou Y, Wang Z, Zhou W, Jiang X (2012b) Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. 19th Annu Netw Distrib Syst Secur Symp 25(4):50–52

  • Zhou W, Zhang X, Jiang X (2013a) AppInk : watermarking android apps for repackaging deterrence. In: the 8th ACM SIGSAC symposium on information, computer and communications security. pp 1–12

  • Zhou W, Zhou Y, Grace M, Jiang X, Zou S (2013b) Fast , scalable detection of “ Piggybacked ” mobile applications. In: In Proceedings of the third ACM conference on Data and application security and privacy. ACM, pp 185–196

  • Zhou W, Wang Z, Zhou Y, Jiang X (2014) DIVILAR: diversifying intermediate language for anti-repackaging on android platform. In: CODASPY ‘14 (4rd ACM conference on data and application security and Privac). pp 199–210

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kobra Khanmohammadi.

Additional information

Communicated by: David Lo, Meiyappan Nagappan, Fabio Palomba and Sebastiano Panichella

Pulisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khanmohammadi, K., Ebrahimi, N., Hamou-Lhadj, A. et al. Empirical study of android repackaged applications. Empir Software Eng 24, 3587–3629 (2019). https://doi.org/10.1007/s10664-019-09760-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09760-3

Keywords

Navigation