Journal of Computer Science and Technology

, Volume 30, Issue 5, pp 942–956 | Cite as

Detecting Android Malware Using Clone Detection

  • Jian Chen
  • Manar H. Alalfi
  • Thomas R. Dean
  • Ying Zou
Regular Paper


Android is currently one of the most popular smartphone operating systems. However, Android has the largest share of global mobile malware and significant public attention has been brought to the security issues of Android. In this paper, we investigate the use of a clone detector to identify known Android malware. We collect a set of Android applications known to contain malware and a set of benign applications. We extract the Java source code from the binary code of the applications and use NiCad, a near-miss clone detector, to find the classes of clones in a small subset of the malicious applications. We then use these clone classes as a signature to find similar source files in the rest of the malicious applications. The benign collection is used as a control group. In our evaluation, we successfully decompile more than 1 000 malicious apps in 19 malware families. Our results show that using a small portion of malicious applications as a training set can detect 95% of previously known malware with very low false positives and high accuracy at 96.88%. Our method can effectively and reliably pinpoint malicious applications that belong to certain malware families.


Android malware clone detection 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Zhou Y, Jiang X. Dissecting Android malware: Characterization and evolution. In Proc. the 2012 IEEE Symposium on Security and Privacy, May 2012, pp.95-109.Google Scholar
  2. [2]
    Zhou W, Zhou Y, Jiang X et al. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. the 2nd CODASPY, Feb. 2012, pp.317-326.Google Scholar
  3. [3]
    Crussell J, Gibler C, Chen H. Attack of the clones: Detecting cloned applications on Android markets. In Lecture Notes in Computer Science 7459, Foresti S, Yung M, Martinelli F (eds.), Springer, 2012, pp.37-54.Google Scholar
  4. [4]
    Bruschi D, Martignoni L, Monga M. Using code normalization for fighting self-mutating malware. In Proc. Int. Symp. Secure Software Engineering, Mar. 2006.Google Scholar
  5. [5]
    Walenstein A, Lakhotia A. The software similarity problem in malware analysis. In Proc. Dagstuhl Seminar 06301: Duplication, Redundancy, and Similarity in Software, July 2006.Google Scholar
  6. [6]
    Roy C, Cordy J, Koschke R. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming, 2009, 74(7): 470–495.MATHMathSciNetCrossRefGoogle Scholar
  7. [7]
    Cordy J R, Roy C K. The NiCad clone detector. In Proc. the 19th ICPC, June 2011, pp.219-220.Google Scholar
  8. [8]
    Griffin K, Schneider S, Hu X et al. Automatic generation of string signatures for malware detection. In Proc. the 12th RAID, Sept. 2009, pp.101-120.Google Scholar
  9. [9]
    Christodorescu M, Jha S, Seshia S A et al. Semantics-aware malware detection. In Proc. the 2005 IEEE Symposium on Security and Privacy, May 2005, pp.32-46.Google Scholar
  10. [10]
    Hanna S, Huang L, Wu E et al. JuxtApp: A scalable system for detecting code reuse among Android applications. In Lecture Notes in Computer Science 7591, Flegel U, Markatos E, Robertson W (eds.), Springer Berlin Heidelberg, 2013, pp.62-81.Google Scholar
  11. [11]
    Enck W, Gilbert P, Chun B et al. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proc. the 9th USENIX Conf. Operating Systems Design and Implementation, Oct. 2010, pp.1-6.Google Scholar
  12. [12]
    Arzt S, Rasthofer S, Fritz C et al. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. ACM SIGPLAN Notice, 2014, 49(6): 259–269.CrossRefGoogle Scholar
  13. [13]
    Burguera I, Zurutuza U, Nadjm-Tehrani S. Crowdroid: Behavior-based malware detection system for Android. In Proc. the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, Oct. 2011, pp.15-26.Google Scholar
  14. [14]
    Christodorescu M, Jha S. Static analysis of executables to detect malicious patterns. In Proc. the 12th Conference on USENIX Security Symposium, Volume 12, Aug. 2003.Google Scholar
  15. [15]
    Wu D,Mao C,Wei T et al. DroidMat: AnDroid malware detection through manifest and API calls tracing. In Proc. the 7th Asia Joint Conference on Information Security (Asia JCIS), Aug. 2012, pp.62-69.Google Scholar
  16. [16]
    Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of semantically similar Android applications. In Lecture Notes in Computer Science 8134, Crampton J, Jajodia S, Mayes K (eds.), Springer Berlin Heidelberg, 2013, pp.182-199.Google Scholar
  17. [17]
    Andoni A, Indyk P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In Proc. the 47th Symp. Foundations of Computer Science, Oct. 2006, pp.459-468.Google Scholar
  18. [18]
    Chen K Z, Johnson N M, D’Silva V et al. Contextual policy enforcement in Android applications with permission event graphs. In Proc. the 20th NDSS, Feb. 2013.Google Scholar
  19. [19]
    Feng Y, Anand S, Dillig I et al. Apposcopy: Semanticsbased detection of Android malware through static analysis. In Proc. the 22nd ACM SIGSOFT Int. Symp. Foundations of Soft. Eng., Nov. 2014, pp.576-587.Google Scholar
  20. [20]
    Baxter I D, Yahin A, Moura L et al. Clone detection using abstract syntax trees. In Proc. International Conference on Software Maintenance, Nov. 1998, pp.368-377.Google Scholar
  21. [21]
    Cordy J. The TXL source transformation language. Sci. Comput. Program., 2006, 61(3): 190–210.Google Scholar
  22. [22]
    van Rijsbergen C J. Information Retrieval (2nd edition). Butterworth-Heinemann, Newton, MA, USA, 1979.Google Scholar
  23. [23]
    Karademir S, Dean T, Leblanc S. Using clone detection to find malware in Acrobat files. In Proc. Conf. the Center for Advanced Studies on Collaborative Research, Nov. 2013, pp.70-80.Google Scholar
  24. [24]
    Farhadi M R. Assembly code clone detection for malware binaries [M.A. Thesis]. Concordia University, April 2013., Nov.2013.
  25. [25]
    Farhadi M R, Fung B C M, Charland P et al. BinClone: Detecting code clones in malware. In Proc. the 8th Int. Conf. Software Security and Reliability, June 30-July 2, 2014, pp.78-87.Google Scholar
  26. [26]
    Yin R K. Case Study Research: Design and Methods. Sage Publications, 2014.Google Scholar
  27. [27]
    Vallee-Rai R, Hendren L J. Jimple: Simplifying Java bytecode for analyses and transformations. Sable Technical Report 1998–4. Sable Research Group, McGill University, 1998.Google Scholar
  28. [28]
    Gruver B. Smali: An assembler/disassembler for Android’s dex format@ONLINE., July 2015.
  29. [29]
    Bartel A, Klein J, Traon Y L, Monperrus M et al. Dexpler: Converting Android Dalvik bytecode to Jimple for static analysis with Soot. In Proc. ACM SIGPLAN International Workshop on the State of the Art in Java Program Analysis, June 2012, pp.27-38.Google Scholar
  30. [30]
    Gilbert D. Malware posing as official Google Play app found in....official Google Play Store., July 2015.

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Jian Chen
    • 1
  • Manar H. Alalfi
    • 2
  • Thomas R. Dean
    • 1
  • Ying Zou
    • 1
  1. 1.Department of Electrical and Computer EngineeringQueen’s UniversityKingstonCanada
  2. 2.School of ComputingQueen’s UniversityKingstonCanada

Personalised recommendations