String-based Malware Detection for Android Environments

  • Alejandro Martín
  • Héctor D. MenéndezEmail author
  • David Camacho
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 678)


Android platforms are known as the less security smartphone devices. The increasing number of malicious apps published on Android markets suppose an important threat to users sensitive data, compromising more devices everyday. The commercial solutions that aims to fight against this malware are based on signature methodologies whose detection ratio is low. Furthermore, these engines can be easily defeated by obfuscation techniques, which are extremely common in app plagiarism. This work aims to improve malware detection using only the binary information and the permissions that are normally used by the anti-virus engines, in order to provide a scalable solution based on machine learning. In order to evaluate the performance of this approach, we carry out our experiments using 5000 malware and 5000 benign-ware, and compare the results with 56 Anti-Virus Engines from VirusTotal.


Malware Classification Android 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Daniel Arp, Michael Spreitzenbarth, Malte Hübner, Hugo Gascon, Konrad Rieck, and CERT Siemens. Drebin: Effective and explainable detection of android malware in your pocket. In Proceedings of the Annual Symposium on Network and Distributed System Security (NDSS), 2014.Google Scholar
  2. 2.
    Gema Bello-Orgaz and David Camacho. Evolutionary clustering algorithm for community detection using graph-based information. In Evolutionary Computation (CEC), 2014 IEEE Congress on, pages 930–937. IEEE, 2014.Google Scholar
  3. 3.
    Gema Bello-Orgaz, Jason J Jung, and David Camacho. Social big data: Recent achievements and new challenges. Information Fusion, 28:45–59, 2016.Google Scholar
  4. 4.
    Mihai Christodorescu, Somesh Jha, Sanjit Seshia, Dawn Song, Randal E Bryant, et al. Semantics-aware malware detection. In Security and Privacy, 2005 IEEE Symposium on, pages 32–46. IEEE, 2005.Google Scholar
  5. 5.
    Pedro Domingos and Michael Pazzani. On the optimality of the simple bayesian classifier under zero-one loss. Machine learning, 29(2-3):103–130, 1997.Google Scholar
  6. 6.
    Marti A. Hearst, Susan T Dumais, Edgar Osman, John Platt, and Bernhard Scholkopf. Support vector machines. Intelligent Systems and their Applications, IEEE, 13(4):18–28, 1998.Google Scholar
  7. 7.
    Tin Kam Ho. The random subspace method for constructing decision forests. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(8):832–844, 1998.Google Scholar
  8. 8.
    Nwokedi Idika and Aditya P Mathur. A survey of malware detection techniques. Purdue University, 48, 2007.Google Scholar
  9. 9.
    Takamasa Isohara, Keisuke Takemori, and Ayumu Kubota. Kernel-based behavior analysis for android malware detection. In Computational Intelligence and Security (CIS), 2011 Seventh International Conference on, pages 1011–1015. IEEE, 2011.Google Scholar
  10. 10.
    Daniel T Larose. Discovering knowledge in data: an introduction to data mining. John Wiley & Sons, 2014.Google Scholar
  11. 11.
    Hector D Menendez, David F Barrero, and David Camacho. A genetic graphbased approach for partitional clustering. International journal of neural systems, 24(03):1430008, 2014.Google Scholar
  12. 12.
    Héctor David Menéndez and David Camacho. Mogcla: A multi-objective genetic clustering algorithm for large data analysis. In Proceedings of the Companion Publication of the 2015 on Genetic and Evolutionary Computation Conference, pages 1437–1438. ACM, 2015.Google Scholar
  13. 13.
    Andreas Moser, Christopher Kruegel, and Engin Kirda. Limits of static analysis for malware detection. In Computer security applications conference, 2007. ACSAC 2007. Twenty-third annual, pages 421–430. IEEE, 2007.Google Scholar
  14. 14.
    Mila Dalla Preda, Mihai Christodorescu, Somesh Jha, and Saumya Debray. A semantics-based approach to malware detection. ACM SIGPLAN Notices, 42(1):377–388, 2007.Google Scholar
  15. 15.
    J Ross Quinlan and Ronald L Rivest. Inferring decision trees using the minimum description lenght principle. Information and computation, 80(3):227–248, 1989.Google Scholar
  16. 16.
    Víctor Rodríguez-Fernáandez, Héctor D Menéndez, and David Camacho. Automatic profile generation for uav operators using a simulation-based training environment. Progress in Artificial Intelligence, 5(1):37–46, 2016.Google Scholar
  17. 17.
    Victor Rodriguez-Fernandez, Cristian Ramirez-Atencia, and David Camacho. A multi-uav mission planning videogame-based framework for player analysis. In Evolutionary Computation (CEC), 2015 IEEE Congress on, pages 1490–1497. IEEE, 2015.Google Scholar
  18. 18.
    Igor Santos, Felix Brezo, Javier Nieves, Yoseba K Penya, Borja Sanz, Carlos Laorden, and Pablo G Bringas. Idea: Opcode-sequence-based malware detection. In Engineering Secure Software and Systems, pages 35–43. Springer, 2010.Google Scholar
  19. 19.
    Asaf Shabtai, Uri Kanonov, Yuval Elovici, Chanan Glezer, and Yael Weiss. andromaly: a behavioral malware detection framework for android devices. Journal of Intelligent Information Systems, 38(1):161–190, 2012.Google Scholar
  20. 20.
    Kimberly Tam, Salahuddin J Khan, Aristide Fattori, and Lorenzo Cavallaro. Copperdroid: Automatic reconstruction of android malware behaviors. In Proc. of the Symposium on Network and Distributed System Security (NDSS), 2015.Google Scholar
  21. 21.
    Mu Zhang, Yue Duan, Heng Yin, and Zhiruo Zhao. Semantics-aware android malware classification using weighted contextual api dependency graphs. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pages 1105–1116. ACM, 2014.Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Alejandro Martín
    • 1
  • Héctor D. Menéndez
    • 2
    Email author
  • David Camacho
    • 1
  1. 1.Universidad Autónoma de MadridMadridSpain
  2. 2.University College LondonLondonUK

Personalised recommendations