The Usability of Metadata for Android Application Analysis

  • Takeshi Takahashi
  • Tao Ban
  • Chin-Wei Tien
  • Chih-Hung Lin
  • Daisuke Inoue
  • Koji Nakao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9947)


The number of security incidents faced by Android users is growing, along with the surge in malware targeting Android terminals. Such malware arrives at the Android terminals in the form of Android Packages (APKs). Assorted techniques for protecting Android users from such malware have been reported, but most of them focus on the APK files themselves. Different from these approaches, we use metadata, such as web information obtained from the online APK markets, to improve the accuracy of malware identification. In this paper, we introduce malware detection schemes using metadata, which includes categories and descriptions of APKs. We introduce two types of schemes: statistical scheme and support vector machine-based scheme. Finally, we analyze and discuss the performance and usability of the schemes, and confirm the usability of web information for the purpose of identifying malware.


Android Malware APK Risk analysis Machine learning 


  1. 1.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  2. 2.
    Cybozu Labs: Language Detection Library for Java, December 2014.
  3. 3.
    Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: ICSE 2014, Proceedings of the 36th International Conference on Software Engineering (2014)Google Scholar
  4. 4.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1: Statistics, pp. 281–297 (1967)Google Scholar
  5. 5.
    McCallum, A.K.: MALLET: a machine learning for language toolkit, December 2014.
  6. 6.
    OPERA SOFTWARE ASA: Opera Mobile Store, January 2015.
  7. 7.
    Ray Pereda: stemmify, December 2014.
  8. 8. kmeans, December 2014.
  9. 9.
    Sarma, B.P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Android permissions: a perspective combining risks and benefits. In: Proceedings of the 17th ACM Symposium on Access Control Models and Technologies, SACMAT 2012, pp. 13–22. ACM, New York (2012).
  10. 10.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  11. 11.
    Takahashi, T., Ban, T., Mimura, T., Nakao, K.: Fine-grained risk level quantication schemes based on APK metadata. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9491, pp. 663–673. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-26555-1_75 CrossRefGoogle Scholar
  12. 12.
    Takahashi, T., Nakao, K., Kanaoka, A.: Data model for android package information and its application to risk analysis system. In: First ACM Workshop on Information Sharing and Collaborative Security. ACM, November 2014Google Scholar
  13. 13.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  14. 14.
    VirusTotal: virustotal for android, January 2015.
  15. 15.
    Wang, Y., Zheng, J., Sun, C., Mukkamala, S.: Quantitative security risk assessment of android permissions and applications. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 226–241. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39256-6_15 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Takeshi Takahashi
    • 1
  • Tao Ban
    • 1
  • Chin-Wei Tien
    • 2
  • Chih-Hung Lin
    • 2
  • Daisuke Inoue
    • 1
  • Koji Nakao
    • 1
  1. 1.National Institute of Information and Communications TechnologyTokyoJapan
  2. 2.Institute for Information IndustryTaipeiTaiwan

Personalised recommendations