Skip to main content

The Usability of Metadata for Android Application Analysis

  • Conference paper
  • First Online:
Book cover Neural Information Processing (ICONIP 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9947))

Included in the following conference series:

Abstract

The number of security incidents faced by Android users is growing, along with the surge in malware targeting Android terminals. Such malware arrives at the Android terminals in the form of Android Packages (APKs). Assorted techniques for protecting Android users from such malware have been reported, but most of them focus on the APK files themselves. Different from these approaches, we use metadata, such as web information obtained from the online APK markets, to improve the accuracy of malware identification. In this paper, we introduce malware detection schemes using metadata, which includes categories and descriptions of APKs. We introduce two types of schemes: statistical scheme and support vector machine-based scheme. Finally, we analyze and discuss the performance and usability of the schemes, and confirm the usability of web information for the purpose of identifying malware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We used the language-detection library [2] to detect the language, stemmify [7] for the stemming operation, and stoplist/en.txt of MALLET [5] as the list of stop words.

  2. 2.

    We used MALLET for running LDA and considered 300 topics because the MALLET documentation states that “The number of topics should depend to some degree on the size of the collection, but 200 to 400 will produce reasonably fine-grained results.”

  3. 3.

    We used the “kmeans” [4] function of Ruby gem [8].

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Cybozu Labs: Language Detection Library for Java, December 2014. https://code.google.com/p/language-detection/

  3. Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: ICSE 2014, Proceedings of the 36th International Conference on Software Engineering (2014)

    Google Scholar 

  4. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1: Statistics, pp. 281–297 (1967)

    Google Scholar 

  5. McCallum, A.K.: MALLET: a machine learning for language toolkit, December 2014. http://mallet.cs.umass.edu

  6. OPERA SOFTWARE ASA: Opera Mobile Store, January 2015. http://apps.opera.com/

  7. Ray Pereda: stemmify, December 2014. https://rubygems.org/gems/stemmify

  8. RubyGems.org: kmeans, December 2014. https://rubygems.org/gems/kmeans/

  9. Sarma, B.P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Android permissions: a perspective combining risks and benefits. In: Proceedings of the 17th ACM Symposium on Access Control Models and Technologies, SACMAT 2012, pp. 13–22. ACM, New York (2012). http://doi.acm.org/10.1145/2295136.2295141

  10. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  11. Takahashi, T., Ban, T., Mimura, T., Nakao, K.: Fine-grained risk level quantication schemes based on APK metadata. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9491, pp. 663–673. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26555-1_75

    Chapter  Google Scholar 

  12. Takahashi, T., Nakao, K., Kanaoka, A.: Data model for android package information and its application to risk analysis system. In: First ACM Workshop on Information Sharing and Collaborative Security. ACM, November 2014

    Google Scholar 

  13. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  14. VirusTotal: virustotal for android, January 2015. http://www.virustotal.com/ja

  15. Wang, Y., Zheng, J., Sun, C., Mukkamala, S.: Quantitative security risk assessment of android permissions and applications. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 226–241. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39256-6_15

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeshi Takahashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Takahashi, T., Ban, T., Tien, CW., Lin, CH., Inoue, D., Nakao, K. (2016). The Usability of Metadata for Android Application Analysis. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9947. Springer, Cham. https://doi.org/10.1007/978-3-319-46687-3_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46687-3_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46686-6

  • Online ISBN: 978-3-319-46687-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics