The Usability of Metadata for Android Application Analysis

Takahashi, Takeshi; Ban, Tao; Tien, Chin-Wei; Lin, Chih-Hung; Inoue, Daisuke; Nakao, Koji

doi:10.1007/978-3-319-46687-3_60

Takeshi Takahashi¹⁹,
Tao Ban¹⁹,
Chin-Wei Tien²⁰,
Chih-Hung Lin²⁰,
Daisuke Inoue¹⁹ &
…
Koji Nakao¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9947))

Included in the following conference series:

International Conference on Neural Information Processing

2497 Accesses
1 Citations

Abstract

The number of security incidents faced by Android users is growing, along with the surge in malware targeting Android terminals. Such malware arrives at the Android terminals in the form of Android Packages (APKs). Assorted techniques for protecting Android users from such malware have been reported, but most of them focus on the APK files themselves. Different from these approaches, we use metadata, such as web information obtained from the online APK markets, to improve the accuracy of malware identification. In this paper, we introduce malware detection schemes using metadata, which includes categories and descriptions of APKs. We introduce two types of schemes: statistical scheme and support vector machine-based scheme. Finally, we analyze and discuss the performance and usability of the schemes, and confirm the usability of web information for the purpose of identifying malware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We used the language-detection library [2] to detect the language, stemmify [7] for the stemming operation, and stoplist/en.txt of MALLET [5] as the list of stop words.
2.
We used MALLET for running LDA and considered 300 topics because the MALLET documentation states that “The number of topics should depend to some degree on the size of the collection, but 200 to 400 will produce reasonably fine-grained results.”
3.
We used the “kmeans” [4] function of Ruby gem [8].

References

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Cybozu Labs: Language Detection Library for Java, December 2014. https://code.google.com/p/language-detection/
Gorla, A., Tavecchia, I., Gross, F., Zeller, A.: Checking app behavior against app descriptions. In: ICSE 2014, Proceedings of the 36th International Conference on Software Engineering (2014)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1: Statistics, pp. 281–297 (1967)
Google Scholar
McCallum, A.K.: MALLET: a machine learning for language toolkit, December 2014. http://mallet.cs.umass.edu
OPERA SOFTWARE ASA: Opera Mobile Store, January 2015. http://apps.opera.com/
Ray Pereda: stemmify, December 2014. https://rubygems.org/gems/stemmify
RubyGems.org: kmeans, December 2014. https://rubygems.org/gems/kmeans/
Sarma, B.P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Android permissions: a perspective combining risks and benefits. In: Proceedings of the 17th ACM Symposium on Access Control Models and Technologies, SACMAT 2012, pp. 13–22. ACM, New York (2012). http://doi.acm.org/10.1145/2295136.2295141
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Takahashi, T., Ban, T., Mimura, T., Nakao, K.: Fine-grained risk level quantication schemes based on APK metadata. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9491, pp. 663–673. Springer, Heidelberg (2015). doi:10.1007/978-3-319-26555-1_75
Chapter Google Scholar
Takahashi, T., Nakao, K., Kanaoka, A.: Data model for android package information and its application to risk analysis system. In: First ACM Workshop on Information Sharing and Collaborative Security. ACM, November 2014
Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
VirusTotal: virustotal for android, January 2015. http://www.virustotal.com/ja
Wang, Y., Zheng, J., Sun, C., Mukkamala, S.: Quantitative security risk assessment of android permissions and applications. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 226–241. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39256-6_15
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Information and Communications Technology, Tokyo, Japan
Takeshi Takahashi, Tao Ban, Daisuke Inoue & Koji Nakao
Institute for Information Industry, Taipei, Taiwan
Chin-Wei Tien & Chih-Hung Lin

Authors

Takeshi Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Tao Ban
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Wei Tien
View author publications
You can also search for this author in PubMed Google Scholar
Chih-Hung Lin
View author publications
You can also search for this author in PubMed Google Scholar
Daisuke Inoue
View author publications
You can also search for this author in PubMed Google Scholar
Koji Nakao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takeshi Takahashi .

Editor information

Editors and Affiliations

The University of Tokyo, Tokyo, Japan
Akira Hirose
Kobe University, Kobe, Japan
Seiichi Ozawa
Okinawa Institute of Science and Technology Graduate University, Onna, Japan
Kenji Doya
Nara Institute of Science and Technology, Ikoma, Japan
Kazushi Ikeda
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Chinese Academy of Sciences, Beijing, China
Derong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takahashi, T., Ban, T., Tien, CW., Lin, CH., Inoue, D., Nakao, K. (2016). The Usability of Metadata for Android Application Analysis. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9947. Springer, Cham. https://doi.org/10.1007/978-3-319-46687-3_60

Download citation

DOI: https://doi.org/10.1007/978-3-319-46687-3_60
Published: 29 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46686-6
Online ISBN: 978-3-319-46687-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics