A Visualization-Based Analysis on Classifying Android Malware

Coulter, Rory; Pan, Lei; Zhang, Jun; Xiang, Yang

doi:10.1007/978-3-030-30619-9_22

Rory Coulter¹¹,
Lei Pan¹²,
Jun Zhang¹¹ &
…
Yang Xiang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11806))

Included in the following conference series:

International Conference on Machine Learning for Cyber Security

1871 Accesses
1 Citations

Abstract

Since the introduction of the Android mobile platform, the state of mobile malware has evolved in both attack sophistication and its ability to evade detection. Given the right combination of elements, the detection of malicious applications may be found among those that pose no threat, yet the threats that exist across these malware types reveal distinguishable attack characteristics. This paper investigates the benign and attacking characteristics. By plotting complex features into dendrograms, we propose a novel approach to visually distinguish Android apps. We visualize the complicated relationship and evaluate the effect of different text mining methods. Specifically, we employ machine learning techniques including feature reduction using Principle Component Analysis, and the Random Forest classifier, to compare eight different models. Using the Drebin dataset, we achieved an average accuracy of 95.83%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, CODASPY 2016, pp. 183–194. ACM, New York (2016). https://doi.org/10.1145/2857705.2857713
Armanfard, N., Reilly, J.P., Komeili, M.: Local feature selection for data classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1217–1227 (2016). https://doi.org/10.1109/TPAMI.2015.2478471
Article Google Scholar
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: DREBIN: effective and explainable detection of android malware in your pocket. In: NDSS (2014)
Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
Coulter, R., Pan, L.: Intelligent agents defending for an IoT world: a review. Comput. Secur. 73, 439–458 (2018)
Article Google Scholar
Deshotels, L., Notani, V., Lakhotia, A.: DroidLegacy: automated familial classification of android malware. In: Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014, PPREW 2014, pp. 3:1–3:12. ACM, New York (2014). https://doi.org/10.1145/2556464.2556467
Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: semantics-based detection of android malware through static analysis. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 576–587. ACM, New York (2014). https://doi.org/10.1145/2635868.2635869
Galili, T.: dendextend: an R package for visualizing, adjusting, and comparing trees of hierarchical clustering. Bioinformatics (2015). https://doi.org/10.1093/bioinformatics/btv428. http://bioinformatics.oxfordjournals.org/content/31/22/3718
Grace, M., Zhou, Y., Zhang, Q., Zou, S., Jiang, X.: RiskRanker: scalable and accurate zero-day android malware detection. In: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, MobiSys 2012, pp. 281–294. ACM, New York (2012). https://doi.org/10.1145/2307636.2307663
Graziano, M., Canali, D., Bilge, L., Lanzi, A., Balzarotti, D.: Needles in a haystack: mining information from public dynamic analysis sandboxes for malware intelligence. In: 24th USENIX Security Symposium, USENIX Security 2015, pp. 1057–1072. USENIX Association, Washington, D.C. (2015). https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/graziano
Gu, Z., Gu, L., Eils, R., Schlesner, M., Brors, B.: Circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014)
Article Google Scholar
Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: HinDroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 1507–1515. ACM, New York (2017). https://doi.org/10.1145/3097983.3098026
Labs, M.: State of malware report. https://www.malwarebytes.com/pdf/white-papers/stateofmalware.pdf. Accessed 15 July 2019
Li, B., Yan, Q., Xu, Z., Wang, G.: Weighted document frequency for feature selection in text classification. In: 2015 International Conference on Asian Language Processing (IALP), pp. 132–135, October 2015. https://doi.org/10.1109/IALP.2015.7451549
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
Liu, L., De Vel, O., Han, Q.L., Zhang, J., Xiang, Y.: Detecting and preventing cyber insider threats: a survey. IEEE Commun. Surv. Tutor. 20(2), 1397–1417 (2018)
Article Google Scholar
Maiorca, D., Mercaldo, F., Giacinto, G., Visaggio, C.A., Martinelli, F.: R-PackDroid: API package-based characterization and detection of mobile ransomware. In: Proceedings of the Symposium on Applied Computing, SAC 2017, pp. 1718–1723. ACM, New York (2017). https://doi.org/10.1145/3019612.3019793
McAfee: McAfee labs 2017 threats predictions. https://www.mcafee.com/au/resources/reports/rp-threats-predictions-2017.pdf. Accessed 15 July 2019
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y.: Context-aware, adaptive, and scalable android malware detection through online learning. IEEE Trans. Emerg. Top. Comput. Intell. 1(3), 157–175 (2017). https://doi.org/10.1109/TETCI.2017.2699220
Article Google Scholar
Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), pp. 338–342, July 2016. https://doi.org/10.1109/NAECON.2016.7856826
Plansangket, S., Gan, J.Q.: A new term weighting scheme based on class specific document frequency for document representation and classification. In: 2015 7th Computer Science and Electronic Engineering Conference (CEEC), pp. 5–8, September 2015. https://doi.org/10.1109/CEEC.2015.7332690
Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P., Blasco, J.: DenDroid: a text mining approach to analyzing and classifying code structures in Android malware families. Expert Syst. Appl. 41(4), 1104–1117 (2014). https://doi.org/10.1016/j.eswa.2013.07.106. http://www.sciencedirect.com/science/article/pii/S0957417413006088
Sun, N., Zhang, J., Rimba, P., Gao, S., Zhang, L.Y., Xiang, Y.: Data-driven cybersecurity incident prediction: a survey. IEEE Commun. Surv. Tutor. 21(2), 1744–1772 (2018)
Article Google Scholar
Symantec: Internet security threat report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf. Accessed 15 July 2019
Vidas, T., Votipka, D., Christin, N.: All your droid are belong to us: a survey of current android attacks. In: WOOT, pp. 81–90 (2011)
Google Scholar
Wei, M., Gong, X., Wang, W.: Claim what you need: a text-mining approach on android permission request authorization. In: 2015 IEEE Global Communications Conference (GLOBECOM), pp. 1–6, December 2015. https://doi.org/10.1109/GLOCOM.2015.7417472
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010). https://doi.org/10.1016/j.asoc.2009.06.019. http://www.sciencedirect.com/science/article/pii/S1568494609000908
Xue, Y., et al.: Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Trans. Inf. Forensics Secur. 12(7), 1529–1544 (2017). https://doi.org/10.1109/TIFS.2017.2661723
Article Google Scholar
Yuan, Z., Lu, Y., Xue, Y.: DroidDetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
Article Google Scholar
Zhang, J., Xiang, Y., Wang, Y., Zhou, W., Xiang, Y., Guan, Y.: Network traffic classification using correlation information. IEEE Trans. Parallel Distrib. Syst. 24(1), 104–117 (2012)
Article Google Scholar
Zhu, Z., Dumitras, T.: FeatureSmith: automatically engineering features for malware detection by mining the security literature. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 767–778. ACM (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software and Electrical Engineering, Swinburne University of Technology, Hawthorn, VIC, 3122, Australia
Rory Coulter, Jun Zhang & Yang Xiang
School of Information Technology, Deakin University, Geelong, VIC, 3216, Australia
Lei Pan

Authors

Rory Coulter
View author publications
You can also search for this author in PubMed Google Scholar
Lei Pan
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Pan .

Editor information

Editors and Affiliations

Xidian University, Xi’an, China
Xiaofeng Chen
Fujian Normal University, Fuzhou, China
Xinyi Huang
Swinburne University of Technology, Hawthorn, VIC, Australia
Jun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Coulter, R., Pan, L., Zhang, J., Xiang, Y. (2019). A Visualization-Based Analysis on Classifying Android Malware. In: Chen, X., Huang, X., Zhang, J. (eds) Machine Learning for Cyber Security. ML4CS 2019. Lecture Notes in Computer Science(), vol 11806. Springer, Cham. https://doi.org/10.1007/978-3-030-30619-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-30619-9_22
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30618-2
Online ISBN: 978-3-030-30619-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics