Abstract
Since the introduction of the Android mobile platform, the state of mobile malware has evolved in both attack sophistication and its ability to evade detection. Given the right combination of elements, the detection of malicious applications may be found among those that pose no threat, yet the threats that exist across these malware types reveal distinguishable attack characteristics. This paper investigates the benign and attacking characteristics. By plotting complex features into dendrograms, we propose a novel approach to visually distinguish Android apps. We visualize the complicated relationship and evaluate the effect of different text mining methods. Specifically, we employ machine learning techniques including feature reduction using Principle Component Analysis, and the Random Forest classifier, to compare eight different models. Using the Drebin dataset, we achieved an average accuracy of 95.83%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, CODASPY 2016, pp. 183–194. ACM, New York (2016). https://doi.org/10.1145/2857705.2857713
Armanfard, N., Reilly, J.P., Komeili, M.: Local feature selection for data classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1217–1227 (2016). https://doi.org/10.1109/TPAMI.2015.2478471
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: DREBIN: effective and explainable detection of android malware in your pocket. In: NDSS (2014)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Coulter, R., Pan, L.: Intelligent agents defending for an IoT world: a review. Comput. Secur. 73, 439–458 (2018)
Deshotels, L., Notani, V., Lakhotia, A.: DroidLegacy: automated familial classification of android malware. In: Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014, PPREW 2014, pp. 3:1–3:12. ACM, New York (2014). https://doi.org/10.1145/2556464.2556467
Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: semantics-based detection of android malware through static analysis. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 576–587. ACM, New York (2014). https://doi.org/10.1145/2635868.2635869
Galili, T.: dendextend: an R package for visualizing, adjusting, and comparing trees of hierarchical clustering. Bioinformatics (2015). https://doi.org/10.1093/bioinformatics/btv428. http://bioinformatics.oxfordjournals.org/content/31/22/3718
Grace, M., Zhou, Y., Zhang, Q., Zou, S., Jiang, X.: RiskRanker: scalable and accurate zero-day android malware detection. In: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, MobiSys 2012, pp. 281–294. ACM, New York (2012). https://doi.org/10.1145/2307636.2307663
Graziano, M., Canali, D., Bilge, L., Lanzi, A., Balzarotti, D.: Needles in a haystack: mining information from public dynamic analysis sandboxes for malware intelligence. In: 24th USENIX Security Symposium, USENIX Security 2015, pp. 1057–1072. USENIX Association, Washington, D.C. (2015). https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/graziano
Gu, Z., Gu, L., Eils, R., Schlesner, M., Brors, B.: Circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014)
Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: HinDroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 1507–1515. ACM, New York (2017). https://doi.org/10.1145/3097983.3098026
Labs, M.: State of malware report. https://www.malwarebytes.com/pdf/white-papers/stateofmalware.pdf. Accessed 15 July 2019
Li, B., Yan, Q., Xu, Z., Wang, G.: Weighted document frequency for feature selection in text classification. In: 2015 International Conference on Asian Language Processing (IALP), pp. 132–135, October 2015. https://doi.org/10.1109/IALP.2015.7451549
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Liu, L., De Vel, O., Han, Q.L., Zhang, J., Xiang, Y.: Detecting and preventing cyber insider threats: a survey. IEEE Commun. Surv. Tutor. 20(2), 1397–1417 (2018)
Maiorca, D., Mercaldo, F., Giacinto, G., Visaggio, C.A., Martinelli, F.: R-PackDroid: API package-based characterization and detection of mobile ransomware. In: Proceedings of the Symposium on Applied Computing, SAC 2017, pp. 1718–1723. ACM, New York (2017). https://doi.org/10.1145/3019612.3019793
McAfee: McAfee labs 2017 threats predictions. https://www.mcafee.com/au/resources/reports/rp-threats-predictions-2017.pdf. Accessed 15 July 2019
Narayanan, A., Chandramohan, M., Chen, L., Liu, Y.: Context-aware, adaptive, and scalable android malware detection through online learning. IEEE Trans. Emerg. Top. Comput. Intell. 1(3), 157–175 (2017). https://doi.org/10.1109/TETCI.2017.2699220
Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), pp. 338–342, July 2016. https://doi.org/10.1109/NAECON.2016.7856826
Plansangket, S., Gan, J.Q.: A new term weighting scheme based on class specific document frequency for document representation and classification. In: 2015 7th Computer Science and Electronic Engineering Conference (CEEC), pp. 5–8, September 2015. https://doi.org/10.1109/CEEC.2015.7332690
Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P., Blasco, J.: DenDroid: a text mining approach to analyzing and classifying code structures in Android malware families. Expert Syst. Appl. 41(4), 1104–1117 (2014). https://doi.org/10.1016/j.eswa.2013.07.106. http://www.sciencedirect.com/science/article/pii/S0957417413006088
Sun, N., Zhang, J., Rimba, P., Gao, S., Zhang, L.Y., Xiang, Y.: Data-driven cybersecurity incident prediction: a survey. IEEE Commun. Surv. Tutor. 21(2), 1744–1772 (2018)
Symantec: Internet security threat report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf. Accessed 15 July 2019
Vidas, T., Votipka, D., Christin, N.: All your droid are belong to us: a survey of current android attacks. In: WOOT, pp. 81–90 (2011)
Wei, M., Gong, X., Wang, W.: Claim what you need: a text-mining approach on android permission request authorization. In: 2015 IEEE Global Communications Conference (GLOBECOM), pp. 1–6, December 2015. https://doi.org/10.1109/GLOCOM.2015.7417472
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010). https://doi.org/10.1016/j.asoc.2009.06.019. http://www.sciencedirect.com/science/article/pii/S1568494609000908
Xue, Y., et al.: Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Trans. Inf. Forensics Secur. 12(7), 1529–1544 (2017). https://doi.org/10.1109/TIFS.2017.2661723
Yuan, Z., Lu, Y., Xue, Y.: DroidDetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
Zhang, J., Xiang, Y., Wang, Y., Zhou, W., Xiang, Y., Guan, Y.: Network traffic classification using correlation information. IEEE Trans. Parallel Distrib. Syst. 24(1), 104–117 (2012)
Zhu, Z., Dumitras, T.: FeatureSmith: automatically engineering features for malware detection by mining the security literature. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 767–778. ACM (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Coulter, R., Pan, L., Zhang, J., Xiang, Y. (2019). A Visualization-Based Analysis on Classifying Android Malware. In: Chen, X., Huang, X., Zhang, J. (eds) Machine Learning for Cyber Security. ML4CS 2019. Lecture Notes in Computer Science(), vol 11806. Springer, Cham. https://doi.org/10.1007/978-3-030-30619-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-30619-9_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30618-2
Online ISBN: 978-3-030-30619-9
eBook Packages: Computer ScienceComputer Science (R0)