Skip to main content

A Visualization-Based Analysis on Classifying Android Malware

  • Conference paper
  • First Online:
Machine Learning for Cyber Security (ML4CS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11806))

Included in the following conference series:

Abstract

Since the introduction of the Android mobile platform, the state of mobile malware has evolved in both attack sophistication and its ability to evade detection. Given the right combination of elements, the detection of malicious applications may be found among those that pose no threat, yet the threats that exist across these malware types reveal distinguishable attack characteristics. This paper investigates the benign and attacking characteristics. By plotting complex features into dendrograms, we propose a novel approach to visually distinguish Android apps. We visualize the complicated relationship and evaluate the effect of different text mining methods. Specifically, we employ machine learning techniques including feature reduction using Principle Component Analysis, and the Random Forest classifier, to compare eight different models. Using the Drebin dataset, we achieved an average accuracy of 95.83%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmadi, M., Ulyanov, D., Semenov, S., Trofimov, M., Giacinto, G.: Novel feature extraction, selection and fusion for effective malware family classification. In: Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, CODASPY 2016, pp. 183–194. ACM, New York (2016). https://doi.org/10.1145/2857705.2857713

  2. Armanfard, N., Reilly, J.P., Komeili, M.: Local feature selection for data classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(6), 1217–1227 (2016). https://doi.org/10.1109/TPAMI.2015.2478471

    Article  Google Scholar 

  3. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: DREBIN: effective and explainable detection of android malware in your pocket. In: NDSS (2014)

    Google Scholar 

  4. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  5. Coulter, R., Pan, L.: Intelligent agents defending for an IoT world: a review. Comput. Secur. 73, 439–458 (2018)

    Article  Google Scholar 

  6. Deshotels, L., Notani, V., Lakhotia, A.: DroidLegacy: automated familial classification of android malware. In: Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014, PPREW 2014, pp. 3:1–3:12. ACM, New York (2014). https://doi.org/10.1145/2556464.2556467

  7. Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: semantics-based detection of android malware through static analysis. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 576–587. ACM, New York (2014). https://doi.org/10.1145/2635868.2635869

  8. Galili, T.: dendextend: an R package for visualizing, adjusting, and comparing trees of hierarchical clustering. Bioinformatics (2015). https://doi.org/10.1093/bioinformatics/btv428. http://bioinformatics.oxfordjournals.org/content/31/22/3718

  9. Grace, M., Zhou, Y., Zhang, Q., Zou, S., Jiang, X.: RiskRanker: scalable and accurate zero-day android malware detection. In: Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services, MobiSys 2012, pp. 281–294. ACM, New York (2012). https://doi.org/10.1145/2307636.2307663

  10. Graziano, M., Canali, D., Bilge, L., Lanzi, A., Balzarotti, D.: Needles in a haystack: mining information from public dynamic analysis sandboxes for malware intelligence. In: 24th USENIX Security Symposium, USENIX Security 2015, pp. 1057–1072. USENIX Association, Washington, D.C. (2015). https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/graziano

  11. Gu, Z., Gu, L., Eils, R., Schlesner, M., Brors, B.: Circlize implements and enhances circular visualization in R. Bioinformatics 30, 2811–2812 (2014)

    Article  Google Scholar 

  12. Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: HinDroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 1507–1515. ACM, New York (2017). https://doi.org/10.1145/3097983.3098026

  13. Labs, M.: State of malware report. https://www.malwarebytes.com/pdf/white-papers/stateofmalware.pdf. Accessed 15 July 2019

  14. Li, B., Yan, Q., Xu, Z., Wang, G.: Weighted document frequency for feature selection in text classification. In: 2015 International Conference on Asian Language Processing (IALP), pp. 132–135, October 2015. https://doi.org/10.1109/IALP.2015.7451549

  15. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  16. Liu, L., De Vel, O., Han, Q.L., Zhang, J., Xiang, Y.: Detecting and preventing cyber insider threats: a survey. IEEE Commun. Surv. Tutor. 20(2), 1397–1417 (2018)

    Article  Google Scholar 

  17. Maiorca, D., Mercaldo, F., Giacinto, G., Visaggio, C.A., Martinelli, F.: R-PackDroid: API package-based characterization and detection of mobile ransomware. In: Proceedings of the Symposium on Applied Computing, SAC 2017, pp. 1718–1723. ACM, New York (2017). https://doi.org/10.1145/3019612.3019793

  18. McAfee: McAfee labs 2017 threats predictions. https://www.mcafee.com/au/resources/reports/rp-threats-predictions-2017.pdf. Accessed 15 July 2019

  19. Narayanan, A., Chandramohan, M., Chen, L., Liu, Y.: Context-aware, adaptive, and scalable android malware detection through online learning. IEEE Trans. Emerg. Top. Comput. Intell. 1(3), 157–175 (2017). https://doi.org/10.1109/TETCI.2017.2699220

    Article  Google Scholar 

  20. Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), pp. 338–342, July 2016. https://doi.org/10.1109/NAECON.2016.7856826

  21. Plansangket, S., Gan, J.Q.: A new term weighting scheme based on class specific document frequency for document representation and classification. In: 2015 7th Computer Science and Electronic Engineering Conference (CEEC), pp. 5–8, September 2015. https://doi.org/10.1109/CEEC.2015.7332690

  22. Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P., Blasco, J.: DenDroid: a text mining approach to analyzing and classifying code structures in Android malware families. Expert Syst. Appl. 41(4), 1104–1117 (2014). https://doi.org/10.1016/j.eswa.2013.07.106. http://www.sciencedirect.com/science/article/pii/S0957417413006088

  23. Sun, N., Zhang, J., Rimba, P., Gao, S., Zhang, L.Y., Xiang, Y.: Data-driven cybersecurity incident prediction: a survey. IEEE Commun. Surv. Tutor. 21(2), 1744–1772 (2018)

    Article  Google Scholar 

  24. Symantec: Internet security threat report. https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf. Accessed 15 July 2019

  25. Vidas, T., Votipka, D., Christin, N.: All your droid are belong to us: a survey of current android attacks. In: WOOT, pp. 81–90 (2011)

    Google Scholar 

  26. Wei, M., Gong, X., Wang, W.: Claim what you need: a text-mining approach on android permission request authorization. In: 2015 IEEE Global Communications Conference (GLOBECOM), pp. 1–6, December 2015. https://doi.org/10.1109/GLOCOM.2015.7417472

  27. Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection systems: a review. Appl. Soft Comput. 10(1), 1–35 (2010). https://doi.org/10.1016/j.asoc.2009.06.019. http://www.sciencedirect.com/science/article/pii/S1568494609000908

  28. Xue, Y., et al.: Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Trans. Inf. Forensics Secur. 12(7), 1529–1544 (2017). https://doi.org/10.1109/TIFS.2017.2661723

    Article  Google Scholar 

  29. Yuan, Z., Lu, Y., Xue, Y.: DroidDetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)

    Article  Google Scholar 

  30. Zhang, J., Xiang, Y., Wang, Y., Zhou, W., Xiang, Y., Guan, Y.: Network traffic classification using correlation information. IEEE Trans. Parallel Distrib. Syst. 24(1), 104–117 (2012)

    Article  Google Scholar 

  31. Zhu, Z., Dumitras, T.: FeatureSmith: automatically engineering features for malware detection by mining the security literature. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 767–778. ACM (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Pan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Coulter, R., Pan, L., Zhang, J., Xiang, Y. (2019). A Visualization-Based Analysis on Classifying Android Malware. In: Chen, X., Huang, X., Zhang, J. (eds) Machine Learning for Cyber Security. ML4CS 2019. Lecture Notes in Computer Science(), vol 11806. Springer, Cham. https://doi.org/10.1007/978-3-030-30619-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30619-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30618-2

  • Online ISBN: 978-3-030-30619-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics