Skip to main content

MINAD: Multi-inputs Neural Network based on Application Structure for Android Malware Detection

Abstract

With the proliferation of smartphone demand, the number of malicious applications has increased exponentially with about tens of thousands per month. Among smartphone platforms, the Android operating system with high popularity has become the most target by malware. By some techniques such as employing polymorphic or encrypting payload, signature-based scanning is easily bypassed. With the support from some useful tools and sandboxes recently, the Android applications could be easy to decoded and tracked the executable behavior. It leads machine learning methods to have potential benefits to classify the malware. However, how to define the suitable model with competent features and avoid over-fitting in learning models become other challenges for researchers. In this paper, we propose MINAD (Multi-Inputs Neural network based on application structure for Android malware Detection) method. First, we collect the features of an Android application based on many aspects, and then those features are grouped into three categories: System-based, Library-based, and User-based corresponding the parts of Android application structure which are related with Android system definition, library, users’ definitions. Second, each group is reconstructed to have effective feature sets. At last, a multi-input deep neural network is designed with two phases to learn the abstract of each feature group before making the final decision for malware detection. Our performances are evaluated in various samples which are collected from Google Play Store, the Drebin, and AMD Datasets with more than 155,000 samples. The results show that the MINAD method does not only improve Android malware detection’s accuracy in comparison with other methods but also improves the stability of the model and reduces the computation costs.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. 1.

    Clement J (2020) Volume of detected mobile malware packages as of q1. https://www.statista.com/statistics/653680/volume-of-detected-mobile-malware-packages/, 2020. Online; accessed 30 June 2020

  2. 2.

    Nguyen TN, Zeadally S (2020) Mobile crowd-sensing applications: Data redundancies, challenges, and solutions. ACM Transactions on Internet Technology

  3. 3.

    O’Dea S (2020) Global smartphone unit shipments by operating system 2016-2023. https://www.statista.com/statistics/309448/global-smartphone-shipments-forecast-operating-system/. Online; accessed 30 June 2020

  4. 4.

    Beroual A, Al-Shaikhli IF (2020) A survey on android malwares and defense techniques. Journal of Computational and Theoretical Nanoscience 17(4):1557–1565

    Article  Google Scholar 

  5. 5.

    Qiu J, Zhang J, Luo W, Pan L, Nepal S, Xiang Y (2020) A survey of android malware detection with deep neural models. ACM Comput Surv 53(6)

  6. 6.

    Dhalaria M, Gandotra E (2021) Android malware detection techniques: A literature review. Recent Patents on Engineering 15(2):225–245

    Article  Google Scholar 

  7. 7.

    Xie N, Wang X, Wang W, Liu J (2019) Fingerprinting android malware families. Front Comp Sci 13(3):637–646

  8. 8.

    Ghasempour A, Sani NFM, Abari OJ (2020) Permission extraction framework for android malware detection. Int J Adv Comput Sci Appl 11(11)

  9. 9.

    Wang C, Xu Q, Lin X, Liu S (2019) Research on data mining of permissions mode for android malware detection. Clust Comput 22(6):13337–13350

  10. 10.

    Fan M, Liu J, Luo X, Chen K, Tian Z, Zheng Q, Liu T (2018) Android malware familial classification and representative sample selection via frequent subgraph analysis. IEEE Transactions on Information Forensics and Security 13(8):1890–1905

    Article  Google Scholar 

  11. 11.

    Arp D, Spreitzenbarth M, Hübner M, Gascon H, Rieck K (2014) Drebin: Effective and explainable detection of android malware in your pocket

  12. 12.

    Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) Liblinear: A library for large linear classification. J Mach Learn Res 9:1871–1874

  13. 13.

    Wang X, Wang W, He Y, Liu J, Han Z, Zhang X (2017) Characterizing android apps behavior for effective detection of malapps at large scale. Future Generation Computer Systems 75:30–45

    Article  Google Scholar 

  14. 14.

    Zhang Y, Yang Y, Wang X (2018) A novel android malware detection approach based on convolutional neural network. In Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, ICCSP 2018, page 144–149, New York, NY, USA. Association for Computing Machinery

  15. 15.

    Karbab EB, Debbabi M, Derhab A, Mouheb D (2018) Maldozer: Automatic framework for android malware detection using deep learning. Digital Investigation 24:S48–S59

    Article  Google Scholar 

  16. 16.

    Kim T, Kang B, Rho M, Sezer S, Im EG (2019) A multimodal deep learning method for android malware detection using various features. IEEE Transactions on Information Forensics and Security 14(3):773–788

    Article  Google Scholar 

  17. 17.

    Vu D-L, Nguyen T-K, Nguyen TV, Nguyen TN, Massacci F, Phung PH (2020) Hit4mal: Hybrid image transformation for malware classification. Trans Emerg Telecommun Technol 31(11):e3789

  18. 18.

    Blasing T, Batyuk L, Schmidt AD, Camtepe SA, Albayrak S (2010) An android application sandbox system for suspicious software detection. In 2010 5th International Conference on Malicious and Unwanted Software 55–62

  19. 19.

    Yan LK, Yin H (2012) Droidscope: Seamlessly reconstructing the os and dalvik semantic views for dynamic android malware analysis. In Proceedings of the 21st USENIX Conference on Security Symposium, Security’12 29, USA. USENIX Association

  20. 20.

    Enck W, Gilbert P, Chun B-G, Cox LP, Jung J, McDaniel P, Sheth AN (2010) Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, OSDI ’10, page 393–407, USA. USENIX Association

  21. 21.

    Lindorfer M, Neugschwandtner M, Weichselbaum L, Fratantonio Y, Veen VVD, Platzer C (2014) Andrubis – 1,000,000 apps later: A view on current android malware behaviors. In 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS) 3–17

  22. 22.

    Alzaylaee MK, Yerima SY, Sezer S (2020) Dl-droid: Deep learning based android malware detection using real devices. Computers & Security 89:101663

    Article  Google Scholar 

  23. 23.

    Liang H, Song Y, Xiao D (2017) An end-to-end model for android malware detection. In 2017 IEEE International Conference on Intelligence and Security Informatics (ISI) 140–142

  24. 24.

    Hou S, Saas A, Chen L, Ye Y (2016) Deep4maldroid: A deep learning framework for android malware detection based on linux kernel system call graphs. In 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW) 104–111

  25. 25.

    Yuan Z, Lu Y, Wang Z, Xue Y (2014) Droid-sec: Deep learning in android malware detection. ACM SIGCOMM Computer Communication Review 44(4):371–372

  26. 26.

    Fischer A, Igel C (2012) An introduction to restricted boltzmann machines 14–36

  27. 27.

    Xu L, Zhang D, Jayasena N, Cavazos J (2018) Hadm: Hybrid analysis for detection of malware. In Y. Bi, S. Kapoor, and R. Bhatia (eds) Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016, pages 702–724, Cham. Springer International Publishing

  28. 28.

    Alshahrani H, Mansourt H, Thorn S, Alshehri A, Alzahrani A, Fu H (2018) Ddefender: Android application threat detection using static and dynamic analysis. In 2018 IEEE Int Conf Consum Electron (ICCE) 1–6

  29. 29.

    Br S (2010) Analysis of the Android Architecture. PhD thesis, Karlsruhe Institute of Technology, Am Fasanengarten 5, Bldg. 50.34 76131 Karlsruhe

  30. 30.

    Adam P, Blajoh, Kaplan A (2017) Feature-extraction. https://github.com/33onethird/feature-extraction

  31. 31.

    Tumbleson C (2019) A tool for reverse engineering android apk files. https://ibotpeaches.github.io/Apktool/. Accessed June 2019

  32. 32.

    Derr E (2017) Libscout. https://github.com/reddr/LibScout

  33. 33.

    Pan B (2013) dex2jar. https://github.com/pxb1988/dex2jar

  34. 34.

    Google (2018) Android developers - permissions on android. https://developer.android.com/guide/topics/permissions/overview. Accessed June 2018

  35. 35.

    Google (2018) Android developers - intent. https://developer.android.com/reference/android/content/Intent. Accessed June 2018

  36. 36.

    Joachims T (2018) Svm light format. https://www.cs.cornell.edu/people/tj/svm_light/. Accessed September 2018

  37. 37.

    Stahle L, Wold S (1989) Analysis of variance (anova). Chemometrics and Intelligent Laboratory Systems 6(4):259–272

    Article  Google Scholar 

  38. 38.

    Glorot X, Bordes A, Bengio Y (2010) Deep sparse rectifier neural networks. 15:01

    Google Scholar 

  39. 39.

    Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In Proceedings of the International Workshop on Artificial Neural Networks: From Natural to Artificial Neural Computation, IWANN ’96, page 195–201, Berlin, Heidelberg. Springer-Verlag

  40. 40.

    Hahnloser RHR, Sarpeshkar R, Mahowald MA, Douglas RJ, Seung HS (2000) Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789):947–951

    Article  Google Scholar 

  41. 41.

    Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12(61):2121–2159

    MathSciNet  MATH  Google Scholar 

  42. 42.

    G. play store (2014) https://play.google.com/store, 2012

  43. 43.

    Wei F, Li Y, Roy S, Ou X, Zhou W (2017) Deepground truth analysis of current android malware 252–276

  44. 44.

    Pedregosa F, Varoquaux G et al (2011) Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  45. 45.

    Abadi M, Agarwal A et al (2015) TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org

  46. 46.

    F-Secure (2019) Trojan:android/ginmaster.a. https://www.f-secure.com/v-descs/trojan_android_ginmaster.shtml. Accessed July 2019

  47. 47.

    Google (2020) Malware categories. https://developers.google.com/android/play-protect/phacategories. Accessed January 2020

Download references

Acknowledgements

This research is funded by the Vietnam Academy of Science and Technology (VAST) under grant number VAST01.05/18-19.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Giang L. Nguyen.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D.V., Nguyen, G.L., Nguyen, T.T. et al. MINAD: Multi-inputs Neural Network based on Application Structure for Android Malware Detection. Peer-to-Peer Netw. Appl. (2021). https://doi.org/10.1007/s12083-021-01244-w

Download citation

Keywords

  • Android malware
  • Malware detection
  • Machine learning
  • Neural network