Skip to main content

Android malware classification using convolutional neural network and LSTM

Abstract

Hand phone devices are the latest technological developments of the 20th century. There is an increasing number of fishing, sniffing and other kinds of attacks in this field of technology. Although signature-based methods are usable, they are not very reliable when faced with new kinds of malwares and they are neither accurate nor enough. Furthermore, signature-based methods cannot efficiently detect rapid malware behavior changes. Our classification process consists of not only analyzing of the source code by using Jadx but also analyzing applications and extracting useful features. Two kinds of analyses are used which are called static and dynamic. We concentrate on Android malware classification using Call-Graph and by moreover generating Call-Graphs for both classes.dex and lib.so files which have not been worked before. The proposed method for classification is CNN-LSTM. Since this method is a reasonable choice to learn complex and sequential features, it benefits from both convolutional neural network and long short-term memory which is a type of recurrent neural network. In this method a Sequential Neural Network is designed to do sequence classification as well as conduct a set of experiments on malware detection. In conclusion, CNN-LSTM is compared with several classification methods like Convolutional Neural Network (CNN), Support Vector Machine (SVM), Naive Bayes, Random Forest, and other methods. Obtained results show that, our method is more effective, efficient, and reliable than others even by using the same hardware and dataset.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Source code dataset

Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

source code classification model

Fig. 12

References

  1. 1.

    Dahl, G.E., Stokes, J.W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3422–3426. IEEE (2013)

  2. 2.

    Zhu, Q., Xu, Y., Jiang, C., Xie, W.: An Android Malware detection method based on native libraries. In: Smart Innovations in Communication and Computational Sciences, pp. 233–246. Springer, Singapore (2020)

  3. 3.

    Qiu, J., Nepal, S., Luo, W., Pan, L., Tai, Y., Zhang, J., Xiang, Y.: Data-driven android malware intelligence: a survey. In: International Conference on Machine Learning for Cyber Security, pp. 183–202. Springer, Cham (2019)

  4. 4.

    Ding, Y., Zhang, X., Hu, J., Xu, W.: Android malware detection method based on bytecode image. J. Ambient Intell Huma. Comput. 1–10 (2020

  5. 5.

    Shang, S., Zheng, N., Xu, J., Xu, M., Zhang, H.: Detecting malware variants via function-call graph similarity, pp. 113–120 (2010). https://doi.org/10.1109/MALWARE.2010.5665787

  6. 6.

    Wyatt, T.: Security alert: Geinimi, sophisticated new android Trojan found in wild. Online, December, 2010 (2010)

  7. 7.

    Fan, M., Liu, T., Liu, J., Luo, X., Yu, L., Guan, X.: Android malware detection: a survey. Sci. Sin. Inf. 50(8), 1148–1177 (2020)

    Article  Google Scholar 

  8. 8.

    Onyebuchi, O.B.: Signature based Network Intrusion Detection System using Feature Selection on Android. Signature 11(6), 551–558 (2020)

  9. 9.

    Sun, M., Tan, G.: Nativeguard: protecting android applications from third-party native libraries. In: Proceedings of the 2014 ACM Conference on Security and Privacy in Wireless & Mobile Networks, pp. 165–176 (2014)

  10. 10.

    Garcia-Alfaro, J., Lioudakis, G., Cuppens-Boulahia, N., Foley, S., Fitzgerald, W.M. (eds.): Data Privacy Management and Autonomous Spontaneous Security: 8th International Workshop, DPM 2013, and 6th International Workshop, SETOP 2013, Egham, UK, September 12–13, 2013, Revised Selected Papers, vol. 8247. Springer (2014)

  11. 11.

    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  12. 12.

    He, K., Zhang, X., Ren, S., & Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

  13. 13.

    Wang, L., Guo, S., Huang, W., Qiao, Y.: Places205-vggnet models for scene recognition. arXiv preprint arXiv:1508.01667 (2015)

  14. 14.

    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  15. 15.

    Cesare, S., Xiang, Y.: Classification of malware using structured control flow. In: Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing-Volume 107, pp. 61–70 (2010)

  16. 16.

    Tian, R., Batten, L.M., Versteeg, S.C.: Function length as a tool for malware classification. In: 2008 3rd International Conference on Malicious and Unwanted Software (MALWARE), pp. 69–76. IEEE (2008)

  17. 17.

    Milosevic, N., Dehghantanha, A., Choo, K.K.R.: Machine learning aided Android malware classification. Comput. Electr. Eng. 61, 266–274 (2017)

    Article  Google Scholar 

  18. 18.

    Nix, R., Zhang, J.: Classification of android apps and malware using deep neural networks. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1871–1878. IEEE (2017)

  19. 19.

    Ide, H., Kurita, T.: Improvement of learning for CNN with ReLU activation by sparse regularization. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2684–2691. IEEE (2017)

  20. 20.

    Zhu, R., Li, C., Niu, D., Zhang, H., Kinawi, H.: Android Malware Detection Using Large-scale Network Representation Learning. arXiv preprint arXiv:1806.04847 (2018)

  21. 21.

    Shorman, A., Faris, H., Aljarah, I.: Unsupervised intelligent system based on one class support vector machine and Grey Wolf optimization for IoT botnet detection. J. Ambient. Intell. Humaniz. Comput. 11(7), 2809–2825 (2020)

    Article  Google Scholar 

  22. 22.

    Nguyen-Vu, L., Ahn, J., Jung, S.: Android fragmentation in malware detection. Comput. Secur. 87, 101573 (2019)

    Article  Google Scholar 

  23. 23.

    Zhong, W., Gu, F.: A multi-level deep learning system for malware detection. Expert Syst. Appl. 133, 151–162 (2019)

    Article  Google Scholar 

  24. 24.

    Wang, W., Zhao, M., Wang, J.: Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Hum. Comput. 10, 3035–3043 (2019)

    Article  Google Scholar 

  25. 25.

    Ahmed, A.A., Jabbar, W.A., Sadiq, A.S., Patel, H.: Deep learning-based classification model for botnet attack detection. J. Ambient Intell. Human. Comput. 1–10 (2020)

  26. 26.

    Martín, A., Lara-Cabrera, R., Camacho, D.: Android malware detection through hybrid features fusion and ensemble classifiers: the AndroPyTool framework and the OmniDroid dataset. Inf. Fusion 52, 128–142 (2019)

    Article  Google Scholar 

  27. 27.

    Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  28. 28.

    Verma, N.K., Salour, A.: Pre-processing. In: Intelligent Condition Based Monitoring, pp. 89–120. Springer, Singapore (2020)

  29. 29.

    Prommee, P., Angkeaw, K., Somdunyakanok, M., Dejhan, K.: CMOS-based near zero-offset multiple inputs max–min circuits and its applications. Analog Integr. Circ. Sig. Process 61(1), 93–105 (2009)

    Article  Google Scholar 

  30. 30.

    Singh, D., Singh, B.: Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020)

    Article  Google Scholar 

  31. 31.

    Safak, V.: Min-Mid-Max Scaling, Limits of Agreement, and Agreement Score. arXiv preprint arXiv:2006.12904 (2020)

  32. 32.

    Graf, R., Kaplan, L.A., King, R.: Neural network-based technique for android smartphone applications classification. In: 2019 11th International Conference on Cyber Conflict (CyCon), vol. 900, pp. 1–17. IEEE (2019).

  33. 33.

    Naway, A., Li, Y.: Using Deep Neural Network for Android Malware Detection. arXiv preprint arXiv:1904.00736 (2019)

  34. 34.

    Zia, T., Zahid, U.: Long short-term memory recurrent neural network architectures for Urdu acoustic modeling. Int. J. Speech Technol. 22(1), 21–30 (2019)

    Article  Google Scholar 

  35. 35.

    Xiong, J., Zhang, K., Zhang, H.: A vibrating mechanism to prevent neural networks from overfitting. In: 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 1737–1742. IEEE (2019).

  36. 36.

    Heusel, M., Clevert, D.A., Klambauer, G., Mayr, A., Schwarzbauer, K., Unterthiner, T., Hochreiter, S.: ELU-networks: fast and accurate CNN learning on imagenet. Nin 8, 35–68 (2015)

    Google Scholar 

  37. 37.

    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  38. 38.

    Memood, F., Ghani, M.U., Ibrahim, M.A., Shehzadi, R., Asim, M.N.: A Precisely Xtreme-Multi Channel Hybrid Approach for Roman Urdu Sentiment Analysis. arXiv preprint arXiv:2003.05443 (2020)

  39. 39.

    Son, K.C., Lee, J.Y.: The method of android application speed up by using NDK. In: 2011 3rd International Conference on Awareness Science and Technology (iCAST), pp. 382–385. IEEE (2011)

  40. 40.

    Basiri, M.E., Abdar, M., Cifci, M.A., Nemati, S., Acharya, U.R.: A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowl.-Based Syst. 198, 105949 (2020)

  41. 41.

    Kim, H., Jeong, Y.S.: Sentiment classification using convolutional neural networks. Appl. Sci. 9(11), 2347 (2019)

    Article  Google Scholar 

  42. 42.

    Wong, T.T., Yeh, P. Y.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32(8), 1586–1594 (2019)

  43. 43.

    Jose, R.R., Salim, A.: Integrated static analysis for malware variants detection. In: International Conference on Inventive Computation Technologies, pp. 622–629. Springer (2019)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Soodeh Hosseini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hosseini, S., Nezhad, A.E. & Seilani, H. Android malware classification using convolutional neural network and LSTM. J Comput Virol Hack Tech 17, 307–318 (2021). https://doi.org/10.1007/s11416-021-00385-z

Download citation

Keywords

  • Android Malware Detection
  • Call Graph
  • Convolutional Neural Network
  • Long Short-Term Memory