Skip to main content
Log in

SysDroid: a dynamic ML-based android malware analyzer using system call traces

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Android is a popular open-source operating system highly susceptible to malware attacks. Researchers have developed machine learning models, learned from attributes extracted using static/dynamic approaches to identify malicious applications. However, such models suffer from low detection accuracy, due to the presence of noisy attributes, extracted from conventional feature selection algorithms. Hence, in this paper, a new feature selection mechanism known as selection of relevant attributes for improving locally extracted features using classical feature selectors (SAILS), is proposed. SAILS, targets on discovering prominent system calls from applications, and is built on the top of conventional feature selection methods, such as mutual information, distinguishing feature selector and Galavotti–Sebastiani–Simi. These classical attribute selection methods are used as local feature selectors. Besides, a novel global feature selection method known as, weighted feature selection is proposed. Comprehensive analysis of the proposed feature selectors, is conducted with the traditional methods. SAILS results in improved values for evaluation metrics, compared to the conventional feature selection algorithms for distinct machine learning models, developed using Logistic Regression, CART, Random Forest, XGBoost and Deep Neural Networks. Our evaluations observe accuracies ranging between 95 and 99% for dropout rate and learning rate in the range 0.1–0.8 and 0.001–0.2, respectively. Finally, the security evaluation of malware classifiers on adversarial examples are thoroughly investigated. A decline in accuracy with adversarial examples is observed. Also, SAILS recall rate of classifier subjected to such examples estimate in the range of 24.79–92.2%. However, prior to the attack, the true positive rate obtained by the classifier is reported between 95.2 and 99.79%. The results suggest that the hackers can bypass detection, by discovering the classifier blind spots, on augmenting a small number of legitimate attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. Aafer, Y., Du, W., Yin, H.: Droidapiminer: mining api-level features for robust malware detection in android. In: International Conference on Security and Privacy in Communication Systems, pp. 86–103. Springer, Berlin (2013)

  2. Afonso, V.M., de Amorim, M.F., Grégio, A.R.A., Junquera, G.B., de Geus, P.L.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11(1), 9–17 (2015)

    Article  Google Scholar 

  3. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. Ndss 14, 23–26 (2014)

    Google Scholar 

  4. Arshad, S., Shah, M.A., Wahid, A., Mehmood, A., Song, H., Yu, H.: Samadroid: a novel 3-level hybrid malware detection model for android operating system. IEEE Access 6, 4321–4339 (2018)

    Article  Google Scholar 

  5. Bhandari, S., Panihar, R., Naval, S., Laxmi, V., Zemmari, A., Gaur, M.S.: Sword: semantic aware android malware detector. J. Inf. Secur. Appl. 42, 46–56 (2018)

    Google Scholar 

  6. Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 387–402. Springer, Berlin (2013)

  7. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  8. Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)

  9. Cyber security facts and statistics for 2019. https://us.norton.com/internetsecurity-emerging-threats-10-facts-/about-todays-cybersecurity-landscape-that/-you-should-know.html (2019). Accessed 10 Aug 2019

  10. Cao, Y., Yang, J.: Towards making systems forget with machine unlearning. In: 2015 IEEE Symposium on Security and Privacy, pp. 463–480. IEEE (2015)

  11. Chen, L., Hou, S., Ye, Y., Chen, L.: An adversarial machine learning model against android malware evasion attacks. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, pp. 43–55. Springer, Berlin (2017)

  12. Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344 (2018)

    Article  Google Scholar 

  13. Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIQKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

  14. Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: European Symposium on Research in Computer Security, pp. 62–79. Springer, Berlin (2017)

  15. Han, W., Xue, J., Wang, Y., Huang, L., Kong, Z., Mao, L.: Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput. Secur. 83, 208–233 (2019)

    Article  Google Scholar 

  16. Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, New York (2013)

    Book  Google Scholar 

  17. Hou, S., Saas, A., Chen, L., Ye, Y.: Deep4maldroid: a deep learning framework for android malware detection based on linux kernel system call graphs. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), pp. 104–111. IEEE (2016)

  18. Hou, S., Saas, A., Ye, Y., Chen, L.: Droiddelver: an android malware detection system using deep belief network based on API call blocks. In: International Conference on Web-Age Information Management, pp. 54–66. Springer, Berlin (2016)

  19. Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: Hindroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1507–1515. ACM (2017)

  20. Ishibashi, H., Hihara, S., Iriki, A.: Acquisition and development of monkey tool-use: behavioral and kinematic analyses. Can. J. Physiol. Pharmacol. 78(11), 958–966 (2000)

    Article  Google Scholar 

  21. Largeron, C., Moulin, C., Géry, M.: Entropy based feature selection for text categorization. In: Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 924–928. ACM (2011)

  22. Mobile malware evolution 2019. https://securelist.com/mobile-malware-evolution-2018/89689/ (2019). Accessed 10 Aug 2019

  23. Michael, S., Florian, E., Thomas, S., Felix, C.F., Hoffmann, J.: Mobilesandbox: looking deeper into android applications. In: Proceedings of the 28th International ACM Symposium on Applied Computing (SAC) (2013)

  24. Naway, A., Li, Y.: A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360 (2018)

  25. Roundy, K.A., Miller, B.P.: Hybrid analysis and control of malware. In: International Workshop on Recent Advances in Intrusion Detection, pp. 317–338. Springer, Berlin (2010)

  26. Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)

    Article  MathSciNet  Google Scholar 

  27. Santos, I., Penya, Y.K., Devesa, J., Bringas, P.G.: N-grams-based file signatures for malware detection. ICEIS 2(9), 317–320 (2009)

    Google Scholar 

  28. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  29. Suciu, O., Marginean, R., Kaya, Y., Daume III, H., Dumitras, T.: When does machine learning FAIL? Generalized transferability for evasion and poisoning attacks. In: 27th USENIX Security Symposium (USENIX Security 18), pp. 1299–1316 (2018)

  30. Tong, F., Yan, Z.: A hybrid approach of mobile malware detection in android. J. Parallel Distrib. Comput. 103, 22–31 (2017)

    Article  Google Scholar 

  31. Virustotal. http://virustotal.com/ (2019). Accessed 10 Aug 2019

  32. Wang, W., Zhao, M., Wang, J.: Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Humaniz. Comput. 10(8), 3035–3043 (2019)

    Article  Google Scholar 

  33. Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised. In: Twenty-Second International Joint Conference on Artificial Intelligence (2011)

  34. Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)

    Article  Google Scholar 

  35. Zhang, J., Zhang, K., Qin, Z., Yin, H., Wu, Q.: Sensitive system calls based packed malware variants detection using principal component initialized multilayers neural networks. Cybersecurity 1(1), 10 (2018)

    Article  Google Scholar 

  36. Zheng, Z., Wu, X., Srihari, R.: Feature selection for text categorization on imbalanced data. ACM SIGKDD Explor. Newslett. 6(1), 80–89 (2004)

    Article  Google Scholar 

  37. Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy, pp. 95–109. IEEE (2012)

  38. 9apps: Andriod app website. https://www.9apps.com/ (2019). Accessed 10 Aug 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Shojafar.

Ethics declarations

Conflicts of interest

There is no conflict of interest for the paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

In this section, different scores of N-gram for GSS, WFS, and MI are presented to illustrate the malware and benign samples for various features (see the Figs. 1112 and 13).

Fig. 11
figure 11

Score difference of N-grams for GSS\(^*\)

Fig. 12
figure 12

Score difference of N-grams for WFS\(^*\)

Fig. 13
figure 13

Score difference of N-grams for MI\(^*\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ananya, A., Aswathy, A., Amal, T.R. et al. SysDroid: a dynamic ML-based android malware analyzer using system call traces. Cluster Comput 23, 2789–2808 (2020). https://doi.org/10.1007/s10586-019-03045-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-019-03045-6

Keywords

Navigation