SysDroid: a dynamic ML-based android malware analyzer using system call traces

Ananya, A.; Aswathy, A.; Amal, T. R.; Swathy, P. G.; Vinod, P.; Mohammad, Shojafar

doi:10.1007/s10586-019-03045-6

SysDroid: a dynamic ML-based android malware analyzer using system call traces

Published: 13 January 2020

Volume 23, pages 2789–2808, (2020)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Ananya A.¹,
Aswathy A.¹,
Amal T. R.¹,
Swathy P. G.¹,
Vinod P.¹ &
…
Mohammad Shojafar ORCID: orcid.org/0000-0003-3284-5086^2,3

1163 Accesses
29 Citations
Explore all metrics

Abstract

Android is a popular open-source operating system highly susceptible to malware attacks. Researchers have developed machine learning models, learned from attributes extracted using static/dynamic approaches to identify malicious applications. However, such models suffer from low detection accuracy, due to the presence of noisy attributes, extracted from conventional feature selection algorithms. Hence, in this paper, a new feature selection mechanism known as selection of relevant attributes for improving locally extracted features using classical feature selectors (SAILS), is proposed. SAILS, targets on discovering prominent system calls from applications, and is built on the top of conventional feature selection methods, such as mutual information, distinguishing feature selector and Galavotti–Sebastiani–Simi. These classical attribute selection methods are used as local feature selectors. Besides, a novel global feature selection method known as, weighted feature selection is proposed. Comprehensive analysis of the proposed feature selectors, is conducted with the traditional methods. SAILS results in improved values for evaluation metrics, compared to the conventional feature selection algorithms for distinct machine learning models, developed using Logistic Regression, CART, Random Forest, XGBoost and Deep Neural Networks. Our evaluations observe accuracies ranging between 95 and 99% for dropout rate and learning rate in the range 0.1–0.8 and 0.001–0.2, respectively. Finally, the security evaluation of malware classifiers on adversarial examples are thoroughly investigated. A decline in accuracy with adversarial examples is observed. Also, SAILS recall rate of classifier subjected to such examples estimate in the range of 24.79–92.2%. However, prior to the attack, the true positive rate obtained by the classifier is reported between 95.2 and 99.79%. The results suggest that the hackers can bypass detection, by discovering the classifier blind spots, on augmenting a small number of legitimate attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

Aafer, Y., Du, W., Yin, H.: Droidapiminer: mining api-level features for robust malware detection in android. In: International Conference on Security and Privacy in Communication Systems, pp. 86–103. Springer, Berlin (2013)
Afonso, V.M., de Amorim, M.F., Grégio, A.R.A., Junquera, G.B., de Geus, P.L.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11(1), 9–17 (2015)
Article Google Scholar
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. Ndss 14, 23–26 (2014)
Google Scholar
Arshad, S., Shah, M.A., Wahid, A., Mehmood, A., Song, H., Yu, H.: Samadroid: a novel 3-level hybrid malware detection model for android operating system. IEEE Access 6, 4321–4339 (2018)
Article Google Scholar
Bhandari, S., Panihar, R., Naval, S., Laxmi, V., Zemmari, A., Gaur, M.S.: Sword: semantic aware android malware detector. J. Inf. Secur. Appl. 42, 46–56 (2018)
Google Scholar
Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 387–402. Springer, Berlin (2013)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)
Cyber security facts and statistics for 2019. https://us.norton.com/internetsecurity-emerging-threats-10-facts-/about-todays-cybersecurity-landscape-that/-you-should-know.html (2019). Accessed 10 Aug 2019
Cao, Y., Yang, J.: Towards making systems forget with machine unlearning. In: 2015 IEEE Symposium on Security and Privacy, pp. 463–480. IEEE (2015)
Chen, L., Hou, S., Ye, Y., Chen, L.: An adversarial machine learning model against android malware evasion attacks. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, pp. 43–55. Springer, Berlin (2017)
Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344 (2018)
Article Google Scholar
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIQKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: European Symposium on Research in Computer Security, pp. 62–79. Springer, Berlin (2017)
Han, W., Xue, J., Wang, Y., Huang, L., Kong, Z., Mao, L.: Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput. Secur. 83, 208–233 (2019)
Article Google Scholar
Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, New York (2013)
Book Google Scholar
Hou, S., Saas, A., Chen, L., Ye, Y.: Deep4maldroid: a deep learning framework for android malware detection based on linux kernel system call graphs. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), pp. 104–111. IEEE (2016)
Hou, S., Saas, A., Ye, Y., Chen, L.: Droiddelver: an android malware detection system using deep belief network based on API call blocks. In: International Conference on Web-Age Information Management, pp. 54–66. Springer, Berlin (2016)
Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: Hindroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1507–1515. ACM (2017)
Ishibashi, H., Hihara, S., Iriki, A.: Acquisition and development of monkey tool-use: behavioral and kinematic analyses. Can. J. Physiol. Pharmacol. 78(11), 958–966 (2000)
Article Google Scholar
Largeron, C., Moulin, C., Géry, M.: Entropy based feature selection for text categorization. In: Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 924–928. ACM (2011)
Mobile malware evolution 2019. https://securelist.com/mobile-malware-evolution-2018/89689/ (2019). Accessed 10 Aug 2019
Michael, S., Florian, E., Thomas, S., Felix, C.F., Hoffmann, J.: Mobilesandbox: looking deeper into android applications. In: Proceedings of the 28th International ACM Symposium on Applied Computing (SAC) (2013)
Naway, A., Li, Y.: A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360 (2018)
Roundy, K.A., Miller, B.P.: Hybrid analysis and control of malware. In: International Workshop on Recent Advances in Intrusion Detection, pp. 317–338. Springer, Berlin (2010)
Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)
Article MathSciNet Google Scholar
Santos, I., Penya, Y.K., Devesa, J., Bringas, P.G.: N-grams-based file signatures for malware detection. ICEIS 2(9), 317–320 (2009)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Suciu, O., Marginean, R., Kaya, Y., Daume III, H., Dumitras, T.: When does machine learning FAIL? Generalized transferability for evasion and poisoning attacks. In: 27th USENIX Security Symposium (USENIX Security 18), pp. 1299–1316 (2018)
Tong, F., Yan, Z.: A hybrid approach of mobile malware detection in android. J. Parallel Distrib. Comput. 103, 22–31 (2017)
Article Google Scholar
Virustotal. http://virustotal.com/ (2019). Accessed 10 Aug 2019
Wang, W., Zhao, M., Wang, J.: Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Humaniz. Comput. 10(8), 3035–3043 (2019)
Article Google Scholar
Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised. In: Twenty-Second International Joint Conference on Artificial Intelligence (2011)
Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
Article Google Scholar
Zhang, J., Zhang, K., Qin, Z., Yin, H., Wu, Q.: Sensitive system calls based packed malware variants detection using principal component initialized multilayers neural networks. Cybersecurity 1(1), 10 (2018)
Article Google Scholar
Zheng, Z., Wu, X., Srihari, R.: Feature selection for text categorization on imbalanced data. ACM SIGKDD Explor. Newslett. 6(1), 80–89 (2004)
Article Google Scholar
Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy, pp. 95–109. IEEE (2012)
9apps: Andriod app website. https://www.9apps.com/ (2019). Accessed 10 Aug 2019

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, SCMS School of Engineering and Technology, Ernakulam, Kerala, India
Ananya A., Aswathy A., Amal T. R., Swathy P. G. & Vinod P.
ICS/5GIC, University of Surrey, Guildford, GU27XH, UK
Mohammad Shojafar
University of Padua, 35131, Padua, Italy
Mohammad Shojafar

Authors

Ananya A.
View author publications
You can also search for this author in PubMed Google Scholar
Aswathy A.
View author publications
You can also search for this author in PubMed Google Scholar
Amal T. R.
View author publications
You can also search for this author in PubMed Google Scholar
Swathy P. G.
View author publications
You can also search for this author in PubMed Google Scholar
Vinod P.
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Shojafar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Shojafar.

Ethics declarations

Conflicts of interest

There is no conflict of interest for the paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In this section, different scores of N-gram for GSS, WFS, and MI are presented to illustrate the malware and benign samples for various features (see the Figs. 11, 12 and 13).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ananya, A., Aswathy, A., Amal, T.R. et al. SysDroid: a dynamic ML-based android malware analyzer using system call traces. Cluster Comput 23, 2789–2808 (2020). https://doi.org/10.1007/s10586-019-03045-6

Download citation

Received: 10 August 2019
Revised: 29 November 2019
Accepted: 31 December 2019
Published: 13 January 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10586-019-03045-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SysDroid: a dynamic ML-based android malware analyzer using system call traces

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation