SysDroid: a dynamic ML-based android malware analyzer using system call traces

  • Ananya A. 
  • Aswathy A. 
  • Amal T. R. 
  • Swathy P. G. 
  • Vinod P. 
  • Mohammad Shojafar Email author


Android is a popular open-source operating system highly susceptible to malware attacks. Researchers have developed machine learning models, learned from attributes extracted using static/dynamic approaches to identify malicious applications. However, such models suffer from low detection accuracy, due to the presence of noisy attributes, extracted from conventional feature selection algorithms. Hence, in this paper, a new feature selection mechanism known as selection of relevant attributes for improving locally extracted features using classical feature selectors (SAILS), is proposed. SAILS, targets on discovering prominent system calls from applications, and is built on the top of conventional feature selection methods, such as mutual information, distinguishing feature selector and Galavotti–Sebastiani–Simi. These classical attribute selection methods are used as local feature selectors. Besides, a novel global feature selection method known as, weighted feature selection is proposed. Comprehensive analysis of the proposed feature selectors, is conducted with the traditional methods. SAILS results in improved values for evaluation metrics, compared to the conventional feature selection algorithms for distinct machine learning models, developed using Logistic Regression, CART, Random Forest, XGBoost and Deep Neural Networks. Our evaluations observe accuracies ranging between 95 and 99% for dropout rate and learning rate in the range 0.1–0.8 and 0.001–0.2, respectively. Finally, the security evaluation of malware classifiers on adversarial examples are thoroughly investigated. A decline in accuracy with adversarial examples is observed. Also, SAILS recall rate of classifier subjected to such examples estimate in the range of 24.79–92.2%. However, prior to the attack, the true positive rate obtained by the classifier is reported between 95.2 and 99.79%. The results suggest that the hackers can bypass detection, by discovering the classifier blind spots, on augmenting a small number of legitimate attributes.


Android malware Machine learning (ML) Deep learning (DL) Feature selection Adversarial machine learning (AML) Attacks 


Compliance with ethical standards

Conflicts of interest

There is no conflict of interest for the paper.


  1. 1.
    Aafer, Y., Du, W., Yin, H.: Droidapiminer: mining api-level features for robust malware detection in android. In: International Conference on Security and Privacy in Communication Systems, pp. 86–103. Springer, Berlin (2013)Google Scholar
  2. 2.
    Afonso, V.M., de Amorim, M.F., Grégio, A.R.A., Junquera, G.B., de Geus, P.L.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11(1), 9–17 (2015)CrossRefGoogle Scholar
  3. 3.
    Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. Ndss 14, 23–26 (2014)Google Scholar
  4. 4.
    Arshad, S., Shah, M.A., Wahid, A., Mehmood, A., Song, H., Yu, H.: Samadroid: a novel 3-level hybrid malware detection model for android operating system. IEEE Access 6, 4321–4339 (2018)CrossRefGoogle Scholar
  5. 5.
    Bhandari, S., Panihar, R., Naval, S., Laxmi, V., Zemmari, A., Gaur, M.S.: Sword: semantic aware android malware detector. J. Inf. Secur. Appl. 42, 46–56 (2018)Google Scholar
  6. 6.
    Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F.: Evasion attacks against machine learning at test time. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 387–402. Springer, Berlin (2013)CrossRefGoogle Scholar
  7. 7.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  8. 8.
    Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)Google Scholar
  9. 9.
  10. 10.
    Cao, Y., Yang, J.: Towards making systems forget with machine unlearning. In: 2015 IEEE Symposium on Security and Privacy, pp. 463–480. IEEE (2015)Google Scholar
  11. 11.
    Chen, L., Hou, S., Ye, Y., Chen, L.: An adversarial machine learning model against android malware evasion attacks. In: Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, pp. 43–55. Springer, Berlin (2017)CrossRefGoogle Scholar
  12. 12.
    Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344 (2018)CrossRefGoogle Scholar
  13. 13.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIQKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
  14. 14.
    Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: European Symposium on Research in Computer Security, pp. 62–79. Springer, Berlin (2017)CrossRefGoogle Scholar
  15. 15.
    Han, W., Xue, J., Wang, Y., Huang, L., Kong, Z., Mao, L.: Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput. Secur. 83, 208–233 (2019)CrossRefGoogle Scholar
  16. 16.
    Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, New York (2013)CrossRefGoogle Scholar
  17. 17.
    Hou, S., Saas, A., Chen, L., Ye, Y.: Deep4maldroid: a deep learning framework for android malware detection based on linux kernel system call graphs. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), pp. 104–111. IEEE (2016)Google Scholar
  18. 18.
    Hou, S., Saas, A., Ye, Y., Chen, L.: Droiddelver: an android malware detection system using deep belief network based on API call blocks. In: International Conference on Web-Age Information Management, pp. 54–66. Springer, Berlin (2016)CrossRefGoogle Scholar
  19. 19.
    Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: Hindroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1507–1515. ACM (2017)Google Scholar
  20. 20.
    Ishibashi, H., Hihara, S., Iriki, A.: Acquisition and development of monkey tool-use: behavioral and kinematic analyses. Can. J. Physiol. Pharmacol. 78(11), 958–966 (2000)CrossRefGoogle Scholar
  21. 21.
    Largeron, C., Moulin, C., Géry, M.: Entropy based feature selection for text categorization. In: Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 924–928. ACM (2011)Google Scholar
  22. 22.
    Mobile malware evolution 2019. (2019). Accessed 10 Aug 2019
  23. 23.
    Michael, S., Florian, E., Thomas, S., Felix, C.F., Hoffmann, J.: Mobilesandbox: looking deeper into android applications. In: Proceedings of the 28th International ACM Symposium on Applied Computing (SAC) (2013)Google Scholar
  24. 24.
    Naway, A., Li, Y.: A review on the use of deep learning in android malware detection. arXiv preprint arXiv:1812.10360 (2018)Google Scholar
  25. 25.
    Roundy, K.A., Miller, B.P.: Hybrid analysis and control of malware. In: International Workshop on Recent Advances in Intrusion Detection, pp. 317–338. Springer, Berlin (2010)Google Scholar
  26. 26.
    Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 21(3), 660–674 (1991)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Santos, I., Penya, Y.K., Devesa, J., Bringas, P.G.: N-grams-based file signatures for malware detection. ICEIS 2(9), 317–320 (2009)Google Scholar
  28. 28.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  29. 29.
    Suciu, O., Marginean, R., Kaya, Y., Daume III, H., Dumitras, T.: When does machine learning FAIL? Generalized transferability for evasion and poisoning attacks. In: 27th USENIX Security Symposium (USENIX Security 18), pp. 1299–1316 (2018)Google Scholar
  30. 30.
    Tong, F., Yan, Z.: A hybrid approach of mobile malware detection in android. J. Parallel Distrib. Comput. 103, 22–31 (2017)CrossRefGoogle Scholar
  31. 31.
    Virustotal. (2019). Accessed 10 Aug 2019
  32. 32.
    Wang, W., Zhao, M., Wang, J.: Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Humaniz. Comput. 10(8), 3035–3043 (2019)CrossRefGoogle Scholar
  33. 33.
    Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: L2, 1-norm regularized discriminative feature selection for unsupervised. In: Twenty-Second International Joint Conference on Artificial Intelligence (2011)Google Scholar
  34. 34.
    Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)CrossRefGoogle Scholar
  35. 35.
    Zhang, J., Zhang, K., Qin, Z., Yin, H., Wu, Q.: Sensitive system calls based packed malware variants detection using principal component initialized multilayers neural networks. Cybersecurity 1(1), 10 (2018)CrossRefGoogle Scholar
  36. 36.
    Zheng, Z., Wu, X., Srihari, R.: Feature selection for text categorization on imbalanced data. ACM SIGKDD Explor. Newslett. 6(1), 80–89 (2004)CrossRefGoogle Scholar
  37. 37.
    Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy, pp. 95–109. IEEE (2012)Google Scholar
  38. 38.
    9apps: Andriod app website. (2019). Accessed 10 Aug 2019

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringSCMS School of Engineering and TechnologyErnakulamIndia
  2. 2.ICS/5GIC, University of SurreyGuildfordUK
  3. 3.University of PaduaPaduaItaly

Personalised recommendations