StackDroid: Evaluation of a Multi-level Approach for Detecting the Malware on Android Using Stacked Generalization

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1035)


Attackers or cyber criminals are getting encouraged to develop android malware because of the rapidly growing rate of android users. To detect android malware, researchers and security specialist have been started to contribute on android malware analysis and detection related tasks using machine learning algorithms. In this paper, Stacked Generalization has been used to minimize the error rate and a multi-level architecture based approach named StackDroid has been presented and evaluated. In this experiment, Extremely Randomized Tree (ET), Random Forest (RF), Multi-Layer Perceptron (MLP) and Stochastic Gradient Descent (SGD) classifiers have been used as base classifiers in level 1 and Extreme Gradient Boosting (EGB) has been used as final predictor in level 2. It’s been found that StackDroid provides 99% of Area Under Curve (AUC), 1.67% of False Positive Rate (FPR) and 97% detection accuracy on DREBIN dataset which provides a strong basement to the development of android malware scanner.


Android malware Machine learning Multi-level approach Stacked generalization Android malware detection 


  1. 1.
    Sen, S., Aysan, A.I., Clark, J.A.: SAFEDroid: using structural features for detecting android malwares. In: Lin, X., Ghorbani, A., Ren, K., Zhu, S., Zhang, A. (eds.) SecureComm 2017. LNICSSITE, vol. 239, pp. 255–270. Springer, Cham (2018). Scholar
  2. 2.
    Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.E.R.T.: DREBIN: effective and explainable detection of android malware in your pocket. NDSS 14, 23–26 (2014)Google Scholar
  3. 3.
    Saracino, A., Sgandurra, D., Dini, G., Martinelli, F.: MADAM: effective and efficient behavior-based android malware detection and prevention. IEEE Trans. Depend. Secure Comput. 15, 83–97 (2016)CrossRefGoogle Scholar
  4. 4.
    Reina, A., Fattori, A., Cavallaro, L.: A system call-centric analysis and stimulation technique to automatically reconstruct android malware behaviors. In: EuroSec, April 2013Google Scholar
  5. 5.
    Backes, M., Gerling, S., Hammer, C., Maffei, M., von Styp-Rekowsky, P.: AppGuard – fine-grained policy enforcement for untrusted android applications. In: Garcia-Alfaro, J., Lioudakis, G., Cuppens-Boulahia, N., Foley, S., Fitzgerald, W.M. (eds.) DPM/SETOP -2013. LNCS, vol. 8247, pp. 213–231. Springer, Heidelberg (2014). Scholar
  6. 6.
    Bugiel, S., Davi, L., Dmitrienko, A., Fischer, T., Sadeghi, A.R., Shastry, B.: Towards taming privilege-escalation attacks on android. NDSS 17, 19 (2012)Google Scholar
  7. 7.
    Gibler, C., Crussell, J., Erickson, J., Chen, H.: AndroidLeaks: automatically detecting potential privacy leaks in android applications on a large scale. In: Katzenbeisser, S., Weippl, E., Camp, L.J., Volkamer, M., Reiter, M., Zhang, X. (eds.) Trust 2012. LNCS, vol. 7344, pp. 291–307. Springer, Heidelberg (2012). Scholar
  8. 8.
    Viswanath, H., Mehtre, B.M.: U.S. Patent No. 9,959,406, U.S. Patent and Trademark Office, Washington, DC (2018)Google Scholar
  9. 9.
    Zhong, X., Zeng, F., Cheng, Z., Xie, N., Qin, X., Guo, S.: Privilege escalation detecting in android applications. In: 2017 3rd International Conference on Big Data Computing and Communications (BIGCOM), pp. 9–44. IEEE (2017)Google Scholar
  10. 10.
    Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICST, vol. 127, pp. 86–103. Springer, Cham (2013). Scholar
  11. 11.
    Demontis, A., et al.: Yes, machine learning can be more secure! a case study on Android malware detection. IEEE Trans. Depend. Secure Comput. (2017)Google Scholar
  12. 12.
    Papadopoulos, H., Georgiou, N., Eliades, C., Konstantinidis, A.: Android malware detection with unbiased confidence guarantees. Neurocomputing. 280, 312 (2018)CrossRefGoogle Scholar
  13. 13.
    Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)CrossRefGoogle Scholar
  14. 14.
    Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware-analysis techniques and tools. ACM Comput. Surv. (CSUR) 44(2), 6 (2012)CrossRefGoogle Scholar
  15. 15.
    Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)Google Scholar
  16. 16.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRefGoogle Scholar
  17. 17.
    Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: Intelligence and Security Informatics Conference (EISIC), pp. 141–147. IEEE (2012)Google Scholar
  18. 18.
    Scikit-learn: machine learning in Python Scikit-learn 0.19.1 documentation.
  19. 19.
    Yerima, S.Y., Sezer, S., McWilliams, G., Muttik, I.: A new android malware detection approach using Bayesian classification. In: IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 121–128 (2013)Google Scholar
  20. 20.
    Wu, W.C., Hung, S.H.: DroidDolphin: a dynamic Android malware detection framework using big data and machine learning. In: 2014 Conference on Research in Adaptive and Convergent Systems, pp. 247–252. ACM (2014)Google Scholar
  21. 21.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  22. 22.
    Feizollah, A., Anuar, N.B., Salleh, R., Amalina, F., Ma’arof, R.U.R., Shamshirband, S.: A study of machine learning classifiers for anomaly-based mobile botnet detection. Malays. J. Comput. Sci. 26(4), 251–265 (2014). (in Malaysian)Google Scholar
  23. 23.
    Talha, K.A., Alper, D.I., Aydin, C.: APK auditor: permission-based Android malware detection system. Digit. Invest. 13, 1–14 (2015)CrossRefGoogle Scholar
  24. 24.
    Urcuqui, C., Navarro, A.: Machine learning classifiers for android malware analysis. In: IEEE Colombian Conference on Communications and Computing (COLCOM), pp. 1–6 (2016)Google Scholar
  25. 25.
    Xiao, X., Zhang, S., Mercaldo, F., Hu, G., Sangaiah, A.K.: Android malware detection based on system call sequences and LSTM. Multimedia Tools Appl. 78, 3979–3999 (2019)CrossRefGoogle Scholar
  26. 26.
    Wang, C., Xu, Q., Lin, X., Liu, S.: Research on data mining of permissions mode for Android malware detection. Clust. Comput. 1–14 (2018)Google Scholar
  27. 27.
    Python package for stacking (machine learning technique).
  28. 28.
    Townsend, J.T.: Theoretical analysis of an alphabetic confusion matrix. Percept. Psychophys. 9(1), 40–50 (1971)CrossRefGoogle Scholar
  29. 29.
    Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)Google Scholar
  30. 30.
    Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Sattar, A., Kang, B. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006). Scholar
  31. 31.
    Boyd, K., Eng, K.H., Page, C.D.: Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 451–466. Springer, Heidelberg (2013). Scholar
  32. 32.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRefGoogle Scholar
  33. 33.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  34. 34.
    Taud, H., Mas, J.F.: Multilayer perceptron (MLP). In: Camacho Olmedo, M.T., Paegelow, M., Mas, J.-F., Escobar, F. (eds.) Geomatic Approaches for Modeling Land Change Scenarios. LNGC, pp. 451–455. Springer, Cham (2018). Scholar
  35. 35.
    Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT 2010, pp. 177–186. Physica-Verlag HD, Heidelberg (2010)CrossRefGoogle Scholar
  36. 36.
    Tsiang, S.C.: The rationale of the mean-standard deviation analysis, skewness preference, and the demand for money. In: Finance Constraints and the Theory of Money, pp. 221–248 (1989)CrossRefGoogle Scholar
  37. 37.
    Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49(4), 764–766 (2013)CrossRefGoogle Scholar
  38. 38.
    Chen, T., He, T., Benesty, M.: Xgboost: extreme gradient boosting. R package version 0.4-2, pp. 1-4 (2015)Google Scholar
  39. 39.
    Kurtulmus, A.B., Daniel, K.: Trustless Machine Learning Contracts; Evaluating and Exchanging Machine Learning Models on the Ethereum Blockchain. Algorithmia Res. (2018)Google Scholar
  40. 40.
    Rana, M.S., Rahman, S.S.M.M., Sung, A.H.: Evaluation of tree based machine learning classifiers for android malware detection. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11056, pp. 377–385. Springer, Cham (2018). Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringJahangirnagar UniversitySavar, DhakaBangladesh
  2. 2.Department of Software EngineeringDaffodil International UniversityDhakaBangladesh

Personalised recommendations