Skip to main content

StackDroid: Evaluation of a Multi-level Approach for Detecting the Malware on Android Using Stacked Generalization

Part of the Communications in Computer and Information Science book series (CCIS,volume 1035)

Abstract

Attackers or cyber criminals are getting encouraged to develop android malware because of the rapidly growing rate of android users. To detect android malware, researchers and security specialist have been started to contribute on android malware analysis and detection related tasks using machine learning algorithms. In this paper, Stacked Generalization has been used to minimize the error rate and a multi-level architecture based approach named StackDroid has been presented and evaluated. In this experiment, Extremely Randomized Tree (ET), Random Forest (RF), Multi-Layer Perceptron (MLP) and Stochastic Gradient Descent (SGD) classifiers have been used as base classifiers in level 1 and Extreme Gradient Boosting (EGB) has been used as final predictor in level 2. It’s been found that StackDroid provides 99% of Area Under Curve (AUC), 1.67% of False Positive Rate (FPR) and 97% detection accuracy on DREBIN dataset which provides a strong basement to the development of android malware scanner.

Keywords

  • Android malware
  • Machine learning
  • Multi-level approach
  • Stacked generalization
  • Android malware detection

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-981-13-9181-1_53
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   99.00
Price excludes VAT (USA)
  • ISBN: 978-981-13-9181-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   129.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

References

  1. Sen, S., Aysan, A.I., Clark, J.A.: SAFEDroid: using structural features for detecting android malwares. In: Lin, X., Ghorbani, A., Ren, K., Zhu, S., Zhang, A. (eds.) SecureComm 2017. LNICSSITE, vol. 239, pp. 255–270. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78816-6_18

    CrossRef  Google Scholar 

  2. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.E.R.T.: DREBIN: effective and explainable detection of android malware in your pocket. NDSS 14, 23–26 (2014)

    Google Scholar 

  3. Saracino, A., Sgandurra, D., Dini, G., Martinelli, F.: MADAM: effective and efficient behavior-based android malware detection and prevention. IEEE Trans. Depend. Secure Comput. 15, 83–97 (2016)

    CrossRef  Google Scholar 

  4. Reina, A., Fattori, A., Cavallaro, L.: A system call-centric analysis and stimulation technique to automatically reconstruct android malware behaviors. In: EuroSec, April 2013

    Google Scholar 

  5. Backes, M., Gerling, S., Hammer, C., Maffei, M., von Styp-Rekowsky, P.: AppGuard – fine-grained policy enforcement for untrusted android applications. In: Garcia-Alfaro, J., Lioudakis, G., Cuppens-Boulahia, N., Foley, S., Fitzgerald, W.M. (eds.) DPM/SETOP -2013. LNCS, vol. 8247, pp. 213–231. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-54568-9_14

    CrossRef  Google Scholar 

  6. Bugiel, S., Davi, L., Dmitrienko, A., Fischer, T., Sadeghi, A.R., Shastry, B.: Towards taming privilege-escalation attacks on android. NDSS 17, 19 (2012)

    Google Scholar 

  7. Gibler, C., Crussell, J., Erickson, J., Chen, H.: AndroidLeaks: automatically detecting potential privacy leaks in android applications on a large scale. In: Katzenbeisser, S., Weippl, E., Camp, L.J., Volkamer, M., Reiter, M., Zhang, X. (eds.) Trust 2012. LNCS, vol. 7344, pp. 291–307. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30921-2_17

    CrossRef  Google Scholar 

  8. Viswanath, H., Mehtre, B.M.: U.S. Patent No. 9,959,406, U.S. Patent and Trademark Office, Washington, DC (2018)

    Google Scholar 

  9. Zhong, X., Zeng, F., Cheng, Z., Xie, N., Qin, X., Guo, S.: Privilege escalation detecting in android applications. In: 2017 3rd International Conference on Big Data Computing and Communications (BIGCOM), pp. 9–44. IEEE (2017)

    Google Scholar 

  10. Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICST, vol. 127, pp. 86–103. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-04283-1_6

    CrossRef  Google Scholar 

  11. Demontis, A., et al.: Yes, machine learning can be more secure! a case study on Android malware detection. IEEE Trans. Depend. Secure Comput. (2017)

    Google Scholar 

  12. Papadopoulos, H., Georgiou, N., Eliades, C., Konstantinidis, A.: Android malware detection with unbiased confidence guarantees. Neurocomputing. 280, 312 (2018)

    CrossRef  Google Scholar 

  13. Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf. Secur. Tech. Rep. 14(1), 16–29 (2009)

    CrossRef  Google Scholar 

  14. Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware-analysis techniques and tools. ACM Comput. Surv. (CSUR) 44(2), 6 (2012)

    CrossRef  Google Scholar 

  15. Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 15–26. ACM (2011)

    Google Scholar 

  16. Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)

    CrossRef  Google Scholar 

  17. Sahs, J., Khan, L.: A machine learning approach to android malware detection. In: Intelligence and Security Informatics Conference (EISIC), pp. 141–147. IEEE (2012)

    Google Scholar 

  18. Scikit-learn: machine learning in Python Scikit-learn 0.19.1 documentation. http://scikit-learn.org/stable/

  19. Yerima, S.Y., Sezer, S., McWilliams, G., Muttik, I.: A new android malware detection approach using Bayesian classification. In: IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp. 121–128 (2013)

    Google Scholar 

  20. Wu, W.C., Hung, S.H.: DroidDolphin: a dynamic Android malware detection framework using big data and machine learning. In: 2014 Conference on Research in Adaptive and Convergent Systems, pp. 247–252. ACM (2014)

    Google Scholar 

  21. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  22. Feizollah, A., Anuar, N.B., Salleh, R., Amalina, F., Ma’arof, R.U.R., Shamshirband, S.: A study of machine learning classifiers for anomaly-based mobile botnet detection. Malays. J. Comput. Sci. 26(4), 251–265 (2014). (in Malaysian)

    Google Scholar 

  23. Talha, K.A., Alper, D.I., Aydin, C.: APK auditor: permission-based Android malware detection system. Digit. Invest. 13, 1–14 (2015)

    CrossRef  Google Scholar 

  24. Urcuqui, C., Navarro, A.: Machine learning classifiers for android malware analysis. In: IEEE Colombian Conference on Communications and Computing (COLCOM), pp. 1–6 (2016)

    Google Scholar 

  25. Xiao, X., Zhang, S., Mercaldo, F., Hu, G., Sangaiah, A.K.: Android malware detection based on system call sequences and LSTM. Multimedia Tools Appl. 78, 3979–3999 (2019)

    CrossRef  Google Scholar 

  26. Wang, C., Xu, Q., Lin, X., Liu, S.: Research on data mining of permissions mode for Android malware detection. Clust. Comput. 1–14 (2018)

    Google Scholar 

  27. Python package for stacking (machine learning technique). https://github.com/vecxoz/vecstack

  28. Townsend, J.T.: Theoretical analysis of an alphabetic confusion matrix. Percept. Psychophys. 9(1), 40–50 (1971)

    CrossRef  Google Scholar 

  29. Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240. ACM (2006)

    Google Scholar 

  30. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Sattar, A., Kang, B. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006). https://doi.org/10.1007/11941439_114

    CrossRef  Google Scholar 

  31. Boyd, K., Eng, K.H., Page, C.D.: Area under the precision-recall curve: point estimates and confidence intervals. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8190, pp. 451–466. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40994-3_29

    CrossRef  Google Scholar 

  32. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    CrossRef  Google Scholar 

  33. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    CrossRef  Google Scholar 

  34. Taud, H., Mas, J.F.: Multilayer perceptron (MLP). In: Camacho Olmedo, M.T., Paegelow, M., Mas, J.-F., Escobar, F. (eds.) Geomatic Approaches for Modeling Land Change Scenarios. LNGC, pp. 451–455. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-60801-3_27

    CrossRef  Google Scholar 

  35. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT 2010, pp. 177–186. Physica-Verlag HD, Heidelberg (2010)

    CrossRef  Google Scholar 

  36. Tsiang, S.C.: The rationale of the mean-standard deviation analysis, skewness preference, and the demand for money. In: Finance Constraints and the Theory of Money, pp. 221–248 (1989)

    CrossRef  Google Scholar 

  37. Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49(4), 764–766 (2013)

    CrossRef  Google Scholar 

  38. Chen, T., He, T., Benesty, M.: Xgboost: extreme gradient boosting. R package version 0.4-2, pp. 1-4 (2015)

    Google Scholar 

  39. Kurtulmus, A.B., Daniel, K.: Trustless Machine Learning Contracts; Evaluating and Exchanging Machine Learning Models on the Ethereum Blockchain. Algorithmia Res. (2018)

    Google Scholar 

  40. Rana, M.S., Rahman, S.S.M.M., Sung, A.H.: Evaluation of tree based machine learning classifiers for android malware detection. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11056, pp. 377–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98446-9_35

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheikh Shah Mohammad Motiur Rahman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Motiur Rahman, S.S.M., Saha, S.K. (2019). StackDroid: Evaluation of a Multi-level Approach for Detecting the Malware on Android Using Stacked Generalization. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1035. Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-1_53

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-9181-1_53

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-9180-4

  • Online ISBN: 978-981-13-9181-1

  • eBook Packages: Computer ScienceComputer Science (R0)