Skip to main content

Prediction of Android Malicious Software Using Boosting Algorithms

  • Conference paper
  • First Online:
Emerging Technologies in Computing (iCETiC 2021)

Abstract

Android malware, a group of malicious software variants, including viruses, ransomware and spyware, designed to cause substantial damage to data and systems or to access a network without authorization. With an inexorable shift in technology, Android has supplanted other Mobile platforms by being flexible and user-friendly to the users. As the number of Android apps continues to grow every day, the number of malwares aimed at attacking those users is also on the rise. Thus, it becomes emergent to identify and remove malicious Android applications before installation to prevent user’s loss. Several studies have already been carried out to anticipate Android malware using machine learning algorithms, while as per the literature survey conducted by this study, a significant research has not been found to be focusing especially on the genre of boosting algorithms. Therefore, the objective of this paper is to classify malicious and benign Android applications by using Boosting algorithm. To attain the research objective, four widely defined boosting models viz. AdaBoost, CatBoost, XGBoost, and GradientBoost were developed whereas, it was found that CatBoost and GradientBoost had the highest F1 score (93.9%), followed by Adaboost (F1 score 93.5%), and XGBoost (F1 score 93.5%).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mobile operating system market share worldwide. https://gs.statcounter.com/os-market-share/mobile/worldwide. Accessed 4 May 2021

  2. Development of new android malware worldwide from June 2016 to March 2020. https://www.statista.com/statistics/680705/global-android-malware-volume/. Accessed 4 May 2021

  3. Qing, S.H.: Research progress on android security. J. Softw. 27(1), 45–71 (2016)

    MathSciNet  Google Scholar 

  4. Lopes, J., Serr˜ao, C., Nunes, L., Almeida, A., Oliveira, J.: Overview of machine learning methods for android malware identification. In: 2019 7th International Symposium on Digital Forensics and Security (ISDFS), pp. 1–6. IEEE (2019)

    Google Scholar 

  5. Ahvanooey, M.T., Li, Q., Rabbani, M., Rajput, A.R.: A survey on smartphones security: software vulnerabilities, malware, and attacks. arXiv preprint arXiv:2001.09406 (2020)

  6. Souri, A., Hosseini, R.: A state-of-the-art survey of malware detection approaches using data mining techniques. HCIS 8(1), 1–22 (2018). https://doi.org/10.1186/s13673-018-0125-x

    Article  Google Scholar 

  7. Mohaisen, A., Alrawi, O., Mohaisen, M.: Amal: high-fidelity, behavior- based automated malware analysis and classification. Comput. Secur. 52, 251–266 (2015)

    Article  Google Scholar 

  8. Pirscoveanu, R.S., Hansen, S.S., Larsen, T.M., Stevanovic, M., Pedersen, J.M., Czech, A.: Analysis of malware behavior: type classification using machine learning. In: 2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), pp. 1–7. IEEE (2015)

    Google Scholar 

  9. Kapratwar, A., Di Troia, F., Stamp, M.: Static and dynamic analysis of android malware. In: ICISSP, pp. 653–662 (2017)

    Google Scholar 

  10. Zhu, H.-J., You, Z.-H., Zhu, Z.-X., Shi, W.-L., Chen, X., Cheng, L.: Droiddet: effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing 272, 638–646 (2018)

    Article  Google Scholar 

  11. Salehi, M., Amini, M.: Android malware detection using Markov chain model of application behaviors in requesting system services. arXiv preprint arXiv:1711.05731 (2017)

  12. Mahindru, A., Singh, P.: Dynamic permissions based android malware detection using machine learning techniques. In: Proceedings of the 10th In novations in Software Engineering Conference, pp. 202–210 (2017)

    Google Scholar 

  13. Ding, Y., Zhang, X., Hu, J., Xu, W.: Android malware detection method based on bytecode image. J. Ambient Intell. Humaniz. Comput. 1–10 (2020). https://doi.org/10.1007/s12652-020-02196-4

  14. Wang, W., Zhao, M., Wang, J.: Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J. Ambient Intell. Humaniz. Comput. 10(8), 3035–3043 (2018). https://doi.org/10.1007/s12652-018-0803-6

    Article  Google Scholar 

  15. Ahmed, A.A., Jabbar, W.A., Sadiq, A.S., Patel, H.: Deep learning-based classification model for botnet attack detection. J. Ambient Intell. Humaniz. Comput. 1–10 (2020). https://doi.org/10.1007/s12652-020-01848-9

  16. Yuxin, D., Siyi, Z.: Malware detection based on deep learning algorithm. Neural Comput. Appl. 31(2), 461–472 (2017). https://doi.org/10.1007/s00521-017-3077-6

    Article  Google Scholar 

  17. Yuan, Z., Yongqiang, L., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)

    Article  Google Scholar 

  18. Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Explaining vulnerabilities of deep learning to adversarial malware binaries. arXiv preprint arXiv:1901.03583 (2019)

  19. Damodaran, A., Di Troia, F., Visaggio, C.A., Austin, T.H., Stamp, M.: A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hacking Tech. 13(1), 1–12 (2017)

    Article  Google Scholar 

  20. Costa, G., Aria, H.: Android malware detection using network behavior analysis and machine learning classifiers (2017)

    Google Scholar 

  21. Ashawa, M.A., Morris, S.: Analysis of android malware detection techniques: a systematic review (2019)

    Google Scholar 

  22. Lee, H.-T., Kim, D., Park, M., Cho, S.: Protecting data on android platform against privilege escalation attack. Int. J. Comput. Math. 93(2), 401–414 (2016)

    Article  MathSciNet  Google Scholar 

  23. Cesare, S., Xiang, Y., Zhou, W.: Control flow-based malware variant-detection. IEEE Trans. Dependable Secure Comput. 11(4), 307–317 (2013)

    Article  Google Scholar 

  24. Chan, P.P.K., Song, W.K.: Static detection of android malware by using permissions and API calls. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 1, pp. 82–87. IEEE (2014)

    Google Scholar 

  25. Wang, X., Wang, W., He, Y., Liu, J., Han, Z., Zhang, X.: Characterizing Android apps’ behavior for effective detection of malapps at large scale. Future Gener. Comput. Syst. 75, 30–45 (2017)

    Article  Google Scholar 

  26. Fan, M., Liu, J., Wang, W., Li, H., Tian, Z., Liu, T.: Dapasa: detecting android piggybacked apps through sensitive subgraph analysis. IEEE Trans. Inf. Forensics Secur. 12(8), 1772–1785 (2017)

    Article  Google Scholar 

  27. Al Shorman, A., Faris, H., Aljarah, I.: Unsupervised intelligent system based on one class support vector machine and grey wolf optimization for iot botnet detection. J. Ambient Intell. Humaniz. Comput. 11(7), 2809–2825 (2020). https://doi.org/10.1007/s12652-019-01387-y

    Article  Google Scholar 

  28. Islam, T., Rahman, S.S.M.M., Hasan, M.A., Rahaman, A.S.M.M., Jabiullah, M.I.: Evaluation of N-gram based multi-layer approach to detect malware in android. Procedia Comput. Sci. 171, 1074–1082 (2020)

    Article  Google Scholar 

  29. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.E.: Drebin: effective and explainable detection of android malware in your pocket. In: NDSS, vol. 14, pp. 23–26 (2014)

    Google Scholar 

  30. Yerima, S.Y., Sezer, S.: Droidfusion: a novel multilevel classifier fusion approach for android malware detection. IEEE Trans. Cybern. 49(2), 453–466 (2018)

    Article  Google Scholar 

  31. Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al.: Handling imbalanced datasets: a review. GESTS Int. Trans. Comput. Sci. and Eng. 30(1), 25–36 (2006)

    Google Scholar 

  32. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intel. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  33. Schapire, R.E.: Explaining adaBoost. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 37–52. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41136-6_5

  34. Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363 (2018)

  35. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: CatBoost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516 (2017)

  36. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery And Data Mining, pp. 785–794 (2016)

    Google Scholar 

  37. Nguyen, L.T.K., Chung, H.H., Tuliao, K.V., Lin, T.M.Y.: Using XGBoost and skip-gram model to predict online review popularity. SAGE Open 10(4), 1–17 (2020). https://doi.org/10.1177/2158244020983316

    Article  Google Scholar 

  38. Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002)

    Article  MathSciNet  Google Scholar 

  39. Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-09823-4_45

  40. Zien, A., Kramer, N., Sonnenburg, S., Ratsch, G.: The feature importance ranking measure. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) Machine Learning and Knowledge Discovery in Databases, vol. 5782, pp. 694–709. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04174-7_45

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nafiz Imtiaz Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nath, D.D., Khan, N.I., Akhter, J., Rahaman, A.S.M.M. (2021). Prediction of Android Malicious Software Using Boosting Algorithms. In: Miraz, M.H., Southall, G., Ali, M., Ware, A., Soomro, S. (eds) Emerging Technologies in Computing. iCETiC 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 395. Springer, Cham. https://doi.org/10.1007/978-3-030-90016-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-90016-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-90015-1

  • Online ISBN: 978-3-030-90016-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics