Skip to main content
Log in

Enhanced DNNs for malware classification with GAN-based adversarial training

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Deep learning based malware classification gains momentum recently. However, deep learning models are vulnerable to adversarial perturbation attacks especially when applied in network security application. Deep neural network (DNN)-based malware classifiers by eating the whole bit sequences are also vulnerable despite their satisfactory performance and less feature-engineering job. Therefore, this paper proposes a DNN-based malware classifier on the raw bit sequences of programs in Windows. We then propose two adversarial attacks targeting our trained DNNs to generate adversarial malware. A defensive mechanism is proposed by treating perturbations as noise added on bit sequences. In our defensive mechanism, a generative adversary network (GAN)-based model is designed to filter out the perturbation noise and those that with the highest probability to fool the target DNNs are chosen for adversarial training. The experiments show that GAN with filter-based model produced the highest quality adversarial samples with medium cost. The evasion ratio under GAN with filter-based model is as high as 50.64% on average. While incorporating GAN-based adversarial samples into training, the enhanced DNN achieves satisfactory with 90.20% accuracy while the evasion ratio is below 9.47%. GAN helps in secure the DNN-based malware classifier with negligible performance degradation when compared with the original DNN. The evasion ratio is remarkably minimized when faced with powerful adversarial attacks, including \({\textit{FGSM}}^r\) and \({\textit{FGSM}}^k\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.kaggle.com/c/malware-classification.

  2. https://www.virustotal.com/gui/.

  3. https://www.kaggle.com/c/malware-classification.

  4. https://www.virustotal.com/gui/.

References

  1. Filiol, E., Josse, S.: A statistical model for undecidable viral detection. J. Comput. Virol. Tech. 3(2), 65–74 (2007). https://doi.org/10.1007/s11416-007-0041-5

    Article  Google Scholar 

  2. Gavrilut, D., Cimpoesu, M., Anton, D., Ciortuz, L.: Malware detection using machine learning. In: International Multiconference on Computer Science & Information Technology, pp. 735–741. IEEE (2010). https://ieeexplore.ieee.org/document/5352759

  3. Gibert, D., Mateu, C., Planes, J.: The rise of machine learning for detection and classification of malware: research developments, trends and challenges. J. Netw. Comput. Appl. 153, 102536 (2020). https://doi.org/10.1016/j.jnca.2019.102526

    Article  Google Scholar 

  4. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath B.S.: Malware images: visualization and automatic classification. In: VizSec 11 Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7. ACM (2011). https://doi.org/10.1145/2016904.2016908

  5. Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.: Malware detection by eating a whole exe (2017). arXiv:1710.09435

  6. Raff, E., Zak, R., Cox, R., Sylvester, J., Yacci, P., Ward, R., Tracy, A., McLean, M., Nicholas, C.: An investigation of byte n-gram features for malware classification. J. Comput. Virol. Hacking Tech. 14(1), 1–20 (2018). https://doi.org/10.1007/s11416-016-0283-1

    Article  Google Scholar 

  7. Kolosnjaji, B., Demontis, A., Biggio, B., Maiorca, D., Giacinto, G.: Adversarial malware binaries: evading deep learning for malware detection in executables. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 533–537. IEEE (2019)

  8. Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshop (SPW), pp. 8–14. CEUR-WS (2019). https://doi.org/10.1109/SPW.2019.00015

  9. Jin, G., Shen, S., Zhang, D., Dai, F., Zhang, Y.: APE-GAN: adversarial perturbation elimination with GAN. In: 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 3842–3846. IEEE (2019). https://doi.org/10.1109/ICASSP.2019.8683044

  10. Al-Dujaili, A., Huang, A., Hemberg, E., O’Reilly, U.M.: Adversarial deep learning for robust detection of binary encoded malware. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 76–82. IEEE (2018). https://doi.org/10.1109/SPW.2018.00020

  11. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations (ICLR 2018), pp. 1–28 (2018)

  12. Mercaldo, F., Santone, A.: Deep learning for image-based mobile malware detection. J. Comput. Virol. Hacking Tech. 16(6), 1–15 (2020). https://doi.org/10.1007/s11416-019-00346-7

    Article  Google Scholar 

  13. Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in android. In: International Conference on Security and Privacy in Communication Systems, pp. 86–103. Springer (2013). https://doi.org/10.1007/978-3-319-04283-1_6

  14. Jerlin, M.A., Marimuthu, K.: A new malware detection system using machine learning techniques for API call sequences. J. Appl. Secur. Res. 13(1), 45–62 (2018)

    Article  Google Scholar 

  15. Zhang, M., Duan, Y., Yin, H., Zhao, Z.: Semantics-aware android malware classification using weighted contextual API dependency graphs. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1105–1116. ACM (2014). https://doi.org/10.1145/2660267.2660359

  16. Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: HinDroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1507–1515. ACM (2017). https://doi.org/10.1145/3097983.3098026

  17. Kreuk, F., Barak, A., Aviv-Reuven, S., Baruch, M., Pinkas, B., Keshet, J.: Adversarial examples on discrete sequences for beating whole-binary malware detection (2018). arXiv:1802.04528v1

  18. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–11 (2015)

  19. Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN (2017). arXiv:1702.05983

  20. Chen, X., Li, C., Wang, D., Wen, S., Zhang, J., Nepal, S., Xiang, Y., Ren, K.: Android HIV: a study of repackaging malware for evading machine-learning detection. IEEE Trans. Inform. Forens. Secur. 15, 987–1001 (2020)

    Article  Google Scholar 

  21. Carlini, N., Wagner, D.: Adversarial examples are not easily detected: bypassing ten detection methods. In: Proceedings of the 10th ACM workshop on artificial intelligence and security, pp. 3–14. ACM (2017). https://doi.org/10.1145/3128572.3140444

  22. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. IEEE (2016). https://doi.org/10.1109/EuroSP.2016.36

  23. Wang, Q., Guo, W., Zhang, K., Ororbia, II., Alexander, G., Xing, X., Liu, X., Giles, C.L.: Adversary resistant deep neural networks with an application to malware detection. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1145–1153. ACM (2017). https://doi.org/10.1145/3097983.3098158

  24. Mahmood, S., Keane, L., Lujo, B., Michael, K.R., Saurabh, S.: Optimization-guided binary diversification to mislead neural networks for malware detection 2019. arXiv:1912.09064

  25. Pappas, V., Polychronakis, M., Keromytis, A.D.: Smashing the gadgets: hindering return-oriented programming using in-place code randomization. In: 2012 IEEE Symposium on Security and Privacy, pp. 601–615. IEEE (2012). https://doi.org/10.1109/SP.2012.41

  26. Koo, H., Polychronakis, M.: Juggling the gadgets: binary-level code randomization using instruction displacement. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 23–34. ACM (2016). https://doi.org/10.1145/2897845.2897863

  27. Song, W., Li, X., Afroz, S., Garg, D., Kuznetsov, D., Yin, H.: Automatic generation of adversarial examples for interpreting malware classifiers (2020). arXiv:2003.03100

  28. Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp, D., Rieck, K., Corona, I., Giacinto, G., Roli, F.: Yes, machine learning can be more secure! A case study on android malware detection. IEEE Trans. Depend. Secure Comput. 16(4), 711–724 (2019). https://doi.org/10.1109/TDSC.2017.2700270

    Article  Google Scholar 

  29. Incer, I., Theodorides, M., Afroz, S., Wagner, D.: Adversarially robust malware detection using monotonic classification. In: The Fourth ACM International Workshop, pp. 54–63. ACM (2018). https://doi.org/10.1145/3180445.3180449

  30. Maiorca, D., Biggio, B., Giacinto, G.: Towards adversarial malware detection: lessons learned from PDF-based attacks. ACM Comput. Surv. (CSUR) 52(4), 1–36 (2019). https://doi.org/10.1145/3332184

    Article  Google Scholar 

  31. Chen, L., Hou, S., Ye, Y., Xu, S.: DroidEye: fortifying security of learning-based classifier against adversarial android malware attacks. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 782–789. IEEE (2018). https://doi.org/10.1109/ASONAM.2018.8508284

  32. Chen, L., Ye, Y.: SecMD: make machine learning more secure against adversarial malware attacks. In: AI 2017: Advances in Artificial Intelligence, pp. 76–89. Springer (2017). https://doi.org/10.1007/978-3-319-63004-5_7

  33. Chen, L., Hou, S., Ye, Y.: SecureDroid: enhancing security of machine learning-based detection against adversarial android malware attacks. In: Proceedings of the 33rd Annual Computer Security Applications Conference, pp. 362–372. ACM (2017). https://doi.org/10.1145/3134600.3134636

  34. Yang, W., Kong, D., Xie, T., Gunter, C.A.: Malware detection in adversarial settings: exploiting feature evolutions and confusions in android apps. In: Proceedings of the 33rd Annual Computer Security Applications Conference, pp. 288–302. ACM (2017). https://doi.org/10.1145/3134600.3134642

  35. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7(4), 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work has been supported by the Open Foundation of Key Laboratory in Software Engineering of Yunnan Province under Grant Nos. 2020SE401, 2020SE306 and 2020SE305.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaowen Yao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Li, H., Zheng, Y. et al. Enhanced DNNs for malware classification with GAN-based adversarial training. J Comput Virol Hack Tech 17, 153–163 (2021). https://doi.org/10.1007/s11416-021-00378-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-021-00378-y

Keywords

Navigation