Skip to main content
Log in

Deep learning vs. adversarial noise: a battle in malware image analysis

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The proliferation of malware variants has shown a steep increase, attributed to their enhanced sophistication and the utilization of the latest technologies. This constitutes a severe menace to smart gadgets and IT infrastructure. Malware visualization has emerged as an exceptionally attractive technique, primarily because it obviates the need for disassembly or code execution. In this approach, malicious executables are transformed into visual representations resembling images. This visual representation allows for the extraction of textural features using the Local Binary Pattern (LBP) technique. Subsequently, classification models are constructed using ResNet50, VGG16, and customized models tailored to the specific task. These model undergoes extensive evaluation through two benchmark datasets: the MalImg dataset (consisting of 9,342 instances of malware across 25 families) and the Malware Classification Challenge dataset (BIG2015) (with 10,868 labeled malware instances across nine families). Additionally, the model is validated on a self-made dataset, which we named Malhub, consisting of 26,452 executables comprising 20 families. Furthermore, we implemented a white-box adversarial attack using additive noise (Gaussian, Local Variable, Poisson, Salt and Pepper, Speckle). We observed an F1 score in the range of 0.992\(-\)0.993 for MalImg, 0.874\(-\)0.878 for BIG2015, and 0.014\(-\)0.992 for Malhub dataset. This proves that efforts are required to tune machine learning models to detect adversarial examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

Not applicable.

Notes

  1. Cyber Security Trends Report (2021): https://purplesec.us/cyber-security-trends-2021.

  2. Crowdstrike global threat report (2021): https://go.crowdstrike.com/rs/281-OBQ-266/images/ Report2021GTR.pdf.

  3. Global industry sectors most targeted by malware incidents in 2020: https://www.statista.com/statistics/223517/ malware-infection-weekly-industries/

  4. VirusShare: https://virusshare.com/

  5. scikit-image: https://scikit-image.org/

  6. https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html

  7. https://keras.io/keras_tuner/

  8. Github link: https://github.com/OPTIMA-CTI/DL-Adversarial-Noise

References

  1. Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Procedia Comput. Sci. 46, 804–811 (2015)

    Article  Google Scholar 

  2. Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: Dl-droid: Deep learning based android malware detection using real devices. Comput. Secur. 89, 101663 (2020)

    Article  Google Scholar 

  3. Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)

    Article  Google Scholar 

  4. Ni, S., Qian, Q., Zhang, R.: Malware identification using visualization images and deep learning. Comput. Secur. 77, 871–885 (2018)

    Article  Google Scholar 

  5. Fu, J., Xue, J., Wang, Y., Liu, Z., Shan, C.: Malware visualization for fine-grained classification. IEEE Access 6, 14510–14523 (2018)

    Article  Google Scholar 

  6. Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial perturbations against deep neural networks for malware classification. arXiv preprint arXiv:1606.04435 (2016)

  7. Al-Dujaili, A., Huang, A., Hemberg, E., O’Reilly, U.-M.: Adversarial deep learning for robust detection of binary encoded malware. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 76–82 (2018). IEEE

  8. Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy. S &P 2001, pp. 38–49 (2000). IEEE

  9. Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478 (2004)

  10. Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: International Symposium on Distributed Computing and Artificial Intelligence, pp. 415–422 (2011). Springer

  11. Siddiqui, M., Wang, M.C., Lee, J.: Detecting internet worms using data mining techniques. J. Syst. Cybernetics Inform. 6(6), 48–53 (2009)

    Google Scholar 

  12. Kang, B., Yerima, S.Y., McLaughlin, K., Sezer, S.: N-opcode analysis for android malware classification and categorization. In: 2016 International Conference on Cyber Security and Protection of Digital Services (cyber Security), pp. 1–7 (2016). IEEE

  13. Peter, E., Schiller, T.: A Practical Guide to Honeypots. Washington Univerity, Washington, DC (2011)

    Google Scholar 

  14. Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 108–125 (2008). Springer

  15. Ki, Y., Kim, E., Kim, H.K.: A novel approach to detect malware based on api call sequence analysis. Int. J. Distrib. Sens. Netw. 11(6), 659101 (2015)

    Article  Google Scholar 

  16. Anderson, B., Quist, D., Neil, J., Storlie, C., Lane, T.: Graph-based malware detection using dynamic analysis. J. Comput. Virol. 7(4), 247–258 (2011)

    Article  Google Scholar 

  17. Yoo, I.: Visualizing windows executable viruses using self-organizing maps. In: Proceedings of the 2004 ACM Workshop on Visualization and Data Mining for Computer Security, pp. 82–89 (2004)

  18. Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)

  19. Choi, S., Jang, S., Kim, Y., Kim, J.: Malware detection using malware image and deep learning. In: 2017 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1193–1195 (2017). IEEE

  20. Yajamanam, S., Selvin, V.R.S., Di Troia, F., Stamp, M.: Deep learning versus gist descriptors for image-based malware classification. In: Icissp, pp. 553–561 (2018)

  21. Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 21–30 (2011)

  22. Su, J., Vasconcellos, D.V., Prasad, S., Sgandurra, D., Feng, Y., Sakurai, K.: Lightweight classification of iot malware based on image recognition. In: 2018 IEEE 42Nd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 664–669 (2018). IEEE

  23. Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-G., Chen, J.: Detection of malicious code variants based on deep learning. IEEE Trans. Ind. Inform. 14(7), 3187–3196 (2018)

    Article  Google Scholar 

  24. Gibert, D., Mateu, C., Planes, J., Vicens, R.: Using convolutional neural networks for classification of malware represented as images. J. Comput. Virol. Hack. Tech. 15(1), 15–28 (2019)

    Article  Google Scholar 

  25. Mourtaji, Y., Bouhorma, M., Alghazzawi, D.: Intelligent framework for malware detection with convolutional neural network. In: Proceedings of the 2nd International Conference on Networking, Information Systems & Security, pp. 1–6 (2019)

  26. Venkatraman, S., Alazab, M., Vinayakumar, R.: A hybrid deep learning image-based analysis for effective malware detection. J. Inf. Secur. Appl. 47, 377–389 (2019)

    Google Scholar 

  27. Akarsh, S., Simran, K., Poornachandran, P., Menon, V.K., Soman, K.: Deep learning framework and visualization for malware classification. In: 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), pp. 1059–1063 (2019). IEEE

  28. Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of cnn architectures (imcec). Comput. Secur. 92, 101748 (2020)

    Article  Google Scholar 

  29. Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: Imcfn: Image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 171, 107138 (2020)

    Article  Google Scholar 

  30. Chen, L.: Deep transfer learning for static malware classification. arXiv preprint arXiv:1812.07606 (2018)

  31. Wang, C., Zhao, Z., Wang, F., Li, Q.: A novel malware detection and family classification scheme for iot based on deam and densenet. Secur. Commun. Netw. 2021, 1–16 (2021)

    Article  Google Scholar 

  32. Alzubi, O.A., Qiqieh, I., Alzubi, J.A.: Fusion of deep learning based cyberattack detection and classification model for intelligent systems. Clust. Comput. 26(2), 1363–1374 (2023)

    Article  Google Scholar 

  33. Le, Q., Boydell, O., Mac Namee, B., Scanlon, M.: Deep learning at the shallow end: Malware classification for non-domain experts. Digit. Invest. 26, 118–126 (2018)

    Article  Google Scholar 

  34. Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp, D., Rieck, K., Corona, I., Giacinto, G., Roli, F.: Yes, machine learning can be more secure! a case study on android malware detection. IEEE Trans. Depend. Secure Comput. 16(4), 711–724 (2017)

    Article  Google Scholar 

  35. Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: European Symposium on Research in Computer Security, pp. 62–79 (2017). Springer

  36. Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. computers & security 73, 326–344 (2018)

  37. Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135 (2018)

  38. Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)

    Article  Google Scholar 

  39. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  40. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)

  41. Olivas, E.S., Guerrero, J.D.M., Martinez-Sober, M., Magdalena-Benedito, J.R., Serrano, L., et al.: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques: Algorithms, Methods, and Techniques. IGI global (2009)

  42. Bulazel, A., Yener, B.: A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web. In: Proceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pp. 1–21 (2017)

  43. Xu, H., Ma, Y., Liu, H.-C., Deb, D., Liu, H., Tang, J.-L., Jain, A.K.: Adversarial attacks and defenses in images, graphs and text: a review. Int. J. Autom. Comput. 17(2), 151–178 (2020)

    Article  Google Scholar 

  44. Laidlaw, C., Feizi, S.: Functional adversarial attacks. Advances in neural information processing systems 32 (2019)

  45. Vivek, B., Mopuri, K.R., Babu, R.V.: Gray-box adversarial training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 203–218 (2018)

  46. You, I., Yim, K.: Malware obfuscation techniques: A brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300 (2010). IEEE

  47. Schiffman, M.: A brief history of malware obfuscation: Part 2 of 2. Cisco Blog (2010)

  48. Agarap, A.F.: Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (svm) for malware classification. arXiv preprint arXiv:1801.00318 (2017)

  49. Jian, Y., Kuang, H., Ren, C., Ma, Z., Wang, H.: A novel framework for image-based malware detection with a deep neural network. Comput. Secur. 109, 102400 (2021)

    Article  Google Scholar 

  50. Deng, H., Guo, C., Shen, G., Cui, Y., Ping, Y.: Mctvd: A malware classification method based on three-channel visualization and deep learning. Comput. Secur. 126, 103084 (2023)

    Article  Google Scholar 

  51. Shaid, S.Z.M., Maarof, M.A.: Malware behavior image for malware variant identification. In: 2014 International Symposium on Biometrics and Security Technologies (ISBAST), pp. 238–243 (2014). IEEE

  52. Bianco, S., Cadene, R., Celona, L., Napoletano, P.: Benchmark analysis of representative deep neural network architectures. IEEE Access 6, 64270–64277 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the support provided by the HORIZON Europe Framework Programme for the project “OPTIMA - Organization-specific Threat Intelligence Mining and Sharing" (No. 101063107).

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

KAA: software, validation, investigation, data curation, writing—original draft, writing—review and editing. VP: conceptualization, investigation, supervision, writing—original draft, writing—review and editing. KARR: supervision, writing—original draft, writing—review and editing. SLA: software, validation, investigation, writing—original draft, writing—review and editing.

Corresponding author

Correspondence to Vinod Puthuvath.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 2196 KB)

Appendix A

Appendix A

This section presents the detailed evaluation results of ML classifiers and Dense (2-Layers). Additionally, we include the ROC curve for the proposed classification model.

Table 15 ML classifiers and Dense(2-layers) performance using LBP images VGG16 as pre-trained network (size \(=\) 128 x 128)
Table 16 Performance of ML classifiers and Dense(2-layers)using LBP images ResNet50 as pretrained network (size \(=\) 128 x 128)
Table 17 Evaluation results of ML classifiers and Dense(2-layers) using Non-LBP images (size \(=\) 128 x 128) using pretrained VGG16
Table 18 Assesment of ML classifiers and Dense(2-layers) using Non-LBP images (size \(=\) 128 x 128) using pretrained ResNet50

See Figs. 18, 19, and 20.

Fig. 18
figure 18

ROC curve for BIG2015 dataset using proposed fusion model(ResNet50||CNN||DL)

Fig. 19
figure 19

ROC curve for MalImg dataset using proposed fusion model(ResNet50||CNN||DL)

Fig. 20
figure 20

ROC curve for Malhub dataset using proposed fusion model(ResNet50||CNN||DL)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Asmitha, K.A., Puthuvath, V., Rafidha Rehiman, K.A. et al. Deep learning vs. adversarial noise: a battle in malware image analysis. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04397-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10586-024-04397-4

Keywords

Navigation