Deep learning vs. adversarial noise: a battle in malware image analysis

Asmitha, K. A.; Puthuvath, Vinod; Rafidha Rehiman, K. A.; Ananth, S. L.

doi:10.1007/s10586-024-04397-4

Deep learning vs. adversarial noise: a battle in malware image analysis

Published: 17 April 2024

(2024)
Cite this article

Cluster Computing Aims and scope Submit manuscript

K. A. Asmitha¹,
Vinod Puthuvath^1,2,
K. A. Rafidha Rehiman¹ &
…
S. L. Ananth³

41 Accesses
Explore all metrics

Abstract

The proliferation of malware variants has shown a steep increase, attributed to their enhanced sophistication and the utilization of the latest technologies. This constitutes a severe menace to smart gadgets and IT infrastructure. Malware visualization has emerged as an exceptionally attractive technique, primarily because it obviates the need for disassembly or code execution. In this approach, malicious executables are transformed into visual representations resembling images. This visual representation allows for the extraction of textural features using the Local Binary Pattern (LBP) technique. Subsequently, classification models are constructed using ResNet50, VGG16, and customized models tailored to the specific task. These model undergoes extensive evaluation through two benchmark datasets: the MalImg dataset (consisting of 9,342 instances of malware across 25 families) and the Malware Classification Challenge dataset (BIG2015) (with 10,868 labeled malware instances across nine families). Additionally, the model is validated on a self-made dataset, which we named Malhub, consisting of 26,452 executables comprising 20 families. Furthermore, we implemented a white-box adversarial attack using additive noise (Gaussian, Local Variable, Poisson, Salt and Pepper, Speckle). We observed an F1 score in the range of 0.992\(-\)0.993 for MalImg, 0.874\(-\)0.878 for BIG2015, and 0.014\(-\)0.992 for Malhub dataset. This proves that efforts are required to tune machine learning models to detect adversarial examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Utilizing InfoGAN and PE Header Features for Synthetic Ransomware Image Generation: An Experimental Study

Auxiliary-Classifier GAN for Malware Analysis

Malware Classification Using Image Representation

Data availability

Not applicable.

Notes

Cyber Security Trends Report (2021): https://purplesec.us/cyber-security-trends-2021.
Crowdstrike global threat report (2021): https://go.crowdstrike.com/rs/281-OBQ-266/images/ Report2021GTR.pdf.
Global industry sectors most targeted by malware incidents in 2020: https://www.statista.com/statistics/223517/ malware-infection-weekly-industries/
VirusShare: https://virusshare.com/
scikit-image: https://scikit-image.org/
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html
https://keras.io/keras_tuner/
Github link: https://github.com/OPTIMA-CTI/DL-Adversarial-Noise

References

Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Procedia Comput. Sci. 46, 804–811 (2015)
Article Google Scholar
Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: Dl-droid: Deep learning based android malware detection using real devices. Comput. Secur. 89, 101663 (2020)
Article Google Scholar
Islam, R., Tian, R., Batten, L.M., Versteeg, S.: Classification of malware based on integrated static and dynamic features. J. Netw. Comput. Appl. 36(2), 646–656 (2013)
Article Google Scholar
Ni, S., Qian, Q., Zhang, R.: Malware identification using visualization images and deep learning. Comput. Secur. 77, 871–885 (2018)
Article Google Scholar
Fu, J., Xue, J., Wang, Y., Liu, Z., Shan, C.: Malware visualization for fine-grained classification. IEEE Access 6, 14510–14523 (2018)
Article Google Scholar
Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial perturbations against deep neural networks for malware classification. arXiv preprint arXiv:1606.04435 (2016)
Al-Dujaili, A., Huang, A., Hemberg, E., O’Reilly, U.-M.: Adversarial deep learning for robust detection of binary encoded malware. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 76–82 (2018). IEEE
Schultz, M.G., Eskin, E., Zadok, F., Stolfo, S.J.: Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE Symposium on Security and Privacy. S &P 2001, pp. 38–49 (2000). IEEE
Kolter, J.Z., Maloof, M.A.: Learning to detect malicious executables in the wild. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 470–478 (2004)
Santos, I., Nieves, J., Bringas, P.G.: Semi-supervised learning for unknown malware detection. In: International Symposium on Distributed Computing and Artificial Intelligence, pp. 415–422 (2011). Springer
Siddiqui, M., Wang, M.C., Lee, J.: Detecting internet worms using data mining techniques. J. Syst. Cybernetics Inform. 6(6), 48–53 (2009)
Google Scholar
Kang, B., Yerima, S.Y., McLaughlin, K., Sezer, S.: N-opcode analysis for android malware classification and categorization. In: 2016 International Conference on Cyber Security and Protection of Digital Services (cyber Security), pp. 1–7 (2016). IEEE
Peter, E., Schiller, T.: A Practical Guide to Honeypots. Washington Univerity, Washington, DC (2011)
Google Scholar
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pp. 108–125 (2008). Springer
Ki, Y., Kim, E., Kim, H.K.: A novel approach to detect malware based on api call sequence analysis. Int. J. Distrib. Sens. Netw. 11(6), 659101 (2015)
Article Google Scholar
Anderson, B., Quist, D., Neil, J., Storlie, C., Lane, T.: Graph-based malware detection using dynamic analysis. J. Comput. Virol. 7(4), 247–258 (2011)
Article Google Scholar
Yoo, I.: Visualizing windows executable viruses using self-organizing maps. In: Proceedings of the 2004 ACM Workshop on Visualization and Data Mining for Computer Security, pp. 82–89 (2004)
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)
Choi, S., Jang, S., Kim, Y., Kim, J.: Malware detection using malware image and deep learning. In: 2017 International Conference on Information and Communication Technology Convergence (ICTC), pp. 1193–1195 (2017). IEEE
Yajamanam, S., Selvin, V.R.S., Di Troia, F., Stamp, M.: Deep learning versus gist descriptors for image-based malware classification. In: Icissp, pp. 553–561 (2018)
Nataraj, L., Yegneswaran, V., Porras, P., Zhang, J.: A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, pp. 21–30 (2011)
Su, J., Vasconcellos, D.V., Prasad, S., Sgandurra, D., Feng, Y., Sakurai, K.: Lightweight classification of iot malware based on image recognition. In: 2018 IEEE 42Nd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 664–669 (2018). IEEE
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.-G., Chen, J.: Detection of malicious code variants based on deep learning. IEEE Trans. Ind. Inform. 14(7), 3187–3196 (2018)
Article Google Scholar
Gibert, D., Mateu, C., Planes, J., Vicens, R.: Using convolutional neural networks for classification of malware represented as images. J. Comput. Virol. Hack. Tech. 15(1), 15–28 (2019)
Article Google Scholar
Mourtaji, Y., Bouhorma, M., Alghazzawi, D.: Intelligent framework for malware detection with convolutional neural network. In: Proceedings of the 2nd International Conference on Networking, Information Systems & Security, pp. 1–6 (2019)
Venkatraman, S., Alazab, M., Vinayakumar, R.: A hybrid deep learning image-based analysis for effective malware detection. J. Inf. Secur. Appl. 47, 377–389 (2019)
Google Scholar
Akarsh, S., Simran, K., Poornachandran, P., Menon, V.K., Soman, K.: Deep learning framework and visualization for malware classification. In: 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), pp. 1059–1063 (2019). IEEE
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of cnn architectures (imcec). Comput. Secur. 92, 101748 (2020)
Article Google Scholar
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: Imcfn: Image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 171, 107138 (2020)
Article Google Scholar
Chen, L.: Deep transfer learning for static malware classification. arXiv preprint arXiv:1812.07606 (2018)
Wang, C., Zhao, Z., Wang, F., Li, Q.: A novel malware detection and family classification scheme for iot based on deam and densenet. Secur. Commun. Netw. 2021, 1–16 (2021)
Article Google Scholar
Alzubi, O.A., Qiqieh, I., Alzubi, J.A.: Fusion of deep learning based cyberattack detection and classification model for intelligent systems. Clust. Comput. 26(2), 1363–1374 (2023)
Article Google Scholar
Le, Q., Boydell, O., Mac Namee, B., Scanlon, M.: Deep learning at the shallow end: Malware classification for non-domain experts. Digit. Invest. 26, 118–126 (2018)
Article Google Scholar
Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp, D., Rieck, K., Corona, I., Giacinto, G., Roli, F.: Yes, machine learning can be more secure! a case study on android malware detection. IEEE Trans. Depend. Secure Comput. 16(4), 711–724 (2017)
Article Google Scholar
Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: European Symposium on Research in Computer Security, pp. 62–79 (2017). Springer
Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., Li, B.: Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. computers & security 73, 326–344 (2018)
Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135 (2018)
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)
Olivas, E.S., Guerrero, J.D.M., Martinez-Sober, M., Magdalena-Benedito, J.R., Serrano, L., et al.: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques: Algorithms, Methods, and Techniques. IGI global (2009)
Bulazel, A., Yener, B.: A survey on automated dynamic malware analysis evasion and counter-evasion: Pc, mobile, and web. In: Proceedings of the 1st Reversing and Offensive-oriented Trends Symposium, pp. 1–21 (2017)
Xu, H., Ma, Y., Liu, H.-C., Deb, D., Liu, H., Tang, J.-L., Jain, A.K.: Adversarial attacks and defenses in images, graphs and text: a review. Int. J. Autom. Comput. 17(2), 151–178 (2020)
Article Google Scholar
Laidlaw, C., Feizi, S.: Functional adversarial attacks. Advances in neural information processing systems 32 (2019)
Vivek, B., Mopuri, K.R., Babu, R.V.: Gray-box adversarial training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 203–218 (2018)
You, I., Yim, K.: Malware obfuscation techniques: A brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300 (2010). IEEE
Schiffman, M.: A brief history of malware obfuscation: Part 2 of 2. Cisco Blog (2010)
Agarap, A.F.: Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (svm) for malware classification. arXiv preprint arXiv:1801.00318 (2017)
Jian, Y., Kuang, H., Ren, C., Ma, Z., Wang, H.: A novel framework for image-based malware detection with a deep neural network. Comput. Secur. 109, 102400 (2021)
Article Google Scholar
Deng, H., Guo, C., Shen, G., Cui, Y., Ping, Y.: Mctvd: A malware classification method based on three-channel visualization and deep learning. Comput. Secur. 126, 103084 (2023)
Article Google Scholar
Shaid, S.Z.M., Maarof, M.A.: Malware behavior image for malware variant identification. In: 2014 International Symposium on Biometrics and Security Technologies (ISBAST), pp. 238–243 (2014). IEEE
Bianco, S., Cadene, R., Celona, L., Napoletano, P.: Benchmark analysis of representative deep neural network architectures. IEEE Access 6, 64270–64277 (2018)
Article Google Scholar

Download references

Acknowledgements

We gratefully acknowledge the support provided by the HORIZON Europe Framework Programme for the project “OPTIMA - Organization-specific Threat Intelligence Mining and Sharing" (No. 101063107).

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Computer Applications, Cochin University of Science & Technology, Kochi, 682022, Kerala, India
K. A. Asmitha, Vinod Puthuvath & K. A. Rafidha Rehiman
Department of Mathematics, University of Padua, Padua, Italy
Vinod Puthuvath
Cisco Systems India Pvt. Ltd., Bangalore, India
S. L. Ananth

Authors

K. A. Asmitha
View author publications
You can also search for this author in PubMed Google Scholar
Vinod Puthuvath
View author publications
You can also search for this author in PubMed Google Scholar
K. A. Rafidha Rehiman
View author publications
You can also search for this author in PubMed Google Scholar
S. L. Ananth
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

KAA: software, validation, investigation, data curation, writing—original draft, writing—review and editing. VP: conceptualization, investigation, supervision, writing—original draft, writing—review and editing. KARR: supervision, writing—original draft, writing—review and editing. SLA: software, validation, investigation, writing—original draft, writing—review and editing.

Corresponding author

Correspondence to Vinod Puthuvath.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 2196 KB)

Appendix A

This section presents the detailed evaluation results of ML classifiers and Dense (2-Layers). Additionally, we include the ROC curve for the proposed classification model.

Table 15 ML classifiers and Dense(2-layers) performance using LBP images VGG16 as pre-trained network (size \(=\) 128 x 128)

Full size table

Table 16 Performance of ML classifiers and Dense(2-layers)using LBP images ResNet50 as pretrained network (size \(=\) 128 x 128)

Full size table

Table 17 Evaluation results of ML classifiers and Dense(2-layers) using Non-LBP images (size \(=\) 128 x 128) using pretrained VGG16

Full size table

Table 18 Assesment of ML classifiers and Dense(2-layers) using Non-LBP images (size \(=\) 128 x 128) using pretrained ResNet50

Full size table

See Figs. 18, 19, and 20.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Asmitha, K.A., Puthuvath, V., Rafidha Rehiman, K.A. et al. Deep learning vs. adversarial noise: a battle in malware image analysis. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04397-4

Download citation

Received: 10 October 2023
Revised: 12 February 2024
Accepted: 26 February 2024
Published: 17 April 2024
DOI: https://doi.org/10.1007/s10586-024-04397-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning vs. adversarial noise: a battle in malware image analysis

Abstract

Access this article

Similar content being viewed by others

Utilizing InfoGAN and PE Header Features for Synthetic Ransomware Image Generation: An Experimental Study

Auxiliary-Classifier GAN for Malware Analysis

Malware Classification Using Image Representation

Data availability

Notes

References

Acknowledgements

Funding