Windows and IoT malware visualization and classification with deep CNN and Xception CNN using Markov images

Sharma, Osho; Sharma, Akashdeep; Kalia, Arvind

doi:10.1007/s10844-022-00734-4

Windows and IoT malware visualization and classification with deep CNN and Xception CNN using Markov images

Published: 09 August 2022

Volume 60, pages 349–375, (2023)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

993 Accesses
6 Citations
Explore all metrics

Abstract

Context

Technological advances have led to a tremendous increase in complexity and volume of specialized malware, affecting computational devices across the globe. Along with malware targeting Windows devices, IoT devices having lesser computational power, have also been affected by malware attacks in the recent past. Due to a scarcity of updated malware datasets, malware recognition and classification has become trickier, particularly in IoT environments where malware samples are limited and scarce. Identifying a malware family can reveal the underlying intent of malware and traditional machine learning algorithms have performed well in this area. However, since such methods necessitate a large amount of feature engineering, deep learning algorithms for malware recognition and classification have been developed. In particular, the malware visualization-based approaches, which have shown decent success in the past have scope of improvement, which has been exploited in the current study.

Objectives

The current work aims at utilizing malware images (grayscale, RGB, markov) and deep CNNs for effective Windows and IoT malware recognition and classification using traditional learning and transfer learning approaches.

Methods and Design

First, grayscale, RGB and markov images were created from malware binaries. In particular, the idea of markov image generation by using markov probability matrix is to retain the global statistics of malware bytes which are generally lost during image transformation operations. A Gabor filter-based approach is utilized to extract textures and then a custom-built deep CNN and pretrained Xception CNN trained on 1.5 million images from ImageNet dataset, which is fine-tuned for malware images are employed for classifying malware images into families.

Results and Conclusions

To assess the effectiveness of the suggested framework, two public benchmark Windows malware image datasets, one custom built Windows malware image dataset and one custom built IoT malware image dataset were utilized. In particular, the methods demonstrate excellent classification results for the 500 GB Microsoft Malware Challenge dataset. A comparison of the suggested solutions with state-of-the-art methods clearly indicates the effectiveness and low computational cost of our malware recognition and classification solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Malware visualization and detection using DenseNets

Article 01 July 2021

Light-Weight Deep Learning Models for Visual Malware Classification

Transfer Learning for Internet of Things Malware Analysis

Data Availability

Microsoft (Ronen et al., 2018) dataset https://www.kaggle.com/c/malware-classification. Malimg (Nataraj et al., 2011) dataset https://www.kaggle.com/datasets/keerthicheepurupalli/malimg-dataset9010. Custom Windows malware dataset sources (https://virusshare.com/, https://github.com/ytisf/theZoo, https://vx-underground.org/archive/VxHeaven/index.html). Custom IoT malware dataset sources (https://vx-underground.org/archive/VxHeaven/index.html, https://github.com/ytisf/theZoo). Malware can cause damage to the computing environments therefore caution must be taken before downloading malware.

Code Availability

Code is available on request at https://forms.gle/mp9GihTmsAzAUNpT7.

References

Amer, E., & Zelinka, I. (2020). A dynamic Windows malware detection and prediction method based on contextual understanding of API call sequence. Computers & Security, 92, 101760. https://doi.org/10.1016/j.cose.2020.101760
Article Google Scholar
Amin, M., Tanveer, T. A., Tehseen, M., Khan, M., Khan, F. A., & Anwar, S. (2020). Static malware detection and attribution in android byte-code through an end-to-end deep system. Future Generation Computer Systems, 102, 112–126. https://doi.org/10.1016/j.future.2019.07.070
Article Google Scholar
Amin, M., Shehwar, D., Ullah, A., Guarda, T., Tanveer, T. A., & Anwar, S. (2020). “A deep learning system for health care IoT and smartphone malware detection,” Neural Comput & Applic. https://doi.org/10.1007/s00521-020-05429-x
Anandhi, V., Vinod, P., & Menon, V. G. (2021). “Malware visualization and detection using DenseNets,” Pers Ubiquit Comput. https://doi.org/10.1007/s00779-021-01581-w.
Andresini, G., Appice, A., De Rose, L., & Malerba, D. (2021). GAN augmentation to deal with imbalance in imaging-based intrusion detection. Future Generation Computer Systems, 123, 108–127. https://doi.org/10.1016/j.future.2021.04.017
Article Google Scholar
Bai, Y., Xing, Z., Ma, D., Li, X., & Feng, Z. (2021). Comparative analysis of feature representations and machine learning methods in Android family classification. Computer Networks, 184, 107639. https://doi.org/10.1016/j.comnet.2020.107639
Article Google Scholar
Bakour, K., & Ünver, H. M. (2021). VisDroid: Android malware classification based on local and global image features, bag of visual words and machine learning techniques. Neural Computing and Applications, 33(8), 3133–3153. https://doi.org/10.1007/s00521-020-05195-w
Article Google Scholar
Chollet, F. (2017). “Xception: Deep Learning with Depthwise Separable Convolutions,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807. https://doi.org/10.1109/CVPR.2017.195.
Dai, Y., Li, H., Qian, Y., & Lu, X. (2018). A malware classification method based on memory dump grayscale image. Digital Investigation, 27, 30–37. https://doi.org/10.1016/j.diin.2018.09.006
Article Google Scholar
Darabian, H., et al. (2020). Detecting Cryptomining Malware: A Deep Learning Approach for Static and Dynamic Analysis. Journal Grid Computing, 18(2), 293–303. https://doi.org/10.1007/s10723-020-09510-6
Article Google Scholar
Darem, A., Abawajy, J., Makkar, A., Alhashmi, A., & Alanazi, S. (2021). Visualization and deep-learning-based malware variant detection using OpCode-level features. Future Generation Computer Systems, 125, 314–323. https://doi.org/10.1016/j.future.2021.06.032
Article Google Scholar
De Lorenzo, A., Martinelli, F., Medvet, E., Mercaldo, F., & Santone, A. (2020). Visualizing the outcome of dynamic analysis of Android malware with VizMal. Journal of Information Security and Applications, 50, 102423. https://doi.org/10.1016/j.jisa.2019.102423
Article Google Scholar
Dehkordy, D. T., & Rasoolzadegan, A. (2021). A new machine learning-based method for android malware detection on imbalanced dataset. Multimedia Tools and Applications, 80(16), 24533–24554. https://doi.org/10.1007/s11042-021-10647-z
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009) “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848.
Dhalaria, M., & Gandotra, E. (2020). “CSForest: an approach for imbalanced family classification of android malicious applications,” p. 13. https://doi.org/10.1007/s41870-021-00661-7.
Ding, Y., Zhang, X., Hu, J., & Xu, W. (2020). “Android malware detection method based on bytecode image.” Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-020-02196-4.
EscuderoGarcía, D., & DeCastro-García, N. (2021). Optimal feature configuration for dynamic malware detection. Computers & Security, 105, 102250. https://doi.org/10.1016/j.cose.2021.102250
Article Google Scholar
Farrokhmanesh, M., & Hamzeh, A. (2019). Music classification as a new approach for malware detection. Journal of Computer Virology and Hacking Techniques, 15(2), 77–96. https://doi.org/10.1007/s11416-018-0321-2
Article Google Scholar
Ganesh, M., Pednekar, P., Prabhuswamy, P., Nair, D. S., Park, Y., & Jeon, H. (2017). “CNN-Based Android Malware Detection,” in 2017 International Conference on Software Security and Assurance (ICSSA), Altoona, PA, pp. 60–65. https://doi.org/10.1109/ICSSA.2017.18.
Gibert, D., Mateu, C., Planes, J., & Vicens, R. (2019). Using convolutional neural networks for classification of malware represented as images. Journal of Computer Virology and Hacking Techniques, 15(1), 15–28. https://doi.org/10.1007/s11416-018-0323-0
Article Google Scholar
Gibert, D., Mateu, C., & Planes, J. (2020). HYDRA: A multimodal deep learning framework for malware classification. Computers & Security, 95, 101873. https://doi.org/10.1016/j.cose.2020.101873
Article Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). “Deep Residual Learning for Image Recognition,” pp. 770–778. Accessed: Nov. 09, 2021. [Online]. Available: https://openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html
Jain, M., Andreopoulos, W., & Stamp, M. (2020). Convolutional neural networks and extreme learning machines for malware classification. Journal of Computer Virology and Hacking Techniques, 16(3), 229–244. https://doi.org/10.1007/s11416-020-00354-y
Article Google Scholar
Li, Z., Qin, Z., Huang, K., Yang, X., & Ye, S. (2017). “Intrusion Detection Using Convolutional Neural Networks for Representation Learning.” In D. Liu, S. Xie, Y. Li, D. Zhao, & E.-S. M. El-Alfy (Eds.), Neural Information Processing, (vol. 10638, pp. 858–866). Springer International Publishing. https://doi.org/10.1007/978-3-319-70139-4_87.
Liu, L., & Wang, B. (2017). “Automatic Malware Detection Using Deep Learning Based on Static Analysis,” in Data Science, Singapore, pp. 500–507. https://doi.org/10.1007/978-981-10-6385-5_42.
“Malware Statistics & Trends Report | AV-TEST.” (2022). https://www.av-test.org/en/statistics/malware/ (accessed May 14, 2022).
Mercaldo, F., & Santone, A. (2020). Deep learning for image-based mobile malware detection. Journal of Computer Virology and Hacking Techniques, 16(2), 157–171. https://doi.org/10.1007/s11416-019-00346-7
Article Google Scholar
Moti, Z., et al. (2021). Generative adversarial network to detect unseen Internet of Things malware. Ad Hoc Networks, 122, 102591. https://doi.org/10.1016/j.adhoc.2021.102591
Article Google Scholar
Moti, Z., Hashemi, S., & Jahromi, A. N. (2020). “A Deep Learning-based Malware Hunting Technique to Handle Imbalanced Data,” in 2020 17th International ISC Conference on Information Security and Cryptology (ISCISC), Tehran, Iran, pp. 48–53. https://doi.org/10.1109/ISCISC51277.2020.9261913.
Naeem, H., et al. (2020). Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Networks, 105, 102154. https://doi.org/10.1016/j.adhoc.2020.102154
Article Google Scholar
Nataraj, L., Karthikeyan, S., Jacob, G., & Manjunath, B. S. (2011). “Malware images: visualization and automatic classification,” in Proceedings of the 8th International Symposium on Visualization for Cyber Security - VizSec ’11, Pittsburgh, Pennsylvania, pp. 1–7. https://doi.org/10.1145/2016904.2016908.
Pei, X., Yu, L., & Tian, S. (2020). AMalNet: A deep learning framework based on graph convolutional networks for malware detection. Computers & Security, 93, 101792. https://doi.org/10.1016/j.cose.2020.101792
Article Google Scholar
Pundir, S., Obaidat, M. S., Wazid, M., Das, A. K., Singh, D. P., & Rodrigues, J. J. P. C. (2021). “MADP-IIME: malware attack detection protocol in IoT-enabled industrial multimedia environment using machine learning approach,” Multimedia Systems. https://doi.org/10.1007/s00530-020-00743-9.
Ren, Z., Chen, G., & Lu, W. (2020). Malware visualization methods based on deep convolution neural networks. Multimedia Tools and Applications, 79(15–16), 10975–10993. https://doi.org/10.1007/s11042-019-08310-9
Article Google Scholar
Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., & Ahmadi, M. (2018) “Microsoft Malware Classification Challenge,” arXiv:1802.10135 [cs], Accessed: Feb. 12, 2022. [Online]. Available: http://arxiv.org/abs/1802.10135
Stamp, M., Chandak, A., Wong, G., & Ye, A. (2021). “On Ensemble Learning,” arXiv:2103.12521 [cs], Accessed: Jan. 22, 2022. [Online]. Available: http://arxiv.org/abs/2103.12521
Sudhakar & Kumar, S. (2021). “MCFT-CNN: Malware classification with fine-tune convolution neural networks using traditional and transfer learning in Internet of Things.” Future Generation Computer Systems, 125, 334–351. https://doi.org/10.1016/j.future.2021.06.029.
tisf, theZoo - A Live Malware Repository. 2022. Accessed: May 14, 2022. [Online]. Available: https://github.com/ytisf/theZoo
Tuncer, T., Ertam, F., & Dogan, S. (2021). Automated malware identification method using image descriptors and singular value decomposition. Multimedia Tools and Applications, 80(7), 10881–10900. https://doi.org/10.1007/s11042-020-10317-6
Article Google Scholar
Vasan, D., Alazab, M., Wassan, S., Safaei, B., & Zheng, Q. (2020a). Image-Based malware classification using ensemble of CNN architectures (IMCEC). Computers & Security, 92, 101748. https://doi.org/10.1016/j.cose.2020.101748
Article Google Scholar
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., & Zheng, Q. (2020b). IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture. Computer Networks, 171, 107138. https://doi.org/10.1016/j.comnet.2020.107138
Article Google Scholar
Verma, V., Muttoo, S. K., & Singh, V. B. (2020). Multiclass malware classification via first- and second-order texture statistics. Computers & Security, 97, 101895. https://doi.org/10.1016/j.cose.2020.101895
Article Google Scholar
“VirusShare.com.” https://virusshare.com/ (accessed May 14, 2022).
“VirusTotal - Stats.” https://www.virustotal.com/gui/stats (accessed May 14, 2022).
“vx-underground.” https://www.vx-underground.org/archive/VxHeaven/index.html (accessed May 14, 2022).
Xiao, G., Li, J., Chen, Y., & Li, K. (2020). MalFCS: An effective malware classification framework with automated feature extraction based on deep convolutional neural networks. Journal of Parallel and Distributed Computing, 141, 49–58. https://doi.org/10.1016/j.jpdc.2020.03.012
Article Google Scholar
Yuan, B., Wang, J., Liu, D., Guo, W., Wu, P., & Bao, X. (2020). Byte-level malware classification based on markov images and deep learning. Computers & Security, 92, 101740. https://doi.org/10.1016/j.cose.2020.101740
Article Google Scholar
Zhang, J., et al. (2021). Malware Detection Based on Multi-level and Dynamic Multi-feature Using Ensemble Learning at Hypervisor. Mobile Netw Appl, 26(4), 1668–1685. https://doi.org/10.1007/s11036-019-01503-4
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Research Scholar, Department of Computer Science, Himachal Pradesh University, Shimla, India
Osho Sharma
Assistant Professor, Department of Computer Science and Engineering, UIET, Panjab University, Chandigarh, India
Akashdeep Sharma
Professor, Department of Computer Science, Himachal Pradesh University, Shimla, India
Arvind Kalia

Authors

Osho Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Akashdeep Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Arvind Kalia
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally in this manuscript.

Corresponding author

Correspondence to Akashdeep Sharma.

Ethics declarations

Conflict of interest

The authors state that they have no known competing financial interests or personal ties that could have appeared to affect the work reported in this study.

Consent to participate

Not Applicable.

Human and Animal Ethics

No Humans or Animals were harmed in any way.

Consent for publication

Not Applicable.

Credit authorship contribution statement

All authors contributed equally to this study.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sharma, O., Sharma, A. & Kalia, A. Windows and IoT malware visualization and classification with deep CNN and Xception CNN using Markov images. J Intell Inf Syst 60, 349–375 (2023). https://doi.org/10.1007/s10844-022-00734-4

Download citation

Received: 18 May 2022
Revised: 29 July 2022
Accepted: 01 August 2022
Published: 09 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10844-022-00734-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Windows and IoT malware visualization and classification with deep CNN and Xception CNN using Markov images