Skip to main content
Log in

Automated malware identification method using image descriptors and singular value decomposition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Cyber-attacks have become a significant problem worldwide. Therefore, many methods, networks, and applications have been suggested for providing information security in the literature. Automated malware classification has become one of the hot-topic research areas in information security and digital forensics. Image processing methods have been used to solve malware detection and recognition problem. Three effective feature extractors are used to propose an automated malware classification method in this work. The proposed method uses local binary pattern (LBP), singular value decomposition (SVD), and a novel local ternary pattern network (LTPNet) to extract features. The extracted features using the hybrid feature extractor are reduced using principal component analysis (PCA). The final features are forwarded to linear discriminant analysis (LDA) classifier. A commonly used heterogonous and big malware dataset (Maligm) is used to obtain the success of the proposed LBP, LTPNet, and SVD based malware classification method. There are 9339 malwares with 25 classes in the Maligm dataset. The proposed LBP-SVD-LTPNet based method achieved an 88.08% success rate using this dataset. The obtained accuracy rate of the proposed LBP-SVD-LTPNet based method is higher than the selected deep learning methods. These methods are convolutional neural network (CNN), multi-layer perceptron (MLP), gated recurrent units (GRU), GoogleNet, VGG16, and ResNet. These results openly demonstrated that the proposed LBP-SVD-LTPNet based malware classification method is successful.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Agarap AF (2017) Towards building an intelligent anti-malware system: a deep learning approach using support vector machine (SVM) for malware classification. arXiv Prepr arXiv180100318

  2. Akarsh S, Simran K, Poornachandran P et al (2019) Deep learning framework and visualization for malware classification. In: 2019 5th international conference on advanced computing and communication systems, ICACCS 2019. IEEE, pp 1059–1063

  3. Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Financ 18:505–529. https://doi.org/10.1016/0378-4266(94)90007-8

    Article  Google Scholar 

  4. Aquilina JM, Casey E, Malin CH (2008) Malware forensics: investigating and analyzing malicious code. Elsevier

  5. Banin S, Dyrkolbotn GO (2018) Multinomial malware classification via low-level features. In: Proceedings of the digital forensic research conference, DFRWS 2018 USA, pp S107–S117

  6. Barriga JJA, Yoo SG (2017) Malware detection and evasion with machine learning techniques: a survey. Int J Appl Eng Res 12:7207–7214

    Google Scholar 

  7. Basu I, Sinha N, Bhagat D, Goswami S (2016) Malware detection based on source data using data mining : a survey. Am J Adv Comput III:18–37

  8. Boero L, Marchese M, Zappatore S (2017) Support vector machine meets software defined networking in IDS domain. In: 2017 29th international teletraffic congress (ITC 29). IEEE, pp 25–30

  9. Chen S, Zhu Y (2004) Subpattern-based principle component analysis. Pattern Recogn 37:1081–1083. https://doi.org/10.1016/j.patcog.2003.09.004

    Article  Google Scholar 

  10. Dai Y, Li H, Qian Y, Lu X (2018) A malware classification method based on memory dump grayscale image. Digit Investig 27:30–37. https://doi.org/10.1016/j.diin.2018.09.006

    Article  Google Scholar 

  11. De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21:1253–1278

    Article  MathSciNet  Google Scholar 

  12. Egele M, Scholte T, Kirda E, Kruegel C (2012) A survey on automated dynamic malware-analysis techniques and tools. ACM Comput Surv 44:1–42. https://doi.org/10.1145/2089125.2089126

    Article  Google Scholar 

  13. Elhadi AAE, Maarof MA, Barry BIA (2013) Improving the detection of malware behaviour using simplified data dependent API call graph. Int J Secur Appl 7:29–42. https://doi.org/10.14257/ijsia.2013.7.5.03

    Article  Google Scholar 

  14. Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874. https://doi.org/10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  15. Flach PA, Kull M (2015) Precision-Recall-Gain curves: PR analysis done right. Adv Neural Inf Process Syst 2015-January, pp 838–846

  16. Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. In: Numerische Mathematik. Springer, pp 403–420

  17. Günther J, Pilarski PM, Helfrich G, Shen H, Diepold K (2014) First steps towards an intelligent laser welding architecture using deep neural networks and reinforcement learning. Procedia Technol 15:474–483. https://doi.org/10.1016/j.protcy.2014.09.007

    Article  Google Scholar 

  18. Islam R, Tian R, Batten LM, Versteeg S (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36:646–656

    Article  Google Scholar 

  19. Kim K (2009) Face recognition using principle component analysis. In: International conference on computer vision and pattern recognition, pp 1–7

  20. Kruczkowski M, Niewiadomska-Szynkiewicz E (2014) Support vector machine for malware analysis and classification. In: proceedings - 2014 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology - workshops, WI-IAT 2014

  21. Kumar P, Quadri MZ, Sharma K, Gia NN, Ranjan P (2018) Persistent cellular telephony: enhanced secure GSM architecture. Recent Patents Eng 12:23–29. https://doi.org/10.2174/1872212111666170808104744

    Article  Google Scholar 

  22. Liang G, Pang J, Dai C (2016) A behavior-based malware variant classification technique. Int J Inf Educ Technol 6:291–295. https://doi.org/10.7763/IJIET.2016.V6.702

    Article  Google Scholar 

  23. Lin CT, Wang NJ, Xiao H, Eckert C (2015) Feature selection and extraction for malware classification. J Inf Sci Eng 31:965–992. https://doi.org/10.6688/JISE.2015.31.3.11

    Article  Google Scholar 

  24. Longstaff ID, Cross JF (1987) A pattern recognition approach to understanding the multi-layer perception. Pattern Recogn Lett 5:315–319. https://doi.org/10.1016/0167-8655(87)90072-9

    Article  Google Scholar 

  25. Machado JAT, Lopes AM (2017) Computational complexity. John Wiley and Sons Ltd.

  26. Mohaisen A, Alrawi O, Mohaisen M (2015) AMAL: high-fidelity, behavior-based automated malware analysis and classification. Comput Secur 52:251–266. https://doi.org/10.1016/j.cose.2015.04.001

    Article  Google Scholar 

  27. Nataraj L, Karthikeyan S, Jacob G, Manjunath BS (2011) Malware images: visualization and automatic classification. ACM Int Conf Proceeding Ser. https://doi.org/10.1145/2016904.2016908

  28. Ojala T, Pietikäinen M, Mäenpää T (2001) A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 399–408

  29. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987

    Article  Google Scholar 

  30. Pai S, Di Troia F, Visaggio CA et al (2017) Clustering for malware classification. J Comput Virol Hacking Tech 13:95–107. https://doi.org/10.1007/s11416-016-0265-3

    Article  Google Scholar 

  31. Provataki A, Katos V (2013) Differential malware forensics. Digit Investig 10:311–322. https://doi.org/10.1016/j.diin.2013.08.006

    Article  Google Scholar 

  32. Raff E, Nicholas C (2017) An alternative to NCD for large sequences, Lempel-Ziv Jaccard distance. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 1007–1015

  33. Ren L, Cheng X, Wang X, Cui J, Zhang L (2019) Multi-scale dense gate recurrent unit networks for bearing remaining useful life prediction. Futur Gener Comput Syst 94:601–609. https://doi.org/10.1016/j.future.2018.12.009

    Article  Google Scholar 

  34. Rieck K, Holz T, Willems C et al (2008) Learning and classification of malware behavior. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, pp 108–125

  35. Rieck K, Trinius P, Willems C, Holz T (2011) Automatic analysis of malware behavior using machine learning. J Comput Secur 19:639–668. https://doi.org/10.3233/JCS-2010-0410

    Article  Google Scholar 

  36. Rudd EM, Rozsa A, Günther M, Boult TE (2017) A survey of stealth malware attacks, mitigation measures, and steps toward autonomous open world solutions. IEEE Commun Surv Tutorials 19:1145–1172. https://doi.org/10.1109/COMST.2016.2636078

    Article  Google Scholar 

  37. Sahu M, Ahirwar M, Hemlata A (2014) A review of malware detection based on pattern matching technique. Int J Comput Sci Inf Technol 5:944–947

    Google Scholar 

  38. Santos I, Brezo F, Ugarte-Pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci (Ny) 231:64–82. https://doi.org/10.1016/j.ins.2011.08.020

    Article  MathSciNet  Google Scholar 

  39. Santos I, Brezo F, Ugarte-Pedrero X, Bringas PG (2013) Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf Sci (Ny) 231:64–82. https://doi.org/10.1016/j.ins.2011.08.020

    Article  MathSciNet  Google Scholar 

  40. Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features. In: 2015 10th international conference on malicious and unwanted software (MALWARE). IEEE, pp 11–20

  41. Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf Secur Tech Rep 14:16–29. https://doi.org/10.1016/j.istr.2009.03.003

    Article  Google Scholar 

  42. Sharma K, Bala S, Bansal H, Shrivastava G (2017) Introduction to the special issue on secure solutions for network in scalable computing. Scalable Comput Pract Exp 18:iii–iv. https://doi.org/10.12694/scpe.v18i3.1299

    Article  Google Scholar 

  43. Shrivastava G (2012) Forensic computing models: technical overview. In: Computer Science & Information Technology (CS & IT). Academy & Industry Research Collaboration Center (AIRCC), pp 207–216

  44. Sibi Chakkaravarthy S, Sangeetha D, Vaidehi V (2019) A survey on malware analysis and mitigation techniques. Comput Sci Rev 32:1–23. https://doi.org/10.1016/j.cosrev.2019.01.002

    Article  MathSciNet  Google Scholar 

  45. Sinha A, Shrivastava G, Kumar P, Gupta D (2020) A community-based hierarchical user authentication scheme for industry 4.0. Softw Pract Exp. https://doi.org/10.1002/spe.2832

  46. Souri A, Hosseini R (2018) A state-of-the-art survey of malware detection approaches using data mining techniques. Human-centric Comput Inf Sci 8:3. https://doi.org/10.1186/s13673-018-0125-x

    Article  Google Scholar 

  47. Srivastava PK, Ojha RP, Sharma K, Awasthi S, Sanyal G (2018) Effect of quarantine and recovery on infectious nodes in wireless sensor network. Int J Sensors, Wirel Commun Control 8:26–36. https://doi.org/10.2174/2210327908666180413154130

    Article  Google Scholar 

  48. Tan X, Triggs W (2010) Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans Image Process 19:1635–1650

    Article  MathSciNet  Google Scholar 

  49. Ucci D, Aniello L, Baldoni R (2019) Survey of machine learning techniques for malware analysis. Comput Secur 81:123–147. https://doi.org/10.1016/j.cose.2018.11.001

    Article  Google Scholar 

  50. Ucci D, Aniello L, Baldoni R (2019) Survey of machine learning techniques for malware analysis. Comput Secur 81:123–147. https://doi.org/10.1016/j.cose.2018.11.001

    Article  Google Scholar 

  51. Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Venkatraman S (2019) Robust intelligent malware detection using deep learning. IEEE Access 7:46717–46738. https://doi.org/10.1109/ACCESS.2019.2906934

    Article  Google Scholar 

  52. Wüchner T, Ochoa M, Pretschner A (2015) Robust and effective malware detection through quantitative data flow graph metrics. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), pp 98–118

  53. Ye J, Janardan R, Li Q (2005) Two-dimensional linear discriminant analysis. In: Advances in neural information processing systems, pp 1569–1576

  54. Ye Y, Li T, Adjeroh D, Iyengar SS (2017) A survey on malware detection using data mining techniques. ACM Comput Surv 50:1–40. https://doi.org/10.1145/3073559

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Turker Tuncer.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tuncer, T., Ertam, F. & Dogan, S. Automated malware identification method using image descriptors and singular value decomposition. Multimed Tools Appl 80, 10881–10900 (2021). https://doi.org/10.1007/s11042-020-10317-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10317-6

Keywords

Navigation