Integrating a Rule-Based Approach to Malware Detection with an LSTM-Based Feature Selection Technique

Bhardwaj, Sonam; Dave, Mayank

doi:10.1007/s42979-023-02177-2

Integrating a Rule-Based Approach to Malware Detection with an LSTM-Based Feature Selection Technique

Original Research
Published: 27 September 2023

Volume 4, article number 737, (2023)
Cite this article

SN Computer Science Aims and scope Submit manuscript

151 Accesses
1 Citation
Explore all metrics

Abstract

Technology has amplified malware activity, affecting network and users. Before being forwarded to the next host, network traffic must be dynamically analysed for malware. By exploiting network vulnerabilities, attackers gain control of the system and implement their own network rules to enable malicious traffic. Yet, another recursive acronym (YARA) rules are effective string and pattern-matching malware analysis approaches. The quality and amount of YARA rules utilized in analysis determine its effectiveness. YARA rules focus on whether to activate a rule for a suspicious sample after examining its rule condition. YARA rules rely on binary conclusion on malware analysis, which may limit its use and results. Thus, the proposed approach selects malware features using the ML-based LSTM model. Rule-based traffic analysis and long-short term memory (LSTM)-based feature selection strengthen the malware detection model in the proposed approach. By comparing performance results with and without LSTM-based feature (parameter) selection, this research assesses model integrity. Due to LSTM-based feature selection, the model achieved its best accuracy of 97%, proving its suitability for malware detection on diverse datasets belonging to different network environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast unsupervised preprocessing method for network monitoring

Article 31 August 2018

Fast and Straightforward Feature Selection Method

Implementation-Oriented Feature Selection in UNSW-NB15 Intrusion Detection Dataset

Data availability

This research work uses publicly available datasets that have been cited and referenced.

References

Tahir R. A study on malware and malware detection techniques. Int J Edu Mgmt Engg. 2018;8(2):20. https://doi.org/10.5815/ijeme.2018.02.03.
Article Google Scholar
Faruk MJ, Miner P, Coughlan R, Masum M, Shahriar H, Clincy V, Cetinkaya C. Smart connected aircraft: towards security, privacy, and ethical hacking. In: International Conference on Security of Information and Networks (SIN). IEEE. 2021; p. 1–5. https://doi.org/10.1109/SIN54109.2021.9699243
Zhang K. A machine learning based approach to identify SQL injection vulnerabilities. In: IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE. 2019; p. 1286–1288. https://doi.org/10.1109/ASE.2019.00164
Kim MS. A Study on the Attack Index Packet Filtering Algorithm Based on Web Vulnerability. In: Big Data, cloud computing, and data science engineering. Springer; 2023, p. 145–152. https://doi.org/10.1007/978-3-031-19608-9_12
Shandilya SK, Ganguli C, Izonin I, Nagar AK. Cyber attack evaluation dataset for deep packet inspection and analysis. Data Brief. 2023;46:108771. https://doi.org/10.1016/j.dib.2022.108771.
Article Google Scholar
Catal C, Giray G, Tekinerdogan B. Applications of deep learning for mobile malware detection: A systematic literature review. Neur Comp Appl. 2022; 1–26. https://doi.org/10.1007/s00521-021-06597-0
Naik N, Jenkins P, Savage N, Yang L. Cyberthreat Hunting-Part 1: triaging ransomware using fuzzy hashing, import hashing and YARA rules. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE. 2019; p. 1–6. https://doi.org/10.1109/FUZZ-IEEE.2019.8858803
Mira F, Huang W. Performance evaluation of string based malware detection methods. In: International Conference on Automation and Computing (ICAC). IEEE. 2018; p. 1–6. https://doi.org/10.23919/IConAC.2018.8749096
Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK. Android malware detection based on system call sequences and LSTM. Multim Tls Appl. 2019;78:3979–99. https://doi.org/10.1007/s11042-017-5104-0.
Article Google Scholar
Zhang J. A practical logic obfuscation technique for hardware security. IEEE Trans VLSI sys. 2015;24(3):1193–7. https://doi.org/10.1109/TVLSI.2015.2437996.
Article Google Scholar
Cesare S, Xiang Y, Zhou W. Malwise—an effective and efficient classification system for packed and polymorphic malware. IEEE Trans Compu. 2012;62(6):1193–206. https://doi.org/10.1109/TC.2012.65.
Article MathSciNet MATH Google Scholar
Abbas MFB, Srikanthan T. Low-complexity signature-based malware detection for IoT devices. In: Applications and Techniques in Information Security: International Conference, ATIS 2017, Auckland, New Zealand, Proceedings. Springer. 2017; p. 181–189. https://doi.org/10.1007/978-981-10-5421-1_15
Naik N, Jenkins P, Savage N, Yang L, Naik K, Song J. Embedding fuzzy rules with YARA rules for performance optimisation of malware analysis. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE. 2020; p. 1–7. https://doi.org/10.1109/FUZZ48607.2020.9177856
Culling C. Which YARA rules rule: basic or advanced? In: GIAC (GCIA) Gold Certification and RES. 2018; p. 5500.
Brengel M, Rossow C. YARIX: scalable YARA-based Malware Intelligence. In: USENIX Security Symposium. 2021. p. 3541–3558.
VirusTotal. YARA in a nutshell. 2019; [Online]. https://virustotal.github.io/yara/. Accessed 3 Feb 2023.
Andrade EDO, Viterbo J, Vasconcelos CN, Guérin J, Bernardini FC. A model based on lstm neural networks to identify five different types of malware. Proc Compu Sci. 2019;159:182–91. https://doi.org/10.1016/j.procs.2019.09.173.
Article Google Scholar
Akbar F, Hussain M, Mumtaz R, Riaz Q, Wahab AWA, Jung KH. Permissions-based detection of android malware using machine learning. Symmetry. 2022;14(4):718. https://doi.org/10.3390/sym14040718.
Article Google Scholar
Kambar MEZN, Esmaeilzadeh A, Kim Y, Taghva K. A survey on mobile malware detection methods using machine learning. In: IEEE Annual Computing and Communication Workshop and Conference (CCWC). IEEE. 2022; p. 0215–0221. https://doi.org/10.1109/CCWC54503.2022.9720753
Yakura H, Shinozaki S, Nishimura R, Oyama Y, Sakuma J. Neural malware analysis with attention mechanism. Comp Secur. 2019;87:101592. https://doi.org/10.1016/j.cose.2019.101592.
Article Google Scholar
Nadeem MW, Goh HG, Ponnusamy V, Aun Y. Ddos detection in sdn usingmachine learning techniques. Comput Mater Contin. 2022;71(1):771–89. https://doi.org/10.32604/cmc.2022.021669.
Article Google Scholar
Mirjalili S, Mirjalili SM, Yang XS. Binary bat algorithm. Neural Comput Appl. 2014;25:663–81. https://doi.org/10.1007/s00521-013-1525-5.
Article Google Scholar
Nakamura RY, Pereira LA, Costa KA, Rodrigues D, Papa JP, Yang XS. BBA: a binary bat algorithm for feature selection. In: SIBGRAPI conference on graphics, patterns and images. IEEE. p. 291–297. https://doi.org/10.1109/SIBGRAPI.2012.47
Alotaibi FM, Vassilakis VG. Sdn-based detection of self-propagating ransomware: the case of badrabbit. IEEE Access. 2021;9:28039–58. https://doi.org/10.1109/ACCESS.2021.3058897.
Article Google Scholar
Masum M, Faruk MJ, Shahriar H, Qian K, Lo D, Adnan MI. Ransomware classification and detection with machine learning algorithms. In: IEEE Annual Computing and Communication Workshop and Conference (CCWC). IEEE 2022; p. 0316–0322. DOI: https://doi.org/10.1109/CCWC54503.2022.9720869.
Şahín CB, Dírí B. Robust feature selection with LSTM recurrent neural networks for artificial immune recognition system. IEEE Access. 2019;7:24165–78. https://doi.org/10.1109/ACCESS.2019.2900118.
Article Google Scholar
Ghanei H, Manavi F, Hamzeh A. A novel method for malware detection based on hardware events using deep neural networks. Compu Viro Hack Tech. 2021;17(4):319–31. https://doi.org/10.1007/s11416-021-00386-y.
Article Google Scholar
Burks R, Islam KA, Lu Y, Li J. Data augmentation with generative models for improved malware detection: A comparative study. In: IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). IEEE. 2019. p. 0660–0665. IEEE. https://doi.org/10.1109/UEMCON47517.2019.8993085.
Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G. Learning under concept drift: a review. IEEE trans knowl Data Engg. 2018;31(12):2346–63. https://doi.org/10.1109/TKDE.2018.2876857.
Article Google Scholar
Thabtah F, Hammoud S, Kamalov F, Gonsalves A. Data imbalance in classification: experimental evaluation. Info Sci. 2020;513:429–41. https://doi.org/10.1016/j.ins.2019.11.004.
Article MathSciNet Google Scholar
Fahy C, Yang S, Gongora M. Scarcity of labels in non-stationary data streams: A survey. ACM Comp Surv (CSUR). 2022;55(2):1–39. https://doi.org/10.1145/3494832.
Article Google Scholar
Duch W, Wieczorek T, Biesiada J, Blachnik M. Comparison of feature ranking methods based on information entropy. In: IEEE International Joint Conference on Neural Networks. IEEE. 2004. p. 1415–1419. IEEE. https://doi.org/10.1109/IJCNN.2004.1380157.
Jaramillo LES. Malware detection and mitigation techniques: Lessons learned from Mirai DDOS attack. Info Sys Eng Manag. 2018;3(3):19. https://doi.org/10.20897/jisem/2655.
Article Google Scholar
Wireshark Tool. https://www.wireshark.org/. Accessed 27 July 2023
Mukhiya SK, Ahmed U. Hands-On Exploratory Data Analysis with Python: Perform EDA techniques to understand, summarize, and investigate your data. Packt Publishing Ltd; 2020.
Google Scholar
Vinesmsuic dataset [Available online] at: https://www.kaggle.com/code/vinesmsuic/malware-detection-using-deeplearning/data. Accessed 8 Feb 2023.
Android Malware Dataset [Available online] at: https://www.kaggle.com/datasets/shashwatwork/android-malware-dataset-for-machine-learning. Accessed 8 Feb 2023.
IoT Malware Dataset [Available online]: https://www.kaggle.com/datasets/efecastaneda/malware-iot-log-file. Accessed 8 Feb 2023.
Elsayed MS, Le-Khac NA, Jurcut AD. InSDN: A novel SDN intrusion dataset. IEEE access. 2020;8:165263–84. https://doi.org/10.1109/ACCESS.2020.3022633.
Article Google Scholar
Reddy KV, Ambati SR, Reddy YS, Reddy AN. AdaBoost for Parkinson's disease detection using robust scaler and SFS from acoustic features. In: Smart Technologies, Communication and Robotics (STCR). IEEE. p. 1–6. https://doi.org/10.1109/STCR51658.2021.9588906
Mashru D, Mangipudi GM, Swamy H, HalangaliS, Sushma E. A decentralised instant messaging application with end-to-end encryption. In: 2023 20th Learning and Technology Conference (L&T) (pp. 48–53). IEEE; 2023.
Si Q, Xu H, Tong Y, Zhou Y, Liang J, Cui L, Hao Z. Malware Detection Using Automated Generation of Yara Rules on Dynamic Features. In: Science of Cyber Security: 4th International Conference, SciSec 2022, Matsue, Japan, August 10–12, 2022, Revised Selected Papers (pp. 315-330). Cham: Springer International Publishing; 2022.
Liu H, Patras P. NetSentry: a deep learning approach to detecting incipient large-scale network attacks. Comput Commun. 2022;191:119–32.
Article Google Scholar
Kim S, Kim J, Nam S, Kim D. WebMon: ML-and YARA-based malicious webpage detection. Comput Netw. 2018;137:119–31.
Article Google Scholar

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra, India
Sonam Bhardwaj & Mayank Dave

Authors

Sonam Bhardwaj
View author publications
You can also search for this author in PubMed Google Scholar
Mayank Dave
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sonam Bhardwaj.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Research Trends in Computational Intelligence” guest edited by Anshul Verma, Pradeepika Verma, Vivek Kumar Singh, and S. Karthikeyan.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bhardwaj, S., Dave, M. Integrating a Rule-Based Approach to Malware Detection with an LSTM-Based Feature Selection Technique. SN COMPUT. SCI. 4, 737 (2023). https://doi.org/10.1007/s42979-023-02177-2

Download citation

Received: 13 February 2023
Accepted: 21 July 2023
Published: 27 September 2023
DOI: https://doi.org/10.1007/s42979-023-02177-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating a Rule-Based Approach to Malware Detection with an LSTM-Based Feature Selection Technique

Abstract

Access this article

Similar content being viewed by others

A fast unsupervised preprocessing method for network monitoring

Fast and Straightforward Feature Selection Method

Implementation-Oriented Feature Selection in UNSW-NB15 Intrusion Detection Dataset

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Integrating a Rule-Based Approach to Malware Detection with an LSTM-Based Feature Selection Technique

Abstract

Access this article

Similar content being viewed by others

A fast unsupervised preprocessing method for network monitoring

Fast and Straightforward Feature Selection Method

Implementation-Oriented Feature Selection in UNSW-NB15 Intrusion Detection Dataset

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation