Skip to main content
Log in

Next-generation antivirus for JavaScript malware detection based on dynamic features

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

There are many kinds of Exploit Kits, each one being built with several vulnerabilities, but almost all of them are written in JavaScript. So, we created an antivirus, endowed with machine learning, expert in detecting JavaScript malware based on Runtime Behaviors. In our methodology, JavaScript is executed, in a controlled environment. The goal was to investigate suspicious file behavior. Our antivirus, as a whole, dynamically monitors and ponders 7690 suspicious behaviors that the JavaScript file can do in Windows 7. As experiments, the authorial antivirus is compared to antiviruses based on deep as based on shallow networks. Our antivirus achieves an average accuracy of 99.75% in the distinction between benign and malware, accompanied by a training time of 8.92 s. Establishing the relationship between accuracy and training time is essential in information security. Eight (8) new malware are released every second. An antivirus with excessive training time can become obsolete even when released. As our proposed model can overcome the limitations of state-of-the-art, our antivirus combines high accuracy and fast training. In addition, the authorial antivirus is able to detect JavaScript malware, endowed with digital antiforense, such as obfuscates, polymorphic and fileless attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The RedMonk Programming Language Rankings: January 2018. Available in https://redmonk.com/sogrady/2018/03/07/language-rankings-1-18/. Accessed on July 2021.

  2. Web technology for developers: JavaScript. Available in https://developer.mozilla.org/en-US/docs/Web/JavaScript. Accessed on July 2021.

  3. PE: Portable Executable.

  4. HynekPetrak: JavaScript Malware Collection. Available in: https://github.com/HynekPetrak/javascript-malware-collection. Accessed on June 2021.

  5. RBM: restricted Boltzmann machine.

  6. HynekPetrak: JavaScript Malware Collection. Available at: https://github.com/HynekPetrak/javascript-malware-collection. Accessed on June 2021.

  7. The JQuery Plugin Registry. Available in: https://plugins.jquery.com/. Accessed on February 2021.

  8. VirusTotal. Online service in order to identify malware files by main commercial antiviruses worldwide. Available at: https://www.VirusTotal.com. Accessed on February 2021.

  9. Cuckoo: Automated Malware Analysis. Available in: https://cuckoosandbox.org/. Accessed on February 2021.

  10. Definition and Statics about Infostealers. Available in: https://www.microsoft.com/en-us/wdsi/threats/worms. Accessed on February 2021.

References

  1. Alam M, Akram A, Saeed T, Arshad S (2021) Deepmalware: a deep learning based malware images classification. Int Conf Cyber Warfare Secur 2021:93–99. https://doi.org/10.1109/ICCWS53234.2021.9703021

    Article  Google Scholar 

  2. Azevedo WW et al. (2015a) Fuzzy morphological extreme learning machines to detect and classify masses in mammograms. In: 2015 IEEE international conference on fuzzy systems (FUZZIEEE), Istanbul. https://doi.org/10.1109/FUZZ-IEEE.2015.7337975

  3. Azevedo WW et al. (2015b) Morphological extreme learning machines applied to detect and classify masses in mammograms. In: 2015 international joint conference on neural networks (IJCNN), Killarney. https://doi.org/10.1109/IJCNN.2015.7280774

  4. Azevedo WW et al. (2020) Morphological extreme learning machines applied to the detection and classification of mammary lesions. In: Tapan K Gandhi; Siddhartha Bhattacharyya; Sourav De; Debanjan Konar; Sandip Dey. (Org.). Advanced machine vision paradigms for medical image analysis. 1ed.Londres: Elsevier Science. , 1–30https://doi.org/10.1016/B978-0-12-819295-5.00003-2

  5. Catal C, Giray G, Tekinerdogan B, Kumar S, SHUKLA S (2022) Applications of deep learning for phishing detection: a systematic literature review. Knowl Inf Syst 64:1457–1500. https://doi.org/10.1007/s10115-022-01672-x

    Article  Google Scholar 

  6. Challapalli J, Devarakonda N (2022) A novel approach for optimization of convolution neural network with hybrid particle swarm and grey wolf algorithm for classification of indian classical dances. Knowl Inf Syst 64:2411–2434. https://doi.org/10.1007/s10115-022-01707-3

    Article  Google Scholar 

  7. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) https://doi.org/10.1109/CVPR.2017.195

  8. CISCO (2018) CISCO 2018 annual cybersecurity report. Accessed on Dec 2020. https://www.cisco.com/c/dam/m/hu_hu/campaigns/security-hub/pdf/acr-2018.pdf

  9. Cosovan D, Bencha R, Gavrilut D (2014) A practical guide for detecting javascript-based malware using hidden markov models and linear classifiers. In: 16th IEEE international symposium on symbolic and numeric algorithms for scientific computing https://doi.org/10.1109/SYNASC.2014.39

  10. Faruki P, Buddhadev B (2019) Droiddivesdeep: android malware classification via low level monitorable features with deep neural networks. Int Conf Secur Priv. https://doi.org/10.1007/978-981-13-7561-3_10

    Article  Google Scholar 

  11. Ferreira L, Silva D, Itzazelaia M (2023) Recommender systems in cybersecurity. Knowl Inf Syst. https://doi.org/10.1007/s10115-023-01906-6

    Article  Google Scholar 

  12. Hardy W, Lingwei CT (2016) Dl 4 md: a deep learning framework for intelligent malware detection. In Int’l Conf. Data Mining, pp 61–67

  13. Hou S, Saas A, (2016) Droiddelver: an android malware detection system using deep belief network based on api call blocks. Web-Age Information Management. In: WAIM 2016 International Workshops, MWDA, SDMMW, and SemiBDMA https://doi.org/10.1007/978-3-319-47121-1_5

  14. Huang GB et al (2000) Classification ability of single hidden layer feedforward neural networks. IEEE Trans. Neural Netw. Learn. Syst. 11(3):799–801. https://doi.org/10.1109/72.846750

    Article  Google Scholar 

  15. Huang GB et al (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern 42(2):513–519. https://doi.org/10.1109/TSMCB.2011.2168604

    Article  Google Scholar 

  16. Intel (2018) McAfee Labs. Accessed on Feb 2022. https://www.mcafee.com/enterprise/en-us/assets/reports/rp-quarterly-threats-mar-2018.pdf

  17. Islam R, Tian R, Batten L, Versteeg S (2013) Classification of malware based on integrated static and dynamic features. J Netw Comput Appl 36:646–656

    Article  Google Scholar 

  18. Lima S (2021) Limitation of COTS antiviruses: issues, controversies, and problems of COTS antiviruses. In: Cruz-Cunha MM, Mateus-Coelho, NR (eds) Handbook of research on cyber crime and information privacy, vol 1, 1st edn. IGI Global, Hershey. https://doi.org/10.4018/978-1-7998-5728-0.ch020

  19. Lima SML, Silva-Filho AG, Dos Santos WP (2014) A methodology for classification of lesions in mammographies using zernike moments, elm and svm neural networks in a multi-kernel approach. In: 2014 IEEE international conference on systems, man and cybernetics SMC, San Diego https://doi.org/10.1109/SMC.2014.6974041

  20. Lima SML, Silva-Filho Santos WP (2020b) Morphological decomposition to detect and classify lesions in mammograms. In: Wellington Pinheiro dos Santos; Maíra Araújo de Santana; Washington Wagner Azevedo da Silva. (Org.). Understanding a Cancer Diagnosis. https://novapublishers.com/shop/understanding-a-cancer-diagnosis/

  21. Lima S, Silva-Filho AG, Santos WP (2016) Detection and classification of masses in mammographic images in a multi-kernel approach. Comput Methods Progr Biomed 134:11–29. https://doi.org/10.1016/j.cmpb.2016.04.029

    Article  Google Scholar 

  22. Lima S, Silva H, Luz J et al (2020) Artificial intelligence-based antivirus in order to detect malware preventively. Progr Artif Intelli. https://doi.org/10.1007/s13748-020-00220-4

    Article  Google Scholar 

  23. Maniath S, Ashok A (2017) Deep learning lstm based ransomware detection. Recent Dev Contr Autom Power Eng. https://doi.org/10.1109/RDCAPE.2017.8358312

    Article  Google Scholar 

  24. MARADJS (2021) MARADJS (Machine learning repository applied to dynamic JavaScript files analysis). Accessed Dec 2021. https://github.com/rewema/MARADJS

  25. Pereira JMS et al. (2020) Method for classification of breast lesions in thermographic images using ELM classifiers. In: Wellington Pinheiro dos Santos; Maíra Araújo de Santana; Washington Wagner Azevedo da Silva. (Org.). Understanding a cancer diagnosis. https://novapublishers.com/shop/understanding-a-cancer-diagnosis/

  26. Salehi Z, Sami A, Ghiasi M (2014) Using features generation from API calls form malware detection. Comput Fraud Secur 9:9–18

    Article  Google Scholar 

  27. SANS (2017) SANS Institute InfoSec reading room. Out with The Old, In with The New: Replacing Traditional Antivirus. Accessed Feb 2020. https://www.sans.org/reading-room/whitepapers/analyst/old-new-replacing-traditional-antivirus-37377

  28. Santos WP (2011) Mathematical morphology in digital document analysis and processing, vol 8. Nova Science, New York

    Google Scholar 

  29. Santos MM, Silva Filho AG, Santos WP (2019) Deep convolutional extreme learning machines: filters combination and error model validation. Neurocomputing 329:359–369. https://doi.org/10.1016/j.neucom.2018.10.063

    Article  Google Scholar 

  30. Shahzad F, Shahzad MFM (2013) In-execution dynamic malware analysis and detection by mining information in process control blocks of linux os. Inf Sci 231:45–63. https://doi.org/10.1016/j.ins.2011.09.016

    Article  Google Scholar 

  31. SIKOS L (2023) Cybersecurity knowledge graphs. Knowl Inf Syst 65:3511–3531. https://doi.org/10.1007/s10115-023-01860-3

    Article  Google Scholar 

  32. Su J, Vasconcellos DT (2018) Lightweight classification of IoT malware based on image recognition. In: 2018 IEEE 42nd annual computer software and applications conference (COMPSAC) https://doi.org/10.1109/COMPSAC.2018.10315

  33. Symantec (2016) Symantec Reports. The Increased use of Powershell in Attacks. Version 1.0. Accessed on Dec 2021

  34. Symantec (2017) Symantec Reports. Internet Security Threat Report: Living off the land and fileless attack techniques. An ISTR Special Report. Accessed Dec 2021

  35. Wang X, Zhang Q, Guo D, Zhao X (2023) A survey of continuous subgraph matching for dynamic graphs. Knowl Inf Syst 65:945–989. https://doi.org/10.1007/s10115-022-01753-x

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sidney M. L. de Lima.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Lima, S.M.L., Souza, D.M., Pinheiro, R.P. et al. Next-generation antivirus for JavaScript malware detection based on dynamic features. Knowl Inf Syst 66, 1337–1370 (2024). https://doi.org/10.1007/s10115-023-01978-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01978-4

Keywords

Navigation