Multiclass Malware Classification Using Either Static Opcodes or Dynamic API Calls

Chanajitt, Rajchada; Pfahringer, Bernhard; Gomes, Heitor Murilo; Yogarajan, Vithya

doi:10.1007/978-3-031-22695-3_30

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13728))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

1568 Accesses
1 Citations

Abstract

Today’s malware variants are growing at an unprecedented rate. To avoid detection by existing antivirus engines, attackers have been increasing the complexity of packers, layers of obfuscation, and encryption to obstruct the process of reverse engineering. This paper presents an automated method using static analysis for extracting opcode sequences of a length of up to 5000 and employing these sequences for classifying potential malware into eight classes, namely ransomware, trojan, backdoor, rootkit, virus, miner, benign, and other. Our empirical analysis compares four different classifiers: MLP, LSTM, GRU, and Transformer. The experimental results demonstrate that the GRU approach achieves the highest F1-score of up to 87%. In addition, we analyze dynamic API call sequences. We use a public malware dataset that comprises more than 7000 sample sequences of 342 API calls each for apps from eight different malware families. A GRU network achieves the best result for this dataset, producing an F1-score of 78%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Amajd, M., Kaimuldenov, Z., Voronkov, I.: Text classification with deep neural networks. In: International Conference on Actual Problems of System and Software Engineering, pp. 364-370 2017
Google Scholar
Capstone: Capstone the ultimate disassembler. https://www.capstone-engine.org/lang_python.html
Catak, F.O., Yazı, A.F., Elezaj, O., Ahmed, J.: Deep learning based sequential model for malware analysis using windows exe API calls. PeerJ Comp. Sci. 6, 81 (2020)
Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Gupta, S., Sharma, H., Kaur, S.: Malware characterization using windows API call sequences. In: SPACE (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jalilian, A., Narimani, Z., Ansari, E.: Static signature-based malware detection using opcode and binary information. In: Bohlouli, M., Sadeghi Bigham, B., Narimani, Z., Vasighi, M., Ansari, E. (eds.) CiDaS 2019. LNDECT, vol. 45, pp. 24–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37309-2_3
Chapter Google Scholar
Kerrisk, M.: objdump - Linux manual page. https://man7.org/linux/man-pages/man1/objdump.1.html
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Maniath, S., Ashok, A., Poornachandran, P., Sujadevi, V., AU, P.S., Jan, S.: Deep learning LSTM based ransomware detection. In: 2017 Recent Developments in Control, Automation Power Engineering (RDCAPE), pp. 442–446 IEEE (2017)
Google Scholar
O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al.: Keras Tuner (2019). https://github.com/keras-team/keras-tuner
Ramchoun, H., Ghanou, Y., Ettaouil, M., Janati Idrissi, M.A.: Multilayer perceptron: architecture optimization and training 4(1), 26–30 (2016)
Google Scholar
Singh, A., Arora, R., Pareek, H.: Malware analysis using multiple API sequence mining control flow graph. arXiv preprint arxiv.org/abs/1707.02691 (2017)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(56), 1929–1958 (2014)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, vol. 30 (2017)
Google Scholar
Wang, Y., Stokes, J., Marinescu, M.: Actor critic deep reinforcement learning for neural malware control. In: AAAI, vol. 34, pp. 1005–1012 (2020)
Google Scholar
Wang, Y., Stokes, J.W., Marinescu, M.: Neural malware control with deep reinforcement learning. In: IEEE Military Communications Conference (2019)
Google Scholar
Zhang, H., Xiao, X., Mercaldo, F., Ni, S., Martinelli, F., Sangaiah, A.K.: Classification of ransomware families with machine learning based onN-gram of opcodes. Future Gener. Comput. Syst. 90, 211–221 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Rajchada Chanajitt & Bernhard Pfahringer
School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New Zealand
Heitor Murilo Gomes
School of Computer Science, University of Auckland, Auckland, New Zealand
Vithya Yogarajan

Authors

Rajchada Chanajitt
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Pfahringer
View author publications
You can also search for this author in PubMed Google Scholar
Heitor Murilo Gomes
View author publications
You can also search for this author in PubMed Google Scholar
Vithya Yogarajan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajchada Chanajitt .

Editor information

Editors and Affiliations

University of New South Wales, Sydney, NSW, Australia
Haris Aziz
University of Western Australia, Perth, WA, Australia
Débora Corrêa
University of Western Australia, Perth, WA, Australia
Tim French

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chanajitt, R., Pfahringer, B., Gomes, H.M., Yogarajan, V. (2022). Multiclass Malware Classification Using Either Static Opcodes or Dynamic API Calls. In: Aziz, H., Corrêa, D., French, T. (eds) AI 2022: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13728. Springer, Cham. https://doi.org/10.1007/978-3-031-22695-3_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-22695-3_30
Published: 03 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22694-6
Online ISBN: 978-3-031-22695-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multiclass Malware Classification Using Either Static Opcodes or Dynamic API Calls