Skip to main content

Multiclass Malware Classification Using Either Static Opcodes or Dynamic API Calls

  • Conference paper
  • First Online:
AI 2022: Advances in Artificial Intelligence (AI 2022)

Abstract

Today’s malware variants are growing at an unprecedented rate. To avoid detection by existing antivirus engines, attackers have been increasing the complexity of packers, layers of obfuscation, and encryption to obstruct the process of reverse engineering. This paper presents an automated method using static analysis for extracting opcode sequences of a length of up to 5000 and employing these sequences for classifying potential malware into eight classes, namely ransomware, trojan, backdoor, rootkit, virus, miner, benign, and other. Our empirical analysis compares four different classifiers: MLP, LSTM, GRU, and Transformer. The experimental results demonstrate that the GRU approach achieves the highest F1-score of up to 87%. In addition, we analyze dynamic API call sequences. We use a public malware dataset that comprises more than 7000 sample sequences of 342 API calls each for apps from eight different malware families. A GRU network achieves the best result for this dataset, producing an F1-score of 78%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://upx.github.io/.

  2. 2.

    https://docs.python.org/3/library/zlib.html.

  3. 3.

    https://virusshare.com.

  4. 4.

    https://urlhaus.abuse.ch.

  5. 5.

    https://fileHorse.com.

  6. 6.

    http://www.virustotal.com.

  7. 7.

    https://www.mono-project.com/docs/tools+libraries/tools/monodis/.

  8. 8.

    https://www.capstone-engine.org/lang_python.html.

  9. 9.

    https://man7.org/linux/man-pages/man1/objdump.1.html.

References

  1. Amajd, M., Kaimuldenov, Z., Voronkov, I.: Text classification with deep neural networks. In: International Conference on Actual Problems of System and Software Engineering, pp. 364-370 2017

    Google Scholar 

  2. Capstone: Capstone the ultimate disassembler. https://www.capstone-engine.org/lang_python.html

  3. Catak, F.O., Yazı, A.F., Elezaj, O., Ahmed, J.: Deep learning based sequential model for malware analysis using windows exe API calls. PeerJ Comp. Sci. 6, 81 (2020)

    Google Scholar 

  4. Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (2014)

    Google Scholar 

  5. Gupta, S., Sharma, H., Kaur, S.: Malware characterization using windows API call sequences. In: SPACE (2016)

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  7. Jalilian, A., Narimani, Z., Ansari, E.: Static signature-based malware detection using opcode and binary information. In: Bohlouli, M., Sadeghi Bigham, B., Narimani, Z., Vasighi, M., Ansari, E. (eds.) CiDaS 2019. LNDECT, vol. 45, pp. 24–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37309-2_3

    Chapter  Google Scholar 

  8. Kerrisk, M.: objdump - Linux manual page. https://man7.org/linux/man-pages/man1/objdump.1.html

  9. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  10. Maniath, S., Ashok, A., Poornachandran, P., Sujadevi, V., AU, P.S., Jan, S.: Deep learning LSTM based ransomware detection. In: 2017 Recent Developments in Control, Automation Power Engineering (RDCAPE), pp. 442–446 IEEE (2017)

    Google Scholar 

  11. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al.: Keras Tuner (2019). https://github.com/keras-team/keras-tuner

  12. Ramchoun, H., Ghanou, Y., Ettaouil, M., Janati Idrissi, M.A.: Multilayer perceptron: architecture optimization and training 4(1), 26–30 (2016)

    Google Scholar 

  13. Singh, A., Arora, R., Pareek, H.: Malware analysis using multiple API sequence mining control flow graph. arXiv preprint arxiv.org/abs/1707.02691 (2017)

  14. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(56), 1929–1958 (2014)

    Google Scholar 

  15. Vaswani, A., et al.: Attention is all you need. In: NIPS, vol. 30 (2017)

    Google Scholar 

  16. Wang, Y., Stokes, J., Marinescu, M.: Actor critic deep reinforcement learning for neural malware control. In: AAAI, vol. 34, pp. 1005–1012 (2020)

    Google Scholar 

  17. Wang, Y., Stokes, J.W., Marinescu, M.: Neural malware control with deep reinforcement learning. In: IEEE Military Communications Conference (2019)

    Google Scholar 

  18. Zhang, H., Xiao, X., Mercaldo, F., Ni, S., Martinelli, F., Sangaiah, A.K.: Classification of ransomware families with machine learning based onN-gram of opcodes. Future Gener. Comput. Syst. 90, 211–221 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajchada Chanajitt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chanajitt, R., Pfahringer, B., Gomes, H.M., Yogarajan, V. (2022). Multiclass Malware Classification Using Either Static Opcodes or Dynamic API Calls. In: Aziz, H., Corrêa, D., French, T. (eds) AI 2022: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13728. Springer, Cham. https://doi.org/10.1007/978-3-031-22695-3_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22695-3_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22694-6

  • Online ISBN: 978-3-031-22695-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics