API Call Based Malware Detection Approach Using Recurrent Neural Network—LSTM

  • J. MathewEmail author
  • M. A. Ajay Kumara
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 940)


Malware variants keep increasing every year as most malware developers tweak existing easily available malware codes to create their custom versions. Though their behaviours are coherent, because of change in signature, static signature-based malware detection schemes would fail to identify such malware. One promising approach for detection of malware is dynamic analysis by observing the malware behaviour. Malware executions largely depend on Application Programming Interface (API) calls they issue to the operating systems to achieve their malicious tasks. Therefore, behaviour-based detection techniques that eye on such API system calls can deliver promising results as they are inherently semantic-aware. In this paper, we have used Recurrent Neural Network’s (RNN) capability to capture long-term features of time-series and sequential data to study the scope and effectiveness of RNNs to efficiently detect and analyze malware and benign based on their behaviour, i.e. system call sequences specifically. We trained the RNN-Long Short Term Memory (LSTM) model to learn from the most informative of sequences from the API-dataset based on their relative ranking based on Term Frequency-Inverse Document Frequency (TF-IDF) recommended features and were able to achieve accuracy as high as 92% in detecting malware and benign from an unknown test API-call sequence.


API system call Malware detection TF-IDF RNN-LSTM 


  1. 1.
    Chen, Q., Bridges, R.A.: Automated behavioral analysis of malware a case study of WannaCry Ransomware (2017). arXiv:1709.08753v1 [cs.CR], Cryptography and Security
  2. 2.
    Ajay Kumara, M.A., Jaidhar, C.D.: Automated multi-level malware detection system based on reconstructed semantic view of executables using machine learning techniques at VMM. Future Gener. Comput. Syst. 79(Part 1), 431–446 (2018)Google Scholar
  3. 3.
    Anju, S.S., Harmya, P., Jagadeesh, N., Darsana, R.: Malware detection using assembly code and control flow graph optimization. In: Proceedings of the 1st Amrita ACM-W Celebration of Women in Computing in India, A2CWiC 2010, Coimbatore (2010)Google Scholar
  4. 4.
    Kang, B., Han, K.S., Kang, B., Im, E.G.: Malware categorization using dynamic mnemonic frequency analysis with redundancy filtering. Digit. Investig. 11, 323–335 (2014)CrossRefGoogle Scholar
  5. 5.
    Salehi, Z., Sami, A., Ghiasi, M.: Using feature generation from API calls for malware detection. Comput. Fraud Secur. 2014, 9–18 (2014)CrossRefGoogle Scholar
  6. 6.
    Galal, H.S., Mahdy, Y.B., Atiea, M.A.: Behavior-based features model for malware detection. J. Comput. Virol. Hacking Tech. 12(2), 59–67 (2016)CrossRefGoogle Scholar
  7. 7.
    Kolosnjaji, B., Zarras, A., Eraisha, G., Webster, G., Eckert, C.: Empowering convolutional networks for malware classification and analysis. In: International Joint Conference on Neural Networks (IJCNN) (2017)Google Scholar
  8. 8.
    Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., Yagi, T.: Malware detection with deep neural network using process behavior. In: 40th Annual Computer Software and Applications Conference (COMPSAC) (2016)Google Scholar
  9. 9.
    Athira, V., Geetha, P., Vinayakumar, R., Soman, K.P.: DeepAirNet: applying recurrent networks for air quality prediction. In: International Conference on Computational Intelligence and Data Science (ICCIDS) (2018)Google Scholar
  10. 10.
    Ki, Y., Kim, E., Kim, H.K.: A novel approach to detect malware based on API call sequence analysis. Int. J. Distrib. Sens. Netw. 11, 659101 (2015)CrossRefGoogle Scholar
  11. 11.
    Tran, T.K., Sato, H.: NLP-based approaches for malware classification from API sequences. In: 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES) (2017)Google Scholar
  12. 12.
    Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification with recurrent networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2015)Google Scholar
  13. 13.
    Rhodes, M., Burnap, P., Jones, K.: Early-stage malware prediction using recurrent neural networks. Comput. Secur. 77, 578–594 (2018). arXiv:1708.03513 [cs.CR]CrossRefGoogle Scholar
  14. 14.
    Wang, X., Yiu, S.M.: A multi-task learning model for malware classification with useful file access pattern from API call sequence (2016). arXiv:1610.05945 [cs.SD], Cryptography and Security
  15. 15.
    Xiao, X., Zhang, S., Mercaldo, F., Hu, G., Sangaiah, A.K.: Android malware detection based on system call sequences and LSTM. Multimed. Tools Appl. (2017)Google Scholar
  16. 16.
    Sugunan, K., Gireesh Kumar, T., Dhanya, K.A.: Static and dynamic analysis for android malware detection. Advances in Intelligent Systems and Computing, vol. 645, pp. 147–155. Springer, Cham (2018)Google Scholar
  17. 17.
    Kim, C.W.: GitHub repository (2018).

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer Science and Engineering, Amrita School of EngineeringBengaluru Amrita Vishwa VidyapeethamBengaluruIndia

Personalised recommendations