Advertisement

Malware Detection Based on Opcode Sequence and ResNet

  • Xuetao Zhang
  • Meng Sun
  • Jiabao Wang
  • Jinshuang WangEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 895)

Abstract

Nowadays, it is challenging for traditional static malware detection method to keep pace with the rapid development of malware variants, therefore machine learning based malware detection approaches begin to flourish. Typically, operation codes disassembled from binary programs were sent to classifiers e.g. SVM and KNN for classification recognition. However, this feature extraction method does not make full use of sequence relations between opcodes, at the same time, the classification model still has less dimensions and lower matching ability. Therefore, a malware detection model based on residual network was proposed in this paper. Firstly, the model extracts the opcode sequences using the disassembler. To improve the vector’s expressibility of opcodes, Word2Vec strategy was used in the representation of opcodes, and word vector representations of opcodes were also optimized in the process of training iteration. Unfortunately, the overlapping opcode matrix and convolution operation results in information redundancies. To overcome this problem, a method of downsampling to organize opcode sequences into opcode matrix was adopted, which can effectively control the time and space complexity. In order to improve the classification ability of the model, a classifier with more layers and cross-layer connection was proposed to match malicious code in more dimensions based on ResNet. The experiment shows that the malware classification accuracy in this paper is 98.2%. At the same time, the processing time consumption comparing with traditional classifiers is still negligible.

Keywords

Opcode N-gram ResNet Word2vec 

Notes

Acknowledgements

This work is supported by the Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (BK20180080).

References

  1. 1.
    Li, J., Sun, L., Yan, Q., et al.: Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Industr. Inf. 14(7), 3216–3225 (2018)CrossRefGoogle Scholar
  2. 2.
    Abou-Assaleh, T., Cercone, N., Keselj, V., et al.: N-gram-based detection of new malicious code. In: Proceedings of the International Computer Software and Applications Conference. COMPSAC 2004, vol. 2, pp. 41–42. IEEE (2004)Google Scholar
  3. 3.
    Shabtai, A., Moskovitch, R., Feher, C., et al.: Detecting unknown malicious code by applying classification techniques on Opcode patterns. Secur. Inform. 1(1), 1–22 (2012)CrossRefGoogle Scholar
  4. 4.
    Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Iasted International Conference on Artificial Intelligence and Applications, pp. 358–363. ACTA Press (2008)Google Scholar
  5. 5.
    Santos, I., Brezo, F., Ugarte-Pedrero, X., et al.: Opcode sequences as representation of executables for data-mining-based unknown malware detection. Inf. Sci. 231(9), 64–82 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Divandari, H., Pechaz, B., Jahan, M.V.: Malware detection using Markov Blanket based on Opcode sequences. In: International Congress on Technology, Communication and Knowledge. IEEE (2016)Google Scholar
  7. 7.
    Kang, B.J., Yerima, S.Y., Mclaughlin, K., et al.: N-Opcode Analysis for Android Malware Classification and Categorization, 1–7 (2016)Google Scholar
  8. 8.
    O’Kane, P., Sezer, S., Mclaughlin, K., et al.: SVM training phase reduction using dataset feature filtering for malware detection. IEEE Trans. Inf. Forensics Secur. 8(3), 500–509 (2013)CrossRefGoogle Scholar
  9. 9.
    Kim, Y.: Convolutional Neural Networks for Sentence Classification. Eprint Arxiv (2014)Google Scholar
  10. 10.
    Lee, Y.J., Choi, S.-H., Kim, C., Lim, S.-H., Park, K.-W.: Learning binary code with deep learning to detect software weakness (2017)Google Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition, pp. 770–778 (2015)Google Scholar
  12. 12.
    Rasmus, A., Valpola, H., Honkala, M., et al.: Semi-supervised learning with ladder networks. Comput. Sci. 9 Suppl 1(1), 1–9 (2015)Google Scholar
  13. 13.

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Xuetao Zhang
    • 1
  • Meng Sun
    • 1
  • Jiabao Wang
    • 1
  • Jinshuang Wang
    • 1
    Email author
  1. 1.Department of Cyberspace SecurityArmy Engineering University of PLANanjingChina

Personalised recommendations