Skip to main content

Malicious Software Classification Using VGG16 Deep Neural Network’s Bottleneck Features

  • Conference paper
  • First Online:
Information Technology - New Generations

Abstract

Malicious software (malware) has been extensively employed for illegal purposes and thousands of new samples are discovered every day. The ability to classify samples with similar characteristics into families makes possible to create mitigation strategies that work for a whole class of programs. In this paper, we present a malware family classification approach using VGG16 deep neural network’s bottleneck features. Malware samples are represented as byteplot grayscale images and the convolutional layers of a VGG16 deep neural network pre-trained on the ImageNet dataset is used for bottleneck features extraction. These features are used to train a SVM classifier for the malware family classification task. The experimental results on a dataset comprising 10,136 samples from 20 different families showed that our approach can effectively be used to classify malware families with an accuracy of 92.97%, outperforming similar approaches proposed in the literature which require feature engineering and considerable domain expertise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Available at http://www.virussign.com.

  2. 2.

    Available at http://www.virustotal.com.

References

  1. Y. Bengio et al., Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  Google Scholar 

  2. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al., Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  3. J. Yosinski, J. Clune, Y. Bengio, H. Lipson, How transferable are features in deep neural networks? in Advances in Neural Information Processing Systems (2014), pp. 3320–3328

    Google Scholar 

  4. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556

    Google Scholar 

  5. J.Z. Kolter, M.A. Maloof, Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  6. A. Shabtai, R. Moskovitch, C. Feher, S. Dolev, Y. Elovici, Detecting unknown malicious code by applying classification techniques on opcode patterns. Secur. Inform. 1(1), 1–22 (2012)

    Article  Google Scholar 

  7. L. Nataraj, S. Karthikeyan, G. Jacob, B. Manjunath, Malware images: visualization and automatic classification, in Proceedings of the 8th International Symposium on Visualization for Cyber Security (ACM, New York, 2011), p. 4

    Google Scholar 

  8. B. Kolosnjaji, A. Zarras, G.D. Webster, C. Eckert, Deep learning for classification of malware system call sequences, in Australasian Conference on Artificial Intelligence (2016), pp. 137–149

    Google Scholar 

  9. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105

    Google Scholar 

  10. G. Conti, E. Dean, M. Sinda, B. Sangster, Visual reverse engineering of binary and data files, in Visualization for Computer Security (Springer, Berlin, 2008), pp. 1–17

    Book  Google Scholar 

  11. M. Sebastián, R. Rivera, P. Kotzias, J. Caballero, Avclass: a tool for massive malware labeling, in International Symposium on Research in Attacks, Intrusions, and Defenses (Springer, Cham, 2016), pp. 230–253

    Book  Google Scholar 

  12. L. van der Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

Download references

Acknowledgements

This work has been partially supported by Brazilian National Council for Scientific and Technological Development (grants 302923/2014-4 and 313152/2015-2). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPUs used for this research.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rezende, E., Ruppert, G., Carvalho, T., Theophilo, A., Ramos, F., Geus, P.d. (2018). Malicious Software Classification Using VGG16 Deep Neural Network’s Bottleneck Features. In: Latifi, S. (eds) Information Technology - New Generations. Advances in Intelligent Systems and Computing, vol 738. Springer, Cham. https://doi.org/10.1007/978-3-319-77028-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77028-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77027-7

  • Online ISBN: 978-3-319-77028-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics