Malware Detection Using Deep Transferred Generative Adversarial Networks

  • Jin-Young Kim
  • Seok-Jun Bu
  • Sung-Bae ChoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10634)


Malicious software is generated with more and more modified features of which the methods to detect malicious software use characteristics. Automatic classification of malicious software is efficient because it does not need to store all characteristic. In this paper, we propose a transferred generative adversarial network (tGAN) for automatic classification and detection of the zero-day attack. Since the GAN is unstable in training process, often resulting in generator that produces nonsensical outputs, a method to pre-train GAN with autoencoder structure is proposed. We analyze the detector, and the performance of the detector is visualized by observing the clustering pattern of malicious software using t-SNE algorithm. The proposed model gets the best performance compared with the conventional machine learning algorithms.


Malicious software Zero-day attack Generative adversarial network Autoencoder Transfer learning 



This work was supported by Defense Acquisition Program Administration and Agency for Defense Development under the contract (UD160066BD).


  1. 1.
    Dhammi, A., Singh, M.: Behavior analysis of malware using machine learning. In: IEEE International Conference on Contemporery Computing, pp. 481–486 (2015)Google Scholar
  2. 2.
    Christodorescum, M., Jha, S., Seshia, S.A., Song, D., Bryant, R.E.: Semantics-aware malware detection. In: Security and Privacy, pp. 32–46 (2005)Google Scholar
  3. 3.
    Nataraj, L., Karthikeyanm, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the Conference on Visualizing for Cyber Security, p. 4 (2011)Google Scholar
  4. 4.
    Kong, D., Guanhua, Y.: Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the Conference on Knowledge Discovery and Datamining, pp. 1357–1365 (2013)Google Scholar
  5. 5.
    Pascanu, R., Stokes, J.W., Sanossian, H., Marinescu, M., Thomas, A.: Malware classification with recurrent network. In: Acoustics, Speech and Signal Processing, pp. 1916–1920 (2015)Google Scholar
  6. 6.
    Akritidis, P., Kostas, A., Evangelos, M.P.: Efficient content-based detection of zero-day worms. In: Communications, vol. 2, pp. 837–843 (2005)Google Scholar
  7. 7.
    Grace, M., Zhou, Y., Zhang, Q., Zou, S., Jiang, X.: RiskRanker: scalable and accurate zero-day android malware detection. In: Proceedings of the Conference on Mobile Systems, Applications, and Services, pp. 281–294 (2012)Google Scholar
  8. 8.
    Goodfellow, I., Pouget-Abadie, J., Mirze, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  9. 9.
    Radford, A., Luke, M., Soumith, C.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
  10. 10.
    Bourlard, H., Yves, K.: Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59, 291–294 (1988)CrossRefzbMATHMathSciNetGoogle Scholar
  11. 11.
    Lu, X., Matsuda, Y., Hori, C.: Speech enhancement based on deep denoising autoencoder. In: Interspeech, pp. 436–440 (2013)Google Scholar
  12. 12.
    Ng, A.: Sparse autoencoder. CS294A Lecture notes, vol. 72, pp. 1–19 (2011)Google Scholar
  13. 13.
    Chandar AP, S., Lauly, S., Larochelle, H., Khapra, M., Ravindran, B., Raykar, C.V., Saha, A.: An autoencoder approach to learning bilingual word representations. In: Advances in Neural Information Processing Systems, pp. 1853–1861 (2014)Google Scholar
  14. 14.
    Arnold, A., Nallapati, R., Cohen, W.: A comparative study of methods for transductive transfer learning. In: Proceedings of the IEEE International Conference on Data Mining, pp. 77–82 (2007)Google Scholar
  15. 15.
    Kaggle: Microsoft Malware Classification Challenge (BIG 2015). Accessed 4 Nov 2015
  16. 16.
    Zeiler, M., Taylor, G., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: IEEE International Conference on Computer Vision, pp. 2018–2025 (2011)Google Scholar
  17. 17.
    Lecun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361, p. 1995 (1995)Google Scholar
  18. 18.
    Jake, D., Tyler, M., Michael, H.: Polymorphic malware detection using sequence calssification methods. In: Security and Privacy Workshops, pp. 81–87 (2016)Google Scholar
  19. 19.
    Narayanan, B.N., Djaneye-Boundjou, O., Kebede, T.M.: Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: Aerospace and Electronics Conference and Ohio Innovation Summit, pp. 338–342 (2016)Google Scholar
  20. 20.
    Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceYonsei UniversitySeoulKorea

Personalised recommendations