Advertisement

AnimeGAN: A Novel Lightweight GAN for Photo Animation

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1205)

Abstract

In this paper, a novel approach for transforming photos of real-world scenes into anime style images is proposed, which is a meaningful and challenging task in computer vision and artistic style transfer. The approach we proposed combines neural style transfer and generative adversarial networks (GANs) to achieve this task. For this task, some existing methods have not achieved satisfactory animation results. The existing methods usually have some problems, among which significant problems mainly include: 1) the generated images have no obvious animated style textures; 2) the generated images lose the content of the original images; 3) the parameters of the network require the large memory capacity. In this paper, we propose a novel lightweight generative adversarial network, called AnimeGAN, to achieve fast animation style transfer. In addition, we further propose three novel loss functions to make the generated images have better animation visual effects. These loss function are grayscale style loss, grayscale adversarial loss and color reconstruction loss. The proposed AnimeGAN can be easily end-to-end trained with unpaired training data. The parameters of AnimeGAN require the lower memory capacity. Experimental results show that our method can rapidly transform real-world photos into high-quality anime images and outperforms state-of-the-art methods.

Keywords

Generative adversarial networks Neural style transfer Computer vision 

Notes

Acknowledgment

The work described in this paper was support by National Natural Science Foundation of China Foundation No. 61300127. Any conclusions or recommendations stated here are those of the authors and do not necessarily reflect official positions of NSFC.

References

  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. CoRR abs/1605.08695 (2016). http://arxiv.org/abs/1605.08695
  2. 2.
    Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A.: Augmented CycleGAN: learning many-to-many mappings from unpaired data. In: Proceedings 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, pp. 300–309 (2018)Google Scholar
  3. 3.
    Chen, Y., Lai, Y.K., Liu, Y.J.: CartoonGAN: generative adversarial networks for photo cartoonization. In: Proceedings 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, United States, pp. 9465–9474 (2018)Google Scholar
  4. 4.
    Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proc. 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, United States, pp. 1800–1807 (2017)Google Scholar
  5. 5.
    Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. CoRR abs/1508.06576 (2015). http://arxiv.org/abs/1508.06576
  6. 6.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Proceedings 29th Annual Conference on Neural Information Processing Systems, NIPS 2015, Montreal, QC, Canada, pp. 262–270 (2015)Google Scholar
  7. 7.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, United States, pp. 2414–2423 (2016)Google Scholar
  8. 8.
    Gatys, L.A., Ecker, A.S., Bethge, M., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, United States, pp. 3730–3738 (2017)Google Scholar
  9. 9.
    Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014, Montreal, QC, Canada, pp. 2672–2680 (2014)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, United States, pp. 770–778 (2016)Google Scholar
  11. 11.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings 16th IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, pp. 1510–1519 (2017)Google Scholar
  12. 12.
    Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, United States, pp. 5967–5976 (2017)Google Scholar
  13. 13.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  14. 14.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings the 3rd International Conference for Learning Representations, ICLR 2015, San Diego, CA, United States, pp. 1–15, May 2015Google Scholar
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012, Lake Tahoe, NV, United States, pp. 1097–1105 (2012)Google Scholar
  16. 16.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Diversified texture synthesis with feed-forward networks. In: Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, United States, pp. 266–274 (2017)Google Scholar
  17. 17.
    Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: Proceedings 31st Annual Conference on Neural Information Processing Systems, NIPS 2017, Long Beach, CA, United States, pp. 386–396 (2017)Google Scholar
  18. 18.
    Li, Y., Liu, M.-Y., Li, X., Yang, M.-H., Kautz, J.: A closed-form solution to photorealistic image stylization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 468–483. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01219-9_28CrossRefGoogle Scholar
  19. 19.
    Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing (2013)Google Scholar
  20. 20.
    Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Smolley, S.P.: Least squares generative adversarial networks. In: Proceedings 2017 IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, pp. 2813–2821 (2017)Google Scholar
  21. 21.
    Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. CoRR abs/1802.05957 (2018). http://arxiv.org/abs/1802.05957
  22. 22.
    Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill (2016).  https://doi.org/10.23915/distill.00003. http://distill.pub/2016/deconv-checkerboard
  23. 23.
    Pesko, M., Svystun, A., Andruszkiewicz, P., Rokita, P., Trzcinski, T.: Comixify: transform video into a comics. CoRR abs/1812.03473 (2018). http://arxiv.org/abs/1812.03473
  24. 24.
    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Honolulu, HI, United States, pp. 4510–4520 (2018)Google Scholar
  25. 25.
    Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRefGoogle Scholar
  26. 26.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
  27. 27.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: The missing ingredient for fast stylization. CoRR abs/1607.08022 (2016). http://arxiv.org/abs/1607.08022
  28. 28.
    Wang, K., Gou, C., Duan, Y., Lin, Y., Zheng, X., Wang, F.Y.: Generative adversarial networks: introduction and outlook. IEEE/CAA J. Autom. Sinica 4(4), 588–598 (2017)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, United States, pp. 8798–8807 (2018)Google Scholar
  30. 30.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings 16th IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, pp. 2242–2251 (2017)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.School of Civil EngineeringWuhan UniversityWuhanChina
  2. 2.School of Computer ScienceHubei University of TechnologyWuhanChina

Personalised recommendations