Advertisement

Image Manipulation with Perceptual Discriminators

  • Diana Sungatullina
  • Egor Zakharov
  • Dmitry Ulyanov
  • Victor Lempitsky
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11210)

Abstract

Systems that perform image manipulation using deep convolutional networks have achieved remarkable realism. Perceptual losses and losses based on adversarial discriminators are the two main classes of learning objectives behind these advances. In this work, we show how these two ideas can be combined in a principled and non-additive manner for unaligned image translation tasks. This is accomplished through a special architecture of the discriminator network inside generative adversarial learning framework. The new architecture, that we call a perceptual discriminator, embeds the convolutional parts of a pre-trained deep classification network inside the discriminator network. The resulting architecture can be trained on unaligned image datasets, while benefiting from the robustness and efficiency of perceptual losses. We demonstrate the merits of the new architecture in a series of qualitative and quantitative comparisons with baseline approaches and state-of-the-art frameworks for unaligned image translation.

Keywords

Image translation Image editing Perceptual loss Generative adversarial networks 

Notes

Acknowledgements

This work has been supported by the Ministry of Education and Science of the Russian Federation (grant 14.756.31.0001).

References

  1. 1.
    Faceapp (2018). https://www.faceapp.com/
  2. 2.
  3. 3.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Proceedings of ICML, pp. 214–223 (2017)Google Scholar
  4. 4.
    Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 584–599. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_38CrossRefGoogle Scholar
  5. 5.
    Benaim, S., Wolf, L.: One-sided unsupervised domain mapping. In: Proceedings of NIPS, pp. 752–762 (2017)Google Scholar
  6. 6.
    Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Neural photo editing with introspective adversarial networks. CoRR abs/1609.07093 (2016)Google Scholar
  7. 7.
    Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: Proceedings of ICCV, pp. 1520–1529 (2017)Google Scholar
  8. 8.
    Chintala, S., Denton, E., Arjovsky, M., Mathieu, M.: How to train a GAN? Tips and tricks to make GANs work (2017). https://github.com/soumith/ganhacks
  9. 9.
    Choi, Y., Choi, M., Kim, M., Ha, J., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of CVPR (2018)Google Scholar
  10. 10.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_13CrossRefGoogle Scholar
  11. 11.
    Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of NIPS, pp. 658–666 (2016)Google Scholar
  12. 12.
    Dosovitskiy, A., Springenberg, J.T., Brox, T.: Learning to generate chairs with convolutional neural networks. In: Proceedings of CVPR, pp. 1538–1546 (2015)Google Scholar
  13. 13.
    Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of CVPR, pp. 2414–2423 (2016)Google Scholar
  14. 14.
    Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of NIPS, pp. 2672–2680 (2014)Google Scholar
  15. 15.
    Goodfellow, I.J.: NIPS 2016 tutorial: Generative adversarial networks. CoRR abs/1701.00160 (2017)Google Scholar
  16. 16.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015). http://arxiv.org/abs/1512.03385
  17. 17.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, 7–13 December 2015, Santiago, Chile, pp. 1026–1034 (2015)Google Scholar
  18. 18.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 107:1–107:14 (2017)CrossRefGoogle Scholar
  19. 19.
    Isola, P., Zhu, J., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of CVPR, pp. 5967–5976 (2017)Google Scholar
  20. 20.
    Jain, V., Seung, S.: Natural image denoising with convolutional networks. In: Proceedings of NIPS, pp. 769–776 (2009)Google Scholar
  21. 21.
    Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
  22. 22.
    Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. CoRR abs/1710.10196 (2017)Google Scholar
  23. 23.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of CVPR, pp. 1646–1654 (2016)Google Scholar
  24. 24.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of CVPR (2017)Google Scholar
  25. 25.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of ICCV (2015)Google Scholar
  26. 26.
    Lopez-Paz, D., Oquab, M.: Revisiting classifier two-sample tests. arXiv preprint arXiv:1610.06545 (2016)
  27. 27.
    Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O.: Are GANs created equal? A large-scale study. CoRR abs/1711.10337 (2017)Google Scholar
  28. 28.
    Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: Proceedings of CVPR (2015)Google Scholar
  29. 29.
    Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z.: Multi-class generative adversarial networks with the L2 loss function. CoRR abs/1611.04076 (2016)Google Scholar
  30. 30.
    Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR abs/1511.06434 (2015)Google Scholar
  31. 31.
    Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2014, 23–28 June 2014, Columbus, OH, USA, pp. 512–519 (2014)Google Scholar
  32. 32.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. CoRR abs/1409.0575 (2014). http://arxiv.org/abs/1409.0575
  33. 33.
    Sajjadi, M.S.M., Scholkopf, B., Hirsch, M.: Enhancenet: single image super-resolution through automated texture synthesis. In: Proceedings of ICCV (2017)Google Scholar
  34. 34.
    Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Proceedings of NIPS, pp. 2226–2234 (2016)Google Scholar
  35. 35.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  36. 36.
    Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. CoRR abs/1611.02200 (2016). http://arxiv.org/abs/1611.02200
  37. 37.
    Ulyanov, D., Lebedev, V., Vedaldi, A., Lempitsky, V.S.: Texture networks: feed-forward synthesis of textures and stylized images. In: Proceedings of ICML, pp. 1349–1357 (2016)Google Scholar
  38. 38.
    Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Deep image prior. In: Proceedings of CVPR (2018)Google Scholar
  39. 39.
    Upchurch, P., et al.: Deep feature interpolation for image content changes. In: Proceedings of CVPR, pp. 6090–6099 (2017)Google Scholar
  40. 40.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. arXiv preprint arXiv:1711.11585 (2017)
  41. 41.
    Zhang, H., et al.: Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. CoRR abs/1612.03242 (2016)Google Scholar
  42. 42.
    Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of ICCV, pp. 2242–2251 (2017)Google Scholar
  43. 43.
    Zhu, J., et al.: Toward multimodal image-to-image translation. In: Proceedings of NIPS, pp. 465–476 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Diana Sungatullina
    • 1
  • Egor Zakharov
    • 1
  • Dmitry Ulyanov
    • 1
  • Victor Lempitsky
    • 1
  1. 1.Skolkovo Institute of Science and TechnologyMoscowRussia

Personalised recommendations