Advertisement

Unpaired Image-to-Image Translation Using Adversarial Consistency Loss

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12354)

Abstract

Unpaired image-to-image translation is a class of vision problems whose goal is to find the mapping between different image domains using unpaired training data. Cycle-consistency loss is a widely used constraint for such problems. However, due to the strict pixel-level constraint, it cannot perform shape changes, remove large objects, or ignore irrelevant texture. In this paper, we propose a novel adversarial-consistency loss for image-to-image translation. This loss does not require the translated image to be translated back to be a specific source image but can encourage the translated images to retain important features of the source images and overcome the drawbacks of cycle-consistency loss noted above. Our method achieves state-of-the-art results on three challenging tasks: glasses removal, male-to-female translation, and selfie-to-anime translation.

Keywords

Generative adversarial networks Dual learning Image synthesis 

Notes

Acknowledgements

This work was supported by the funding from Key-Area Research and Development Program of Guangdong Province (No.2019B121204008), start-up research funds from Peking University (7100602564) and the Center on Frontiers of Computing Studies (7100602567). We would also like to thank Imperial Institute of Advanced Technology for GPU supports.

Supplementary material

504446_1_En_46_MOESM1_ESM.pdf (6.4 mb)
Supplementary material 1 (pdf 6512 KB)

References

  1. 1.
    Almahairi, A., Rajeswar, S., Sordoni, A., Bachman, P., Courville, A.: Augmented cycleGAN: learning many-to-many mappings from unpaired data. In: International Conference on Machine Learning (2018)Google Scholar
  2. 2.
    Anoosheh, A., Agustsson, E., Timofte, R., Gool, L.V.: ComboGAN: unrestrained scalability for image domain translation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (2018)Google Scholar
  3. 3.
    Benaim, S., Wolf, L.: One-sided unsupervised domain mapping. In: Conference on Neural Information Processing Systems (2017)Google Scholar
  4. 4.
    Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying mmd GANs. In: International Conference on Learning Representations (2018)Google Scholar
  5. 5.
    Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., Krishnan, D.: Unsupervised pixel-level domain adaptation with generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  6. 6.
    Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  7. 7.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 295–307 (2016)CrossRefGoogle Scholar
  8. 8.
    Goodfellow, I., et al.: Generative adversarial nets. In: Conference on Neural Information Processing Systems (2014)Google Scholar
  9. 9.
    Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Conference on Neural Information Processing Systems (2017)Google Scholar
  10. 10.
    Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: European Conference on Computer Vision (2018)Google Scholar
  11. 11.
    Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36, 1–14 (2017)CrossRefGoogle Scholar
  12. 12.
    Isola, P., Zhu, J., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  13. 13.
    Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  14. 14.
    Kim, J., Kim, M., Kang, H., Lee, K.H.: U-GAT-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: International Conference on Learning Representations (2020)Google Scholar
  15. 15.
    Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: International Conference on Machine Learning (2017)Google Scholar
  16. 16.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. International Conference on Learning Representations (2014)Google Scholar
  17. 17.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  18. 18.
    Lee, H.Y., et al.: DRIT++: diverse image-to-image translation via disentangled representations. Int. J. Comput. Vis., 1–16 (2020)Google Scholar
  19. 19.
    Lee, H., Tseng, H., Huang, J., Singh, M., Yang, M.: Diverse image-to-image translation via disentangled representations. In: European Conference on Computer Vision (2018)Google Scholar
  20. 20.
    Liu, M., Breuel, T.M., Kautz, J.: Unsupervised image-to-image translation networks. In: Conference on Neural Information Processing Systems (2017)Google Scholar
  21. 21.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: IEEE International Conference on Computer Vision (2015)Google Scholar
  22. 22.
    Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Smolley, S.P.: Least squares generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  23. 23.
    Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim, K.I.: Unsupervised attention-guided image-to-image translation. In: Conference on Neural Information Processing Systems (2018)Google Scholar
  24. 24.
    Nizan, O., Tal, A.: Breaking the cycle - colleagues are all you need. In: arXiv preprint arXiv:1911.10538 (2019)
  25. 25.
    Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  26. 26.
    Rosca, M., Lakshminarayanan, B., Warde-Farley, D., Mohamed, S.: Variational approaches for auto-encoding generative adversarial networks. In: arXiv preprint arXiv:1706.04987 (2017)
  27. 27.
    Royer, A., et al.: XGAN: unsupervised image-to-image translation for many-to-many mappings. In: International Conference on Machine Learning (2018)Google Scholar
  28. 28.
    Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Conference on Neural Information Processing Systems (2016)Google Scholar
  29. 29.
    Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  30. 30.
    Siddiquee, M.M.R., et al.: Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization. In: IEEE International Conference on Computer Vision (2019)Google Scholar
  31. 31.
    Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: International Conference on Learning Representations (2017)Google Scholar
  32. 32.
    Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  33. 33.
    Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: IEEE International Conference on Computer Vision (2017)Google Scholar
  34. 34.
    Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_40CrossRefGoogle Scholar
  35. 35.
    Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision (2017)Google Scholar
  36. 36.
    Zhu, J., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E.: Toward multimodal image-to-image translation. In: Conference on Neural Information Processing Systems (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Hyperplane Lab, CFCS, Computer Science DepartmentPeking UniversityBeijingChina

Personalised recommendations