Skip to main content

Cross-Domain Interpolation for Unpaired Image-to-Image Translation

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 11754)

Abstract

Unpaired Image-to-image translation is a brand new challenging problem that consists of latent vectors extracting and matching from a source domain A and a target domain B. Both latent spaces are matched and interpolated by a directed correspondence function F for \(A \rightarrow B\) and G for \(B \rightarrow A\). The current efforts point to Generative Adversarial Networks (GANs) based models due they synthesize new quite realistic samples across different domains by learning critical features from their latent spaces. Nonetheless, domain exploration is not explicit supervision; thereby most GANs based models do not achieve to learn the key features. In consequence, the correspondence function overfits and fails in reverse or loses translation quality. In this paper, we propose a guided learning model through manifold bi-directional translation loops between the source and the target domains considering the Wasserstein distance between their probability distributions. The bi-directional translation is CycleGAN-based but considering the latent space Z as an intermediate domain which guides the learning process and reduces the inducted error from loops. We show experimental results in several public datasets including Cityscapes, Horse2zebra, and Monet2photo at the EECS-Berkeley webpage (http://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/). Our results are competitive to the state-of-the-art regarding visual quality, stability, and other baseline metrics.

Keywords

  • Image-to-image translation
  • Generative Adversarial Network
  • Cross-domain interpolation

The present work was supported by grant 234-2015-FONDECYT (Master Program) from Cienciactiva of the National Council for Science, Technology and Technological Innovation (CONCYTEC-PERU) and the Vicerrectorate for Research of Universidad Nacional de Ingenierıa (VRI - UNI).

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-34995-0_49
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-34995-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

References

  1. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)

  2. Benaim, S., Wolf, L.: One-shot unsupervised cross domain translation. In: Advances in Neural Information Processing Systems, pp. 2104–2114 (2018)

    Google Scholar 

  3. Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 262–270 (2015)

    Google Scholar 

  4. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  5. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  6. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017)

    Google Scholar 

  7. Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340. ACM (2001)

    Google Scholar 

  8. Hiasa, Y., et al.: Cross-modality image synthesis from unpaired data using CycleGAN. In: Gooya, A., Goksel, O., Oguz, I., Burgos, N. (eds.) SASHIMI 2018. LNCS, vol. 11037, pp. 31–41. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00536-8_4

    CrossRef  Google Scholar 

  9. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

    Google Scholar 

  10. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)

    Google Scholar 

  11. Li, M., Huang, H., Ma, L., Liu, W., Zhang, T., Jiang, Y.: Unsupervised image-to-image translation with stacked cycle-consistent adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 184–199 (2018)

    CrossRef  Google Scholar 

  12. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  13. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)

    Google Scholar 

  14. Mauricio, A., López, J., Huauya, R., Diaz, J.: High-resolution generative adversarial neural networks applied to histological images generation. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11140, pp. 195–202. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01421-6_20

    CrossRef  Google Scholar 

  15. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)

  16. Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2849–2857 (2017)

    Google Scholar 

  17. Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36

    CrossRef  Google Scholar 

  18. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2223–2232 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge López .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

López, J., Mauricio, A., Díaz, J., Cámara, G. (2019). Cross-Domain Interpolation for Unpaired Image-to-Image Translation. In: Tzovaras, D., Giakoumis, D., Vincze, M., Argyros, A. (eds) Computer Vision Systems. ICVS 2019. Lecture Notes in Computer Science(), vol 11754. Springer, Cham. https://doi.org/10.1007/978-3-030-34995-0_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34995-0_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34994-3

  • Online ISBN: 978-3-030-34995-0

  • eBook Packages: Computer ScienceComputer Science (R0)