GANHopper: Multi-hop GAN for Unsupervised Image-to-Image Translation

Lira, Wallace; Merz, Johannes; Ritchie, Daniel; Cohen-Or, Daniel; Zhang, Hao

doi:10.1007/978-3-030-58574-7_22

Wallace Lira¹²,
Johannes Merz¹²,
Daniel Ritchie¹³,
Daniel Cohen-Or¹⁴ &
…
Hao Zhang¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12371))

Included in the following conference series:

European Conference on Computer Vision

3036 Accesses
18 Citations

Abstract

We introduce GANHopper, an unsupervised image-to-image translation network that transforms images gradually between two domains, through multiple hops. Instead of executing translation directly, we steer the translation by requiring the network to produce in-between images that resemble weighted hybrids between images from the input domains. Our network is trained on unpaired images from the two domains only, without any in-between images. All hops are produced using a single generator along each direction. In addition to the standard cycle-consistency and adversarial losses, we introduce a new hybrid discriminator, which is trained to classify the intermediate images produced by the generator as weighted hybrids, with weights based on a predetermined hop count. We also add a smoothness term to constrain the magnitude of each hop, further regularizing the translation. Compared to previous methods, GANHopper excels at image translations involving domain-specific image features and geometric variations while also preserving non-domain-specific features such as general color schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aberman, K., Liao, J., Shi, M., Lischinski, D., Chen, B., Cohen-Or, D.: Neural best-buddies: sparse cross-domain correspondence. ACM Trans. Graph. 37(4), 1–14 (2018)
Article Google Scholar
Cao, K., Liao, J., Yuan, L.: Carigans: unpaired photo-to-caricature translation (2018)
Google Scholar
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR abs/1706.05587 (2017)
Google Scholar
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (2016)
Google Scholar
Gokaslan, A., Ramanujan, V., Ritchie, D., Kim, K.I., Tompkin, J.: Improving shape deformation in unsupervised image-to-image translation. CoRR abs/1808.04325 (2018). http://arxiv.org/abs/1808.04325
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 6626–6637. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7240-gans-trained-by-a-two-time-scale-update-rule-converge-to-a-local-nash-equilibrium.pdf
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV (2018)
Google Scholar
Isola, P., Zhu, J., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. CoRR abs/1611.07004 (2016). http://arxiv.org/abs/1611.07004
Johnson, J., Alahi, A., Li, F.: Perceptual losses for real-time style transfer and super-resolution. CoRR abs/1603.08155 (2016). http://arxiv.org/abs/1603.08155
Katzir, O., Lischinski, D., Cohen-Or, D.: Cross-domain cascaded deep feature translation. In: European Conference on Computer Vision (ECCV). Springer, Cham (2020)
Google Scholar
Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: ICML (2017)
Google Scholar
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., et al.: Fader networks: manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems (2017)
Google Scholar
Liao, J., Lima, R.S., Nehab, D., Hoppe, H., Sander, P.V., Yu, J.: Automating image morphing using structural similarity on a halfway domain. ACM Trans. Graph. 33(5), 1–12 (2014)
Article Google Scholar
Liu, J., Kanazawa, A., Jacobs, D., Belhumeur, P.: Dog breed classification using part localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 172–185. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_13
Chapter Google Scholar
Liu, M., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. CoRR abs/1703.00848 (2017). http://arxiv.org/abs/1703.00848
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. CoRR abs/1411.7766 (2014). http://arxiv.org/abs/1411.7766
Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z.: Least squares generative adversarial networks. In: ICCV (2017)
Google Scholar
Mo, S., Cho, M., Shin, J.: Instagan: instance-aware image-to-image translation. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=ryxwJhC9YX
Ni, K., et al.: Large-scale deep learning on the YFCC100M dataset. CoRR abs/1502.03409 (2015). http://arxiv.org/abs/1502.03409
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: MICCAI (2015)
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017). https://doi.org/10.1109/TPAMI.2016.2572683. http://ieeexplore.ieee.org/document/7478072/
Taigman, Y., Polyak, A., Wolf, L.: Unsupervised cross-domain image generation. In: Proceedings of ICLR (2017)
Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Wu, W., Cao, K., Li, C., Qian, C., Loy, C.C.: Transgaga: geometry-aware unsupervised image-to-image translation. In: Proceedings of CVPR (2019)
Google Scholar
Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of ICCV (2017)
Google Scholar
Yin, K., Chen, Z., Huang, H., Cohen-Or, D., Zhang, H.: LOGAN: unpaired shape transform in latent overcomplete space. ACM Trans. Graph. 38(6), 1–13 (2019)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586–595, June 2018. https://doi.org/10.1109/CVPR.2018.00068
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International Conference on Computer Vision (ICCV) (2017, to appear)
Google Scholar
Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Simon Fraser University, Burnaby, Canada
Wallace Lira, Johannes Merz & Hao Zhang
Brown University, Providence, USA
Daniel Ritchie
Tel Aviv University, Tel Aviv, Israel
Daniel Cohen-Or

Authors

Wallace Lira
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Merz
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Ritchie
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Cohen-Or
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wallace Lira .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 872 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lira, W., Merz, J., Ritchie, D., Cohen-Or, D., Zhang, H. (2020). GANHopper: Multi-hop GAN for Unsupervised Image-to-Image Translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12371. Springer, Cham. https://doi.org/10.1007/978-3-030-58574-7_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-58574-7_22
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58573-0
Online ISBN: 978-3-030-58574-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics