Inducing Optimal Attribute Representations for Conditional GANs

Bhattarai, Binod; Kim, Tae-Kyun

doi:10.1007/978-3-030-58571-6_5

Binod Bhattarai¹² &
Tae-Kyun Kim^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12352))

Included in the following conference series:

European Conference on Computer Vision

3673 Accesses
13 Citations

Abstract

Conditional GANs (cGANs) are widely used in translating an image from one category to another. Meaningful conditions on GANs provide greater flexibility and control over the nature of the target domain synthetic data. Existing conditional GANs commonly encode target domain label information as hard-coded categorical vectors in the form of 0s and 1s. The major drawbacks of such representations are inability to encode the high-order semantic information of target categories and their relative dependencies. We propose a novel end-to-end learning framework based on Graph Convolutional Networks to learn the attribute representations to condition the generator. The GAN losses, the discriminator and attribute classification loss, are fed back to the graph resulting in the synthetic images that are more natural and clearer with respect to the attributes generation. Moreover, prior-arts are mostly given priorities to condition on the generator side, not on the discriminator side of GANs. We apply the conditions on the discriminator side as well via multi-task learning. We enhanced four state-of-the-art cGANs architectures: Stargan, Stargan-JNT, AttGAN and STGAN. Our extensive qualitative and quantitative evaluations on challenging face attributes manipulation data set, CelebA, LFWA, and RaFD, show that the cGANs enhanced by our methods outperform by a large margin, compared to their counter-parts and other conditioning methods, in terms of both target attributes recognition rates and quality measures such as PSNR and SSIM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)
Bhattarai, B., Bodur, R., Kim, T.K.: AugLabel: exploiting word representations to augment labels for face attribute classification. In: ICASSP (2020)
Google Scholar
Cao, J., Huang, H., Li, Y., Liu, J., He, R., Sun, Z.: Biphasic learning of GANs for high-resolution image-to-image translation. In: CVPR (2019)
Google Scholar
Cavallanti, G., Cesa-Bianchi, N., Gentile, C.: Linear algorithms for online multitask classification. JMLR 11, 2901–2934 (2010)
MathSciNet MATH Google Scholar
Chen, T., Zhai, X., Ritter, M., Lucic, M., Houlsby, N.: Self-supervised GANs via auxiliary rotation loss. In: CVPR (2019)
Google Scholar
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS (2016)
Google Scholar
Chen, Y.C., et al.: Facelet-bank for fast portrait manipulation. In: CVPR (2018)
Google Scholar
Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: CVPR (2019)
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: CVPR (2018)
Google Scholar
Ding, H., Sricharan, K., Chellappa, R.: ExprGAN: facial expression editing with controllable expression intensity. In: AAAI (2018)
Google Scholar
Gecer, B., Bhattarai, B., Kittler, J., Kim, T.-K.: Semi-supervised adversarial learning to generate photorealistic face images of new identities from 3D morphable model. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 230–248. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_14
Chapter Google Scholar
Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: SIGKDD (2016)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: NIPS (2017)
Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)
Google Scholar
Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017)
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE TIP 28, 5464–5478 (2019)
MathSciNet MATH Google Scholar
Hong, S., Yang, D., Choi, J., Lee, H.: Inferring semantic layout for hierarchical text-to-image synthesis. In: CVPR (2018)
Google Scholar
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Chapter Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Google Scholar
Kaneko, T., Hiramatsu, K., Kashino, K.: Generative attribute controller with conditional filtered generative adversarial networks. In: CVPR (2017)
Google Scholar
Kaneko, T., Hiramatsu, K., Kashino, K.: Generative adversarial image synthesis with decision tree latent controller. In: CVPR (2018)
Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2016)
Google Scholar
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., et al.: Fader networks: manipulating images by sliding attributes. In: NIPS (2017)
Google Scholar
Liu, M., et al.: STGAN: a unified selective transfer network for arbitrary image attribute editing. In: CVPR (2019)
Google Scholar
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NeurIPS (2017)
Google Scholar
Liu, Y., Li, Q., Sun, Z.: Attribute-aware face aging with wavelet-based generative adversarial networks. In: CVPR (2019)
Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NurIPS (2013)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: ICLR (2018)
Google Scholar
Miyato, T., Koyama, M.: cGANs with projection discriminator. In: ICLR (2018)
Google Scholar
Nian, F., Chen, X., Yang, S., Lv, G.: Facial attribute recognition with feature decoupling and graph convolutional networks. IEEE Access 7, 85500–85512 (2019)
Article Google Scholar
Odena, A., et al.: Is generator conditioning causally related to GAN performance? In: ICML (2018)
Google Scholar
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: ICML (2017)
Google Scholar
Perarnau, G., Van De Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional GANs for image editing. In: NIPSW (2016)
Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: SIGKDD (2014)
Google Scholar
Pumarola, A., Agudo, A., Martinez, A.M., Sanfeliu, A., Moreno-Noguer, F.: GANimation: anatomically-aware facial animation from a single image. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 835–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_50
Chapter Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis (2016)
Google Scholar
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NIPS (2016)
Google Scholar
Shen, W., Liu, R.: Learning residual images for face attribute manipulation. In: CVPR (2017)
Google Scholar
Shmelkov, K., Schmid, C., Alahari, K.: How good is my GAN? In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 218–234. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_14
Chapter Google Scholar
Taherkhani, F., Nasrabadi, N.M., Dawson, J.: A deep face identification network enhanced by facial attributes prediction. In: CVPRW (2018)
Google Scholar
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: WWW (2015)
Google Scholar
Xiao, T., Hong, J., Ma, J.: ELEGANT: exchanging latent encodings with GAN for transferring multiple face attributes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 172–187. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_11
Chapter Google Scholar
Zakharov, E., Shysheya, A., Burkov, E., Lempitsky, V.: Few-shot adversarial learning of realistic neural talking head models. arXiv preprint arXiv:1905.08233 (2019)
Zhang, G., Kan, M., Shan, S., Chen, X.: Generative adversarial network with spatial attention for face attribute editing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 422–437. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_26
Chapter Google Scholar
Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: CVPR (2017)
Google Scholar
Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder. In: CVPR (2017)
Google Scholar
Zhao, B., Meng, L., Yin, W., Sigal, L.: Image generation from layout. In: CVPR (2019)
Google Scholar
Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., Sun, M.: Graph neural networks: A review of methods and applications. arXiv:1812.08434 (2018)

Download references

Acknowledgements

Authors would like to thank EPSRC Programme Grant ‘FACER2VM’(EP/N007743/1) for generous support. We would also like to thank Prateek Manocha, undergraduate student from IIT Guwahati for some of the baseline experiments during his summer internship at Imperial College London.

Author information

Authors and Affiliations

Imperial College London, London, UK
Binod Bhattarai & Tae-Kyun Kim
KAIST, Daejeon, South Korea
Tae-Kyun Kim

Authors

Binod Bhattarai
View author publications
You can also search for this author in PubMed Google Scholar
Tae-Kyun Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Binod Bhattarai .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1946 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhattarai, B., Kim, TK. (2020). Inducing Optimal Attribute Representations for Conditional GANs. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12352. Springer, Cham. https://doi.org/10.1007/978-3-030-58571-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-58571-6_5
Published: 09 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58570-9
Online ISBN: 978-3-030-58571-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics