Skip to main content

Inducing Optimal Attribute Representations for Conditional GANs

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12352))

Included in the following conference series:

Abstract

Conditional GANs (cGANs) are widely used in translating an image from one category to another. Meaningful conditions on GANs provide greater flexibility and control over the nature of the target domain synthetic data. Existing conditional GANs commonly encode target domain label information as hard-coded categorical vectors in the form of 0s and 1s. The major drawbacks of such representations are inability to encode the high-order semantic information of target categories and their relative dependencies. We propose a novel end-to-end learning framework based on Graph Convolutional Networks to learn the attribute representations to condition the generator. The GAN losses, the discriminator and attribute classification loss, are fed back to the graph resulting in the synthetic images that are more natural and clearer with respect to the attributes generation. Moreover, prior-arts are mostly given priorities to condition on the generator side, not on the discriminator side of GANs. We apply the conditions on the discriminator side as well via multi-task learning. We enhanced four state-of-the-art cGANs architectures: Stargan, Stargan-JNT, AttGAN and STGAN. Our extensive qualitative and quantitative evaluations on challenging face attributes manipulation data set, CelebA, LFWA, and RaFD, show that the cGANs enhanced by our methods outperform by a large margin, compared to their counter-parts and other conditioning methods, in terms of both target attributes recognition rates and quality measures such as PSNR and SSIM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)

  2. Bhattarai, B., Bodur, R., Kim, T.K.: AugLabel: exploiting word representations to augment labels for face attribute classification. In: ICASSP (2020)

    Google Scholar 

  3. Cao, J., Huang, H., Li, Y., Liu, J., He, R., Sun, Z.: Biphasic learning of GANs for high-resolution image-to-image translation. In: CVPR (2019)

    Google Scholar 

  4. Cavallanti, G., Cesa-Bianchi, N., Gentile, C.: Linear algorithms for online multitask classification. JMLR 11, 2901–2934 (2010)

    MathSciNet  MATH  Google Scholar 

  5. Chen, T., Zhai, X., Ritter, M., Lucic, M., Houlsby, N.: Self-supervised GANs via auxiliary rotation loss. In: CVPR (2019)

    Google Scholar 

  6. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS (2016)

    Google Scholar 

  7. Chen, Y.C., et al.: Facelet-bank for fast portrait manipulation. In: CVPR (2018)

    Google Scholar 

  8. Chen, Z.M., Wei, X.S., Wang, P., Guo, Y.: Multi-label image recognition with graph convolutional networks. In: CVPR (2019)

    Google Scholar 

  9. Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: CVPR (2018)

    Google Scholar 

  10. Ding, H., Sricharan, K., Chellappa, R.: ExprGAN: facial expression editing with controllable expression intensity. In: AAAI (2018)

    Google Scholar 

  11. Gecer, B., Bhattarai, B., Kittler, J., Kim, T.-K.: Semi-supervised adversarial learning to generate photorealistic face images of new identities from 3D morphable model. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 230–248. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_14

    Chapter  Google Scholar 

  12. Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: SIGKDD (2016)

    Google Scholar 

  13. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: NIPS (2017)

    Google Scholar 

  14. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: NIPS (2017)

    Google Scholar 

  15. Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017)

  16. He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE TIP 28, 5464–5478 (2019)

    MathSciNet  MATH  Google Scholar 

  17. Hong, S., Yang, D., Choi, J., Lee, H.: Inferring semantic layout for hierarchical text-to-image synthesis. In: CVPR (2018)

    Google Scholar 

  18. Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11

    Chapter  Google Scholar 

  19. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)

    Google Scholar 

  20. Kaneko, T., Hiramatsu, K., Kashino, K.: Generative attribute controller with conditional filtered generative adversarial networks. In: CVPR (2017)

    Google Scholar 

  21. Kaneko, T., Hiramatsu, K., Kashino, K.: Generative adversarial image synthesis with decision tree latent controller. In: CVPR (2018)

    Google Scholar 

  22. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)

    Google Scholar 

  23. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)

    Google Scholar 

  24. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2016)

    Google Scholar 

  25. Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., et al.: Fader networks: manipulating images by sliding attributes. In: NIPS (2017)

    Google Scholar 

  26. Liu, M., et al.: STGAN: a unified selective transfer network for arbitrary image attribute editing. In: CVPR (2019)

    Google Scholar 

  27. Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NeurIPS (2017)

    Google Scholar 

  28. Liu, Y., Li, Q., Sun, Z.: Attribute-aware face aging with wavelet-based generative adversarial networks. In: CVPR (2019)

    Google Scholar 

  29. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML (2013)

    Google Scholar 

  30. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NurIPS (2013)

    Google Scholar 

  31. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

  32. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: ICLR (2018)

    Google Scholar 

  33. Miyato, T., Koyama, M.: cGANs with projection discriminator. In: ICLR (2018)

    Google Scholar 

  34. Nian, F., Chen, X., Yang, S., Lv, G.: Facial attribute recognition with feature decoupling and graph convolutional networks. IEEE Access 7, 85500–85512 (2019)

    Article  Google Scholar 

  35. Odena, A., et al.: Is generator conditioning causally related to GAN performance? In: ICML (2018)

    Google Scholar 

  36. Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: ICML (2017)

    Google Scholar 

  37. Perarnau, G., Van De Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional GANs for image editing. In: NIPSW (2016)

    Google Scholar 

  38. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: SIGKDD (2014)

    Google Scholar 

  39. Pumarola, A., Agudo, A., Martinez, A.M., Sanfeliu, A., Moreno-Noguer, F.: GANimation: anatomically-aware facial animation from a single image. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 835–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_50

    Chapter  Google Scholar 

  40. Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.: Generative adversarial text to image synthesis (2016)

    Google Scholar 

  41. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NIPS (2016)

    Google Scholar 

  42. Shen, W., Liu, R.: Learning residual images for face attribute manipulation. In: CVPR (2017)

    Google Scholar 

  43. Shmelkov, K., Schmid, C., Alahari, K.: How good is my GAN? In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 218–234. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_14

    Chapter  Google Scholar 

  44. Taherkhani, F., Nasrabadi, N.M., Dawson, J.: A deep face identification network enhanced by facial attributes prediction. In: CVPRW (2018)

    Google Scholar 

  45. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: WWW (2015)

    Google Scholar 

  46. Xiao, T., Hong, J., Ma, J.: ELEGANT: exchanging latent encodings with GAN for transferring multiple face attributes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 172–187. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_11

    Chapter  Google Scholar 

  47. Zakharov, E., Shysheya, A., Burkov, E., Lempitsky, V.: Few-shot adversarial learning of realistic neural talking head models. arXiv preprint arXiv:1905.08233 (2019)

  48. Zhang, G., Kan, M., Shan, S., Chen, X.: Generative adversarial network with spatial attention for face attribute editing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 422–437. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_26

    Chapter  Google Scholar 

  49. Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: CVPR (2017)

    Google Scholar 

  50. Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder. In: CVPR (2017)

    Google Scholar 

  51. Zhao, B., Meng, L., Yin, W., Sigal, L.: Image generation from layout. In: CVPR (2019)

    Google Scholar 

  52. Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., Sun, M.: Graph neural networks: A review of methods and applications. arXiv:1812.08434 (2018)

Download references

Acknowledgements

Authors would like to thank EPSRC Programme Grant ‘FACER2VM’(EP/N007743/1) for generous support. We would also like to thank Prateek Manocha, undergraduate student from IIT Guwahati for some of the baseline experiments during his summer internship at Imperial College London.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Binod Bhattarai .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1946 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bhattarai, B., Kim, TK. (2020). Inducing Optimal Attribute Representations for Conditional GANs. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12352. Springer, Cham. https://doi.org/10.1007/978-3-030-58571-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58571-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58570-9

  • Online ISBN: 978-3-030-58571-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics