Skip to main content

Improving GANs for Long-Tailed Data Through Group Spectral Regularization

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13675))

Included in the following conference series:

Abstract

Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples. There has been a large body of work to train discriminative models for visual recognition on long-tailed distribution. In contrast, we aim to train conditional Generative Adversarial Networks, a class of image generation models on long-tailed distributions. We find that similar to recognition, state-of-the-art methods for image generation also suffer from performance degradation on tail classes. The performance degradation is mainly due to class-specific mode collapse for tail classes, which we observe to be correlated with the spectral explosion of the conditioning parameter matrix. We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse, which results in diverse and plausible image generation even for tail classes. We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data. Extensive experiments demonstrate the efficacy of our regularizer on long-tailed datasets with different degrees of imbalance. Project Page: https://sites.google.com/view/gsr-eccv22.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bansal, N., Chen, X., Wang, Z.: Can we gain more from orthogonality regularizations in training deep networks? In: Advances in Neural Information Processing Systems, vol. 31 (2018)

    Google Scholar 

  2. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)

  3. Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  4. Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: CVPR (2019)

    Google Scholar 

  5. De Vries, H., Strub, F., Mary, J., Larochelle, H., Pietquin, O., Courville, A.C.: Modulating early visual processing by language. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  7. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in neural information processing systems (2014)

    Google Scholar 

  8. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in neural information processing systems (2017)

    Google Scholar 

  9. Huang, L., Zhou, Y., Liu, L., Zhu, F., Shao, L.: Group whitening: balancing learning efficiency and representational capacity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  10. iNaturalist: The inaturalist 2019 competition dataset. http://github.com/visipedia/inat_comp/tree/2019 (2019)

  11. Jin, G., Yi, X., Zhang, L., Zhang, L., Schewe, S., Huang, X.: How does weight correlation affect generalisation ability of deep neural networks? Adv. Neural. Inf. Process. Syst. 33, 21346–21356 (2020)

    Google Scholar 

  12. Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2019)

    Google Scholar 

  13. Kang, M., Park, J.: Contrastive generative adversarial networks. arXiv preprint arXiv:2006.12681 (2020)

  14. Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. arXiv preprint arXiv:2006.06676 (2020)

  15. Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. In: Proceedings NeurIPS (2020)

    Google Scholar 

  16. Kavalerov, I., Czaja, W., Chellappa, R.: CGANs with multi-hinge loss. arXiv preprint arXiv:1912.04216 (2019)

  17. Kolouri, S., Zou, Y., Rohde, G.K.: Sliced Wasserstein kernels for probability distributions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  18. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical Report (2009)

    Google Scholar 

  19. Kynkäänniemi, T., Karras, T., Laine, S., Lehtinen, J., Aila, T.: Improved precision and recall metric for assessing generative models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  20. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  21. Liu, K., Tang, W., Zhou, F., Qiu, G.: Spectral regularization for combating mode collapse in GANs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6382–6390 (2019)

    Google Scholar 

  22. Liu, M.Y., Huang, X., Yu, J., Wang, T.C., Mallya, A.: Generative adversarial networks for image and video synthesis: algorithms and applications. Proc. IEEE 109(5), 839–862 (2021)

    Google Scholar 

  23. Mao, Q., Lee, H.Y., Tseng, H.Y., Ma, S., Yang, M.H.: Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  24. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  25. Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S.: Long-tail learning via logit adjustment. In: International Conference on Learning Representations (2021)

    Google Scholar 

  26. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)

  27. Miyato, T., Koyama, M.: cGANs with projection discriminator. In: International Conference on Learning Representations (2018)

    Google Scholar 

  28. Mullick, S.S., Datta, S., Das, S.: Generative adversarial minority oversampling. In: The IEEE International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  29. Naeem, M.F., Oh, S.J., Uh, Y., Choi, Y., Yoo, J.: Reliable fidelity and diversity metrics for generative models (2020)

    Google Scholar 

  30. Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of the 34th International Conference on Machine Learning-Volume, vol. 70 (2017)

    Google Scholar 

  31. Rangwani, H., Mopuri, K.R., Babu, R.V.: Class balancing GAN with a classifier in the loop. arXiv preprint arXiv:2106.09402 (2021)

  32. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: improved techniques for training GANs. In: Advances in neural information processing systems (2016)

    Google Scholar 

  33. Santurkar, S., Schmidt, L., Madry, A.: A classification-based study of covariate shift in GAN distributions. In: International Conference on Machine Learning, pp. 4480–4489 (2018)

    Google Scholar 

  34. Si, Z., Zhu, S.C.: Learning hybrid image templates (HIT) by information projection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1354–1367 (2011)

    Article  Google Scholar 

  35. Tran, N.T., Tran, V.H., Nguyen, N.B., Nguyen, T.K., Cheung, N.M.: On data augmentation for GAN training. IEEE Trans. Image Process. 30, 1882–1897 (2021)

    Article  MathSciNet  Google Scholar 

  36. Tseng, H.Y., Jiang, L., Liu, C., Yang, M.H., Yang, W.: Regularizing generative adversarial networks under limited data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  37. Vahdat, A., Kautz, J.: NVAE: a deep hierarchical variational autoencoder. In: Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  38. Wang, Y.X., Ramanan, D., Hebert, M.: Learning to model the tail. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  39. Wang, Z., Xiang, C., Zou, W., Xu, C.: MMA regularization: decorrelating weights of neural networks by maximizing the minimal angles. Adv. Neural. Inf. Process. Syst. 33, 19099–19110 (2020)

    Google Scholar 

  40. Wu, Y., He, K.: Group normalization. In: Proceedings of the European conference on computer vision (ECCV) (2018)

    Google Scholar 

  41. Yang, Y., Xu, Z.: Rethinking the value of labels for improving class-imbalanced learning. In: NeurIPS (2020)

    Google Scholar 

  42. Yoshida, Y., Miyato, T.: Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941 (2017)

  43. Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. CoRR abs/1506.03365 (2015)

    Google Scholar 

  44. Zhang, H., Zhang, Z., Odena, A., Lee, H.: Consistency regularization for generative adversarial networks. arXiv preprint arXiv:1910.12027 (2019)

  45. Zhao, S., Liu, Z., Lin, J., Zhu, J.Y., Han, S.: Differentiable augmentation for data-efficient GAN training. In: Advances in Neural Information Processing Systems, vol. 33 (2020)

    Google Scholar 

  46. Zhao, Z., Singh, S., Lee, H., Zhang, Z., Odena, A., Zhang, H.: Improved consistency regularization for GANs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11033–11041 (2021)

    Google Scholar 

  47. Zhou, B., Cui, Q., Wei, X.S., Chen, Z.M.: BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition, pp. 1–8 (2020)

    Google Scholar 

  48. Zhou, P., Xie, L., Ni, B., Geng, C., Tian, Q.: Omni-GAN: On the secrets of cGANs and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14061–14071 (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by SERB-STAR Project (Project:STR/2020/000128), Govt. of India and a Google Research Award. Harsh Rangwani is supported by Prime Minister’s Research Fellowship (PMRF). We thank Lavish Bansal for help with StyleGAN experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harsh Rangwani .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 5278 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rangwani, H., Jaswani, N., Karmali, T., Jampani, V., Babu, R.V. (2022). Improving GANs for Long-Tailed Data Through Group Spectral Regularization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13675. Springer, Cham. https://doi.org/10.1007/978-3-031-19784-0_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19784-0_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19783-3

  • Online ISBN: 978-3-031-19784-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics