Abstract
Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples. There has been a large body of work to train discriminative models for visual recognition on long-tailed distribution. In contrast, we aim to train conditional Generative Adversarial Networks, a class of image generation models on long-tailed distributions. We find that similar to recognition, state-of-the-art methods for image generation also suffer from performance degradation on tail classes. The performance degradation is mainly due to class-specific mode collapse for tail classes, which we observe to be correlated with the spectral explosion of the conditioning parameter matrix. We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse, which results in diverse and plausible image generation even for tail classes. We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data. Extensive experiments demonstrate the efficacy of our regularizer on long-tailed datasets with different degrees of imbalance. Project Page: https://sites.google.com/view/gsr-eccv22.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bansal, N., Chen, X., Wang, Z.: Can we gain more from orthogonality regularizations in training deep networks? In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in Neural Information Processing Systems (2019)
Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: CVPR (2019)
De Vries, H., Strub, F., Mary, J., Larochelle, H., Pietquin, O., Courville, A.C.: Modulating early visual processing by language. In: Advances in Neural Information Processing Systems (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in neural information processing systems (2014)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in neural information processing systems (2017)
Huang, L., Zhou, Y., Liu, L., Zhu, F., Shao, L.: Group whitening: balancing learning efficiency and representational capacity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
iNaturalist: The inaturalist 2019 competition dataset. http://github.com/visipedia/inat_comp/tree/2019 (2019)
Jin, G., Yi, X., Zhang, L., Zhang, L., Schewe, S., Huang, X.: How does weight correlation affect generalisation ability of deep neural networks? Adv. Neural. Inf. Process. Syst. 33, 21346–21356 (2020)
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2019)
Kang, M., Park, J.: Contrastive generative adversarial networks. arXiv preprint arXiv:2006.12681 (2020)
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. arXiv preprint arXiv:2006.06676 (2020)
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. In: Proceedings NeurIPS (2020)
Kavalerov, I., Czaja, W., Chellappa, R.: CGANs with multi-hinge loss. arXiv preprint arXiv:1912.04216 (2019)
Kolouri, S., Zou, Y., Rohde, G.K.: Sliced Wasserstein kernels for probability distributions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical Report (2009)
Kynkäänniemi, T., Karras, T., Laine, S., Lehtinen, J., Aila, T.: Improved precision and recall metric for assessing generative models. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Liu, K., Tang, W., Zhou, F., Qiu, G.: Spectral regularization for combating mode collapse in GANs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6382–6390 (2019)
Liu, M.Y., Huang, X., Yu, J., Wang, T.C., Mallya, A.: Generative adversarial networks for image and video synthesis: algorithms and applications. Proc. IEEE 109(5), 839–862 (2021)
Mao, Q., Lee, H.Y., Tseng, H.Y., Ma, S., Yang, M.H.: Mode seeking generative adversarial networks for diverse image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Menon, A.K., Jayasumana, S., Rawat, A.S., Jain, H., Veit, A., Kumar, S.: Long-tail learning via logit adjustment. In: International Conference on Learning Representations (2021)
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
Miyato, T., Koyama, M.: cGANs with projection discriminator. In: International Conference on Learning Representations (2018)
Mullick, S.S., Datta, S., Das, S.: Generative adversarial minority oversampling. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Naeem, M.F., Oh, S.J., Uh, Y., Choi, Y., Yoo, J.: Reliable fidelity and diversity metrics for generative models (2020)
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of the 34th International Conference on Machine Learning-Volume, vol. 70 (2017)
Rangwani, H., Mopuri, K.R., Babu, R.V.: Class balancing GAN with a classifier in the loop. arXiv preprint arXiv:2106.09402 (2021)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: improved techniques for training GANs. In: Advances in neural information processing systems (2016)
Santurkar, S., Schmidt, L., Madry, A.: A classification-based study of covariate shift in GAN distributions. In: International Conference on Machine Learning, pp. 4480–4489 (2018)
Si, Z., Zhu, S.C.: Learning hybrid image templates (HIT) by information projection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1354–1367 (2011)
Tran, N.T., Tran, V.H., Nguyen, N.B., Nguyen, T.K., Cheung, N.M.: On data augmentation for GAN training. IEEE Trans. Image Process. 30, 1882–1897 (2021)
Tseng, H.Y., Jiang, L., Liu, C., Yang, M.H., Yang, W.: Regularizing generative adversarial networks under limited data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Vahdat, A., Kautz, J.: NVAE: a deep hierarchical variational autoencoder. In: Neural Information Processing Systems (NeurIPS) (2020)
Wang, Y.X., Ramanan, D., Hebert, M.: Learning to model the tail. In: Advances in Neural Information Processing Systems (2017)
Wang, Z., Xiang, C., Zou, W., Xu, C.: MMA regularization: decorrelating weights of neural networks by maximizing the minimal angles. Adv. Neural. Inf. Process. Syst. 33, 19099–19110 (2020)
Wu, Y., He, K.: Group normalization. In: Proceedings of the European conference on computer vision (ECCV) (2018)
Yang, Y., Xu, Z.: Rethinking the value of labels for improving class-imbalanced learning. In: NeurIPS (2020)
Yoshida, Y., Miyato, T.: Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941 (2017)
Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. CoRR abs/1506.03365 (2015)
Zhang, H., Zhang, Z., Odena, A., Lee, H.: Consistency regularization for generative adversarial networks. arXiv preprint arXiv:1910.12027 (2019)
Zhao, S., Liu, Z., Lin, J., Zhu, J.Y., Han, S.: Differentiable augmentation for data-efficient GAN training. In: Advances in Neural Information Processing Systems, vol. 33 (2020)
Zhao, Z., Singh, S., Lee, H., Zhang, Z., Odena, A., Zhang, H.: Improved consistency regularization for GANs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11033–11041 (2021)
Zhou, B., Cui, Q., Wei, X.S., Chen, Z.M.: BBN: Bilateral-branch network with cumulative learning for long-tailed visual recognition, pp. 1–8 (2020)
Zhou, P., Xie, L., Ni, B., Geng, C., Tian, Q.: Omni-GAN: On the secrets of cGANs and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14061–14071 (2021)
Acknowledgements
This work was supported in part by SERB-STAR Project (Project:STR/2020/000128), Govt. of India and a Google Research Award. Harsh Rangwani is supported by Prime Minister’s Research Fellowship (PMRF). We thank Lavish Bansal for help with StyleGAN experiments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rangwani, H., Jaswani, N., Karmali, T., Jampani, V., Babu, R.V. (2022). Improving GANs for Long-Tailed Data Through Group Spectral Regularization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13675. Springer, Cham. https://doi.org/10.1007/978-3-031-19784-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-19784-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19783-3
Online ISBN: 978-3-031-19784-0
eBook Packages: Computer ScienceComputer Science (R0)