Skip to main content
Log in

Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Zero-shot learning (ZSL) trains classifiers on seen classes and utilizes class-level semantic attributes to recognize unseen classes. In recent years, feature generation methods have been employed to address the data imbalance issue in ZSL by synthesizing samples of unseen classes. Nevertheless, the current generation methods solely rely on training the generator using seen classes, resulting in synthesized visual features that are inevitably biased toward seen classes and lack discriminative information. In this paper, we propose an effective ZSL method, termed zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis. To mitigate the category bias problem, we incorporate a domain discriminator and a semantic decoder into the generative network. By maximizing the decision boundary between categories, distinguishing unseen classes and promoting the alignment of synthetic features with their corresponding semantic embeddings, to enhance semantic consistency. In addition, to reduce the interference of redundant information in visual features, we propose a batch recombining-based disentangling method. The original visual features are projected into categorization-irrelevant and categorization-relevant representations through a conditional encoder, and a correlation penalty is employed to ensure the independence between the two component representations. The categorization-irrelevant representation is reorganized and spliced with the categorization-relevant representation, and disentanglement is achieved by calculating the difference between the post-reorganized and pre-reorganized representations. Finally, we train a classifier using the categorization-relevant representations to improve classification accuracy. Experiments are conducted on four publicly available ZSL datasets, and the proposed method achieves superior results, demonstrating its effectiveness in addressing the challenges of ZSL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Data will be made available on request.

References

  1. Liu, J., Fu, L., Zhang, H., Ye, Q., Yang, W., Liu, L.: Learning discriminative and representative feature with cascade GAN for generalized zero-shot learning. Knowl.-Based Syst. 236, 107780 (2022)

    Article  Google Scholar 

  2. Li, X., Xu, Z., Wei, K., Deng, C.: Generalized zero-shot learning via disentangled representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1966–1974 (2021)

  3. Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: Advances in Neural Information Processing Systems, vol. 22 (2009)

  4. Kong, X., Gao, Z., Li, X., Hong, M., Liu, J., Wang, C., Xie, Y., Qu, Y.: En-compactness: Self-distillation embedding & contrastive generation for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9306–9315 (2022)

  5. Chao, W.-L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14, pp. 52–68. Springer (2016)

  6. Annadani, Y., Biswas, S.: Preserving semantic relations for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7603–7612 (2018)

  7. Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9765–9774 (2019)

  8. Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6698–6707 (2019)

  9. Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6034–6042 (2016)

  10. Zhang, F., Shi, G.: Co-representation network for generalized zero-shot learning. In: International Conference on Machine Learning, pp. 7434–7443. PMLR (2019)

  11. Narayan, S., Gupta, A., Khan, F.S., Snoek, C.G., Shao, L.: Latent embedding feedback and discriminative features for zero-shot classification. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pp. 479–495. Springer (2020)

  12. Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7402–7411 (2019)

  13. Keshari, R., Singh, R., Vatsa, M.: Generalized zero-shot learning via over-complete distribution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13300–13308 (2020)

  14. Chen, Z., Luo, Y., Qiu, R., Wang, S., Huang, Z., Li, J., Zhang, Z.: Semantics disentangling for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8712–8720 (2021)

  15. Chen, S., Wang, W., Xia, B., Peng, Q., You, X., Zheng, F., Shao, L.: Free: Feature refinement for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 122–131 (2021)

  16. Guan, J., Meng, M., Liang, T., Liu, J., Wu, J.: Dual-level contrastive learning network for generalized zero-shot learning. Vis. Comput. 38(9–10), 3087–3095 (2022)

    Article  Google Scholar 

  17. Bhagat, P., Choudhary, P., Singh, K.M.: A study on zero-shot learning from semantic viewpoint. Vis. Comput. 39(5), 2149–2163 (2023)

    Article  Google Scholar 

  18. Yang, G., Han, A., Liu, X., Liu, Y., Wei, T., Zhang, Z.: Enhancing semantic-consistent features and transforming discriminative features for generalized zero-shot classifications. Appl. Sci. 12(24), 12642 (2022)

    Article  Google Scholar 

  19. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)

  20. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  21. Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)

  22. Zhao, S., Song, J., Ermon, S.: InfoVAE: balancing learning and inference in variational autoencoders. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5885–5892 (2019)

  23. Xian, Y., Sharma, S., Schiele, B., Akata, Z.: F-VAEGAN-D2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10275–10284 (2019)

  24. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset (2011)

  25. Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero-and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8247–8255 (2019)

  26. Han, Z., Fu, Z., Chen, S., Yang, J.: Contrastive embedding for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2371–2381 (2021)

  27. Wang, W., Xu, H., Wang, G., Wang, W., Carin, L.: Zero-shot recognition via optimal transport. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3471–3481 (2021)

  28. Su, H., Li, J., Chen, Z., Zhu, L., Lu, K.: Distinguishing unseen from seen for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7885–7894 (2022)

  29. Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)

  30. Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)

  31. Tong, B., Wang, C., Klinkigt, M., Kobayashi, Y., Nonaka, Y.: Hierarchical disentanglement of discriminative latent features for zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11467–11476 (2019)

  32. Yang, M., Liu, F., Chen, Z., Shen, X., Hao, J., Wang, J.: CausalVAE: structured causal disentanglement in variational autoencoder. arXiv e-prints, arXiv:2004.08697 (2020)

  33. Chen, J., Deng, W., Peng, B., Liu, T., Wei, Y., Liu, L.: Variational information bottleneck for cross domain object detection. In: 2023 IEEE International Conference on Multimedia and Expo (ICME), pp. 2231–2236. IEEE (2023)

  34. Deng, W., Zhao, L., Liao, Q., Guo, D., Kuang, G., Hu, D., Pietikäinen, M., Liu, L.: Informative feature disentanglement for unsupervised domain adaptation. IEEE Trans. Multimedia 24, 2407–2421 (2021)

    Article  Google Scholar 

  35. Deng, W., Cui, Y., Liu, Z., Kuang, G., Hu, D., Pietikäinen, M., Liu, L.: Informative class-conditioned feature alignment for unsupervised domain adaptation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1303–1312 (2021)

  36. Geng, Y., Chen, J., Zhang, W., Xu, Y., Chen, Z., Z. Pan, J., Huang, Y., Xiong, F., Chen, H.: Disentangled ontology embedding for zero-shot learning. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 443–453 (2022)

  37. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

  38. Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)

    Article  Google Scholar 

  39. Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2751–2758 (2012). IEEE

  40. Nilsback, M.-E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722–729 (2008). IEEE

  41. Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5542–5551 (2018)

  42. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  43. Chen, Z., Li, J., Luo, Y., Huang, Z., Yang, Y.: CANZSL: cycle-consistent adversarial networks for zero-shot learning from natural language. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 874–883 (2020)

  44. Feng, Y., Huang, X., Yang, P., Yu, J., Sang, J.: Non-generative generalized zero-shot learning via task-correlated disentanglement and controllable samples synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9346–9355 (2022)

  45. Verma, V.K., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4281–4289 (2018)

  46. Chou, Y.-Y., Lin, H.-T., Liu, T.-L.: Adaptive and generative zero-shot learning. In: International Conference on Learning Representations (2021)

  47. Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4166–4174 (2015)

  48. Yang, Z., Zhang, Y., Du, Y., Tong, C.: Semantic-aligned reinforced attention model for zero-shot learning. Image Vis. Comput. 128, 104586 (2022)

    Article  Google Scholar 

  49. Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2927–2936 (2015)

  50. Liu, Y., Zhou, L., Bai, X., Huang, Y., Gu, L., Zhou, J., Harada, T.: Goal-oriented gaze estimation for zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3794–3803 (2021)

  51. Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1004–1013 (2018)

  52. Zhang, H., Long, Y., Yang, W., Shao, L.: Dual-verification network for zero-shot learning. Inf. Sci. 470, 43–57 (2019)

    Article  MathSciNet  Google Scholar 

  53. Liu, Y., Li, J., Gao, X.: A simple discriminative dual semantic auto-encoder for zero-shot classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 940–941 (2020)

  54. Chen, S., Hong, Z., Liu, Y., Xie, G.-S., Sun, B., Li, H., Peng, Q., Lu, K., You, X.: TransZero: attribute-guided transformer for zero-shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 330–338 (2022)

Download references

Acknowledgements

This work was supported in part by the Key Research Projects of Higher Education Institutions in Henan (Project No. 23A520022), Henan Postgraduate Education Reform and Quality Improvement Project (Project No. YJS2022KC19), and Special Fund Project for Basic Scientific Research of Zhongyuan University of Technology (Project No.K2021TD05).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization was done by GY, JF, and AH; JF, GY, AH, and XL were involved in methodology; JF, GY, XL, and CW helped in validation; JF, GY, CW, and XL assisted in formal analysis; investigation was done by JF; GY helped in resources; JF helped in writing—original draft preparation; JF, AH, and GY helped in writing—review and editing; visualization was done by JF and AH; supervision was done by GY, BC, and XL; GY was involved in funding acquisition. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Guan Yang.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, J., Yang, G., Han, A. et al. Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03393-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03393-4

Keywords

Navigation