Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis

Fang, Juan; Yang, Guan; Han, Ayou; Liu, Xiaoming; Chen, Bo; Wang, Chen

doi:10.1007/s00371-024-03393-4

Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis

Original article
Published: 29 April 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Juan Fang^1,2,
Guan Yang^1,2^na1,
Ayou Han^1,2^na1,
Xiaoming Liu^1,2,
Bo Chen³ &
…
Chen Wang¹

55 Accesses
Explore all metrics

Abstract

Zero-shot learning (ZSL) trains classifiers on seen classes and utilizes class-level semantic attributes to recognize unseen classes. In recent years, feature generation methods have been employed to address the data imbalance issue in ZSL by synthesizing samples of unseen classes. Nevertheless, the current generation methods solely rely on training the generator using seen classes, resulting in synthesized visual features that are inevitably biased toward seen classes and lack discriminative information. In this paper, we propose an effective ZSL method, termed zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis. To mitigate the category bias problem, we incorporate a domain discriminator and a semantic decoder into the generative network. By maximizing the decision boundary between categories, distinguishing unseen classes and promoting the alignment of synthetic features with their corresponding semantic embeddings, to enhance semantic consistency. In addition, to reduce the interference of redundant information in visual features, we propose a batch recombining-based disentangling method. The original visual features are projected into categorization-irrelevant and categorization-relevant representations through a conditional encoder, and a correlation penalty is employed to ensure the independence between the two component representations. The categorization-irrelevant representation is reorganized and spliced with the categorization-relevant representation, and disentanglement is achieved by calculating the difference between the post-reorganized and pre-reorganized representations. Finally, we train a classifier using the categorization-relevant representations to improve classification accuracy. Experiments are conducted on four publicly available ZSL datasets, and the proposed method achieves superior results, demonstrating its effectiveness in addressing the challenges of ZSL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Joint Generative Model for Zero-Shot Learning

Contrastive embedding-based feature generation for generalized zero-shot learning

Article 15 November 2022

Stack-VAE Network for Zero-Shot Learning

Data availability

Data will be made available on request.

References

Liu, J., Fu, L., Zhang, H., Ye, Q., Yang, W., Liu, L.: Learning discriminative and representative feature with cascade GAN for generalized zero-shot learning. Knowl.-Based Syst. 236, 107780 (2022)
Article Google Scholar
Li, X., Xu, Z., Wei, K., Deng, C.: Generalized zero-shot learning via disentangled representation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1966–1974 (2021)
Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: Advances in Neural Information Processing Systems, vol. 22 (2009)
Kong, X., Gao, Z., Li, X., Hong, M., Liu, J., Wang, C., Xie, Y., Qu, Y.: En-compactness: Self-distillation embedding & contrastive generation for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9306–9315 (2022)
Chao, W.-L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14, pp. 52–68. Springer (2016)
Annadani, Y., Biswas, S.: Preserving semantic relations for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7603–7612 (2018)
Jiang, H., Wang, R., Shan, S., Chen, X.: Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9765–9774 (2019)
Liu, Y., Guo, J., Cai, D., He, X.: Attribute attention for semantic disambiguation in zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6698–6707 (2019)
Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6034–6042 (2016)
Zhang, F., Shi, G.: Co-representation network for generalized zero-shot learning. In: International Conference on Machine Learning, pp. 7434–7443. PMLR (2019)
Narayan, S., Gupta, A., Khan, F.S., Snoek, C.G., Shao, L.: Latent embedding feedback and discriminative features for zero-shot classification. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pp. 479–495. Springer (2020)
Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7402–7411 (2019)
Keshari, R., Singh, R., Vatsa, M.: Generalized zero-shot learning via over-complete distribution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13300–13308 (2020)
Chen, Z., Luo, Y., Qiu, R., Wang, S., Huang, Z., Li, J., Zhang, Z.: Semantics disentangling for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8712–8720 (2021)
Chen, S., Wang, W., Xia, B., Peng, Q., You, X., Zheng, F., Shao, L.: Free: Feature refinement for generalized zero-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 122–131 (2021)
Guan, J., Meng, M., Liang, T., Liu, J., Wu, J.: Dual-level contrastive learning network for generalized zero-shot learning. Vis. Comput. 38(9–10), 3087–3095 (2022)
Article Google Scholar
Bhagat, P., Choudhary, P., Singh, K.M.: A study on zero-shot learning from semantic viewpoint. Vis. Comput. 39(5), 2149–2163 (2023)
Article Google Scholar
Yang, G., Han, A., Liu, X., Liu, Y., Wei, T., Zhang, Z.: Enhancing semantic-consistent features and transforming discriminative features for generalized zero-shot classifications. Appl. Sci. 12(24), 12642 (2022)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Arjovsky, M., Bottou, L.: Towards principled methods for training generative adversarial networks. arXiv preprint arXiv:1701.04862 (2017)
Zhao, S., Song, J., Ermon, S.: InfoVAE: balancing learning and inference in variational autoencoders. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 5885–5892 (2019)
Xian, Y., Sharma, S., Schiele, B., Akata, Z.: F-VAEGAN-D2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10275–10284 (2019)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-UCSD birds-200-2011 dataset (2011)
Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero-and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8247–8255 (2019)
Han, Z., Fu, Z., Chen, S., Yang, J.: Contrastive embedding for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2371–2381 (2021)
Wang, W., Xu, H., Wang, G., Wang, W., Carin, L.: Zero-shot recognition via optimal transport. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3471–3481 (2021)
Su, H., Li, J., Chen, Z., Zhu, L., Lu, K.: Distinguishing unseen from seen for generalized zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7885–7894 (2022)
Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
Kim, H., Mnih, A.: Disentangling by factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)
Tong, B., Wang, C., Klinkigt, M., Kobayashi, Y., Nonaka, Y.: Hierarchical disentanglement of discriminative latent features for zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11467–11476 (2019)
Yang, M., Liu, F., Chen, Z., Shen, X., Hao, J., Wang, J.: CausalVAE: structured causal disentanglement in variational autoencoder. arXiv e-prints, arXiv:2004.08697 (2020)
Chen, J., Deng, W., Peng, B., Liu, T., Wei, Y., Liu, L.: Variational information bottleneck for cross domain object detection. In: 2023 IEEE International Conference on Multimedia and Expo (ICME), pp. 2231–2236. IEEE (2023)
Deng, W., Zhao, L., Liao, Q., Guo, D., Kuang, G., Hu, D., Pietikäinen, M., Liu, L.: Informative feature disentanglement for unsupervised domain adaptation. IEEE Trans. Multimedia 24, 2407–2421 (2021)
Article Google Scholar
Deng, W., Cui, Y., Liu, Z., Kuang, G., Hu, D., Pietikäinen, M., Liu, L.: Informative class-conditioned feature alignment for unsupervised domain adaptation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1303–1312 (2021)
Geng, Y., Chen, J., Zhang, W., Xu, Y., Chen, Z., Z. Pan, J., Huang, Y., Xiong, F., Chen, H.: Disentangled ontology embedding for zero-shot learning. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 443–453 (2022)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2251–2265 (2018)
Article Google Scholar
Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2751–2758 (2012). IEEE
Nilsback, M.-E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722–729 (2008). IEEE
Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5542–5551 (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Chen, Z., Li, J., Luo, Y., Huang, Z., Yang, Y.: CANZSL: cycle-consistent adversarial networks for zero-shot learning from natural language. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 874–883 (2020)
Feng, Y., Huang, X., Yang, P., Yu, J., Sang, J.: Non-generative generalized zero-shot learning via task-correlated disentanglement and controllable samples synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9346–9355 (2022)
Verma, V.K., Arora, G., Mishra, A., Rai, P.: Generalized zero-shot learning via synthesized examples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4281–4289 (2018)
Chou, Y.-Y., Lin, H.-T., Liu, T.-L.: Adaptive and generative zero-shot learning. In: International Conference on Learning Representations (2021)
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4166–4174 (2015)
Yang, Z., Zhang, Y., Du, Y., Tong, C.: Semantic-aligned reinforced attention model for zero-shot learning. Image Vis. Comput. 128, 104586 (2022)
Article Google Scholar
Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2927–2936 (2015)
Liu, Y., Zhou, L., Bai, X., Huang, Y., Gu, L., Zhou, J., Harada, T.: Goal-oriented gaze estimation for zero-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3794–3803 (2021)
Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1004–1013 (2018)
Zhang, H., Long, Y., Yang, W., Shao, L.: Dual-verification network for zero-shot learning. Inf. Sci. 470, 43–57 (2019)
Article MathSciNet Google Scholar
Liu, Y., Li, J., Gao, X.: A simple discriminative dual semantic auto-encoder for zero-shot classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 940–941 (2020)
Chen, S., Hong, Z., Liu, Y., Xie, G.-S., Sun, B., Li, H., Peng, Q., Lu, K., You, X.: TransZero: attribute-guided transformer for zero-shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 330–338 (2022)

Download references

Acknowledgements

This work was supported in part by the Key Research Projects of Higher Education Institutions in Henan (Project No. 23A520022), Henan Postgraduate Education Reform and Quality Improvement Project (Project No. YJS2022KC19), and Special Fund Project for Basic Scientific Research of Zhongyuan University of Technology (Project No.K2021TD05).

Author information

Guan Yang and Ayou Han have contributed equally to this work.

Authors and Affiliations

School of Computer Science, Zhongyuan University of Technology, Zhengzhou, 450007, China
Juan Fang, Guan Yang, Ayou Han, Xiaoming Liu & Chen Wang
Henan Key Laboratory on Public Opinion Intelligent Analysis, Zhengzhou, 450007, China
Juan Fang, Guan Yang, Ayou Han & Xiaoming Liu
School of Mathematics and Statistics, Shenzhen University, Shenzhen, 518000, China
Bo Chen

Authors

Juan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Guan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ayou Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization was done by GY, JF, and AH; JF, GY, AH, and XL were involved in methodology; JF, GY, XL, and CW helped in validation; JF, GY, CW, and XL assisted in formal analysis; investigation was done by JF; GY helped in resources; JF helped in writing—original draft preparation; JF, AH, and GY helped in writing—review and editing; visualization was done by JF and AH; supervision was done by GY, BC, and XL; GY was involved in funding acquisition. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Guan Yang.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fang, J., Yang, G., Han, A. et al. Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03393-4

Download citation

Accepted: 28 March 2024
Published: 29 April 2024
DOI: https://doi.org/10.1007/s00371-024-03393-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis

Abstract

Access this article

Similar content being viewed by others

A Joint Generative Model for Zero-Shot Learning

Contrastive embedding-based feature generation for generalized zero-shot learning

Stack-VAE Network for Zero-Shot Learning

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis

Abstract

Access this article

Similar content being viewed by others

A Joint Generative Model for Zero-Shot Learning

Contrastive embedding-based feature generation for generalized zero-shot learning

Stack-VAE Network for Zero-Shot Learning

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation