Semantic Contrastive Embedding for Generalized Zero-Shot Learning

  • Published:
Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes when only the labeled examples from seen classes are provided. Recent feature generation methods learn a generative model that can synthesize the missing visual features of unseen classes to mitigate the data-imbalance problem in GZSL. However, the original visual feature space is suboptimal for GZSL recognition since it lacks semantic information, which is vital for recognizing the unseen classes. To tackle this issue, we propose to integrate the feature generation model with an embedding model. Our GZSL framework maps both the real and the synthetic samples produced by the generation model into an embedding space, where we perform the final GZSL classification. Specifically, we propose a semantic contrastive embedding (SCE) for our GZSL framework. Our SCE consists of attribute-level contrastive embedding and class-level contrastive embedding. They aim to obtain the transferable and discriminative information, respectively, in the embedding space. We evaluate our GZSL method with semantic contrastive embedding, named SCE-GZSL, on four benchmark datasets. The results show that our SCE-GZSL method can achieve the state-of-the-art or the second-best on these datasets.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

We would like to show our greatest appreciation to all editors and reviewers for their constructive comments on our paper. This work was supported by the National Science Foundation of China (Grant Nos. U1713208 and 61876085) and the China Postdoctoral Science Foundation (Grant Nos. 2017M621748, 2020M681606 and 2019T120430). This work was also supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX21_0302).

