A study on zero-shot learning from semantic viewpoint

Bhagat, P K; Choudhary, Prakash; Singh, Kh Manglem

doi:10.1007/s00371-022-02470-w

A study on zero-shot learning from semantic viewpoint

Survey
Published: 30 May 2022

Volume 39, pages 2149–2163, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

483 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Recognition of unseen object class by a human being is always based on the relationship between seen and unseen classes, given that human has some background knowledge of the unseen object class. Zero-shot learning is a learning paradigm that tries to develop a recognition model to recognize mutually exclusive training and testing classes. A zero-shot learning model trained on labeled data can also recognize unseen classes when sufficient information about the relationship between seen and unseen classes is given. Semantic space contains semantic information about seen and unseen classes. It is an important part of zero-shot learning and acts as a bridge between seen and unseen classes. In this article, we provide a compact and comprehensive survey on zero-shot learning. First, we explain the different ways to construct semantic space along with its pros and cons. Next, we present a categorization of zero-shot learning methods from the semantic space construction point of view. Furthermore, this paper also presents performance evaluation measures with a relevant and influential zero-shot learning database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

References

Akata, Z., Malinowski, M., Fritz, M., Schiele, B.: Multi-cue zero-shot learning with strong supervision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 59–68 (2016). https://doi.org/10.1109/CVPR.2016.14
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2013). https://doi.org/10.1109/CVPR.2013.111
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)
Article Google Scholar
Akata, Z., Reed, S., Walter, D., , Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2927–2936 (2015)
Al-Halah, Z., Stiefelhagen, R.: How to transfer? zero-shot object recognition via hierarchical transfer of semantic attributes. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 837–843 (2015). https://doi.org/10.1109/WACV.2015.116
An, F.P., Liu, J.e., Bai, L.: Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network. Vis. Comput. pp. 1–13 (2021)
Ba, J.L., Swersky, K., Fidler, S., Salakhutdinov, R.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4247–4255 (2015)
Bhagat, P.K., Choudhary, P.: Image annotation: then and now. Image Vis. Comput. 80, 1–23 (2018)
Article Google Scholar
Bhagat, P.K., Choudhary, P., Singh, K.M.: A novel approach based on fully connected weighted bipartite graph for zero-shot learning problems. J. Ambient. Intell. Humaniz. Comput. (2021). https://doi.org/10.1007/s12652-020-02615-6
Article Google Scholar
Bradley, D.R., Dumais, S.T.: Ambiguous cognitive contours. Nature 257(5527), 582–584 (1975). https://doi.org/10.1038/257582a0
Article Google Scholar
Changpinyo, S., Chao, W., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5327–5336 (2016). https://doi.org/10.1109/CVPR.2016.575
Chao, W.L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European Conference on Computer Vision, pp. 52–68. Springer (2016)
Cheng, H.T., Sun, F.T., Griss, M., Davis, P., Li, J., You, D.: Nuactiv: Recognizing unseen new activities using semantic attribute-based learning. In: Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services, pp. 361–374. Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2462456.2464438
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Article Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785 (2009)
Feng, J., Jegelka, S., Yan, S., Darrell, T.: Learning scalable discriminative dictionary with sample relatedness. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1645–1652 (2014). https://doi.org/10.1109/CVPR.2014.213
Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach, Second Edition. Pitman (2012)
Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Ranzato, M.A., Mikolov, T.: Devise: A deep visual-semantic embedding model. In: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems, Vol. 26, pp. 2121–2129. Curran Associates, Inc. (2013)
Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Attribute learning for understanding unstructured social activity. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 530–543. Springer, Berlin (2012)
Chapter Google Scholar
Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 303–316 (2014). https://doi.org/10.1109/TPAMI.2013.128
Article Google Scholar
Fu, Y., Wang, X., Dong, H., Jiang, Y.G., Wang, M., Xue, X., Sigal, L.: Vocabulary-informed zero-shot and open-set learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(12), 3136–3152 (2020). https://doi.org/10.1109/TPAMI.2019.2922175
Article Google Scholar
Fu, Y., Xiang, T., Jiang, Y., Xue, X., Sigal, L., Gong, S.: Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content. IEEE Signal Process. Mag. 35(1), 112–125 (2018). https://doi.org/10.1109/MSP.2017.2763441
Article Google Scholar
Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot learning on semantic class prototype graph. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 2009–2022 (2017)
Article Google Scholar
Gan, C., Yang, T., Gong, B.: Learning attributes equals multi-source domain generalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 87–97 (2016)
Gan, C., Yang, T., Gong, B.: Learning attributes equals multi-source domain generalization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 87–97 (2016). https://doi.org/10.1109/CVPR.2016.17
Gao, L., Song, J., Shao, J., Zhu, X., Shen, H.: Zero-shot image categorization by image correlation exploration. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 487–490 (2015)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Addison-Wesley Longman Publishing Co., New York (2001)
Google Scholar
Guo, Y., Ding, G., Jin, X., Wang, J.: Transductive zero-shot recognition via shared model space learning. In: AAAI (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Huang, S., Elhoseiny, M., Elgammal, A., Yang, D.: Learning hypergraph-regularized attribute predictors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409–417 (2015). https://doi.org/10.1109/CVPR.2015.7298638
Jayaraman, D., Sha, F., Grauman, K.: Decorrelating semantic visual attributes by resisting the urge to share. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1629–1636 (2014). https://doi.org/10.1109/CVPR.2014.211
Ji, Z., Wang, Q., Cui, B., Pang, Y., Cao, X., Li, X.: A semi-supervised zero-shot image classification method based on soft-target. Neural Netw. 143, 88–96 (2021). https://doi.org/10.1016/j.neunet.2021.05.019
Article Google Scholar
Jia, Z., Zhang, Z., Wang, L., Shan, C., Tan, T.: Deep unbiased embedding transfer for zero-shot learning. IEEE Trans. Image Process. 29, 1958–1971 (2020)
Article MathSciNet MATH Google Scholar
Jiang, H., Wang, R., Shan, S., Yang, Y., Chen, X.: Learning discriminative latent attributes for zero-shot classification. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4233–4242 (2017). https://doi.org/10.1109/ICCV.2017.453
Jurie, F., Bucher, M., Herbin, S.: Generating visual representations for zero-shot classification. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 2666–2673 (2017). https://doi.org/10.1109/ICCVW.2017.308
Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI’06, pp. 381–388. AAAI Press (2006)
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4447–4456 (2017). https://doi.org/10.1109/CVPR.2017.473
Kordumova, S., Mensink, T., Snoek, C.G.: Pooling objects for recognizing scenes without examples. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 143–150 (2016)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–958 (2009). https://doi.org/10.1109/CVPR.2009.5206594
Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)
Article Google Scholar
Lazaridou, A., Dinu, G., Baroni, M.: Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 270–280 (2015)
Li, H., Li, D., Luo, X.: Bap: Bimodal attribute prediction for zero-shot image categorization. Proceedings of the 22nd ACM International Conference on Multimedia (2014)
Li, X., Guo, Y.: Max-Margin Zero-Shot Learning for Multi-class Classification. In: Lebanon, G., Vishwanathan, S. V. N. (eds.) Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, pp. 626–634. PMLR, San Diego, California, USA (2015) http://proceedings.mlr.press/v38/li15d.html
Li, X., Liao, S., Lan, W., Du, X., Yang, G.: Zero-shot image tagging by hierarchical semantic embedding. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, pp. 879–882. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2766462.2767773
Li, Y., Jia, Z., Zhang, J., Huang, K., Tan, T.: Deep semantic structural constraints for zero-shot learning. In: AAAI (2018)
Li, Y., Wang, D., Hu, H., Lin, Y., Zhuang, Y.: Zero-shot recognition using dual visual-semantic mapping paths. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp, 5207–5215 (2017). https://doi.org/10.1109/CVPR.2017.553
Li, Y., Zhang, J., Zhang, J., Huang, K.: Discriminative learning of latent features for zero-shot recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7463–7471 (2018). https://doi.org/10.1109/CVPR.2018.00779
Liang, K., Chang, H., Shan, S., Chen, X.: A unified multiplicative framework for attribute learning. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2506–2514 (2015). https://doi.org/10.1109/ICCV.2015.288
Long, Y., Liu, L., Shao, L., Shen, F., Ding, G., Han, J.: From zero-shot learning to conventional supervised classification: Unseen visual data synthesis. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6165–6174 (2017). https://doi.org/10.1109/CVPR.2017.653
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In: European Conference on Computer Vision, pp. 488–501. Springer (2012)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Y. Bengio, Y. LeCun (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings (2013)
Mikolov, T., Kopecky, J., Burget, L., Glembek, O., ?Cernocky, J.: Neural network based language models for highly inflective languages. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4725–4728 (2009)
Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751. Association for Computational Linguistics, Atlanta, Georgia (2013)
Miller, G. A.: Wordnet: A lexical database for english. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748
Article Google Scholar
Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2037–2046 (2017). https://doi.org/10.1109/CVPR.2017.220
Narayan, S., Gupta, A., Khan, F.S., Snoek, C.G., Shao, L.: Latent embedding feedback and discriminative features for zero-shot classification. In: ECCV (2020)
Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., Corrado, G., Dean, J.: Zero-shot learning by convex combination of semantic embeddings. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.5650
Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 15(2), 251–269 (1991)
Article Google Scholar
Palatucci, M., Pomerleau, D., Hinton, G., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: Proceedings of the 22Nd International Conference on Neural Information Processing Systems, NIPS’09, pp. 1410–1418. Curran Associates Inc., USA (2009)
Pambala, A.K., Dutta, T., Biswas, S.: Generative model with semantic embedding and integrated classifier for generalized zero-shot learning. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1226–1235 (2020). https://doi.org/10.1109/WACV45572.2020.9093625
Paragios, N., Deriche, R.: Geodesic active regions and level set methods for supervised texture segmentation. Int. J. Comput. Vis. 46(3), 223–247 (2002)
Article MATH Google Scholar
Parikh, D., Grauman, K.: Relative attributes. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV ’11, pp. 503–510. IEEE Computer Society, USA (2011). https://doi.org/10.1109/ICCV.2011.6126281
Parkkonen, L., Andersson, J., Hämäläinen, M., Hari, R.: Early visual brain areas reflect the percept of an ambiguous scene. Proc. Natl. Acad. Sci. (2008). https://doi.org/10.1073/pnas.0810966105
Article Google Scholar
Peng, P., Tian, Y., Xiang, T., Wang, Y., Pontil, M., Huang, T.: Joint semantic and latent attribute modelling for cross-class transfer learning. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1625–1638 (2018). https://doi.org/10.1109/TPAMI.2017.2723882
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014).https://doi.org/10.3115/v1/D14-1162
Pi, T., Li, X., Zhang, Z.M.: Boosted zero-shot learning with semantic correlation regularization. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2599–2605 (2017). https://doi.org/10.24963/ijcai.2017/362
Qian, K., Wen, X., Song, A.: Hybrid neural network model for large-scale heterogeneous classification tasks in few-shot learning. Vis. Comput. 38(2), 719–728 (2022). https://doi.org/10.1007/s00371-020-02046-6
Article Google Scholar
Qiao, R., Liu, L., Shen, C., Van Den Hengel, A.: Less is more: Zero-shot learning from online textual documents with noise suppression. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2249–2257 (2016).https://doi.org/10.1109/CVPR.2016.247
Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)
Article Google Scholar
Rastegari, M., Farhadi, A., Forsyth, D.: Attribute discovery via predictable discriminative binary codes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 876–889. Springer, Berlin (2012)
Chapter Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: Proceedings of 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017 (2017)
Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 49–58 (2016). https://doi.org/10.1109/CVPR.2016.13
Renault, O., Thalmann, N.M., Thalmann, D.: A vision-based approach to behavioural animation. J. Vis. Comput. Animat. 1(1), 18–21 (1990). https://doi.org/10.1002/vis.4340010106
Article Google Scholar
Rifai, S., Bengio, Y., Courville, A., Vincent, P., Mirza, M.: Disentangling factors of variation for facial expression recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 808–822. Springer, Berlin (2012)
Chapter Google Scholar
Rohrbach, M., Ebert, S., Schiele, B.: Transfer learning in a transductive setting. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1, NIPS’13, pp. 46–54. Curran Associates Inc., Red Hook, NY, USA (2013)
Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR 2011, pp. 1641–1648 (2011)
Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where–and why? semantic relatedness for knowledge transfer. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 910–917. IEEE (2010)
Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161 (2015)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chap. Learning Internal Representations by Error Propagation, pp. 318–362. MIT Press, Cambridge, MA, USA (1986)
Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2163–2173 (2019)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pp. 32–36. IEEE Computer Society, Washington, DC, USA (2004)
Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8239–8247 (2019).https://doi.org/10.1109/CVPR.2019.00844
Shimojo, S., Paradiso, M., Fujita, I.: What visual perception tells us about mind and brain. Proc. Natl. Acad. Sci. 98(22), 12340–12341 (2001). https://doi.org/10.1073/pnas.221383698
Article Google Scholar
Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 73–86. Springer, Berlin (2012)
Chapter Google Scholar
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 26, pp. 935–943. Curran Associates, Inc. (2013)
Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402 (2012)
Su, Y., Jurie, F.: Improving image classification using semantic attributes. Int. J. Comput. Vis. 100, 59–77 (2012)
Article Google Scholar
Sun, X., Gu, J., Sun, H.: Research progress of zero-shot learning. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-02075-7
Article Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)
MathSciNet MATH Google Scholar
Verma, V.K., Rai, P.: A simple exponential family framework for zero-shot learning. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 792–808. Springer International Publishing, Cham (2017)
Chapter Google Scholar
Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (2019). https://doi.org/10.1145/3293318
Article Google Scholar
Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: 2013 IEEE International Conference on Computer Vision, pp. 2120–2127 (2013).https://doi.org/10.1109/ICCV.2013.264
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology (2010)
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 69–77 (2016). https://doi.org/10.1109/CVPR.2016.15
Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning: a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. (2018)
Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5542–5551 (2018). https://doi.org/10.1109/CVPR.2018.00581
Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning - the good, the bad and the ugly. In: IEEE Computer Vision and Pattern Recognition (CVPR) (2017)
Xie, G.S., Liu, L., Jin, X., Zhu, F., Zhang, Z., Qin, J., Yao, Y., Shao, L.: Attentive region embedding network for zero-shot learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9376–9385 (2019). https://doi.org/10.1109/CVPR.2019.00961
Xu, W., Xian, Y., Wang, J., Schiele, B., Akata, Z.: Attribute prototype network for zero-shot learning. In: NeurIPS (2020)
Xu, X., Shen, F., Yang, Y., Zhang, D., Shen, H.T., Song, J.: Matrix tri-factorization with manifold regularizations for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2007–2016 (2017).https://doi.org/10.1109/CVPR.2017.217
Yang, Y., Teo, C.L., Daumé, H., Aloimonos, Y.: Corpus-guided sentence generation of natural images. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp. 444–454. Association for Computational Linguistics, USA (2011)
Yu, F.X., Cao, L., Feris, R.S., Smith, J.R., Chang, S.: Designing category-level attributes for discriminative visual recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 771–778 (2013). https://doi.org/10.1109/CVPR.2013.105
Yue, Z., Wang, T., Zhang, H., Sun, Q., Hua, X.S.: Counterfactual zero-shot and open-set visual recognition. In: CVPR (2021)
Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3010–3019 (2017)
Zhang, Y., Jin, R., Zhou, Z.H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1), 43–52 (2010)
Article Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4166–4174 (2015).https://doi.org/10.1109/ICCV.2015.474
Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.R.: Domain-invariant projection learning for zero-shot recognition (2018)
Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2616–2622 (2017)
Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1004–1013 (2018).https://doi.org/10.1109/CVPR.2018.00111
Zhu, Y., Xie, J., Tang, Z., Peng, X., Elgammal, A.: Semantic-Guided Multi-Attention Localization for Zero-Shot Learning. Curran Associates Inc., Red Hook (2019)
Google Scholar

Download references

Acknowledgements

The authors would also like to thank the Department of Computer Science and Engineering, National Institute of Technology Manipur, and National Institute of Technology Hamirpur, India, to provide the platform and equipment for the study so that authors are able to perform this study.

Author information

Authors and Affiliations

National Institute of Technology Manipur, Imphal, Manipur, 795001, India
P K Bhagat & Kh Manglem Singh
National Institute of Technology Hamirpur, Hamirpur, Himachal Pradesh, 177005, India
Prakash Choudhary

Authors

P K Bhagat
View author publications
You can also search for this author in PubMed Google Scholar
Prakash Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
Kh Manglem Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P K Bhagat.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest to disclose. All the authors have participated in writing this paper and the work is original and is not published elsewhere. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhagat, P.K., Choudhary, P. & Singh, K.M. A study on zero-shot learning from semantic viewpoint. Vis Comput 39, 2149–2163 (2023). https://doi.org/10.1007/s00371-022-02470-w

Download citation

Accepted: 18 March 2022
Published: 30 May 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s00371-022-02470-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study on zero-shot learning from semantic viewpoint

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study on zero-shot learning from semantic viewpoint

Abstract

Access this article

Similar content being viewed by others

End-to-End Object Detection with Transformers

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation