Skip to main content
Log in

A study on zero-shot learning from semantic viewpoint

  • Survey
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Recognition of unseen object class by a human being is always based on the relationship between seen and unseen classes, given that human has some background knowledge of the unseen object class. Zero-shot learning is a learning paradigm that tries to develop a recognition model to recognize mutually exclusive training and testing classes. A zero-shot learning model trained on labeled data can also recognize unseen classes when sufficient information about the relationship between seen and unseen classes is given. Semantic space contains semantic information about seen and unseen classes. It is an important part of zero-shot learning and acts as a bridge between seen and unseen classes. In this article, we provide a compact and comprehensive survey on zero-shot learning. First, we explain the different ways to construct semantic space along with its pros and cons. Next, we present a categorization of zero-shot learning methods from the semantic space construction point of view. Furthermore, this paper also presents performance evaluation measures with a relevant and influential zero-shot learning database.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Akata, Z., Malinowski, M., Fritz, M., Schiele, B.: Multi-cue zero-shot learning with strong supervision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 59–68 (2016). https://doi.org/10.1109/CVPR.2016.14

  2. Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2013). https://doi.org/10.1109/CVPR.2013.111

  3. Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)

    Article  Google Scholar 

  4. Akata, Z., Reed, S., Walter, D., , Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2927–2936 (2015)

  5. Al-Halah, Z., Stiefelhagen, R.: How to transfer? zero-shot object recognition via hierarchical transfer of semantic attributes. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 837–843 (2015). https://doi.org/10.1109/WACV.2015.116

  6. An, F.P., Liu, J.e., Bai, L.: Object recognition algorithm based on optimized nonlinear activation function-global convolutional neural network. Vis. Comput. pp. 1–13 (2021)

  7. Ba, J.L., Swersky, K., Fidler, S., Salakhutdinov, R.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4247–4255 (2015)

  8. Bhagat, P.K., Choudhary, P.: Image annotation: then and now. Image Vis. Comput. 80, 1–23 (2018)

    Article  Google Scholar 

  9. Bhagat, P.K., Choudhary, P., Singh, K.M.: A novel approach based on fully connected weighted bipartite graph for zero-shot learning problems. J. Ambient. Intell. Humaniz. Comput. (2021). https://doi.org/10.1007/s12652-020-02615-6

    Article  Google Scholar 

  10. Bradley, D.R., Dumais, S.T.: Ambiguous cognitive contours. Nature 257(5527), 582–584 (1975). https://doi.org/10.1038/257582a0

    Article  Google Scholar 

  11. Changpinyo, S., Chao, W., Gong, B., Sha, F.: Synthesized classifiers for zero-shot learning. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5327–5336 (2016). https://doi.org/10.1109/CVPR.2016.575

  12. Chao, W.L., Changpinyo, S., Gong, B., Sha, F.: An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European Conference on Computer Vision, pp. 52–68. Springer (2016)

  13. Cheng, H.T., Sun, F.T., Griss, M., Davis, P., Li, J., You, D.: Nuactiv: Recognizing unseen new activities using semantic attribute-based learning. In: Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services, pp. 361–374. Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2462456.2464438

  14. Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)

    Article  Google Scholar 

  15. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778–1785 (2009)

  16. Feng, J., Jegelka, S., Yan, S., Darrell, T.: Learning scalable discriminative dictionary with sample relatedness. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1645–1652 (2014). https://doi.org/10.1109/CVPR.2014.213

  17. Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach, Second Edition. Pitman (2012)

  18. Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Ranzato, M.A., Mikolov, T.: Devise: A deep visual-semantic embedding model. In: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems, Vol. 26, pp. 2121–2129. Curran Associates, Inc. (2013)

  19. Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Attribute learning for understanding unstructured social activity. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 530–543. Springer, Berlin (2012)

    Chapter  Google Scholar 

  20. Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 303–316 (2014). https://doi.org/10.1109/TPAMI.2013.128

    Article  Google Scholar 

  21. Fu, Y., Wang, X., Dong, H., Jiang, Y.G., Wang, M., Xue, X., Sigal, L.: Vocabulary-informed zero-shot and open-set learning. IEEE Trans. Pattern Anal. Mach. Intell. 42(12), 3136–3152 (2020). https://doi.org/10.1109/TPAMI.2019.2922175

    Article  Google Scholar 

  22. Fu, Y., Xiang, T., Jiang, Y., Xue, X., Sigal, L., Gong, S.: Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content. IEEE Signal Process. Mag. 35(1), 112–125 (2018). https://doi.org/10.1109/MSP.2017.2763441

    Article  Google Scholar 

  23. Fu, Z., Xiang, T., Kodirov, E., Gong, S.: Zero-shot learning on semantic class prototype graph. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 2009–2022 (2017)

    Article  Google Scholar 

  24. Gan, C., Yang, T., Gong, B.: Learning attributes equals multi-source domain generalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 87–97 (2016)

  25. Gan, C., Yang, T., Gong, B.: Learning attributes equals multi-source domain generalization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 87–97 (2016). https://doi.org/10.1109/CVPR.2016.17

  26. Gao, L., Song, J., Shao, J., Zhu, X., Shen, H.: Zero-shot image categorization by image correlation exploration. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 487–490 (2015)

  27. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Addison-Wesley Longman Publishing Co., New York (2001)

    Google Scholar 

  28. Guo, Y., Ding, G., Jin, X., Wang, J.: Transductive zero-shot recognition via shared model space learning. In: AAAI (2016)

  29. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  30. Huang, S., Elhoseiny, M., Elgammal, A., Yang, D.: Learning hypergraph-regularized attribute predictors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409–417 (2015). https://doi.org/10.1109/CVPR.2015.7298638

  31. Jayaraman, D., Sha, F., Grauman, K.: Decorrelating semantic visual attributes by resisting the urge to share. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1629–1636 (2014). https://doi.org/10.1109/CVPR.2014.211

  32. Ji, Z., Wang, Q., Cui, B., Pang, Y., Cao, X., Li, X.: A semi-supervised zero-shot image classification method based on soft-target. Neural Netw. 143, 88–96 (2021). https://doi.org/10.1016/j.neunet.2021.05.019

    Article  Google Scholar 

  33. Jia, Z., Zhang, Z., Wang, L., Shan, C., Tan, T.: Deep unbiased embedding transfer for zero-shot learning. IEEE Trans. Image Process. 29, 1958–1971 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  34. Jiang, H., Wang, R., Shan, S., Yang, Y., Chen, X.: Learning discriminative latent attributes for zero-shot classification. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4233–4242 (2017). https://doi.org/10.1109/ICCV.2017.453

  35. Jurie, F., Bucher, M., Herbin, S.: Generating visual representations for zero-shot classification. In: 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 2666–2673 (2017). https://doi.org/10.1109/ICCVW.2017.308

  36. Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI’06, pp. 381–388. AAAI Press (2006)

  37. Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4447–4456 (2017). https://doi.org/10.1109/CVPR.2017.473

  38. Kordumova, S., Mensink, T., Snoek, C.G.: Pooling objects for recognizing scenes without examples. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 143–150 (2016)

  39. Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–958 (2009). https://doi.org/10.1109/CVPR.2009.5206594

  40. Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)

    Article  Google Scholar 

  41. Lazaridou, A., Dinu, G., Baroni, M.: Hubness and pollution: Delving into cross-space mapping for zero-shot learning. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 270–280 (2015)

  42. Li, H., Li, D., Luo, X.: Bap: Bimodal attribute prediction for zero-shot image categorization. Proceedings of the 22nd ACM International Conference on Multimedia (2014)

  43. Li, X., Guo, Y.: Max-Margin Zero-Shot Learning for Multi-class Classification. In: Lebanon, G., Vishwanathan, S. V. N. (eds.) Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, pp. 626–634. PMLR, San Diego, California, USA (2015) http://proceedings.mlr.press/v38/li15d.html

  44. Li, X., Liao, S., Lan, W., Du, X., Yang, G.: Zero-shot image tagging by hierarchical semantic embedding. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, pp. 879–882. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2766462.2767773

  45. Li, Y., Jia, Z., Zhang, J., Huang, K., Tan, T.: Deep semantic structural constraints for zero-shot learning. In: AAAI (2018)

  46. Li, Y., Wang, D., Hu, H., Lin, Y., Zhuang, Y.: Zero-shot recognition using dual visual-semantic mapping paths. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp, 5207–5215 (2017). https://doi.org/10.1109/CVPR.2017.553

  47. Li, Y., Zhang, J., Zhang, J., Huang, K.: Discriminative learning of latent features for zero-shot recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7463–7471 (2018). https://doi.org/10.1109/CVPR.2018.00779

  48. Liang, K., Chang, H., Shan, S., Chen, X.: A unified multiplicative framework for attribute learning. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2506–2514 (2015). https://doi.org/10.1109/ICCV.2015.288

  49. Long, Y., Liu, L., Shao, L., Shen, F., Ding, G., Han, J.: From zero-shot learning to conventional supervised classification: Unseen visual data synthesis. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6165–6174 (2017). https://doi.org/10.1109/CVPR.2017.653

  50. Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In: European Conference on Computer Vision, pp. 488–501. Springer (2012)

  51. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Y. Bengio, Y. LeCun (eds.) 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings (2013)

  52. Mikolov, T., Kopecky, J., Burget, L., Glembek, O., ?Cernocky, J.: Neural network based language models for highly inflective languages. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4725–4728 (2009)

  53. Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746–751. Association for Computational Linguistics, Atlanta, Georgia (2013)

  54. Miller, G. A.: Wordnet: A lexical database for english. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748

    Article  Google Scholar 

  55. Morgado, P., Vasconcelos, N.: Semantically consistent regularization for zero-shot recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2037–2046 (2017). https://doi.org/10.1109/CVPR.2017.220

  56. Narayan, S., Gupta, A., Khan, F.S., Snoek, C.G., Shao, L.: Latent embedding feedback and discriminative features for zero-shot classification. In: ECCV (2020)

  57. Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., Corrado, G., Dean, J.: Zero-shot learning by convex combination of semantic embeddings. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.5650

  58. Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 15(2), 251–269 (1991)

    Article  Google Scholar 

  59. Palatucci, M., Pomerleau, D., Hinton, G., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: Proceedings of the 22Nd International Conference on Neural Information Processing Systems, NIPS’09, pp. 1410–1418. Curran Associates Inc., USA (2009)

  60. Pambala, A.K., Dutta, T., Biswas, S.: Generative model with semantic embedding and integrated classifier for generalized zero-shot learning. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1226–1235 (2020). https://doi.org/10.1109/WACV45572.2020.9093625

  61. Paragios, N., Deriche, R.: Geodesic active regions and level set methods for supervised texture segmentation. Int. J. Comput. Vis. 46(3), 223–247 (2002)

    Article  MATH  Google Scholar 

  62. Parikh, D., Grauman, K.: Relative attributes. In: Proceedings of the 2011 International Conference on Computer Vision, ICCV ’11, pp. 503–510. IEEE Computer Society, USA (2011). https://doi.org/10.1109/ICCV.2011.6126281

  63. Parkkonen, L., Andersson, J., Hämäläinen, M., Hari, R.: Early visual brain areas reflect the percept of an ambiguous scene. Proc. Natl. Acad. Sci. (2008). https://doi.org/10.1073/pnas.0810966105

    Article  Google Scholar 

  64. Peng, P., Tian, Y., Xiang, T., Wang, Y., Pontil, M., Huang, T.: Joint semantic and latent attribute modelling for cross-class transfer learning. IEEE Trans. Pattern Anal. Mach. Intell. 40(7), 1625–1638 (2018). https://doi.org/10.1109/TPAMI.2017.2723882

    Article  Google Scholar 

  65. Pennington, J., Socher, R., Manning, C.: GloVe: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. Association for Computational Linguistics, Doha, Qatar (2014).https://doi.org/10.3115/v1/D14-1162

  66. Pi, T., Li, X., Zhang, Z.M.: Boosted zero-shot learning with semantic correlation regularization. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pp. 2599–2605 (2017). https://doi.org/10.24963/ijcai.2017/362

  67. Qian, K., Wen, X., Song, A.: Hybrid neural network model for large-scale heterogeneous classification tasks in few-shot learning. Vis. Comput. 38(2), 719–728 (2022). https://doi.org/10.1007/s00371-020-02046-6

    Article  Google Scholar 

  68. Qiao, R., Liu, L., Shen, C., Van Den Hengel, A.: Less is more: Zero-shot learning from online textual documents with noise suppression. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2249–2257 (2016).https://doi.org/10.1109/CVPR.2016.247

  69. Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)

    Article  Google Scholar 

  70. Rastegari, M., Farhadi, A., Forsyth, D.: Attribute discovery via predictable discriminative binary codes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 876–889. Springer, Berlin (2012)

    Chapter  Google Scholar 

  71. Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: Proceedings of 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017 (2017)

  72. Reed, S., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 49–58 (2016). https://doi.org/10.1109/CVPR.2016.13

  73. Renault, O., Thalmann, N.M., Thalmann, D.: A vision-based approach to behavioural animation. J. Vis. Comput. Animat. 1(1), 18–21 (1990). https://doi.org/10.1002/vis.4340010106

    Article  Google Scholar 

  74. Rifai, S., Bengio, Y., Courville, A., Vincent, P., Mirza, M.: Disentangling factors of variation for facial expression recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 808–822. Springer, Berlin (2012)

    Chapter  Google Scholar 

  75. Rohrbach, M., Ebert, S., Schiele, B.: Transfer learning in a transductive setting. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1, NIPS’13, pp. 46–54. Curran Associates Inc., Red Hook, NY, USA (2013)

  76. Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR 2011, pp. 1641–1648 (2011)

  77. Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where–and why? semantic relatedness for knowledge transfer. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 910–917. IEEE (2010)

  78. Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: International Conference on Machine Learning, pp. 2152–2161 (2015)

  79. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chap. Learning Internal Representations by Error Propagation, pp. 318–362. MIT Press, Cambridge, MA, USA (1986)

  80. Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2163–2173 (2019)

  81. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR’04) Volume 3 - Volume 03, ICPR ’04, pp. 32–36. IEEE Computer Society, Washington, DC, USA (2004)

  82. Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8239–8247 (2019).https://doi.org/10.1109/CVPR.2019.00844

  83. Shimojo, S., Paradiso, M., Fujita, I.: What visual perception tells us about mind and brain. Proc. Natl. Acad. Sci. 98(22), 12340–12341 (2001). https://doi.org/10.1073/pnas.221383698

    Article  Google Scholar 

  84. Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012, pp. 73–86. Springer, Berlin (2012)

    Chapter  Google Scholar 

  85. Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, K.Q. Weinberger (eds.) Advances in Neural Information Processing Systems 26, pp. 935–943. Curran Associates, Inc. (2013)

  86. Soomro, K., Zamir, A.R., Shah, M.: UCF101: A dataset of 101 human actions classes from videos in the wild. CoRR abs/1212.0402 (2012)

  87. Su, Y., Jurie, F.: Improving image classification using semantic attributes. Int. J. Comput. Vis. 100, 59–77 (2012)

    Article  Google Scholar 

  88. Sun, X., Gu, J., Sun, H.: Research progress of zero-shot learning. Appl. Intell. (2020). https://doi.org/10.1007/s10489-020-02075-7

    Article  Google Scholar 

  89. Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res. 6, 1453–1484 (2005)

    MathSciNet  MATH  Google Scholar 

  90. Verma, V.K., Rai, P.: A simple exponential family framework for zero-shot learning. In: Ceci, M., Hollmén, J., Todorovski, L., Vens, C., Džeroski, S. (eds.) Machine Learning and Knowledge Discovery in Databases, pp. 792–808. Springer International Publishing, Cham (2017)

    Chapter  Google Scholar 

  91. Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (2019). https://doi.org/10.1145/3293318

    Article  Google Scholar 

  92. Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: 2013 IEEE International Conference on Computer Vision, pp. 2120–2127 (2013).https://doi.org/10.1109/ICCV.2013.264

  93. Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD Birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology (2010)

  94. Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 69–77 (2016). https://doi.org/10.1109/CVPR.2016.15

  95. Xian, Y., Lampert, C.H., Schiele, B., Akata, Z.: Zero-shot learning: a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. (2018)

  96. Xian, Y., Lorenz, T., Schiele, B., Akata, Z.: Feature generating networks for zero-shot learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5542–5551 (2018). https://doi.org/10.1109/CVPR.2018.00581

  97. Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning - the good, the bad and the ugly. In: IEEE Computer Vision and Pattern Recognition (CVPR) (2017)

  98. Xie, G.S., Liu, L., Jin, X., Zhu, F., Zhang, Z., Qin, J., Yao, Y., Shao, L.: Attentive region embedding network for zero-shot learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9376–9385 (2019). https://doi.org/10.1109/CVPR.2019.00961

  99. Xu, W., Xian, Y., Wang, J., Schiele, B., Akata, Z.: Attribute prototype network for zero-shot learning. In: NeurIPS (2020)

  100. Xu, X., Shen, F., Yang, Y., Zhang, D., Shen, H.T., Song, J.: Matrix tri-factorization with manifold regularizations for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2007–2016 (2017).https://doi.org/10.1109/CVPR.2017.217

  101. Yang, Y., Teo, C.L., Daumé, H., Aloimonos, Y.: Corpus-guided sentence generation of natural images. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp. 444–454. Association for Computational Linguistics, USA (2011)

  102. Yu, F.X., Cao, L., Feris, R.S., Smith, J.R., Chang, S.: Designing category-level attributes for discriminative visual recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 771–778 (2013). https://doi.org/10.1109/CVPR.2013.105

  103. Yue, Z., Wang, T., Zhang, H., Sun, Q., Hua, X.S.: Counterfactual zero-shot and open-set visual recognition. In: CVPR (2021)

  104. Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3010–3019 (2017)

  105. Zhang, Y., Jin, R., Zhou, Z.H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1), 43–52 (2010)

    Article  Google Scholar 

  106. Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4166–4174 (2015).https://doi.org/10.1109/ICCV.2015.474

  107. Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.R.: Domain-invariant projection learning for zero-shot recognition (2018)

  108. Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 2616–2622 (2017)

  109. Zhu, Y., Elhoseiny, M., Liu, B., Peng, X., Elgammal, A.: A generative adversarial approach for zero-shot learning from noisy texts. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1004–1013 (2018).https://doi.org/10.1109/CVPR.2018.00111

  110. Zhu, Y., Xie, J., Tang, Z., Peng, X., Elgammal, A.: Semantic-Guided Multi-Attention Localization for Zero-Shot Learning. Curran Associates Inc., Red Hook (2019)

    Google Scholar 

Download references

Acknowledgements

The authors would also like to thank the Department of Computer Science and Engineering, National Institute of Technology Manipur, and National Institute of Technology Hamirpur, India, to provide the platform and equipment for the study so that authors are able to perform this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P K Bhagat.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest to disclose. All the authors have participated in writing this paper and the work is original and is not published elsewhere. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhagat, P.K., Choudhary, P. & Singh, K.M. A study on zero-shot learning from semantic viewpoint. Vis Comput 39, 2149–2163 (2023). https://doi.org/10.1007/s00371-022-02470-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02470-w

Keywords

Navigation