Skip to main content
Log in

Research progress of zero-shot learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Although there have been encouraging breakthroughs in supervised learning since the renaissance of deep learning, the recognition of large-scale object classes remains a challenge, especially when some classes have no or few training samples. In this paper, the development of ZSL is reviewed comprehensively, including the evolution, key technologies, mainstream models, current research hotspots and future research directions. First, the evolution process is introduced from the perspectives of multi-shot, few-shot to zero-shot learning. Second, the key techniques of ZSL are analyzed in detail in terms of three aspects: visual feature extraction, semantic representation and visual-semantic mapping. Third, some typical models are interpreted in chronological order. Finally, closely related articles from the last three years are collected to analyze the current research hotspots and list future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Kilinc O, Uysal I (2018) GAR: an efficient and scalable graph-based activity regularization for semi-supervised learning. Neurocomputing 296:46–54

    Google Scholar 

  2. Pan S, Yang Q (2010) A survey on transfer learning. IEEE Transactions on Knowledge & Data Engineering 22:1345–1359

    Google Scholar 

  3. Chen X, Li B, Proietti R, Zhu Z, Ben S (2019) Self-taught anomaly detection with hybrid unsupervised/supervised machine learning in optical networks. J Lightwave Technol TECHNO 37:1742–1749

    Google Scholar 

  4. Yan L, Zheng Y, Cao J (2018) Few-shot learning for short text classification. Multimedia Tools & Applications 77:29799–29810

    Google Scholar 

  5. Dinu G, Lazaridou A, Baroni M (2014) Improving zero-shot learning by mitigating the hubness problem. Computer science 9284:135–151

    Google Scholar 

  6. Hamker F (2001) Life-long learning cell structures-continuously learning without catastrophic interference. Neural Netw 14:551–573

    Google Scholar 

  7. Fu Y, Hospedales T, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. Proceedings of the European conference on computer vision (ECCV). Zurich, Switzerland 5-12 September

  8. Zhao X, Sun X, Hong Y, Yao Y, (2019) Zero-shot learning via recurrent knowledge transfer. Proceedings of IEEE winter conference on applications of computer vision (WACV). Hawaii, USA 8-10 January

  9. Guo Y, Ding G, Jin X (2016) Transductive Zero-shot Recognition via Shared Model Space Learning. proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Arizona, USA 12–17 February, 3494–3500

  10. Qin J, Wang Y, Liu L, Chen J, Shao L (2016) Beyond semantic attributes: discrete latent attributes learning for zero-shot recognition. IEEE Signal Proc Let 23:1667–1671

    Google Scholar 

  11. Zhang Z, Saligrama V (2017) Learning joint feature adaptation for zero-shot recognition arXiv 2016

  12. Guo Y, Ding G, Han J, Gao Y (2017) Zero-shot learning with transferred samples. IEEE T. Image Process. 26:3277–3290

    MathSciNet  MATH  Google Scholar 

  13. Ye M, Guo Y (2019) Progressive ensemble networks for zero-shot recognition. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), California, USA 15–20 June, pp11720–11729

  14. Wang W, Miao C, Hao S (2017) Zero-shot human activity recognition via nonlinear compatibility based method. the International Conference. Proceedings of International Conference On Web Intelligence-WI 17, Leipzig, Germany, 23–26 August, pp322–330

  15. Hayashi T, Fujita H (2020) Cluster-based zero-shot learning for multivariate data. Journal of ambient intelligence and humanized computing 2–3

  16. Toshitaka H, Kotaro A, Hamido F, (2020) Applying cluster-based zero-shot Classififier to data imbalance problems. URL: https://link.springer.com/article/10.1007/s12652-020-02268-5, Cluster-based zero-shot learning for multivariate data

  17. Fu Y, Xiang T, Jiang Y, Xue X, Gong S (2018) Recent advances in zero-shot recognition: toward data-efficient understanding of visual content. IEEE Signal Proc Mag 35:112–125

    Google Scholar 

  18. Junior V, Pedrini H, Menotti D. Zero-shot action recognition in videos: a survey. arXiv 2019, arXiv:1909.06423v1

  19. Wang Y, Yao Q, Kwok J, Ni, L (2020) Generalizing from a few examples: a survey on few-shot learning. arXiv 2020, arXiv:submit/3107007

  20. Geng C, Huang S, Chen S (2019) Recent advances in open set recognition: a survey, arXiv 2019, arXiv:submit/2781127

  21. Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. Proceedings of the twenty-third AAAI conference on artificial intelligence, Chicago, Illinois, USA, 13-17 July

  22. Palatucci M, Pomerleau D, Hinton G, Mitchell T (2009) Zero-shot learning with semantic output codes. Adv Neural Inf Proces Syst 1:1410–1418

    Google Scholar 

  23. LAMPERT C, Nickisch H, HARMELING S (2009) Learning to detect unseen object classes by between-class attribute transfer, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, 20–25 June, pp951–958

  24. Fu Y, Hospedales T, Xiang T, Gong S (2012) Attribute learning for understanding unstructured social activity. Proceedings of the European conference on computer vision. Springer, Berlin, Heidelberg. Florence, Italy, pp 530–543

  25. Li D, Wang H, Hu Y, Lin Y (2017) Zhuang, zero-shot recognition using dual visual-semantic mapping paths, proceedings of IEEE conference on computer vision and pattern recognition (CVPR), Honolulu HI USA, pp 5207-5215

  26. Verma V, Rai P (2017) A simple exponential family framework for zero-shot learning. Proceedings of the ECML-PKDD, Skopje, Macedonia, 18-22, September

  27. Shafin R, Salman K, Fatih P (2018) A unified approach for conventional zero-shot, generalized zero-shot and few-shot learning. IEEE T Image Process 1:1–1

    Google Scholar 

  28. Wen, X. , Liu, W. , Wang, N. , Yuan, H. , & Zhao, H. . (2009). Improved wavelet feature extraction methods based on HSV space for vehicle detection. Iapr Conference on Machine Vision Applications. DBLP

  29. O'Rourke S, Herskowitz I, O'Shea E (2002) Yeast go the whole hog for the hyperosmotic response. Trends Genet 18:405–412

    Google Scholar 

  30. Abolghasemi M, Aghainia H, Faez K, Mehrabi M (2008) LSB data hiding detection based on gray level co-occurrence matrix (GLCM). Proceedings of the international symposium on telecommunications. Tehran, Iran 27-28 august

  31. Akaike H (1971) Autoregressive model fitting for control. Annals of the Institute of Statal Mathematics 23:163–180

    MathSciNet  MATH  Google Scholar 

  32. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis & Machine Intelligence 24:971–987

    MATH  Google Scholar 

  33. Duda R, Hart P (1972) Use of the hough transformation to detect lines and curves in pictures. Commun ACM 15:11–15

    MATH  Google Scholar 

  34. Markel J (1973) The sift algorithm for fundamental frequency estimation. IEEE Trans Audio Electroacoust 20:367–377

    MathSciNet  Google Scholar 

  35. Bay H, Tuytelaars T, Luc J (2006) SURF: speeded up robust features. Proceedings of the 9th European conference on computer vision, Graz, Austria, may 7-13, pp 406-417

  36. Lee J (2020) Integration of Digital Twin and Deep Learning in Cyber-Physical Systems: Towards Smart Manufacturing 38:901–910

    Google Scholar 

  37. Ha I, Kim H, Park S, Kim H (2018) Image retrieval using BIM and features from pretrained VGG network for indoor localization. Build Environ 140:23–31

    Google Scholar 

  38. Xie S, Zheng X, Chen Y, Xie L, Liu J, Zhang Y (2018) Artifact removal using improved googlenet for sparse-view ct reconstruction. Sci Rep-UK 8:6700

    Google Scholar 

  39. Lu Z, Jiang X, Kot C (2018) Deep coupled ResNet for low-resolution face recognition. IEEE Signal Proc. Let 1:1–1

    Google Scholar 

  40. Chasset P (2013) Grnn: general regression neural network. Revue De Physique Appliquée 4:1321–1325

    Google Scholar 

  41. Wang X, Chen C, Cheng Y (2018) Zero-shot learning based on deep weighted attribute prediction. IEEE transactions on systems, man, and cybernetics: systems :1-10

  42. Hascoet T, Ariki Y, Takiguchi T (2019) Semantic embeddings of generic objects for zero-shot learning. EURASIP J. Image Vide 13:1–14

    Google Scholar 

  43. Cheng W, Greaves C, Warren M (2006) From n-gram to skipgram to concgram. International Journal of Corpus Linguistics 11:411–433

    Google Scholar 

  44. Xiong Z, Shen Q, Xiong Y, Wang Y, L, W (2019) New generation model of word vector representation based on cbow or skip-gram. CMC-Comput Mater Con 58: 259–273

  45. Ferreira E, Masson A, Jabaian B, Lefevre F (2016) Adversarial bandit for online interactive active learning of zero-shot spoken language understanding. In the proceedings of 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP)

  46. Xu X, Hospedales T, Gong S (2015) Transductive zero-shot action recognition by word-vector embedding. Int J Comput Vis 123:309–333

    MathSciNet  MATH  Google Scholar 

  47. Zhong J, Yuxin S, Yunlong Y, Jichang G, Yanwei P (2018) Semantic softmax loss for zero-shot learning. Neurocomputing 316:369–375

    Google Scholar 

  48. Gao J, Zhang T, Xu C (2019) I Know the Relationships: Zero-Shot Action Recognition via Two-Stream Graph Convolutional Networks and Knowledge Graphs. In the proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA , 27 January-1 February

  49. Karessli N, Akata Z, Schiele B, Bulling A (2017) Gaze embeddings for zero-shot image classification. Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, USA, 21-26 July: 4525-4534

  50. Elhoseiny M, Zhu Y, Zhang H, Elgammal A (2017) Link the head to the "beak": zero shot learning from noisy text description at part precision. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, USA 21-26 July, 2017

  51. Wang X, Ji Q (2013) A unified probabilistic approach modeling relationships between attributes and objects. Proceedings of the 2013 IEEE international conference on computer vision. Sydney, Australia 1-8 December, 2013

  52. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. Proceedings of the computer vision and pattern recognition (CVPR), Oregon, USA 23-28 June, 2013

  53. Bucher M, Herbin S, Jurie F (2017) Generating visual representations for zero-shot classification. Proceedings of the international conference on computer vision workshops, 22-29 October, 2017

  54. Xue N, Xue N, Wang Y, Fan X, Min M (2018) ICIP2017_Incremental zero-shot learning based on attributes for image classification. Proceedings of the IEEE international conference on image processing. Athens, Greece, 7-10 October, 2018

  55. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label embedding for image classification. IEEE T Pattern Anal (TPAMI) 38:1425–1438

    Google Scholar 

  56. Frome A, Corrado G, Shlens J, DeViSE: a deep visual-semantic embedding model, Proceedings of the NIPS , Lake Tahoe, Nevada, United States, 13-14, December 2013

  57. Murray N, Perronnin F, Zisserman A (2017) Interferences in match kernels. IEEE T. Pattern Anal 39:1797–1810

    Google Scholar 

  58. Wang Z (2011) Hingeboost: ROC-based boost for classification and variable selection. Int J Biostat 7:1–30

    MathSciNet  Google Scholar 

  59. Sun K, Kang H, Park H (2015) Tagging and classifying facial images in cloud environments based on KNN using mapreduce. Optik 126:S0030402615006324

    Google Scholar 

  60. Sriadhi S, Gultom S, Martiano M, Rahim R, Abdullah D (2018) K-means method with linear search algorithm to reduce means square error (mse) within data clustering. Iop Conference 434:012032

    Google Scholar 

  61. Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado G, Dean J (2013) Zero-shot learning by convex combination of semantic embeddings. arXiv 2013, arXiv:1312.5650

  62. Dean A, Sutskever I, Hinton G (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90

    Google Scholar 

  63. Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp2927–2936

  64. Romera B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In ICML:2152–2161

  65. Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. Proceedings of the CVPR: 69–77

  66. Morgado P, Vasconcelos N (2017) Semantically consistent regularization for zero-shot recognition. Proceedings of the CVPR pp 2037-2046

  67. Xu X, Shen F, Yang Y, Zhang D, Shen H, Song J (2017) matrix tri-factorization with manifold regularizations for zero-shot learning. Proceedings of the CVPR pp 2007-2016

  68. Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. Proceedings of the CVPR pp 4447-4456

  69. Peng P, Tian Y, Xiang T, Wang Y, Pontil M (2017) Joint semantic and latent attribute modelling for cross-class transfer learning. IEEE T. Pattern Analy 40:1625–1638

    Google Scholar 

  70. Jiang H, Wang R, Shan S, Yang Y, Chen X (2017) Learning discriminative latent attributes for zero-shot classification. Proceedings of the ICCV: 4223–4232

  71. Changpinyo S, Chao W, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. Proceedings of the CVPR: 5327–5336

  72. Li Y, Zhang J, Zhang J, Huang K (2018) Discriminative learning of latent features for zero-shot recognition. In the proceedings of the CVPR: 7463-7471

  73. Zhao A, Ding M, Guan J, Lu Z, Tao X (2018) Domain-invariant projection learning for zero-shot recognition. NeurIPS:1–12

  74. Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding, in the proceedings of IEEE international conference on computer vision pp 4166-4175

  75. Richard S, Milind G, Christopher D (2013) Zero-shot learning through cross-modal transfer. In proceedings of the 26th international conference on neural information processing systems - volume 1 (NIPS'13). Curran associates Inc., red hook, NY, USA :935-943

  76. Chao W, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. Front Inform Tech El 17:403–412

    Google Scholar 

  77. Song J, Shen C, Yang Y (2018) Transductive unbiased embedding for zero-shot learning [C]. The IEEE/CVF conference on computer vision and pattern recognition, Salt Lake City, USA pp 1024–1033

  78. Zhu P, Wang H, Saligrama V (2018) Generalized zero-shot recognition based on visually semantic embedding

  79. Liu S, Long M, Wang J, MichaelI J, Generalized Zero-Shot Learning with Deep Calibration Network

  80. Arora G, Verma V, Mishra A, Rai, P (2018). Generalized zero-shot learning via synthesized examples. CVPR, 2018. IEEE

  81. Xing Y, Huang S, Huangfu L, Chen F, Ge Y (2020). Robust Bidirectional Generative Network For Generalized Zero-Shot Learning. 2020 IEEE international conference on multimedia and expo (ICME). IEEE

  82. Mazumder P, Singh P, Parida K, Namboodiri V (2020). Avgzslnet: audio-visual generalized zero-shot learning by reconstructing label features from multi-modal embeddings

  83. Huang S, Lin J, Huangfu L (2020) Class-prototype discriminative network for generalized zero-shot learning. IEEE Signal Processing Letters 27:301–305

    Google Scholar 

  84. Liu K, Wu L, Ma H, Huang W, Dong X (2019) Generalized zero-shot learning for action recognition with web-scale video data. World Wide Web 22(2):807–824

    Google Scholar 

  85. Zhang H, Koniusz P (2018) Model selection for generalized zero-shot learning. European conference on computer vision. Springer, Cham

    Google Scholar 

  86. Madapana N, Wachs J (2019). Database of Gesture Attributes: Zero Shot Learning for Gesture Recognition. 2019 14th IEEE international conference on Automatic Face & Gesture Recognition (FG 2019). IEEE

  87. Mishra A, Pandey A, Murthy H (2020) Zero-shot learning for action recognition using synthesized features. Neurocomputing 390:117–130

    Google Scholar 

  88. Wen G, Ma J, Hu Y, Li H, Jiang L (2020). Grouping attributes zero-shot learning for tongue constitution recognition. Artif Intell Med, 101951

  89. Pelicon A, Pranji M, Miljkovi D, Krlj B, Pollak S (2020) Zero-shot learning for cross-lingual news sentiment classification. Applied ences 10(17):5993

    Google Scholar 

  90. Maraghi V, Faez K (2019). Zero-shot learning on human-object interaction recognition in video. 2019 5th Iranian conference on signal processing and intelligent systems (ICSPIS)

  91. Zhao Y, Shi P, You J (2019). Fine-grained Human Action Recognition Based on Zero-Shot Learning. 2019 IEEE 10th international conference on software engineering and service science (ICSESS). IEEE

  92. Gao Y, Gao L, Li X, Zheng Y (2020) A zero-shot learning method for fault diagnosis under unknown working loads. J Intell Manuf 31:899–909

    Google Scholar 

  93. Madapana N, Wachs J (2018). Hard zero shot learning for gesture recognition. 2018 24th international conference on pattern recognition (ICPR)

  94. Zhang H, Long Y, Liu L, Shao L (2019). Adversarial unseen visual feature synthesis for zero-shot learning. Neurocomputing, 329(FEB.15): 12-20

  95. Liu H, Yao L, Zheng Q, Luo M, Lyu Y (2020). Dual-stream generative adversarial networks for distributionally robust zero-shot learning. Inf Sci

  96. Ji Z, Chen K, Wang J, Yu Y, Zhang Z (2020) Multi-modal generative adversarial network for zero-shot learning.197: 105847

  97. Vyas M, Venkateswara H, Panchanathan S (2020). Leveraging seen and unseen semantic relationships for generative zero-shot learning

  98. Wang J, Li Y, Pang Z, Wang D (2018). Generating manifold-aligned semantic feature for zero-shot learning 1613-1617

  99. Xian Y, Lorenz T, Schiele B, Akata Z (2018). Feature Generating Networks for Zero-Shot Learning. 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE

  100. Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018). A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts. 2018 IEEE/CVF conference on computer vision and pattern recognition. IEEE

  101. Yu Y, Ji Z, Guo J, Pang Y (2018) Transductive zero-shot learning with adaptive structural embedding. IEEE Transactions on Neural Networks and Learning Systems 29(9):4116–4127

    Google Scholar 

  102. Yu Y, Ji Z, Li X, Guo J, Zhang Z, Ling H (2018) Transductive zero-shot learning with a self-training dictionary approach. IEEE Transactions on Cybernetics 48(10):2908–2919

    Google Scholar 

  103. Gune O, Pal M, Mukherjee P, Banerjee B, Chaudhuri S (2020). Generative model-driven structure aligning discriminative embeddings for transductive zero-shot learning

  104. Peng J, Xiong Z, Wang Y, Zhang Y, Liu D (2020) Zero-shot depth estimation from light field using a convolutional neural network. IEEE Transactions on Computational Imaging 6:682–696

    Google Scholar 

  105. Brattoli B, Tighe J, Zhdanov F, Perona P, Chalupka K (2020). Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE

  106. Tian Y, Ruan Q, Gao Y (2018) Zero-shot Action Recognition via Empirical Maximum Mean Discrepancy. 2018 14th IEEE international conference on signal processing (ICSP). IEEE

  107. Sun L, Song J, Wang Y, Li B (2020). Cooperative coupled generative networks for generalized zero-shot learning. IEEE access, PP(99), 1-1

  108. Gao R, Hou X, Qin J (2020) Zero-VAE-GAN: generating unseen features for generalized and transductive zero-shot learning. IEEE T. Image Process 29:3665–3680

    Google Scholar 

  109. Jia Z, Zhang Z, Wang L, Shan C, Tan T (2019) Deep unbiased embedding transfer for zero-shot learning. IEEE Trans Image Process 29:1958–1971

    MathSciNet  Google Scholar 

  110. Fu Z, Xiang T, Kodirov E, Gong S (2018) Zero-shot learning on semantic class prototype graph. IEEE Trans Pattern Anal Mach Intell 40(8):2009–2022

    Google Scholar 

  111. Zhang Z, Li Y, Yang J, Li Y, Gao M (2019) Cross-layer autoencoder for zero-shot learning. IEEE Access 7(99):167584–167592

    Google Scholar 

  112. Guo J, Guo S (2019). Adaptive Adjustment with Semantic Feature Space for Zero-Shot Recognition. ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE

  113. Rostami M, Kolouri S, Murez Z, Owekcho Y, Eaton E, Kim K (2019). Zero-shot image classification using coupled dictionary embedding

Download references

Funding

This research was funded by the National Natural Science Foundation of China (No. 51875266), National Natural Science Foundation of China (No. U1904194).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, X.S.; data curation, X.S., H.S.; formal analysis, X.S.; funding acquisition, J.G., H.S.; investigation, X.S., H.S.; methodology, X.S.; resources, X.S., H.S.; software, X.S., H.S.; supervision, J.G., H.S.; writing—original draft, X.S.; writing—review & editing, X.S., H.S.

Corresponding authors

Correspondence to Jinan Gu or Hongying Sun.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, X., Gu, J. & Sun, H. Research progress of zero-shot learning. Appl Intell 51, 3600–3614 (2021). https://doi.org/10.1007/s10489-020-02075-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02075-7

Keywords

Navigation