Zero-shot leaning and hashing with binary visual similes

  • Haofeng ZhangEmail author
  • Yang Long
  • Ling Shao


Conventional zero-shot learning methods usually learn mapping functions to project image features into semantic embedding spaces, in which to find the nearest neighbors with predefined attributes. The predefined attributes including both seen classes and unseen classes are often annotated with high dimensional real values by experts, which costs a lot of human labors. In this paper, we propose a simple but effective method to reduce the annotation work. In our strategy, only unseen classes are needed to be annotated with several binary codes, which lead to only about one percent of original annotation work. In addition, we design a Visual Similes Annotation System (ViSAS) to annotate the unseen classes, and build both linear and deep mapping models and test them on four popular datasets, the experimental results show that our method can outperform the state-of-the-art methods in most circumstances.


Zero-shot learning Zero-shot hashing Visual similes Binary annotation 



  1. 1.
    Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: CVPR, pp 2927–2936Google Scholar
  2. 2.
    Akata Z, Malinowski M, Fritz M, Schiele B (2016) Multi-cue zero-shot learning with strong supervision. In: CVPR, pp 59–68Google Scholar
  3. 3.
    Akata Z, Perronnin F, Harchaoui Z, Schmid C (2016) Label-embedding for image classification. IEEE TPAMI 38(7):1425–1438CrossRefGoogle Scholar
  4. 4.
    Al-Halah Z, Stiefelhagen R (2017) Automatic discovery, association estimation and learning of semantic attributes for a thousand categories. In: CVPRGoogle Scholar
  5. 5.
    Bartels RH, Stewart GW (1972) Solution of the matrix equation AX + XB = C [F4]. Commun ACM 15(9):820–826CrossRefGoogle Scholar
  6. 6.
    Bucher M, Herbin S, Jurie F (2016) Improving semantic embedding consistency by metric learning for zero-shot classiffication. In: ECCV, pp 730–746. SpringerGoogle Scholar
  7. 7.
    Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: CVPR, pp 5327–5336Google Scholar
  8. 8.
    Cheng Z, Shen J (2016) On very large scale test collection for landmark image search benchmarking. Signal Process 124:13–26CrossRefGoogle Scholar
  9. 9.
    Cheng Z, Ding Y, Zhu L, Kankanhalli M (2018) Aspect-aware latent factor model: Rating prediction with ratings and reviews. In: WWWGoogle Scholar
  10. 10.
    Demirel B, Cinbis RG, Cinbis NI (2017) Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning. In: CVPRGoogle Scholar
  11. 11.
    Ding Z, Shao M, Fu Y (2017) Low-rank embedded ensemble semantic dictionary for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2050–2058Google Scholar
  12. 12.
    Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: CVPR, pp 1778–1785. IEEEGoogle Scholar
  13. 13.
    Ferrari V, Zisserman A (2008) Learning visual attributes. In: NIPS, pp 433–440Google Scholar
  14. 14.
    Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Mikolov T, et al. (2013) Devise: A deep visual-semantic embedding model. In: NIPS, pp 2121–2129Google Scholar
  15. 15.
    Fu Y, Sigal L (2016) Semi-supervised vocabulary-informed learning. In: CVPR, pp 5337–5346Google Scholar
  16. 16.
    Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: ECCV, pp 584–599. SpringerGoogle Scholar
  17. 17.
    Fu Z, Xiang T, Kodirov E, Gong S (2015) Zero-shot object recognition by semantic manifold distance. In: CVPR, pp 2635–2644Google Scholar
  18. 18.
    Guo Y, Ding G, Jin X, Wang J (2016) Transductive zero-shot recognition via shared model space learning. In: AAAI, pp 3494–5000Google Scholar
  19. 19.
    Guo Y, Ding G, Han J, Gao Y (2017) Sitnet: Discrete similarity transfer network for zero-shot hashing. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 1767–1773Google Scholar
  20. 20.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778Google Scholar
  21. 21.
    Huang S, Elhoseiny M, Elgammal A, Yang D (2015) Learning hypergraph-regularized attribute predictors. In: CVPR, pp 409–417Google Scholar
  22. 22.
    Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning. In: ICCV, pp 2452–2460Google Scholar
  23. 23.
    Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: CVPRGoogle Scholar
  24. 24.
    Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny imagesGoogle Scholar
  25. 25.
    Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE TPAMI 36(3):453–465CrossRefGoogle Scholar
  26. 26.
    Li J, Zhao J, Lu K (2016) Joint feature selection and structure preservation for domain adaptation. In: IJCAI, pp 1697–1703Google Scholar
  27. 27.
    Li J, Wu Y, Zhao J, Lu K (2017) Low-rank discriminant embedding for multiview learning. IEEE Trans Cybern 47(11):3516–3529CrossRefGoogle Scholar
  28. 28.
    Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: A generalized framework for domain adaptation. IEEE Transactions on CyberneticsGoogle Scholar
  29. 29.
    Liu Y, Gao Q, Li J, Han J, Shao L (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496Google Scholar
  30. 30.
    Long Y, Shao L (2017) Describing unseen classes by exemplars: Zero-shot learning using grouped simile ensemble. In: WACV, pp 907–915. IEEEGoogle Scholar
  31. 31.
    Long Y, Liu L, Shao L, Shen F, Ding G, Han J (2017) From zero-shot learning to conventional supervised classification: Unseen visual data synthesis. In: CVPRGoogle Scholar
  32. 32.
    Lu J, Li J, Yan Z, Zhang C (2017) Zero-shot learning by generating pseudo feature representations. In: CVPRGoogle Scholar
  33. 33.
    Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: ICLRGoogle Scholar
  34. 34.
    Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on Multimedia, pp 59–68. ACMGoogle Scholar
  35. 35.
    Nie L, Zhao YL, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst (TOIS) 32(1):5CrossRefGoogle Scholar
  36. 36.
    Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2014) Zero-shot learning by convex combination of semantic embeddings. In: ICLRGoogle Scholar
  37. 37.
    Patterson G, Xu C, Su H, Hays J (2014) The sun attribute database: Beyond categories for deeper scene understanding. IJCV 108(1-2):59–81CrossRefGoogle Scholar
  38. 38.
    Qiao R, Liu L, Shen C, van den Hengel A (2016) Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp 2249–2257Google Scholar
  39. 39.
    Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: ICML, pp 2152–2161Google Scholar
  40. 40.
    Shen F, Shen C, Liu W, Tao Shen H (2015) Supervised discrete hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 37–45Google Scholar
  41. 41.
    Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: NIPS, pp 935–943Google Scholar
  42. 42.
    Welinder P, Branson S, Mita T, Wah C, Schroff F, Belongie S, Perona P (2010) Caltech-ucsd birds, pp 200Google Scholar
  43. 43.
    Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: CVPR, pp 69–77Google Scholar
  44. 44.
    Xian Y, Schiele B, Akata Z (2017) Zero-shot learning-the good, the bad and the ugly. In: CVPRGoogle Scholar
  45. 45.
    Yang Y, Luo Y, Chen W, Shen F, Shao J, Shen H T (2016) Zero-shot hashing via transferring supervised knowledge. In: Proceedings of the 2016 ACM on multimedia conference, pp 1286–1295. ACMGoogle Scholar
  46. 46.
    Yang Y, Luo Y, Chen W, Shen F, Shao J, Shen H T (2016) Zero-shot hashing via transferring supervised knowledge. In: ACM MM, pp 1286–1295. ACMGoogle Scholar
  47. 47.
    Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: ICCV, pp 4166–4174Google Scholar
  48. 48.
    Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl (TOMM) 12(3):37Google Scholar
  49. 49.
    Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: CVPRGoogle Scholar
  50. 50.
    Zhang H, Liu L, Long Y, Shao L (2018) Unsupervised deep hashing with pseudo labels for scalable image retrieval. IEEE Trans Image Process 27(4):1626–1638MathSciNetCrossRefGoogle Scholar
  51. 51.
    Zhang H, Long Y, Yang W, Shao L (2019) Dual-verification network for zero-shot learning. Inform Sci 470:43–57MathSciNetCrossRefGoogle Scholar
  52. 52.
    Zhu L, Shen J, Liu X, Xie L, Nie L (2016) Learning compactvisual representation with canonical views for robust mobile landmark search. In: IJCAIGoogle Scholar
  53. 53.
    Zhu L, Shen J, Xie L (2016) Topic hypergraph hashing for mobile imageretrieval. In: ACM MMGoogle Scholar
  54. 54.
    Zhu L, Huang Z, Chang X, Song J, Shen H T (2017) Exploring consistent preferences:discrete hashing with pair-exemplar for scalable landmark search. In: ACM MMGoogle Scholar
  55. 55.
    Zhu L, Huang Z, Liu X, Xie L (2017) Discrete multi-modal hashing with canonical views for robust mobile landmark search. IEEE TMM 19(9):2066–2079Google Scholar
  56. 56.
    Zhu L, Huang Z, Liu X, Xie L (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 19(9):2066–2079Google Scholar
  57. 57.
    Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistance for efficient content-based web image retrieval. IEEE TKDE 29 (2):472–486Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNanjing University of Science and TechnologyNanjingChina
  2. 2.Open Laboratory, School of ComputingNewcastle UniversityNewcastle upon TyneUK
  3. 3.Inception Institute of Artificial Intelligence (IIAI)Abu DhabiUnited Arab Emirates

Personalised recommendations