Is Attribute-Based Zero-Shot Learning an Ill-Posed Strategy?

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9851)


One transfer learning approach that has gained a wide popularity lately is attribute-based zero-shot learning. Its goal is to learn novel classes that were never seen during the training stage. The classical route towards realizing this goal is to incorporate a prior knowledge, in the form of a semantic embedding of classes, and to learn to predict classes indirectly via their semantic attributes. Despite the amount of research devoted to this subject lately, no known algorithm has yet reported a predictive accuracy that could exceed the accuracy of supervised learning with very few training examples. For instance, the direct attribute prediction (DAP) algorithm, which forms a standard baseline for the task, is known to be as accurate as supervised learning when as few as two examples from each hidden class are used for training on some popular benchmark datasets! In this paper, we argue that this lack of significant results in the literature is not a coincidence; attribute-based zero-shot learning is fundamentally an ill-posed strategy. The key insight is the observation that the mechanical task of predicting an attribute is, in fact, quite different from the epistemological task of learning the “correct meaning” of the attribute itself. This renders attribute-based zero-shot learning fundamentally ill-posed. In more precise mathematical terms, attribute-based zero-shot learning is equivalent to the mirage goal of learning with respect to one distribution of instances, with the hope of being able to predict with respect to any arbitrary distribution. We demonstrate this overlooked fact on some synthetic and real datasets. The data and software related to this paper are available at


Zero-shot learning Attribute-based classification Multi-label classification 



Research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST) and the Saudi Arabian Oil Company (Saudi Aramco).


  1. 1.
    Abu-Mostafa, Y.S., Magdon-Ismail, M., Lin, H.T.: Learning from data (2012)Google Scholar
  2. 2.
    Alabdulmohsin, I.: Algorithmic stability and uniform generalization. In: NIPS, pp. 19–27. Curran Associates, Inc. (2015)Google Scholar
  3. 3.
    Ben-David, S., Blitzer, J., Crammer, K., Pereira, F.: Analysis of representations for domain adaptation. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, pp. 137–144. MIT Press, Cambridge (2006).
  4. 4.
    Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Fifth Annual Workshop on Computational Learning Theory, pp. 144–152 (1992)Google Scholar
  5. 5.
    Bousquet, O., Boucheron, S., Lugosi, G.: Introduction to statistical learning theory. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) Machine Learning 2003. LNCS (LNAI), vol. 3176, pp. 169–207. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)zbMATHGoogle Scholar
  7. 7.
    Dinu, G., Baroni, M.: Improving zero-shot learning by mitigating the hubness problem. In: ICLR: Workshop Track (2015). arXiv:1412.6568
  8. 8.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. JMLR 9, 1871–1874 (2008)zbMATHGoogle Scholar
  9. 9.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR, pp. 1778–1785. IEEE (2009)Google Scholar
  10. 10.
    Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)CrossRefGoogle Scholar
  11. 11.
    Haehl, V., Vardaxis, V., Ulrich, B.: Learning to cruise: Bernstein’s theory applied to skill acquisition during infancy. Hum. Mov. Sci. 19(5), 685–715 (2000)CrossRefGoogle Scholar
  12. 12.
    Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: NIPS, pp. 3464–3472 (2014)Google Scholar
  13. 13.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 951–958. IEEE (2009)Google Scholar
  14. 14.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 453–465 (2014)CrossRefGoogle Scholar
  15. 15.
    Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344. IEEE (2011)Google Scholar
  16. 16.
    Rader, N., Bausano, M., Richards, J.E.: On the nature of the visual-cliff-avoidance response in human infants. Child Dev. 51(1), 61–68 (1980)CrossRefGoogle Scholar
  17. 17.
    Palatucci, M., Pomerleau, D., Hinton, G.E., Mitchell, T.M.: Zero-shot learning with semantic output codes. In: NIPS, pp. 1410–1418 (2009)Google Scholar
  18. 18.
    Romera-Paredes, B., Torr, P.: An embarrassingly simple approach to zero-shot learning. In: ICML, pp. 2152–2161 (2015)Google Scholar
  19. 19.
    Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)CrossRefzbMATHGoogle Scholar
  20. 20.
    Shigeto, Y., Suzuki, I., Hara, K., Shimbo, M., Matsumoto, Y.: Ridge regression, hubness, and zero-shot learning. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9284, pp. 135–151. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-23528-8_9 CrossRefGoogle Scholar
  21. 21.
    Socher, R., Ganjoo, M., Manning, C.D., Ng, A.: Zero-shot learning through cross-modal transfer. In: NIPS, pp. 935–943 (2013)Google Scholar
  22. 22.
    Thrun, S., Mitchell, T.M.: Lifelong robot learning. Rob. Auton. Syst. 15, 25–46 (1995)CrossRefGoogle Scholar
  23. 23.
    Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Computer, Electrical and Mathematical Sciences and Engineering DivisionKing Abdullah University of Science and Technology (KAUST)ThuwalSaudi Arabia
  2. 2.Facebook Artificial Intelligence Research (FAIR)Menlo ParkUSA

Personalised recommendations