Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost

  • Thomas Mensink
  • Jakob Verbeek
  • Florent Perronnin
  • Gabriela Csurka
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7573)


We are interested in large-scale image classification and especially in the setting where images corresponding to new or existing classes are continuously added to the training set. Our goal is to devise classifiers which can incorporate such images and classes on-the-fly at (near) zero cost. We cast this problem into one of learning a metric which is shared across all classes and explore k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers. We learn metrics on the ImageNet 2010 challenge data set, which contains more than 1.2M training images of 1K classes. Surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier, and has comparable performance to linear SVMs. We also study the generalization performance, among others by using the learned metric on the ImageNet-10K dataset, and we obtain competitive performance. Finally, we explore zero-shot classification, and show how the zero-shot model can be combined very effectively with small training datasets.


Training Image Query Image Reference Class Transfer Learning Large Scale Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  2. 2.
    Checkik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. Journal of Machine Learning Research 11, 1109–1135 (2010)Google Scholar
  3. 3.
    Deng, J., Berg, A.C., Li, K., Fei-Fei, L.: What Does Classifying More Than 10,000 Image Categories Tell Us? In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 71–84. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR (2010)Google Scholar
  5. 5.
    Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR (2011)Google Scholar
  6. 6.
    Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. PAMI 33, 117–128 (2011)CrossRefGoogle Scholar
  7. 7.
    Weston, J., Bengio, S., Usunier, N.: WSABIE: Scaling up to large vocabulary image annotation. In: IJCAI (2011)Google Scholar
  8. 8.
    Sánchez, J., Perronnin, F.: High-dimensional signature compression for large-scale image classification. In: CVPR (2011)Google Scholar
  9. 9.
    Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. PAMI (to appear, 2012)Google Scholar
  10. 10.
    Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., Huang, T.: Large-scale image classification: Fast feature extraction and SVM training. In: CVPR (2011)Google Scholar
  11. 11.
    Perronnin, F., Akata, Z., Harchaoui, Z., Schmid, C.: Towards good practice in large-scale learning for image classification. In: CVPR (2012)Google Scholar
  12. 12.
    Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)Google Scholar
  13. 13.
    Webb, A.R.: Statistical pattern recognition. Wiley, New-York (2002)zbMATHCrossRefGoogle Scholar
  14. 14.
    Veenman, C., Tax, D.: LESS: a model-based classifier for sparse subspaces. IEEE Trans. PAMI 27, 1496–1500 (2005)CrossRefGoogle Scholar
  15. 15.
    Zhou, X., Zhang, X., Yan, Z., Chang, S.-F., Hasegawa-Johnson, M., Huang, T.: SIFT-Bag kernel for video event analysis. In: ACM Multimedia (2008)Google Scholar
  16. 16.
    Weinberger, K., Saul, L.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)zbMATHGoogle Scholar
  17. 17.
    Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: COMPSTAT (2010)Google Scholar
  18. 18.
    Gray, R., Neuhoff, D.: Quantization. IEEE Trans. Information Theory 44, 2325–2383 (1998)zbMATHMathSciNetCrossRefGoogle Scholar
  19. 19.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. IJCV 73, 213–238 (2007)CrossRefGoogle Scholar
  21. 21.
    Nowak, E., Jurie, F.: Learning visual similarity measures for comparing never seen objects. In: CVPR (2007)Google Scholar
  22. 22.
    Chai, J., Liua, H., Chenb, B., Baoa, Z.: Large margin nearest local mean classifier. Signal Processing 90, 236–248 (2010)zbMATHCrossRefGoogle Scholar
  23. 23.
    Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. PAMI 28, 594–611 (2006)CrossRefGoogle Scholar
  24. 24.
    Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)Google Scholar
  25. 25.
    Tommasi, T., Caputo, B.: The more you know, the less you learn: from knowledge transfer to one-shot learning of object categories. In: BMVC (2009)Google Scholar
  26. 26.
    Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI Conference on Artificial Intelligence (2008)Google Scholar
  27. 27.
    Bai, B., Weston, J., Grangier, D., Collobert, R., Qi, Y., Sadamasa, K., Chapelle, O., Weinberger, K.: Learning to rank with (a lot of) word features. Information Retrieval – Special Issue on Learning to Rank 13, 291–314 (2010)CrossRefGoogle Scholar
  28. 28.
    Gauvain, J.L., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech and Audio Proc. 2, 291–298 (1994)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Thomas Mensink
    • 1
    • 2
  • Jakob Verbeek
    • 1
  • Florent Perronnin
    • 2
  • Gabriela Csurka
    • 2
  1. 1.LEAR, INRIA GrenobleMontbonnotFrance
  2. 2.TVPA, Xerox Research Centre EuropeMeylanFrance

Personalised recommendations