Skip to main content

Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets

  • Chapter

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

Abstract

Many real-life large-scale datasets are open-ended and dynamic: new images are continuously added to existing classes, new classes appear over time, and the semantics of existing classes might evolve too. Therefore, we study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers. Since the performance of distance-based classifiers heavily depends on the used distance function, we cast the problem into one of learning a low-rank metric, which is shared across all classes. For the NCM classifier, we introduce a new metric learning approach, and we also introduce an extension to allow for richer class representations.

Experiments on the ImageNet 2010 challenge dataset, which contains over one million training images of thousand classes, show that, surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier. Moreover, the NCM performance is comparable to that of linear SVMs which obtain current state-of-the-art performance. Experimentally we study the generalization performance to classes that were not used to learn the metrics. Using a metric learned on 1,000 classes, we show results for the ImageNet-10K dataset which contains 10,000 classes, and obtain performance that is competitive with the current state-of-the-art, while being orders of magnitude faster.

This chapter is largely based on previously published work [30, 31].

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Strictly speaking the covariance matrix is not properly defined as the low-rank matrix W W is non-invertible.

  2. 2.

    See http://www.image-net.org/challenges/LSVRC/2010/index.

References

  1. Bai B, Weston J, Grangier D, Collobert R, Qi Y, Sadamasa K, Chapelle O, Weinberger K (2010) Learning to rank with (a lot of) word features. Inf Retr 13(3):291–314 (special issue on learning to rank)

    Article  Google Scholar 

  2. Bengio S, Weston J, Grangier D (2011) Label embedding trees for large multi-class tasks. In: NIPS

    Google Scholar 

  3. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: COMPSTAT

    Google Scholar 

  4. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  5. Chai J, Liua H, Chenb B, Baoa Z (2010) Large margin nearest local mean classifier. Signal Process 90(1):236–248

    Article  MATH  Google Scholar 

  6. Checkik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135

    MathSciNet  Google Scholar 

  7. Clinchant S, Csurka G, Perronnin F, Renders J-M (2007) XRCE’s participation to ImagEval. In: ImageEval workshop at CVIR

    Google Scholar 

  8. Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV int workshop on stat learning in computer vision

    Google Scholar 

  9. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR

    Google Scholar 

  10. Deng J, Berg A, Li K, Fei-Fei L (2010) What does classifying more than 10,000 image categories tell us? In: ECCV

    Google Scholar 

  11. Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

    Article  Google Scholar 

  12. Gao T, Koller D (2011) Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: ICCV

    Google Scholar 

  13. Gauvain J-L, Lee C-H (1994) Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans Speech Audio Process 2(2):291–298

    Article  Google Scholar 

  14. Globerson A, Roweis S (2006) Metric learning by collapsing classes. In: NIPS

    Google Scholar 

  15. Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighbourhood component analysis. In: NIPS

    Google Scholar 

  16. Gordo A, Rodríguez J, Perronnin F, Valveny E (2012) Leveraging category-level labels for instance-level image retrieval. In: CVPR

    Google Scholar 

  17. Gray R, Neuhoff D (1998) Quantization. IEEE Trans Inf Theory 44(6):2325–2383

    Article  MathSciNet  MATH  Google Scholar 

  18. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV

    Google Scholar 

  19. Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. In: ICCV

    Google Scholar 

  20. Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: ECCV

    Google Scholar 

  21. Jégou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128

    Article  Google Scholar 

  22. Jégou H, Perronnin F, Douze M, Sánchez J, Pérez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716

    Article  Google Scholar 

  23. Köstinger M, Hirzer M, Wohlhart P, Roth P, Bischof H (2012) Large scale metric learning from equivalence constraints. In: CVPR

    Google Scholar 

  24. Lampert C, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: CVPR

    Google Scholar 

  25. Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. In: AAAI conference on artificial intelligence

    Google Scholar 

  26. Le Q, Ranzato M, Monga R, Devin M, Chen K, Corrado G, Dean J, Ng A (2012) Building high-level features using large scale unsupervised learning. In: ICML

    Google Scholar 

  27. Lin Y, Lv F, Zhu S, Yang M, Cour T, Yu K, Cao L, Huang T (2011) Large-scale image classification: fast feature extraction and SVM training. In: CVPR

    Google Scholar 

  28. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  29. Lucchi A, Weston J (2012) Joint image and word sense discrimination for image retrieval. In: ECCV

    Google Scholar 

  30. Mensink T, Verbeek J, Perronnin F, Csurka G (2012) Metric learning for large scale image classification: generalizing to new classes at near-zero cost. In: ECCV

    Google Scholar 

  31. Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell (to appear)

    Google Scholar 

  32. Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: CVPR

    Google Scholar 

  33. Nowak E, Jurie F (2007) Learning visual similarity measures for comparing never seen objects. In: CVPR

    Google Scholar 

  34. Parameswaran S, Weinberger KQ (2010) Large margin multi-task metric learning. In: NIPS

    Google Scholar 

  35. Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. In: ECCV

    Google Scholar 

  36. Perronnin F, Akata Z, Harchaoui Z, Schmid C (2012) Towards good practice in large-scale learning for image classification. In: CVPR

    Google Scholar 

  37. Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR

    Google Scholar 

  38. Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: ECCV

    Google Scholar 

  39. Sánchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: CVPR

    Google Scholar 

  40. Tommasi T, Caputo B (2009) The more you know, the less you learn: from knowledge transfer to one-shot learning of object categories. In: BMVC

    Google Scholar 

  41. Veenman C, Tax D (2005) LESS: a model-based classifier for sparse subspaces. IEEE Trans Pattern Anal Mach Intell 27(9):1496–1500

    Article  Google Scholar 

  42. Webb AR (2002) Statistical pattern recognition. Wiley, New York

    Book  MATH  Google Scholar 

  43. Weinberger KQ, Chapelle O (2009) Large margin taxonomy embedding for document categorization. In: NIPS

    Google Scholar 

  44. Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  45. Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: NIPS

    Google Scholar 

  46. Weston J, Bengio S, Usunier N (2011) WSABIE: scaling up to large vocabulary image annotation. In: IJCAI

    Google Scholar 

  47. Zhang J, Marszałek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238

    Article  Google Scholar 

  48. Zhou X, Zhang X, Yan Z, Chang S-F, Hasegawa-Johnson M, Huang T (2008) Sift-bag kernel for video event analysis. In: ACM multimedia

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Mensink .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Mensink, T., Verbeek, J., Perronnin, F., Csurka, G. (2013). Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-5520-1_9

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-5519-5

  • Online ISBN: 978-1-4471-5520-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics