Abstract
Many real-life large-scale datasets are open-ended and dynamic: new images are continuously added to existing classes, new classes appear over time, and the semantics of existing classes might evolve too. Therefore, we study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers. Since the performance of distance-based classifiers heavily depends on the used distance function, we cast the problem into one of learning a low-rank metric, which is shared across all classes. For the NCM classifier, we introduce a new metric learning approach, and we also introduce an extension to allow for richer class representations.
Experiments on the ImageNet 2010 challenge dataset, which contains over one million training images of thousand classes, show that, surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier. Moreover, the NCM performance is comparable to that of linear SVMs which obtain current state-of-the-art performance. Experimentally we study the generalization performance to classes that were not used to learn the metrics. Using a metric learned on 1,000 classes, we show results for the ImageNet-10K dataset which contains 10,000 classes, and obtain performance that is competitive with the current state-of-the-art, while being orders of magnitude faster.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Strictly speaking the covariance matrix is not properly defined as the low-rank matrix W ⊤ W is non-invertible.
- 2.
References
Bai B, Weston J, Grangier D, Collobert R, Qi Y, Sadamasa K, Chapelle O, Weinberger K (2010) Learning to rank with (a lot of) word features. Inf Retr 13(3):291–314 (special issue on learning to rank)
Bengio S, Weston J, Grangier D (2011) Label embedding trees for large multi-class tasks. In: NIPS
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: COMPSTAT
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Chai J, Liua H, Chenb B, Baoa Z (2010) Large margin nearest local mean classifier. Signal Process 90(1):236–248
Checkik G, Sharma V, Shalit U, Bengio S (2010) Large scale online learning of image similarity through ranking. J Mach Learn Res 11:1109–1135
Clinchant S, Csurka G, Perronnin F, Renders J-M (2007) XRCE’s participation to ImagEval. In: ImageEval workshop at CVIR
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV int workshop on stat learning in computer vision
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR
Deng J, Berg A, Li K, Fei-Fei L (2010) What does classifying more than 10,000 image categories tell us? In: ECCV
Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611
Gao T, Koller D (2011) Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: ICCV
Gauvain J-L, Lee C-H (1994) Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans Speech Audio Process 2(2):291–298
Globerson A, Roweis S (2006) Metric learning by collapsing classes. In: NIPS
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighbourhood component analysis. In: NIPS
Gordo A, Rodríguez J, Perronnin F, Valveny E (2012) Leveraging category-level labels for instance-level image retrieval. In: CVPR
Gray R, Neuhoff D (1998) Quantization. IEEE Trans Inf Theory 44(6):2325–2383
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV
Guillaumin M, Verbeek J, Schmid C (2009) Is that you? Metric learning approaches for face identification. In: ICCV
Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: ECCV
Jégou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128
Jégou H, Perronnin F, Douze M, Sánchez J, Pérez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716
Köstinger M, Hirzer M, Wohlhart P, Roth P, Bischof H (2012) Large scale metric learning from equivalence constraints. In: CVPR
Lampert C, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: CVPR
Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. In: AAAI conference on artificial intelligence
Le Q, Ranzato M, Monga R, Devin M, Chen K, Corrado G, Dean J, Ng A (2012) Building high-level features using large scale unsupervised learning. In: ICML
Lin Y, Lv F, Zhu S, Yang M, Cour T, Yu K, Cao L, Huang T (2011) Large-scale image classification: fast feature extraction and SVM training. In: CVPR
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lucchi A, Weston J (2012) Joint image and word sense discrimination for image retrieval. In: ECCV
Mensink T, Verbeek J, Perronnin F, Csurka G (2012) Metric learning for large scale image classification: generalizing to new classes at near-zero cost. In: ECCV
Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell (to appear)
Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: CVPR
Nowak E, Jurie F (2007) Learning visual similarity measures for comparing never seen objects. In: CVPR
Parameswaran S, Weinberger KQ (2010) Large margin multi-task metric learning. In: NIPS
Perronnin F, Sánchez J, Mensink T (2010) Improving the Fisher kernel for large-scale image classification. In: ECCV
Perronnin F, Akata Z, Harchaoui Z, Schmid C (2012) Towards good practice in large-scale learning for image classification. In: CVPR
Rohrbach M, Stark M, Schiele B (2011) Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: CVPR
Saenko K, Kulis B, Fritz M, Darrell T (2010) Adapting visual category models to new domains. In: ECCV
Sánchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: CVPR
Tommasi T, Caputo B (2009) The more you know, the less you learn: from knowledge transfer to one-shot learning of object categories. In: BMVC
Veenman C, Tax D (2005) LESS: a model-based classifier for sparse subspaces. IEEE Trans Pattern Anal Mach Intell 27(9):1496–1500
Webb AR (2002) Statistical pattern recognition. Wiley, New York
Weinberger KQ, Chapelle O (2009) Large margin taxonomy embedding for document categorization. In: NIPS
Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: NIPS
Weston J, Bengio S, Usunier N (2011) WSABIE: scaling up to large vocabulary image annotation. In: IJCAI
Zhang J, Marszałek M, Lazebnik S, Schmid C (2007) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238
Zhou X, Zhang X, Yan Z, Chang S-F, Hasegawa-Johnson M, Huang T (2008) Sift-bag kernel for video event analysis. In: ACM multimedia
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G. (2013). Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets. In: Farinella, G., Battiato, S., Cipolla, R. (eds) Advanced Topics in Computer Vision. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-5520-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4471-5520-1_9
Publisher Name: Springer, London
Print ISBN: 978-1-4471-5519-5
Online ISBN: 978-1-4471-5520-1
eBook Packages: Computer ScienceComputer Science (R0)