Improving Bag-of-Features for Large Scale Image Search

Jégou, Hervé; Douze, Matthijs; Schmid, Cordelia

doi:10.1007/s11263-009-0285-2

Improving Bag-of-Features for Large Scale Image Search

Published: 11 August 2009

Volume 87, pages 316–336, (2010)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Hervé Jégou¹,
Matthijs Douze¹ &
Cordelia Schmid¹

3149 Accesses
557 Citations
3 Altmetric
Explore all metrics

Abstract

This article improves recent methods for large scale image search. We first analyze the bag-of-features approach in the framework of approximate nearest neighbor search. This leads us to derive a more precise representation based on Hamming embedding (HE) and weak geometric consistency constraints (WGC). HE provides binary signatures that refine the matching based on visual words. WGC filters matching descriptors that are not consistent in terms of angle and scale. HE and WGC are integrated within an inverted file and are efficiently exploited for all images in the dataset. We then introduce a graph-structured quantizer which significantly speeds up the assignment of the descriptors to visual words. A comparison with the state of the art shows the interest of our approach when high accuracy is needed.

Experiments performed on three reference datasets and a dataset of one million of images show a significant improvement due to the binary signature and the weak geometric consistency constraints, as well as their efficiency. Estimation of the full geometric transformation, i.e., a re-ranking step on a short-list of images, is shown to be complementary to our weak geometric consistency constraints. Our approach is shown to outperform the state-of-the-art on the three datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Andoni, A., Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. (2006). Nearest-neighbor methods in learning and vision: theory and practice. Cambridge: MIT Press.
Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., & Zisserman, A. (2007). Total recall: Automatic query expansion with a generative feature model for object retrieval. In International conference on computer vision.
Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. (2004). Locality-sensitive hashing scheme based on p-stable distributions. In Symposium on computational geometry (pp. 253–262).
Douze, M., Jégou, H., Singh, H., Amsaleg, L., & Schmid, C. (2009). Evaluation of gist descriptors for web-scale image search. In Conference on image and video retrieval.
Fraundorfer, F., Stewénius, H., & Nistér, D. (2007). A binning scheme for fast hard drive based image search. In Conference on computer vision and pattern recognition.
Jégou, H., & Douze, M. (2008). INRIA Holidays dataset. http://lear.inrialpes.fr/people/jegou/data.php.
Jégou, H., Harzallah, H., & Schmid, C. (2007). A contextual dissimilarity measure for accurate and efficient image search. In Conference on computer vision and pattern recognition.
Jégou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In European conference on computer vision.
Lindeberg, T. (1998). Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2), 77–116.
Google Scholar
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Matas, J., Chum, O., Martin, U., & Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In British machine vision conference (pp. 384–393).
Mikolajczyk, K. (2007). Binaries for affine covariant region descriptors. http://www.robots.ox.ac.uk/~vgg/research/affine/.
Mikolajczyk, K., & Schmid, C. (2004). Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1), 63–86.
Article Google Scholar
Muja, M., & Lowe, D. G. (2009). Fast approximate nearest neighbors with automatic algorithm configuration. In International conference on computer vision and applications.
Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In Conference on computer vision and pattern recognition (pp. 2161–2168).
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.
Article MATH Google Scholar
Omercevic, D., Drbohlav, O., & Leonardis, A. (2007). High-dimensional feature matching: employing the concept of meaningful nearest neighbors. In International conference on computer vision.
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In Conference on computer vision and pattern recognition.
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2008). Lost in quantization: Improving particular object retrieval in large scale image databases. In Conference on computer vision and pattern recognition.
Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In Conference on computer vision and pattern recognition.
Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In International conference on computer vision (pp. 1470–1477).
Torralba, A., Fergus, R., & Weiss, Y. (2008). Small codes and large databases for recognition. In Conference on computer vision and pattern recognition.
Weiss, Y., Torralba, A., & Fergus, R. (2009). Spectral hashing. In Advances in neural information processing systems.

Download references

Author information

Authors and Affiliations

INRIA Grenoble Rhône-Alpes, 655 Avenue de l’Europe, Montbonnot, Saint-Ismier Cedex, 38334, France
Hervé Jégou, Matthijs Douze & Cordelia Schmid

Authors

Hervé Jégou
View author publications
You can also search for this author in PubMed Google Scholar
Matthijs Douze
View author publications
You can also search for this author in PubMed Google Scholar
Cordelia Schmid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hervé Jégou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jégou, H., Douze, M. & Schmid, C. Improving Bag-of-Features for Large Scale Image Search. Int J Comput Vis 87, 316–336 (2010). https://doi.org/10.1007/s11263-009-0285-2

Download citation

Received: 02 February 2009
Accepted: 28 July 2009
Published: 11 August 2009
Issue Date: May 2010
DOI: https://doi.org/10.1007/s11263-009-0285-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Bag-of-Features for Large Scale Image Search

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving Bag-of-Features for Large Scale Image Search

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

The Pascal Visual Object Classes Challenge: A Retrospective

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation