Learning to Match Images in Large-Scale Collections

  • Song Cao
  • Noah Snavely
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7583)


Many computer vision applications require computing structure and feature correspondence across a large, unorganized image collection. This is a computationally expensive process, because the graph of matching image pairs is unknown in advance, and so methods for quickly and accurately predicting which of the O(n 2) pairs of images match are critical. Image comparison methods such as bag-of-words models or global features are often used to predict similar pairs, but can be very noisy. In this paper, we propose a new image matching method that uses discriminative learning techniques—applied to training data gathered automatically during the image matching process—to gradually compute a better similarity measure for predicting whether two images in a given collection overlap. By using such a learned similarity measure, our algorithm can select image pairs that are more likely to match for performing further feature matching and geometric verification, improving the overall efficiency of the matching process. Our approach processes a set of images in an iterative manner, alternately performing pairwise feature matching and learning an improved similarity measure. Our experiments show that our learned measures can significantly improve match prediction over the standard tf-idf-weighted similarity and more recent unsupervised techniques even with small amounts of training data, and can improve the overall speed of the image matching process by more than a factor of two.


Image Retrieval Visual Word Image Pair Training Pair Large Scale Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, S., Snavely, N., Simon, I., Seitz, S., Szeliski, R.: Building Rome in a day. In: ICCV (2009)Google Scholar
  2. 2.
    Frahm, J.-M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.-H., Dunn, E., Clipp, B., Lazebnik, S., Pollefeys, M.: Building Rome on a Cloudless Day. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 368–381. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Simon, I., Snavely, N., Seitz, S.: Scene summarization for online image collections. In: ICCV (2007)Google Scholar
  4. 4.
    Philbin, J., Sivic, J., Zisserman, A.: Geometric latent dirichlet allocation on a matching graph for large-scale image datasets. IJCV (2010)Google Scholar
  5. 5.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  6. 6.
    Li, X., Wu, C., Zach, C., Lazebnik, S., Frahm, J.-M.: Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 427–440. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007)Google Scholar
  8. 8.
    Jegou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (2009)Google Scholar
  9. 9.
    Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: CVPR (2010)Google Scholar
  10. 10.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV (2004)Google Scholar
  11. 11.
    Zhang, J., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. IJCV (2007)Google Scholar
  12. 12.
    Naikal, N., Yang, A., Sastry, S.: Informative feature selection for object recognition via sparse PCA. In: ICCV (2011)Google Scholar
  13. 13.
    Mikulík, A., Perdoch, M., Chum, O., Matas, J.: Learning a Fine Vocabulary. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 1–14. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  14. 14.
    Schindler, G., Brown, M., Szeliski, R.: City-scale location recognition. In: CVPR (2007)Google Scholar
  15. 15.
    Knopp, J., Sivic, J., Pajdla, T.: Avoiding Confusing Features in Place Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 748–761. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Turcot, P., Lowe, D.: Better matching with fewer features: The selection of useful features in large database recognition problems. In: Workshop on Emergent Issues in Large Amounts of Visual Data, ICCV (2009)Google Scholar
  17. 17.
    Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Descriptor Learning for Efficient Retrieval. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 677–691. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  18. 18.
    Xing, E., Ng, A., Jordan, M., Russell, S.: Distance metric learning with application to clustering with side-information. In: NIPS (2003)Google Scholar
  19. 19.
    Schultz, M., Joachims, T.: Learning a distance metric from relative comparisons. In: NIPS (2003)Google Scholar
  20. 20.
    Frome, A., Malik, J.: Learning distance functions for exemplar-based object recognition. In: ICCV (2007)Google Scholar
  21. 21.
    Chechik, G., Sharma, V., Shalit, U., Bengio, S.: An online algorithm for large scale image similarity learning. In: NIPS (2009)Google Scholar
  22. 22.
    Bai, B., Weston, J., Grangier, D., Collobert, R., Sadamasa, K., Qi, Y., Chapelle, O., Weinberger, K.: Supervised semantic indexing. In: CIKM (2009)Google Scholar
  23. 23.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)Google Scholar
  24. 24.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)Google Scholar
  25. 25.
    Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: ICCV (2007)Google Scholar
  26. 26.
    Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87, 316–336 (2010)CrossRefGoogle Scholar
  27. 27.
    Frome, A., Singer, Y., Sha, F., Malik, J.: Learning globally-consistent local distance functions for shape-based image retrieval and classification. In: ICCV (2007)Google Scholar
  28. 28.
    Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-SVMs for object detection and beyond. In: ICCV (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Song Cao
    • 1
  • Noah Snavely
    • 1
  1. 1.Cornell UniversityIthacaUSA

Personalised recommendations