Image Retrieval for Online Browsing in Large Image Collections

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8199)


Two new methods for large scale image retrieval are proposed, showing that the classical ranking of images based on similarity addresses only one of possible user requirements. The novel retrieval methods add zoom-in and zoom-out capabilities and answer the “What is this?” and “Where is this?” questions.

The functionality is obtained by modifying the scoring and ranking functions of a standard bag-of-words image retrieval pipeline. We show the importance of the DAAT scoring and query expansion for recall of zoomed images.

The proposed methods were tested on a standard large annotated image dataset together with images of Sagrada Familia and 100000 image confusers downloaded from Flickr. For completeness, we present in detail components of image retrieval pipelines in state-of-the-art systems. Finally, open problems related to zoom-in and zoom-out queries are discussed.


Image Retrieval Visual Word Ranking Function Query Image Query Expansion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aasheim, Y., Lidal, M., Risvik, K.M.: Multi-tier architecture for web search engines. In: Proceedings of First Latin American Web Congress (2003)Google Scholar
  2. 2.
    Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Proc. CVPR, pp. 2911–2918. IEEE (2012)Google Scholar
  3. 3.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, ISBN: 020139829 (1999)Google Scholar
  4. 4.
    Barroso, L.A., Dean, J., Holzle, U.: Web search for a planet: The google cluster architecture. IEEE Micro 23 (2003)Google Scholar
  5. 5.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using smart: Trec 3, pp. 69–69. NIST Special Publication Sp (1995)Google Scholar
  7. 7.
    Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: Proc. CVPR (2010)Google Scholar
  8. 8.
    Chum, O., Matas, J., Kittler, J.: Locally optimized RANSAC. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 236–243. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Chum, O., Mikulik, A., Perdoch, M., Matas, J.: Total recall ii: Query expansion revisited. In: Proc. CVPR, pp. 889–896. IEEE Computer Society, Los Alamitos (2011) CD-ROMGoogle Scholar
  10. 10.
    Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: Proc. ICCV (2007)Google Scholar
  11. 11.
    Doubek, P., Matas, J., Perdoch, M., Chum, O.: Image matching and retrieval by repetitive patterns. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 3195–3198. IEEE (2010)Google Scholar
  12. 12.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)Google Scholar
  13. 13.
    Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: Proc. CVPR (2009)Google Scholar
  14. 14.
    Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE PAMI 33(1), 117–128 (2011)CrossRefGoogle Scholar
  15. 15.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proc. CVPR (2010)Google Scholar
  16. 16.
    Leung, T., Malik, J.: Detecting, localizing and grouping repeated scene elements from an image. In: Buxton, B.F., Cipolla, R. (eds.) ECCV 1996. LNCS, vol. 1064, pp. 546–555. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  17. 17.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. In: Proc. ICCV, vol. 60(2), pp. 91–110 (2004)Google Scholar
  18. 18.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Rosin, P.L., Marshall, D. (eds.) Proc. BMVC, vol. 1, pp. 384–393. BMVA, London (2002)Google Scholar
  19. 19.
    Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  20. 20.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. IJCV 65(1/2), 43–72 (2005)CrossRefGoogle Scholar
  21. 21.
    Mikulik, A., Perďoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. IJCV, 1–13 (2012)Google Scholar
  22. 22.
    Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISSAPP (2009)Google Scholar
  23. 23.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR (2006)Google Scholar
  24. 24.
    Oliva, A., Torralba, A.: Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research 155 (2006)Google Scholar
  25. 25.
    Perdoch, M., Chum, O., Matas, J.: Efficient representation of local geometry for large scale object retrieval. In: Proc. CVPR (2009)Google Scholar
  26. 26.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proc. CVPR (2007)Google Scholar
  27. 27.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in largescale image databases. In: Proc. CVPR (2008)Google Scholar
  28. 28.
    Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. In: Readings in Information Retrieval, pp. 355–364 (1997)Google Scholar
  29. 29.
    Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or how do I organize my holiday snaps? In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 414–431. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  30. 30.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proc. ICCV, pp. 1470–1477 (2003)Google Scholar
  31. 31.
    Stewénius, H., Gunderson, S.H., Pilet, J.: Size matters: Exhaustive geometric verification for image retrieval accepted for ECCV 2012. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 674–687. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  32. 32.
    Torii, A., Sivic, J., Pajdla, T.: Okutomi M. Visual place recognition with repetitive structures. In: Proc. CVPR (2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Center of Machine Perception, Department of Cybernetics, Faculty of Electrical EngineeringCzech Technical University in PragueCzech Republic

Personalised recommendations