Skip to main content

Region-of-Interest Retrieval in Large Image Datasets with Voronoi VLAD

  • Conference paper
  • First Online:
Computer Vision Systems (ICVS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9163))

Included in the following conference series:

Abstract

We investigate the problem of visual-query based retrieval from large image datasets when the visual queries comprise arbitrary regions of interest (ROI) rather than entire images. Our proposal is a compact image descriptor that combines the vector of locally aggregated descriptors (VLAD) of Jegou et. al. with a multi-level, Voronoi-based, spatial partitioning of each dataset image, and it is termed as the Voronoi VLAD (VVLAD). The proposed multi-level Voronoi partitioning uses a spatial hierarchical K-means over interest-point locations, and computes a VLAD over each cell. In order to reduce the matching complexity when handling very large datasets, we propose the following modifications. First, we utilize the tree structure of the spatial hierarchical K-means to perform a top-to-bottom pruning for local similarity maxima, rather than exhaustively matching against all cells (Fast-VVLAD). Second, we propose to aggregate VLADs of adjacent Voronoi cells in order to reduce the overall VVLAD storage requirement per image. Finally, we propose a new image similarity score for Fast-VVLAD that combines relevant information from all partition levels into a single measure for similarity. For a range of ROI queries in two standard datasets, Fast-VVLAD achieves comparable or higher mean Average Precision against the state-of-the-art Multi-VLAD framework while offering more than two-fold acceleration.

This work was funded in part by Innovate UK, project REVQUAL (101855), and EPSRC (Industrial PhD CASE award, co-sponsored by BAFTA).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2911–2918 (2012)

    Google Scholar 

  2. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. of Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  3. Lazebnik, S. et al.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, In: IEEE International Conference on Computer Vision and Pattern Recogonition, vol. 2, pp. 2169–2178 (2006)

    Google Scholar 

  4. Philbin, J. et al.: Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2008)

    Google Scholar 

  5. Chum, O. et al.: Total recall: automatic query expansion with a generative feature model for object retrieval, In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  6. Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation, In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3304–3311 (2010)

    Google Scholar 

  7. Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2007)

    Google Scholar 

  8. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2007)

    Google Scholar 

  9. Arandjelovic, R., Zisserman, A.: All about VLAD. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1578–1585 (2013)

    Google Scholar 

  10. Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE International Conference on Computer Vision and Pattern Recogonition, vol. 2, pp. II-264–II-271 (2003)

    Google Scholar 

  12. Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3384–3391 (2010)

    Google Scholar 

  13. Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. In: IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp. 1704–1716 (2012)

    Google Scholar 

  14. Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of pca and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 774–787. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3416–3423 (2010)

    Google Scholar 

  16. Mikolajczyk, K., et al.: A comparison of affine region detectors. Int. J. of Comput. Vis. 65(1–2), 43–72 (2005)

    Article  Google Scholar 

  17. Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiannis Andreopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chadha, A., Andreopoulos, Y. (2015). Region-of-Interest Retrieval in Large Image Datasets with Voronoi VLAD. In: Nalpantidis, L., Krüger, V., Eklundh, JO., Gasteratos, A. (eds) Computer Vision Systems. ICVS 2015. Lecture Notes in Computer Science(), vol 9163. Springer, Cham. https://doi.org/10.1007/978-3-319-20904-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20904-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20903-6

  • Online ISBN: 978-3-319-20904-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics