Large Scale Image Retrieval Using Vector of Locally Aggregated Descriptors

  • Giuseppe Amato
  • Paolo Bolettieri
  • Fabrizio Falchi
  • Claudio Gennaro
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8199)


Vector of locally aggregated descriptors (VLAD) is a promising approach for addressing the problem of image search on a very large scale. This representation is proposed to overcome the quantization error problem faced in Bag-of-Words (BoW) representation. However, text search engines have not be used yet for indexing VLAD given that it is not a sparse vector of occurrence counts. For this reason BoW approach is still the most widely adopted method for finding images that represent the same object or location given an image as a query and a large set of images as dataset.

In this paper, we propose to enable inverted files of standard text search engines to exploit VLAD representation to deal with large-scale image search scenarios. We show that the use of inverted files with VLAD significantly outperforms BoW in terms of efficiency and effectiveness on the same hardware and software infrastructure.


bag of features bag of words local features compact codes image retrieval 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amato, G., Bolettieri, P., Falchi, F., Gennaro, C., Rabitti, F.: Combining local and global visual feature similarity using a text search engine. In: 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 49–54 (June 2011)Google Scholar
  2. 2.
    Amato, G., Falchi, F., Gennaro, C.: On reducing the number of visualwords in the bag-of-features representation. In: Battiato, S., Braz, J. (eds.) VISAPP 2013 - Proceedings of the International Conference on Computer Vision Theory and Applications, Barcelona, Spain, February 21-24, vol. 1, pp. 657–662. SciTePress (2013) ISBN: 978-989-8565-47-1 Google Scholar
  3. 3.
    Amato, G., Gennaro, C., Savino, P.: Mi-file: using inverted files for scalable approximate similarity search. In: Multimedia Tools and Applications, pp. 1–30 (2012)Google Scholar
  4. 4.
    Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of the 3rd International Conference on Scalable Information Systems, InfoScale 2008, pp. 28:1–28:10. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), Brussels (2008)Google Scholar
  5. 5.
    Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval - the concepts and technology behind search, 2nd edn. Pearson Education Ltd., Harlow (2011)Google Scholar
  6. 6.
    Chávez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Trans. Pattern Anal. Mach. Intell. 30(9), 1647–1658 (2008)CrossRefGoogle Scholar
  7. 7.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG 2004, pp. 253–262. ACM, New York (2004)CrossRefGoogle Scholar
  8. 8.
    Esuli, A.: Mipai: Using the pp-index to build an efficient and scalable similarity search system. In: Proceedings of the 2009 Second International Workshop on Similarity Search and Applications, SISAP 2009, pp. 146–148. IEEE Computer Society, Washington, DC (2009)CrossRefGoogle Scholar
  9. 9.
    Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)zbMATHCrossRefGoogle Scholar
  10. 10.
    Gennaro, C., Amato, G., Bolettieri, P., Savino, P.: An approach to content-based image retrieval based on the lucene search engine library. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds.) ECDL 2010. LNCS, vol. 6273, pp. 55–66. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems 11, pp. 487–493. MIT Press (1998)Google Scholar
  12. 12.
    Jégou, H., Douze, M., Sánchez, J., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311 (June 2010)Google Scholar
  13. 13.
    Jegou, H., Douze, M., Schmid, C.: Packing bag-of-features. In: 2009 IEEE 12th International Conference on Computer Vision, September 29 - October 2, pp. 2357–2364 (2009)Google Scholar
  14. 14.
    Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: IEEE Conference on Computer Vision & Pattern Recognition, pp. 3304–3311 (June 2010)Google Scholar
  15. 15.
    Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (September 2012) QUAEROGoogle Scholar
  16. 16.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  17. 17.
    Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8 (June 2007)Google Scholar
  18. 18.
    Perronnin, F., Liu, Y., Sanchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3384–3391 ( June 2010)Google Scholar
  19. 19.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  20. 20.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)Google Scholar
  21. 21.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, vol. 2. IEEE Computer Society, Washington, DC (2003)Google Scholar
  22. 22.
    Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: a survey. Found. Trends. Comput. Graph. Vis. 3(3), 177–280 (2008)CrossRefGoogle Scholar
  23. 23.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach. Advances in Database Systems, vol. 32. Kluwer (2006)Google Scholar
  24. 24.
    Zhang, X., Li, Z., Zhang, L., Ma, W.-Y., Shum, H.-Y.: Efficient indexing for large scale visual search. In: 2009 IEEE 12th International Conference on Computer Vision, September 29-October 2, vol. 2, pp. 1103–1110 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Giuseppe Amato
    • 1
  • Paolo Bolettieri
    • 1
  • Fabrizio Falchi
    • 1
  • Claudio Gennaro
    • 1
  1. 1.CNRISTIPisaItaly

Personalised recommendations