Multimedia Systems

, Volume 21, Issue 3, pp 245–254 | Cite as

Visual word expansion and BSIFT verification for large-scale image search

  • Wengang ZhouEmail author
  • Houqiang Li
  • Yijuan Lu
  • Meng Wang
  • Qi Tian
Regular Paper


Recently, great advance has been made in large-scale content-based image search. Most state-of-the-art approaches are based on the bag-of-visual-words model with local features, such as SIFT, for image representation. Visual matching between images is obtained by vector quantization of local features. Feature quantization is either performed with hierarchical k-NN which introduces severe quantization loss, or with ANN (approximate nearest neighbors) search such as k-d tree, which is computationally inefficient. Besides, feature matching by quantization ignores the vector distance between features, which may cause many false-positive matches. In this paper, we propose constructing a supporting visual word table for all visual words by visual word expansion. Given the initial quantization result, multiple approximate nearest visual words are identified by checking supporting visual word table, which benefits the retrieval recall. Moreover, we present a matching verification scheme based on binary SIFT (BSIFT) signature. The L 2 distance between original SIFT descriptors is demonstrated to be well kept with the metric of Hamming distance between the corresponding binary SIFT signatures. With the BSIFT verification, false-positive matches can be effectively and efficiently identified and removed, which greatly improves the precision of large-scale image search. We evaluate the proposed approach on two public datasets for large-scale image search. The experimental results demonstrate the effectiveness and efficiency of our scheme.


Visual word expansion Binary SIFT Matching verification Image search 



This work was provided support as follows: Dr. Li was supported in part by NSFC under contract No. 61272316; Dr. Lu in part by Research Enhancement Program (REP), start-up funding from the Texas State University and DoD HBCU/MI grant W911NF-12-1-0057; Dr. Tian in part by ARO grant W911NF-12-1-0057, NSF IIS 1052851, Faculty Research Awards by Google, NEC Laboratories of America, FXPAL and UTSA START-R award.


  1. 1.
    Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In Proceedings of ICCV (2003)Google Scholar
  2. 2.
    Zhou, W., Lu, Y., Li, H., Song, Y., Tian, Q.: Spatial coding for large scale partial-duplicate Web image search. In: Proceedings of ACM Multimedia (2010)Google Scholar
  3. 3.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of CVPR (2006)Google Scholar
  4. 4.
    Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: automatic query expansion with a generative feature model for object retrieval. In: Proceedings of ICCV (2007)Google Scholar
  5. 5.
    Lowe, D.: Distinctive image features form scale-invariant keypoints. IJCV 20(2), 91–110 (2004)CrossRefGoogle Scholar
  6. 6.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of CVPR (2008)Google Scholar
  7. 7.
    Tuytelaars, T., Schmid, C.: Vector quantizing feature space with a regular lattice. In: Proceedings of ICCV (2010)Google Scholar
  8. 8.
    Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Proceedings of ECCV (2008)Google Scholar
  9. 9.
    Kuo, Y., Chen, K., Chiang, C., Hsu, W.H.: Query expansion for hash-based image object retrieval. In: Proceedings of ACM Multimedia (2009)Google Scholar
  10. 10.
    Philbin, J., Chum, O., Isard, M., Sivic J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR (2007)Google Scholar
  11. 11.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval. ACM Press, New York (1999). ISBN 020139829Google Scholar
  12. 12.
    Jain, M., Jegou, H., Gros, P.: Asymmetric Hamming embedding: taking the best of our bits for large scale image search. In: Proceedings of ACM Multimedia (2011)Google Scholar
  13. 13.
    Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: Proceedings of CVPR (2010)Google Scholar
  14. 14.
    Matas, J., Chum, O., Martin, U., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of BMVC (2002)Google Scholar
  15. 15.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)CrossRefGoogle Scholar
  16. 16.
    Bay, H., Tuytelaars, T., Gool, L.V.: SURF: speeded up robust features. In: Proceedings of ECCV (2006)Google Scholar
  17. 17.
    Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-Hash and tf-idf weighting. In: Proceedings of BMVC (2008)Google Scholar
  18. 18.
    Chum, O., Perdoch, M., Matas, J.: Geometric min-Hashing: finding a (thick) needle in a haystack. In: Proceedings of CVPR (2009)Google Scholar
  19. 19.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Comm ACM 24, 381–395 (1981)CrossRefMathSciNetGoogle Scholar
  20. 20.
    Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Proceedings of CIVR (2007)Google Scholar
  21. 21.
    Zhou, W., Li, H., Lu, Y., Tian, Q.: Large scale image search with geometric coding. In: Proceedings of ACM Multimedia (2011)Google Scholar
  22. 22.
    Hess, R.: An open-source SIFT library. In: Proceedings of ACM Multimedia (2010)Google Scholar
  23. 23.
    Arya, S., Mount, D.: Ann: Library for approximate nearest neighbor searching.
  24. 24.
    Zhou, W., Lu, Y., Li, H., Tian, Q.: Scalar quantization for large scale image search. In: Proceedings of ACM Multimedia (2012)Google Scholar
  25. 25.
    Zhang, S., Huang, Q., Hua, G., Jiang, S., Gao, W., Tian, Q.: Building contextual visual vocabulary for large-scale image applications. In: Proceedings of ACM Multimedia, pp. 501–510 (2010)Google Scholar
  26. 26.
    Perronnin, F., Liu, Y., Sandnchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: Proceedings of CVPR, pp. 3384–3391 (2010)Google Scholar
  27. 27.
    Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Trans. Multimedia 14(5), 1401–1413 (2012)CrossRefGoogle Scholar
  28. 28.
    Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local images descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. (2011)Google Scholar
  29. 29.
    Zhou, W., Li, H., Wang, M., Lu, Y., Tian, Q.: Binary sift: towards efficient feature matching verification for image search. In: Proceedings of ICIMCS, pp. 1–6 (2012)Google Scholar
  30. 30.
    Zhang, S., Tian, Q., Hua, G., Huang, Q., Wen, G.: Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans. Image Process. 20(9), 2664–2677 (2011)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Wengang Zhou
    • 1
    Email author
  • Houqiang Li
    • 2
  • Yijuan Lu
    • 3
  • Meng Wang
    • 4
  • Qi Tian
    • 1
  1. 1.Department of Computer ScienceUniversity of Texas at San AntonioTexasUSA
  2. 2.Department of EEISUniversity of Science and Technology of ChinaHefeiPeople’s Republic of China
  3. 3.Department of Computer ScienceTexas State UniversityTexasUSA
  4. 4.School of Computer and InformationHefei University of TechnologyHefeiPeople’s Republic of China

Personalised recommendations