Skip to main content
Log in

Bayes pooling of visual phrases for object retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Object retrieval is still an open question. A promising approach is based on the matching of visual phrases. However, this routine is often corrupted by visual phrase burstiness, i.e., the repetitive occurrence of some certain visual phrases. Burstiness leads to over-counting the co-occurring visual patterns between two images, thus would deteriorate the accuracy of image similarity measurement. On the other hand, existing methods are incapable of capturing the complete geometric variation between images. In this paper, we propose a novel strategy to address the two problems. Firstly, we propose a unified framework for matching geometry-constrained visual phrases. This framework provides a possibility of combing the optimal geometry constraints to improve the validity of matched visual phrases. Secondly, we propose to address the problem of visual phrase burstiness from a probabilistic view. This approach effectively filters out the bursty visual phrases through explicitly modelling their distribution. Experiments on five benchmark datasets demonstrate that our method outperforms other approaches consistently and significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Source code is available at: http://jiangwh.weebly.com/

References

  1. Arandjelovi R, Zisserman A (2012) Three things everyone should know to improve object retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 2911–2918

    Google Scholar 

  2. Bay H, Tuytelaars T, Van Gool L (2006) SURF : speeded up robust features. European Conference on Computer Vision, Graz, pp 404–417

    Google Scholar 

  3. Chum O, Matas J (2010) Unsupervised discovery of co-occurrence in sparse high dimensional data. IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, pp 3416–3423

    Google Scholar 

  4. Chum O, Mikulik A (2011) Total recall II: query expansion revisited. IEEE Conference on Computer Vision and Pattern Recognition, Colorado, pp 889–896

    Google Scholar 

  5. El sayad I, Martinet J, Urruty T, Djeraba C (2010) Toward a higher-level visual representation for content-based image retrieval. Multimed Tools Appl 1–28

  6. Hao Q, Cai R, Li Z et al (2012) 3D visual phrases for landmark recognition. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 3594–3601

    Google Scholar 

  7. J’egou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33:117–128

    Article  Google Scholar 

  8. Jégou H, Chum O (2012) Negative evidences and co-occurences in image retrieval : the benefit of PCA and whitening. European Conference on Computer Vision, Florence, pp 774–787

    Google Scholar 

  9. Jégou H, Douze M (2010) Aggregating local descriptors into a compact image representation. IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, pp 3304–3311

    Google Scholar 

  10. Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. European Conference on Computer Vision, Marseille, pp 304–317

    Google Scholar 

  11. Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp 1169–1176

    Google Scholar 

  12. Jiang Y, Meng J, Yuan J, Luo J (2015) Randomized spatial context for object search. IEEE Trans Image Process 24:1748–1762

    Article  MathSciNet  Google Scholar 

  13. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition, New York, pp 2169–2178

    Google Scholar 

  14. Lebeda K, Matas JJ, Chum O (2012) Fixing the locally optimized RANSAC. British Machine Vision Conference, Surrey, pp 95.1–95.11

    Google Scholar 

  15. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110. doi:10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  16. Mikolajczyk K, Schmid C (2001) Indexing based on scale invariant interest points. IEEE Conference on Computer Vision and Pattern Recognition, British Columbia, pp 525–531

    Google Scholar 

  17. Murata M, Nagano H, Member S, Mukai R (2014) BM25 with exponential IDF for instance search. IEEE Trans Multimed 16:1690–1699

    Article  Google Scholar 

  18. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. IEEE Conference on Computer Vision and Pattern Recognition, New York, pp 2161–2168

    Google Scholar 

  19. Philbin J, Chum O, Isard M et al (2007) Object retrieval with large vocabularies and fast spatial matching. IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp 1–8

    Google Scholar 

  20. Philbin J, Chum O, Isard M et al (2008) Lost in quantization: improving particular object retrieval in large scale image databases. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp 1–8

    Google Scholar 

  21. Qin D, Wengert C, Van Gool L (2013) Query adaptive similarity for large scale object retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Portland, pp 1610–1617

    Google Scholar 

  22. Revaud J, Douze M, Schmid C (2012) Correlation-based burstiness for logo retrieval. ACM Multimedia, Nara, pp 965–968

    Google Scholar 

  23. Shen X, Lin Z, Brandt J, Wu Y (2014) Spatially-constrained similarity measure for large-scale object retrieval. IEEE Trans Pattern Anal Mach Intell 36:1229–1241

    Article  Google Scholar 

  24. Shi M, Avrithis Y, Jégou H (2015) Early burst detection for memory-efficient image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp 605–613

    Google Scholar 

  25. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. IEEE Conference on Computer Vision and Pattern Recognition, Nice, pp 1470–1477

    Google Scholar 

  26. Smucker MD, Allan J, Carterette B (2007) A comparison of statistical significance tests for information retrieval evaluation. ACM Conference on Information and Knowledge Management, Lisbon, pp 623–632

    Google Scholar 

  27. Tian Q, Zhang S, Zhou W et al (2011) Building descriptive and discriminative visual codebook for large-scale image applications. Multimedia Tools Appl 51:441–477

    Article  Google Scholar 

  28. Uijlings JR, Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104:154–171

    Article  Google Scholar 

  29. Wang X, Yang M, Cour T, Zhu S (2011) Contextual weighting for vocabulary tree based image retrieval. IEEE International Conference on Computer Vision, Barcelona, pp 209–216

    Google Scholar 

  30. Xu J, Jagadeesh V, Ni Z (2013) Graph-based topic-focused retrieval in distributed camera network. IEEE Trans Multimed 15:2046–2057

    Article  Google Scholar 

  31. Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 809–816

    Google Scholar 

  32. Zhao W, Wu X, Ngo C (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimed 12:448–461

    Article  Google Scholar 

  33. Zheng L, Wang S (2013) Visual phraselet: refining spatial constraints for large scale image search. IEEE Signal Process Lett 20:391–394

    Article  Google Scholar 

  34. Zheng L, Wang S, He F, Tian Q (2014) Seeing the big picture : deep embedding with contextual evidences. arXiv preprint arXiv:1406.0132

  35. Zheng L, Wang S, Liu Z et al (2014) Packing and padding: coupled multi-index for accurate image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp 1963–1970

    Google Scholar 

  36. Zheng L, Wang S, Zhou W, Tian Q (2014) Bayes merging of multiple vocabularies for scalable image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp 1963–1970

    Google Scholar 

  37. Zhong W, Qifa K, Isard M, Jian S (2009) Bundling features for large scale partial-duplicate web image search. IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp 25–32

    Google Scholar 

Download references

Acknowledgments

This work is supported by Chinese National Natural Science Foundation under Grants 61471049,61532018 and 61372169, and BUPT Excellent Ph.D. students Foundation under Grant CX201425.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenhui Jiang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, W., Zhao, Z. & Su, F. Bayes pooling of visual phrases for object retrieval. Multimed Tools Appl 75, 9095–9119 (2016). https://doi.org/10.1007/s11042-015-2939-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2939-0

Keywords

Navigation