Bayes pooling of visual phrases for object retrieval

Jiang, Wenhui; Zhao, Zhicheng; Su, Fei

doi:10.1007/s11042-015-2939-0

Bayes pooling of visual phrases for object retrieval

Published: 30 September 2015

Volume 75, pages 9095–9119, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wenhui Jiang¹,
Zhicheng Zhao¹ &
Fei Su¹

223 Accesses
2 Citations
Explore all metrics

Abstract

Object retrieval is still an open question. A promising approach is based on the matching of visual phrases. However, this routine is often corrupted by visual phrase burstiness, i.e., the repetitive occurrence of some certain visual phrases. Burstiness leads to over-counting the co-occurring visual patterns between two images, thus would deteriorate the accuracy of image similarity measurement. On the other hand, existing methods are incapable of capturing the complete geometric variation between images. In this paper, we propose a novel strategy to address the two problems. Firstly, we propose a unified framework for matching geometry-constrained visual phrases. This framework provides a possibility of combing the optimal geometry constraints to improve the validity of matched visual phrases. Secondly, we propose to address the problem of visual phrase burstiness from a probabilistic view. This approach effectively filters out the bursty visual phrases through explicitly modelling their distribution. Experiments on five benchmark datasets demonstrate that our method outperforms other approaches consistently and significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Notes

Source code is available at: http://jiangwh.weebly.com/

References

Arandjelovi R, Zisserman A (2012) Three things everyone should know to improve object retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 2911–2918
Google Scholar
Bay H, Tuytelaars T, Van Gool L (2006) SURF : speeded up robust features. European Conference on Computer Vision, Graz, pp 404–417
Google Scholar
Chum O, Matas J (2010) Unsupervised discovery of co-occurrence in sparse high dimensional data. IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, pp 3416–3423
Google Scholar
Chum O, Mikulik A (2011) Total recall II: query expansion revisited. IEEE Conference on Computer Vision and Pattern Recognition, Colorado, pp 889–896
Google Scholar
El sayad I, Martinet J, Urruty T, Djeraba C (2010) Toward a higher-level visual representation for content-based image retrieval. Multimed Tools Appl 1–28
Hao Q, Cai R, Li Z et al (2012) 3D visual phrases for landmark recognition. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 3594–3601
Google Scholar
J’egou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33:117–128
Article Google Scholar
Jégou H, Chum O (2012) Negative evidences and co-occurences in image retrieval : the benefit of PCA and whitening. European Conference on Computer Vision, Florence, pp 774–787
Google Scholar
Jégou H, Douze M (2010) Aggregating local descriptors into a compact image representation. IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, pp 3304–3311
Google Scholar
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. European Conference on Computer Vision, Marseille, pp 304–317
Google Scholar
Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp 1169–1176
Google Scholar
Jiang Y, Meng J, Yuan J, Luo J (2015) Randomized spatial context for object search. IEEE Trans Image Process 24:1748–1762
Article MathSciNet Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. IEEE Conference on Computer Vision and Pattern Recognition, New York, pp 2169–2178
Google Scholar
Lebeda K, Matas JJ, Chum O (2012) Fixing the locally optimized RANSAC. British Machine Vision Conference, Surrey, pp 95.1–95.11
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91–110. doi:10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Mikolajczyk K, Schmid C (2001) Indexing based on scale invariant interest points. IEEE Conference on Computer Vision and Pattern Recognition, British Columbia, pp 525–531
Google Scholar
Murata M, Nagano H, Member S, Mukai R (2014) BM25 with exponential IDF for instance search. IEEE Trans Multimed 16:1690–1699
Article Google Scholar
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. IEEE Conference on Computer Vision and Pattern Recognition, New York, pp 2161–2168
Google Scholar
Philbin J, Chum O, Isard M et al (2007) Object retrieval with large vocabularies and fast spatial matching. IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, pp 1–8
Google Scholar
Philbin J, Chum O, Isard M et al (2008) Lost in quantization: improving particular object retrieval in large scale image databases. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, pp 1–8
Google Scholar
Qin D, Wengert C, Van Gool L (2013) Query adaptive similarity for large scale object retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Portland, pp 1610–1617
Google Scholar
Revaud J, Douze M, Schmid C (2012) Correlation-based burstiness for logo retrieval. ACM Multimedia, Nara, pp 965–968
Google Scholar
Shen X, Lin Z, Brandt J, Wu Y (2014) Spatially-constrained similarity measure for large-scale object retrieval. IEEE Trans Pattern Anal Mach Intell 36:1229–1241
Article Google Scholar
Shi M, Avrithis Y, Jégou H (2015) Early burst detection for memory-efficient image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp 605–613
Google Scholar
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. IEEE Conference on Computer Vision and Pattern Recognition, Nice, pp 1470–1477
Google Scholar
Smucker MD, Allan J, Carterette B (2007) A comparison of statistical significance tests for information retrieval evaluation. ACM Conference on Information and Knowledge Management, Lisbon, pp 623–632
Google Scholar
Tian Q, Zhang S, Zhou W et al (2011) Building descriptive and discriminative visual codebook for large-scale image applications. Multimedia Tools Appl 51:441–477
Article Google Scholar
Uijlings JR, Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104:154–171
Article Google Scholar
Wang X, Yang M, Cour T, Zhu S (2011) Contextual weighting for vocabulary tree based image retrieval. IEEE International Conference on Computer Vision, Barcelona, pp 209–216
Google Scholar
Xu J, Jagadeesh V, Ni Z (2013) Graph-based topic-focused retrieval in distributed camera network. IEEE Trans Multimed 15:2046–2057
Article Google Scholar
Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. IEEE Conference on Computer Vision and Pattern Recognition, Providence, pp 809–816
Google Scholar
Zhao W, Wu X, Ngo C (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimed 12:448–461
Article Google Scholar
Zheng L, Wang S (2013) Visual phraselet: refining spatial constraints for large scale image search. IEEE Signal Process Lett 20:391–394
Article Google Scholar
Zheng L, Wang S, He F, Tian Q (2014) Seeing the big picture : deep embedding with contextual evidences. arXiv preprint arXiv:1406.0132
Zheng L, Wang S, Liu Z et al (2014) Packing and padding: coupled multi-index for accurate image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp 1963–1970
Google Scholar
Zheng L, Wang S, Zhou W, Tian Q (2014) Bayes merging of multiple vocabularies for scalable image retrieval. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, pp 1963–1970
Google Scholar
Zhong W, Qifa K, Isard M, Jian S (2009) Bundling features for large scale partial-duplicate web image search. IEEE Conference on Computer Vision and Pattern Recognition, Miami, pp 25–32
Google Scholar

Download references

Acknowledgments

This work is supported by Chinese National Natural Science Foundation under Grants 61471049,61532018 and 61372169, and BUPT Excellent Ph.D. students Foundation under Grant CX201425.

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, China
Wenhui Jiang, Zhicheng Zhao & Fei Su

Authors

Wenhui Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhicheng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Fei Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenhui Jiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, W., Zhao, Z. & Su, F. Bayes pooling of visual phrases for object retrieval. Multimed Tools Appl 75, 9095–9119 (2016). https://doi.org/10.1007/s11042-015-2939-0

Download citation

Received: 30 November 2014
Revised: 04 August 2015
Accepted: 08 September 2015
Published: 30 September 2015
Issue Date: August 2016
DOI: https://doi.org/10.1007/s11042-015-2939-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayes pooling of visual phrases for object retrieval

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Attention mechanisms in computer vision: A survey

Image Matching from Handcrafted to Deep Features: A Survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayes pooling of visual phrases for object retrieval

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Attention mechanisms in computer vision: A survey

Image Matching from Handcrafted to Deep Features: A Survey

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation