Skip to main content
Log in

Is visual saliency useful for content-based image retrieval?

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the real world, people often focus on the distinctive objects (Salient Regions, SR) in a scene. Thus, a number of saliency detection methods are introduced into content-based image retrieval (CBIR), which is often with Bag of Words (BoW) model. These methods aim to use the saliency map to prune keypoints or discard the keypoints from the background. However, these methods do not consider the background of the image and the characteristics of the dataset itself. In this paper we focus on the following two issues: 1) whether the saliency pruning method is useful for image retrieval in different kinds of datasets (e.g., salient/cluttered, mixed image database); 2) we test the effectiveness of the discarded parts from the background (Non-Salient Regions, Non-SR) for different kinds of image database. In order to demonstrate the performance of using visual saliency, we conduct experiments on two publicly available database (Ukbench, Holidays). The experiments reveal that the way of using saliency map to filter a small amount of key-points can clearly improve the performance of CBIR, and the keypoints in the background are also useful in some kinds of image datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Achanta R, Hemami S, Estrada F (2009) Frequency-tuned salient region detection. Comput Vis Pattern Recognit (CVPR) 1597–1604

  2. Arandjelović R, Zisserman A (2012) Three things everyone should know to improve object retrieval. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2911–2918

  3. Awad D, Courboulay V, Revel A (2012) Saliency filtering of sift detectors: application to cbir. In: Advanced concepts for intelligent vision systems. Springer, Berlin, pp 290–300

  4. Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35:185–207

    Article  Google Scholar 

  5. Cardillo FA, Amato G, Falchi F (2013) Experimenting a visual attention model in the context of cbir systems. In: IIR. Citeseer, pp 45–56

  6. Chang X, Yang Y, Hauptmann A, Xing EP, Yu Y (2015) Semantic concept discovery for large-scale zero-shot event detection, IJCAI

  7. Chang X, Yang Y, Xing E, Yu Y (2015) Complex event detection using semantic saliency and nearly-isotonic SVM, ICML

  8. Chang X, Yu Y, Yang Y, Xing EP (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell

  9. Chang X, Yu Y, Yang Y, Xing EP (2016) They are not equally reliable: semantic event search using differentiated concept classifiers. IEEE Conf Comput Vis Pattern Recognit 1884–1893

  10. Cheng MM, Zhang GX, Mitra NJ, X Huang SM (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell (PAMI) 37:569–582

    Article  Google Scholar 

  11. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv (CSUR) 40:5

    Article  Google Scholar 

  12. de Carvalho Soares R, da Silva IR, Guliato D (2012) Spatial locality weighting of features using saliency map with a bag-of-visual-words approach. In: 2012 IEEE 24th international conference on tools with artificial intelligence (ICTAI), vol 1. IEEE, pp 1070–1075

  13. Foo JJ, Sinha R (2007) Pruning sift for scalable near-duplicate image matching. In: Proceedings of the 18th conference on australasian database, vol 63. Australian Computer Society, Inc., pp 63–71

  14. Frintrop S, Rome E, Christensen HI (2010) Computational visual attention systems and their cognitive foundations: a survey. ACM Trans Appl Perception (TAP) 7(1):6

    Google Scholar 

  15. Fu H, Chi Z, Feng D (2006) Attention-driven image interpretation with application to image retrieval. Pattern Recogn 39(9):1604–1621

    Article  MATH  Google Scholar 

  16. Gao K, Lin S, Zhang Y, Tang S, Ren H (2008) Attention model based sift key points filtration for image retrieval. In: 7th IEEE/ACIS international conference on computer and information science, 2008. ICIS 08. IEEE, pp 191–196

  17. Gao Y, Ji R, Liu W, Dai Q, Hua G (2014) Weakly supervised visual dictionary learning by harnessing image attributes. IEEE Trans Image Process 23(12):5400–5411

    Article  MathSciNet  MATH  Google Scholar 

  18. Giouvanakis E, Kotropoulos C (2014) Saliency map driven image retrieval combining the bag-of-words model and plsa. In: 2014 19th international conference on digital signal processing (DSP). IEEE, pp 280–285

  19. Goferman S, Zelnik-Manor L, Tal A (2012) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34:1915–1926

    Article  Google Scholar 

  20. Gupta A, Jain R (1997) Visual information retrieval. Commun ACM 40:70–79

    Article  Google Scholar 

  21. Harel J, Koch C, Perona P (2006) Graph-based visual saliency. In: NIPS, vol 1, pp 545–552

  22. Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: IEEE conference on computer vision and pattern recognition, 2007. CVPR07. IEEE, pp 1–8

  23. Huang S, Wang W, Zhang H (2014) Retrieving images using saliency detection and graph matching. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 3087–3091

  24. Itti L (2000) Models of bottom-up and top-down visual attention. Ph.D. thesis, California Institute of Technology

  25. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20:1254–1259

    Article  Google Scholar 

  26. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell (11):1254–1259

  27. Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: Computer vision-ECCV 2008. Springer, Berlin, pp 304–317

  28. Judd T, Ehinger K, Durand CF, Torralba CA (2009) Learning to predict where humans look. IEEE Int Conf Comput Vis 30:2106–2113

    Google Scholar 

  29. Li B, Xiong W, Hu W (2012) Visual saliency map from tensor analysis. In: AAAI

  30. Li B, Xiong W, Hu W, Funt B, Xing J (2015) Multi-cue illumination estimation via a tree-structured group joint sparse representation. Int J Comput Vis C1–27

  31. Li B, Xiong W, Wu O, Hu W, Maybank S, Yan S (2015) Horror image recognition based on context-aware multi-instance learning. IEEE Trans Image Process 24(12):5193–5205

    Article  MathSciNet  Google Scholar 

  32. Li G, Yu Y (2016) Deep contrast learning for salient object detection. IEEE Conf Comput Vis Pattern Recognit 478–487

  33. Li G, Yu Y (2016) Visual saliency detection based on multiscale deep CNN features. IEEE Trans Pattern Anal Mach Intell 25:5012–5024

    MathSciNet  Google Scholar 

  34. Liang Z, Fu H, Chi Z, Feng D (2010) Salient-sift for image retrieval. In: Advanced concepts for intelligent vision systems. Springer, Berlin, pp 62–71

  35. Liu J, Meng F, Mu F, Zhang Y (2014) An improved image retrieval method based on sift algorithm and saliency map. In: 2014 11th international conference on fuzzy systems and knowledge discovery (FSKD). IEEE, pp 766–770

  36. Liu T, Sun J, Zheng N, Tang X, Shum H (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33

  37. Liu Y, Zhang D, Lu G, Ma W-Y (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40:262–282

    Article  MATH  Google Scholar 

  38. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110

    Article  Google Scholar 

  39. Muja M, Lowe DG (2009) Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP 2:331–340

    Google Scholar 

  40. Muratov O (2013) Visual saliency detection and its application to image retrieval. Ph.D. thesis, University of Trento

  41. Nguyen B-V, Pham D, Ngo TD, Le D-D, Duong DA (2014) Integrating spatial information into inverted index for large-scale image retrieval. In: 2014 IEEE international symposium on multimedia (ISM). IEEE, pp 102–105

  42. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. Comput Vis Pattern Recognit 2:2161–2168

    Google Scholar 

  43. Papushoy A, Bors AG (2015) Image retrieval based on query by saliency content. Digital Signal Process 36:156–173

    Article  MathSciNet  Google Scholar 

  44. Peng H, Li B, Ji R, Hu W, Xiong W, Lang C (2013) Salient object detection via low-rank and structured sparse matrix decomposition. In: AAAI

  45. Peng H, Li B, Ling H, Hu W (2017) Salient object detection via structured matrix decomposition, IEEE Trans Pattern Anal Mach Intell

  46. Perazzi F, Krähenbühl P, Pritch Y (2012) Saliency filters: Contrast based filtering for salient region detection. In: Computer vision and pattern recognition (CVPR), pp 733–740

  47. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. in: Computer vision and pattern recognition, pp 1–8

  48. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8

  49. Qin D, Wengert C, Van Gool L (2013) Query adaptive similarity for large scale object retrieval. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1610–1617

  50. Rashedi E, Nezamabadi-Pour H, Saryazdi S (2013) A simultaneous feature adaptation and feature selection method for content-based image retrieval systems. Knowl-Based Syst 39:85–94

    Article  Google Scholar 

  51. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. IEEE Int Conf Comput Vis 1470–1477

  52. Tolias G, Avrithis Y, Jgou H (2013) To aggregate or not to aggregate: Selective match kernels for image search. In: 2013 IEEE international conference on computer vision (ICCV). IEEE, pp 1401–1408

  53. Wang JZ, Li J, Wiederhold G (2001) Simplicity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell 23:1470–1477

    Google Scholar 

  54. Wang X, Yang M, Cour T, Zhu S, Yu K, Han TX (2011) Contextual weighting for vocabulary tree based image retrieval. In: 2011 IEEE international conference on computer vision (ICCV). IEEE, pp 209–216

  55. Wen Z, Gao J, Luo R, Wu H (2014) Image retrieval based on saliency attention. In: Foundations of intelligent systems. Springer, Berlin, pp 177–188

  56. Wu J, Li Z, Xu Y, Ji C, Li Y, Xing X (2014) Color image retrieval using visual weighted blocks. In: 2014 IEEE 12th international conference on dependable, autonomic and secure computing (DASC). IEEE, pp 338–343

  57. Wu L, Jin R, Jain AK (2013) Tag completion for image retrieval. IEEE Trans Pattern Anal Mach Intell 35:716–727

    Article  Google Scholar 

  58. Xie H, Gao K, Zhang Y, Li J, Liu Y (2011) Pairwise weak geometric consistency for large scale image search. In: Proceedings of the 1st ACM international conference on multimedia retrieval. ACM, p 42

  59. Yang J, Yang M (2012) Top-down visual saliency via joint crf and dictionary learning. Comput Vis Pattern Recognit 2296–2303

  60. Zdziarski Z, Dahyot R (2012) Feature selection using visual saliency for content-based image retrieval. In: Signals and systems conference (ISSC 2012), IET irish. IET, pp 1–6

  61. Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 809–816

  62. Zheng L, Wang S, Liu Z, Tian Q (2013) Lp-norm idf for large scale image search. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 1626–1633

  63. Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: coupled multi-index for accurate image retrieval. In: Computer vision and pattern recognition (CVPR), pp 1947–1954

  64. Zheng L, Wang S, Liu Z, Tian Q (2015) Fast image retrieval: query pruning and early termination. IEEE Trans Multimedia 17(5):648–659

    Article  Google Scholar 

  65. Zhu F, Shao L (2014) Weakly-supervised cross-domain dictionary learning for visual recognition. Int J Comput Vis 109(1-2):42–59

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This paper is supported by the following projects: The National Natural Science Foundation of China (No. 61571045, No. 61372148); Beijing Natural Science Foundation (4152016); The National Key Technology R&D Program (2014BAK08B02, 2015BAH55F03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhe Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Liu, H., Yuan, J. et al. Is visual saliency useful for content-based image retrieval?. Multimed Tools Appl 77, 13983–14006 (2018). https://doi.org/10.1007/s11042-017-5001-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5001-6

Keywords

Navigation