Skip to main content
Log in

SalientShape: group saliency in image collections

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Efficiently identifying salient objects in large image collections is essential for many applications including image retrieval, surveillance, image annotation, and object recognition. We propose a simple, fast, and effective algorithm for locating and segmenting salient objects by analysing image collections. As a key novelty, we introduce group saliency to achieve superior unsupervised salient object segmentation by extracting salient objects (in collections of pre-filtered images) that maximize between-image similarities and within-image distinctness. To evaluate our method, we construct a large benchmark dataset consisting of 15 K images across multiple categories with 6000+ pixel-accurate ground truth annotations for salient object regions where applicable. In all our tests, group saliency consistently outperforms state-of-the-art single-image saliency algorithms, resulting in both higher precision and better recall. Our algorithm successfully handles image collections, of an order larger than any existing benchmark datasets, consisting of diverse and heterogeneous images from various internet sources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Meta-data is the current industry standard for image retrieval as popularized by search engines like Google image, Flickr, etc.

  2. http://mmcheng.net/gsal/.

References

  1. Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: IEEE CVPR, pp. 1597–1604 (2009)

    Google Scholar 

  2. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)

    Article  Google Scholar 

  3. Ben-Haim, N., Babenko, B., Belongie, S.: Improving web-based image search via content based clustering. In: IEEE CVPRW, p. 106 (2006)

    Google Scholar 

  4. Biswas, S., Aggarwal, G., Chellappa, R.: An efficient and robust algorithm for shape indexing and retrieval. IEEE Trans. Multimed. 12, 371–385 (2010)

    Article  Google Scholar 

  5. Bouet, M., Khenchaf, A., Briand, H.: Shape representation for image retrieval. In: ACM MM, pp. 1–4 (1999)

    Google Scholar 

  6. Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE CVPR, pp. 761–768 (2011)

    Google Scholar 

  7. Cao, Y., Wang, H., Wang, C., Li, Z., Zhang, L., Zhang, L.: Mindfinder: interactive sketch-based image search on millions of images. In: ACM MM, pp. 1605–1608 (2010)

    Google Scholar 

  8. Chang, K., Liu, T., Lai, S.: From co-saliency to co-segmentation: an efficient and fully unsupervised energy minimization model. In: IEEE CVPR, pp. 2129–2136 (2011)

    Google Scholar 

  9. Charpiat, G., Faugeras, O., Keriven, R.: Shape statistics for image segmentation with prior. In: IEEE CVPR, pp. 1–6 (2007)

    Google Scholar 

  10. Chen, H.: Preattentive co-saliency detection. In: IEEE ICIP, pp. 1117–1120 (2010)

    Google Scholar 

  11. Chen, T., Cheng, M.M., Tan, P., Shamir, A., Hu, S.M.: Sketch2photo: Internet image montage. ACM Trans. Graph. 28(5), 124:1–124:10 (2009)

    Google Scholar 

  12. Chen, T., Tan, P., Ma, L.Q., Cheng, M.M., Shamir, A., Hu, S.M.: Poseshop: human image database construction and personalized content synthesis. IEEE Trans. Vis. Comput. Graph. 19(5), 824–837 (2013)

    Article  Google Scholar 

  13. Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H.S., Hu, S.M.: Salient object detection and segmentation. Tech. rep., Tsinghua University (2011). http://mmcheng.net/SalObj/. Submission NO. TPAMI-2011-10-0753

  14. Cheng, M.M., Zhang, F.L., Mitra, N.J., Huang, X., Hu, S.M.: Repfinder: finding approximately repeated scene elements for image editing. ACM Trans. Graph. 29(4), 83:1–83:8 (2010)

    Article  Google Scholar 

  15. Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. In: IEEE CVPR, pp. 409–416 (2011)

    Google Scholar 

  16. Chia, Y.S., Zhuo, S., Gupta, R.K., Tai, Y.W., Cho, S.Y., Tan, P., Lin, S.: Semantic colorization with internet images. ACM Trans. Graph. 30(6) (2011). doi:10.1145/2024156.2024190

  17. Cui, J., Wen, F., Tang, X.: Real time Google and live image search re-ranking. In: ACM MM, pp. 729–732 (2008)

    Google Scholar 

  18. Datta, R., Joshi, D., Li, J., Wang, J.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)

    Article  Google Scholar 

  19. Del Bimbo, A., Pala, P.: Visual image retrieval by elastic matching of user sketches. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 121–132 (1997)

    Article  Google Scholar 

  20. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR, pp. 248–255 (2009)

    Google Scholar 

  21. Eitz, M., Hays, J., Alexa, M.: How do humans sketch objects? ACM Trans. Graph. 31(4) (2012). doi:10.1145/2185520.2185540

  22. Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17, 1624–1636 (2011)

    Article  Google Scholar 

  23. Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The Pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  24. Felzenszwalb, P., Huttenlocher, D.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)

    Article  Google Scholar 

  25. Fergus, R., Perona, P., Zisserman, A.: A visual category filter for Google images. In: ECCV, pp. 242–256 (2004)

    Google Scholar 

  26. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., et al.: Query by image and video content: the QBIC system. Computer 28(9), 23–32 (1995)

    Article  Google Scholar 

  27. Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)

    Article  MathSciNet  Google Scholar 

  28. Gao, Y., Wang, M., Zha, Z., Tian, Q., Dai, Q., Zhang, N.: Less is more: efficient 3d object retrieval with query view selection. IEEE Trans. Multimed. 11(5), 1007–1018 (2011)

    Article  Google Scholar 

  29. Han, J., Ngan, K., Li, M., Zhang, H.: Unsupervised extraction of visual attention objects in color images. IEEE Trans. Circuits Syst. Video Technol. 16(1), 141–145 (2006)

    Article  Google Scholar 

  30. He, J., Feng, J., Liu, X., Cheng, T., Lin, T.H., Chung, H., Chang, S.F.: Mobile product search with bag of hash bits and boundary reranking. In: IEEE CVPR, pp. 3005–3012 (2012)

    Google Scholar 

  31. Hirata, K., Kato, T.: Query by visual example-content based image retrieval. In: Advances in Database Technology-EDBT, pp. 56–71 (1992)

    Chapter  Google Scholar 

  32. Hu, R., Wang, T., Collomosse, J.: A bag-of-regions approach to sketch-based image retrieval. In: IEEE ICIP, pp. 3661–3664 (2011)

    Google Scholar 

  33. Hu, S.M., Chen, T., Xu, K., Cheng, M.M., Martin, R.R.: Internet visual media processing: a survey with graphics and vision applications. Vis. Comput., 1–13 (2013). doi:10.1007/s00371-013-0792-6

  34. Huang, H., Zhang, L., Zhang, H.: Arcimboldo-like collage using Internet images. ACM Trans. Graph. 30(6), 155 (2011)

    Google Scholar 

  35. Jones, M., Rehg, J.: Statistical color models with application to skin detection. Int. J. Comput. Vis. 46(1), 81–96 (2002)

    Article  MATH  Google Scholar 

  36. Ko, B., Nam, J.: Object-of-interest image segmentation based on human attention and semantic region clustering. J. Opt. Soc. Am. 23(10), 2462–2470 (2006)

    Article  Google Scholar 

  37. Kuettel, D., Ferrari, V.: Figure-ground segmentation by transferring window masks. In: IEEE CVPR, pp. 558–565 (2012)

    Google Scholar 

  38. Kuettel, D., Guillaumin, M., Ferrari, V.: Segmentation propagation in ImageNet. In: ECCV, pp. 459–473. Springer, Berlin (2012)

    Google Scholar 

  39. Li, H., Ngan, K.N.: A co-saliency model of image pairs. IEEE Trans. Image Process. 20(12), 3365–3375 (2011)

    Article  MathSciNet  Google Scholar 

  40. Liu, H., Zhang, L., Huang, H.: Web-image driven best views of 3d shapes. Vis. Comput., pp. 1–9 (2012). doi:10.1007/s00371-011-0638-z

  41. Margolin, R., Zelnik-Manor, L., Tal, A.: Saliency for image manipulation. Vis. Comput., pp. 381–392 (2013). doi:10.1007/s00371-012-0740-x

  42. Mori, G., Belongie, S., Malik, J.: Shape contexts enable efficient retrieval of similar shapes. In: IEEE CVPR, pp. 723–730 (2001)

    Google Scholar 

  43. Peter, A., Rangarajan, A., Ho, J.: Shape L’ane rouge: sliding wavelets for indexing and retrieval. In: IEEE CVPR, pp. 1–8 (2008)

    Google Scholar 

  44. Popescu, A., Moëllic, P., Kanellos, I., Landais, R.: Lightweight web image reranking. In: ACM MM, pp. 657–660 (2009)

    Google Scholar 

  45. Rahtu, E., Kannala, J., Salo, M., Heikkila, J.: Segmenting salient objects from images and videos. In: ECCV, pp. 366–379 (2010)

    Google Scholar 

  46. Rother, C., Kolmogorov, V., Blake, A.: “GrabCut”—interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)

    Article  Google Scholar 

  47. Schmid, J., Guitián, J.A.I., Gobbetti, E., Magnenat-Thalmann, N.: A GPU framework for parallel segmentation of volumetric images using discrete deformable models. Vis. Comput. 27(2), 85–95 (2011)

    Article  Google Scholar 

  48. Yang, X., Koknar-Tezel, S., Latecki, L.: Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval. In: IEEE CVPR, pp. 357–364 (2009)

    Google Scholar 

  49. Zhang, D., Lu, G.: Review of shape representation and description techniques. Pattern Recognit. 37, 1–19 (2004)

    Article  MATH  Google Scholar 

  50. Zhang, D.S., Lu, G.J.: Shape-based image retrieval using generic Fourier descriptor. Signal Process. Image Commun. 17(10), 825–848 (2002)

    Article  Google Scholar 

  51. Zhang, G.X., Cheng, M.M., Hu, S.M., Martin, R.R.: A shape-preserving approach to image resizing. Comput. Graph. Forum 28(7), 1897–1906 (2009)

    Article  Google Scholar 

  52. Zhang, L., Huang, H.: Hierarchical narrative collage for digital photo album. Comput. Graph. Forum 31(7), 2173–2181 (2012)

    Article  Google Scholar 

  53. Zheng, Y., Chen, X., Cheng, M.M., Zhou, K., Hu, S.M., Mitra, N.J.: Interactive images: Cuboid-based scene understanding for smart manipulation. ACM Trans. Graph. 31(4) (2012). doi:10.1145/2185520.2185595

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their constructive comments. This research was supported by the 973 Program (2011CB302205), the 863 Program (2009AA01Z327), the Key Project of S&T (2011ZX01042-001-002), and NSFC (U0735001). Ming-Ming Cheng was funded by Google Ph.D. fellowship, IBM Ph.D. fellowship, and New Ph.D. Researcher Award (Ministry of Edu., CN).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Ming Cheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, MM., Mitra, N.J., Huang, X. et al. SalientShape: group saliency in image collections. Vis Comput 30, 443–453 (2014). https://doi.org/10.1007/s00371-013-0867-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-013-0867-4

Keywords

Navigation