Learning Complementary Saliency Priors for Foreground Object Segmentation in Complex Scenes

Abstract

Object segmentation is widely recognized as one of the most challenging problems in computer vision. One major problem of existing methods is that most of them are vulnerable to the cluttered background. Moreover, human intervention is often required to specify foreground/background priors, which restricts the usage of object segmentation in real-world scenario. To address these problems, we propose a novel approach to learn complementary saliency priors for foreground object segmentation in complex scenes. Different from existing saliency-based segmentation approaches, we propose to learn two complementary saliency maps that reveal the most reliable foreground and background regions. Given such priors, foreground object segmentation is formulated as a binary pixel labelling problem that can be efficiently solved using graph cuts. As such, the confident saliency priors can be utilized to extract the most salient objects and reduce the distraction of cluttered background. Extensive experiments show that our approach outperforms 16 state-of-the-art methods remarkably on three public image benchmarks.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Notes

  1. 1.

    In our implementation, we add a very small positive number to the value in every \(\log \) function to avoid yielding infinity and make problems have feasible solutions.

  2. 2.

    As in many previous works, we divide images into macro-blocks and all pixels in a block are assumed to share the same parameter. In our experiments, each block covers \(4\times 4\) pixels for an image resized to the resolution \(320\times 240\).

  3. 3.

    In our implementation, we use \(\delta _\perp * avg(\mathcal {S}^{+})\) and \(\delta _\perp * avg(\mathcal {S}^{-})\) to perform the binarization.

  4. 4.

    The two thresholds are \(\delta _s * avg(\mathcal {S})\) and \(\frac{1}{\delta _s} * avg(\mathcal {S})\), while \(\delta _s \in (0,1] \) is learned via experiments on the validation set, in a similar way to \(\delta _\perp \) in our approach.

References

  1. Achanta, R., Estrada, F., Wils, P., & Susstrunk, S. (2008). Salient region detection and segmentation. In IEEE international conference on computer vision, pp 66–75.

  2. Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In IEEE conference on computer vision and pattern recognition, pp. 1597–1604.

  3. Borenstein, E., & Ullman, S. (2004). Learning to segment. In The 8th European conference on computer vision, pp. 315–328.

  4. Borji, A., Sihite, D. N., & Itti, L. (2012). Salient object detection: A benchmark. In European conference on computer vision 2012, Part II, LNCS 7573, pp. 414–429.

  5. Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  6. Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In IEEE international conference on computer vision, pp. 105–112.

  7. Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1124–1137.

    Article  Google Scholar 

  8. Cheng, M. M., Zhang, G. X., Mitra, N. J., Huang, X., & Hu, S. M. (2011). Global contrast based salient region detection. In IEEE conference on computer vision and pattern recognition, pp. 409–416.

  9. Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The pascal visual object classes challenge 2009 (voc2009) results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html.

  10. Goferman, S., Manor, L. Z., & Tal, A. (2010). Context-aware saliency detection. In IEEE conference on computer vision and pattern recognition, pp. 2376–2383.

  11. Gopalakrishnan, V., Hu, Y., & Rajan, D. (2009). Random walks on graphs to model saliency in images. In IEEE conference on computer vision and pattern recognition,, pp. 1698–1705.

  12. Gould, S., Rodgers, J., Cohen, D., Elidan, G., & Koller, D. (2007). Multi-class segmentation with relative location prior. Ineternational Journal of Computer Vision, 80(3), 300–316.

    Article  Google Scholar 

  13. Harel, J., Koch, C., & Perona, P. (2007). Graph-based visual saliency. In Neural information processing systems, 19, 545–552.

  14. Hou, X., & Zhang, L. (2007). Saliency detection: A spectral residual approach. In IEEE conference on computer vision and pattern recognition.

  15. Hua, G., Liu, Z., Zhang, Z., & Wu, Y. (2006). Iterative local-global energy minimization for automatic extraction of objects of interest. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1701–1706.

    Article  Google Scholar 

  16. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.

    Article  Google Scholar 

  17. Jiang, H., Wang, J., Yuan, Z., Liu, T., Zheng, N., & Li, S. (2011). Automatic salient object segmentation based on context and shape prior. In British machine vision conference, pp. 1–12.

  18. Judd, T., Ehinger, K., Durand, F., & Torralba, A. (2009). Learning to predict where humans look. In International conference on computer vision, pp. 2106–2113.

  19. Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: Active contour models. International Journal of Computer Vision, 1(4), 321–331.

    Article  Google Scholar 

  20. Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  21. Lempitsky, V., Kohli, P., Rother, C., & Sharp, T. (2009). Image segmentation with a bounding box prior. In IEEE international conference on computer vision, pp. 277–284.

  22. Li, C., Xu, C., Gui, C., & Fox, M. D. (2005). Level set evolution without re-initialization: A new variational formulation. In IEEE conference on computer vision and pattern recognition, pp. 430–436.

  23. Li, F., Carreira, J., & Sminchisescu, C. (2010a). Object recognition as ranking holistic figure-ground hypotheses. In IEEE conference computer vision and pattern recognition, pp. 1712–1719.

  24. Li, J., Tian, Y., Huang, T., & Gao, W. (2010b). Probabilistic multi-task learning for visual saliency estimation in video. International Journal of Computer Vision, 90(2), 150–165.

    Article  Google Scholar 

  25. Li, J., Tian, Y., Duan, L., & Huang, T. (2013). Estimating visual saliency through single image optimization. IEEE Signal Processing Letters, 20(9), 845–848.

    Article  Google Scholar 

  26. Liu, G., Lin, Z., Yu, Y., & Tang, X. (2010). Unsupervised object segmentation with a hybrid graph model (hgm). IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 910–924.

    Google Scholar 

  27. Liu, T., Sun, J., Zheng, N., Tang, X., & Shum, H. (2007). Learning to detect a salient object. In IEEE conference on computer vision and pattern recognition, pp. 1–8.

  28. Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., et al. (2011). Learning to detect a salient object. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(2), 353–367.

    Article  Google Scholar 

  29. Ma, Y., & Zhang, H. (2003). Contrast-based image attention analysis by using fuzzy growing. In The 11th ACM international conference on multimedia, pp. 374–381.

  30. Mehrani, P., & Veksler, O. (2010). Saliency segmentation based on learning and graph cut refinement. In British machine vision conference, pp. 1–12.

  31. Movahedi, V., & Elder, J. H. (2010). Design and perceptual validation of performance measures for salient object segmentation. In IEEE workshop on perceptual organization in computer vision.

  32. Perazzi, F., Krahenbuhl, P., Pritch, Y., & Hornung, A. (2012). Saliency filters: Contrast based filtering for salient region detection. In IEEE conference on computer vision and pattern recognition, pp. 733–740.

  33. Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut—Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 23(3), 309–314.

    Article  Google Scholar 

  34. Rother, C., Minka, T., Blake, A., & Kolmogorov, V. (2006). Cosegmentation of image pairs by histogram matchingincorporating a global constraint into mrfs. In IEEE conference on computer vision and pattern recognition, pp. 993–1000.

  35. Seo, H. J., & Milanfar, P. (2009). Static and space-time visual saliency detection by self-resemblance. Journal of Vision, 9(12):article 15, 1–27.

  36. Tseng, P., Carmi, R., Cameron, L., Munoz, D. P., & Itti, L. (2009). Quantifying center bias of observers in free viewing of dynamic natural scenes. Journal of Vision, 9(7), 1–16.

    Article  Google Scholar 

  37. Vicente, S., Kolmogorov, V., & Rother, C. (2008). Graph cut based image segmentation with connectivity priors. In IEEE conference on computer vision and pattern recognition, pp. 1–8.

  38. Walther, D., & Koch, C. (2006). Modeling attention to salient pro-to-object. Neural Networks, 19, 1395–1407.

    Article  MATH  Google Scholar 

  39. Winn, J., & Jojic, N. (2005). Locus: Learning object classes with unsupervised segmentation. In IEEE international conference on computer vision, pp. 756–763.

  40. Yu, H., Li, J., Tian, Y., & Huang, T. (2010). Automatic interesting object extraction from images using complementary saliency maps. In 2010 ACM international conference on multimedia, pp. 891–894.

  41. Yu, S., Gross, R., & Shi, J. (2002). Concurrent object recognition and segmentation by graph partitioning. In Neural information processing systems, 14, 1383–1390.

  42. Zhang, L., Tong, M. H., Marks, T. K., Shan, H., & Cottrell, G. W. (2008). Sun: A bayesian framework for saliency using natural statistics. Journal of Vision, 8(7), 1–20.

  43. Zhao, L., & Davis, L. (2005). Closely coupled object detection and segmentation. In IEEE international conference on computer vision, pp. 454–461.

Download references

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Yonghong Tian or Jia Li.

Additional information

This work was supported in part by grants from the Chinese National Natural Science Foundation under contract No. 61035001, No. 61370113, and No. 61390515, and the Supervisor Award Funding for Excellent Doctoral Dissertation of Beijing (No. 20128000103).

Communicated by M. Hebert.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tian, Y., Li, J., Yu, S. et al. Learning Complementary Saliency Priors for Foreground Object Segmentation in Complex Scenes. Int J Comput Vis 111, 153–170 (2015). https://doi.org/10.1007/s11263-014-0737-1

Download citation

Keywords

  • Foreground object segmentation
  • Visual saliency
  • Complementary saliency map
  • Graph cuts