Skip to main content
Log in

Two-stage salient object detection based on prior distribution learning and saliency consistency optimization

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

A Publisher Correction to this article was published on 13 November 2022

This article has been updated

Abstract

Although two-stage methods have favorably improved the accuracy and robustness of saliency detection, obtaining a saliency map with clear foreground boundaries and fine structure is still challenging. In this article, we proposed a novel and effective two-stage method to intelligently detect salient objects, which constitutes the coarse saliency map construction and the fine saliency map generation. Firstly, we develop the prior distribution learning algorithm (PDL) to explore the mapping relationship between input superpixel and its corresponding superpixels in various prior maps. The PDL can calculate the corresponding weights according to the contribution of various priors to each region in the image. Therefore, it can provide more reliable pseudo-labels for training subsequent learning models. Secondly, through learning the implicit representation between reliable samples and multiple priors, the learning model can accurately predict the salient values of those regions that are difficult to judge the saliency, so as to obtain an instructive coarse saliency map. Thirdly, in order to optimize the details of the coarse saliency map, we propose a framework called saliency consistency optimization, which can get clear foreground boundaries and effectively suppress the background noise. We compare the proposed algorithm with other state-of-the-art methods on four datasets. Experimental results adequately demonstrate the effectiveness of our approach over other comparison methods, especially two-stage-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability Statement

The datasets analyzed during the current study are available from the corresponding author on reasonable request.

Change history

References

  1. Dou, P., Shen, H., Li, Z., et al.: Time series remote sensing image classification framework using combination of deep learning and multiple classifiers system. Int. J. Appl. Earth Obs. Geoinf. 103(8), 102477 (2021)

    Google Scholar 

  2. Luo, L., Wang, X., Hu, S., Hu, X., Zhang, H., Liu, Y., Zhang, J.: A unified framework for interactive image segmentation via Fisher rules. Vis. Comput. 35(12), 1869–1882 (2019)

    Article  Google Scholar 

  3. Lv, G., Dong, L., Zhang, W., Xu, W.: Region-based adaptive association learning for robust image scene recognition. Vis. Comput. 1–21 (2022)

  4. Chen, X., Wang, T., Zhu, Y., Jin, L., Luo, C.: Adaptive embedding gate for attention-based scene text recognition. Neurocomputing 381, 261–271 (2020)

    Article  Google Scholar 

  5. Wang, Q., Huang, Y., Jia, W., He, X., Blumenstein, M., Lyu, S., Lu, Y.: FACLSTM: ConvLSTM with focused attention for scene text recognition. Sci. China Inf. Sci. 63(2), 1–14 (2020)

    Article  MathSciNet  Google Scholar 

  6. Xie, J., Ge, Y., Zhang, J., Huang, S., Chen, F., Wang, H.: Low-resolution assisted three-stream network for person re-identification. Vis. Comput. 38, 1–11 (2021)

    Google Scholar 

  7. Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In Proceedings of the IEEE International Conference on Computer Vision, pp. 202–211 (2017)

  8. He, S., Lau, R.W., Liu, W., Huang, Z., Yang, Q.: Supercnn: a superpixelwise convolutional neural network for salient object detection. Int. J. Comput. Vis. 115(3), 330–344 (2015)

    Article  MathSciNet  Google Scholar 

  9. Liu, N., Han, J., Yang, M.H.: Picanet: pixel-wise contextual attention learning for accurate saliency detection. IEEE Trans. Image Process. 29, 6438–6451 (2020)

    Article  MATH  Google Scholar 

  10. Qin, X., Fan, D.P., Huang, C., Diagne, C., Zhang, Z.: Boundary-aware segmentation network for mobile and web applications. arXiv:2101.04704 (2021)

  11. Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)

  12. Zhang, M., Pang, Y., Wu, Y., Du, Y., Sun, H., Zhang, K.: Saliency detection via local structure propagation. J. Vis. Commun. Image Represent. 52, 131–142 (2018)

    Article  Google Scholar 

  13. Wu, Y., Jia, T., Pang, Y., Sun, J., Xue, D.: Salient object detection via a boundary-guided graph structure. J. Vis. Commun. Image Represent. 75, 103048 (2021)

    Article  Google Scholar 

  14. Jian, M., Wang, J., Yu, H., Wang, G.G.: Integrating object proposal with attention networks for video saliency detection. Inf. Sci. 576, 819–830 (2021)

    Article  MathSciNet  Google Scholar 

  15. Zhuge, M., Lu, X., Guo, Y., Cai, Z., Chen, S.: CubeNet: X-shape connection for camouflaged object detection. Pattern Recogn. 127, 108644 (2022)

    Article  Google Scholar 

  16. Zhuge, M., Fan, D.P., Liu, N., Zhang, D., Xu, D.: Salient object detection via integrity learning. arXiv:2101.07663 (2021)

  17. Islam, M.A., Kalash, M., Bruce, N.D.B.: Revisiting salient object detection: Simultaneous detection, ranking, and subitizing of multiple salient objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7142–7150 (2018)

  18. Cheng, M.M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2014)

    Article  Google Scholar 

  19. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1597–1604 (2009)

  20. Li, H., Lu, H., Lin, Z., Shen, X., Price, B.: Inner and inter label propagation: salient object detection in the wild. IEEE Trans. Image Process. 24(10), 3176–3186 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Zhang, M., Wu, Y., Du, Y., Fang, L., Pang, Y.: Saliency detection integrating global and local information. J. Vis. Commun. Image Represent. 53, 215–223 (2018)

    Article  Google Scholar 

  22. Jian, M., Wang, R., Xu, H., Yu, H., Dong, J., Li, G., Yin, Y., Lam, K.M.: Robust seed selection of foreground and background priors based on directional blocks for saliency-detection system. Multimed. Tools Appl. 1–25 (2022)

  23. Jian, M., Wang, J., Yu, H., Wang, G., Meng, X., Yang, L., Dong, J., Yin, Y.: Visual saliency detection by integrating spatial position prior of object with background cues. Expert Syst. Appl. 168, 114219 (2021)

    Article  Google Scholar 

  24. Qin, Y., Feng, M., Lu, H., Cottrell, G.W.: Hierarchical cellular automata for visual saliency. Int. J. Comput. Vis. 126(7), 751–770 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  25. Chen, S., Zheng, L., Hu, X., Zhou, P.: Discriminative saliency propagation with sink points. Pattern Recogn. 60, 2–12 (2016)

    Article  Google Scholar 

  26. Pang, Y., Yu, X., Wu, Y., Wu, C.: FSP: a feedback-based saliency propagation method for saliency detection. J. Electr. Imaging 29(1), 013011 (2020)

    Article  Google Scholar 

  27. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  28. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. 1409–1556 (2014)

  29. Geng, X.: Label distribution learning. IEEE Trans. Knowl. Data Eng. 28(7), 1734–1748 (2016)

    Article  Google Scholar 

  30. Yang, C., Zhang, L., Lu, H.: Graph-regularized saliency detection with convex-hull-based center prior. IEEE Signal Process. Lett. 20(7), 637–640 (2013)

    Article  Google Scholar 

  31. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  32. Hwang, I., Lee, S.H., Park, J.S., Cho, N.I.: Saliency detection based on seed propagation in a multilayer graph. Multimed. Tools Appl. 76(2), 2111–2129 (2017)

    Article  Google Scholar 

  33. Gong, C., Tao, D., Liu, W., Maybank, S.J., Fang, M., Fu, K., Yang, J.: Saliency propagation from simple to difficult. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2531–2539 (2015)

  34. Qin, Y., Lu, H., Xu, Y., Wang, H.: Saliency detection via cellular automata. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 110–119 (2015)

  35. Fang, S., Li, J., Tian, Y., Huang, T., Chen, X.: Learning discriminative subspaces on random contrasts for image saliency analysis. IEEE Trans. Neural Netw. Learn. Syst. 28(5), 1095–1108 (2016)

    Article  Google Scholar 

  36. Tu, W.C., He, S., Yang, Q., Chien, S.Y.: Real-time salient object detection with a minimum spanning tree. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2334–2342 (2016)

  37. Sun, J., Lu, H., Liu, X.: Saliency region detection based on Markov absorption probabilities. IEEE Trans. Image Process. 24(5), 1639–1649 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  38. Peng, H., Li, B., Ling, H., Hu, W., Xiong, W., Maybank, S.J.: Salient object detection via structured matrix decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 818–832 (2016)

    Article  Google Scholar 

  39. Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3183–3192 (2015)

  40. Liu, G.H., Yang, J.Y.: Exploiting color volume and color difference for salient region detection. IEEE Trans. Image Process. 28(1), 6–16 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  41. Zhang, L., Ai, J., Jiang, B., Lu, H., Li, X.: Saliency detection via absorbing Markov chain with learnt transition probability. IEEE Trans. Image Process. 27(2), 987–998 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  42. Wu, Y., Jia, T., Li, W., Chen, D.: RSF: a novel saliency fusion framework for image saliency detection. In: 2020 International Conference on Culture-oriented Science and Technology (ICCST), pp. 45–49 (2020)

  43. Zhang, L., Sun, J., Wang, T., Min, Y., Lu, H.: Visual saliency detection via kernelized subspace ranking with active learning. IEEE Trans. Image Process. 29, 2258–2270 (2019)

    Article  MATH  Google Scholar 

  44. Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1155–1162 (2013)

  45. Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)

  46. Movahedi, V., Elder, J.H.: Design and perceptual validation of performance measures for salient object segmentation. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 49–56 (2010)

  47. Tong, N., Lu, H., Ruan, X., Yang, M.H.: Salient object detection via bootstrap learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1884–1892 (2015)

  48. Tong, N., Lu, H., Zhang, Y., Ruan, X.: Salient object detection via global and local cues. Pattern Recogn. 48(10), 3258–3267 (2015)

    Article  Google Scholar 

  49. Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  50. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189–2202 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

Thanks are due to Ph.D.Yu Pang for inspiring our work. This work was supported in part by the National Natural Science Foundation of China under Grant Nos. U1613214 and 62173083, the Major Program of National Natural Science Foundation of China under Grant No.71790614 and the 111 Project B16009, the National Key Research and Development Project Grant No.2018YFB1404101 and the Fundamental Research Fund for the Central Universities of China under Grant N170402008 and N2026004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Jia.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial or non-financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: there were errors in affiliations and in equation 10 and 11.

Appendix

Appendix

In the multi-prior distribution learning model, five prior knowledge are utilized to extract saliency cues, i.e., background prior, objectness prior, color distribution prior, global contrast prior and center prior. Here, we will give more details about these five prior knowledge.

Background prior is the universal prior knowledge in saliency detection. It usually assumes image boundary superpixels as initial background seeds. Then, each superpixel’s saliency value is determined by its feature contrast with background seeds. Thus, we construct a background-based map \({\text{BG}} = \left[ {{\text{bg}}_{1} ,{\text{bg}}_{2} , \ldots ,{\text{bg}}_{N} } \right]^{T}\) as follows:

$$ {\text{bg}}_{i} = \frac{1}{n}\sum\limits_{{j = 1}}^{n} {\exp } \left( {\frac{{\left\| {d_{i} ,d_{j}^{b} } \right\|}}{\sigma }} \right) $$
(13)

where \({d}_{i}\) and \({d}_{j}^{b}\) represent the deep features of \(i\)-th superpixel and \(j\)-th boundary superpixel, respectively. \(n \) is the total number of boundary superpixels, and the parameter \(\sigma \) is set to 0.1. Higher \({\text{bg}}_{i} \) indicates that superpixel \( m_{i }\) has higher contrast to background seeds and is more likely to be the salient object. Otherwise, superpixel \({ } m_{i }\) tends to be background regions.

Objectness prior is proposed in [50]. It firstly constructs a large number of windows, each of which is a part of the image. Then several rules are utilized to compute the likelihood of each window containing salient objects. Objectness prior is widely applied to numerous saliency detection methods due to its effectiveness. We define the objectness map to be \({\text{OB}} = \left[ {{\text{ob}}_{1} ,{\text{ob}}_{2} , \ldots ,{\text{ob}}_{N} } \right]^{T}\).

Color distribution prior is proposed in our previous works [12]. It firstly divides image pixels into eight regions according to color cues. Secondly, those regions that jointly have compact spatial structures and are near the image weight center are more likely to be salient objects. Finally, according to [12], we obtain a color distribution map \({\text{CD}} = \left[ {{\text{cd}}_{1} ,{\text{cd}}_{2} , \ldots ,{\text{cd}}_{N} } \right]^{T}\) without any modifications.

Global contrast prior is based on the observation that the human eye tends to detect regions which have higher contrast to other regions. Thus, we construct the global contrast-based map \({\text{GC}} = \left[ {{\text{gc}}_{1} ,{\text{gc}}_{2} , \ldots ,{\text{gc}}_{N} } \right]^{T}\) as follows:

$$ {\text{gc}}_{i} = \frac{1}{N}\sum\limits_{{j = 1,j \ne i}}^{N} {{\text{exp}}} \left( {\frac{{\left\| {d_{i} ,d_{j} } \right\|}}{\sigma }} \right) $$
(14)

where \(d_{i} \) and \(d_{j} \) are the deep features of superpixel \( m_{i }\) and \( m_{j} , \) respectively,\(\,{ }N \) is the total number of superpixels and parameter \(\sigma \) is set to 0.1. The global contrast value of each superpixel is determined by the mean contrast between it and other superpixels.

Center prior is also effective prior knowledge, which assumes that image center regions are more likely to be salient objects. To this end, we construct the center-based map \({\text{CB}} = \left[ {{\text{cb}}_{1} ,{\text{cb}}_{2} , \ldots ,{\text{cb}}_{N} } \right]^{T}\) as follows:

$$ {\text{cb}}_{i} = \exp \left( { - \frac{{\left\| {{\mathbf{p}}_{i} ,{\mathbf{p}}^{{\mathbf{c}}} } \right\|}}{\sigma }} \right) $$
(15)

where \({\mathbf{p}}_{i} \) is the position coordinate of superpixel \( m_{i }\) and \({\mathbf{p}}^{{{\mathbf{c}} }}\) is the image center position, parameter \(\sigma \) is set to 0.1.

In multi-prior distribution learning algorithm, five prior maps are represented as \( {\text{pm}}_{1}\),\({\text{pm}}_{2}\),…, \({\text{pm}}_{5}\), i.e., \({\text{pm}}_{1} \) for \({\text{ BG}}\),\( {\text{pm}}_{2} \) for \({\text{ OB}}\),\( {\text{pm}}_{3} \) for \({\text{ SD}}\),\( {\text{pm}}_{4} \) for \({\text{ GC}}\) and \({\text{pm}}_{5} \) for \({\text{ CB}}\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Chang, X., Chen, D. et al. Two-stage salient object detection based on prior distribution learning and saliency consistency optimization. Vis Comput 39, 5729–5745 (2023). https://doi.org/10.1007/s00371-022-02692-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02692-y

Keywords

Navigation