Skip to main content

Split-guidance network for salient object detection

Abstract

Due to the large-scale variation in objects in practical scenes, multi-scale representation is of critical importance for salient object detection (SOD). Recent advances in multi-level feature fusion also demonstrate its contribution in consistent performance gains. Different from the existing layer-wise methods, we propose a simple yet efficient split-guidance convolution block to improve the multi-scale representation ability at a granular level in this paper. Specifically, the input feature is first split into different subsets; each of them is guided by all the subsets in front of it, in this way to increase the range of receptive fields for each network layer. By embedding it into each side-output stage of the encoder, we build a unified decoder for both RGB SOD and RGB-D SOD. Experimental results on five RGB datasets, five RGB-D datasets and three RGB-T datasets demonstrate that the proposed method without any attention mechanisms and other complex designs performs favorably against state-of-the-art approaches and also shows advantages in simplicity, efficiency and compactness.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. Ren, Z., Gao, S., Chia, L.T., Tsang, I.W.H.: Region-based saliency detection and its application in object recognition. IEEE Trans. Circuits Syst. Video Technol. 24(5), 769 (2013)

    Article  Google Scholar 

  2. Wei, Y., Liang, X., Chen, Y., Shen, X., Cheng, M.M., Feng, J., Zhao, Y., Yan, S.: STC: A simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2314 (2016)

    Article  Google Scholar 

  3. Fang, H., Gupta, S., Iandola, F., Srivastava, R.K., Deng, L., Dollár, P., Gao, J., He, X., Mitchell, M., Platt, J.C., et al.: From captions to visual concepts and back. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1473–1482 (2015)

  4. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

  5. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)

  6. Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9413–9422 (2020)

  7. Chen, S., Tan, X., Wang, B., Lu, H., Hu, X., Fu, Y.: Reverse attention-based residual network for salient object detection. IEEE Trans. Image Process. 29, 3763 (2020)

    Article  Google Scholar 

  8. Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3907–3916 (2019)

  9. Wei, J., Wang, S., Huang, Q.: F\(^3\)Net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 12,321–12,328 (2020)

  10. Zhao, T., Wu, X.: Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3085–3094 (2019)

  11. Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3917–3926 (2019)

  12. Zhao, X., Pang, Y., Zhang, L., Lu, H., Zhang, L.: Suppress and balance: a simple gated network for salient object detection. In: Proceedings of the European Conference on Computer Vision, pp. 35–51 (2020)

  13. Feng, M., Lu, H., Ding, E.: Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1623–1632 (2019)

  14. Liu, N., Zhang, N., Han, J.: Learning selective self-mutual attention for RGB-D saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 13,756–13,765 (2020)

  15. Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7254–7263 (2019)

  16. Fan, D.P., Zhai, Y., Borji, A., Yang, J., Shao, L.: BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network. In: Proceedings of the European Conference on Computer Vision, pp. 275–292 (2020)

  17. Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2777–2787 (2020)

  18. Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3203–3212 (2017)

  19. Hou, Q., Cheng, M., Hu, X., Borji, A., Tu, Z., Torr, P.: Deeply supervised salient object detection with short connections. IEEE Trans. Pattern Anal. Mach. Intell. 41(4), 815 (2018)

    Article  Google Scholar 

  20. Xiao, H., Feng, J., Wei, Y., Zhang, M., Yan, S.: Deep salient object detection with dense connections and distraction diagnosis. IEEE Trans. Multimed. 20(12), 3239 (2018)

    Article  Google Scholar 

  21. Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 202–211 (2017)

  22. Nguyen, T.V., Nguyen, K., Do, T.T.: Semantic prior analysis for salient object detection. IEEE Trans. Image Process. 28(6), 3130 (2019)

    MathSciNet  Article  Google Scholar 

  23. Liu, N., Han, J., Yang, M.H.: Picanet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3089–3098 (2018)

  24. Nguyen, T.V., Zhao, Q., Yan, S.: Attentive systems: a survey. Int. J. Comput. Vis. 126(1), 86 (2018)

    Article  Google Scholar 

  25. Wang, W., Zhao, S., Shen, J., Hoi, S.C., Borji, A.: Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1448–1457 (2019)

  26. Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EGNet: Edge guidance network for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8779–8788 (2019)

  27. Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7264–7273 (2019)

  28. Su, J., Li, J., Zhang, Y., Xia, C., Tian, Y.: Selectivity or invariance: boundary-aware salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3799–3808 (2019)

  29. Zhou, H., Xie, X., Lai, J.H., Chen, Z., Yang, L.: Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9141–9150 (2020)

  30. Liu, J.J., Hou, Q., Cheng, M.M.: Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans. Image Process. 29, 8652 (2020)

    Article  Google Scholar 

  31. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7479–7489 (2019)

  32. Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H., Yang, R.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3051099

  33. Chen, H., Li, Y.: Progressively complementarity-aware fusion network for RGB-D salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3051–3060 (2018)

  34. Li, G., Liu, Z., Ling, H.: ICNet: information conversion network for RGB-D based salient object detection. IEEE Trans. Image Process. 29, 4873 (2020)

    Article  Google Scholar 

  35. Chen, Z., Cong, R., Xu, Q., Huang, Q.: DPANet: depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Trans. Image Process. 30, 7012 (2020)

    Article  Google Scholar 

  36. Li, G., Liu, Z., Ye, L., Wang, Y., Ling, H.: Cross-modal weighting network for RGB-D salient object detection. In: European Conference on Computer Vision, pp. 665–681 (Springer, 2020)

  37. Pang, Y., Zhang, L., Zhao, X., Lu, H.: Hierarchical dynamic filtering network for RGB-D salient object detection. In: Proceedings of the European Conference on Computer Vision, pp. 235–252 (2020)

  38. Fu, K., Fan, D.P., Ji, G.P., Zhao, Q.: JL-DCF: joint learning and densely-cooperative fusion framework for RGB-D salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3052–3062 (2020)

  39. Zhao, X., Zhang, L., Pang, Y., Lu, H., Zhang, L.: A single stream network for robust and real-time RGB-D salient object detection. In: Proceedings of the European Conference on Computer Vision, pp. 646–662 (2020)

  40. Chen, Q., Liu, Z., Zhang, Y., Fu, K., Zhao, Q., Du, H.: RGB-D salient object detection via 3D convolutional neural networks. arXiv preprint arXiv:2101.10241 (2021)

  41. Piao, Y., Rong, Z., Zhang, M., Ren, W., Lu, H.: A2dele: adaptive and attentive depth distiller for efficient RGB-D salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9060–9069 (2020)

  42. Zhou, T., Fan, D.P., Cheng, M.M., Shen, J., Shao, L.: RGB-D salient object detection: a survey. Comput. Vis. 7(1), 37–69 (2021)

    Google Scholar 

  43. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834 (2017)

    Article  Google Scholar 

  44. Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3684–3692 (2018)

  45. Zhang, J., Fan, D.P., Dai, Y., Anwar, S., Saleh, F.S., Zhang, T., Barnes, N.: UC-Net: uncertainty inspired RGB-D saliency detection via conditional variational autoencoders. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8582–8591 (2020)

  46. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

  47. Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European Conference on Computer Vision, pp. 385–400 (2018)

  48. Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

  49. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)

  50. Gao, S., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.H.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652 (2021)

    Article  Google Scholar 

  51. Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller, J., Manmatha, R., et al.: Resnest: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)

  52. Yuan, P., Lin, S., Cui, C., Du, Y., Guo, R., He, D., Ding, E., Han, S.: HS-ResNet: hierarchical-split block on convolutional neural network. arXiv preprint arXiv:2010.07621 (2020)

  53. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  54. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  55. Wang, W., Shen, J., Dong, X., Borji, A.: Salient object detection driven by fixation prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1711–1720 (2018)

  56. Le, T.N., Nguyen, T.V., Nie, Z., Tran, M.T., Sugimoto, A.: Anabranch network for camouflaged object segmentation. Comput. Vis. Image Underst. 184, 45 (2019)

    Article  Google Scholar 

  57. Wei, J., Wang, S., Wu, Z., Su, C., Huang, Q., Tian, Q.: Label decoupling framework for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 13,025–13,034 (2020)

  58. Chen, Z., Xu, Q., Cong, R., Huang, Q.: Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10,599–10,606 (2020)

  59. Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1155–1162 (2013)

  60. Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 280–287 (2014)

  61. Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5455–5463 (2015)

  62. Wang, L., Lu, H., Wang, Y., Feng, M., Wang, D., Yin, B., Ruan, X.: Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 136–145 (2017)

  63. Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3166–3173 (2013)

  64. Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: Proceedings of the IEEE International Conference on Image Processing, pp. 1115–1119 (2014)

  65. Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: Rgbd salient object detection: a benchmark and algorithms. In: Proceedings of the European Conference on Computer Vision, pp. 92–109 (2014)

  66. Fan, D.P., Lin, Z., Zhang, Z., Zhu, M., Cheng, M.M.: Rethinking RGB-D salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans. Neural Netw. Learn. Syst. 32(5), 2075 (2020)

    Article  Google Scholar 

  67. Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 454–461 (2012)

  68. Ji, W., Li, J., Zhang, M., Piao, Y., Lu, H.: In: Accurate RGB-D salient object detection via collaborative learning. European Conference on Computer Vision, pp. 52–69 (Springer, 2020)

  69. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1597–1604 (2009)

  70. Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: Proceedings of the International Joint Conference on Artificial Intelligence (2018)

  71. Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4548–4557 (2017)

  72. Zhao, J.X., Cao, Y., Fan, D.P., Cheng, M.M., Li, X.Y., Zhang, L.: Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3927–3936 (2019)

  73. Sun, P., Zhang, W., Wang, H., Li, S., Li, X.: Deep RGB-D saliency detection with depth-sensitive attention and automatic multi-modal fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1407–1417 (2021)

  74. Ji, W., Li, J., Yu, S., Zhang, M., Piao, Y., Yao, S., Bi, Q., Ma, K., Zheng, Y., Lu, H., Cheng, L.: Calibrated RGB-D salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9471–9481 (2021)

  75. Wang, G., Li, C., Ma, Y., Zheng, A., Tang, J.,Luo, B.: RGB-T saliency detection benchmark: dataset, baselines, analysis and a novel approach. In: Chinese Conference on Image and Graphics Technologies, pp. 359–369 (2018)

  76. Tu, Z., Xia, T., Li, C., Wang, X., Ma, Y., Tang, J.: RGB-T image saliency detection via collaborative graph learning. IEEE Trans. Multimed. 22(1), 160 (2019)

    Article  Google Scholar 

  77. Tu, Z., Ma, Y., Li, Z., Li, C., Xu, J., Liu, Y.: RGBT salient object detection: a large-scale dataset and benchmark. arXiv preprint arXiv:2007.03262 (2020)

  78. Tu, Z., Xia, T., Li, C., Lu, Y., Tang, J.: M3S-NIR: multi-modal multi-scale noise-insensitive ranking for RGB-T saliency detection. In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval, pp. 141–146 (2019)

  79. Tu, Z., Li, Z., Li, C., Lang, Y., Tang, J.:Multi-interactive siamese decoder for RGBT salient object detection. arXiv preprint arXiv:2005.02315 (2021)

Download references

Acknowledgements

This work is partially supported by the Natural Science Foundation of China (No. 61802336, No. 61806175, No. 62073322) and Yangzhou University “Qinglan Project.”

Author information

Authors and Affiliations

Authors

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chen, S., Yu, J., Xu, X. et al. Split-guidance network for salient object detection. Vis Comput (2022). https://doi.org/10.1007/s00371-022-02421-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-022-02421-5

Keywords

  • Salient object detection
  • Split-guidance convolution
  • Unified decoder