BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12357)


Multi-level feature fusion is a fundamental topic in computer vision for detecting, segmenting and classifying objects at various scales. When multi-level features meet multi-modal cues, the optimal fusion problem becomes a hot potato. In this paper, we make the first attempt to leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to develop a novel cascaded refinement network. In particular, we 1) propose a bifurcated backbone strategy (BBS) to split the multi-level features into teacher and student features, and 2) utilize a depth-enhanced module (DEM) to excavate informative parts of depth cues from the channel and spatial views. This fuses RGB and depth modalities in a complementary way. Our simple yet efficient architecture, dubbed Bifurcated Backbone Strategy Network (BBS-Net), is backbone independent and outperforms 18 SOTAs on seven challenging datasets using four metrics.


RGB-D saliency detection Bifurcated backbone strategy 



This work was supported by the Major Project for New Generation of AI Grant (NO. 2018AAA0100403), NSFC (NO. 61876094, U1933114), Natural Science Foundation of Tianjin, China (NO. 18JCYBJC15400, 18ZXZNGX00110), the Open Project Program of the National Laboratory of Pattern Recognition (NLPR), and the Fundamental Research Funds for the Central Universities.

Supplementary material

504453_1_En_17_MOESM1_ESM.pdf (6.9 mb)
Supplementary material 1 (pdf 7112 KB)


  1. 1.
    Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)Google Scholar
  2. 2.
    Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE TIP 24(12), 5706–5722 (2015)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Chen, H., Li, Y.: Progressively complementarity-aware fusion network for RGB-D salient object detection. In: CVPR, pp. 3051–3060 (2018)Google Scholar
  4. 4.
    Chen, H., Li, Y.: Three-stream attention-aware network for RGB-D salient object detection. IEEE TIP 28(6), 2825–2835 (2019)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Chen, H., Li, Y., Su, D.: Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. IEEE TOC 86, 376–385 (2019)Google Scholar
  6. 6.
    Chen, Q., Koltun, V.: Photographic image synthesis with cascaded refinement networks. In: CVPR, pp. 1511–1520 (2017)Google Scholar
  7. 7.
    Chen, S., Tan, X., Wang, B., Lu, H., Hu, X., Fu, Y.: Reverse attention-based residual network for salient object detection. IEEE TIP 29, 3763–3776 (2020)Google Scholar
  8. 8.
    Cheng, G., Han, J., Zhou, P., Xu, D.: Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE TIP 28(1), 265–278 (2018)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Cheng, Y., Fu, H., Wei, X., Xiao, J., Cao, X.: Depth enhanced saliency detection method. In: ICIMCS, pp. 23–27 (2014)Google Scholar
  10. 10.
    Ciptadi, A., Hermans, T., Rehg, J.M.: An in depth view of saliency. In: BMVC (2013)Google Scholar
  11. 11.
    Cong, R., Lei, J., Fu, H., Hou, J., Huang, Q., Kwong, S.: Going from RGB to RGBD saliency: a depth-guided transformation model. IEEE TOC, 1–13 (2019)Google Scholar
  12. 12.
    Cong, R., Lei, J., Zhang, C., Huang, Q., Cao, X., Hou, C.: Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. IEEE SPL 23(6), 819–823 (2016)Google Scholar
  13. 13.
    Cong, R., Lei, J., Fu, H., Huang, Q., Cao, X., Ling, N.: HSCS: hierarchical sparsity based co-saliency detection for RGBD images. IEEE TMM 21(7), 1660–1671 (2019)Google Scholar
  14. 14.
    Deng, Z., et al.: R3Net: recurrent residual refinement network for saliency detection. In: IJCAI, pp. 684–690 (2018)Google Scholar
  15. 15.
    Desingh, K., Krishna, K., Rajanand, D., Jawahar, C.: Depth really matters: improving visual salient region detection with depth. In: BMVC, pp. 1–11 (2013)Google Scholar
  16. 16.
    Fan, D.P., Cheng, M.M., Liu, J.J., Gao, S.H., Hou, Q., Borji, A.: Salient objects in clutter: bringing salient object detection to the foreground. In: ECCV, pp. 186–202 (2018)Google Scholar
  17. 17.
    Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: ICCV, pp. 4548–4557 (2017)Google Scholar
  18. 18.
    Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI, pp. 698–704 (2018)Google Scholar
  19. 19.
    Fan, D.P., Lin, Z., Ji, G.P., Zhang, D., Fu, H., Cheng, M.M.: Taking a deeper look at co-salient object detection. In: CVPR, pp. 2919–2929 (2020)Google Scholar
  20. 20.
    Fan, D.P., Lin, Z., Zhang, Z., Zhu, M., Cheng, M.M.: Rethinking RGB-D salient object detection: models, datasets, and large-scale benchmarks. IEEE TNNLS (2020)Google Scholar
  21. 21.
    Fan, D.P., Wang, W., Cheng, M.M., Shen, J.: Shifting more attention to video salient object detection. In: CVPR, pp. 8554–8564 (2019)Google Scholar
  22. 22.
    Fan, X., Liu, Z., Sun, G.: Salient region detection for stereoscopic images. In: DSP, pp. 454–458 (2014)Google Scholar
  23. 23.
    Fang, Y., Wang, J., Narwaria, M., Le Callet, P., Lin, W.: Saliency detection for stereoscopic images. IEEE TIP 23(6), 2625–2636 (2014)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Feng, D., Barnes, N., You, S., McCarthy, C.: Local background enclosure for RGB-D salient object detection. In: CVPR, pp. 2343–2350 (2016)Google Scholar
  25. 25.
    Fu, K., Fan, D.P., Ji, G.P., Zhao, Q.: JL-DCF: joint learning and densely-cooperative fusion framework for RGB-D salient object detection. In: CVPR, pp. 3052–3062 (2020)Google Scholar
  26. 26.
    Gao, S.H., Tan, Y.Q., Cheng, M.M., Lu, C., Chen, Y., Yan, S.: Highly efficient salient object detection with 100K parameters. In: ECCV (2020)Google Scholar
  27. 27.
    Guo, J., Ren, T., Bei, J.: Salient object detection for RGB-D image via saliency evolution. In: ICME, pp. 1–6 (2016)Google Scholar
  28. 28.
    Han, J., Chen, H., Liu, N., Yan, C., Li, X.: CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion. IEEE TOC 48(11), 3171–3183 (2018)Google Scholar
  29. 29.
    Han, J., Yang, L., Zhang, D., Chang, X., Liang, X.: Reinforcement cutting-agent learning for video object segmentation. In: CVPR, pp. 9080–9089 (2018)Google Scholar
  30. 30.
    Han, Q., Zhao, K., Xu, J., Cheng, M.M.: Deep hough transform for semantic line detection. In: ECCV (2020)Google Scholar
  31. 31.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  32. 32.
    He, X., Yang, S., Li, G., Li, H., Chang, H., Yu, Y.: Non-local context encoder: robust biomedical image segmentation against adversarial attacks. In: AAAI 2019, pp. 8417–8424 (2019)Google Scholar
  33. 33.
    Hu, X., Yang, K., Fei, L., Wang, K.: ACNet: attention based network to exploit complementary features for RGBD semantic segmentation. In: ICIP, pp. 1440–1444 (2019)Google Scholar
  34. 34.
    Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: ICIP, pp. 1115–1119 (2014)Google Scholar
  35. 35.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)Google Scholar
  36. 36.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)Google Scholar
  37. 37.
    Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: CVPR, pp. 5455–5463 (2015)Google Scholar
  38. 38.
    Li, G., Zhu, X., Zeng, Y., Wang, Q., Lin, L.: Semantic relationships guided representation learning for facial action unit recognition. In: AAAI, pp. 8594–8601 (2019)Google Scholar
  39. 39.
    Li, H., Chen, G., Li, G., Yu, Y.: Motion guided attention for video salient object detection. In: ICCV, pp. 7274–7283 (2019)Google Scholar
  40. 40.
    Li, J., et al.: Learning from large-scale noisy web data with ubiquitous reweighting for image classification. IEEE TPAMI (2019)Google Scholar
  41. 41.
    Li, N., Ye, J., Ji, Y., Ling, H., Yu, J.: Saliency detection on light field. In: CVPR, pp. 2806–2813 (2014)Google Scholar
  42. 42.
    Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: ECCV, pp. 355–370 (2018)Google Scholar
  43. 43.
    Liang, F., Duan, L., Ma, W., Qiao, Y., Cai, Z., Qing, L.: Stereoscopic saliency model using contrast and depth-guided-background prior. Neurocomputing 275, 2227–2238 (2018)CrossRefGoogle Scholar
  44. 44.
    Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: CVPR, pp. 3917–3926 (2019)Google Scholar
  45. 45.
    Liu, N., Han, J., Yang, M.H.: PiCANet: learning pixel-wise contextual attention for saliency detection. In: CVPR, pp. 3089–3098 (2018)Google Scholar
  46. 46.
    Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. In: ECCV, pp. 404–419 (2018)Google Scholar
  47. 47.
    Liu, Z., Shi, S., Duan, Q., Zhang, W., Zhao, P.: Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing 363, 46–57 (2019)CrossRefGoogle Scholar
  48. 48.
    Luo, A., Li, X., Yang, F., Jiao, Z., Cheng, H., Lyu, S.: Cascade graph neural networks for RGB-D salient object detection. In: ECCV (2020)Google Scholar
  49. 49.
    Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: CVPR, pp. 454–461 (2012)Google Scholar
  50. 50.
    Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Is object localization for free? - Weakly-supervised learning with convolutional neural networks. In: CVPR, pp. 685–694 (2015)Google Scholar
  51. 51.
    Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. Lecture Notes in Computer Science, vol. 8691, pp. 92–109. Springer, Cham (2014). Scholar
  52. 52.
    Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: ICCV, pp. 7254–7263 (2019)Google Scholar
  53. 53.
    Qiao, L., Shi, Y., Li, J., Wang, Y., Huang, T., Tian, Y.: Transductive episodic-wise adaptive metric for few-shot learning. In: ICCV, pp. 3603–3612 (2019)Google Scholar
  54. 54.
    Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., Yang, Q.: RGBD salient object detection via deep fusion. IEEE TIP 26(5), 2274–2285 (2017)MathSciNetzbMATHGoogle Scholar
  55. 55.
    Ren, J., Gong, X., Yu, L., Zhou, W., Ying Yang, M.: Exploiting global priors for RGB-D saliency detection. In: CVPRW, pp. 25–32 (2015)Google Scholar
  56. 56.
    Shigematsu, R., Feng, D., You, S., Barnes, N.: Learning RGB-D salient object detection using background enclosure, depth contrast, and top-down features. In: ICCVW, pp. 2749–2757 (2017)Google Scholar
  57. 57.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  58. 58.
    Song, H., Liu, Z., Du, H., Sun, G., Le Meur, O., Ren, T.: Depth-aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE TIP 26(9), 4204–4216 (2017)MathSciNetzbMATHGoogle Scholar
  59. 59.
    Steiner, B., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NIPS, pp. 8024–8035 (2019)Google Scholar
  60. 60.
    Su, J., Li, J., Zhang, Y., Xia, C., Tian, Y.: Selectivity or invariance: boundary-aware salient object detection. In: ICCV, pp. 3798–3807 (2019)Google Scholar
  61. 61.
    Wang, L., Wang, L., Lu, H., Zhang, P., Ruan, X.: Salient object detection with recurrent fully convolutional networks. IEEE TPAMI 41(7), 1734–1746 (2018)CrossRefGoogle Scholar
  62. 62.
    Wang, N., Gong, X.: Adaptive fusion for RGB-D salient object detection. IEEE Access 7, 55277–55284 (2019)CrossRefGoogle Scholar
  63. 63.
    Wang, T., et al.: Detect globally, refine locally: a novel approach to saliency detection. In: CVPR, pp. 3127–3135 (2018)Google Scholar
  64. 64.
    Woo, S., Park, J., Lee, J.Y., So Kweon, I.: CBAM: convolutional block attention module. In: ECCV, pp. 3–19 (2018)Google Scholar
  65. 65.
    Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: CVPR, pp. 3907–3916 (2019)Google Scholar
  66. 66.
    Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: ICCV, pp. 7264–7273 (2019)Google Scholar
  67. 67.
    Zeng, Y., Zhuge, Y., Lu, H., Zhang, L.: Joint learning of saliency detection and weakly supervised semantic segmentation. In: ICCV, pp. 7223–7233 (2019)Google Scholar
  68. 68.
    Zhang, J., et al.: UC-Net: uncertainty inspired RGB-D saliency detection via conditional variational autoencoders. In: CVPR, pp. 8582–8591 (2020)Google Scholar
  69. 69.
    Zhang, L., Wu, J., Wang, T., Borji, A., Wei, G., Lu, H.: A multistage refinement network for salient object detection. IEEE TIP 29, 3534–3545 (2020)Google Scholar
  70. 70.
    Zhang, Q., Huang, N., Yao, L., Zhang, D., Shan, C., Han, J.: RGB-T salient object detection via fusing multi-level CNN features. IEEE TIP 29, 3321–3335 (2020)Google Scholar
  71. 71.
    Zhang, X., Wang, T., Qi, J., Lu, H., Wang, G.: Progressive attention guided recurrent network for salient object detection. In: CVPR, pp. 714–722 (2018)Google Scholar
  72. 72.
    Zhang, Z., Jin, W., Xu, J., Cheng, M.M.: Gradient-induced co-saliency detection. In: ECCV (2020)Google Scholar
  73. 73.
    Zhang, Z., Lin, Z., Xu, J., Jin, W., Lu, S.P., Fan, D.P.: Bilateral attention network for RGB-D salient object detection. arXiv preprint arXiv:2004.14582 (2020)
  74. 74.
    Zhao, J.X., Cao, Y., Fan, D.P., Cheng, M.M., Li, X.Y., Zhang, L.: Contrast prior and fluid pyramid integration for RGBD salient object detection. In: CVPR, pp. 3927–3936 (2019)Google Scholar
  75. 75.
    Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EGNet: edge guidance network for salient object detection. In: CVPR, pp. 8779–8788 (2019)Google Scholar
  76. 76.
    Zhu, C., Cai, X., Huang, K., Li, T.H., Li, G.: PDNet: prior-model guided depth-enhanced network for salient object detection. In: ICME, pp. 199–204 (2019)Google Scholar
  77. 77.
    Zhu, C., Li, G.: A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: ICCVW, pp. 3008–3014 (2017)Google Scholar
  78. 78.
    Zhu, C., Li, G., Wang, W., Wang, R.: An innovative salient object detection using center-dark channel prior. In: ICCVW, pp. 1509–1515 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Inception Institute of Artificial IntelligenceAbu DhabiUAE
  2. 2.Nankai UniversityTianjinChina
  3. 3.HCL AmericaManhattanUSA
  4. 4.Mohamed bin Zayed University of Artificial IntelligenceAbu DhabiUAE

Personalised recommendations