Accurate RGB-D Salient Object Detection via Collaborative Learning

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12363)


Benefiting from the spatial cues embedded in depth images, recent progress on RGB-D saliency detection shows impressive ability on some challenge scenarios. However, there are still two limitations. One hand is that the pooling and upsampling operations in FCNs might cause blur object boundaries. On the other hand, using an additional depth-network to extract depth features might lead to high computation and storage cost. The reliance on depth inputs during testing also limits the practical applications of current RGB-D models. In this paper, we propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way, which solves those problems tactfully. The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries. Depth and saliency learning is innovatively integrated into the high-level feature learning process in a mutual-benefit manner. This strategy enables the network to be free of using extra depth networks and depth inputs to make inference. To this end, it makes our model more lightweight, faster and more versatile. Experiment results on seven benchmark datasets show its superior performance.



This work was supported by the Science and Technology Innovation Foundation of Dalian (2019J12GX034), the National Natural Science Foundation of China (61976035), and the Fundamental Research Funds for the Central Universities (DUT19JC58, DUT20JC42).


  1. 1.
    Achanta, R., Hemami, S.S., Estrada, F.J., Süsstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)Google Scholar
  2. 2.
    Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. TIP 24(12), 5706–5722 (2015)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Borji, A., Sihite, D.N., Itti, L.: Salient object detection: a benchmark. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 414–429. Springer, Heidelberg (2012). Scholar
  4. 4.
    Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)CrossRefGoogle Scholar
  5. 5.
    Chen, H., Li, Y.: Progressively complementarity-aware fusion network for RGB-D salient object detection. In: CVPR, pp. 3051–3060 (2018)Google Scholar
  6. 6.
    Chen, H., Li, Y.: Three-stream attention-aware network for RGB-D salient object detection. TIP 28(6), 2825–2835 (2019)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Chen, H., Li, Y., Su, D.: Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. PR 86, 376–385 (2019)Google Scholar
  8. 8.
    Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2018)CrossRefGoogle Scholar
  9. 9.
    Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). Scholar
  10. 10.
    Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. TPAMI 37(3), 409–416 (2011)Google Scholar
  11. 11.
    Cheng, Y., Fu, H., Wei, X., Xiao, J., Cao, X.: Depth enhanced saliency detection method. In: ICIMCS, pp. 23–27 (2014)Google Scholar
  12. 12.
    Cong, R., Lei, J., Zhang, C., Huang, Q., Cao, X., Hou, C.: Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. SPL 23(6), 819–823 (2016)Google Scholar
  13. 13.
    Craye, C., Filliat, D., Goudou, J.F.: Environment exploration for object-based visual saliency learning. In: ICRA, pp. 2303–2309 (2016)Google Scholar
  14. 14.
    Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016)Google Scholar
  15. 15.
    Deng, Z., et al.: R\(^3\)net: recurrent residual refinement network for saliency detection. In: IJCAI, pp. 684–690 (2018)Google Scholar
  16. 16.
    Fan, D.-P., Cheng, M.-M., Liu, J.-J., Gao, S.-H., Hou, Q., Borji, A.: Salient objects in clutter: bringing salient object detection to the foreground. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 196–212. Springer, Cham (2018). Scholar
  17. 17.
    Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A.: Structure-measure: a new way to evaluate foreground maps. In: ICCV, pp. 4558–4567 (2017)Google Scholar
  18. 18.
    Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI, pp. 698–704 (2018)Google Scholar
  19. 19.
    Fan, D.P., et al.: Rethinking RGB-D salient object detection: models, datasets, and large-scale benchmarks. arXiv preprint arXiv:1907.06781 (2019)
  20. 20.
    Fan, D.P., Wang, W., Cheng, M.M., Shen, J.: Shifting more attention to video salient object detection. In: CVPR, pp. 8554–8564 (2019)Google Scholar
  21. 21.
    Feng, M., Lu, H., Ding, E.: Attentive feedback network for boundary-aware salient object detection. In: CVPR, pp. 1623–1632 (2019)Google Scholar
  22. 22.
    Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)Google Scholar
  23. 23.
    Han, J., Chen, H., Liu, N., Yan, C., Li, X.: CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion. IEEE Trans. Syst. Man Cybern. 48(11), 3171–3183 (2018)Google Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)Google Scholar
  25. 25.
    Hong, S., You, T., Kwak, S., Han, B.: Online tracking by learning discriminative saliency map with convolutional neural network. In: ICML, pp. 597–606 (2015)Google Scholar
  26. 26.
    Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.S.: Deeply supervised salient object detection with short connections. In: CVPR, pp. 815–828 (2017)Google Scholar
  27. 27.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. TPAMI 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  28. 28.
    Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: ICIP, pp. 1115–1119 (2014)Google Scholar
  29. 29.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with gaussian edge potentials. In: NIPS, pp. 109–117 (2011)Google Scholar
  30. 30.
    Lee, G., Tai, Y.W., Kim, J.: Deep saliency with encoded low level distance map and high level features. In: CVPR, pp. 660–668 (2016)Google Scholar
  31. 31.
    Li, G., Zhu, C.: A three-pathway psychobiological framework of salient object detection using stereoscopic technology. In: ICCVW, pp. 3008–3014 (2017)Google Scholar
  32. 32.
    Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: CVPR, pp. 5455–5463 (2015)Google Scholar
  33. 33.
    Li, G., Yu, Y.: Visual saliency detection based on multiscale deep CNN features. TIP 25(11), 5012–5024 (2016)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Li, N., Ye, J., Ji, Y., Ling, H., Yu, J.: Saliency detection on light field. TPAMI 39(8), 1605–1616 (2017)CrossRefGoogle Scholar
  35. 35.
    Li, X., et al.: Deepsaliency: multi-task deep neural network model for salient object detection. TIP 25(8), 3919–3930 (2016)MathSciNetzbMATHGoogle Scholar
  36. 36.
    Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: CVPR, pp. 280–287 (2014)Google Scholar
  37. 37.
    Lin, G., Milan, A., Shen, C., Reid, I.D.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: CVPR, pp. 5168–5177 (2017)Google Scholar
  38. 38.
    Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: CVPR, pp. 3917–3926 (2019)Google Scholar
  39. 39.
    Liu, N., Han, J., Yang, M.H.: PicaNet: learning pixel-wise contextual attention for saliency detection. In: CVPR, pp. 3089–3098 (2018)Google Scholar
  40. 40.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)Google Scholar
  41. 41.
    Luo, Z., Mishra, A.K., Achkar, A., Eichel, J.A., Li, S., Jodoin, P.M.: Non-local deep features for salient object detection. In: CVPR, pp. 6593–6601 (2017)Google Scholar
  42. 42.
    Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps. In: CVPR, pp. 248–255 (2014)Google Scholar
  43. 43.
    Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: CVPR, pp. 454–461 (2012)Google Scholar
  44. 44.
    Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 92–109. Springer, Cham (2014). Scholar
  45. 45.
    Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: contrast based filtering for salient region detection. In: CVPR, pp. 733–740 (2012)Google Scholar
  46. 46.
    Piao, Y., Ji, W., Li, J., Zhang, M., Lu, H.: Depth-induced multi-scale recurrent attention network for saliency detection. In: ICCV (2019)Google Scholar
  47. 47.
    Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: BasNet: boundary-aware salient object detection. In: CVPR, pp. 7479–7489 (2019)Google Scholar
  48. 48.
    Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., Yang, Q.: RGBD salient object detection via deep fusion. TIP 26(5), 2274–2285 (2017)MathSciNetzbMATHGoogle Scholar
  49. 49.
    Ren, J., Gong, X., Yu, L., Zhou, W., Yang, M.Y.: Exploiting global priors for RGB-D saliency detection. In: CVPRW, pp. 25–32 (2015)Google Scholar
  50. 50.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS 2015, pp. 91–99 (2015)Google Scholar
  51. 51.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)Google Scholar
  52. 52.
    Smeulders, A.W.M., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. TPAMI 36(7), 1442–1468 (2014)CrossRefGoogle Scholar
  53. 53.
    Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: CVPR, pp. 3183–3192 (2015)Google Scholar
  54. 54.
    Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B., Yuille, A.: Towards unified depth and semantic prediction from a single image. In: CVPR, pp. 2800–2809 (2015)Google Scholar
  55. 55.
    Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H.: Salient object detection in the deep learning era: an in-depth survey. arXiv preprint arXiv:1904.09146 (2019)
  56. 56.
    Wang, W., Shen, J.: Deep visual attention prediction. TIP 27(5), 2368–2378 (2018)MathSciNetGoogle Scholar
  57. 57.
    Wang, W., Shen, J., Dong, X., Borji, A.: Salient object detection driven by fixation prediction. In: CVPR, pp. 1711–1720 (2018)Google Scholar
  58. 58.
    Wang, W., Shen, J., Xie, J., Cheng, M.M., Ling, H., Borji, A.: Revisiting video saliency prediction in the deep learning era. TPAMI 1 (2019)Google Scholar
  59. 59.
    Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). Scholar
  60. 60.
    Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: CVPR, pp. 3907–3916 (2019)Google Scholar
  61. 61.
    Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: CVPR, pp. 1155–1162 (2013)Google Scholar
  62. 62.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)Google Scholar
  63. 63.
    Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: ICCV, pp. 202–211 (2017)Google Scholar
  64. 64.
    Zhang, X., Wang, T., Qi, J., Lu, H., Wang, G.: Progressive attention guided recurrent network for salient object detection. In: CVPR, pp. 714–722 (2018)Google Scholar
  65. 65.
    Zhao, J., Cao, Y., Fan, D., Cheng, M., LI, X., Zhang, L.: Contrast prior and fluid pyramid integration for RGBD salient object detection. In: CVPR (2019)Google Scholar
  66. 66.
    Zhao, J., Liu, J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EGNet: edge guidance network for salient object detection. In: ICCV (2019)Google Scholar
  67. 67.
    Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: CVPR, pp. 1265–1274 (2015)Google Scholar
  68. 68.
    Zhu, C., Cai, X., Huang, K., Li, T.H., Li, G.: PDNet: prior-model guided depth-enhanced network for salient object detection. In: ICME, pp. 199–204 (2019)Google Scholar
  69. 69.
    Zhu, C., Li, G., Guo, X., Wang, W., Wang, R.: A multilayer backpropagation saliency detection algorithm based on depth mining. In: Felsberg, M., Heyden, A., Krüger, N. (eds.) CAIP 2017. LNCS, vol. 10425, pp. 14–23. Springer, Cham (2017). Scholar
  70. 70.
    Zhu, C., Li, G., Wang, W., Wang, R.: An innovative salient object detection using center-dark channel prior. In: ICCVW, pp. 1509–1515 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Dalian University of TechnologyDalianChina
  2. 2.Pengcheng LabShenzhenChina

Personalised recommendations