Skip to main content
Log in

RGB-D saliency detection via complementary and selective learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Previous RGB-D saliency detection methods adopt different fusion schemes to fuse the RGB images and depth maps or their saliency maps. However, both the feature maps from different modalities and the different features within the same maps are not of equal importance. To address this problem, We present a new precise RGB-D saliency detection framework in this work that selectively fuses features of different resolutions from two modalities, considering the global location and local detail complementarity. Depth data contains superior position discrimination, which has been shown to enhance saliency prediction. However, errors or missing areas in a depth map or random distribution along an object boundary will introduce negative effect. Therefore, we design a backbone network and an edge detection module that can select useful representations from RGB images and depth maps with attention mechanism and effectively integrate macroscopic and microscopic features from the two modalities. The accurate location of salient objects with fine edge details is realized by cross-modal selective fusion and complementation. We also propose a triple loss function to improve the credibility of the network for hard sample detection. Extensive quantitative and qualitative evaluation experiments on six benchmark datasets show that our method has a superior performance compared with 11 existing state-of-the-art methods with various evaluation metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Intell 35(1):185–207. https://doi.org/10.1109/TPAMI.2012.89

    Article  Google Scholar 

  2. Banks M S, Read J C A, Allison R S, Watt S J (2012) Stereoscopy and the human visual system. SMPTE Mot Imaging J 121(4):24–43. https://doi.org/10.5594/j18173

    Article  Google Scholar 

  3. Wang H, Li Z, Li Y, Gupta B B, Choi C (2020) Visual saliency guided complex image retrieval. Pattern Recogn Lett 130:64–72. https://doi.org/10.1016/j.patrec.2018.08.010

    Article  Google Scholar 

  4. Wei S, Liao L, Li J, Zheng Q, Yang F, Zhao Y (2019) Saliency inside: learning attentive CNNs for content-based image retrieva. IEEE Trans Image Process 28(9):4580–4593. https://doi.org/10.1109/TIP.2019.2913513https://doi.org/10.1109/TIP.2019.2913513

    Article  MathSciNet  MATH  Google Scholar 

  5. Yang S, Lin W, Jiang Q, Wang Y (2019) SGDNEt: An end-to-end saliency-guided deep neural network for no-reference image quality assessment. MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia, 1383–1391

  6. Jia S, Zhang Y (2018) Saliency-based deep convolutional neural network for no-reference image quality assessment. Multimed Tools Appl 77(12):14859–14872. https://doi.org/10.1007/s11042-017-5070-6https://doi.org/10.1007/s11042-017-5070-6

    Article  Google Scholar 

  7. Sun F, Li W (2019) Saliency guided deep network for weakly-supervised image segmentation. Pattern Recogn Lett 120(Wenhui Li):62–68. https://doi.org/10.1016/j.patrec.2019.01.009

    Article  Google Scholar 

  8. Zhou Y, Wang X, Jiao J, Darrell T, Yu F (2020) Learning saliency propagation for semi-supervised instance segmentation. In: Proceedings of the IEEE computer society conference on computer vision and Pattern Recognition, pp 10304–10313, DOI https://doi.org/10.1109/CVPR42600.2020.01032, (to appear in print)

  9. Chen C, Li S, Qin H, Hao A (2015) Real-time and robust object tracking in video via low-rank coherency analysis in feature space. Pattern Recogn 48(9):2885–2905. https://doi.org/10.1016/j.patcog.2015.01.025https://doi.org/10.1016/j.patcog.2015.01.025

    Article  Google Scholar 

  10. Babichev S A, Ries J, Lvovsky A I (2002) Quantum scissors: teleportation of single-mode optical states by means of a nonlocal single photon Preprint at arXiv:quant-ph/0208066v1

  11. Beneke M, Buchalla G, Dunietz I (1997) Mixing induced CP asymmetries in inclusive B decays. Phys Lett B393:132–142. arXiv:https://arxiv.org/abs/0707.3168 [gr-gc]

    Article  Google Scholar 

  12. Imamoglu N, Lin W, Fang Y (2013) A saliency detection model using low-level features based on wavelet transform. IEEE Trans Multimed 15(1):96–105. https://doi.org/10.1109/TMM.2012.2225034https://doi.org/10.1109/TMM.2012.2225034

    Article  Google Scholar 

  13. Cheng M M, Mitra N J, Huang X, Torr P H S, Hu S M (2015) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582. https://doi.org/10.1109/TPAMI.2014.2345401https://doi.org/10.1109/TPAMI.2014.2345401

    Article  Google Scholar 

  14. Yang J, Yang M H (2017) Top-down visual saliency via joint CRF and dictionary learning. IEEE Trans Pattern Anal Mach Intell 39(3):576–588. https://doi.org/10.1109/TPAMI.2016.2547384https://doi.org/10.1109/TPAMI.2016.2547384

    Article  Google Scholar 

  15. He S, Lau R W H (2016) Exemplar-driven top-down saliency detection via deep association. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2016-Decem. https://doi.org/10.1109/CVPR.2016.617https://doi.org/10.1109/CVPR.2016.617, pp 5723–5732

  16. Deng Z, Hu X, Zhu L, Xu X, Qin J, Han B, Heng P-A (2018) r3net: recurrent residual refinement network for saliency detection. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence (IJCAI-18), pp 684–690, DOI https://doi.org/10.24963/ijcai.2018/95, (to appear in print)

  17. Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3085–3094, DOI https://doi.org/10.48550/arXiv.1903.00179, (to appear in print)

  18. Piao Y, Ji W, Li J, Zhang M, Lu H (2019) Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE international conference on computer vision, vol 2019-Octob, pp 7254–7263, DOI https://doi.org/10.1109/ICCV.2019.00735, (to appear in print)

  19. Tan Z, Gu X (2021) Depth scale balance saliency detection with connective feature pyramid and edge guidance. Appl Intell 51(8):5775–5792. https://doi.org/10.1007/s10489-020-02150-z

    Article  Google Scholar 

  20. Wang J, Zhao Z, Yang S, Chai X, Zhang W, Zhang M (2021) Global contextual guided residual attention network for salient object detection. Applied Intelligence

  21. Jiao J, Xue H, Ding J (2021) Non-local duplicate pooling network for salient object detection. Appl Intell 51(10):6881–6894. https://doi.org/10.1007/s10489-020-02147-8

    Article  Google Scholar 

  22. Liu Z, Song T, Xie F (2019) Rgb-d image saliency detection from 3d perspective. Multimed Tools Appl 78(6):6787–6804. https://doi.org/10.1007/s11042-018-6319-4

    Article  Google Scholar 

  23. Liu Z, Shi S, Duan Q, Zhang W, Zhao P (2019) Salient object detection for RGB-d image by single stream recurrent convolution neural network. Neurocomputing 363:46–57. https://doi.org/10.1016/j.neucom.2019.07.012https://doi.org/10.1016/j.neucom.2019.07.012

    Article  Google Scholar 

  24. Song H, Liu Z, Du H, Sun G, Le meur O, Ren T (2017) Depth-Aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE Trans Image Process 26(9):4204–4216. https://doi.org/10.1109/TIP.2017.2711277https://doi.org/10.1109/TIP.2017.2711277

    Article  MathSciNet  MATH  Google Scholar 

  25. Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) RGBD Salient object detection via deep fusion. IEEE Trans Image Process 26(5):2274–2285. https://doi.org/10.1109/TIP.2017.2682981

    Article  MathSciNet  MATH  Google Scholar 

  26. Fan D P, Lin Z, Zhao J X, Liu Y, Zhang Z, Hou Q, Zhu M, Cheng M M (2019) Rethinking RGB-d salient object detection: models, datasets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089. https://doi.org/10.1109/tnnls.2020.2996406https://doi.org/10.1109/tnnls.2020.2996406

    Article  Google Scholar 

  27. Huang P, Shen C H, Hsiao H F (2019) RGBD Salient object detection using spatially coherent deep learning framework. International Conference on Digital Signal Processing DSP 2018-November:1–5. https://doi.org/10.1109/ICDSP.2018.8631584

    Article  Google Scholar 

  28. Guo J, Ren T, Bei J (2016) Salient object detection for rgb-d image via saliency evolution. In: 2016 IEEE International conference on multimedia and expo (ICME), pp 1–6. https://doi.org/10.1109/ICME.2016.7552907https://doi.org/10.1109/ICME.2016.7552907

  29. Han J (2018) Cnns-based rgb-d saliency detection via cross-view transfer and multiview fusion. IEEE Trans Cybern 48(11):3171–3183. https://doi.org/10.1109/TCYB.2017.2761775

    Article  Google Scholar 

  30. Wang N, Gong X (2019) Adaptive fusion for rgb-d salient object detection. IEEE Access 7:55277–55284. https://doi.org/10.1109/ACCESS.2019.2913107https://doi.org/10.1109/ACCESS.2019.2913107

    Article  Google Scholar 

  31. Chen H, Li Y (2018) Progressively complementarity-aware fusion network for rgb-d salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2018.00322https://doi.org/10.1109/CVPR.2018.00322, pp 3051–3060

  32. Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-d salient object detection. Pattern Recogn 86:376–385. https://doi.org/10.1016/j.patcog.2018.08.007

    Article  Google Scholar 

  33. Liu N, Zhang N, Han J (2020) Learning selective self-mutual attention for RGB-d saliency detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR42600.2020.01377https://doi.org/10.1109/CVPR42600.2020.01377, pp 13753–13762

  34. Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2020) ASIF-Net: attention steered interweave fusion network for RGB-d salient object detection. IEEE Trans Cybern 51 (1):88–100. https://doi.org/10.1109/TCYB.2020.2969255

    Article  Google Scholar 

  35. Huang N, Liu Y, Zhang Q, Han J (2021) Joint cross-modal and unimodal features for RGB-d salient object detection. IEEE Trans Multimed 23:2428–2441. https://doi.org/10.1109/TMM.2020.3011327https://doi.org/10.1109/TMM.2020.3011327

    Article  Google Scholar 

  36. Zhang M, Ren W, Piao Y, Rong Z, Lu H (2020) Select, supplement and focus for RGB-d saliency detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR42600.2020.00353, pp 3472–3481

  37. Ji W, Li J, Zhang M, Piao Y, Lu H (2020) Accurate RGB-d Salient Object Detection Via Collaborative Learning vol 12363 LNCS, pp 52–69. https://doi.org/10.1007/978-3-030-58523-5_4

  38. Li G, Liu Z, Ye L, Wang Y, Ling H (2020) Cross-modal weighting network for RGB-d salient object detection. In: Computer vision - ECCV 2020: 16th european conference, pp 665–681, DOI https://doi.org/10.1007/978-3-030-58520-4_39, (to appear in print)

  39. Chen H, Li Y (2019) Three-stream attention-aware network for rgb-d salient object detection. IEEE Trans Image Process 28(6):2825–2835. https://doi.org/10.1109/TIP.2019.2891104

    Article  MathSciNet  MATH  Google Scholar 

  40. Zhang Y, Jiang G, Yu M, Chen K (2010) Stereoscopic visual attention model for 3d video. In: Advances in multimedia modeling. https://doi.org/10.1007/978-3-642-11301-7_33. Springer, Berlin, Heidelberg, pp 314–324

  41. Desingh K, K MK, Rajan D, Jawahar C (2014) Depth really matters: improving visual salient region detection with depth. pp 98–19811. https://doi.org/10.5244/c.27.98

  42. Ju R, Liu Y, Ren T, Ge L, Wu G (2015) Depth-aware salient object detection using anisotropic center-surround difference. Signal Process Image Commun 38:115–126. https://doi.org/10.1016/j.image.2015.07.002

    Article  Google Scholar 

  43. Cheng Y, Fu H, Wei X, Xiao J, Cao X (2014) Depth enhanced saliency detection method. In: ACM International conference proceeding series, pp 23–27. https://doi.org/10.1145/2632856.2632866

  44. Liu J J, Hou Q, Cheng M M, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2019-June, pp 3912–3921, DOI https://doi.org/10.1109/CVPR.2019.00404, (to appear in print)

  45. Zhao J, Liu J J, Fan D P, Cao Y, Yang J, Cheng M M (2019) EGNEt: Edge guidance network for salient object detection. In: Proceedings of the IEEE international conference on computer vision 2019-Octob(Iccv), pp 8778–8787, DOI https://doi.org/10.1109/ICCV.2019.00887https://doi.org/10.1109/ICCV.2019.00887, (to appear in print)

  46. Feng M, Lu H, Ding E (2019) Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2019-June. https://doi.org/10.1109/CVPR.2019.00172, pp 1623–1632

  47. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 7471–7481. https://doi.org/10.1109/CVPR.2019.00766

  48. Fu K, Fan D P, Ji G P, Zhao Q, Shen J, Zhu C (2021) Siamese network for RGB-d salient object detection and beyond. IEEE Trans Pattern Anal Mach Intell 14(8):1–18. https://doi.org/10.1109/TPAMI.2021.3073689

    Article  Google Scholar 

  49. Wang H, Wang Y, Zhang Z, Fu X, Zhuo L, Xu M, Wang M (2021) Kernelized multiview subspace analysis by Self-Weighted learning. IEEE Trans Multimed 23:3828–3840. https://doi.org/10.1109/TMM.2020.3032023

    Article  Google Scholar 

  50. Deng R, Shen C, Liu S, Wang H, Liu X (2018) Learning to predict crisp boundaries. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11210 LNCS:570–586. https://doi.org/10.1007/978-3-030-01231-1_35

    Article  Google Scholar 

  51. Liu Y, Cheng M M, Hu X, Bian J W, Zhang L, Bai X, Tang J (2019) Richer convolutional features for edge detection. IEEE Trans Pattern Anal Mach Intell 41(8):1939–1946. https://doi.org/10.1109/TPAMI.2018.2878849

    Article  Google Scholar 

  52. Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. In: Arxiv, DOI https://doi.org/10.1609/aaai.v34i07.6633, (to appear in print)

  53. Ren J, Gong X, Yu L, Zhou W, Yang M Y (2015) Exploiting global priors for RGB-d saliency detection. IEEE Comput Soc Conf Comput Vision Pattern Recog Work 2015-Octob:25–32. https://doi.org/10.1109/CVPRW.2015.7301391

    Google Scholar 

  54. Peng H, Li B, Xiong W, Hu W, Ji R (2014) RGBD Salient object detection: a benchmark and algorithms. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8691 LNCS((PART 3)):92–109. https://doi.org/10.1007/978-3-319-10578-9_7

    Google Scholar 

  55. Niu Y, Geng Y, Li X, Liu F (2012) Leveraging stereopsis for saliency analysis. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 454–461, DOI https://doi.org/10.1109/CVPR.2012.6247708, (to appear in print)

  56. Li N, Ye J, Ji Y, Ling H, Yu J (2017) Saliency detection on light field. IEEE Trans Pattern Anal Mach Intell 39(8):1605–1616. https://doi.org/10.1109/TPAMI.2016.2610425

    Article  Google Scholar 

  57. Fu K, Fan DP, Ji G P, Zhao Q (2020) JL-DCF: Joint Learning and densely-cooperative fusion framework for RGB-d salient object detection. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 3049–3059, DOI https://doi.org/10.1109/CVPR42600.2020.00312, (to appear in print)

  58. Borji A, Cheng M M, Jiang H, Li J (2015) Salient object detection: a benchmark. IEEE Trans Image Process 24(12):5706–5722. https://doi.org/10.1109/TIP.2015.2487833

    Article  MathSciNet  MATH  Google Scholar 

  59. Cheng M M, Fan D P (2021) Structure-measure: A new way to evaluate foreground maps. Int J Comput Vis 129(9):2622–2638. https://doi.org/10.1007/s11263-021-01490-8

    Article  Google Scholar 

  60. Fan D P, Gong C, Cao Y, Ren B, Cheng M M, Borji A (2018) Enhanced-alignment measure for binary foreground map evaluation. In: IJCAI International Joint Conference on Artificial Intelligence 2018-July. https://doi.org/10.24963/ijcai.2018/97https://doi.org/10.24963/ijcai.2018/97, pp 698–704

  61. Cong R, Lei J, Zhang C, Huang Q, Cao X, Hou C (2016) Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. IEEE Signal Processing Letters 23 (6):819–823. https://doi.org/10.1109/LSP.2016.2557347

    Article  Google Scholar 

  62. Zhao JX, Cao Y, Fan DP et al (2019) Contrast prior and Fluid pyramid integration for rgbd salient object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2019-June, pp 3922–3931. https://doi.org/10.1109/CVPR.2019.00405

  63. Liu D, Zhang K, Chen Z (2020) Attentive cross-modal fusion network for RGB-D saliency detection. IEEE Trans Multimed 23:967–981

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the National College Student Innovation and Entrepreneurship Training Program of Zaozhuang University (No.1022004), Key Support Project of the National Natural Science Foundation Joint Fund of China (No.U2141239).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaofei Sun or Yunsheng Qian.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, W., Sun, X. & Qian, Y. RGB-D saliency detection via complementary and selective learning. Appl Intell 53, 7957–7969 (2023). https://doi.org/10.1007/s10489-022-03612-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03612-2

Keywords

Navigation