Skip to main content
Log in

Light-TBFNet: RGB-D salient detection based on a lightweight two-branch fusion strategy

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Aiming at the current large model for salient detection tasks, which leads to a poor balance between performance and efficiency. Therefore, we propose a lightweight and accurate salient object detection framework that adopts a dual-stream coding network to extract the depth and RGB features. For the depth feature extraction stream, the depth feature enhancement module is designed to enhance the depth features and extract valid information before layer-by-layer feature fusion with the RGB feature extraction stream to solve the influence of low-quality depth features on the fused features. Then, from the perspective of lightweight, semantic information is used to locate salient regions, spatial detail information is used to optimize salient regions, the traditional top-down fusion of the U-shaped structure is abandoned, and the decoding network is innovatively divided into a spatial detail branch and semantic information branch. The first three layers of fusion features obtained by the coding network are used to extract spatial detail features, and the last three layers of fusion features are used to extract semantic features. After that, a two-branch fusion strategy is proposed for fusing two different level of features in the way of feature interaction and reconstruction. The framework avoids the traditional top-down fusion of U-shaped structures, which increases the computational complexity and decreases inference speed due to the large resolution of low-level features, and the high-level features may be gradually diluted in the top-down propagation process. Finally, by introducing Dice and SSIM loss functions, the hybrid loss function is proposed to supervise network training. Light-TBFNet performs favorably against state-of-the-art methods on six challenging RGB-D SOD datasets with much faster speed (30FPS for the input size of 384 × 384) and fewer parameters (3.79M).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The data are not publicy available due to privacy or ethical restrictions.

References

  1. Chen Z, Cong R, Xu Q, Huang Q (2020) Dpanet: Depth potentiality-aware gated attention network for rgb-d salient object detection. IEEE Trans Image Process 30:7012–7024

    Article  Google Scholar 

  2. Chen S, Fu Y (2020) Progressively guided alternate refinement network for rgb-d salient object detection:520–538

  3. Chen H, Li Y, Su D (2019) Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for rgb-d salient object detection. Pattern Recogn 86:376–385

    Article  Google Scholar 

  4. Chen T, Liu X, Feng R, Wang W, Yuan C, Lu W, He H, Gao H, Ying H, Chen DZ, Wu J (2021) Discriminative cervical lesion detection in colposcopic images with global class activation and local bin excitation. IEEE J Biomed Health Inform 26(4):1411–1421

    Article  Google Scholar 

  5. Chen Q, Liu Z, Zhang Y, Fu K, Zhao Q, Du H (2021) Rgb-d salient object detection via 3d convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 1063–1071

  6. Chen C, Wei J, Peng C, Zhang W, Qin H (2020) Improved saliency detection in rgb-d images using two-phase depth estimation and selective deep fusion. IEEE Trans Image Process 29:4296–4307

    Article  MATH  Google Scholar 

  7. Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: the joint methods perspective. IEEE/ACM Trans Computational Bio Bioinformatics 18(1):103–113

    Google Scholar 

  8. Cheng Y, Fu H, Wei X, Xiao J, Cao X (2014) Depth enhanced saliency detection method. In: Proceedings of international conference on internet multimedia computing and service, pp 23–27

  9. Fan D-P, Lin Z, Zhang Z, Zhu M, Cheng M-M (2020) Rethinking rgb-d salient object detection: models, data sets, and large-scale benchmarks. IEEE Trans Neural Netw Learn Syst 32(5):2075–2089

    Article  Google Scholar 

  10. Fan D -P, Zhai Y, Borji A, Yang J, Shao L (2020) Bbs-net: Rgb-d salient object detection with a bifurcated backbone strategy network. In: European conference on computer vision, pp 275–292

  11. Feng R, Liu X, Chen J, Chen DZ, Gao H, Wu J (2020) A deep learning approach for colonoscopy pathology wsi analysis: accurate segmentation and classification. IEEE J Biomed Health Inform 25(10):3700–3708

    Article  Google Scholar 

  12. Fu K, Fan D-P, Ji G-P, Zhao Q (2020) Jl-dcf: joint learning and densely-cooperative fusion framework for rgb-d salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3052–3062

  13. Gao H, Xu K, Cao M, Xiao J, Xu Q, Yin Y (2021) The deep features and attention mechanism-based method to dish healthcare under social iot systems: an empirical study with a hand-deep local-global net. IEEE Trans Computational Social Syst 9(1):336–347

    Article  Google Scholar 

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  15. Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722

  16. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861

  17. Hu X, Yang K, Fei L, Wang K (2019) Acnet: Attention based network to exploit complementary features for rgbd semantic segmentation. In: 2019 IEEE international conference on image processing, pp 1440–1444

  18. Jin W -D, Xu J, Han Q, Zhang Y, Cheng M -M (2021) Cdnet: complementary depth network for rgb-d salient object detection. IEEE Trans Image Process 30:3376–3390

    Article  Google Scholar 

  19. Ju R, Ge L, Geng W, Ren T, Wu G (2014) Depth saliency based on anisotropic center-surround difference. In: 2014 IEEE international conference on image processing, pp 1115–1119

  20. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  21. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks Advances in neural information processing systems, vol 25(2)

  22. Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2020) Asif-net: Attention steered interweave fusion network for rgb-d salient object detection. IEEE Trans Cybern 51(1):88–100

    Article  Google Scholar 

  23. Li G, Liu Z, Chen M, Bai Z, Lin W, Ling H (2021) Hierarchical alternate interaction network for rgb-d salient object detection. IEEE Trans Image Process 30:3528–3542

    Article  Google Scholar 

  24. Li G, Liu Z, Ling H (2020) Icnet: Information conversion network for rgb-d based salient object detection. IEEE Trans Image Process 29:4873–4884

    Article  MATH  Google Scholar 

  25. Li G, Liu Z, Ye L, Wang Y, Ling H (2020) Cross-modal weighting network for rgb-d salient object detection. In: European conference on computer vision, pp 665–681

  26. Li N, Ye J, Ji Y, Ling H, Yu J (2014) Saliency detection on light field. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2806–2813

  27. Liu L, Kong G, Duan X, Wu Y, Long H (2021) Siamese network with bidirectional feature pyramid for small target tracking. J Electr Imaging 30(5):053028

    Article  Google Scholar 

  28. Liu Z, Tan Y, He Q, Xiao Y (2022) Swinnet: Swin transformer drives edge-aware rgb-d and rgb-t salient object detection. arXiv:2204.05585

  29. Liu Z, Wang Y, Tu Z, Xiao Y, Tang B (2021) Tritransnet: Rgb-d salient object detection with a triplet transformer embedding network. In: Proceedings of the 29th ACM international conference on multimedia, pp 4481–4490

  30. Liu N, Zhang N, Wan K, Shao L, Han J (2021) Visual saliency transformer. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4722–4732

  31. Liu H, Zhang J, Yang K, Hu X, Stiefelhagen R (2022) Cmx: cross-modal fusion for rgb-x semantic segmentation with transformers. arXiv:2203.04838

  32. Milletari F, Navab N, Ahmadi S-A (2016) V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth international conference on 3d vision, pp 565–571

  33. Niu Y, Geng Y, Li X, Liu F (2012) Leveraging stereopsis for saliency analysis. In: 2012 IEEE conference on computer vision and pattern recognition, pp 454–461

  34. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurth S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inform Process Syst, vol 32

  35. Peng H, Li B, Xiong W, Hu W, Ji R (2014) Rgbd salient object detection: a benchmark and algorithms. In: European conference on computer vision, pp 92–109

  36. Piao Y, Ji W, Li J, Zhang M, Lu H (2019) Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7254–7263

  37. Simonyan K, Zisserman A (2014)

  38. Sun P, Zhang W, Wang H, Li S, Li X (2021) Deep rgb-d saliency detection with depth-sensitive attention and automatic multi-modal fusion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1407–1417

  39. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  40. Wang N, Gong X (2019) Adaptive fusion for rgb-d salient object detection. IEEE Access 7:55277–55284

    Article  Google Scholar 

  41. Xiao J, Xu H, Gao H, Bian M, Li Y (2021) A weakly supervised semantic segmentation network by aggregating seed cues: the multi-object proposal generation perspective. ACM Trans Multimidia Comput Commun Appl 17(1s):1–19

    Article  Google Scholar 

  42. Zhang J, Fan D-P, Dai Y, Anwar S, Saleh FS, Zhang T, Barnes N (2020) Uc-net: uncertainty inspired rgb-d saliency detection via conditional variational autoencoders:8582–8591

  43. Zhang W, Jiang Y, Fu K, Zhao Q (2021) Bts-net: bi-directional transfer-and-selection network for rgb-d salient object detection. In: 2021 IEEE international conference on multimedia and expo, pp 1–6

  44. Zhao J-X, Cao Y, Fan D-P, Cheng M-M, Li X-Y, Zhang L (2019) Contrast prior and fluid pyramid integration for rgbd salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3927–3936

  45. Zhao X, Zhang L, Pang Y, Lu H, Zhang L (2020) A single stream network for robust and real-time rgb-d salient object detection:646–662

  46. Zhou T, Fu H, Chen G, Zhou Y, Fan D-P, Shao L (2021) Specificity-preserving rgb-d saliency detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4681–4691

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (NSFC, No.62266011)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yucheng Shi.

Ethics declarations

Conflict of Interests

There is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Yucheng Shi, Huaiyan Shen, Yaya Tan and Yu Wang contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Shi, Y., Shen, H. et al. Light-TBFNet: RGB-D salient detection based on a lightweight two-branch fusion strategy. Multimed Tools Appl 82, 26005–26035 (2023). https://doi.org/10.1007/s11042-022-14230-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-14230-y

Keywords

Navigation