Advertisement

RGB-D Saliency Detection by Multi-stream Late Fusion Network

  • Hao Chen
  • Youfu Li
  • Dan Su
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10528)

Abstract

In this paper we aim to address the problem of saliency detection on RGB-D image pairs based on a multi-stream late fusion network. With the prevalence of RGB-D sensors, leveraging additional depth information to facilitate saliency detection task has drawn increasing attention. However, the key challenge that how to fuse RGB data and depth data in an optimum manner is still under-studied. Conventional wisdom simply regards depth information as an undifferentiated channel and models RGB-D saliency detection by using existing RGB saliency detection models directly. However, this paradigm is incapable of capturing specific representations in depth modality and also powerless in fusing multi-modal information. In this paper, we address this problem by proposing a simple yet principled late fusion strategy carried out in conjunction with convolutional neural networks (CNNs). The proposed network is able to learn discriminant representations and explore the complementarity between RGB and depth modalities. Comprehensive experiments on two public datasets witness the benefits of the proposed RGB-D saliency detection network.

Keywords

RGB-D Saliency detection Convolutional neural networks 

Notes

Acknowledgments

This work is funded by the Research Grants Council of Hong Kong (CityU 11205015) and the National Natural Science Foundation of China (NSFC) (61673329).

References

  1. 1.
    Cheng, M., Mitra, N.J., Huang, X., Torr, P.H., Hu, S.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)CrossRefGoogle Scholar
  2. 2.
    Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)CrossRefGoogle Scholar
  4. 4.
    Yang, J., Yang, M.-H.: Top-down visual saliency via joint CRF and dictionary learning. In: CVPR 2012, pp. 2296–2303 (2012)Google Scholar
  5. 5.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS 2007, pp. 545–552 (2007)Google Scholar
  6. 6.
    Zhang, Y., Han, J., Guo, L.: Saliency detection by combining spatial and spectral information. Opt. Lett. 38(11), 1987–1989 (2013)CrossRefGoogle Scholar
  7. 7.
    Niu, Y., Geng, Y., Li, X., Liu, F.: Leveraging stereopsis for saliency analysis. In: CVPR 2012, pp. 454–461 (2012)Google Scholar
  8. 8.
    Peng, H., Li, B., Xiong, W., Hu, W., Ji, R.: RGBD salient object detection: a benchmark and algorithms. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 92–109. Springer, Cham (2014). doi: 10.1007/978-3-319-10578-9_7 Google Scholar
  9. 9.
    Ren, J., Gong, X., Yu, L., Zhou, W., Yang, M.Y.: Exploiting global priors for RGB-D saliency detection. In: CVPR Workshop 2015, pp. 25–32 (2015)Google Scholar
  10. 10.
    Cheng, Y., Fu, H., Wei, X., Xiao, J., Cao, X.: Depth enhanced saliency detection method. In: ICIMCS 2014, p. 23 (2014)Google Scholar
  11. 11.
    Ciptadi, A., Hermans, T., Rehg, J.M.: An in depth view of saliency. In: BMVC 2013, pp. 9–13 (2013)Google Scholar
  12. 12.
    Desingh, K., Krishna, K.M., Rajan, D., Jawahar, C.V.: Depth really matters: improving visual salient region detection with depth. In: BMVC 2013 (2013)Google Scholar
  13. 13.
    Ju, R., Ge, L., Geng, W., Ren, T., Wu, G.: Depth saliency based on anisotropic center-surround difference. In: ICIP 2014, pp. 1115–1119 (2014)Google Scholar
  14. 14.
    Lang, C., Nguyen, T.V., Katti, H., Yadati, K., Kankanhalli, M., Yan, S.: Depth matters: influence of depth cues on visual saliency. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 101–115. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33709-3_8 CrossRefGoogle Scholar
  15. 15.
    Fan, X., Liu, Z., Sun, G.: Salient region detection for stereoscopic images. In: DSP 2014, pp. 454–458 (2014)Google Scholar
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS 2012, pp. 1097–1105 (2012)Google Scholar
  17. 17.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM 2014, pp. 675–678 (2014)Google Scholar
  18. 18.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR 2015, pp. 3431–440 (2015)Google Scholar
  19. 19.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  20. 20.
    Feng, D., Barnes, N., You, S., McCarthy, C.: Local background enclosure for RGB-D salient object detection. In: CVPR 2016, pp. 2343–2350 (2016)Google Scholar
  21. 21.
    Qu, L., He, S., Zhang, J., Tian, J., Tang, Y., Yang, Q.: RGBD salient object detection via deep fusion. IEEE Trans. Image Process. 26(5), 2274–2285 (2017)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)
  23. 23.
    Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014). doi: 10.1007/978-3-319-10584-0_23 Google Scholar
  24. 24.
    Martin, D.R., Fowlkes, C.C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 530–549 (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Mechanical and Biomedical EngineeringCity University of Hong KongKowloon TongHong Kong SAR
  2. 2.City University of Hong Kong, Shenzhen Research InstituteShenzhenChina

Personalised recommendations