Skip to main content
Log in

An end-to-end network for co-saliency detection in one single image

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Co-saliency detection within a single image is a common vision problem that has not yet been well addressed. Existing methods often used a bottom-up strategy to infer co-saliency in an image in which salient regions are firstly detected using visual primitives such as color and shape and then grouped and merged into a co-saliency map. However, co-saliency is intrinsically perceived complexly with bottom-up and top-down strategies combined in human vision. To address this problem, this study proposes a novel end-to-end trainable network comprising a backbone net and two branch nets. The backbone net uses ground-truth masks as top-down guidance for saliency prediction, whereas the two branch nets construct triplet proposals for regional feature mapping and clustering, which drives the network to be bottom-up sensitive to co-salient regions. We construct a new dataset of 2019 natural images with co-saliency in each image to evaluate the proposed method. Experimental results show that the proposed method achieves state-of-the-art accuracy with a running speed of 28 fps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hou X, Zhang L. Saliency detection: a spectral residual approach. In: Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007. 1–8

  2. Huang T J, Tian Y H, Li J, et al. Salient region detection and segmentation for general object recognition and image understanding. Sci China Inf Sci, 2011, 54: 2461–2470

    Article  MathSciNet  Google Scholar 

  3. Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell, 2011, 34: 1915–1926

    Article  Google Scholar 

  4. Cheng M M, Mitra N J, Huang X, et al. Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell, 2014, 37: 569–582

    Article  Google Scholar 

  5. Li Z Q, Fang T, Huo H. A saliency model based on wavelet transform and visual attention. Sci China Inf Sci, 2010, 53: 738–751

    Article  Google Scholar 

  6. Huang Z Y, He F Z, Cai X T, et al. Efficient random saliency map detection. Sci China Inf Sci, 2011, 54: 1207–1217

    Article  Google Scholar 

  7. Li Q N, Li Y D, Lang C Y. Salient object detection with side information. Sci China Inf Sci, 2020, 63: 189202

    Article  Google Scholar 

  8. Wang W, Shen J, Shao L. Video salient object detection via fully convolutional networks. IEEE Trans Image Process, 2018, 27: 38–49

    Article  MathSciNet  MATH  Google Scholar 

  9. Zhu W, Liang S, Wei Y, et al. Saliency optimization from robust background detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. 2814–2821

  10. Liu Y, Li X Q, Wang L, et al. Interpolation-tuned salient region detection. Sci China Inf Sci, 2014, 57: 012104

    Google Scholar 

  11. Piao Y, Jiang Y, Zhang M, et al. PANet: patch-aware network for light field salient object detection. IEEE Trans Cybern, 2023, 53: 379–391

    Article  Google Scholar 

  12. Fu H, Cao X, Tu Z. Cluster-based co-saliency detection. IEEE Trans Image Process, 2013, 22: 3766–3778

    Article  MathSciNet  MATH  Google Scholar 

  13. Cao X, Tao Z, Zhang B, et al. Self-adaptively weighted co-saliency detection via rank constraint. IEEE Trans Image Processing, 2014, 23: 4175–4186

    MathSciNet  MATH  Google Scholar 

  14. Huang R, Feng W, Sun J. Color feature reinforcement for cosaliency detection without single saliency residuals. IEEE Signal Process Lett, 2017, 24: 569–573

    Article  Google Scholar 

  15. Cong R, Lei J, Fu H, et al. An iterative co-saliency framework for RGBD images. IEEE Trans Cybern, 2017, 49: 233–246

    Article  Google Scholar 

  16. Wei L, Zhao S, Bourahla O E F, et al. Group-wise deep co-saliency detection. 2017. ArXiv:1707.07381

  17. Guo F, Wang W, Shen J, et al. Video saliency detection using object proposals. IEEE Trans Cybern, 2017, 48: 3159–3170

    Article  Google Scholar 

  18. Zou Q, Ni L, Wang Q, et al. Local pattern collocations using regional co-occurrence factorization. IEEE Trans Multimedia, 2017, 19: 492–505

    Article  Google Scholar 

  19. Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell, 2016, 39: 2298–2304

    Article  Google Scholar 

  20. Liao M, Shi B, Bai X. TextBoxes++: a single-shot oriented scene text detector. IEEE Trans Image Process, 2018, 27: 3676–3690

    Article  MathSciNet  MATH  Google Scholar 

  21. Huang T T, Xu Y C, Bai S, et al. Feature context learning for human parsing. Sci China Inf Sci, 2019, 62: 220101

    Article  Google Scholar 

  22. Yu H, Zheng K, Fang J, et al. Co-saliency detection within a single image. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2018

  23. Zitnick C L, Dollár P. Edge boxes: locating object proposals from edges. In: Proceedings of the 13th European Conference on Computer Vision, Zurich, 2014. 391–405

  24. Wang L, Wang L, Lu H, et al. Saliency detection with recurrent fully convolutional networks. In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 825–841

  25. Li G, Yu Y. Deep contrast learning for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 478–487

  26. Zhao T, Wu X. Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 3085–3094

  27. Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Advances in Neural Information Processing Systems, 2015. 28

  28. Schroff F, Kalenichenko D, Philbin J. FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 815–823

  29. Bell S, Bala K. Learning visual similarity for product design with convolutional neural networks. ACM Trans Graph, 2015, 34: 1–10

    Article  Google Scholar 

  30. Cong R, Zhang Y, Fang L, et al. RRNet: relational reasoning network with parallel multiscale attention for salient object detection in optical remote sensing images. IEEE Trans Geosci Remote Sens, 2022, 60: 1–11

    Article  Google Scholar 

  31. Cong R, Yang N, Li C, et al. Global-and-local collaborative learning for co-salient object detection. IEEE Trans Cybern, 2023, 53: 1920–1931

    Article  Google Scholar 

  32. Han J, Zhang D, Wen S, et al. Two-stage learning to predict human eye fixations via SDAEs. IEEE Trans Cybern, 2015, 46: 487–498

    Article  Google Scholar 

  33. Bylinskii Z, Recasens A, Borji A, et al. Where should saliency models look next? In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 809–824

  34. Borji A, Cheng M M, Jiang H, et al. Salient object detection: a benchmark. IEEE Trans Image Process, 2015, 24: 5706–5722

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhou Y, Huo S, Xiang W, et al. Semi-supervised salient object detection using a linear feedback control system model. IEEE Trans Cybern, 2018, 49: 1173–1185

    Article  Google Scholar 

  36. Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Machine Intell, 1998, 20: 1254–1259

    Article  Google Scholar 

  37. Liu T, Yuan Z J, Sun J, et al. Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell, 2010, 33: 353–367

    Google Scholar 

  38. Wei Y, Wen F, Zhu W, et al. Geodesic saliency using background priors. In: Proceedings of the 12th European Conference on Computer Vision, Florence, 2012. 29–42

  39. Borji A. Boosting bottom-up and top-down visual features for saliency estimation. In: Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012. 438–445

  40. Li J, Pan Z, Liu Q, et al. Complementarity-aware attention network for salient object detection. IEEE Trans Cybern, 2020, 52: 873–886

    Article  Google Scholar 

  41. Fang Y, Lin W, Lee B S, et al. Bottom-up saliency detection model based on human visual sensitivity and amplitude spectrum. IEEE Trans Multimedia, 2011, 14: 187–198

    Article  Google Scholar 

  42. Zhao R, Ouyang W, Li H, et al. Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 1265–1274

  43. Zhang J, Sclaroff S, Lin Z, et al. Unconstrained salient object detection via proposal subset optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 5733–5742

  44. Li G, Yu Y. Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 5455–5463

  45. Kim J, Pavlovic V. A shape-based approach for salient object detection using deep learning. In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016. 455–470

  46. Hou Q, Cheng M M, Hu X, et al. Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 3203–3212

  47. Liu N, Han J, Yang M H. PiCANet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. 3089–3098

  48. Chen S, Tan X, Wang B, et al. Reverse attention for salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), 2018. 234–250

  49. Luo Z, Mishra A, Achkar A, et al. Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 6609–6617

  50. Xie S, Tu Z. Holistically-nested edge detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2015. 1395–1403

  51. Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, 2015. 234–241

  52. Ye L, Liu Z, Li L, et al. Salient object segmentation via effective integration of saliency and objectness. IEEE Trans Multimedia, 2017, 19: 1742–1756

    Article  Google Scholar 

  53. Cong R, Lei J, Fu H, et al. Review of visual saliency detection with comprehensive information. IEEE Trans Circuits Syst Video Technol, 2018, 29: 2941–2959

    Article  Google Scholar 

  54. Zhang Q, Cong R, Li C, et al. Dense attention fluid network for salient object detection in optical remote sensing images. IEEE Trans Image Process, 2020, 30: 1305–1317

    Article  Google Scholar 

  55. Li C, Cong R, Hou J, et al. Nested network with two-stream pyramid for salient object detection in optical remote sensing images. IEEE Trans Geosci Remote Sens, 2019, 57: 9156–9166

    Article  Google Scholar 

  56. Chen Z, Cong R, Xu Q, et al. DPANet: depth potentiality-aware gated attention network for RGB-D salient object detection. IEEE Trans Image Process, 2020, 30: 7012–7024

    Article  Google Scholar 

  57. Cong R, Lei J, Fu H, et al. Going from RGB to RGBD saliency: a depth-guided transformation model. IEEE Trans Cybern, 2019, 50: 3627–3639

    Article  Google Scholar 

  58. Fang Y, Wang Z, Lin W, et al. Video saliency incorporating spatiotemporal cues and uncertainty weighting. IEEE Trans Image Process, 2014, 23: 3910–3921

    Article  MathSciNet  MATH  Google Scholar 

  59. Li Y, Sheng B, Ma L, et al. Temporally coherent video saliency using regional dynamic contrast. IEEE Trans Circuits Syst Video Technol, 2013, 23: 2067–2076

    Article  Google Scholar 

  60. Chen H T. Preattentive co-saliency detection. In: Proceedings of 2010 IEEE International Conference on Image Processing, 2010. 1117–1120

  61. Li H, Ngan K N. A co-saliency model of image pairs. IEEE Trans Image Process, 2011, 20: 3365–3375

    Article  MathSciNet  MATH  Google Scholar 

  62. Zhang D, Meng D, Han J. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans Pattern Anal Mach Intell, 2016, 39: 865–878

    Article  Google Scholar 

  63. Zhang K, Dong M, Liu B, et al. DeepACG: co-saliency detection via semantic-aware contrast Gromov-Wasserstein distance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. 13703–13712

  64. Zhang N, Han J, Liu N, et al. Summarize and search: learning consensus-aware dynamic convolution for co-saliency detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 4167–4176

  65. Tang L, Li B, Kuang S, et al. Re-thinking the relations in co-saliency detection. IEEE Trans Circuits Syst Video Technol, 2022, 32: 5453–5466

    Article  Google Scholar 

  66. Ren G, Dai T, Stathaki T. Adaptive intra-group aggregation for co-saliency detection. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022. 2520–2524

  67. Yu H, Zheng K, Fang J, et al. A new method and benchmark for detecting co-saliency within a single image. IEEE Trans Multimedia, 2020, 22: 3051–3063

    Article  Google Scholar 

  68. Song S, Yu H, Miao Z, et al. An easy-to-hard learning strategy for within-image co-saliency detection. Neurocomputing, 2019, 358: 166–176

    Article  Google Scholar 

  69. Guo Y, Liu Y, Georgiou T, et al. A review of semantic segmentation using deep neural networks. Int J Multimed Info Retr, 2018, 7: 87–93

    Article  Google Scholar 

  70. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556

  71. Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. 248–255

  72. He K, Gkioxari G, Dollár P, et al. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 2961–2969

  73. Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 761–769

  74. Zou Q, Zhang Z, Li Q, et al. DeepCrack: learning hierarchical convolutional features for crack detection. IEEE Trans Image Process, 2018, 28: 1498–1512

    Article  MathSciNet  Google Scholar 

  75. Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2015. 1440–1448

  76. Batra D, Kowdle A, Parikh D, et al. iCoseg: interactive co-segmentation with intelligent scribble guidance. In: Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. 3169–3176

  77. Li Q Q, Zou Q, Ma D, et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes. Sci China Inf Sci, 2018, 61: 092105

    Article  Google Scholar 

  78. Achanta R, Hemami S, Estrada F, et al. Frequency-tuned salient region detection. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009. 1597–1604

  79. Fan D P, Gong C, Cao Y, et al. Enhanced-alignment measure for binary foreground map evaluation. 2018. ArXiv:1805.10421

  80. Qin X, Zhang Z, Huang C, et al. BASNet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 7479–7489

  81. Chen Z, Xu Q, Cong R, et al. Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 10599–10606

  82. Pang Y, Zhao X, Zhang L, et al. Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 9413–9422

  83. Fan Q, Fan D P, Fu H, et al. Group collaborative learning for co-salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021. 12288–12298

  84. Yu S, Xiao J, Zhang B, et al. Democracy does matter: comprehensive feature mining for co-salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 979–988

Download references

Acknowledgements

This work was supported by Key Research and Development Program of Hubei Province (Grant No. 2020BAB018), National Natural Science Foundation of China (Grant No. 62171324), and National Key R&D Program of China (Grant No. 2022YFF0901902).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qin Zou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yue, Y., Zou, Q., Yu, H. et al. An end-to-end network for co-saliency detection in one single image. Sci. China Inf. Sci. 66, 210101 (2023). https://doi.org/10.1007/s11432-022-3686-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3686-1

Keywords

Navigation