Abstract
Although the salient object detection (SOD) methods based on fully convolutional networks have made extraordinary achievements, it is still a challenge to accurately detect salient objects with complicated structure from cluttered real-world scenes due to their rarely considering the effectiveness and correlation of the captured different scale context and how to efficient interaction of complementary information. Motivate by this, in this paper, a novel Dense Multi-scale Inference Network (DMINet) is proposed for the accurate SOD task, which mainly consists of a dual-stream multi-receptive field module and a residual multi-mode interaction strategy. The former uses the well-designed different receptive field convolution operations and dense guidance connections to efficiently capture and utilize multi-scale contextual features for better salient objects inferring, while the latter adopts diverse interaction manners to adequately interact complementary information from multi-level features, generating powerful feature representations for predicting high-quality saliency maps. Quantitative and qualitative comparison results on five SOD datasets convincingly demonstrate that our DMINet performs favorably compared with 17 state-of-the-art SOD methods under different evaluation metrics.
Similar content being viewed by others
References
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) pp. 1597–1604 (2009)
Chen, S., Tan, X., Wang, B., Lu, H., Hu, X., Fu, Y.: Reverse attention-based residual network for salient object detection. IEEE Trans. Image Process. 29, 3763–3776 (2020)
Chen, S., Yu, J., Xu, X., Chen, Z., Lu, L., Hu, X., Yang, Y.: Split-guidance network for salient object detection. Vis. Comput., pp. 1–15 (2022)
Cheng, M.M., Gao, S., Borji, A., Tan, Y.Q., Lin, Z., Wang, M.: A highly efficient model to study the semantics of salient object detection. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
Das, D.K., Shit, S., Ray, D.N., Majumder, S.: Cgan: closure-guided attention network for salient object detection. Vis. Comput., pp. 1–15 (2021)
De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Ann. Oper. Res. 134(1), 19–67 (2005)
Donoser, M., Urschler, M., Hirzer, M., Bischof, H.: Saliency driven total variation segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 817–824 (2009)
Fan, D.P., Gong, C., Cao, Y., Ren, B., Cheng, M.M., Borji, A.: Enhanced-alignment measure for binary foreground map evaluation. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 698–704 (2018)
Feng, M., Lu, H., Ding, E.: Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1623–1632 (2019)
Feng, M., Lu, H., Yu, Y.: Residual learning for salient object detection. IEEE Trans. Image Process. 29, 4696–4708 (2020)
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.S.: Deeply supervised salient object detection with short connections. IEEE Trans. Pattern Anal. Mach. Intell. 41(4), 815–828 (2019)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017)
Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5455–5463 (2015)
Li, J., Pan, Z., Liu, Q., Wang, Z.: Stacked u-shape network with channel-wise attention for salient object detection. IEEE Trans. Multimedia 23, 1397–1409 (2021)
Li, Y., Hou, X., Koch, C., Rehg, J.M., Yuille, A.L.: The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 280–287 (2014)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)
Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3912–3921 (2019)
Liu, N., Han, J., Yang, M.H.: Picanet: learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3089–3098 (2018)
Liu, N., Zhang, N., Wan, K., Shao, L., Han, J.: Visual saliency transformer. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4722–4732 (2021)
Liu, S., Huang, D., et al.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 385–400 (2018)
Liu, Y., Han, J., Zhang, Q., Shan, C.: Deep salient object detection with contextual information guidance. IEEE Trans. Image Process. 29, 360–374 (2020)
Liu, Z., Xiang, Q., Tang, J., Wang, Y., Zhao, P.: Robust salient object detection for rgb images. Vis. Comput. 36(9), 1823–1835 (2020)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Margolin, R., Tal, A., Zelnik-Manor, L.: What makes a patch distinct? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1139–1146 (2013)
Máttyus, G., Luo, W., Urtasun, R.: Deeproadmapper: Extracting road topology from aerial images. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3438–3446 (2017)
Pang, Y., Zhao, X., Zhang, L., Lu, H.: Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9410–9419 (2020)
Peng, H., Li, B., Ling, H., Hu, W., Xiong, W., Maybank, S.J.: Salient object detection via structured matrix decomposition. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 818–832 (2017)
Qin, X., Zhang, Z., Huang, C., Dehghan, M., Jagersand, M.: U2-Net: Going deeper with nested u-structure for salient object detection. Pattern Recognit. 106, 107,404 (2020)
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Basnet: boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7471–7481 (2019)
Ramanishka, V., Das, A., Zhang, J., Saenko, K.: Top-down visual saliency guided by captions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3135–3144 (2017)
Ren, Q., Lu, S., Zhang, J., Hu, R.: Salient object detection by fusing local and global contexts. IEEE Trans. Multimedia 23, 1442–1453 (2021)
Shen, X., Wu, Y.: A unified approach to salient object detection via low rank matrix recovery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 853–860 (2012)
Tong, N., Lu, H., Ruan, X., Yang, M.H.: Salient object detection via bootstrap learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1884–1892 (2015)
Wang, B., Chen, S., Wang, J., Hu, X.: Residual feature pyramid networks for salient object detection. Vis. Comput. 36(9), 1897–1908 (2020)
Wang, L., Chen, R., Zhu, L., Xie, H., Li, X.: Deep sub-region network for salient object detection. IEEE Trans. Circuits Syst. Video Technol. 31(2), 728–741 (2021)
Wang, L., Lu, H., Ruan, X., Yang, M.H.: Deep networks for saliency detection via local estimation and global search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3183–3192 (2015)
Wang, L., Lu, H., Wang, Y., Feng, M., Wang, D., Yin, B., Ruan, X.: Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3796–3805 (2017)
Wang, T., Zhang, L., Wang, S., Lu, H., Yang, G., Ruan, X., Borji, A.: Detect globally, refine locally: a novel approach to saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3127–3135 (2018)
Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3902–3911 (2019)
Wu, Z., Su, L., Huang, Q.: Stacked cross refinement network for edge-aware salient object detection. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 7264–7273 (2019)
Xia, C., Gao, X., Fang, X., Li, K.C., Su, S., Zhang, H.: Rlp-agmc: Robust label propagation for saliency detection based on an adaptive graph with multiview connections. Signal Process.: Image Commun. 98, 116372 (2021)
Xia, C., Gao, X., Li, K.C., Zhao, Q., Zhang, S.: Salient object detection based on distribution-edge guidance and iterative Bayesian optimization. Appl. Intell. 50(10), 2977–2990 (2020)
Xia, C., Zhang, H., Gao, X., Li, K.: Exploiting background divergence and foreground compactness for salient object detection. Neurocomputing 383, 194–211 (2020)
Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1155–1162 (2013)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.H.: Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3166–3173 (2013)
Zhang, J., Sclaroff, S., Lin, Z., Shen, X., Price, B., Mech, R.: Unconstrained salient object detection via proposal subset optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5733–5742 (2016)
Zhang, L., Dai, J., Lu, H., He, Y., Wang, G.: A bi-directional message passing model for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1741–1750 (2018)
Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 202–211 (2017)
Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: Egnet: Edge guidance network for salient object detection. In: Proceedings of the International Conference on Computer Vision (ICCV), pp. 8779–8788 (2019)
Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1265–1274 (2015)
Zhao, X., Pang, Y., Zhang, L., Lu, H., Zhang, L.: Suppress and balance: a simple gated network for salient object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 35–51 (2020)
Zhou, H., Xie, X., Lai, J.H., Chen, Z., Yang, L.: Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9138–9147 (2020)
Acknowledgements
This work was supported by the National Science Foundation of China (6210071479), the Anhui Natural Science Foundation (2108085QF258), the Natural Science Research Project of Colleges and Universities in Anhui Province (KJ2020A0299), the University-level key projects of Anhui University of science and technology (QN2019102) and the University-level general projects of Anhui University of science and technology (xjyb2020-04).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
we declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xia, C., Sun, Y., Gao, X. et al. DMINet: dense multi-scale inference network for salient object detection. Vis Comput 38, 3059–3072 (2022). https://doi.org/10.1007/s00371-022-02561-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02561-8