Video saliency detection via combining temporal difference and pixel gradient

Lu, Xiangwei; Jian, Muwei; Wang, Rui; Liu, Xiangyu; Lin, Peiguang; Yu, Hui

doi:10.1007/s11042-023-17128-5

Video saliency detection via combining temporal difference and pixel gradient

Published: 02 October 2023

Volume 83, pages 37589–37602, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiangwei Lu¹,
Muwei Jian ORCID: orcid.org/0000-0002-4249-2264^1,2,
Rui Wang¹,
Xiangyu Liu¹,
Peiguang Lin¹ &
…
Hui Yu³

150 Accesses
Explore all metrics

Abstract

Even though temporal information matters for the quality of video saliency detection, many problems still arise/emerge in present network frameworks, such as bad performance in time-space coherence and edge continuity. In order to solve these problems, this paper proposes a full convolutional neural network, which integrates temporal differential and pixel gradient to fine tune the edges of salient targets. Considering the features of neighboring frames are highly relevant because of their proximity in location, a co-attention mechanism is used to put pixel-wise weight on the saliency probability map after features extraction with multi-scale pooling so that attention can be paid on both the edge and central of images. And the changes of pixel gradients of original images are used to recursively improve the continuity of target edges and details of central areas. In addition, residual networks are utilized to integrate information between modules, ensuring stable connections between the backbone network and modules and propagation of pixel gradient changes. In addition, a self-adjustment strategy for loss functions is presented to solve the problem of overfitting in experiments. The method presented in the paper has been tested with three available public datasets and its effectiveness has been proved after comparing with 6 other typically stat-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

GFNet: gated fusion network for video saliency prediction

Article 19 September 2023

Video Saliency Detection Using Deep Convolutional Neural Networks

Fusion hierarchy motion feature for video saliency detection

Article 20 September 2023

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Guo C, Zhang L (2009) A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Transactions on image processing 19(1):185–198
MathSciNet Google Scholar
Wu H, Li G, Luo X (2014) Weighted attentional blocks for probabilistic object tracking. The Visual Computer 30(2):229–243
Article Google Scholar
Fan Q, Luo W, Xia Y et al (2019) Metrics and methods of video quality assessment: a brief review. Multimedia Tools and Applications 78(22):31019–31033
Article Google Scholar
Götze N, Mertsching B, Schmalz S, et al. (1996) Multistage recognition of complex objects with the active vision system NAVIS
Lu X, Yuan Y, Zheng X (2016) Joint dictionary learning for multispectral change detection. IEEE Transactions on cybernetics 47(4):884–897
Article Google Scholar
Wang Q, Wan J, Yuan Y (2018) Locality constraint distance metric learning for traffic congestion detection. Pattern Recognition 75:272–281
Article Google Scholar
Wang Q, Gao J, Yuan Y (2017) Embedding structured contour and location prior in siamesed fully convolutional networks for road detection. IEEE Transactions on Intelligent Transportation Systems 19(1):230–241
Article Google Scholar
Wang Q, Gao J, Yuan Y (2017) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Transactions on Intelligent Transportation Systems 19(5):1457–1470
Article Google Scholar
Wang Q, Wan J, Yuan Y (2017) Deep metric learning for crowdedness regression. IEEE Transactions on Circuits and Systems for Video Technology 28(10):2633–2643
Article Google Scholar
Yang J, Yang MH (2016) Top-down visual saliency via joint CRF and dictionary learning. IEEE transactions on pattern analysis and machine intelligence 39(3):576–588
Article Google Scholar
Gao D, Vasconcelos N (2007) Bottom-up saliency is a discriminant process 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1-6
Cheng MM, Mitra NJ, Huang X et al (2014) Global contrast based salient region detection. IEEE transactions on pattern analysis and machine intelligence 37(3):569–582
Article Google Scholar
Fang Y, Wang Z, Lin W et al (2014) Video saliency incorporating spatiotemporal cues and uncertainty weighting. IEEE transactions on image processing 23(9):3910–3921
Article MathSciNet Google Scholar
Wang W, Shen J, Shao L (2015) Consistent video saliency using local gradient flow optimization and global refinement. IEEE Transactions on Image Processing 24(11):4185–4196
Article MathSciNet Google Scholar
Wang W, Shen J, Shao L (2017) Video salient object detection via fully convolutional networks. IEEE Transactions on Image Processing 27(1):38–49
Article MathSciNet Google Scholar
Brox T, Malik J (2010) Object segmentation by long term analysis of point trajectories European conference on computer vision. Springer, Berlin, Heidelberg, pp 282–295
Google Scholar
Li F, Kim T, Humayun A, et al. (2013) Video segmentation by tracking many figure-ground segments Proceedings of the IEEE International Conference on Computer Vision. 2192-2199
Perazzi F, Pont-Tuset J, McWilliams B, et al. (2016) A benchmark dataset and evaluation methodology for video object segmentation Proceedings of the IEEE conference on computer vision and pattern recognition. 724-732
Achanta R, Hemami S, Estrada F, et al. (2009) Frequency-tuned salient region detection 2009 IEEE conference on computer vision and pattern recognition. IEEE, 1597-1604
Fan D P, Wang W, Cheng M M, et al. (2019) Shifting more attention to video salient object detection Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8554-8564
Song H, Wang W, Zhao S, et al. (2018) Pyramid dilated deeper convlstm for video salient object detection Proceedings of the European conference on computer vision (ECCV). 715-731
Li G, Xie Y, Wei T, et al. (2018) Flow guided recurrent neural encoder for video salient object detection Proceedings of the IEEE conference on computer vision and pattern recognition. 3243-3252
Chen Y, Zou W, Tang Y et al (2018) SCOM: Spatiotemporal constrained optimization for salient object detection. IEEE Transactions on Image Processing 27(7):3345–3357
Article MathSciNet Google Scholar
Li S, Seybold B, Vorobyov A, et al. (2018) Unsupervised video object segmentation with motion-based bilateral networks proceedings of the European Conference on Computer Vision (ECCV). 207-223
Wang B, Liu W, Han G et al (2020) Learning long-term structural dependencies for video salient object detection. IEEE Transactions on Image Processing 29:9017–9031
Article Google Scholar
Jian M, Lam K-M, Dong J, Shen L (2014) Visual-patch-attention aware saliency detection, IEEE Trans Cybern, pp. 1575–1586
Wang Q, Lin J, Yuan Y (2016) Salient band selection for hyperspectral image classification via manifold ranking, IEEE Transactions on Neural Networks and Learning Systems, 1279–1289
Han J, Chen H, Liu N, Yan C, Li X (2017) Cnns-based rgb-d saliency detection via cross-view transfer and multiview fusion. IEEE Transactions on Cybernetics 48(11):3171–3183
Article Google Scholar
Cong R, Lei J, Fu H, Lin W, Huang Q, Cao X, Hou C (2019) An iterative co-saliency framework for rgbd images. IEEE Transactions on Cybernetics 49(1):233–246
Article Google Scholar
Cong R, Lei J, Fu H, Hou J, Huang Q, Kwong S (2020) Going from rgb to rgbd saliency: A depth-guided transformation model. IEEE Transactions on Cybernetics 50(8):3627–3639
Article Google Scholar
Zhang M, Ji W, Piao Y, Li J, Zhang Y, Xu S, Lu H (2020) Lfnet: Light field fusion network for salient object detection. IEEE Transactions on Image Processing 29:6276–6287
Article Google Scholar
Li C, Cong R, Kwong S, Hou J, Fu H, Zhu G, Zhang D, Huang Q (2020) Asif-net: Attention steered interweave fusion network for rgb-d salient object detection, IEEE Trans Cybern, pp.1–13
Jian M, Qi Q, Dong J et al (2018) Saliency detection using quaternionic distance based weber local descriptor and level priors. Multimed Tools Appl 77:14343–14360
Article Google Scholar
Jian M, Wang J, Dong J et al (2020) Saliency detection using multiple low-level priors and a propagation mechanism. Multimed Tools Appl 79:33467–33482
Article Google Scholar
Hu R, Deng Z, Zhu X. Multi-scale Graph Fusion for Co-saliency Detection. Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 7789–7796
Wang Z, Zhou Z, Lu H, Jiang J et al (2020) Global and local sensitivity guided key salient object re-augmentation for video saliency detection. Pattern Recognition 103:107275
Article Google Scholar
Zhang K, Dong M, Liu B et al. (2021) DeepACG: Co-Saliency Detection via Semantic-aware Contrast Gromov-Wasserstein Distance. the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13703-13712
Wang Y, Wang R, Fan X, Wang T, He X (2023) Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 10031-10040

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (NSFC) (61976123, 61601427, 61876098); the Taishan Young Scholars Program of Shandong Province; and Key Development Program for Basic Research of Shandong Province (ZR2020ZD44).

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong University of Finance and Economics, Jinan, China
Xiangwei Lu, Muwei Jian, Rui Wang, Xiangyu Liu & Peiguang Lin
School of Information Science and Engineering, Linyi University, Linyi, China
Muwei Jian
School of Creative Technologies, University of Portsmouth, Portsmouth, UK
Hui Yu

Authors

Xiangwei Lu
View author publications
You can also search for this author in PubMed Google Scholar
Muwei Jian
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Peiguang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Hui Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Muwei Jian or Hui Yu.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lu, X., Jian, M., Wang, R. et al. Video saliency detection via combining temporal difference and pixel gradient. Multimed Tools Appl 83, 37589–37602 (2024). https://doi.org/10.1007/s11042-023-17128-5

Download citation

Received: 07 August 2022
Revised: 24 August 2023
Accepted: 15 September 2023
Published: 02 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-17128-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video saliency detection via combining temporal difference and pixel gradient

Abstract

Access this article

Similar content being viewed by others

GFNet: gated fusion network for video saliency prediction

Video Saliency Detection Using Deep Convolutional Neural Networks

Fusion hierarchy motion feature for video saliency detection

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video saliency detection via combining temporal difference and pixel gradient

Abstract

Access this article

Similar content being viewed by others

GFNet: gated fusion network for video saliency prediction

Video Saliency Detection Using Deep Convolutional Neural Networks

Fusion hierarchy motion feature for video saliency detection

Data availability

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation