SRFCNM: Spatiotemporal recurrent fully convolutional network model for salient object detection

Arora, Ishita; Gangadharappa, M.

doi:10.1007/s11042-023-17009-x

SRFCNM: Spatiotemporal recurrent fully convolutional network model for salient object detection

Published: 03 October 2023

Volume 83, pages 38009–38036, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ishita Arora¹ &
M. Gangadharappa²

127 Accesses
Explore all metrics

Abstract

Video saliency detection has recently been widely used because of its ability to distinguish significant regions of interest. It has several applications, such as video segmentation, abnormal activity detection, video summarization, etc. This research paper develops a novel technique for video saliency detection known as Spatiotemporal Recurrent Fully Convolutional Network Model (SRFCNM). This model uses recurrent convolutional layers to represent spatial and temporal features of superpixels for element uniqueness. The model is trained in two phases; initially, we pre-train the model on the segmented data sets and then fine-tune it for saliency detection, which allows the network to learn salient objects more accurately. The uniqueness of integrating saliency maps with recurrent convolutional layers and spatiotemporal characteristics facilitates the robust representation of salient objects to capture the relevant features. The SRFCNM model is extensively estimated on the challenging datasets viz. SegTrackV2, FBMS and DAVIS. Our model is compared with the existing Deep Learning and Convolutional Neural Network algorithms. This research demonstrates that SRFCNM outperforms the state-of-the-art saliency models considerably over the three public datasets regarding accuracy recall and mean absolute error (MAE). The proposed SRFCNM model produces the lowest MAE values, 3.2%, 3.5%, and 7.5%, for SegTrackV2, DAVIS, and FBMS datasets, respectively, with hand-crafted color features, compared with the existing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video Saliency Detection by 3D Convolutional Neural Networks

A semi-supervised recurrent neural network for video salient object detection

Article 22 June 2020

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, “Enet: A deep neural network architecture for real-time semantic segmentation,” ArXiv Prepr. ArXiv160602147, 2016.
Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast R-CNN for pedestrian detection. IEEE Trans Multimed 20(4):985–996
Google Scholar
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Li G, Yu Y (2016) “Deep contrast learning for salient object detection”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 478–487
Pan H, Jiang H (2016) “A deep learning based fast image saliency detection algorithm”.ArXiv Prepr. ArXiv160200577
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Simonyan K, Zisserman A (2014) “Two-stream convolutional networks for action recognition in videos”. ArXiv Prepr. ArXiv14062199
Wang L, Ouyang W, Wang X, Lu H (2015) “Visual tracking with fully convolutional networks”. In:Proceedings of the IEEE international conference on computer vision, pp 3119–3127
Wang L, Wang L, Lu H, Zhang P, Ruan X (2018) Salient object detection with recurrent fully convolutional networks. IEEE Trans Pattern Anal Mach Intell 41(7):1734–1746
Article Google Scholar
Gastal ES, Oliveira MM (2012) Adaptive manifolds for real-time high-dimensional filtering. ACM Trans Graph TOG 31(4):1–13
Article Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Article Google Scholar
Goferman S, Zelnik-Manor L, Tal A (2012) Context-Aware Saliency Detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926. https://doi.org/10.1109/TPAMI.2011.272
Article Google Scholar
Cheng M-M, Mitra NJ, Huang X, Torr PHS, Hu S-M (2015) Global Contrast Based Salient Region Detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582. https://doi.org/10.1109/TPAMI.2014.2345401
Article Google Scholar
Mahamud S, Williams LR, Thornber KK, Xu K (2003) Segmentation of multiple salient closed contours from real images. IEEE Trans Pattern Anal Mach Intell 25(4):433–444
Article Google Scholar
Yang B, Zhang X, Chen L, Yang H, Gao Z (2017) Edge guided salient object detection. Neurocomputing 221:60–71
Article Google Scholar
Li J, Xia C, Chen X (2018) A Benchmark Dataset and Saliency-Guided Stacked Autoencoders for Video-Based Salient Object Detection. IEEE Trans Image Process 27(1):349–364. https://doi.org/10.1109/TIP.2017.2762594
Article MathSciNet Google Scholar
Yan Y et al (2018) Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement. Pattern Recognit 79:65–78
Article Google Scholar
Zhang P, Wang D, Lu H, Wang H, Yin B (2017) “Learning uncertain convolutional features for accurate saliency detection”. In:Proceedings of the IEEE International Conference on computer vision, pp 212–221
Sajid H, Cheung S-CS, Jacobs N (2019) Motion and appearance based background subtraction for freely moving cameras. Signal Process Image Commun 75:11–21
Article Google Scholar
Liang J, Zhou J, Tong L, Bai X, Wang B (2018) Material based salient object detection from hyperspectral images. Pattern Recognit 76:476–490
Article Google Scholar
Xiao F, Peng L, Fu L, Gao X (2018) Salient object detection based on eye tracking data. Signal Process 144:392–397
Article Google Scholar
Fu K, Gu IY-H, Yang J (2018) Spectral salient object detection. Neurocomputing 275:788–803
Article Google Scholar
Li H, Chen J, Lu H, Chi Z (2017) CNN for saliency detection with low-level feature integration. Neurocomputing 226:212–220
Article Google Scholar
Qu L, He S, Zhang J, Tian J, Tang Y, Yang Q (2017) RGBD salient object detection via deep fusion. IEEE Trans Image Process 26(5):2274–2285
Article MathSciNet Google Scholar
Huang K, Gao S (2020) Image saliency detection via multi-scale iterative CNN. Vis Comput 36(7):1355–1367. https://doi.org/10.1007/s00371-019-01734-2
Article Google Scholar
Huang L, Song K, Wang J, Niu M, Yan Y (2022) Multi-Graph Fusion and Learning for RGBT Image Saliency Detection. IEEE Trans Circuits Syst Video Technol 32(3):1366–1377. https://doi.org/10.1109/TCSVT.2021.3069812
Article Google Scholar
Zhang Q, Xiao X, Wang X, Wang S, Kwong S, Jiang J (2022) Adaptive Viewpoint Feature Enhancement-Based Binocular Stereoscopic Image Saliency Detection. IEEE Trans Circuits Syst Video Technol 32(10):6543–6556. https://doi.org/10.1109/TCSVT.2022.3171563
Article Google Scholar
Fang Y, Wang Z, Lin W, Fang Z (2014) Video Saliency Incorporating Spatiotemporal Cues and Uncertainty Weighting. IEEE Trans Image Process 23(9):3910–3921. https://doi.org/10.1109/TIP.2014.2336549
Article MathSciNet Google Scholar
Wang W, Shen J, Shao L (2017) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Article MathSciNet Google Scholar
Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, Vinyals O, Monga R, Toderici G (2015) “Beyond short snippets: Deep networks for video classification”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4694–4702
Xingjian SHI, Chen Z, Wang H, Yeung D-Y, Wong W-K, Woo W (2015) “Convolutional LSTM network: A machine learning approach for precipitation nowcasting”. In:Advances in neural information processing systems, pp 802–810
Chen Y, Zou W, Tang Y, Li X, Xu C, Komodakis N (2018) SCOM: Spatiotemporal Constrained Optimization for Salient Object Detection. IEEE Trans Image Process 27(7):3345–3357. https://doi.org/10.1109/TIP.2018.2813165
Article MathSciNet Google Scholar
Le T-N, Sugimoto A (2018) Video Salient Object Detection Using Spatiotemporal Deep Features. IEEE Trans Image Process 27(10):5002–5015. https://doi.org/10.1109/TIP.2018.2849860
Article MathSciNet Google Scholar
Song H, Wang W, Zhao S, Shen J, Lam K-M (2018) “Pyramid dilated deeper convlstm for video salient object detection”. In: Proceedings of the European conference on computer vision (ECCV), pp 715–731
Li G, Xie Y, Wei T, Wang K, Lin L (2018) “Flow guided recurrent neural encoder for video salient object detection”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3243–3252
Jiao L et al (2019) A Survey of Deep Learning-Based Object Detection. IEEE Access 7:128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201
Article Google Scholar
Huang K, Li G, Liu S (2020) Learning channel-wise spatio-temporal representations for video salient object detection. Neurocomputing 403:325–336. https://doi.org/10.1016/j.neucom.2020.04.015
Article Google Scholar
Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) “See more, know more: Unsupervised video object segmentation with co-attention siamese networks,”. In:Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3623–3632
Qin Z, Lu X, Nie X, Liu D, Yin Y, Wang W (2023) Coarse-to-fine video instance segmentation with factorized conditional appearance flows. IEEECAA J Autom Sin 10(5):1192–1208
Article Google Scholar
Rahtu E, Kannala J, Salo M, Heikkilä J (2010) “Segmenting salient objects from images and videos”. In: European conference on computer vision, Springer, pp 366–379
Chang Q, Zhu S (2021) “Temporal-spatial feature pyramid for video saliency detection”.ArXiv Prepr. ArXiv210504213
Jian M, Wang J, Yu H, Wang G-G (2021) Integrating object proposal with attention networks for video saliency detection. Inf Sci 576:819–830. https://doi.org/10.1016/j.ins.2021.08.069
Article MathSciNet Google Scholar
Tang L, Li B, Kuang S, Song M, Ding S (2022) Re-thinking the relations in co-saliency detection. IEEE Trans Circuits Syst Video Technol 32(8):5453–5466. https://doi.org/10.1109/TCSVT.2022.3150923
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) “Fully convolutional networks for semantic segmentation”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282. https://doi.org/10.1109/TPAMI.2012.120
Article Google Scholar
Kim J, Han D, Tai Y-W, Kim J (2016) Salient Region Detection via High-Dimensional Color Transform and Local Spatial Support. IEEE Trans Image Process 25(1):9–23. https://doi.org/10.1109/TIP.2015.2495122
Article MathSciNet Google Scholar
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) “MOT16: A benchmark for multi-object tracking”.ArXiv Prepr. ArXiv160300831
Jia Y et al (2014) “Caffe: Convolutional Architecture for Fast Feature Embedding,” in Proceedings of the 22nd ACM international conference on Multimedia, Orlando Florida USA: ACM, pp 675–678. https://doi.org/10.1145/2647868.2654889
Borji A, Cheng M-M, Jiang H, Li J (2015) Salient Object Detection: A Benchmark. IEEE Trans Image Process 24(12):5706–5722. https://doi.org/10.1109/TIP.2015.2487833
Article MathSciNet Google Scholar
Tsai D, Flagg M, Nakazawa A, Rehg JM (2012) Motion coherent tracking using multi-label MRF optimization. Int J Comput Vis 100(2):190–202
Article MathSciNet Google Scholar
Hutchison D et al (2010) “Object Segmentation by Long Term Analysis of Point Trajectories,” in Computer Vision – ECCV 2010, K. Daniilidis, P. Maragos, and N. Paragios, Eds., in Lecture Notes in Computer Science, vol. 6315. Berlin, Heidelberg: Springer Berlin Heidelberg, pp 282–295. https://doi.org/10.1007/978-3-642-15555-0_21
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) “A benchmark dataset and evaluation methodology for video object segmentation,” In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732
Navalpakkam V, Itti L (2005) Modeling the influence of task on attention. Vision Res 45(2):205–231
Article Google Scholar
Wei Y, Wen F, Zhu W, Sun J (2012) “Geodesic saliency using background priors”. In European conference on computer vision, Springer, 2012, pp 29–42
Fu H, Cao X, Tu Z (2013) Cluster-Based Co-Saliency Detection. IEEE Trans Image Process 22(10):3766–3778. https://doi.org/10.1109/TIP.2013.2260166
Article MathSciNet Google Scholar
Zhu W, Liang S, Wei Y, Sun J (2014) “Saliency optimization from robust background detection”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2814–2821
Yang C, Zhang L, Lu H, Ruan X, Yang M-H (2013) “Saliency detection via graph-based manifold ranking”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3166–3173
Zhou F, Bing Kang S, Cohen MF (2014) “Time-mapping using space-time saliency”. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3358–3365
Wang L, Lu H, Ruan X, Yang M-H (2015) “Deep networks for saliency detection via local estimation and global search”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3183–3192
Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) “Salient object detection: A discriminative regional feature integration approach”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2083–2090
Wang W, Shen J, Porikli F (2015) “Saliency-aware geodesic video object segmentation,” In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3395–3402
Wang W, Shen J, Shao L (2015) Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans Image Process 24(11):4185–4196
Article MathSciNet Google Scholar
Liu N, Han J (2016) “Dhsnet: Deep hierarchical saliency network for salient object detection”. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 678–686
Wang L, Wang L, Lu H, Zhang P, Ruan X (2016) “Saliency detection with recurrent fully convolutional networks,” in European conference on computer vision, Springer, pp 825–841
Hou Q, Cheng M-M, Hu X, Borji A, Tu Z, Torr PH (2017) “Deeply supervised salient object detection with short connections”. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3203–3212
Ji Y, Zhang H, Jie Z, Ma L, Jonathan Wu QM (2021) CASNet: A Cross-Attention Siamese Network for Video Salient Object Detection. IEEE Trans Neural Netw Learn Syst 32(6):2676–2690. https://doi.org/10.1109/TNNLS.2020.3007534
Article Google Scholar
Liu N, Han J, Yang M-H (2018) “Picanet: Learning pixel-wise contextual attention for saliency detection”. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3089–3098
Huang L, Yan P, Li G, Wang Q, Lin L (2019) Attention embedded spatio-temporal network for video salient object detection. IEEE Access 7:166203–166213
Article Google Scholar
Xu C, Gao Z, Zhang H, Li S, de Albuquerque VHC (2021) Video salient object detection using dual-stream spatiotemporal attention. Appl Soft Comput 108:107433
Article Google Scholar
Liu Y, Han J, Zhang Q, Wang L (2019) Salient Object Detection via Two-Stage Graphs. IEEE Trans Circuits Syst Video Technol 29(4):1023–1037. https://doi.org/10.1109/TCSVT.2018.2823769
Article Google Scholar
Lu H, Li X, Zhang L, Ruan X, Yang M-H (2016) Dense and Sparse Reconstruction Error Based Saliency Descriptor. IEEE Trans Image Process 25(4):1592–1603. https://doi.org/10.1109/TIP.2016.2524198
Article MathSciNet Google Scholar
Zhang L, Yang C, Lu H, Ruan X, Yang M-H (2017) Ranking Saliency. IEEE Trans Pattern Anal Mach Intell 39(9):1892–1904. https://doi.org/10.1109/TPAMI.2016.2609426
Article Google Scholar
Zhou L, Yang Z, Yuan Q, Zhou Z, Hu D (2015) Salient Region Detection via Integrating Diffusion-Based Compactness and Local Contrast. IEEE Trans Image Process 24(11):3308–3320. https://doi.org/10.1109/TIP.2015.2438546
Article MathSciNet Google Scholar

Download references

Funding

None.

Author information

Authors and Affiliations

Department of Electronics & Communication, AIACTR, Affiliated to GGSIPU, Delhi, India, 110031
Ishita Arora
Department of Electronics & Communication, NSUT East Campus, Delhi, India, 110031
M. Gangadharappa

Authors

Ishita Arora
View author publications
You can also search for this author in PubMed Google Scholar
M. Gangadharappa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ishita Arora.

Ethics declarations

Ethical approval

This research does not contain any studies with human participants or animals performed by any authors.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Arora, I., Gangadharappa, M. SRFCNM: Spatiotemporal recurrent fully convolutional network model for salient object detection. Multimed Tools Appl 83, 38009–38036 (2024). https://doi.org/10.1007/s11042-023-17009-x

Download citation

Received: 05 August 2022
Revised: 28 June 2023
Accepted: 11 September 2023
Published: 03 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-17009-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SRFCNM: Spatiotemporal recurrent fully convolutional network model for salient object detection

Abstract

Access this article

Similar content being viewed by others

Video Saliency Detection by 3D Convolutional Neural Networks

A semi-supervised recurrent neural network for video salient object detection

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SRFCNM: Spatiotemporal recurrent fully convolutional network model for salient object detection

Abstract

Access this article

Similar content being viewed by others

Video Saliency Detection by 3D Convolutional Neural Networks

A semi-supervised recurrent neural network for video salient object detection

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation