IoT-based 3D convolution for video salient object detection

Dong, Shizhou; Gao, Zhifan; Pirbhulal, Sandeep; Bian, Gui-Bin; Zhang, Heye; Wu, Wanqing; Li, Shuo

doi:10.1007/s00521-018-03971-3

IoT-based 3D convolution for video salient object detection

Intelligent Biomedical Data Analysis and Processing
Published: 02 January 2019

Volume 32, pages 735–746, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Shizhou Dong^1,2,
Zhifan Gao³,
Sandeep Pirbhulal¹,
Gui-Bin Bian ORCID: orcid.org/0000-0003-4708-2245⁴,
Heye Zhang⁵,
Wanqing Wu¹ &
…
Shuo Li³

1454 Accesses
27 Citations
Explore all metrics

Abstract

The video salient object detection (SOD) is the first step for the devices in the Internet of Things (IoT) to understand the environment around them. The video SOD needs the objects’ motion information in contiguous video frames as well as spatial contrast information from a single video frame. A large number of IoT devices’ computing power is not sufficient to support the existing SOD methods’ expensive computational complexity in emotion estimation, because they might have low hardware configurations (e.g., surveillance camera, and smartphone). In order to model the objects’ motion information efficiently for SOD, we propose an end-to-end video SOD algorithm with an efficient representation of the objects’ motion information. This algorithm contains two major parts: a 3D convolution-based X-shape structure that directly represents the motion information in successive video frames efficiently, and 2D densely connected convolutional neural networks (DenseNet) with pyramid structure to extract the rich spatial contrast information in a single video frame. Our method not only can maintain a small number of parameters as the 2D convolutional neural network but also represents spatiotemporal information uniformly that enables it can be trained end-to-end. We evaluate our proposed method on four benchmark datasets. The results show that our method achieves state-of-the-art performance compared with the other five methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight video salient object detection via channel-shuffle enhanced multi-modal fusion network

Article 30 May 2023

Spatiotemporal context-aware network for video salient object detection

Article 20 May 2022

Video Saliency Detection by 3D Convolutional Neural Networks

References

Borji A (2012) Boosting bottom-up and top-down visual features for saliency estimation. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 438–445
Chen S, Xu H, Liu D, Hu B, Wang H (2014) A vision of IoT: applications, challenges, and opportunities with China perspective. IEEE Internet Things J 1(4):349–359. https://doi.org/10.1109/JIOT.2014.2337336
Article Google Scholar
Cheng MM, Mitra NJ, Huang XL, Torr PHS, Hu SM (2015) Global contrast based salient region detection. IEEE TPAMI 37(3):569–582. https://doi.org/10.1109/TPAMI.2014.2345401
Article Google Scholar
Fukuchi K, Miyazato K, Kimura A, Takagi S, Yamato J (2009) Saliency-based video segmentation with graph cuts and sequentially updated priors. In: 2009 IEEE international conference on multimedia and expo (ICME), pp 638–641
Gao H, Zhuang L, Laurens M, Kilian W (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269
Guo C, Ma Q, Zhang L (2008) Spatio-temporal saliency detection using phase spectrum of quaternion Fourier transform. In: 2008 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Hou Q, Cheng MM, Hu X, Borji A, Tu Z, Torr PHS (2018) Deeply supervised salient object detection with short connections. IEEE Trans Pattern Anal Mach Intell 1–1
Hsu KJ, Lin YY, Chuang YY (2017) Weakly supervised saliency detection with a category-driven map generator. In: British machine vision conference (BMVC)
Hu P, Shuai B, Liu J, Wang G (2017) Deep level sets for salient object detection. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR)
Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Article Google Scholar
Jiang B, Zhang L, Lu H, Yang C, Yang MH (2013) Saliency detection via absorbing Markov chain. In: 2013 IEEE international conference on computer vision (ICCV), pp 1665–1672
Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to predict where humans look. In: 2009 IEEE 12th international conference on computer vision, pp 2106–2113
Kazuma A, Fukuchi K, Kimura A, Takagi S (2010) Fully automatic extraction of salient objects from videos in near real-time. CoRR 1–25
Le TN, Sugimoto A (2017) Spatiotemporal utilization of deep features for video saliency detection. In: 2017 IEEE international conference on multimedia and expo workshops (ICMEW). IEEE, pp 465–470
Lee YJ, Kim J, Grauman K (2011) Key-segments for video object segmentation. In: 2011 International conference on computer vision (ICCV), pp 1995–2002
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 5455–5463
Li G, Xie Y, Wei T, Wang K, Lin L (2018) Flow guided recurrent neural encoder for video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3243–3252
Li GB, Xie Y, Lin L, Yu YZ (2017) Instance-level salient object segmentation. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 247–256
Li J, Levine M, An X, He H (2011) Saliency detection based on frequency and spatial domain analyses. In: Proceedings of the British machine vision conference (BMVC). BMVA Press, pp 86.1–86.11
Li J, Xia C, Chen X (2018) A benchmark dataset and saliency-guided stacked autoencoders for video-based salient object detection. IEEE Trans Image Process 27(1):349–364
Article MathSciNet Google Scholar
Li X, Zhao LM, Wei L, Yang MH, Wu F, Zhuang YT, Ling HB, Wang JD (2016) Deepsaliency: multi-task deep neural network model for salient object detection. IEEE Trans Image Process 25(8):3919–3930
Article MathSciNet Google Scholar
Liu N, Han J (2016) Dhsnet: deep hierarchical saliency network for salient object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 678–686
Liu T, Zheng N, Wei, Yuan Z (2008) Video attention: learning to detect a salient object sequence. In: 2008 19th International conference on pattern recognition (ICPR), pp 1–4
Luo ZM, Mishra A, Achkar A, Eichel J, Li SZ, Jodoin PM (2017) Non-local deep features for salient object detection. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR)
Ma T, Latecki LJ (2012) Maximum weight cliques with mutex constraints for video object segmentation. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 670–677
Margolin R, Tal A, Zelnik-Manor L (2013) What makes a patch distinct? In: 2013 IEEE conference on computer vision and pattern recognition, pp 1139–1146
Mohammadi M, Al-Fuqaha A, Guizani M, Oh JS (2018) Semisupervised deep reinforcement learning in support of IoT and smart city services. IEEE Internet Things J 5(2):624–635. https://doi.org/10.1109/JIOT.2017.2712560
Article Google Scholar
Ochs P, Malik J, Brox T (2014) Segmentation of moving objects by long term video analysis. IEEE Trans Pattern Anal Mach Intell 36(6):1187–1200
Article Google Scholar
Perazzi F, Pont-Tuset J, McWilliams B, Gool LV, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 724–732
Rahtu E, Kannala J, Salo M, Heikkil J (2010) Segmenting salient objects from images and videos. In: Proceedings of the 11th European conference on computer vision: part V (ECCV), ECCV’10. Springer, Berlin, pp 366–379
Chapter Google Scholar
Seo PM (2009) Static and space–time visual saliency detection by self-resemblance. J Vis 9(12):15
Article Google Scholar
Sezer OB, Dogdu E, Ozbayoglu AM (2018) Context-aware computing, learning, and big data in internet of things: a survey. IEEE Internet Things J 5(1):1–27. https://doi.org/10.1109/JIOT.2017.2773600
Article Google Scholar
Stankovic JA (2014) Research directions for the internet of things. IEEE Internet Things J 1(1):3–9. https://doi.org/10.1109/JIOT.2014.2312291
Article MathSciNet Google Scholar
Sudre CH, Li WQ, Vercauteren T, Ourselin S, Cardoso MJ (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, pp 240–248
Sumati M, Shanu S (2016) Analysis of computer vision based techniques for motion detection. In: Cloud system and big data engineering. IEEE, pp 445–450
Lijun W, Huchuan L, Xiang R, Ming-Hsuan Y (2015) Deep networks for saliency detection via local estimation and global search. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3183–3192
Wang LJ, Lu HH, Wang YF, Feng MY, Wang D, Yin BC, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR)
Wang T, Borji A, Zhang LH, Zhang PP, Lu HC (2017) A stagewise refinement model for detecting salient objects in images. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 4019–4028
Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3395–3402
Wang W, Shen J, Shao L (2015) Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans Image Process 24(11):4185–4196
Article MathSciNet Google Scholar
Wang W, Shen J, Shao L (2018) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Article MathSciNet Google Scholar
Xiao X, Xu C, Rui Y (2010) Video based 3D reconstruction using spatio-temporal attention analysis. In: 2010 IEEE international conference on multimedia and expo (ICME), pp 1091–1096
Yang C, Zhang LH, Lu HH, Ruan X, Yang M (2013) Saliency detection via graph-based manifold ranking. In: 2013 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3166–3173
Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW (2008) Sun: a Bayesian framework for saliency using natural statistics. J Vis 8(7):32
Article Google Scholar
Zhao R, Ouyang W, Li H, Wang X (2015) Saliency detection by multi-context deep learning. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1265–1274

Download references

Funding

This study was funded by Youth Innovation Promotion Association of the Chinese Academy of Sciences (Grant No. 218165), Shenzhen Key Laboratory of Neuropsychiatric Modulation (CN) (Grant No. JCYJ20170307165309009).

Author information

Authors and Affiliations

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Shizhou Dong, Sandeep Pirbhulal & Wanqing Wu
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences Shenzhen, Shenzhen, China
Shizhou Dong
Western University, London, Canada
Zhifan Gao & Shuo Li
State Key Laboratory of Management and Control for Complex Systems Institute of Automation, Chinese Academy of Sciences, Beijing, China
Gui-Bin Bian
Sun Yat-Sen University, Guangzhou, Guangdong, China
Heye Zhang

Authors

Shizhou Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhifan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Pirbhulal
View author publications
You can also search for this author in PubMed Google Scholar
Gui-Bin Bian
View author publications
You can also search for this author in PubMed Google Scholar
Heye Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wanqing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gui-Bin Bian.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, S., Gao, Z., Pirbhulal, S. et al. IoT-based 3D convolution for video salient object detection. Neural Comput & Applic 32, 735–746 (2020). https://doi.org/10.1007/s00521-018-03971-3

Download citation

Received: 14 September 2018
Accepted: 20 December 2018
Published: 02 January 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00521-018-03971-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IoT-based 3D convolution for video salient object detection

Abstract

Access this article

Similar content being viewed by others

Lightweight video salient object detection via channel-shuffle enhanced multi-modal fusion network

Spatiotemporal context-aware network for video salient object detection

Video Saliency Detection by 3D Convolutional Neural Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

IoT-based 3D convolution for video salient object detection

Abstract

Access this article

Similar content being viewed by others

Lightweight video salient object detection via channel-shuffle enhanced multi-modal fusion network

Spatiotemporal context-aware network for video salient object detection

Video Saliency Detection by 3D Convolutional Neural Networks

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation