Abstract
Moving object detection is one of the key applications of video surveillance. Deep convolutional neural networks have gained increasing attention in the field of video surveillance due to their effective feature learning ability. The performance of deep neural networks is often affected by the characteristics of videos like poor illumination and inclement weather conditions. It is important to design an innovative architecture of deep neural networks to deal with the videos effectively. Here, the convolutional layers for the networks require to be in an appropriate number and it’s important to determine the number. In this study, we propose a customized deep convolutional encoder–decoder network, say CEDSegNet, for moving object detection in a video sequence. Here, the CEDSegNet is based on SegNet, and its encoder and decoder parts are chosen to be two. By customizing the SegNet with two encoder and decoder parts, the proposed CEDSegNet improves detection performance, where its parameters are reduced to an extent. The two encoder and decoder parts function towards generating feature maps preserving the fine details of object pixels in videos. The proposed CEDSegNet is tested on multiple video sequences of the CDNet dataset2012. The results obtained using CEDSegNet for moving object detection in the video frames are interpreted qualitatively. Further, the performance of CEDSegNet is evaluated using several quantitative indices. Both the qualitative and quantitative results demonstrate that the performance of CEDSegNet is superior to the state-of-the-network models, VGG16, VGG19, ResNet18 and ResNet50.
Similar content being viewed by others
Availability of data:
A video database that supports the findings of this study is publicly available at http://jacarini.dinf.usherbrooke.ca/dataset2012.
References
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp 177–186. Springer
Chang X, Yang Y (2016) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305
Chang X, Yu Y-L, Yang Y, Xing EP (2017) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632. https://doi.org/10.1109/TPAMI.2016.2608901
Chen L-C, Zhu Y, Papandreou G, Schroff F, and Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818
He K, Zhang X, Ren S, and Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
He K, Zhang X, Ren S, and Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn 37:448–456
Ji Y, Zhang H, Jie Z, Ma L, Wu QJ (2020) Casnet: a cross-attention siamese network for video salient object detection. IEEE Trans Neural Netw Learn Syst 32(6):2676–2690
Jiang S, Lu X (2018) Wesambe: a weight-sample-based method for background subtraction. IEEE Trans Circuits Syst Video Technol 28(9):2105–2115
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868
Jiao L, Zhang R, Liu F, Yang S, Hou B, Li L, and Tang X (2021) New generation deep learning for video object detection: a survey. IEEE Trans Neural Netw Learn Syst
Kang K, Li H, Xiao T, Ouyang W, Yan J, Liu X, and Wang X (2017a) Object detection in videos with tubelet proposal networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 727–735
Kang K, Li H, Yan J, Zeng X, Yang B, Xiao T, Zhang C, Wang Z, Wang R, Wang X et al (2017) T-cnn: Tubelets with convolutional neural networks for object detection from videos. IEEE Trans Circuits Syst Video Technol 28(10):2896–2907
Kompella A, Kulkarni RV (2021) A semi-supervised recurrent neural network for video salient object detection. Neural Comput Appl 33:2065–2083
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Lee B, Erdenee E, Jin S, Rhee PK (2016) Efficient object detection using convolutional neural network-based hierarchical feature modeling. Signal Image Video Process 10(8):1503–1510
Lim LA, Keles HY (2018) Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recogn Lett 112:256–262
Long J, Shelhamer E, and Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Malebary SJ, Khan R, Khan YD (2021) Protopred: advancing oncological research through identification of proto-oncogene proteins. IEEE Access 9:68788–68797
Minaee S, Boykov YY, Porikli F, Plaza AJ, Kehtarnavaz N, and Terzopoulos D (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell
Muhammad K, Ahmad J, Baik SW (2018) Early fire detection using convolutional neural networks during surveillance for effective disaster management. Neurocomputing 288:30–42
Pal SK, Bhoumik D, Bhunia Chakraborty D (2020) Granulated deep learning and z-numbers in motion detection and object recognition. Neural Comput Appl 32(21):16533–16548
Patil PW, Murala S (2018) Msfgnet: a novel compact end-to-end deep network for moving object detection. IEEE Trans Intell Transp Syst 20(11):4066–4077
Rahmon G, Bunyak F, Seetharaman G, and Palaniappan K (2021) Motion u-net: multi-cue encoder-decoder network for motion segmentation. In; 2020 25th International conference on pattern recognition (ICPR), pp 8125–8132
Ren Q, Hu R (2018) Multi-scale deep encoder-decoder network for salient object detection. Neurocomputing 316:95–104
Shi G, Suo J, Liu C, Wan K, and Lv X (2017) Moving target detection algorithm in image sequences based on edge detection and frame difference. In: 2017 IEEE 3rd information technology and mechatronics engineering conference (ITOEC), pp 740–744. IEEE
Simonyan K and Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint: arXiv:1409.1556
Singh SA, Meitei TG, and Majumder S (2020) Short pcg classification based on deep learning. In: Deep learning techniques for biomedical and health informatics, pp 141–164. Elsevier
St-Charles P-L, Bilodeau G-A, Bergevin R (2014) Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans Image Process 24(1):359–373
St-Charles P-L, Bilodeau G-A, Bergevin R (2016) Universal background subtraction using word consensus models. IEEE Trans Image Process 25(10):4768–4781
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, and Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Wang D, Cui X, Chen X, Zou Z, Shi T, Salcudean S, Wang ZJ, and Ward R (2021) Multi-view 3d reconstruction with transformers. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 5702–5711. https://doi.org/10.1109/ICCV48922.2021.00567
Xiaojun C, Zhigang M, Yi Y, Zhiqiang Z, G, H. A. (2016) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans cybern 47(5):1180–1197
Zhu H, Yan X, Tang H, Chang Y, Li B, Yuan X (2020) Moving object detection with deep cnns. IEEE Access 8:29729–29741
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ganivada, A., Yara, S. A novel deep convolutional encoder–decoder network: application to moving object detection in videos. Neural Comput & Applic 35, 22027–22041 (2023). https://doi.org/10.1007/s00521-023-08956-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08956-5