Skip to main content
Log in

A novel deep convolutional encoder–decoder network: application to moving object detection in videos

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Moving object detection is one of the key applications of video surveillance. Deep convolutional neural networks have gained increasing attention in the field of video surveillance due to their effective feature learning ability. The performance of deep neural networks is often affected by the characteristics of videos like poor illumination and inclement weather conditions. It is important to design an innovative architecture of deep neural networks to deal with the videos effectively. Here, the convolutional layers for the networks require to be in an appropriate number and it’s important to determine the number. In this study, we propose a customized deep convolutional encoder–decoder network, say CEDSegNet, for moving object detection in a video sequence. Here, the CEDSegNet is based on SegNet, and its encoder and decoder parts are chosen to be two. By customizing the SegNet with two encoder and decoder parts, the proposed CEDSegNet improves detection performance, where its parameters are reduced to an extent. The two encoder and decoder parts function towards generating feature maps preserving the fine details of object pixels in videos. The proposed CEDSegNet is tested on multiple video sequences of the CDNet dataset2012. The results obtained using CEDSegNet for moving object detection in the video frames are interpreted qualitatively. Further, the performance of CEDSegNet is evaluated using several quantitative indices. Both the qualitative and quantitative results demonstrate that the performance of CEDSegNet is superior to the state-of-the-network models, VGG16, VGG19, ResNet18 and ResNet50.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Availability of data:

A video database that supports the findings of this study is publicly available at http://jacarini.dinf.usherbrooke.ca/dataset2012.

Notes

  1. https://optiviewusa.com/cctv-video-resolutions/.

  2. http://jacarini.dinf.usherbrooke.ca/dataset2012/.

  3. http://jacarini.dinf.usherbrooke.ca/dataset2012/

References

  1. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  2. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp 177–186. Springer

  3. Chang X, Yang Y (2016) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305

    Article  MathSciNet  Google Scholar 

  4. Chang X, Yu Y-L, Yang Y, Xing EP (2017) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632. https://doi.org/10.1109/TPAMI.2016.2608901

    Article  Google Scholar 

  5. Chen L-C, Zhu Y, Papandreou G, Schroff F, and Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision, pp 801–818

  6. He K, Zhang X, Ren S, and Sun J (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

  7. He K, Zhang X, Ren S, and Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  8. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. Int Conf Mach Learn 37:448–456

    Google Scholar 

  9. Ji Y, Zhang H, Jie Z, Ma L, Wu QJ (2020) Casnet: a cross-attention siamese network for video salient object detection. IEEE Trans Neural Netw Learn Syst 32(6):2676–2690

    Article  Google Scholar 

  10. Jiang S, Lu X (2018) Wesambe: a weight-sample-based method for background subtraction. IEEE Trans Circuits Syst Video Technol 28(9):2105–2115

    Article  Google Scholar 

  11. Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, Qu R (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868

    Article  Google Scholar 

  12. Jiao L, Zhang R, Liu F, Yang S, Hou B, Li L, and Tang X (2021) New generation deep learning for video object detection: a survey. IEEE Trans Neural Netw Learn Syst

  13. Kang K, Li H, Xiao T, Ouyang W, Yan J, Liu X, and Wang X (2017a) Object detection in videos with tubelet proposal networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 727–735

  14. Kang K, Li H, Yan J, Zeng X, Yang B, Xiao T, Zhang C, Wang Z, Wang R, Wang X et al (2017) T-cnn: Tubelets with convolutional neural networks for object detection from videos. IEEE Trans Circuits Syst Video Technol 28(10):2896–2907

    Article  Google Scholar 

  15. Kompella A, Kulkarni RV (2021) A semi-supervised recurrent neural network for video salient object detection. Neural Comput Appl 33:2065–2083

    Article  Google Scholar 

  16. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  17. Lee B, Erdenee E, Jin S, Rhee PK (2016) Efficient object detection using convolutional neural network-based hierarchical feature modeling. Signal Image Video Process 10(8):1503–1510

    Article  Google Scholar 

  18. Lim LA, Keles HY (2018) Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recogn Lett 112:256–262

    Article  Google Scholar 

  19. Long J, Shelhamer E, and Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  20. Malebary SJ, Khan R, Khan YD (2021) Protopred: advancing oncological research through identification of proto-oncogene proteins. IEEE Access 9:68788–68797

    Article  Google Scholar 

  21. Minaee S, Boykov YY, Porikli F, Plaza AJ, Kehtarnavaz N, and Terzopoulos D (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell

  22. Muhammad K, Ahmad J, Baik SW (2018) Early fire detection using convolutional neural networks during surveillance for effective disaster management. Neurocomputing 288:30–42

    Article  Google Scholar 

  23. Pal SK, Bhoumik D, Bhunia Chakraborty D (2020) Granulated deep learning and z-numbers in motion detection and object recognition. Neural Comput Appl 32(21):16533–16548

    Article  Google Scholar 

  24. Patil PW, Murala S (2018) Msfgnet: a novel compact end-to-end deep network for moving object detection. IEEE Trans Intell Transp Syst 20(11):4066–4077

    Article  Google Scholar 

  25. Rahmon G, Bunyak F, Seetharaman G, and Palaniappan K (2021) Motion u-net: multi-cue encoder-decoder network for motion segmentation. In; 2020 25th International conference on pattern recognition (ICPR), pp 8125–8132

  26. Ren Q, Hu R (2018) Multi-scale deep encoder-decoder network for salient object detection. Neurocomputing 316:95–104

    Article  Google Scholar 

  27. Shi G, Suo J, Liu C, Wan K, and Lv X (2017) Moving target detection algorithm in image sequences based on edge detection and frame difference. In: 2017 IEEE 3rd information technology and mechatronics engineering conference (ITOEC), pp 740–744. IEEE

  28. Simonyan K and Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint: arXiv:1409.1556

  29. Singh SA, Meitei TG, and Majumder S (2020) Short pcg classification based on deep learning. In: Deep learning techniques for biomedical and health informatics, pp 141–164. Elsevier

  30. St-Charles P-L, Bilodeau G-A, Bergevin R (2014) Subsense: a universal change detection method with local adaptive sensitivity. IEEE Trans Image Process 24(1):359–373

    Article  MathSciNet  MATH  Google Scholar 

  31. St-Charles P-L, Bilodeau G-A, Bergevin R (2016) Universal background subtraction using word consensus models. IEEE Trans Image Process 25(10):4768–4781

    Article  MathSciNet  MATH  Google Scholar 

  32. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, and Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  33. Wang D, Cui X, Chen X, Zou Z, Shi T, Salcudean S, Wang ZJ, and Ward R (2021) Multi-view 3d reconstruction with transformers. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 5702–5711. https://doi.org/10.1109/ICCV48922.2021.00567

  34. Xiaojun C, Zhigang M, Yi Y, Zhiqiang Z, G, H. A. (2016) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans cybern 47(5):1180–1197

  35. Zhu H, Yan X, Tang H, Chang Y, Li B, Yuan X (2020) Moving object detection with deep cnns. IEEE Access 8:29729–29741

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srinivas Yara.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ganivada, A., Yara, S. A novel deep convolutional encoder–decoder network: application to moving object detection in videos. Neural Comput & Applic 35, 22027–22041 (2023). https://doi.org/10.1007/s00521-023-08956-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08956-5

Keywords

Navigation