Skip to main content
Log in

Dynamic background modeling using deep learning autoencoder network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Background modeling is a major prerequisite for a variety of multimedia applications like video surveillance, traffic monitoring, etc. Numerous approaches have been proposed for the same over the past few decades. However, the need for real time artificial intelligent based low cost approach still exists. Moreover, few recently proposed efficient approaches are not validated on the basis of some of the challenging applications in which they may fail in its efficiency when tested. In this paper, an efficient deep learning technique based on autoencoder network is used for modeling the background. The background model generated herein is obtained by training the incoming frames of the surveillance video with the deep learning network in an unsupervised manner. In order to optimize the weights of the network, greedy layer wise pre-training approach is used initially and the fine tuning of the network is done using conjugate gradient based back propagation algorithm. The performance of the algorithm is validated based on the application of unattended object detection in a dynamic environment. Comprehensive assessment of the proposed method using CDNET 2014 dataset and other datasets demonstrates the efficiency of the technique in background modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Babaee M, Dinha DT, Rigolla G (2018) A deep convolutional neural network for video sequence background subtraction. Pattern Recogn 76:635–649

    Article  Google Scholar 

  2. Barnich O, Droogenbroeck MV (2011) ViBe: A Universal Background Subtraction Algorithm for Video Sequences. IEEE Trans Image Process 20(6):1709–1724

    Article  MathSciNet  Google Scholar 

  3. Benedek C, Sziranyi T (2008) Bayesian foreground and shadow detection in uncertain frame rate surveillance videos. IEEE Trans Image Process 17(4):608–621

    Article  MathSciNet  Google Scholar 

  4. Benezeth Y, Jodoin P-M, Emile B, Laurent H, Rosenberger C (2010) Comparative study of background subtraction algorithms. J Electron Imaging 19(3)

  5. Bhargava M, Chen C-C, Ryoo M, Aggarwal J (2009) Detection of object abandonment using temporal logic. Mach Vis Appl 20(5):271–281

    Article  Google Scholar 

  6. Charalambous C (1992) Conjugate gradient algorithm for efficient training of artificial neural networks. IEE Proceedings G - Circuits, Devices and Systems 139(3):301–310

    Article  Google Scholar 

  7. Culibrk D, Marques O, Socek D, Kalva H, Furht B (2017) Neural network approachto background modeling for video object segmentation. IEEE Trans Neural Netw 18(6):1614–1627

    Article  Google Scholar 

  8. De Gregorio M, Giordano M (2017) Background estimation by weightless neural networks. Pattern Recogn Lett 96. https://doi.org/10.1016/j.patrec.2017.05.029

    Article  Google Scholar 

  9. Deng G, Guo K (2014) Self-adaptive background modeling research based on change detection and area training. Proceedings of IEEE Workshop on Electronics, Computer and Applications, Ottawa, pp. 59-62

  10. Droogenbroeck MV, Paquot O (2012) Background subtraction: experiments and improvements for ViBe. In: Proceedings of IEEE Comput. Soc. Conf. Comput.Vis. Pattern Recognit. Workshops, pp. 32-37

  11. Elgammal A, Duraiswami R, Harwood D, Davis LS (2002) Background and foreground modeling using on parametric kernel density estimation for visual surveillance. Proc IEEE 90(7):1151–1163

    Article  Google Scholar 

  12. Guo H, Wang J, Lu H (2016) Multiple deep features learning for object retrieval in surveillance videos. IET Comput Vis 10(4):268–271. https://doi.org/10.1049/iet-cvi.2015.0291

    Article  Google Scholar 

  13. Heikkilä M, Pietikäinen M (2006) A texture-based method for modeling the background and detecting moving objects. IEEE Trans Pattern Anal Mach Intell 28(4):657–662

    Article  Google Scholar 

  14. Kamijo S, Matsushita Y, Ikeuchi K, Sakauchi M (2000) Traffic monitoring and accident detection at intersections. IEEE Trans Intell Transp Syst 1(2):108–118

    Article  Google Scholar 

  15. Kim K, Chalidabhongse T, Harwood D, Davis L (2004) Background modeling and subtraction by codebook construction. In: Proceedings of IEEE International Conference on Image Processing, ICIP

  16. Krahnstoever N, Tu P, Sebastian T, Perera A, Collins R (2006) Multiview detection and tracking of travelers and luggage in mass transit environments. In: Proceedings of Int. Workshop Performance Eval. Tracking Surveillance, pp. 67–74

  17. Krizhevsky A, Hinton GE (2011) Using very deep autoencoders for content-based image retrieval. In: Proceedings of 19th ESANN, Bruges, pp 27-29

  18. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (NIPS'12)

  19. Laugraud B, Piérard S, Braham M, Broeck MVD (2015) Simple median-based method for stationary background generation using background subtraction algorithms. In Proceedings of ICIAP

  20. Li L, Huang W, Gu IY-H, Tian Q (2004) Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans Image Process 13(11):1459–1472

    Article  Google Scholar 

  21. Li L, Leung MKH (2002) Integrating intensity and texture differences for robust change detection. IEEE Trans Image Process 11(2):105–112

    Article  Google Scholar 

  22. Liang D, Kaneko S, Hashimoto M, Iwata K, Zhao X (2015) Co-occurrence probability-based pixel pairs background model for robust object detection in dynamic scenes. Pattern Recogn 48(4):1374–1390

    Article  Google Scholar 

  23. Lin H, Liu T, Chuang J (2002) A probabilistic SVM approach for background scene initialization. In: Proceedings of the International Conference on Image Processing, ICIP, pp. 893–8963

  24. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: recognizing complex activities from sensor data. In: Proceedings of the 24th international conference on artificial intelligence (IJCAI'15). AAAI Press, pp 1617–1623

  25. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  26. Liu T, Stathaki T (2017) Enhanced pedestrian detection using deep learning based semantic image segmentation. In: Proceedings of 22nd International Conference on Digital Signal Processing (DSP), London, United Kingdom, 2017, pp. 1-5

  27. Lu X (2014) A multiscalespatio-temporal background model for motion detection. In: Proceedings of IEEE Int. Conference on image Processing (ICIP)

  28. Maddalena L, Petrosino A (2008) A self organizing approach tobackground subtraction for visual surveillance applications. IEEE Trans Image Process 17(7):1729–1736

    Article  Google Scholar 

  29. Maddalena L, Petrosino A (2012) The sobs algorithm: what are the limits?. In: Proceedings of Computer Vision and Pattern Recognition Workshops

  30. Marsden M, McGuinness K, Little S, O'Connor NE (2017) ResnetCrowd: A residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification. In: Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, pp. 1-7

  31. Martins I, Carvalho P, Corte-Real L, Alba-Castro JL (2017) BMOG: Boosted Gaussian Mixture Model with Controlled Complexity. Pattern Recognition and Image Analysis. IbPRIA 2017. LNCS, Springer, pp 50-57

    Chapter  Google Scholar 

  32. Matsuyama T, Ohya T, Habe H (2000) Background subtraction for non-stationary scenes. In: Proceedings of Asian Conference on Computer Vision, pp. 662–667

  33. Miron A, Badii A (2015) Change detection based on graph cuts. In: Proceedings of International Conference on Systems, Signals and Image Processing (IWSSIP), London

  34. Muhammad K, Ahmad J, Mehmood I, Rho S, Baik SW (2018) Convolutional Neural Networks Based Fire Detection in Surveillance Videos. IEEE Access 6:18174–18183. https://doi.org/10.1109/ACCESS.2018.2812835

    Article  Google Scholar 

  35. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  Google Scholar 

  36. Sajid H, Cheung SCS (2017) Universal Multimode Background Subtraction. IEEE Trans Image Process 26(7):3249–3260

    Article  MathSciNet  Google Scholar 

  37. Sehairi K, Chouireb F, Meunier J (2017) Comparative study of motion detection methods for video surveillance systems. J Electron Imaging 26(2)

    Article  Google Scholar 

  38. Stauffer C, Grimson E (1999) Adaptive background mixture models for realtime tracking. Proceedings of IEEE Int Conf Comput Vis Pattern Recognit 2:246–252

    Google Scholar 

  39. Tang Z, Miao Z, Wan Y (2007) Background Subtraction Using Running Gaussian Average and Frame Difference. In: Proceedings of Entertainment Computing – ICEC, pp 411-414

    Chapter  Google Scholar 

  40. Tian Y, Wang Y, Hu Z, Huang T (2013) Selective Eigen background for background modeling and subtraction in crowded scenes. IEEE Trans Circuits Syst Video Technol 23(11):1849–1864

    Article  Google Scholar 

  41. Toyama K, Krumm J, Brumitt B, Meyers B (1999) Wallflower: principles and practice of background maintenance. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, 1 Kerkyra, Greece, pp. 255–261

  42. Varadarajan S, Miller P, Zhou H (2013) Spatial mixture of gaussians for dynamic background modelling. In: Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance

  43. Wang Y, Yao H, Zhao S (2015) Auto-Encoder Based Dimensionality Reduction. Neurocomputing 184:232–242. https://doi.org/10.1016/j.neucom.2015.08.104

    Article  Google Scholar 

  44. Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: Real-time tracking of the human body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785

    Article  Google Scholar 

  45. Yi S, Li H, Wang X (2016) Pedestrian Behavior Modeling From Stationary Crowds With Applications to Intelligent Surveillance. IEEE Trans Image Process 25(9):4354–4368

    Article  MathSciNet  Google Scholar 

  46. Zhang S, Yao H, Liu S (2008) Dynamic Background Subtraction Based on Local Dependency Histogram. In: Proceedings of Eighth International Workshop on Visual Surveillance -VS2008, Marseille

  47. Zhao Z, Zhang X, Fang Y (2015) Stacked Multilayer Self-Organizing Map for Background Modeling. IEEE Trans Image Process 24(9):2841–2850

    Article  MathSciNet  Google Scholar 

  48. Zivkovic Z (2004) Improved adaptive gaussian mixture model for background subtraction. Proceedings of International Conference on Pattern Recognition (ICIP) 2:28–31

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mala John.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gracewell, J., John, M. Dynamic background modeling using deep learning autoencoder network. Multimed Tools Appl 79, 4639–4659 (2020). https://doi.org/10.1007/s11042-019-7411-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7411-0

Keywords

Navigation