Skip to main content
Log in

Background subtraction: separating the modeling and the inference

Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In its early implementations, background modeling was a process of building a model for the background of a video with a stationary camera, and identifying pixels that did not conform well to this model. The pixels that were not well-described by the background model were assumed to be moving objects. Many systems today maintain models for the foreground as well as the background, and these models compete to explain the pixels in a video. If the foreground model explains the pixels better, they are considered foreground. Otherwise they are considered background. In this paper, we argue that the logical endpoint of this evolution is to simply use Bayes’ rule to classify pixels. In particular, it is essential to have a background likelihood, a foreground likelihood, and a prior at each pixel. A simple application of Bayes’ rule then gives a posterior probability over the label. The only remaining question is the quality of the component models: the background likelihood, the foreground likelihood, and the prior. We describe a model for the likelihoods that is built by using not only the past observations at a given pixel location, but by also including observations in a spatial neighborhood around the location. This enables us to model the influence between neighboring pixels and is an improvement over earlier pixelwise models that do not allow for such influence. Although similar in spirit to the joint domain-range model, we show that our model overcomes certain deficiencies in that model. We use a spatially dependent prior for the background and foreground. The background and foreground labels from the previous frame, after spatial smoothing to account for movement of objects, are used to build the prior for the current frame. These components are, by themselves, not novel aspects in background modeling. As we will show, many existing systems account for these aspects in different ways. We argue that separating these components as suggested in this paper yields a very simple and effective model. Our intuitive description also isolates the model components from the classification or inference step. Improvements to each model component can be carried out without any changes to the inference or other components. The various components can hence be modeled effectively and their impact on the overall system understood more easily.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. We have modified their equation to allow probabilistic contributions from the pixels and changed the notation to make it easily comparable to ours.

  2. Observations from the ground truth labels from videos in the change detection data set [4] show that between 95 and 100 % of all pixels labeled as background in each frame retain their background label in the next frame. We believe the use of the value \(0.95\) for background prior is justified in light of this observation. The use of \(0.50\) for the background prior in pixel locations that were labeled as foreground in the previous frame essentially allows the likelihood to decide the labels of these pixels in the current frame.

  3. The KDE and jKDE models are our own implementations and include spatially-dependent priors and Bayes’ classification criterion in order to make a fair comparison.

  4. For a detailed comparison of our model and the joint domain-range model, the reader is referred to our earlier paper [13].

References

  1. Aeschliman, C., Park, J., Kak, A.: A probabilistic framework for joint segmentation and tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1371–1378 (2010)

  2. Elgammal, A., Duraiswami, R., Davis, L.S.: Probabilistic tracking in joint feature-spatial spaces. In: IEEE Conference on, CVPR’03 Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, Washington, DC, USA , pp. 781–788. (2003). http://dl.acm.org/citation.cfm?id=1965841.1965943

  3. Elgammal, A.M., Harwood, D., Davis, L.S.: Non-parametric model for background subtraction. In: European Conference on Computer Vision, pp. 751–767 (2000)

  4. Goyette, N., Jodoin, P.M., Porikli, F., Konrad, J., Ishwar, P.: Changedetection.net: a new change detection benchmark dataset. In: IEEE Workshop on Change Detection (CDW 12) at CVPR (2012)

  5. Han, B., Davis, L.: On-line density-based appearance modeling for object tracking. In: Proceedings of the Tenth IEEE International Conference on Computer Vision, ICCV 05, IEEE Computer Society, vol. 2 , Washington, DC, USA, pp. 1492–1499.(2005). doi:10.1109/ICCV.2005.181. http://dx.doi.org/10.1109/ICCV.2005.181

  6. Kaewtrakulpong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Proceedings of 2nd European Workshop on Advanced Video Based Surveillance Systems, vol. 5308 (2001)

  7. Ko, T., Soatto, S., Estrin, D.: Background subtraction on distributions. European Conference on Computer Vision, ECCV ’08, pp. 276–289. Springer, Berlin (2008)

    Google Scholar 

  8. Li, L., Huang, W., Gu, I.Y.H., Tian, Q.: Foreground object detection from videos containing complex background. In: ACM International Conference on Multimedia, pp. 2–10 (2003)

  9. Liao, S., Zhao, G., Kellokumpu, V., Pietikäinen, M., Li, S.Z.: Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 1301–1306 (2010)

  10. Mittal, A., Paragios, N.: Motion-based background subtraction using adaptive kernel density estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. II-302–II-309 (2004)

  11. Narayana, M.: Automatic segmentation and tracking of moving objects in video for surveillance applications. Master’s thesis, University of Kansas, Lawrence, Kansas, USA (2007)

  12. Narayana, M., Hanson, A., Learned-Miller, E.: Background modeling using adaptive pixelwise kernel variances in a hybrid feature space. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

  13. Narayana, M., Hanson, A., Learned-Miller, E.: Improvements in joint domain-range modeling for background subtraction. In: Proceedings of the British Machine Vision Conference. BMVA Press, pp. 115.1–115.11 (2012). http://dx.doi.org/10.5244/C.26.115

  14. Porikli, F., Tuzel, O.: Bayesian background modeling for foreground detection. In: Proceedings of the third ACM international workshop on Video surveillance & sensor networks, VSSN 05. ACM, New York, NY, USA, pp. 55–58 (2005). doi:10.1145/1099396.1099407. http://doi.acm.org/10.1145/1099396.1099407

  15. Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

  16. Sheikh, Y., Shah, M.: Bayesian modeling of dynamic scenes for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1778–1792 (2005)

    Article  Google Scholar 

  17. Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 246–252 (1999)

  18. Tavakkoli, A., Nicolescu, M., Bebis, G., Nicolescu, M.: Non-parametric statistical background modeling for efficient foreground region detection. Mach. Vis. Appl. 7, 1–15 (2009)

    Google Scholar 

  19. Toyama, K., Krumm, J., Brumitt, B., Meyers, B.: Wallflower: principles and practice of background maintenance. In: IEEE International Conference on Computer Vision, vol. 1, pp. 255–261. doi:10.1109/ICCV.1999.791228. http://dx.doi.org/10.1109/ICCV.1999.791228

  20. Turlach, B.A.: Bandwidth selection in kernel density estimation: a review. In: CORE and Institut de Statistique (1993)

  21. Wand, M.P., Jones, M.C.: Kernel smoothing. Chapman and Hall, London (1995)

    Book  MATH  Google Scholar 

  22. Wren, C.R., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: real-time tracking of the human body. IEEE Trans.Pattern Anal. Mach. Intell. 19, 780–785 (1997). doi:10.1109/34.598236

    Article  Google Scholar 

  23. Yao, J., Odobez, J.M.: Multi-layer background subtraction based on color and texture. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)

  24. Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: International Conference on Pattern Recognition (ICPR), vol. 2, pp. 28–31 (2004)

  25. Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognit. Lett. 27(7), 773–780 (2006). doi:10.1016/j.patrec.2005.11.005. http://dx.doi.org/10.1016/j.patrec.2005.11.005

Download references

Acknowledgments

This work was supported in part by the National Science Foundation under CAREER award IIS-0546666 and grant CNS-0619337. Any opinions, findings, conclusions, or recommendations expressed here are the authors’ and do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manjunath Narayana.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Narayana, M., Hanson, A. & Learned-Miller, E.G. Background subtraction: separating the modeling and the inference. Machine Vision and Applications 25, 1163–1174 (2014). https://doi.org/10.1007/s00138-013-0569-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-013-0569-y

Keywords

Navigation