CONDENSATION—Conditional Density Propagation for Visual Tracking

Abstract

The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimo dal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time.

This is a preview of subscription content, log in to check access.

References

  1. Anderson and Moore 1979. Optimal Filtering. Prentice Hall.

  2. Astrom, K.J. 1970. Introduction to Stochastic Control Theory. Academic Press.

  3. Bar-Shalom, Y. and Fortmann, T. 1988. Tracking and Data Association. Academic Press.

  4. Bartels, R., Beatty, J., and Barsky, B. 1987. An Introduction to Splines for use in Computer Graphics and Geometric Modeling. Morgan Kaufmann.

  5. Baumberg, A. and Hogg, D. 1994. Learning flexible models from image sequences. In Proc. 3rd European Conference on Computer Vision, J.-O. Eklundh (Ed.), Springer-Verlag, pp. 299-308.

  6. Baumberg, A. and Hogg, D. 1995. Generating spatiotemporal models from examples. In Proc. of the British Machine Vision Conference, vol. 2, pp. 413-422.

    Google Scholar 

  7. Blake, A., Curwen, R., and Zisserman, A. 1993. A framework for spatio-temporal control in the tracking of visual contours. Int. Journal of Computer Vision, 11(2):127-145.

    Google Scholar 

  8. Blake, A. and Isard, M. 1994. 3D position, attitude and shape input using video tracking of hands and lips. In Proc. Siggraph, ACM, pp. 185-192.

  9. Blake, A., Isard, M., and Reynard, D. 1995. Learning to track the visual motion of contours Artificial Intelligence, 78: 101-134.

    Article  Google Scholar 

  10. Bucy, R. 1969. Bayes theorem and digital realizations for non-linear filters. J. Astronautical Sciences, 17(2):80-94.

    Google Scholar 

  11. Cipolla, R. and Blake, A. 1990. The dynamic analysis of apparent contours. In Proc. 3rd Int. Conf. on Computer Vision, pp. 616-625.

  12. Cootes, T., Taylor, C., Lanitis, A., Cooper, D., and Graham, J. 1993. Building and using flexible models incorporating grey-level information. In Proc. 4th Int. Conf. on Computer Vision, pp. 242-246.

  13. Dickmanns, E. and Graefe, V. 1988. Applications of dynamic monocular machine vision.MachineVision and Applications, 1:241-261.

    Google Scholar 

  14. Fischler, M. and Bolles, R. 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:381-395.

    Google Scholar 

  15. Fischler, M.A. and Elschlager, R.A. 1973. The representation and matching of pictorial structures. IEEE. Trans. Computers, C-22:1.

    Google Scholar 

  16. Gelb, A. (Ed.) 1974. Applied Optimal Estimation. MIT Press: Cambridge, MA.

    Google Scholar 

  17. Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Analysis and Machine Intelligence, 6(6):721-741.

    Google Scholar 

  18. Gennery, D. 1992. Visual tracking of known three-dimensional objects. Int. Journal of Computer Vision, 7(3):243-270.

    Google Scholar 

  19. Goodwin, C. and Sin, K. 1984. Adaptive Filtering Prediction and Control. Prentice-Hall.

  20. Gordon, N., Salmond, D., and Smith, A. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F, 140(2):107-113.

    Google Scholar 

  21. Grenander, U., Chow, Y., and Keenan, D.M. 1991. HANDS. A Pattern Theoretical Study of Biological Shapes. Springer-Verlag: New York.

    Google Scholar 

  22. Hager, G. 1990. Sensor Fusion and Planning: A Computational Approach. Kluwer Academic Publishers.

  23. Harris, C. 1992. Tracking with rigid models. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 59-74.

    Google Scholar 

  24. Hogg, D. 1983. Model-based vision: A program to see a walking person, Image and Vision Computing, 1(1):5-20.

    Article  Google Scholar 

  25. Huttenlocher, D., Noh, J., and Rucklidge, W. 1993. Tracking nonrigid objects in complex scenes. In Proc. 4th Int. Conf. on Computer Vision, pp. 93-101.

  26. Isard, M. and Blake, A. 1996. Visual tracking by stochastic propagation of conditional density. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 343-356.

  27. Kass, M., Witkin, A., and Terzopoulos, D. 1987. Snakes: Active contour models. In Proc. 1st Int. Conf. on Computer Vision, pp. 259-268.

  28. Kitagawa, G. 1996. Monte-carlo filter and smoother for non-Gaussian nonlinear state space models. J. Computational and Graphical Statistics, 5(1):1-25.

    Google Scholar 

  29. Koenderink, J. and Van Doorn, A. 1991. Affine structure from motion, J. Optical Soc. of America A., 8(2):337-385.

    Google Scholar 

  30. Lowe, D. 1991. Fitting parameterised 3D models to images. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(5):441-450.

    Article  Google Scholar 

  31. Lowe, D. 1992. Robust model-based motion tracking through the integration of search and estimation. Int. Journal of Computer Vision, 8(2):113-122.

    Google Scholar 

  32. Matthies, L.H., Kanade, T., and Szeliski, R. 1989. Kalman filterbased algorithms for estimating depth from image sequences. Int. Journal of Computer Vision, 3:209-236.

    Google Scholar 

  33. Menet, S., Saint-Marc, P., and Medioni, G. 1990. B-snakes: Implementation and application to stereo. In Proceedings DARPA, pp. 720-726.

  34. Miller, M., Srivasta, A., and Grenander, U. 1995. Conditional-mean estimation via jump-diffusion processes in multiple target tracking/ recognition. IEEE Transactions on Signal Processing, 43(11): 2678-2690.

    Article  Google Scholar 

  35. Papoulis, A. 1990. Probability and Statistics. Prentice-Hall.

  36. Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. 1988. Numerical Recipes in C. Cambridge University Press.

  37. Rabiner, L. and Bing-Hwang, J. 1993. Fundamentals of Speech Recognition. Prentice-Hall.

  38. Rao, B. 1992. Data association methods for tracking systems. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 91-105.

    Google Scholar 

  39. Rao, C. 1973. Linear Statistical Inference and its Applications. John Wiley and Sons: New York.

    Google Scholar 

  40. Rehg, J. and Kanade, T. 1994. Visual tracking of high dof articulated structures: An application to human hand tracking. In Proc. 3rd European Conference on Computer Vision, J.-O. Eklundh (Ed.), Springer-Verlag, pp. 35-46.

  41. Reynard, D., Wildenberg, A., Blake, A., and Marchant, J. 1996. Learning dynamics of complex motions from image sequences. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 357-368.

  42. Ripley, B. 1987. Stochastic Simulation. Wiley: New York.

    Google Scholar 

  43. Ripley, B. and Sutherland, A. 1990. Finding spiral structures in images of galaxies. Phil. Trans. R. Soc. Lond. A., 332, 1627:477-485.

    Google Scholar 

  44. Rowe, S. and Blake, A. 1996. Statistical feature modelling for active contours. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 560-569.

  45. Sorenson, H.W. and Alspach, D.L. 1971. Recursive Bayesian estimation using Gaussian sums. Automatica, 7:465-479.

    Article  Google Scholar 

  46. Storvik, G. 1994. A Bayesian approach to dynamic contours through stochastic sampling and simulated annealing. IEEE Trans. Pattern Analysis and Machine Intelligence, 16(10):976-986.

    Article  Google Scholar 

  47. Sullivan, G. 1992. Visual interpretation of known objects in constrained scenes. Phil. Trans. R. Soc. Lond. B., 337:361-370.

    Google Scholar 

  48. Terzopoulos, D. and Metaxas, D. 1991. Dynamic 3D models with local and global deformations: Deformable superquadrics.IEEE Trans. Pattern Analysis and Machine Intelligence, 13:7.

    Google Scholar 

  49. Terzopoulos, D. and Szeliski, R. 1992.Tracking with Kalman snakes. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 3-20.

    Google Scholar 

  50. Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(10):992-1006.

    Article  Google Scholar 

  51. Yuille, A. and Hallinan, P. 1992. Deformable templates. In Acive Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 20-38.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Isard, M., Blake, A. CONDENSATION—Conditional Density Propagation for Visual Tracking. International Journal of Computer Vision 29, 5–28 (1998). https://doi.org/10.1023/A:1008078328650

Download citation

Keywords

  • Probability Distribution
  • Computer Vision
  • Computer Image
  • Alternative Hypothesis
  • Visual Observation