International Journal of Computer Vision

, Volume 29, Issue 1, pp 5–28 | Cite as

CONDENSATION—Conditional Density Propagation for Visual Tracking

  • Michael Isard
  • Andrew Blake
Article

Abstract

The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimo dal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson and Moore 1979. Optimal Filtering. Prentice Hall.Google Scholar
  2. Astrom, K.J. 1970. Introduction to Stochastic Control Theory. Academic Press.Google Scholar
  3. Bar-Shalom, Y. and Fortmann, T. 1988. Tracking and Data Association. Academic Press.Google Scholar
  4. Bartels, R., Beatty, J., and Barsky, B. 1987. An Introduction to Splines for use in Computer Graphics and Geometric Modeling. Morgan Kaufmann.Google Scholar
  5. Baumberg, A. and Hogg, D. 1994. Learning flexible models from image sequences. In Proc. 3rd European Conference on Computer Vision, J.-O. Eklundh (Ed.), Springer-Verlag, pp. 299-308.Google Scholar
  6. Baumberg, A. and Hogg, D. 1995. Generating spatiotemporal models from examples. In Proc. of the British Machine Vision Conference, vol. 2, pp. 413-422.Google Scholar
  7. Blake, A., Curwen, R., and Zisserman, A. 1993. A framework for spatio-temporal control in the tracking of visual contours. Int. Journal of Computer Vision, 11(2):127-145.Google Scholar
  8. Blake, A. and Isard, M. 1994. 3D position, attitude and shape input using video tracking of hands and lips. In Proc. Siggraph, ACM, pp. 185-192.Google Scholar
  9. Blake, A., Isard, M., and Reynard, D. 1995. Learning to track the visual motion of contours Artificial Intelligence, 78: 101-134.CrossRefGoogle Scholar
  10. Bucy, R. 1969. Bayes theorem and digital realizations for non-linear filters. J. Astronautical Sciences, 17(2):80-94.Google Scholar
  11. Cipolla, R. and Blake, A. 1990. The dynamic analysis of apparent contours. In Proc. 3rd Int. Conf. on Computer Vision, pp. 616-625.Google Scholar
  12. Cootes, T., Taylor, C., Lanitis, A., Cooper, D., and Graham, J. 1993. Building and using flexible models incorporating grey-level information. In Proc. 4th Int. Conf. on Computer Vision, pp. 242-246.Google Scholar
  13. Dickmanns, E. and Graefe, V. 1988. Applications of dynamic monocular machine vision.MachineVision and Applications, 1:241-261.Google Scholar
  14. Fischler, M. and Bolles, R. 1981. Random sample consensus: A paradigm for model fitting with application to image analysis and automated cartography. Commun. Assoc. Comp. Mach., 24:381-395.Google Scholar
  15. Fischler, M.A. and Elschlager, R.A. 1973. The representation and matching of pictorial structures. IEEE. Trans. Computers, C-22:1.Google Scholar
  16. Gelb, A. (Ed.) 1974. Applied Optimal Estimation. MIT Press: Cambridge, MA.Google Scholar
  17. Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Analysis and Machine Intelligence, 6(6):721-741.Google Scholar
  18. Gennery, D. 1992. Visual tracking of known three-dimensional objects. Int. Journal of Computer Vision, 7(3):243-270.Google Scholar
  19. Goodwin, C. and Sin, K. 1984. Adaptive Filtering Prediction and Control. Prentice-Hall.Google Scholar
  20. Gordon, N., Salmond, D., and Smith, A. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F, 140(2):107-113.Google Scholar
  21. Grenander, U., Chow, Y., and Keenan, D.M. 1991. HANDS. A Pattern Theoretical Study of Biological Shapes. Springer-Verlag: New York.Google Scholar
  22. Hager, G. 1990. Sensor Fusion and Planning: A Computational Approach. Kluwer Academic Publishers.Google Scholar
  23. Harris, C. 1992. Tracking with rigid models. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 59-74.Google Scholar
  24. Hogg, D. 1983. Model-based vision: A program to see a walking person, Image and Vision Computing, 1(1):5-20.CrossRefGoogle Scholar
  25. Huttenlocher, D., Noh, J., and Rucklidge, W. 1993. Tracking nonrigid objects in complex scenes. In Proc. 4th Int. Conf. on Computer Vision, pp. 93-101.Google Scholar
  26. Isard, M. and Blake, A. 1996. Visual tracking by stochastic propagation of conditional density. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 343-356.Google Scholar
  27. Kass, M., Witkin, A., and Terzopoulos, D. 1987. Snakes: Active contour models. In Proc. 1st Int. Conf. on Computer Vision, pp. 259-268.Google Scholar
  28. Kitagawa, G. 1996. Monte-carlo filter and smoother for non-Gaussian nonlinear state space models. J. Computational and Graphical Statistics, 5(1):1-25.Google Scholar
  29. Koenderink, J. and Van Doorn, A. 1991. Affine structure from motion, J. Optical Soc. of America A., 8(2):337-385.Google Scholar
  30. Lowe, D. 1991. Fitting parameterised 3D models to images. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(5):441-450.CrossRefGoogle Scholar
  31. Lowe, D. 1992. Robust model-based motion tracking through the integration of search and estimation. Int. Journal of Computer Vision, 8(2):113-122.Google Scholar
  32. Matthies, L.H., Kanade, T., and Szeliski, R. 1989. Kalman filterbased algorithms for estimating depth from image sequences. Int. Journal of Computer Vision, 3:209-236.Google Scholar
  33. Menet, S., Saint-Marc, P., and Medioni, G. 1990. B-snakes: Implementation and application to stereo. In Proceedings DARPA, pp. 720-726.Google Scholar
  34. Miller, M., Srivasta, A., and Grenander, U. 1995. Conditional-mean estimation via jump-diffusion processes in multiple target tracking/ recognition. IEEE Transactions on Signal Processing, 43(11): 2678-2690.CrossRefGoogle Scholar
  35. Papoulis, A. 1990. Probability and Statistics. Prentice-Hall.Google Scholar
  36. Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. 1988. Numerical Recipes in C. Cambridge University Press.Google Scholar
  37. Rabiner, L. and Bing-Hwang, J. 1993. Fundamentals of Speech Recognition. Prentice-Hall.Google Scholar
  38. Rao, B. 1992. Data association methods for tracking systems. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 91-105.Google Scholar
  39. Rao, C. 1973. Linear Statistical Inference and its Applications. John Wiley and Sons: New York.Google Scholar
  40. Rehg, J. and Kanade, T. 1994. Visual tracking of high dof articulated structures: An application to human hand tracking. In Proc. 3rd European Conference on Computer Vision, J.-O. Eklundh (Ed.), Springer-Verlag, pp. 35-46.Google Scholar
  41. Reynard, D., Wildenberg, A., Blake, A., and Marchant, J. 1996. Learning dynamics of complex motions from image sequences. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 357-368.Google Scholar
  42. Ripley, B. 1987. Stochastic Simulation. Wiley: New York.Google Scholar
  43. Ripley, B. and Sutherland, A. 1990. Finding spiral structures in images of galaxies. Phil. Trans. R. Soc. Lond. A., 332, 1627:477-485.Google Scholar
  44. Rowe, S. and Blake, A. 1996. Statistical feature modelling for active contours. In Proc. 4th European Conf. on Computer Vision, Cambridge, England, pp. 560-569.Google Scholar
  45. Sorenson, H.W. and Alspach, D.L. 1971. Recursive Bayesian estimation using Gaussian sums. Automatica, 7:465-479.CrossRefGoogle Scholar
  46. Storvik, G. 1994. A Bayesian approach to dynamic contours through stochastic sampling and simulated annealing. IEEE Trans. Pattern Analysis and Machine Intelligence, 16(10):976-986.CrossRefGoogle Scholar
  47. Sullivan, G. 1992. Visual interpretation of known objects in constrained scenes. Phil. Trans. R. Soc. Lond. B., 337:361-370.Google Scholar
  48. Terzopoulos, D. and Metaxas, D. 1991. Dynamic 3D models with local and global deformations: Deformable superquadrics.IEEE Trans. Pattern Analysis and Machine Intelligence, 13:7.Google Scholar
  49. Terzopoulos, D. and Szeliski, R. 1992.Tracking with Kalman snakes. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 3-20.Google Scholar
  50. Ullman, S. and Basri, R. 1991. Recognition by linear combinations of models. IEEE Trans. Pattern Analysis and Machine Intelligence, 13(10):992-1006.CrossRefGoogle Scholar
  51. Yuille, A. and Hallinan, P. 1992. Deformable templates. In Acive Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, MA, pp. 20-38.Google Scholar

Copyright information

© Kluwer Academic Publishers 1998

Authors and Affiliations

  • Michael Isard
    • 1
  • Andrew Blake
    • 1
  1. 1.Department of Engineering ScienceUniversity of OxfordOxfordUK

Personalised recommendations