Stereo Based 3D Tracking and Scene Learning, Employing Particle Filtering within EM

  • Trausti Kristjansson
  • Hagai Attias
  • John Hershey
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3024)


We present a generative probabilistic model for 3D scenes with stereo views. With this model, we track an object in 3 dimensions while simultaneously learning its appearance and the appearance of the background. By using a generative model for the scene, we are able to aggregate evidence over time. In addition, the probabilistic model naturally handles sources of variability.

For inference and learning in the model, we formulate an Expectation Maximization (EM) algorithm where Rao-Blackwellized Particle filtering is used in the E step. The use of stereo views of the scene is a strong source of disambiguating evidence and allows rapid convergence of the algorithm. The update equations have an appealing form and as a side result, we give a generative probabilistic interpretation for the Sum of Squared Differences (SSD) metric known from the field of Stereo Vision.


Expectation Maximiza Expectation Maximiza Algorithm Appearance Model Stereo Vision Stereo Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Malciu, M.: A robust model-based approach for 3d head tracking in video sequences. In: Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), Grenoble, France, vol. 1, pp. 169–174 (2000)Google Scholar
  2. 2.
    Schodl, I., Haro, A.: Head tracking using a textured polygonal model. In. In: Proceedings of Workshop on Perceptual User Interfaces (1998)Google Scholar
  3. 3.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 7–42 (2002)Google Scholar
  4. 4.
    Frey, B., Jojic, N.: Transformation-invariant clustering and dimensionality reduction using em. IEEE Transactions on Pattern Analysis and Machine Intelligence (2000)Google Scholar
  5. 5.
    Papanikolopoulos, N., Khosla, P., Kanade, T.: Vision and control techniques for robotic visual tracking. In: Proc. IEEE Int. Conf. Robotics and Autmation, vol. 1, pp. 851–856 (1991)Google Scholar
  6. 6.
    Toyama, K.: Prolegomena for robust face tracking. Technical Report MSR Technical Report, MSR-TR-98-65, Microsoft Research (1998)Google Scholar
  7. 7.
    Jebara, T., Azarbeyejani, A., Pentland, A.: 3d structure from 2d motion. IEEE Signal Processing Magazine 16 (1999)Google Scholar
  8. 8.
    Sun, J., Shum, H.Y., Zheng, N.N.: Stereo matching using belief propagation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 510–524. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Scharstein, D., Szeliski, R.: Stereo matching with non-linear diffusion. In: Proc. of IEEE conference on Computer Vision and Pattern Recognition, pp. 343–350 (1996)Google Scholar
  10. 10.
    Kanade, T., Okutomi, M.: A stereo matching algorithm with an adaptive window: Theory and experiment. IEEE Transactions on Pattern Analysis and Machine Intelligence 16, 920–932 (1994)CrossRefGoogle Scholar
  11. 11.
    Frey, B.J., Jojic, N.: Learning graphical models of images, videos and their spatial transformations. In: Proceedings of the Sixteenth Conference on Uncertainty in Artifical Intelligence (2000)Google Scholar
  12. 12.
    Dellaert, F., Thrun, S., Thorpe, C.: Jacobian images of super-resolved texture maps for modelbased motion estimation and tracking. In: IEEE Workshop on Applications of Computer Vision, pp. 2–7 (1998)Google Scholar
  13. 13.
    Wang, J., Adelson, E.: Representing moving images with layers. IEEE Transactions on Image Processing, Special Issue: Image Sequence Compression 4, 625–638 (1994)Google Scholar
  14. 14.
    Blake, A., Isard, M.: Active Contours. Springer, Heidelberg (1998)Google Scholar
  15. 15.
    Isard, M., Blake, A.: Icondensation: Unifying low-level and high-level tracking in a stochastic framework. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 893–908. Springer, Heidelberg (1998)Google Scholar
  16. 16.
    Kristjansson, T., Frey, B.: Keeping flexible active contours on track using metropolis updates. In: Advances is Neural Information Processing (NIPS), pp. 859–865 (2000)Google Scholar
  17. 17.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 484–498. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  18. 18.
    Murphy, K., Russell, S.: Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks. In: (Sequential Monte Carlo Methods in Practice)Google Scholar
  19. 19.
    Doucet, A., de Freitas, N., Murphy, K., Russell, S.: Rao-blackwellised particle filtering for dynamic bayesian networks. In: Proc. of Uncertainty in AI (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Trausti Kristjansson
    • 1
  • Hagai Attias
    • 1
  • John Hershey
    • 1
  1. 1.Microsoft ResearchRedmondUSA

Personalised recommendations