Multiple Tree Models for Occlusion and Spatial Constraints in Human Pose Estimation

  • Yang Wang
  • Greg Mori
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5304)


Tree-structured models have been widely used for human pose estimation, in either 2D or 3D. While such models allow efficient learning and inference, they fail to capture additional dependencies between body parts, other than kinematic constraints between connected parts. In this paper, we consider the use of multiple tree models, rather than a single tree model for human pose estimation. Our model can alleviate the limitations of a single tree-structured model by combining information provided across different tree models. The parameters of each individual tree model are trained via standard learning algorithms in a single tree-structured model. Different tree models can be combined in a discriminative fashion by a boosting procedure. We present experimental results showing the improvement of our approaches on two different datasets. On the first dataset, we use our multiple tree framework for occlusion reasoning. On the second dataset, we combine multiple deformable trees for capturing spatial constraints between non-connected body parts.


Tree Model Markov Random Fields Conditional Random Field Spatial Constraint Weak Learner 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Crandell, D., Felzenszwalb, P.F., Huttenlocher, D.P.: Spatial priors for part-based recognition using statistical models. In: IEEE CVPR (2005)Google Scholar
  2. 2.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. International Journal of Computer Vision 61(1), 55–79 (2003)CrossRefGoogle Scholar
  3. 3.
    Forsyth, D.A., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational studies of human motion: Part 1, tracking and motion synthesis. Foundations and Trends in Computer Graphics and Vision 1(2/3), 77–254 (2006)CrossRefGoogle Scholar
  4. 4.
    Gross, R., Shi, J.: The cmu motion of body(mobo) database. Technical Report CMU-RI-TR-01-18, CMU (2001)Google Scholar
  5. 5.
    Hogg, D.: Model-based vision: a program to see a walking person. Image and Vision Computing 1(1), 5–20 (1983)CrossRefGoogle Scholar
  6. 6.
    Ioffe, S., Forsyth, D.: Human tracking with mixtures of trees. In: IEEE ICCV (2001)Google Scholar
  7. 7.
    Ju, S.X., Black, M.J., Yaccob, Y.: Cardboard people: A parameterized model of articulated image motion. In: Proc. Automatic Face and Gesture Recognition (1996)Google Scholar
  8. 8.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML (2001)Google Scholar
  9. 9.
    Lan, X., Huttenlocher, D.P.: Beyond trees: Common-factor models for 2d human pose recovery. In: IEEE ICCV (2005)Google Scholar
  10. 10.
    Meila, M., Jordan, M.I.: Learning with mixtures of trees. Journal of Machine Learning Research 1, 1–48 (2000)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Ramanan, D.: Learning to parse images of articulated bodies. In: NIPS 19 (2007)Google Scholar
  13. 13.
    Ren, X., Berg, A., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: IEEE ICCV (2005)Google Scholar
  14. 14.
    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter sensitive hashing. In: IEEE ICCV (2003)Google Scholar
  15. 15.
    Sigal, L., Black, M.J.: Measure locally, reason globally: Occlusion-sensitive articulated pose estimation. In: IEEE CVPR (2006)Google Scholar
  16. 16.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.: BM3E: Discriminative Density Propagation for Visual Tracking. IEEE PAMI 29(11), 2030–2044 (2007)CrossRefGoogle Scholar
  17. 17.
    Song, Y., Goncalves, L., Perona, P.: Unsupervised learning of human motion. IEEE Transaction on Pattern Analysis and Machine Intelligence 25(7), 814–827 (2003)CrossRefGoogle Scholar
  18. 18.
    Sudderth, E.B., Mandel, M.I., Freeman, W.T., Willsky, A.S.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: NIPS (2004)Google Scholar
  19. 19.
    Sullivan, J., Carlsson, S.: Recognizing and tracking human action. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 629–644. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  20. 20.
    Torralba, A., Murphy, K.P., Freeman, W.T.: Contextual models for object detection using boosted random fields. In: NIPS 17 (2005)Google Scholar
  21. 21.
    Toyama, K., Blake, A.: Probabilistic exemplar-based tracking in a metric space. In: IEEE ICCV (2001)Google Scholar
  22. 22.
    Truyen, T.T., Phung, D.Q., Bui, H.H., Venkatesh, S.: AdaBoost.MRF: Boosted markov random forests and application to multilevel activity recognition. In: IEEE CVPR (2006)Google Scholar
  23. 23.
    Wainwright, M.J., Jaakkola, T.S., Willsky, A.S.: A new class of upper bounds on the log partition function. IEEE Transactions on Information Theory 51(7), 2313–2335 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Wang, Y., Mori, G.: Boosted multiple deformable trees for parsing human poses. In: ICCV Workshop on Human Motion Understanding, Modeling, Capture and Animation (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yang Wang
    • 1
  • Greg Mori
    • 1
  1. 1.School of Computing ScienceSimon Fraser UniversityCanada

Personalised recommendations