Predicting 3D People from 2D Pictures

  • Leonid Sigal
  • Michael J. Black
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4069)


We propose a hierarchical process for inferring the 3D pose of a person from monocular images. First we infer a learned view-based 2D body model from a single image using non-parametric belief propagation. This approach integrates information from bottom-up body-part proposal processes and deals with self-occlusion to compute distributions over limb poses. Then, we exploit a learned Mixture of Experts model to infer a distribution of 3D poses conditioned on 2D poses. This approach is more general than recent work on inferring 3D pose directly from silhouettes since the 2D body model provides a richer representation that includes the 2D joint angles and the poses of limbs that may be unobserved in the silhouette. We demonstrate the method in a laboratory setting where we evaluate the accuracy of the 3D poses against ground truth data. We also estimate 3D body pose in a monocular image sequence. The resulting 3D estimates are sufficiently accurate to serve as proposals for the Bayesian inference of 3D human motion over time.


Ground Truth Data Relevance Vector Regression Graphical Model Representation Monocular Image Sequence Anneal Particle Filter 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: Learning to track 3D human motion from silhouettes. In: ICML, pp. 9–16 (2004)Google Scholar
  2. 2.
    Agarwal, A., Triggs, B.: 3D human pose from silhouettes by relevance vector regression. In: CVPR, vol. 2, pp. 882–888 (2004)Google Scholar
  3. 3.
    Balan, A., Sigal, L., Black, M.: A quantitative evaluation of video-based 3D person tracking. In: VS-PETS, pp. 349–356 (2005)Google Scholar
  4. 4.
    Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. IJCV 61(2), 185–205 (2004)CrossRefGoogle Scholar
  5. 5.
    Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61(1), 55–79 (2005)CrossRefGoogle Scholar
  6. 6.
    Howe, N.R., Leventon, M.E., Freeman, W.T.: Bayesian reconstruction of (3D) human motion from single-camera video. In: NIPS, pp. 820–826 (1999)Google Scholar
  7. 7.
    Hua, G., Yang, M.-H., Wu, Y.: Learning to estimate human pose with data driven belief propagation. In: CVPR, vol. 2, pp. 747–754 (2005)Google Scholar
  8. 8.
    Isard, M.: Pampas: Real-valued graphical models for computer vision. In: CVPR, vol. 1, pp. 613–620 (2003)Google Scholar
  9. 9.
    Ju, S., Black, M., Yacoob, Y.: Cardboard people: A parametrized model of articulated motion. In: Int. Conf. on Automatic Face and Gesture Recognition, pp. 38–44 (1996)Google Scholar
  10. 10.
    Lan, X., Huttenlocher, D.: A unified spatio-temporal articulated model for tracking. In: CVPR, vol. 1, pp. 722–729 (2004)Google Scholar
  11. 11.
    Lee, M., Cohen, I.: Proposal maps driven MCMC for estimating human body pose in static images. In: CVPR, vol. 2, pp. 334–341 (2004)Google Scholar
  12. 12.
    Mori, G.: Guiding model search using segmentation. In: ICCV, pp. 1417–1423 (2005)Google Scholar
  13. 13.
    Mori, G., Ren, X., Efros, A., Malik, J.: Recovering human body configurations: Combining segmentation and recognition. In: CVPR, vol. 2, pp. 326–333 (2004)Google Scholar
  14. 14.
    Ramanan, D., Forsyth, D., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: CVPR, vol. 1, pp. 271–278 (2005)Google Scholar
  15. 15.
    Ramanan, D., Forsyth, D.: Finding and tracking people from the bottom up. In: CVPR, vol. 2, pp. 467–474 (2003)Google Scholar
  16. 16.
    Roberts, T.J., McKenna, S.J., Ricketts, I.W.: Human pose estimation using learnt probabilistic region similarities and partial configurations. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3024, pp. 291–303. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  17. 17.
    Rosales, R., Sclaroff, S.: Inferring body pose without tracking body parts. In: CVPR, vol. 2, pp. 721–727 (2000)Google Scholar
  18. 18.
    Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3D human figures using 2D image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  19. 19.
    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV, vol. 2, pp. 750–759 (2003)Google Scholar
  20. 20.
    Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: CVPR, vol. 1, pp. 421–428 (2004)Google Scholar
  21. 21.
    Sigal, L., Black, M.: Measure Locally, Reason Globally: Occlusion-sensitive articulated pose estimation. In: CVPR (2006)Google Scholar
  22. 22.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: CVPR, vol. 1, pp. 390–397 (2005)Google Scholar
  23. 23.
    Sminchisescu, C., Triggs, B.: Estimating articulated human motion with covariance scaled sampling. IJRR 22(6), 371–391 (2003)Google Scholar
  24. 24.
    Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: NIPS, pp. 1369–1376 (2004)Google Scholar
  25. 25.
    Sudderth, E., Ihler, A., Freeman, W., Willsky, A.: Nonparametric belief propagation. In: CVPR, vol. 1, pp. 605–612 (2003)Google Scholar
  26. 26.
    Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single image. CVIU 80(3), 349–363 (2000)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Leonid Sigal
    • 1
  • Michael J. Black
    • 1
  1. 1.Department of Computer ScienceBrown UniversityProvidenceUSA

Personalised recommendations