Reconstructing 3D Human Pose from 2D Image Landmarks

Ramakrishna, Varun; Kanade, Takeo; Sheikh, Yaser

doi:10.1007/978-3-642-33765-9_41

Reconstructing 3D Human Pose from 2D Image Landmarks

Varun Ramakrishna²¹,
Takeo Kanade²¹ &
Yaser Sheikh²¹

Conference paper

10k Accesses
125 Citations
3 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7575))

Abstract

Reconstructing an arbitrary configuration of 3D points from their projection in an image is an ill-posed problem. When the points hold semantic meaning, such as anatomical landmarks on a body, human observers can often infer a plausible 3D configuration, drawing on extensive visual memory. We present an activity-independent method to recover the 3D configuration of a human figure from 2D locations of anatomical landmarks in a single image, leveraging a large motion capture corpus as a proxy for visual memory. Our method solves for anthropometrically regular body pose and explicitly estimates the camera via a matching pursuit algorithm operating on the image projections. Anthropometric regularity (i.e., that limbs obey known proportions) is a highly informative prior, but directly applying such constraints is intractable. Instead, we enforce a necessary condition on the sum of squared limb-lengths that can be solved for in closed form to discourage implausible configurations in 3D. We evaluate performance on a wide variety of human poses captured from different viewpoints and show generalization to novel 3D configurations and robustness to missing data.

Download to read the full chapter text

Chapter PDF

References

Lee, H.J., Chen, Z.: Determination of 3D Human Body Postures from a Single View. Computer Vision, Graphics, and Image Processing 30, 148–168 (1985)
Article MathSciNet Google Scholar
Peelen, M.V., Downing, P.E.: The Neural Basis of Visual Body Perception. Nature Reviews Neuroscience (8), 636–648
Google Scholar
MoCap: Carnegie Mellon University Graphics Lab Motion Capture Database, http://mocap.cs.cmu.edu
Matthews, I., Baker, S.: Active Appearance Models Revisited. International Journal of Computer Vision 60, 135–164 (2003)
Article Google Scholar
Safonova, A., Hodgins, J.K., Pollard, N.S.: Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces. ACM Transactions on Graphics (SIGGRAPH 2004) 23 (2004)
Google Scholar
Xiao, J., Baker, S., Matthews, I., Kanade, T.: Real-Time Combined 2D+3D Active Appearance Models. In: CVPR, pp. 535–542. IEEE (2004)
Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)
Google Scholar
Gander, W.: Least Squares with a Quadratic Constraint. Numerische Mathematik (1981)
Google Scholar
Taylor, C.: Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image. CVIU, 349–363 (2000)
Google Scholar
Jiang, H.: 3D Human Pose Reconstruction Using Millions of Exemplars. In: ICPR, pp. 1674–1677. IEEE (2010)
Google Scholar
Parameswaran, V., Chellappa, R.: View Independent Human Body Pose Estimation from a Single Perspective Image. In: CVPR, pp. 16–22. IEEE (2006)
Google Scholar
Barron, C., Kakadiaris, I.A.: Estimating Anthropometry and Pose from a Single Uncalibrated Image. CVIU, 269–284 (2001)
Google Scholar
Salzmann, M., Urtasun, R.: Implicitly Constrained Gaussian Process Regression for Monocular Non-Rigid Pose Estimation. In: Advances in Neural Information Processing Systems, pp. 2065–2073 (2010)
Google Scholar
Agarwal, A., Triggs, B.: 3D Human Pose from Silhouettes by Relevance Vector Regression. In: CVPR, pp. 882–888. IEEE (2004)
Google Scholar
Mori, G., Malik, J.: Recovering 3D Human Body Configurations using Shape Contexts. PAMI 28, 1052–1062 (2006)
Article Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast Pose Estimation with Parameter-Sensitive Hashing. In: ICCV, p. 750. IEEE (2003)
Google Scholar
Elgammal, A., Lee, C.S.: Inferring 3D Body Pose from Silhouettes using Activity Manifold Learning. In: CVPR, pp. 681–688. IEEE (2004)
Google Scholar
Rosales, R., Sclaroff, S.: Specialized Mappings and the Estimation of Human Body Pose from a Single Image. In: Proceedings of the Workshop on Human Motion, pp. 19–24 (2000)
Google Scholar
Salzmann, M., Fua, P.: Reconstructing Sharply Folding Surfaces: A Convex Formulation. In: CVPR, pp. 1054–1061. IEEE (2009)
Google Scholar
Moreno-Noguer, F., Porta, J.M., Fua, P.: Exploring Ambiguities for Monocular Non-Rigid Shape Estimation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 370–383. Springer, Heidelberg (2010)
Chapter Google Scholar
Wei, X.K., Chai, J.: Modeling 3D Human Poses from Uncalibrated Monocular Images. In: ICCV, pp. 1873–1880. IEEE (2009)
Google Scholar
Valmadre, J., Lucey, S.: Deterministic 3D Human Pose Estimation using Rigid Structure. In: Daniilidis, K. (ed.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 467–480. Springer, Heidelberg (2010)
Chapter Google Scholar
Cootes, T., Edwards, G., Taylor, C.: Active Appearance Models. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 681–685 (2001)
Article Google Scholar
Pati, Y., Rezaiifar, R., Krishnaprasad, P.: Orthogonal Matching Pursuit: Recursive Function Approximation with Applications to Wavelet Decomposition. In: 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 40–44 (1993)
Google Scholar
Tropp, J.A., Gilbert, A.C.: Signal Recovery from Random Measurements via Orthogonal Matching Pursuit. IEEE Transactions on Information Theory 53, 4655–4666 (2007)
Article MathSciNet Google Scholar
Tropp, J.: Greed is Good: Algorithmic Results for Sparse Approximation. IEEE Transactions on Information Theory 50, 2231–2242 (2004)
Article MathSciNet Google Scholar
Mallat, S., Zhang, Z.: Matching Pursuits with Time-Frequency Dictionaries. IEEE Transactions on Signal Processing 41, 3397–3415 (1993)
Article MATH Google Scholar
Schnemann, P.: A Generalized Solution of the Orthogonal Procrustes Problem. Psychometrika 31, 1–10 (1966) doi:10.1007/BF02289451
Google Scholar

Download references

Author information

Authors and Affiliations

Robotics Institute, Carnegie Mellon University, USA
Varun Ramakrishna, Takeo Kanade & Yaser Sheikh

Authors

Varun Ramakrishna
View author publications
You can also search for this author in PubMed Google Scholar
Takeo Kanade
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Sheikh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Ltd., CB3 0FB, Cambridge, UK
Andrew Fitzgibbon
Dept. of Computer Science, University of North Carolina, 27599, Chapel Hill, NC, USA
Svetlana Lazebnik
California Institute of Technology, 91125, Pasadena, CA, USA
Pietro Perona
Institute of Industrial Science, The University of Tokyo, 153-8505, Tokyo, Japan
Yoichi Sato
INRIA, 38330, Montbonnot, France
Cordelia Schmid

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramakrishna, V., Kanade, T., Sheikh, Y. (2012). Reconstructing 3D Human Pose from 2D Image Landmarks. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7575. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33765-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-33765-9_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33764-2
Online ISBN: 978-3-642-33765-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics