Skip to main content

Human Pose Estimation

  • Reference work entry
  • First Online:
Computer Vision

Synonyms

Articulated pose estimation; Body configuration recovery

Related Concepts

Definition

Human pose estimation is the process of estimating the configuration of the body (pose) from a single, typically monocular (While the problem of human pose estimation can be formulated from simultaneous observations from multiple camera views (or one or more RGBD cameras), which can result in higher-fidelity results or alleviate annotation [46], such formulations are substantially less common, as they require cumbersome hardware setups, making them inappropriate for many applications.), image or video. The pose can be expressed in variety of ways (e.g., joint positions/keypoints or angles between body parts) in either the image (2d) or the world (3d) coordinate frame.

Background

Human pose estimation is one of the key fundamental problems in computer vision that has been studied for well over 20 years. The reason for its importance is the abundance of...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 849.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal A, Triggs B (2006) Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1): 44–58

    Article  Google Scholar 

  2. Alp Guler R, Trigeorgis G, Antonakos E, Snape P, Zafeiriou S, Kokkinos I (2017) Densereg: fully convolutional dense shape regression in-the-wild. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  3. Alp Guler R, Neverova N, Kokkinos I (2018) Densepose: dense human pose estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  4. Andriluka M, Roth S, Schiele B (2009) Pictorial structures revisited: people detection and articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  5. Andriluka M, Roth S, Schiele B (2010) Monocular 3D pose estimation and tracking by detection. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  6. Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2D human pose estimation: new benchmark and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  7. Bergtholdt M, Kappes J, Schmidt S, Schnorr C (2010) A study of parts-based object class detection using complete graphs. International Journal of Computer Vision 87: 93–117

    Article  MathSciNet  Google Scholar 

  8. Bo L, Sminchisescu C (2010) Twin gaussian processes for structured prediction. International Journal of Computer Vision 87:28–52

    Article  Google Scholar 

  9. Bo L, Sminchisescu C, Kanaujia A, Metaxas D (2008) Fast algorithms for large scale conditional 3D prediction. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  10. Cai Y, Ge L, Liu J, Cai J, Cham T-J, Yuan J, Magnenat Thalmann N (2019) Exploiting spatial-temporal relationships for 3D pose estimation via graph convolutional networks. In: IEEE International Conference on Computer Vision

    Book  Google Scholar 

  11. Cao Z, Hidalgo Martinez G, Simon T, Wei S, Sheikh YA (2019) Openpose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(1):1–1

    Google Scholar 

  12. Carreira J, Fragkiadaki K, Agrawal P, Malik J (2016) Human pose estimation with iterative error feedback. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  13. Chen C-H, Tyagi A, Agrawal A, Drover D, MV R, Stojanov S, Rehg JM Unsupervised 3D pose estimation with geometric self-supervision. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  14. de Bem R, Arnab A, Golodetz S, Sapienza M, Torr P (2018) Deep fully-connected part-based models for human pose estimation. Machine Learning Research 95:327–342

    Google Scholar 

  15. Eichner M, Ferrari V (2010) We are family: joint pose estimation of multiple persons. In: European Conference on Computer Vision

    Google Scholar 

  16. Fang H-S, Xie S, Tai Y-W, Lu C (2017) RMPE: regional multi-person pose estimation. In: IEEE International Conference on Computer Vision

    Google Scholar 

  17. Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. International Journal of Computer Vision 61(1):55–79

    Article  Google Scholar 

  18. Ferrari V, Marn-Jimnez MJ, Zisserman A (2008) Progressive search space reduction for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  19. Gall J, Rosenhahn B, Brox T, Seidel H-P (2010) Optimization and filtering for human motion capture. International Journal of Computer Vision 87(1–2):75–92

    Article  Google Scholar 

  20. Girshick R, Iandola F, Darrell T, Malik J (2015) Deformable part models are convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  21. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  22. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask RCNN. In: IEEE International Conference on Computer Vision

    Google Scholar 

  23. Ionescu C, Papava D, Olaru V, Sminchisescu C (2014) Human3.6m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Transactions on Pattern Analysis and Machine Intelligence 36:1325–1339

    Article  Google Scholar 

  24. Jiang H (2009) Human pose estimation using consistent max-covering. In: IEEE International Conference on Computer Vision

    Book  Google Scholar 

  25. Kanaujia A, Sminchisescu C, Metaxas D (2007) Semi-supervised hierarchical models for 3D human pose reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  26. Kanazawa A, Black MJ, Jacobs DW, Malik J (2018) End-to-end recovery of human shape and pose. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  27. Kiciroglu S, Rhodin H, Sinha S, Salzmann M, Fua P (2020) Activemocap: optimized drone flight for active human motion capture. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  28. Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge

    MATH  Google Scholar 

  29. Kolotouros N, Pavlakos G, Black MJ, Daniilidis K (2019) Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In: IEEE International Conference on Computer Vision

    Book  Google Scholar 

  30. Lee MW, Cohen I (2004) Proposal maps driven MCMC for estimating human body pose in static images. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  31. Lin T-Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft coco: common objects in context. In: European Conference on Computer Vision

    Google Scholar 

  32. Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ (2015) SMPL: a skinned multi-person linear model. ACM SIGGRAPH Asia 34(6):1–16

    Article  Google Scholar 

  33. Martinez J, Hossain R, Romero J, Little JJ (2017) A simple yet effective baseline for 3D human pose estimation. In: IEEE International Conference on Computer Vision, pp 2640–2649

    Google Scholar 

  34. Mehta D, Sotnychenko O, Mueller F, Xu W, Elgharib M, Fua P, Seidel H-P, Rhodin H, Pons-Moll G, Theobalt C (2020) XNect: real-time multi-person 3D human pose estimation with a single RGB camera. In: ACM SIGGRAPH

    Book  Google Scholar 

  35. Mori G, Ren X, Efros A, Malik J (2004) Recovering human body configurations: combining segmentation and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  36. Navaratnam R, Fitzgibbon A, Cipolla R (2007) The joint manifold model for semi-supervised multi-valued regression. In: IEEE International Conference on Computer Vision

    Book  Google Scholar 

  37. Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European Conference on Computer Vision

    Book  Google Scholar 

  38. Papandreou G, Kanazawa N, Zhu T, Toshev A, Tompson J, Bregler C, Murphy K (2017) Towards accurate multi-person pose estimation in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  39. Papandreou G, Zhu T, Chen L-C, Gidaris S, Tompson J, Murphy K (2018) Personlab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In: European Conference on Computer Vision

    Google Scholar 

  40. Pavlakos G, Zhou X, Derpanis K, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3D human pose. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  41. Peng XB, Kanazawa A, Malik J, Abbeel P, Levine S (2018) SFV: reinforcement learning of physical skills from videos. ACM Trans Graph 37:1–14

    Google Scholar 

  42. Pishchulin L, Insafutdinov E, Tang S, Andres B, Andriluka M, Gehler P, Schiele B (2016) Deepcut: joint subset partition and labeling for multi-person pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  43. Ramanan D (2006) Learning to parse images of articulated bodies. Neural Information and Processing Systems 19:1129–1136

    Google Scholar 

  44. Ren X, Berg AC, Malik J (2005) Recovering human body configurations using pair-wise constraints between parts. In: International Conference on Computer Vision

    Google Scholar 

  45. Rhodin H, Salzmann M, Fua P (2018) Unsupervised geometry-aware representation for 3D human pose estimation. In: European Conference on Computer Vision

    Book  Google Scholar 

  46. Rhodin H, Sporri J, Katircioglu I, Constantin V, Meyer F, Muller E, Salzmann M, Fua P (2018) Learning monocular 3D human pose estimation from multi-view images. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  47. Rhodin H, Constantin V, Katircioglu I, Salzmann M, Fua P (2019) Neural scene decomposition for multi-person motion capture. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  48. Shakhnarovich G, Viola P, Darrell T (2003) Fast pose estimation with parameter sensitive hashing. In: International Conference on Computer Vision

    Book  Google Scholar 

  49. Sigal L, Black MJ (2006) Measure locally, reason globally: occlusion-sensitive articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  50. Sigal L, Isard M, Sigelman BH, Black MJ (2003) Attractive people: assembling loose-limbed models using non-parametric belief propagation. Advances in Neural Information Processing Systems 16:1539–1546

    Google Scholar 

  51. Sigal L, Balan A, Black MJ (2007) Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Neural Information and Processing Systems

    Google Scholar 

  52. Sigal L, Memisevic R, Fleet DJ (2009) Shared kernel information embedding for discriminative inference. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  53. Singh VK, Nevatia R, Huang C (2010) Efficient inference with multiple heterogeneous part detectors for human pose estimation. In: European Conference on Computer Vision, pp 314–327

    Google Scholar 

  54. Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  55. Tian T-P, Sclaroff S (2010) Fast globally optimal 2D human detection with loopy graph models. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  56. Tompson JJ, Jain A, LeCun Y, Bregler C (2014) Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp 1799–1807

    Google Scholar 

  57. Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C (2015) Efficient object localization using convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  58. Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  59. Urtasun R, Darrell T (2008) Sparse probabilistic regression for activity-independent human pose inference. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  60. Wei S-E, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  61. Xiao B, Wu H, Wei Y (2018) Simple baselines for human pose estimation and tracking. In: European Conference on Computer Vision

    Book  Google Scholar 

  62. Xu Y, Zhu S-C, Tung T (2019) Denserac: joint 3D pose and shape estimation by dense render-and-compare. In: IEEE International Conference on Computer Vision

    Google Scholar 

  63. Yang Y, Ramanan D (2011) Articulated pose estimation with flexible mixture-of-parts. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  64. Zhang J, Luo J, Collins R, Liu Y (2006) Body localization in still images using hierarchical models and hybrid search. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

  65. Zhang H, Ouyang H, Liu S, Qi X, Shen X, Yang R, Jia J (2019) Human pose estimation with spatial contextual information. arXiv preprint arXiv:1901.01760

    Google Scholar 

  66. Zhang JY, Felsen P, Kanazawa A, Malik J (2019) Predicting 3D human dynamics from video. In: International Conference on Computer Vision

    Book  Google Scholar 

  67. Zhao L, Peng X, Tian Y, Kapadia M, Metaxas DN (2019) Semantic graph convolutional networks for 3D human pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition

    Book  Google Scholar 

  68. Zuffi S, Black MJ (2015) The stitched puppet: a graphical model of 3d human shape and pose. In: IEEE Conference on Computer Vision and Pattern Recognition

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonid Sigal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer International Publishing

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Sigal, L. (2021). Human Pose Estimation. In: Ikeuchi, K. (eds) Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-030-63416-2_584

Download citation

Publish with us

Policies and ethics