Real-time 3D human pose recovery from a single depth image using principal direction analysis

Abstract

In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth image using principal direction analysis (PDA). Human body parts are first recognized from a human depth silhouette via trained random forests (RFs). PDA is applied to each recognized body part, which is presented as a set of points in 3D, to estimate its principal direction. Finally, a 3D human pose is recovered by mapping the principal direction to each body part of a 3D synthetic human model. We perform both quantitative and qualitative evaluations of our proposed 3D human pose recovering methodology. We show that our proposed approach has a low average reconstruction error of 7.07 degrees for four key joint angles and performs more reliably on a sequence of unconstrained poses than conventional methods. In addition, our methodology runs at a speed of 20 FPS on a standard PC, indicating that our system is suitable for real-time applications. Our 3D pose recovery methodology is applicable to applications ranging from human computer interactions to human activity recognition.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Autodesk 3Ds MAX, 2012

  2. 2.

    Baak A, Mller M, Bharaj G, Seidel HP, Theobalt C (2011) Data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the 2011 international conference on computer vision. pp 1092–1099

  3. 3.

    Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  4. 4.

    Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recognit Lett Lett 34(15):1995–2006

    Google Scholar 

  5. 5.

    CMU motion capture database. http://mocap.cs.cmu.edu

  6. 6.

    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  7. 7.

    Ganapathi V, Plagemann C, Koller D, Thrun S (2010) Real time motion capture using a single time-of-flight camera. In: IEEE conference on computer vision and pattern recognition (CVPR). pp 755–762

  8. 8.

    Ganapathi V, Plagemann C, Koller D, Thrun S (2012) Real-time human pose tracking from range data. In: Proceedings of the 12th European conference on computer vision. pp 738–751

  9. 9.

    Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning. Springer, New York

    Google Scholar 

  10. 10.

    Holt B, Ong EJ, Bowden R (2013) Accurate static pose estimation combining direct regression and geodesic extrema. In: IEEE international conference and workshops on automatic face and gesture recognition

  11. 11.

    Jalal A, Sharif N, Kim JT, Kim TS (2013) Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. J Indoor Built Environ 22:271–279

    Article  Google Scholar 

  12. 12.

    Jiu M, Wolf C, Taylor G, Baskurt A (2013) Human body part estimation from depth images via spatially-constrained deep learning. Pattern Recognit Lett

  13. 13.

    Lepetit V, Lagger P, Fua P (2005) Randomized trees for real-time keypoint recognition. In: IEEE computer society conference on computer vision and pattern recognition, pp 75–781

  14. 14.

    Moeslund TB, Hilton A, Krger V (2006) A survey of advances in vision-based human motion capture and analysis. Comp Vision Image Underst 104(2):90–126

    Article  Google Scholar 

  15. 15.

    Plagemann C, Ganapathi V, Koller D, Thrun S (2010) Real-time identification and localization of body parts from depth images. In: IEEE international conference on robotics and automation (ICRA). pp 3108–3113

  16. 16.

    Poppe R (2007) Vision-based human motion analysis: an overview. Comp Vision Image Underst 108(1–2):4–18

    Article  Google Scholar 

  17. 17.

    PrimeSense Ltd. http://www.primesense.com

  18. 18.

    Rosenhahn B, Kersting UG, Smith AW, Gurney JK, Brox T, Klette R (2005) A system for marker-less human motion estimation. Lect Notes Comput Sci 3663:230–237

    Article  Google Scholar 

  19. 19.

    Rosenhahn B, Schmaltz C, Brox T, Weickert J, Cremers D, Seidel HP (2008) Markerless motion capture of man-machine interaction. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. pp 23–28

  20. 20.

    Schwarz LA, Mkhitaryan A, Mateus D, Navab N (2011) Estimating human 3d pose from time-of-flight images based on geodesic distances and optical flow. In: IEEE conference on automatic face and gesture recognition. pp 700–706

  21. 21.

    Schwarz LA, Mkhitaryan A, Mateus D, Navab N (2012) Human skeleton tracking from depth data using geodesic distances and optical flow. J Image Vision Comput 30(3):217–226

    Article  Google Scholar 

  22. 22.

    Shen J, Yang W, Liao Q (2013) Part template: 3d representation for multiview human pose estimation. Pattern Recog 46(7):1920–1932

    Google Scholar 

  23. 23.

    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition. pp 1297–1304

  24. 24.

    Shuang Z, Yu-ping Q, Hao D, Gang J (2012) Analyzing of mean-shift algorithm in extended target tracking technology. Lect Notes Electr Eng 144:161–166

    Article  Google Scholar 

  25. 25.

    Sundaresan A, Chellappa R (2008) Model-driven segmentation of articulating humans in laplacian eigenspace. IEEE Trans Pattern Anal Mach Intell 30(10):1771–1785

    Article  Google Scholar 

  26. 26.

    Thang ND, Kim TS, Lee YK, Lee SY (2011) Estimation of 3-d human body posture via co-registration of 3-d human model and sequential stereo information. Appl Intell 35(2):163–177

    Article  Google Scholar 

  27. 27.

    Vilaplana V, Marques F (2008) Region-based mean sift tracking: application to face tracking. In: IEEE international conference on image processing. pp 2712–2715

  28. 28.

    Yuan ZH, Lu T (2013) Incremental 3d reconstruction using bayesian learning. Appl Intell 39(4):761–771

    Article  Google Scholar 

Download references

Acknowledgments

This research was supported by the MSIP (Ministry of Science, ICT & Future Planning), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency (NIPA-2013-(H0301-13-2001)). This work was also supported by the Industrial Strategic Technology Development Program (10035348, Development of a Cognitive Planning and Learning Model for Mobile Platforms) funded by the Ministry of Knowledge Economy(MKE, Korea).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Sungyoung Lee or Tae-Seong Kim.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Dinh, DL., Lim, MJ., Thang, N.D. et al. Real-time 3D human pose recovery from a single depth image using principal direction analysis. Appl Intell 41, 473–486 (2014). https://doi.org/10.1007/s10489-014-0535-z

Download citation

Keywords

  • 3D human pose recovery
  • Depth image
  • Body part recognition
  • Principal direction analysis