Hand Pose Estimation from a Single RGB-D Image

  • Alina Kuznetsova
  • Bodo Rosenhahn
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8034)


Hand pose estimation is an important task in areas such as human computer interaction (HCI), sign language recognition and robotics. Due to the high variability in hand appearance and many degrees of freedom (DoFs) of the hand, hand pose estimation and tracking is very challenging, and different sources of data and methods are used to solve this problem. In the paper, we propose a method for model-based full DoF hand pose estimation from a single RGB-D image. The main advantage of the proposed method is that no prior manual initialization is required and only very general assumptions about the hand pose are made. Therefore, this method can be used for hand pose estimation from a single RGB-D image, as an initialization step for subsequent tracking, or for tracking recovery.


Point Cloud Machine Learning Method Depth Image Kinect Sensor Hand Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Comput. Vis. Image Underst. 108, 52–73 (2007)CrossRefGoogle Scholar
  2. 2.
    Stenger, B.: Template-based hand pose recognition using multiple cues. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3852, pp. 551–560. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Doliotis, P., Athitsos, V., Kosmopoulos, D., Perantonis, S.: Hand shape and 3D pose estimation using depth data from a single cluttered frame. In: Bebis, G., et al. (eds.) ISVC 2012, Part I. LNCS, vol. 7431, pp. 148–158. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Stenger, B., Mendonça, P.R.S., Cipolla, R.: Model-based 3d tracking of an articulated hand. In: CVPR (2), pp. 310–315 (2001)Google Scholar
  5. 5.
    Bray, M., Koller-Meier, E., Mueller, P., Gool, L.V., Schraudolph, N.N.: 3d hand tracking by rapid stochastic gradient descent using a skinning model. In: 1st European Conference on Visual Media Production, CVMP, pp. 59–68 (2004)Google Scholar
  6. 6.
    de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1793–1805 (2011)CrossRefGoogle Scholar
  7. 7.
    Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    de La Gorce, M., Paragios, N., Fleet, D.J.: Model-based hand tracking with texture, shading and self-occlusions. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2008, Anchorage, Alaska, USA, June 24-26. IEEE Computer Society (2008)Google Scholar
  9. 9.
    Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. ACM Trans. Graph. 28, 63:1–63:8 (2009)Google Scholar
  10. 10.
    Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Hand pose estimation using hierarchical detection. In: Sebe, N., Lew, M., Huang, T.S. (eds.) ECCV/HCI 2004. LNCS, vol. 3058, pp. 102–112. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Keskin, C., Kiraç, F., Kara, Y.E., Akarun, L.: Real time hand pose estimation using depth sensors. In: ICCV Workshops, pp. 1228–1234 (2011)Google Scholar
  12. 12.
    Jacka, D., Merry, B., Reid, A.: A comparison of linear skinning techniques for character animation. In: Afrigraph, pp. 177–186. ACM (2007)Google Scholar
  13. 13.
    Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation, 1st edn. CRC Press, Inc., Boca Raton (1994)MATHGoogle Scholar
  14. 14.
    Pons-Moll, G., Rosenhahn, B.: Model-Based Pose Estimation. Springer (2011)Google Scholar
  15. 15.
    Plagemann, C., Ganapathi, V., Koller, D., Thrun, S.: Real-time identification and localization of body parts from depth images. In: 2010 IEEE International Conference on Robotics and Automation, ICRA, pp. 3108–3113 (2010)Google Scholar
  16. 16.
    Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 239–256 (1992)CrossRefGoogle Scholar
  17. 17.
    Rusu, R.B.: Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments. PhD thesis, Computer Science department, Technische Universitaet Muenchen, Germany (2009)Google Scholar
  18. 18.
    Iason Oikonomidis, N.K., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: Proceedings of the British Machine Vision Conference. BMVA Press (2011)Google Scholar
  19. 19.
    Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18, 509–517 (1975)CrossRefMATHGoogle Scholar
  20. 20.
    Coleman, T.F., Li, Y.: An interior trust region approach for nonlinear minimization subject to bounds (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Alina Kuznetsova
    • 1
  • Bodo Rosenhahn
    • 1
  1. 1.Institute for Information Processing (TNT)Leibniz University HanoverGermany

Personalised recommendations