The Visual Computer

, Volume 30, Issue 10, pp 1133–1144 | Cite as

Real-time and robust hand tracking with a single depth camera

  • Ziyang Ma
  • Enhua Wu
Original Article


In this paper, we introduce a novel, real-time and robust hand tracking system, capable of tracking the articulated hand motion in full degrees of freedom (DOF) using a single depth camera. Unlike most previous systems, our system is able to initialize and recover from tracking loss automatically. This is achieved through an efficient two-stage k-nearest neighbor database searching method proposed in the paper. It is effective for searching from a pre-rendered database of small hand depth images, designed to provide good initial guesses for model based tracking. We also propose a robust objective function, and improve the Particle Swarm Optimization algorithm with a resampling based strategy in model based tracking. It provides continuous solutions in full DOF hand motion space more efficiently than previous methods. Our system runs at 40 fps on a GeForce GTX 580 GPU and experimental results show that the system outperforms the state-of-the-art model based hand tracking systems in terms of both speed and accuracy. The work result is of significance to various applications in the field of human–computer-interaction and virtual reality.


Hand tracking Virtual reality  Motion capture User interface 3D interaction 



The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This research is supported by National 973 Program of Basic Research on Science and Technology (2009CB320800), NSFC (61272326) and the research grant of University of Macau.


  1. 1.
  2. 2.
    Albrecht, I., Haber, J., Seidel, H.P.: Construction and animation of anatomically based human hand models. In: Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, pp. 98–109. Eurographics Association (2003)Google Scholar
  3. 3.
    Argyros, A.A., Lourakis, M.I.: Real-time tracking of multiple skin-colored objects with a possibly moving camera. In: European Conference on Computer Vision (ECCV) (2004)Google Scholar
  4. 4.
    Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2003)Google Scholar
  5. 5.
    Cao, Y., Wang, C., Zhang, L., Zhang, L.: Edgel index for large-scale sketch-based image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  6. 6.
    Clerc, M., Kennedy, J.: The particle swarm-explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol.Comput. 6(1), 58–73 (2002)CrossRefGoogle Scholar
  7. 7.
    Criminisi, A.: Decision forests: a unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 7(2—-3), 81–227 (2011)CrossRefzbMATHGoogle Scholar
  8. 8.
    Dipietro, L., Sabatini, A.M., Dario, P.: A survey of glove-based systems and their applications. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(4), 461–482 (2008)CrossRefGoogle Scholar
  9. 9.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. Comput. Vis. Image Underst. 108(1), 52–73 (2007)CrossRefGoogle Scholar
  10. 10.
    Huang, H., Zhao, L., Yin, K., Qi, Y., Yu, Y., Tong, X.: Controllable hand deformation from sparse examples with rich details. In: Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 73–82. ACM (2011)Google Scholar
  11. 11.
    Kennedy, J.F., Kennedy, J., Eberhart, R.C.: Swarm intelligence. Morgan Kaufmann Publisher, San Francisco (2001)Google Scholar
  12. 12.
    de La Gorce, M., Paragios, N., Fleet, D.J.: Model-based hand tracking with texture, shading and self-occlusions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)Google Scholar
  13. 13.
    Liu, Y.J., Zheng, Y.F., Lv, L., Xuan, Y.M., Fu, X.L.: 3D model retrieval based on color + geometry signatures. Vis. Comput. 28(1), 75–86 (2012)CrossRefGoogle Scholar
  14. 14.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3D tracking of hand articulations using kinect. In: British Machine Vision Conference (BMVC) (2011)Google Scholar
  15. 15.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and efficient 26-DOF hand pose recovery. In: Asian Conference on Computer Vision (ACCV) (2010)Google Scholar
  16. 16.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (ICCV) (2011)Google Scholar
  17. 17.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  18. 18.
    Pharr, M., Fernando, R.: GPU gems 2: programming techniques for high-performance graphics and general-purpose computation. Addison-Wesley Professional, Boston (2005)Google Scholar
  19. 19.
    Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3D hand pose reconstruction using specialized mappings. In: IEEE International Conference on Computer Vision (ICCV) (2001) Google Scholar
  20. 20.
    Schlattmann, M., Kahlesz, F., Sarlette, R., Klein, R.: Markerless 4 gestures 6 DOF real-time visual tracking of the human hand with automatic initialization. In: Computer Graphics Forum, vol. 26, pp. 467–476. Wiley Online, Library (2007).Google Scholar
  21. 21.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), In (2011)Google Scholar
  22. 22.
    Stenger, B., Mendonça, P.R., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2001)Google Scholar
  23. 23.
    Stenger, B.D.R.: Model-based hand tracking using a hierarchical bayesian filter. Ph.D. thesis (2004)Google Scholar
  24. 24.
    Tomasi, C., Petrov, S., Sastry, A.: 3D tracking= classification+ interpolation. In: IEEE International Conference on Computer Vision (ICCV) (2003)Google Scholar
  25. 25.
    Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. In: ACM SIGGRAPH, vol. 28 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.State Key Laboratory of Computer Science, Institute of SoftwareChinese Academic of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.University of MacauMacauChina

Personalised recommendations