Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

Tzionas, Dimitrios; Srikantha, Abhilash; Aponte, Pablo; Gall, Juergen

doi:10.1007/978-3-319-11752-2_22

Dimitrios Tzionas^16,17,
Abhilash Srikantha^16,17,
Pablo Aponte¹⁷ &
…
Juergen Gall¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8753))

Included in the following conference series:

German Conference on Pattern Recognition

2853 Accesses
13 Citations
1 Altmetric

Abstract

Hand motion capture has been an active research topic, following the success of full-body pose tracking. Despite similarities, hand tracking proves to be more challenging, characterized by a higher dimensionality, severe occlusions and self-similarity between fingers. For this reason, most approaches rely on strong assumptions, like hands in isolation or expensive multi-camera systems, that limit practical use. In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera. Our approach combines a generative model with collision detection and discriminatively learned salient points. We quantitatively evaluate our approach on 14 new sequences with challenging interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The annotated dataset sequences and the supplementary material are available at http://files.is.tue.mpg.de/dtzionas/GCPR_2014.html.

References

Albrecht, I., Haber, J., Seidel, H.P.: Construction and animation of anatomically based human hand models. In: SCA, pp. 98–109 (2003)
Google Scholar
Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: CVPR, pp. 432–439 (2003)
Google Scholar
Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012)
Chapter Google Scholar
Baran, I., Popović, J.: Automatic rigging and animation of 3d characters. TOG 26(3), 72 (2007)
Article Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI 24(4), 509–522 (2002)
Article Google Scholar
Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. IJCV 56(3), 179–194 (2004)
Article Google Scholar
de Campos, T., Murray, D.: Regression-based hand pose estimation from multiple cameras. In: CVPR, pp. 782–789 (2006)
Google Scholar
Canny, J.: A computational approach to edge detection. PAMI 8(6), 679–698 (1986)
Article Google Scholar
Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: ICRA, pp. 2724–2729 (1991)
Google Scholar
Ekvall, S., Kragic, D.: Grasp recognition for programming by demonstration. In: ICRA, pp. 748–753 (2005)
Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. CVIU 108(1–2), 52–73 (2007)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Distance transforms of sampled functions. Technical report. Cornell Computing and Information Science (2004)
Google Scholar
Gall, J., Fossati, A., Van Gool, L.: Functional categorization of objects using real-time markerless motion capture. In: CVPR, pp. 1969–1976 (2011)
Google Scholar
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. PAMI 33(11), 2188–2202 (2011)
Article Google Scholar
Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: ICCV, pp. 1475–1482 (2009)
Google Scholar
Hamer, H., Gall, J., Weise, T., Van Gool, L.: An object-dependent hand pose prior from sparse training data. In: CVPR, pp. 671–678 (2010)
Google Scholar
Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: FG, pp. 140–145 (1996)
Google Scholar
Holzer, S., Rusu, R., Dixon, M., Gedikli, S., Navab, N.: Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images. In: IROS, pp. 2684–2689 (2012)
Google Scholar
Jones, M.J., Rehg, J.M.: Statistical color models with application to skin detection. IJCV 46(1), 81–96 (2002)
Article MATH Google Scholar
Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L.: Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 852–863. Springer, Heidelberg (2012)
Chapter Google Scholar
Kim, D., Hilliges, O., Izadi, S., Butler, A.D., Chen, J., Oikonomidis, I., Olivier, P.: Digits: freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: UIST, pp. 167–176 (2012)
Google Scholar
Kyriazis, N., Argyros, A.: Physically plausible 3d scene tracking: the single actor hypothesis. In: CVPR, pp. 9–16 (2013)
Google Scholar
Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: SIGGRAPH, pp. 165–172 (2000)
Google Scholar
MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)
Chapter Google Scholar
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. CVIU 104(2), 90–126 (2006)
Google Scholar
Murray, R.M., Sastry, S.S., Zexiang, L.: A Mathematical Introduction to Robotic Manipulation (1994)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: BMVC, pp. 101.1–101.11 (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: ICCV, pp. 2088–2095 (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: CVPR, pp. 1862–1869 (2012)
Google Scholar
Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. IJCV 81(1), 24–52 (2009)
Article Google Scholar
Pons-Moll, G., Rosenhahn, B.: Model-Based Pose Estimation, pp. 139–170 (2011)
Google Scholar
Rehg, J.M., Kanade, T.: Visual tracking of high dof articulated structures: an application to human hand tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)
Chapter Google Scholar
Rehg, J., Kanade, T.: Model-based tracking of self-occluding articulated objects. In: ICCV, pp. 612–617 (1995)
Google Scholar
Romero, J., Kjellström, H., Kragic, D.: Monocular real-time 3d articulated hand pose estimation. In: HUMANOIDS, pp. 87–92 (2009)
Google Scholar
Romero, J., Kjellström, H., Kragic, D.: Hands in action: real-time 3d reconstruction of hands in interaction with objects. In: ICRA, pp. 458–463 (2010)
Google Scholar
Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3d hand pose reconstruction using specialized mappings. In: ICCV, pp. 378–387 (2001)
Google Scholar
Rosenhahn, B., Brox, T., Weickert, J.: Three-dimensional shape knowledge for joint image segmentation and pose tracking. IJCV 73(3), 243–262 (2007)
Article Google Scholar
Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: 3DIM, pp. 145–152 (2001)
Google Scholar
Rusinkiewicz, S., Hall-Holt, O., Levoy, M.: Real-time 3d model acquisition. TOG 21(3), 438–446 (2002)
Article Google Scholar
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)
Google Scholar
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using rgb and depth data. In: ICCV, pp. 2456–2463 (2013)
Google Scholar
Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR, pp. 310–315 (2001)
Google Scholar
Stolfi, J.: Oriented Proj. Geometry: A Framework for Geom. Computation (1991)
Google Scholar
Teschner, M., Kimmerle, S., Heidelberger, B., Zachmann, G., Raghupathi, L., Fuhrmann, A., Cani, M.P., Faure, F., Magnetat-Thalmann, N., Strasser, W.: Collision detection for deformable objects. In: Eurographics, pp. 119–139 (2004)
Google Scholar
Thayananthan, A., Stenger, B., Torr, P.H.S., Cipolla, R.: Shape context and chamfer matching in cluttered scenes. In: CVPR, pp. 127–133 (2003)
Google Scholar
Tzionas, D., Gall, J.: A comparison of directional distances for hand pose estimation. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 131–141. Springer, Heidelberg (2013)
Chapter Google Scholar
Vaezi, M., Nekouie, M.A.: 3d human hand posture reconstruction using a single 2d image. IJHCI 1(4), 83–94 (2011)
Google Scholar
Wang, R.Y., Popović, J.: Real-time hand-tracking with a color glove. TOG 28(3), 68:1–68:8 (2009)
Google Scholar

Download references

Acknowledgments

The authors acknowledge the help of Javier Romero and Jessica Purmort of MPI-IS regarding the acquisition of the personalized hand model, the assistance of Philipp Rybalov with annotation and the public software release of the FORTH tracker by the CVRL lab of FORTH-ICS, enabling comparison to [27]. Financial support was provided by the DFG Emmy Noether program (GA 1927/1-1).

Author information

Authors and Affiliations

Perceiving Systems Department, MPI for Intelligent Systems, Stuttgart, Germany
Dimitrios Tzionas & Abhilash Srikantha
Computer Vision Group, University of Bonn, Bonn, Germany
Dimitrios Tzionas, Abhilash Srikantha, Pablo Aponte & Juergen Gall

Authors

Dimitrios Tzionas
View author publications
You can also search for this author in PubMed Google Scholar
Abhilash Srikantha
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Aponte
View author publications
You can also search for this author in PubMed Google Scholar
Juergen Gall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitrios Tzionas .

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Münster, Münster, Germany
Xiaoyi Jiang
Computer Science Department 5, University of Erlangen-Nürnberg, Erlangen, Germany
Joachim Hornegger
Department of Computer Science, University of Kiel, Kiel, Germany
Reinhard Koch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tzionas, D., Srikantha, A., Aponte, P., Gall, J. (2014). Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points. In: Jiang, X., Hornegger, J., Koch, R. (eds) Pattern Recognition. GCPR 2014. Lecture Notes in Computer Science(), vol 8753. Springer, Cham. https://doi.org/10.1007/978-3-319-11752-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-11752-2_22
Published: 15 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11751-5
Online ISBN: 978-3-319-11752-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics