Abstract
In this paper we present a novel vision-based markerless hand pose estimation scheme with the input of depth image sequences. The proposed scheme exploits both temporal constraints and spatial features of the input sequence, and focuses on hand parsing and 3D fingertip localization for hand pose estimation. The hand parsing algorithm incorporates a novel spatial-temporal feature into a Bayesian inference framework to assign the correct label to each image pixel. The 3D fingertip localization algorithm adapts a recently developed geodesic extrema extraction method to fingertip detection with the hand parsing algorithm, a novel path-reweighting method and K-means clustering in metric space. The detected 3D fingertip locations are finally used for hand pose estimation with an inverse kinematics solver. Quantitative experiments on synthetic data show the proposed hand pose estimation scheme can accurately capture the natural hand motion. A simulated water-oscillator application is also built to demonstrate the effectiveness of the proposed method in human-computer interaction scenarios.
Similar content being viewed by others
References
Cyberglove 2. http://www.cyberglovesystems.com
Aristidou, A., Lasenby, J.: Motion capture with constrained inverse kinematics for real-time hand tracking. In: International Symposium on Communications, Control and Signal Processing, pp. 1–5 (2010)
Wu, Y., Huang, T.S.: Capturing articulated human hand motion: a divide-and-conquer approach. In: Proceedings of the IEEE International Conference on Computer Vision 1, pp. 606–611 (1999)
Henia, O.B., Hariti, M., Bouakaz, S.: A two-step minimization algorithm for model-based hand tracking. In: WSCG (2010)
Ho, M., Tseng, C., Lien, C., Huang, C.: A multi-view vision-based hand motion capturing system. Pattern Recognit. 44(2), 443–453 (2011)
Ballan, L., Taneja, A., Gall, J., Gool, L.V., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: ECCV, vol. 12, pp. 640–653 (2012)
Keskin, C., Kira, F., Kara, Y.E., Akarun, L.: Real time hand pose estimation using depth sensors. In: Proceeding of the IEEE International Conference on Computer Vision Workshops, pp. 1228–1234 (2011)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3D tracking of hand articulations using kinect. In: Proceedings of the British Machine Vision Conference (2011)
Pellegrini, S., Schindler, K., Nardi, D.: A generalisation of the ICP algorithm for articulated bodies. In: Proceedings of the British Machine Vision Conference (2008)
Stenger, B., Mendonqa, P.R.S., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR, vol. 2, pp. 310–315 (2001)
Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Model-based hand tracking using a hierarchical Bayesian filter. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1372–1384 (2006)
Lin, J.Y., Wu, Y., Huang, T.S.: 3D model-based hand tracking using stochastic direct search method. In: FG 2004, pp. 693–698 (2004)
Romero, J., Kjellstrom, H., Kragic, D.: Monocular real-time 3D articulated hand pose estimation. In: IEEE-RAS International Conference on Humanoid Robots, pp. 87–92 (2009)
Xu, J., Wu, Y., Katsaggelos, A.: Part-based initialization for hand tracking. In: IEEE International Conference on Image Processing, pp. 3257–3260 (2010)
Doliotis, P., Athitsos, V., Kosmopoulos, D., Perantonis, S.: Hand shape and 3D pose estimation using depth data from a single cluttered frame. In: ISVC, vol. 1, pp. 148–158 (2012)
Plagemann, C., Ganapathi, V., Koller, D., Thrun, S.: Real-time identification and localization of body parts from depth images. In: IEEE International Conference on Robotics and Automation, pp. 3108–3113 (2010)
Wang, R.Y., Popovic, J.: Real-time hand tracking with a color glove. ACM Trans. Graph. 28(3) (2009). doi:10.1145/1531326.1531369
Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)
Erol, A., Bebis, G., Nicolescu, M., Boyle, X.R.D.: A review on vision-based full DOF hand motion estimation. In: CVPR 05, pp. 75–82 (2005)
Lin, L.J., Ying, W., Huang, T.S.: Modeling the constraints of human hand motion. In: Proceedings of the Workshop on Human Motion, pp. 121–126 (2000)
Mo, Z., Neumann, N.: Real-time hand pose recognition using low-resolution depth images. In: CVPR 06, pp. 1499–1505 (2006)
Panin, G., Klose, S., Knoll, A.: Real-time articulated hand detection and pose estimation. In: Proceedings of the International Symposium on Advances in Visual Computing, pp. 1131–1140 (2009)
Kolsch, M., Turk, M.: Robust hand detection. In: FG 2004, pp. 614–619 (2004)
Kolsch, M., Turk, M.: Hand tracking with flocks of features. In: CVPR (2005)
Toyama, K., Blake, A.: Probabilistic tracking with exemplars in a metric space. Int. J. Comput. Vis. 48(1), 9–19 (2002)
Chua, C.S., Guan, H., Ho, Y.K.: Model-based 3d hand posture estimation from a single 2d image. Image Vis. Comput. 20(3), 191–202 (2002)
Baak, A., Muller, M., Bharaj, G., Seidel, H.P., Theobal, C.: A data-driven approach for real-time full body pose reconstruction from a depth camera. In: Proceedings of the IEEE International Conference on Computer Vision (2011)
Schwarz, L., Mkhitaryan, A., Mateus, D., Navab, N.: Estimating human 3d pose from time-of-flight images based on geodesic distances and optical flow. In: FG 2011, pp. 700–706 (2011)
Wang, L.C.T., Chen, C.C.: A combined optimization method for solving the inverse kinematics problem of mechanical manipulators. IEEE Trans. Robot. Autom. 7(4), 489–499 (1991)
Liang, H., Yuan, J., Thalmann, D.: 3D fingertip and palm tracking in depth image sequences. In: ACM MultiMedia, pp. 785–788 (2012)
Liang, H., Yuan, J., Thalmann, D.: Hand pose estimation by combining fingertip tracking and articulated ICP. In: VRCAI, vol. 12, pp. 87–90 (2012)
Acknowledgements
This research, which is carried out at BeingThere Centre, is supported by the Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiative and administered by the IDM Programme Office.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liang, H., Yuan, J., Thalmann, D. et al. Model-based hand pose estimation via spatial-temporal hand parsing and 3D fingertip localization. Vis Comput 29, 837–848 (2013). https://doi.org/10.1007/s00371-013-0822-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-013-0822-4