A Survey on Human Motion Analysis from Depth Data

  • Mao Ye
  • Qing Zhang
  • Liang Wang
  • Jiejie Zhu
  • Ruigang Yang
  • Juergen Gall
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8200)

Abstract

Human pose estimation has been actively studied for decades. While traditional approaches rely on 2d data like images or videos, the development of Time-of-Flight cameras and other depth sensors created new opportunities to advance the field. We give an overview of recent approaches that perform human motion analysis which includes depth-based and skeleton-based activity recognition, head pose estimation, facial feature detection, facial performance capture, hand pose estimation and hand gesture recognition. While the focus is on approaches using depth data, we also discuss traditional image based methods to provide a broad overview of recent developments in these areas.

Keywords

Entropy Nickel Manifold Expense Gall 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Klette, R., Tee, G.: Understanding human motion: A historic review. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.) Human Motion. Computational Imaging and Vision, vol. 36, pp. 1–22. Springer, Netherlands (2008)CrossRefGoogle Scholar
  2. 2.
    Aggarwal, J.: Motion analysis: Past, present and future. In: Bhanu, B., Ravishankar, C.V., Roy-Chowdhury, A.K., Aghajan, H., Terzopoulos, D. (eds.) Distributed Video Sensor Networks, pp. 27–39. Springer, London (2011)CrossRefGoogle Scholar
  3. 3.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)Google Scholar
  4. 4.
    Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys 43(2), 16:1–16:43 (2011)Google Scholar
  5. 5.
    Mitra, S., Acharya, T.: Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 37(3), 311–324 (2007)CrossRefGoogle Scholar
  6. 6.
    Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104(2), 90–126 (2006)CrossRefGoogle Scholar
  7. 7.
    Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)CrossRefGoogle Scholar
  8. 8.
    Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: Workshop on Human Activity Understanding from 3D Data, pp. 9–14 (2010)Google Scholar
  9. 9.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012)Google Scholar
  10. 10.
    Kurakin, A., Zhang, Z., Liu, Z.: A real time system for dynamic hand gesture recognition with a depth sensor. In: 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1975–1979 (2012)Google Scholar
  11. 11.
    Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)Google Scholar
  12. 12.
    Li, W., Zhang, Z., Liu, Z.: Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Transactions on Circuits and Systems for Video Technology 18(11), 1499–1510 (2008)CrossRefGoogle Scholar
  13. 13.
    Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: Space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  15. 15.
    Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM International Conference on Multimedia, pp. 1057–1060 (2012)Google Scholar
  16. 16.
    Zhang, H., Parker, L.: 4-dimensional local spatio-temporal features for human activity recognition. In: International Conference on Intelligent Robots and Systems, pp. 2044–2049 (2011)Google Scholar
  17. 17.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5228–5235 (2004)CrossRefGoogle Scholar
  18. 18.
    Lei, J., Ren, X., Fox, D.: Fine-grained kitchen activity recognition using rgb-d. In: ACM Conference on Ubiquitous Computing (2012)Google Scholar
  19. 19.
    Jalal, A., Uddin, M.Z., Kim, J.T., Kim, T.S.: Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor and Built Environment 21(1), 184–190 (2011)CrossRefGoogle Scholar
  20. 20.
    Wang, Y., Huang, K., Tan, T.: Human activity recognition based on r transform. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  21. 21.
    Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In: Workshop on Human Activity Understanding from 3D Data, pp. 20–27 (2012)Google Scholar
  22. 22.
    Han, L., Wu, X., Liang, W., Hou, G., Jia, Y.: Discriminative human action recognition in the learned hierarchical manifold space. Image and Vision Computing 28(5), 836–849 (2010)CrossRefGoogle Scholar
  23. 23.
    Johansson, G.: Visual motion perception. Scientific American (1975)Google Scholar
  24. 24.
    Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3d pose estimation from a single depth image. In: IEEE International Conference on Computer Vision, pp. 731–738 (2011)Google Scholar
  25. 25.
    Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MICCAI 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  26. 26.
    Campbell, L., Bobick, A.: Recognition of human body motion using phase space constraints. In: IEEE International Conference on Computer Vision, pp. 624–630 (1995)Google Scholar
  27. 27.
    Lv, F., Nevatia, R.: Recognition and segmentation of 3-D human action using HMM and multi-class adaBoost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 359–372. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  28. 28.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Lee, M.W., Nevatia, R.: Dynamic human pose estimation using markov chain monte carlo approach. In: IEEE Workshops on Application of Computer Vision, pp. 168–175 (2005)Google Scholar
  30. 30.
    Koppula, H.S., Gupta, R., Saxena, A.: Human activity learning using object affordances from rgb-d videos. CoRR abs/1208.0967 (2012)Google Scholar
  31. 31.
    Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from rgb-d videos. CoRR abs/1210.1207 (2012)Google Scholar
  32. 32.
    Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: International Conferences on Robotics and Automation, pp. 4007–4013 (2011)Google Scholar
  33. 33.
    Yang, X., Tian, Y.: Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Workshop on Human Activity Understanding from 3D Data, pp. 14–19 (2012)Google Scholar
  34. 34.
    Sung, J., Ponce, C., Selman, B., Saxena, A.: Human activity detection from rgbd images. In: Plan, Activity, and Intent Recognition (2011)Google Scholar
  35. 35.
    Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from rgbd images. In: IEEE International Conference on Robotics and Automation, pp. 842–849 (2012)Google Scholar
  36. 36.
    McCallum, A., Freitag, D., Pereira, F.C.N.: Maximum entropy markov models for information extraction and segmentation. In: International Conference on Machine Learning, pp. 591–598 (2000)Google Scholar
  37. 37.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)Google Scholar
  38. 38.
    Yao, A., Gall, J., Van Gool, L.: Coupled action recognition and pose estimation from multiple views. International Journal of Computer Vision 100(1), 16–37 (2012)CrossRefMATHGoogle Scholar
  39. 39.
    Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of motion capture data. ACM Transactions on Graphics 24, 677–685 (2005)CrossRefGoogle Scholar
  40. 40.
    Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)Google Scholar
  41. 41.
    Tenorth, M., Bandouch, J., Beetz, M.: The TUM kitchen data set of everyday manipulation activities for motion tracking and action recognition. In: IEEE Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences (2009)Google Scholar
  42. 42.
    Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3d face analysis. International Journal of Computer Vision 101(3), 437–458 (2013)CrossRefGoogle Scholar
  43. 43.
    Murphy-Chutorian, E., Trivedi, M.: Head pose estimation in computer vision: A survey. Transactions on Pattern Analysis and Machine Intelligence 31(4), 607–626 (2009)CrossRefGoogle Scholar
  44. 44.
    Jones, M., Viola, P.: Fast multi-view face detection. Technical Report TR2003-096, Mitsubishi Electric Research Laboratories (2003)Google Scholar
  45. 45.
    Huang, C., Ding, X., Fang, C.: Head pose estimation based on random forests for multiclass classification. In: International Conference on Pattern Recognition (2010)Google Scholar
  46. 46.
    Chen, L., Zhang, L., Hu, Y., Li, M., Zhang, H.: Head pose estimation using fisher manifold learning. In: Analysis and Modeling of Faces and Gestures (2003)Google Scholar
  47. 47.
    Balasubramanian, V.N., Ye, J., Panchanathan, S.: Biased manifold embedding: A framework for person-independent head pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  48. 48.
    Osadchy, M., Miller, M.L., LeCun, Y.: Synergistic face detection and pose estimation with energy-based models. In: Neural Information Processing Systems (2005)Google Scholar
  49. 49.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 681–685 (2001)CrossRefGoogle Scholar
  50. 50.
    Ramnath, K., Koterba, S., Xiao, J., Hu, C., Matthews, I., Baker, S., Cohn, J., Kanade, T.: Multi-view aam fitting and construction. International Journal of Computer Vision 76(2), 183–204 (2008)CrossRefGoogle Scholar
  51. 51.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: ACM International Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)Google Scholar
  52. 52.
    Storer, M., Urschler, M., Bischof, H.: 3d-mam: 3d morphable appearance model for efficient fine head pose estimation from still images. In: Workshop on Subspace Methods (2009)Google Scholar
  53. 53.
    Martins, P., Batista, J.: Accurate single view model-based head pose estimation. In: Automatic Face and Gesture Recognition (2008)Google Scholar
  54. 54.
    Vatahska, T., Bennewitz, M., Behnke, S.: Feature-based head pose estimation from images. In: International Conference on Humanoid Robots (2007)Google Scholar
  55. 55.
    Whitehill, J., Movellan, J.R.: A discriminative approach to frame-by-frame head pose tracking. In: Automatic Face and Gesture Recognition (2008)Google Scholar
  56. 56.
    Morency, L.P., Whitehill, J., Movellan, J.R.: Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation. In: Automatic Face and Gesture Recognition (2008)Google Scholar
  57. 57.
    Breitenstein, M.D., Kuettel, D., Weise, T., Van Gool, L., Pfister, H.: Real-time face pose estimation from single range images. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  58. 58.
    Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  59. 59.
    Morency, L.P., Sundberg, P., Darrell, T.: Pose estimation using 3d view-based eigenspaces. In: Automatic Face and Gesture Recognition (2003)Google Scholar
  60. 60.
    Seemann, E., Nickel, K., Stiefelhagen, R.: Head pose estimation using stereo vision for human-robot interaction. In: Automatic Face and Gesture Recognition (2004)Google Scholar
  61. 61.
    Mian, A., Bennamoun, M., Owens, R.: Automatic 3d face detection, normalization and recognition. In: 3D Data Processing, Visualization, and Transmission (2006)Google Scholar
  62. 62.
    Lu, X., Jain, A.K.: Automatic feature extraction for multiview 3d face recognition. In: Automatic Face and Gesture Recognition (2006)Google Scholar
  63. 63.
    Weise, T., Leibe, B., Van Gool, L.: Fast 3d scanning with automatic motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  64. 64.
    Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. ACM Transactions on Graphics 30(4) (2011)Google Scholar
  65. 65.
    Breitenstein, M.D., Jensen, J., Høilund, C., Moeslund, T.B., Van Gool, L.: Head pose estimation from passive stereo images. In: Salberg, A.-B., Hardeberg, J.Y., Jenssen, R. (eds.) SCIA 2009. LNCS, vol. 5575, pp. 219–228. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  66. 66.
    Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)Google Scholar
  67. 67.
    Fanelli, G., Weise, T., Gall, J., Van Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  68. 68.
    Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3d face model for pose and illumination invariant face recognition. In: Advanced Video and Signal based Surveillance (2009)Google Scholar
  69. 69.
    Weise, T., Wismer, T., Leibe, B., Van Gool, L.: In-hand scanning with online loop closure. In: 3-D Digital Imaging and Modeling (2009)Google Scholar
  70. 70.
    Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Transactions on Graphics 28(5) (2009)Google Scholar
  71. 71.
    Cootes, T.F., Wheeler, G.V., Walker, K.N., Taylor, C.J.: View-based active appearance models. Image and Vision Computing 20(9-10), 657–664 (2002)CrossRefGoogle Scholar
  72. 72.
    Matthews, I., Baker, S.: Active appearance models revisited. International Journal of Computer Vision 60(2), 135–164 (2003)CrossRefGoogle Scholar
  73. 73.
    Gross, R., Matthews, I., Baker, S.: Generic vs. person specific active appearance models. Image and Vision Computing 23(12), 1080–2093 (2005)CrossRefGoogle Scholar
  74. 74.
    Valstar, M., Martinez, B., Binefa, X., Pantic, M.: Facial point detection using boosted regression and graph models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  75. 75.
    Amberg, B., Vetter, T.: Optimal landmark detection using shape models and branch and bound slides. In: IEEE International Conference on Computer Vision (2011)Google Scholar
  76. 76.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)Google Scholar
  77. 77.
    Dantone, M., Gall, J., Fanelli, G., Van Gool, L.: Real-time facial feature detection using conditional regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  78. 78.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. International Journal of Computer Vision 61(1), 55–79 (2005)CrossRefGoogle Scholar
  79. 79.
    Everingham, M., Sivic, J., Zisserman, A.: Hello! my name is... buffy - automatic naming of characters in tv video. In: British Machine Vision Conference (2006)Google Scholar
  80. 80.
    Cristinacce, D., Cootes, T.: Automatic feature localisation with constrained local models. Journal of Pattern Recognition 41(10), 3054–3067 (2008)CrossRefMATHGoogle Scholar
  81. 81.
    Mpiperis, I., Malassiotis, S., Strintzis, M.: Bilinear models for 3-d face and facial expression recognition. IEEE Transactions on Information Forensics and Security 3(3), 498–511 (2008)CrossRefGoogle Scholar
  82. 82.
    Kakadiaris, I.A., Passalis, G., Toderici, G., Murtuza, M.N., Lu, Y., Karampatziakis, N., Theoharis, T.: Three-dimensional face recognition in the presence of facial expressions: An annotated deformable model approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(4), 640–649 (2007)CrossRefGoogle Scholar
  83. 83.
    Weise, T., Li, H., Van Gool, L., Pauly, M.: Face/off: live facial puppetry. In: Symposium on Computer Animation, pp. 7–16 (2009)Google Scholar
  84. 84.
    Sun, Y., Yin, L.: Automatic pose estimation of 3d facial models. In: International Conference on Pattern Recognition (2008)Google Scholar
  85. 85.
    Segundo, M., Silva, L., Bellon, O., Queirolo, C.: Automatic face segmentation and facial landmark detection in range images. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 40(5), 1319–1330 (2010)CrossRefGoogle Scholar
  86. 86.
    Chang, K.I., Bowyer, K.W., Flynn, P.J.: Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1695–1700 (2006)CrossRefGoogle Scholar
  87. 87.
    Mehryar, S., Martin, K., Plataniotis, K., Stergiopoulos, S.: Automatic landmark detection for 3d face image processing. In: Evolutionary Computation (2010)Google Scholar
  88. 88.
    Colbry, D., Stockman, G., Jain, A.: Detection of anchor points for 3d face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)Google Scholar
  89. 89.
    Dorai, C., Jain, A.K.: COSMOS - A Representation Scheme for 3D Free-Form Objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(10), 1115–1130 (1997)CrossRefGoogle Scholar
  90. 90.
    Wang, Y., Chua, C., Ho, Y.: Facial feature detection and face recognition from 2d and 3d images. Pattern Recognition Letters 10(23), 1191–1202 (2002)CrossRefGoogle Scholar
  91. 91.
    Chua, C.S., Jarvis, R.: Point signatures: A new representation for 3d object recognition. International Journal of Computer Vision 25, 63–85 (1997)CrossRefGoogle Scholar
  92. 92.
    Yu, T.H., Moon, Y.S.: A novel genetic algorithm for 3d facial landmark localization. In: Biometrics: Theory, Applications and Systems (2008)Google Scholar
  93. 93.
    Ju, Q., O’keefe, S., Austin, J.: Binary neural network based 3d facial feature localization. In: International Joint Conference on Neural Networks (2009)Google Scholar
  94. 94.
    Zhao, X., Dellandréa, E., Chen, L., Kakadiaris, I.: Accurate landmarking of three-dimensional facial data in the presence of facial expressions and occlusions using a three-dimensional statistical facial feature model. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 41(5), 1417–1428 (2011)CrossRefGoogle Scholar
  95. 95.
    Nair, P., Cavallaro, A.: 3-d face detection, landmark localization, and registration using a point distribution model. IEEE Transactions on Multimedia 11(4), 611–623 (2009)CrossRefGoogle Scholar
  96. 96.
    Fanelli, G., Gall, J., Romsdorfer, H., Weise, T., Van Gool, L.: A 3-d audio-visual corpus of affective communication. IEEE Transactions on Multimedia 12(6), 591–598 (2010)CrossRefGoogle Scholar
  97. 97.
    Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3d facial expression database for facial behavior research. In: International Conference on Automatic Face and Gesture Recognition (2006)Google Scholar
  98. 98.
    Lewis, J.P., Pighin, F.: Background mathematics. In: ACM SIGGRAPH Courses (2006)Google Scholar
  99. 99.
    Alexander, O., Rogers, M., Lambeth, W., Chiang, M., Debevec, P.: The digital emily project: photoreal facial modeling and animation. In: ACM SIGGRAPH Courses (2009)Google Scholar
  100. 100.
    Zhang, S., Huang, P.: High-resolution, real-time 3d shape acquisition. In: Workshop on Real-time 3D Sensors and Their Use (2004)Google Scholar
  101. 101.
    Zhang, L., Snavely, N., Curless, B., Seitz, S.M.: Spacetime faces: high resolution capture for modeling and animation. ACM Transactions on Graphics 23(3), 548–558 (2004)CrossRefGoogle Scholar
  102. 102.
    Borshukov, G., Piponi, D., Larsen, O., Lewis, J.P., Tempelaar-Lietz, C.: Universal capture - image-based facial animation for “the matrix reloaded”. In: ACM SIGGRAPH Courses (2005)Google Scholar
  103. 103.
    Ma, W.C., Hawkins, T., Peers, P., Chabert, C.F., Weiss, M., Debevec, P.: Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination. In: Eurographics Conference on Rendering Techniques, pp. 183–194 (2007)Google Scholar
  104. 104.
    Wilson, C.A., Ghosh, A., Peers, P., Chiang, J.Y., Busch, J., Debevec, P.: Temporal upsampling of performance geometry using photometric alignment. ACM Transactions on Graphics 29(2) (2010)Google Scholar
  105. 105.
    Beeler, T., Bickel, B., Beardsley, P., Sumner, B., Gross, M.: High-quality single-shot capture of facial geometry. ACM Transactions on Graphics 29 (2010)Google Scholar
  106. 106.
    Bradley, D., Heidrich, W., Popa, T., Sheffer, A.: High resolution passive facial performance capture. ACM Transactions on Graphics 29(4) (2010)Google Scholar
  107. 107.
    Furukawa, Y., Ponce, J.: Dense 3d motion capture from synchronized video streams. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  108. 108.
    Breidt, M., Buelthoff, H., Curio, C.: Robust semantic analysis by synthesis of 3d facial motion. In: Automatic Face and Gesture Recognition (2011)Google Scholar
  109. 109.
    Savran, A., Celiktutan, O., Akyol, A., Trojanová, J., Dibeklioglu, H., Esenlik, S., Bozkurt, N., Demirkir, C., Akagunduz, E., Caliskan, K., Alyuz, N., Sankur, B., Ulusoy, I., Akarun, L., Sezgin, T.M.: 3d face recognition performance under adversarial conditions. In: Workshop on Multimodal Interfaces, pp. 87–102 (2007)Google Scholar
  110. 110.
    Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3d dynamic facial expression database. In: Automatic Face and Gesture Recognition (2008)Google Scholar
  111. 111.
    Gupta, S., Markey, M., Bovik, A.: Anthropometric 3d face recognition. International Journal of Computer Vision 90(3), 331–349 (2010)CrossRefGoogle Scholar
  112. 112.
    Colombo, A., Cusano, C., Schettini, R.: Umb-db: A database of partially occluded 3d faces. In: Workshop on Benchmarking Facial Image Analysis Technologies, pp. 2113–2119 (2011)Google Scholar
  113. 113.
    Huynh, T., Min, R., Dugelay, J.-L.: An efficient LBP-based descriptor for facial depth images applied to gender recognition using RGB-D face data. In: Park, J.-I., Kim, J. (eds.) ACCV Workshops 2012, Part I. LNCS, vol. 7728, pp. 133–145. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  114. 114.
    Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  115. 115.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108(1-2), 52–73 (2007)CrossRefMATHGoogle Scholar
  116. 116.
    Lu, S., Metaxas, D., Samaras, D., Oliensis, J.: Using multiple cues for hand tracking and model refinement. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)Google Scholar
  117. 117.
    de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(9), 1793–1805 (2011)CrossRefGoogle Scholar
  118. 118.
    Delamarre, Q., Faugeras, O.D.: 3d articulated models and multiview tracking with physical forces. Computer Vision and Image Understanding 81(3), 328–357 (2001)CrossRefMATHGoogle Scholar
  119. 119.
    Bray, M., Koller-Meier, E., Van Gool, L.: Smart particle filtering for high-dimensional tracking. Computer Vision and Image Understanding 106(1), 116–129 (2007)CrossRefGoogle Scholar
  120. 120.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: British Machine Vision Conference (2011)Google Scholar
  121. 121.
    Rehg, J.M., Kanade, T.: Visual tracking of high dof articulated structures: an application to human hand tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  122. 122.
    Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 310–315 (2001)Google Scholar
  123. 123.
    MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  124. 124.
    Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: International Conference on Automatic Face and Gesture Recognition (1996)Google Scholar
  125. 125.
    Wu, Y., Lin, J., Huang, T.: Capturing natural hand articulation. In: IEEE International Conference on Computer Vision, pp. 426–432 (2001)Google Scholar
  126. 126.
    Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Visual Hand Tracking Using Nonparametric Belief Propagation. In: Workshop on Generative Model Based Vision, pp. 189–189 (2004)Google Scholar
  127. 127.
    Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: IEEE International Conference on Computer Vision, pp. 1475–1482 (2009)Google Scholar
  128. 128.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and efficient 26-DOF hand pose recovery. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 744–757. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  129. 129.
    Keskin, C., Kra, F., Kara, Y., Akarun, L.: Real time hand pose estimation using depth sensors. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. Advances in Computer Vision and Pattern Recognition, pp. 119–137. Springer, London (2013)CrossRefGoogle Scholar
  130. 130.
    Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  131. 131.
    State, A., Coleca, F., Barth, E., Martinetz, T.: Hand tracking with an extended self-organizing map. In: Estevez, P.A., Principe, J.C., Zegers, P. (eds.) Advances in Self-Organizing Maps. AISC, vol. 198, pp. 115–124. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  132. 132.
    Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3d hand pose reconstruction using specialized mappings. In: IEEE International Conference on Computer Vision, pp. 378–387 (2001)Google Scholar
  133. 133.
    Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 432–439 (2003)Google Scholar
  134. 134.
    de Campos, T., Murray, D.: Regression-based hand pose estimation from multiple cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 782–789 (2006)Google Scholar
  135. 135.
    Stenger, B., Thayananthan, A., Torr, P.: Model-based hand tracking using a hierarchical bayesian filter. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1372–1384 (2006)CrossRefGoogle Scholar
  136. 136.
    Romero, J., Kjellström, H., Kragic, D.: Hands in action: Real-time 3d reconstruction of hands in interaction with objects. In: International Conferences on Robotics and Automation, pp. 458–463 (2010)Google Scholar
  137. 137.
    Lee, C.S., Chun, S.Y., Park, S.W.: Articulated hand configuration and rotation estimation using extended torus manifold embedding. In: International Conference on Pattern Recognition, pp. 441–444 (2012)Google Scholar
  138. 138.
    Hamer, H., Gall, J., Urtasun, R., Van Gool, L.: Data-driven animation of hand-object interactions. In: International Conference on Automatic Face and Gesture Recognition, pp. 360–367 (2011)Google Scholar
  139. 139.
    Hamer, H., Gall, J., Weise, T., Van Gool, L.: An object-dependent hand pose prior from sparse training data. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 671–678 (2010)Google Scholar
  140. 140.
    Uebersax, D., Gall, J., den Bergh, M.V., Van Gool, L.: Real-time sign language letter and word recognition from depth data. In: IEEE Workshop on Human Computer Interaction: Real-Time Vision Aspects of Natural User Interfaces (2011)Google Scholar
  141. 141.
    Ye, Y., Liu, C.K.: Synthesis of detailed hand manipulations using contact sampling. ACM Transactions on Graphics 31(4), 41 (2012)CrossRefGoogle Scholar
  142. 142.
    Oikonomidis, I., Kyriazis, N., Argyros, A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011)Google Scholar
  143. 143.
    Kim, D., Hilliges, O., Izadi, S., Butler, A.D., Chen, J., Oikonomidis, I., Olivier, P.: Digits: Freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: ACM Symposium on User Interface Software and Technology, pp. 167–176 (2012)Google Scholar
  144. 144.
    Zhao, W., Chai, J., Xu, Y.Q.: Combining marker-based mocap and rgb-d camera for acquiring high-fidelity hand motion data. In: Symposium on Computer Animation, pp. 33–42 (2012)Google Scholar
  145. 145.
    Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12), 1371–1375 (1998)CrossRefGoogle Scholar
  146. 146.
    Derpanis, K.G., Wildes, R.P., Tsotsos, J.K.: Hand gesture recognition within a linguistics-based framework. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 282–296. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  147. 147.
    Ong, S., Ranganath, S.: Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 873–891 (2005)CrossRefGoogle Scholar
  148. 148.
    Pei, T., Starner, T., Hamilton, H., Essa, I., Rehg, J.: Learnung the basic units in american sign language using discriminative segmental feature selection. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4757–4760 (2009)Google Scholar
  149. 149.
    Yang, H.D., Sclaroff, S., Lee, S.W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(7), 1264–1277 (2009)CrossRefGoogle Scholar
  150. 150.
    Theodorakis, S., Pitsikalis, V., Maragos, P.: Model-level data-driven sub-units for signs in videos of continuous sign language. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2262–2265 (2010)Google Scholar
  151. 151.
    Zafrulla, Z., Brashear, H., Hamilton, H., Starner, T.: A novel approach to american sign language (asl) phrase verification using reversed signing. In: IEEE Workshop on CVPR for Human Communicative Behavior Analysis, pp. 48–55 (2010)Google Scholar
  152. 152.
    Dreuw, P., Ney, H., Martinez, G., Crasborn, O., Piater, J., Moya, J.M., Wheatley, M.: The signspeak project - bridging the gap between signers and speakers. In: International Conference on Language Resources and Evaluation (2010)Google Scholar
  153. 153.
    Liu, X., Fujimura, K.: Hand gesture recognition using depth data. In: International Conference on Automatic Face and Gesture Recognition (2004)Google Scholar
  154. 154.
    Mo, Z., Neumann, U.: Real-time hand pose recognition using low-resolution depth images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1499–1505 (2006)Google Scholar
  155. 155.
    Breuer, P., Eckes, C., Müller, S.: Hand gesture recognition with a novel IR time-of-flight range camera–A pilot study. In: Gagalowicz, A., Philips, W. (eds.) MIRAGE 2007. LNCS, vol. 4418, pp. 247–260. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  156. 156.
    Soutschek, S., Penne, J., Hornegger, J., Kornhuber, J.: 3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras. In: Workshop on Time of Flight Camera based Computer Vision (2008)Google Scholar
  157. 157.
    Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. International Journal of Intelligent Systems Technologies and Applications 5, 334–343 (2008)CrossRefGoogle Scholar
  158. 158.
    Penne, J., Soutschek, S., Fedorowicz, L., Hornegger, J.: Robust real-time 3d time-of-flight based gesture navigation. In: International Conference on Automatic Face and Gesture Recognition (2008)Google Scholar
  159. 159.
    Li, Z., Jarvis, R.: Real time hand gesture recognition using a range camera. In: Australasian Conference on Robotics and Automation (2009)Google Scholar
  160. 160.
    Takimoto, H., Yoshimori, S., Mitsukura, Y., Fukumi, M.: Classification of hand postures based on 3d vision model for human-robot interaction. In: International Symposium on Robot and Human Interactive Communication, pp. 292–297 (2010)Google Scholar
  161. 161.
    Lahamy, H., Litchi, D.: Real-time hand gesture recognition using range cameras. In: Canadian Geomatics Conference (2010)Google Scholar
  162. 162.
    Van den Bergh, M., Van Gool, L.: Combining rgb and tof cameras for real-time 3d hand gesture interaction. In: IEEE Workshop on Applications of Computer Vision (2011)Google Scholar
  163. 163.
    Marnik, J.: The polish finger alphabet hand postures recognition using elastic graph matching. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds.) Computer Recognition Systems 2. ASC, vol. 45, pp. 454–461. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  164. 164.
    Incertis, I., Garcia-Bermejo, J., Casanova, E.: Hand gesture recognition for deaf people interfacing. In: International Conference on Pattern Recognition, pp. 100–103 (2006)Google Scholar
  165. 165.
    Lockton, R., Fitzgibbon, A.W.: Real-time gesture recognition using deterministic boosting. In: British Machine Vision Conference (2002)Google Scholar
  166. 166.
    Liwicki, S., Everingham, M.: Automatic recognition of fingerspelled words in british sign language. In: IEEE Workshop on CVPR for Human Communicative Behavior Analysis (2009)Google Scholar
  167. 167.
    Kelly, D., Mc Donald, J., Markham, C.: A person independent system for recognition of hand postures used in sign language. Pattern Recognition Letters 31, 1359–1368 (2010)CrossRefGoogle Scholar
  168. 168.
    Amin, M., Yan, H.: Sign language finger alphabet recognition from gabor-pca representation of hand gestures. In: Machine Learning and Cybernetics (2007)Google Scholar
  169. 169.
    Munib, Q., Habeeb, M., Takruri, B., Al-Malik, H.: American sign language (asl) recognition based on hough transform and neural networks. Expert Systems with Applications 32(1), 24–37 (2007)CrossRefGoogle Scholar
  170. 170.
    Tzionas, D., Gall, J.: A comparison of directional distances for hand pose estimation. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 131–141. Springer, Heidelberg (2013)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Mao Ye
    • 1
  • Qing Zhang
    • 1
  • Liang Wang
    • 2
  • Jiejie Zhu
    • 3
  • Ruigang Yang
    • 1
  • Juergen Gall
    • 4
  1. 1.University of KentuckyLexingtonU.S.A.
  2. 2.Microsoft, One Microsoft WayRedmondU.S.A.
  3. 3.SRI International SarnoffPrincetonU.S.A.
  4. 4.University of BonnBonnGermany

Personalised recommendations