Skip to main content

A Survey on Human Motion Analysis from Depth Data

  • Chapter

Part of the Lecture Notes in Computer Science book series (LNIP,volume 8200)

Abstract

Human pose estimation has been actively studied for decades. While traditional approaches rely on 2d data like images or videos, the development of Time-of-Flight cameras and other depth sensors created new opportunities to advance the field. We give an overview of recent approaches that perform human motion analysis which includes depth-based and skeleton-based activity recognition, head pose estimation, facial feature detection, facial performance capture, hand pose estimation and hand gesture recognition. While the focus is on approaches using depth data, we also discuss traditional image based methods to provide a broad overview of recent developments in these areas.

Keywords

  • Action Recognition
  • Sign Language
  • Gesture Recognition
  • Depth Data
  • Depth Sensor

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-44964-2_8
  • Chapter length: 39 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   64.99
Price excludes VAT (USA)
  • ISBN: 978-3-642-44964-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   84.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Klette, R., Tee, G.: Understanding human motion: A historic review. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds.) Human Motion. Computational Imaging and Vision, vol. 36, pp. 1–22. Springer, Netherlands (2008)

    CrossRef  Google Scholar 

  2. Aggarwal, J.: Motion analysis: Past, present and future. In: Bhanu, B., Ravishankar, C.V., Roy-Chowdhury, A.K., Aghajan, H., Terzopoulos, D. (eds.) Distributed Video Sensor Networks, pp. 27–39. Springer, London (2011)

    CrossRef  Google Scholar 

  3. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)

    Google Scholar 

  4. Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys 43(2), 16:1–16:43 (2011)

    Google Scholar 

  5. Mitra, S., Acharya, T.: Gesture recognition: A survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 37(3), 311–324 (2007)

    CrossRef  Google Scholar 

  6. Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding 104(2), 90–126 (2006)

    CrossRef  Google Scholar 

  7. Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)

    CrossRef  Google Scholar 

  8. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: Workshop on Human Activity Understanding from 3D Data, pp. 9–14 (2010)

    Google Scholar 

  9. Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1290–1297 (2012)

    Google Scholar 

  10. Kurakin, A., Zhang, Z., Liu, Z.: A real time system for dynamic hand gesture recognition with a depth sensor. In: 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1975–1979 (2012)

    Google Scholar 

  11. Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  12. Li, W., Zhang, Z., Liu, Z.: Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Transactions on Circuits and Systems for Video Technology 18(11), 1499–1510 (2008)

    CrossRef  Google Scholar 

  13. Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: Space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  14. Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  15. Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM International Conference on Multimedia, pp. 1057–1060 (2012)

    Google Scholar 

  16. Zhang, H., Parker, L.: 4-dimensional local spatio-temporal features for human activity recognition. In: International Conference on Intelligent Robots and Systems, pp. 2044–2049 (2011)

    Google Scholar 

  17. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5228–5235 (2004)

    CrossRef  Google Scholar 

  18. Lei, J., Ren, X., Fox, D.: Fine-grained kitchen activity recognition using rgb-d. In: ACM Conference on Ubiquitous Computing (2012)

    Google Scholar 

  19. Jalal, A., Uddin, M.Z., Kim, J.T., Kim, T.S.: Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor and Built Environment 21(1), 184–190 (2011)

    CrossRef  Google Scholar 

  20. Wang, Y., Huang, K., Tan, T.: Human activity recognition based on r transform. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  21. Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In: Workshop on Human Activity Understanding from 3D Data, pp. 20–27 (2012)

    Google Scholar 

  22. Han, L., Wu, X., Liang, W., Hou, G., Jia, Y.: Discriminative human action recognition in the learned hierarchical manifold space. Image and Vision Computing 28(5), 836–849 (2010)

    CrossRef  Google Scholar 

  23. Johansson, G.: Visual motion perception. Scientific American (1975)

    Google Scholar 

  24. Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3d pose estimation from a single depth image. In: IEEE International Conference on Computer Vision, pp. 731–738 (2011)

    Google Scholar 

  25. Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Menze, B., Langs, G., Tu, Z., Criminisi, A. (eds.) MICCAI 2010. LNCS, vol. 6533, pp. 106–117. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  26. Campbell, L., Bobick, A.: Recognition of human body motion using phase space constraints. In: IEEE International Conference on Computer Vision, pp. 624–630 (1995)

    Google Scholar 

  27. Lv, F., Nevatia, R.: Recognition and segmentation of 3-D human action using HMM and multi-class adaBoost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 359–372. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  28. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    MathSciNet  CrossRef  MATH  Google Scholar 

  29. Lee, M.W., Nevatia, R.: Dynamic human pose estimation using markov chain monte carlo approach. In: IEEE Workshops on Application of Computer Vision, pp. 168–175 (2005)

    Google Scholar 

  30. Koppula, H.S., Gupta, R., Saxena, A.: Human activity learning using object affordances from rgb-d videos. CoRR abs/1208.0967 (2012)

    Google Scholar 

  31. Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from rgb-d videos. CoRR abs/1210.1207 (2012)

    Google Scholar 

  32. Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: International Conferences on Robotics and Automation, pp. 4007–4013 (2011)

    Google Scholar 

  33. Yang, X., Tian, Y.: Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Workshop on Human Activity Understanding from 3D Data, pp. 14–19 (2012)

    Google Scholar 

  34. Sung, J., Ponce, C., Selman, B., Saxena, A.: Human activity detection from rgbd images. In: Plan, Activity, and Intent Recognition (2011)

    Google Scholar 

  35. Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from rgbd images. In: IEEE International Conference on Robotics and Automation, pp. 842–849 (2012)

    Google Scholar 

  36. McCallum, A., Freitag, D., Pereira, F.C.N.: Maximum entropy markov models for information extraction and segmentation. In: International Conference on Machine Learning, pp. 591–598 (2000)

    Google Scholar 

  37. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)

    Google Scholar 

  38. Yao, A., Gall, J., Van Gool, L.: Coupled action recognition and pose estimation from multiple views. International Journal of Computer Vision 100(1), 16–37 (2012)

    CrossRef  MATH  Google Scholar 

  39. Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of motion capture data. ACM Transactions on Graphics 24, 677–685 (2005)

    CrossRef  Google Scholar 

  40. Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)

    Google Scholar 

  41. Tenorth, M., Bandouch, J., Beetz, M.: The TUM kitchen data set of everyday manipulation activities for motion tracking and action recognition. In: IEEE Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences (2009)

    Google Scholar 

  42. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3d face analysis. International Journal of Computer Vision 101(3), 437–458 (2013)

    CrossRef  Google Scholar 

  43. Murphy-Chutorian, E., Trivedi, M.: Head pose estimation in computer vision: A survey. Transactions on Pattern Analysis and Machine Intelligence 31(4), 607–626 (2009)

    CrossRef  Google Scholar 

  44. Jones, M., Viola, P.: Fast multi-view face detection. Technical Report TR2003-096, Mitsubishi Electric Research Laboratories (2003)

    Google Scholar 

  45. Huang, C., Ding, X., Fang, C.: Head pose estimation based on random forests for multiclass classification. In: International Conference on Pattern Recognition (2010)

    Google Scholar 

  46. Chen, L., Zhang, L., Hu, Y., Li, M., Zhang, H.: Head pose estimation using fisher manifold learning. In: Analysis and Modeling of Faces and Gestures (2003)

    Google Scholar 

  47. Balasubramanian, V.N., Ye, J., Panchanathan, S.: Biased manifold embedding: A framework for person-independent head pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  48. Osadchy, M., Miller, M.L., LeCun, Y.: Synergistic face detection and pose estimation with energy-based models. In: Neural Information Processing Systems (2005)

    Google Scholar 

  49. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 681–685 (2001)

    CrossRef  Google Scholar 

  50. Ramnath, K., Koterba, S., Xiao, J., Hu, C., Matthews, I., Baker, S., Cohn, J., Kanade, T.: Multi-view aam fitting and construction. International Journal of Computer Vision 76(2), 183–204 (2008)

    CrossRef  Google Scholar 

  51. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: ACM International Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)

    Google Scholar 

  52. Storer, M., Urschler, M., Bischof, H.: 3d-mam: 3d morphable appearance model for efficient fine head pose estimation from still images. In: Workshop on Subspace Methods (2009)

    Google Scholar 

  53. Martins, P., Batista, J.: Accurate single view model-based head pose estimation. In: Automatic Face and Gesture Recognition (2008)

    Google Scholar 

  54. Vatahska, T., Bennewitz, M., Behnke, S.: Feature-based head pose estimation from images. In: International Conference on Humanoid Robots (2007)

    Google Scholar 

  55. Whitehill, J., Movellan, J.R.: A discriminative approach to frame-by-frame head pose tracking. In: Automatic Face and Gesture Recognition (2008)

    Google Scholar 

  56. Morency, L.P., Whitehill, J., Movellan, J.R.: Generalized adaptive view-based appearance model: Integrated framework for monocular head pose estimation. In: Automatic Face and Gesture Recognition (2008)

    Google Scholar 

  57. Breitenstein, M.D., Kuettel, D., Weise, T., Van Gool, L., Pfister, H.: Real-time face pose estimation from single range images. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  58. Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  59. Morency, L.P., Sundberg, P., Darrell, T.: Pose estimation using 3d view-based eigenspaces. In: Automatic Face and Gesture Recognition (2003)

    Google Scholar 

  60. Seemann, E., Nickel, K., Stiefelhagen, R.: Head pose estimation using stereo vision for human-robot interaction. In: Automatic Face and Gesture Recognition (2004)

    Google Scholar 

  61. Mian, A., Bennamoun, M., Owens, R.: Automatic 3d face detection, normalization and recognition. In: 3D Data Processing, Visualization, and Transmission (2006)

    Google Scholar 

  62. Lu, X., Jain, A.K.: Automatic feature extraction for multiview 3d face recognition. In: Automatic Face and Gesture Recognition (2006)

    Google Scholar 

  63. Weise, T., Leibe, B., Van Gool, L.: Fast 3d scanning with automatic motion compensation. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  64. Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. ACM Transactions on Graphics 30(4) (2011)

    Google Scholar 

  65. Breitenstein, M.D., Jensen, J., Høilund, C., Moeslund, T.B., Van Gool, L.: Head pose estimation from passive stereo images. In: Salberg, A.-B., Hardeberg, J.Y., Jenssen, R. (eds.) SCIA 2009. LNCS, vol. 5575, pp. 219–228. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  66. Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)

    Google Scholar 

  67. Fanelli, G., Weise, T., Gall, J., Van Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  68. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3d face model for pose and illumination invariant face recognition. In: Advanced Video and Signal based Surveillance (2009)

    Google Scholar 

  69. Weise, T., Wismer, T., Leibe, B., Van Gool, L.: In-hand scanning with online loop closure. In: 3-D Digital Imaging and Modeling (2009)

    Google Scholar 

  70. Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Transactions on Graphics 28(5) (2009)

    Google Scholar 

  71. Cootes, T.F., Wheeler, G.V., Walker, K.N., Taylor, C.J.: View-based active appearance models. Image and Vision Computing 20(9-10), 657–664 (2002)

    CrossRef  Google Scholar 

  72. Matthews, I., Baker, S.: Active appearance models revisited. International Journal of Computer Vision 60(2), 135–164 (2003)

    CrossRef  Google Scholar 

  73. Gross, R., Matthews, I., Baker, S.: Generic vs. person specific active appearance models. Image and Vision Computing 23(12), 1080–2093 (2005)

    CrossRef  Google Scholar 

  74. Valstar, M., Martinez, B., Binefa, X., Pantic, M.: Facial point detection using boosted regression and graph models. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)

    Google Scholar 

  75. Amberg, B., Vetter, T.: Optimal landmark detection using shape models and branch and bound slides. In: IEEE International Conference on Computer Vision (2011)

    Google Scholar 

  76. Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)

    Google Scholar 

  77. Dantone, M., Gall, J., Fanelli, G., Van Gool, L.: Real-time facial feature detection using conditional regression forests. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  78. Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. International Journal of Computer Vision 61(1), 55–79 (2005)

    CrossRef  Google Scholar 

  79. Everingham, M., Sivic, J., Zisserman, A.: Hello! my name is... buffy - automatic naming of characters in tv video. In: British Machine Vision Conference (2006)

    Google Scholar 

  80. Cristinacce, D., Cootes, T.: Automatic feature localisation with constrained local models. Journal of Pattern Recognition 41(10), 3054–3067 (2008)

    CrossRef  MATH  Google Scholar 

  81. Mpiperis, I., Malassiotis, S., Strintzis, M.: Bilinear models for 3-d face and facial expression recognition. IEEE Transactions on Information Forensics and Security 3(3), 498–511 (2008)

    CrossRef  Google Scholar 

  82. Kakadiaris, I.A., Passalis, G., Toderici, G., Murtuza, M.N., Lu, Y., Karampatziakis, N., Theoharis, T.: Three-dimensional face recognition in the presence of facial expressions: An annotated deformable model approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(4), 640–649 (2007)

    CrossRef  Google Scholar 

  83. Weise, T., Li, H., Van Gool, L., Pauly, M.: Face/off: live facial puppetry. In: Symposium on Computer Animation, pp. 7–16 (2009)

    Google Scholar 

  84. Sun, Y., Yin, L.: Automatic pose estimation of 3d facial models. In: International Conference on Pattern Recognition (2008)

    Google Scholar 

  85. Segundo, M., Silva, L., Bellon, O., Queirolo, C.: Automatic face segmentation and facial landmark detection in range images. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 40(5), 1319–1330 (2010)

    CrossRef  Google Scholar 

  86. Chang, K.I., Bowyer, K.W., Flynn, P.J.: Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1695–1700 (2006)

    CrossRef  Google Scholar 

  87. Mehryar, S., Martin, K., Plataniotis, K., Stergiopoulos, S.: Automatic landmark detection for 3d face image processing. In: Evolutionary Computation (2010)

    Google Scholar 

  88. Colbry, D., Stockman, G., Jain, A.: Detection of anchor points for 3d face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)

    Google Scholar 

  89. Dorai, C., Jain, A.K.: COSMOS - A Representation Scheme for 3D Free-Form Objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(10), 1115–1130 (1997)

    CrossRef  Google Scholar 

  90. Wang, Y., Chua, C., Ho, Y.: Facial feature detection and face recognition from 2d and 3d images. Pattern Recognition Letters 10(23), 1191–1202 (2002)

    CrossRef  Google Scholar 

  91. Chua, C.S., Jarvis, R.: Point signatures: A new representation for 3d object recognition. International Journal of Computer Vision 25, 63–85 (1997)

    CrossRef  Google Scholar 

  92. Yu, T.H., Moon, Y.S.: A novel genetic algorithm for 3d facial landmark localization. In: Biometrics: Theory, Applications and Systems (2008)

    Google Scholar 

  93. Ju, Q., O’keefe, S., Austin, J.: Binary neural network based 3d facial feature localization. In: International Joint Conference on Neural Networks (2009)

    Google Scholar 

  94. Zhao, X., Dellandréa, E., Chen, L., Kakadiaris, I.: Accurate landmarking of three-dimensional facial data in the presence of facial expressions and occlusions using a three-dimensional statistical facial feature model. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 41(5), 1417–1428 (2011)

    CrossRef  Google Scholar 

  95. Nair, P., Cavallaro, A.: 3-d face detection, landmark localization, and registration using a point distribution model. IEEE Transactions on Multimedia 11(4), 611–623 (2009)

    CrossRef  Google Scholar 

  96. Fanelli, G., Gall, J., Romsdorfer, H., Weise, T., Van Gool, L.: A 3-d audio-visual corpus of affective communication. IEEE Transactions on Multimedia 12(6), 591–598 (2010)

    CrossRef  Google Scholar 

  97. Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3d facial expression database for facial behavior research. In: International Conference on Automatic Face and Gesture Recognition (2006)

    Google Scholar 

  98. Lewis, J.P., Pighin, F.: Background mathematics. In: ACM SIGGRAPH Courses (2006)

    Google Scholar 

  99. Alexander, O., Rogers, M., Lambeth, W., Chiang, M., Debevec, P.: The digital emily project: photoreal facial modeling and animation. In: ACM SIGGRAPH Courses (2009)

    Google Scholar 

  100. Zhang, S., Huang, P.: High-resolution, real-time 3d shape acquisition. In: Workshop on Real-time 3D Sensors and Their Use (2004)

    Google Scholar 

  101. Zhang, L., Snavely, N., Curless, B., Seitz, S.M.: Spacetime faces: high resolution capture for modeling and animation. ACM Transactions on Graphics 23(3), 548–558 (2004)

    CrossRef  Google Scholar 

  102. Borshukov, G., Piponi, D., Larsen, O., Lewis, J.P., Tempelaar-Lietz, C.: Universal capture - image-based facial animation for “the matrix reloaded”. In: ACM SIGGRAPH Courses (2005)

    Google Scholar 

  103. Ma, W.C., Hawkins, T., Peers, P., Chabert, C.F., Weiss, M., Debevec, P.: Rapid acquisition of specular and diffuse normal maps from polarized spherical gradient illumination. In: Eurographics Conference on Rendering Techniques, pp. 183–194 (2007)

    Google Scholar 

  104. Wilson, C.A., Ghosh, A., Peers, P., Chiang, J.Y., Busch, J., Debevec, P.: Temporal upsampling of performance geometry using photometric alignment. ACM Transactions on Graphics 29(2) (2010)

    Google Scholar 

  105. Beeler, T., Bickel, B., Beardsley, P., Sumner, B., Gross, M.: High-quality single-shot capture of facial geometry. ACM Transactions on Graphics 29 (2010)

    Google Scholar 

  106. Bradley, D., Heidrich, W., Popa, T., Sheffer, A.: High resolution passive facial performance capture. ACM Transactions on Graphics 29(4) (2010)

    Google Scholar 

  107. Furukawa, Y., Ponce, J.: Dense 3d motion capture from synchronized video streams. In: IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  108. Breidt, M., Buelthoff, H., Curio, C.: Robust semantic analysis by synthesis of 3d facial motion. In: Automatic Face and Gesture Recognition (2011)

    Google Scholar 

  109. Savran, A., Celiktutan, O., Akyol, A., Trojanová, J., Dibeklioglu, H., Esenlik, S., Bozkurt, N., Demirkir, C., Akagunduz, E., Caliskan, K., Alyuz, N., Sankur, B., Ulusoy, I., Akarun, L., Sezgin, T.M.: 3d face recognition performance under adversarial conditions. In: Workshop on Multimodal Interfaces, pp. 87–102 (2007)

    Google Scholar 

  110. Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3d dynamic facial expression database. In: Automatic Face and Gesture Recognition (2008)

    Google Scholar 

  111. Gupta, S., Markey, M., Bovik, A.: Anthropometric 3d face recognition. International Journal of Computer Vision 90(3), 331–349 (2010)

    CrossRef  Google Scholar 

  112. Colombo, A., Cusano, C., Schettini, R.: Umb-db: A database of partially occluded 3d faces. In: Workshop on Benchmarking Facial Image Analysis Technologies, pp. 2113–2119 (2011)

    Google Scholar 

  113. Huynh, T., Min, R., Dugelay, J.-L.: An efficient LBP-based descriptor for facial depth images applied to gender recognition using RGB-D face data. In: Park, J.-I., Kim, J. (eds.) ACCV Workshops 2012, Part I. LNCS, vol. 7728, pp. 133–145. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  114. Ballan, L., Taneja, A., Gall, J., Van Gool, L., Pollefeys, M.: Motion capture of hands in action using discriminative salient points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 640–653. Springer, Heidelberg (2012)

    CrossRef  Google Scholar 

  115. Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: A review. Computer Vision and Image Understanding 108(1-2), 52–73 (2007)

    CrossRef  MATH  Google Scholar 

  116. Lu, S., Metaxas, D., Samaras, D., Oliensis, J.: Using multiple cues for hand tracking and model refinement. In: IEEE Conference on Computer Vision and Pattern Recognition (2003)

    Google Scholar 

  117. de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3d hand pose estimation from monocular video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(9), 1793–1805 (2011)

    CrossRef  Google Scholar 

  118. Delamarre, Q., Faugeras, O.D.: 3d articulated models and multiview tracking with physical forces. Computer Vision and Image Understanding 81(3), 328–357 (2001)

    CrossRef  MATH  Google Scholar 

  119. Bray, M., Koller-Meier, E., Van Gool, L.: Smart particle filtering for high-dimensional tracking. Computer Vision and Image Understanding 106(1), 116–129 (2007)

    CrossRef  Google Scholar 

  120. Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. In: British Machine Vision Conference (2011)

    Google Scholar 

  121. Rehg, J.M., Kanade, T.: Visual tracking of high dof articulated structures: an application to human hand tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)

    CrossRef  Google Scholar 

  122. Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 310–315 (2001)

    Google Scholar 

  123. MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)

    CrossRef  Google Scholar 

  124. Heap, T., Hogg, D.: Towards 3d hand tracking using a deformable model. In: International Conference on Automatic Face and Gesture Recognition (1996)

    Google Scholar 

  125. Wu, Y., Lin, J., Huang, T.: Capturing natural hand articulation. In: IEEE International Conference on Computer Vision, pp. 426–432 (2001)

    Google Scholar 

  126. Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Visual Hand Tracking Using Nonparametric Belief Propagation. In: Workshop on Generative Model Based Vision, pp. 189–189 (2004)

    Google Scholar 

  127. Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: IEEE International Conference on Computer Vision, pp. 1475–1482 (2009)

    Google Scholar 

  128. Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Markerless and efficient 26-DOF hand pose recovery. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 744–757. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  129. Keskin, C., Kra, F., Kara, Y., Akarun, L.: Real time hand pose estimation using depth sensors. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. Advances in Computer Vision and Pattern Recognition, pp. 119–137. Springer, London (2013)

    CrossRef  Google Scholar 

  130. Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  131. State, A., Coleca, F., Barth, E., Martinetz, T.: Hand tracking with an extended self-organizing map. In: Estevez, P.A., Principe, J.C., Zegers, P. (eds.) Advances in Self-Organizing Maps. AISC, vol. 198, pp. 115–124. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  132. Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3d hand pose reconstruction using specialized mappings. In: IEEE International Conference on Computer Vision, pp. 378–387 (2001)

    Google Scholar 

  133. Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 432–439 (2003)

    Google Scholar 

  134. de Campos, T., Murray, D.: Regression-based hand pose estimation from multiple cameras. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 782–789 (2006)

    Google Scholar 

  135. Stenger, B., Thayananthan, A., Torr, P.: Model-based hand tracking using a hierarchical bayesian filter. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1372–1384 (2006)

    CrossRef  Google Scholar 

  136. Romero, J., Kjellström, H., Kragic, D.: Hands in action: Real-time 3d reconstruction of hands in interaction with objects. In: International Conferences on Robotics and Automation, pp. 458–463 (2010)

    Google Scholar 

  137. Lee, C.S., Chun, S.Y., Park, S.W.: Articulated hand configuration and rotation estimation using extended torus manifold embedding. In: International Conference on Pattern Recognition, pp. 441–444 (2012)

    Google Scholar 

  138. Hamer, H., Gall, J., Urtasun, R., Van Gool, L.: Data-driven animation of hand-object interactions. In: International Conference on Automatic Face and Gesture Recognition, pp. 360–367 (2011)

    Google Scholar 

  139. Hamer, H., Gall, J., Weise, T., Van Gool, L.: An object-dependent hand pose prior from sparse training data. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 671–678 (2010)

    Google Scholar 

  140. Uebersax, D., Gall, J., den Bergh, M.V., Van Gool, L.: Real-time sign language letter and word recognition from depth data. In: IEEE Workshop on Human Computer Interaction: Real-Time Vision Aspects of Natural User Interfaces (2011)

    Google Scholar 

  141. Ye, Y., Liu, C.K.: Synthesis of detailed hand manipulations using contact sampling. ACM Transactions on Graphics 31(4), 41 (2012)

    CrossRef  Google Scholar 

  142. Oikonomidis, I., Kyriazis, N., Argyros, A.: Full dof tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011)

    Google Scholar 

  143. Kim, D., Hilliges, O., Izadi, S., Butler, A.D., Chen, J., Oikonomidis, I., Olivier, P.: Digits: Freehand 3d interactions anywhere using a wrist-worn gloveless sensor. In: ACM Symposium on User Interface Software and Technology, pp. 167–176 (2012)

    Google Scholar 

  144. Zhao, W., Chai, J., Xu, Y.Q.: Combining marker-based mocap and rgb-d camera for acquiring high-fidelity hand motion data. In: Symposium on Computer Animation, pp. 33–42 (2012)

    Google Scholar 

  145. Starner, T., Weaver, J., Pentland, A.: Real-time american sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(12), 1371–1375 (1998)

    CrossRef  Google Scholar 

  146. Derpanis, K.G., Wildes, R.P., Tsotsos, J.K.: Hand gesture recognition within a linguistics-based framework. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 282–296. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  147. Ong, S., Ranganath, S.: Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(6), 873–891 (2005)

    CrossRef  Google Scholar 

  148. Pei, T., Starner, T., Hamilton, H., Essa, I., Rehg, J.: Learnung the basic units in american sign language using discriminative segmental feature selection. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4757–4760 (2009)

    Google Scholar 

  149. Yang, H.D., Sclaroff, S., Lee, S.W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(7), 1264–1277 (2009)

    CrossRef  Google Scholar 

  150. Theodorakis, S., Pitsikalis, V., Maragos, P.: Model-level data-driven sub-units for signs in videos of continuous sign language. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2262–2265 (2010)

    Google Scholar 

  151. Zafrulla, Z., Brashear, H., Hamilton, H., Starner, T.: A novel approach to american sign language (asl) phrase verification using reversed signing. In: IEEE Workshop on CVPR for Human Communicative Behavior Analysis, pp. 48–55 (2010)

    Google Scholar 

  152. Dreuw, P., Ney, H., Martinez, G., Crasborn, O., Piater, J., Moya, J.M., Wheatley, M.: The signspeak project - bridging the gap between signers and speakers. In: International Conference on Language Resources and Evaluation (2010)

    Google Scholar 

  153. Liu, X., Fujimura, K.: Hand gesture recognition using depth data. In: International Conference on Automatic Face and Gesture Recognition (2004)

    Google Scholar 

  154. Mo, Z., Neumann, U.: Real-time hand pose recognition using low-resolution depth images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1499–1505 (2006)

    Google Scholar 

  155. Breuer, P., Eckes, C., Müller, S.: Hand gesture recognition with a novel IR time-of-flight range camera–A pilot study. In: Gagalowicz, A., Philips, W. (eds.) MIRAGE 2007. LNCS, vol. 4418, pp. 247–260. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  156. Soutschek, S., Penne, J., Hornegger, J., Kornhuber, J.: 3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras. In: Workshop on Time of Flight Camera based Computer Vision (2008)

    Google Scholar 

  157. Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. International Journal of Intelligent Systems Technologies and Applications 5, 334–343 (2008)

    CrossRef  Google Scholar 

  158. Penne, J., Soutschek, S., Fedorowicz, L., Hornegger, J.: Robust real-time 3d time-of-flight based gesture navigation. In: International Conference on Automatic Face and Gesture Recognition (2008)

    Google Scholar 

  159. Li, Z., Jarvis, R.: Real time hand gesture recognition using a range camera. In: Australasian Conference on Robotics and Automation (2009)

    Google Scholar 

  160. Takimoto, H., Yoshimori, S., Mitsukura, Y., Fukumi, M.: Classification of hand postures based on 3d vision model for human-robot interaction. In: International Symposium on Robot and Human Interactive Communication, pp. 292–297 (2010)

    Google Scholar 

  161. Lahamy, H., Litchi, D.: Real-time hand gesture recognition using range cameras. In: Canadian Geomatics Conference (2010)

    Google Scholar 

  162. Van den Bergh, M., Van Gool, L.: Combining rgb and tof cameras for real-time 3d hand gesture interaction. In: IEEE Workshop on Applications of Computer Vision (2011)

    Google Scholar 

  163. Marnik, J.: The polish finger alphabet hand postures recognition using elastic graph matching. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds.) Computer Recognition Systems 2. ASC, vol. 45, pp. 454–461. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  164. Incertis, I., Garcia-Bermejo, J., Casanova, E.: Hand gesture recognition for deaf people interfacing. In: International Conference on Pattern Recognition, pp. 100–103 (2006)

    Google Scholar 

  165. Lockton, R., Fitzgibbon, A.W.: Real-time gesture recognition using deterministic boosting. In: British Machine Vision Conference (2002)

    Google Scholar 

  166. Liwicki, S., Everingham, M.: Automatic recognition of fingerspelled words in british sign language. In: IEEE Workshop on CVPR for Human Communicative Behavior Analysis (2009)

    Google Scholar 

  167. Kelly, D., Mc Donald, J., Markham, C.: A person independent system for recognition of hand postures used in sign language. Pattern Recognition Letters 31, 1359–1368 (2010)

    CrossRef  Google Scholar 

  168. Amin, M., Yan, H.: Sign language finger alphabet recognition from gabor-pca representation of hand gestures. In: Machine Learning and Cybernetics (2007)

    Google Scholar 

  169. Munib, Q., Habeeb, M., Takruri, B., Al-Malik, H.: American sign language (asl) recognition based on hough transform and neural networks. Expert Systems with Applications 32(1), 24–37 (2007)

    CrossRef  Google Scholar 

  170. Tzionas, D., Gall, J.: A comparison of directional distances for hand pose estimation. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 131–141. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., Gall, J. (2013). A Survey on Human Motion Analysis from Depth Data. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds) Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Lecture Notes in Computer Science, vol 8200. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44964-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-44964-2_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-44963-5

  • Online ISBN: 978-3-642-44964-2

  • eBook Packages: Computer ScienceComputer Science (R0)