Abstract
Depth sensors have become one of the most popular means of generating human facial and posture information in the past decade. By coupling a depth camera and computer vision based recognition algorithms, these sensors can detect human facial and body features in real time. Such a breakthrough has fused many new research directions in animation creation and control, which also has opened up new challenges. In this chapter, we explain how depth sensors obtain human facial and body information. We then discuss on the main challenge on depth sensor-based systems, which is the inaccuracy of the obtained data, and explain how the problem is tackled. Finally, we point out the emerging applications in the field, in which human facial and body feature modeling and understanding is a key research problem.
References
Alex Butler D, Izadi S, Hilliges O, Molyneaux D, Hodges S, Kim D (2012) Shake’n’sense: reducing interference for overlapping structured light depth cameras. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI’12. ACM, New York, pp 1933–1936
Bailey SW, Bodenheimer B (2012) A comparison of motion capture data recorded from a vicon system and a Microsoft Kinect sensor. In: Proceedings of the ACM symposium on applied perception, SAP’12. ACM, New York, pp 121–121
Bleiweiss A, Eshar D, Kutliroff G, Lerner A, Oshrat Y, Yanai Y (2010) Enhanced interactive gaming by blending full-body tracking and gesture animation. In: ACM SIGGRAPH ASIA 2010 Sketches. Seoul, South Korea. ACM, p 34
Bronstein AM, Bronstein MM, Kimmel R (2005) Three-dimensional face recognition. Int J Comput Vision 64(1):5–30
Chai J, Hodgins JK (2005) Performance animation from low-dimensional control signals. In SIGGRAPH’05: ACM SIGGRAPH 2005 Papers. ACM, New York, pp 686–696
Chang KI, Bowyer KW, Flynn PJ (2006) Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Trans Pattern Anal Mach Intell 28(10):1695–700
Cui Y, Chang W, Nöll T, Stricker D (2013) Kinectavatar: fully automatic body capture using a single Kinect. In: Proceedings of the 11th international conference on computer vision, vol 2, ACCV’12. Springer-Verlag, Berlin/Heidelberg, pp 133–147
Fern’ndez-Baena A, SusÃÂn A, Lligadas X (2012) Biomechanical validation of upper-body and lower-body joint movements of Kinect motion capture data for rehabilitation treatments. In: Intelligent Networking and Collaborative Systems (INCoS), 2012 4th International Conference on, pp 656–661
Fernandez-Sanchez EJ, Diaz J, Ros E (2013) Background subtraction based on color and depth using active sensors. Sensors 13(7):8895–915
Girshick R, Shotton J, Kohli P, Criminisi A, Fitzgibbon A (2011) Efficient regression of general-activity human poses from depth images. In: Computer Vision (ICCV), 2011 I.E. international conference on. Barcelona, Spain. pp 415–422
Ho ESL, Chan JCP, Komura T, Leung H (2013) Interactive partner control in close interactions for real-time applications. ACM Trans Multimedia Comput Commun Appl 9(3):21:1–21:19
Ho ES, Chan JC, Chan DC, Shum HP, Cheung YM, Yuen PC (2016) Improving posture classification accuracy for depth sensor-based human activity monitoring in smart environments. Comput Vis Image Underst 148:97–110. doi:10.1111/cgf.12735
Holden D, Saito J, Komura T, Joyce T (2015) Learning motion manifolds with convolutional autoencoders. In ACM SIGGRAPH ASIA 2015 technical briefs. ACM, Kobe, Japan. 2015 SIGGRAPH ASIA
Iwamoto N, Shum HPH, Yang L, Morishima S (2015) Multi-layer lattice model for real-time dynamic character animation. Comput Graph Forum 34(7):99–109
Jiang Y, Saxena A (2013) Hallucinating humans for learning robotic placement of objects. In: Proceedings of the 13th international symposium on experimental robotics. Springer International Publishing, Heidelberg, pp 921–937
Jiang Y, Koppula H, Saxena A (2013) Hallucinated humans as the hidden context for labeling 3d scenes. In: Proceedings of the 2013 I.E. conference on computer vision and pattern recognition, CVPR’13. IEEE Computer Society, Washington, DC, pp 2993–3000
Kakumanu P, Makrogiannis S, Bourbakis N (2007) A survey of skin-color modeling and detection methods. Pattern Recogn 40(3):1106–22
Kazemi V, Keskin C, Taylor J, Kohli P, Izadi S (2014) Real-time face reconstruction from a single depth image. In: 3D Vision (3DV), 2014 2nd international conference on, vol 1. IEEE, Lyon, France. 2014 3DV. pp 369–376
Kinect sdk. https://developer.microsoft.com/en-us/windows/kinect
Kyan M, Sun G, Li H, Zhong L, Muneesawang P, Dong N, Elder B, Guan L (2015) An approach to ballet dance training through ms Kinect and visualization in a cave virtual reality environment. ACM Trans Intell Syst Technol (TIST) 6(2):23
Li H, Yu J, Ye Y, Bregler C (2013) Realtime facial animation with on-the-fly correctives. ACM Trans Graph 32(4):42–1
Liang S, Kemelmacher-Shlizerman I, Shapiro LG (2014) 3d face hallucination from a single depth frame. In: 3D Vision (3DV), 2014 2nd international conference on, vol 1. IEEE, Lyon, France. 2014 3DV. pp 31–38
Liu H, Wei X, Chai J, Ha I, Rhee T (2011) Realtime human motion control with a small number of inertial sensors. In: Symposium on interactive 3D graphics and games, I3D’11. ACM, New York, pp 133–140
Liu Z, Huang J, Bu S, Han J, Tang X, Li X (2016a) Template deformation-based 3-d reconstruction of full human body scans from low-cost depth cameras. IEEE Trans Cybern PP(99):1–14
Liu Z, Zhou L, Leung H, Shum HPH (2016b) Kinect posture reconstruction based on a local mixture of gaussian process models. IEEE Trans Vis Comput Graph 14 pp. doi:10.1109/TVCG.2015.2510000
Mackay K, Shum HPH, Komura T (2012) Environment capturing with Microsoft Kinect. In: Proceedings of the 2012 international conference on software knowledge information management and applications, SKIMA’12. Chengdu, China. 2012 SKIMA
Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) Kinectfusion: real-time dense surface mapping and tracking. In: Proceedings of the 2011 10th IEEE international symposium on mixed and augmented reality, ISMAR’11. IEEE Computer Society, Washington, DC, pp 127–136
Pachoulakis I, Kapetanakis K (2012) Augmented reality platforms for virtual fitting rooms. Int J Multimedia Appl 4(4):35
Plantard P, Shum HP, Multon F (2016a) Filtered pose graph for efficient kinect pose reconstruction. Multimed Tools Appl 1–22. doi:10.1007/s11042-016-3546-4
Plantard P, Shum HPH, Multon F (2016b) Ergonomics measurements using Kinect with a pose correction framework. In: Proceedings of the 2016 international digital human modeling symposium, DHM ’16, Montreal, 8 p
Sandilands P, Choi MG, Komura T (2012) Capturing close interactions with objects using a magnetic motion capture system and a rgbd sensor. In: Proceedings of the 2012 motion in games. Springer, Berlin/Heidelberg, pp 220–231
Sandilands P, Choi MG, Komura T (2013) Interaction capture using magnetic sensors. Comput Anim Virtual Worlds 24(6):527–38
Segundo MP, Silva L, Bellon ORP, Queirolo CC (2010) Automatic face segmentation and facial landmark detection in range images. Systems Man Cybern Part B Cybern IEEE Trans 40(5):1319–30
Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, … Blake A (2013) Efficient human pose estimation from single depth images. IEEE Trans Pattern Anal Machine Intell 35(12):2821–2840
Shum HPH (2013) Serious games with human-object interactions using rgb-d camera. In: Proceedings of the 6th international conference on motion in games, MIG’13. Springer-Verlag, Berlin/Heidelberg
Shum HPH, Ho ESL (2012) Real-time physical modelling of character movements with Microsoft Kinect. In: Proceedings of the 18th ACM symposium on virtual reality software and technology, VRST’12. ACM, New York, pp 17–24
Shum HPH, Ho ESL, Jiang Y, Takagi S (2013) Real-time posture reconstruction for Microsoft Kinect. IEEE Trans Cybern 43(5):1357–69
Soh J, Choi Y, Park Y, Yang HS (2013) User-friendly 3d object manipulation gesture using Kinect. In: Proceedings of the 12th ACM SIGGRAPH international conference on virtual-reality continuum and its applications in industry, VRCAI’13. ACM, New York, pp 231–234
Sun M, Kohli P, Shotton J (2012) Conditional regression forests for human pose estimation. In: Computer Vision and Pattern Recognition (CVPR), 2012 I.E. conference on. Providence, Rhode Island. pp 3394–3401
Tautges J, Zinke A, Krüger B, Baumann J, Weber A, Helten T, Müller M, Seidel H-P, Eberhardt B (2011) Motion reconstruction using sparse accelerometer data. ACM Trans Graph 30(3):18:1–18:12
Vera L, Gimeno J, Coma I, Fernández M (2011) Augmented mirror: interactive augmented reality system based on Kinect. In: Human-Computer Interaction–INTERACT 2011. Springer, Lisbon, Portugal. 2011 INTERACT. pp 483–486
Wang L, Villamil R, Samarasekera S, Kumar R (2012) Magic mirror: a virtual handbag shopping system. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 I.E. computer society conference on. IEEE, Rhode Island. 2012 CVPR. pp 19–24
Wang K, Wang X, Pan Z, Liu K (2014) A two-stage framework for 3d facereconstruction from rgbd images. Pattern Anal Mach Intell IEEE Trans 36(8):1493–504
Weise T, Bouaziz S, Li H, Pauly M (2011) Realtime performance-based facial animation. ACM Trans Graph (TOG) 30:77, ACM
Zhang P, Siu K, Jianjie Z, Liu CK, Chai J (2014a) Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture. ACM Trans Graph 33(6):221:1–221:14
Zhang P, Siu K, Jianjie Z, Liu CK, Chai J (2014b) Leveraging depth cameras and wearable pressure sensors for full-body kinematics and dynamics capture. ACM Trans Graph (TOG) 33(6):221
Zhou Z, Shu B, Zhuo S, Deng X, Tan P, Lin S (2012) Image-based clothes animation for virtual fitting. In: SIGGRAPH Asia 2012 technical briefs. ACM, Singapore. 2012 SIGGRAPH ASIA. p 33
Zhou L, Liu Z, Leung H, Shum HPH (2014) Posture reconstruction using Kinect with a probabilistic model. In: Proceedings of the 20th ACM symposium on virtual reality software and technology, VRST’14. ACM, New York, pp 117–125
Zollhöfer M, Nießner M, Izadi S, Rehmann C, Zach C, Fisher M, Wu C, Fitzgibbon A, Loop C, Theobalt C et al (2014) Real-time non-rigid reconstruction using an rgb-d camera. ACM Trans Graph (TOG) 33(4):156
Acknowledgment
This work is supported by the Engineering and Physical Sciences Research Council (EPSRC) (Ref: EP/M002632/1).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this entry
Cite this entry
Shen, Y., Zhang, J., Yang, L., Shum, H.P.H. (2016). Depth Sensor-Based Facial and Body Animation Control. In: Müller, B., et al. Handbook of Human Motion. Springer, Cham. https://doi.org/10.1007/978-3-319-30808-1_7-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-30808-1_7-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Online ISBN: 978-3-319-30808-1
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering