Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications

Volume 8200 of the series Lecture Notes in Computer Science pp 188-206

Full-Body Human Motion Capture from Monocular Depth Images

  • Thomas HeltenAffiliated withMPI Informatik
  • , Andreas BaakAffiliated withMPI Informatik
  • , Meinard MüllerAffiliated withInternational Audio Laboratories
  • , Christian TheobaltAffiliated withMPI Informatik

* Final gross prices may vary according to local VAT.

Get Access


Optical capturing of human body motion has many practical applications, ranging from motion analysis in sports and medicine, over ergonomy research, up to computer animation in game and movie production. Unfortunately, many existing approaches require expensive multi-camera systems and controlled studios for recording, and expect the person to wear special marker suits. Furthermore, marker-less approaches demand dense camera arrays and indoor recording. These requirements and the high acquisition cost of the equipment makes it applicable only to a small number of people. This has changed in recent years, when the availability of inexpensive depth sensors, such as time-of-flight cameras or the Microsoft Kinect has spawned new research on tracking human motions from monocular depth images. These approaches have the potential to make motion capture accessible to much larger user groups. However, despite significant progress over the last years, there are still unsolved challenges that limit applicability of depth-based monocular full body motion capture. Algorithms are challenged by very noisy sensor data, (self) occlusions, or other ambiguities implied by the limited information that a depth sensor can extract of the scene. In this article, we give an overview on the state-of-the-art in full body human motion capture using depth cameras. Especially, we elaborate on the challenges current algorithms face and discuss possible solutions. Furthermore, we investigate how the integration of additional sensor modalities may help to resolve some of the ambiguities and improve tracking results.