Computer Vision and Machine Learning with RGB-D Sensors

Part of the series Advances in Computer Vision and Pattern Recognition pp 91-108


Human Performance Capture Using Multiple Handheld Kinects

  • Yebin LiuAffiliated withTsinghua University Email author 
  • , Genzhi YeAffiliated withTsinghua University
  • , Yangang WangAffiliated withTsinghua University
  • , Qionghai DaiAffiliated withTsinghua University
  • , Christian TheobaltAffiliated withTsinghua University

* Final gross prices may vary according to local VAT.

Get Access


Capturing real performances of human actors has been an important topic in the fields of computer graphics and computer vision in the last few decades. The reconstructed 3D performance can be used for character animation and free-viewpoint video. While most of the available performance capture approaches rely on a 3D video studio with tens of RGB cameras, this chapter presents a method for marker-less performance capture of single or multiple human characters using only three handheld Kinects. Compared with the RGB camera approaches, the proposed method is more convenient with respect to data acquisition, allowing for much fewer cameras and carry-on camera capture. The method introduced in this chapter reconstructs human skeletal poses, deforming surface geometry and camera poses for every time step of the depth video. It succeeds on general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even for reconstruction of multiple closely interacting characters.