Machine Vision and Applications

, Volume 26, Issue 1, pp 41–54 | Cite as

Tracking the articulated motion of the human body with two RGBD cameras

  • Damien Michel
  • Costas Panagiotakis
  • Antonis A. Argyros
Original Paper


We present a model-based, top-down solution to the problem of tracking the 3D position, orientation and full articulation of the human body from markerless visual observations obtained by two synchronized RGBD cameras. Inspired by recent advances to the problem of model-based hand tracking Oikonomidis et al. (Efficient Model-based 3D Tracking of Hand Articulations using Kinect, 2011), we treat human body tracking as an optimization problem that is solved using stochastic optimization techniques. We show that the proposed approach outperforms in accuracy state of the art methods that rely on a single RGBD camera. Thus, for applications that require increased accuracy and can afford the extra-complexity introduced by the second sensor, the proposed approach constitutes a viable solution to the problem of markerless human motion tracking. Our findings are supported by an extensive quantitative evaluation of the method that has been performed on a publicly available data set that is annotated with ground truth.


Markerless human motion capture  3D human tracking 3D pose estimation  Articulated object tracking 3D reconstruction 



This work was partially funded by the European Commission under contract FP7-IST-288146 HOBBIT and by the European Union (European Social Fund—ESF) and Greek national funds through the Operational Program “Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF)—Research Funding Project: THALIS-UOA-ERASITECHNIS MIS 375435.


  1. 1.
    Bisacco, A., Ming-Hsuan, Y., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: IEEE Computer Vision and Pattern Recognition (2007)Google Scholar
  2. 2.
    Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recognit. Lett. 34(15), 1995–2006 (2013)CrossRefGoogle Scholar
  3. 3.
    Corazza, S., Mundermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.: Markerless motion capture through visual hull, articulated icp and subject specific model generation. Int. J. Comput. Vis. 87(1–2), 156–169 (2010)CrossRefGoogle Scholar
  4. 4.
    Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. Int. J. Comput. Vis. 61(2), 185–205 (2005)CrossRefGoogle Scholar
  5. 5.
    Gall, J., Rosenhahn, B., Brox, T., Seidel, H.-P.: Optimization and filtering for human motion capture. Int. J. Comput. Vis. 87(1–2), 75–92 (2010)CrossRefGoogle Scholar
  6. 6.
    Gall, J., Stoll, C., de Aguiar, E., Theobalt, C., Rosenhahn, B., Seidel, H. P.: Motion capture using joint skeleton tracking and surface estimation. In: IEEE Computer Vision and Pattern Recognition, pp. 1746–1753 (2009)Google Scholar
  7. 7.
    Hamer, H., Schindler, K., Koller-Meier, E., Van Gool, L.: Tracking a hand manipulating an object. In: IEEE International Conference on Computer Vision (2009)Google Scholar
  8. 8.
    Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 Jan (1995)Google Scholar
  9. 9.
    Kennedy, J., Eberhart, R., Yuhui, S.: Swarm intelligence. Morgan Kaufmann (2001)Google Scholar
  10. 10.
    Mikic, I., Trivedi, M., Hunter, E., Cosman, P.: Human body model acquisition and tracking using voxel data. Int. J. Comput. Vis. 53(3), 199–223 (2003)CrossRefGoogle Scholar
  11. 11.
    Moeslund, T.B., Hilton, A., Kru, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 90–126 (2006)CrossRefGoogle Scholar
  12. 12.
    Mussi, L., Ivekovic, S., Cagnoni, S.: Markerless articulated human body tracking from multi-view video with gpu-pso. In: Tempesti, G., Tyrrell, A., Miller, J. (eds.) Evolvable systems: from biology to hardware of Lecture Notes in Computer Science, vol. 6274, pp. 97–108 (2010)Google Scholar
  13. 13.
    Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley MHAD: a comprehensive multimodal human action database. In: IEEE Workshop on Applications on Computer Vision (WACV) (2013)Google Scholar
  14. 14.
    Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Markerless and Efficient 26-DOF Hand Pose Recovery. Asian Conf. Comput. Vis. 6494, 744–757 (2010)Google Scholar
  15. 15.
    Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Efficient Model-based 3D Tracking of Hand Articulations using Kinect. In: British Machine Vision Conference. Dundee, UK (2011)Google Scholar
  16. 16.
    Oikonomidis, I., Kyriazis, N., Argyros, A. A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE International Conference on Computer Vision (2011)Google Scholar
  17. 17.
    OpenNI, November. OpenNI User Guide. OpenNI organization, last viewed 19-01-2011 pp. 11–32. (2010)
  18. 18.
    Pons-Moll, G., Leal-Taixe, L., Truong, T., Rosenhahn, B.: Efficient and robust shape matching for model based human motion capture. In: Mester, R., Felsberg, M. (eds.) Pattern Recognition. Lecture Notes in Computer Science, vol. 6835, pp. 416–425. Springer, Berlin (2011)CrossRefGoogle Scholar
  19. 19.
    Poppe, R.: Vision-based human motion analysis: an overview. Comput. VIsi. Image Underst. Vis. Hum. Comput. Interact. 108(1–2), 4–18 (2007)CrossRefGoogle Scholar
  20. 20.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images Commun ACM 56(1), 116–124 (2013)Google Scholar
  21. 21.
    Sigal, L., Isard, M., Haussecker, H., Black, M.: Loose-limbed people: estimating 3d human pose and motion using non-parametric belief propagation. Int. J. Comput. Vis. 98(1), 15–48 (2012)Google Scholar
  22. 22.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. IEEE Comput. Vis. Pattern Recognit. 1, 390–397 (2005)Google Scholar
  23. 23.
    Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: IEEE ICCV Workshops. pp. 1154–1160 (2011)Google Scholar
  24. 24.
    Tzevanidis, K., Zabulis, X., Sarmis, T., Koutlemanis, P., Kyriazis, N., Argyros, A.: From multiple views to textured 3d meshes: a gpu-powered approach. ECCV Workshops , pp. 5–11 (2010)Google Scholar
  25. 25.
    Vicon.: Vicon: Motion capture systems. (2013)
  26. 26.
    Vijay, J., Trucco, E., Ivekovic, S.: Markerless human articulated tracking using hierarchical particle swarm optimisation. Image Vis. Comput. 28(11), 1530–1547 (2010)Google Scholar
  27. 27.
    Wilson, J. L.: Microsoft kinect for xbox 360. PC Magazine Communications (2010)Google Scholar
  28. 28.
    Zhang, L., Sturm, J., Cremers, D., Lee, D.: Real-time human motion tracking using multiple depth cameras. In: Proceedings of the International Conference on Intelligent Robot Systems (IROS) (Oct. 2012) (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Damien Michel
    • 1
  • Costas Panagiotakis
    • 2
  • Antonis A. Argyros
    • 1
    • 3
  1. 1.Institute of Computer ScienceFORTHHeraklionGreece
  2. 2.Department of Business AdministrationTEI of CreteAgios NikolaosGreece
  3. 3.Computer Science DepartmentUniversity of CreteHeraklionGreece

Personalised recommendations