Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation

  • Stefano CorazzaEmail author
  • Lars Mündermann
  • Emiliano Gambaretto
  • Giancarlo Ferrigno
  • Thomas P. Andriacchi


An approach for accurately measuring human motion through Markerless Motion Capture (MMC) is presented. The method uses multiple color cameras and combines an accurate and anatomically consistent tracking algorithm with a method for automatically generating subject specific models. The tracking approach employed a Levenberg-Marquardt minimization scheme over an iterative closest point algorithm with six degrees of freedom for each body segment. Anatomical consistency was maintained by enforcing rotational and translational joint range of motion constraints for each specific joint. A subject specific model of the subjects was obtained through an automatic model generation algorithm (Corazza et al. in IEEE Trans. Biomed. Eng., 2009) which combines a space of human shapes (Anguelov et al. in Proceedings SIGGRAPH, 2005) with biomechanically consistent kinematic models and a pose-shape matching algorithm. There were 15 anatomical body segments and 14 joints, each with six degrees of freedom (13 and 12, respectively for the HumanEva II dataset). The overall method is an improvement over (Mündermann et al. in Proceedings of CVPR, 2007) in terms of both accuracy and robustness. Since the method was originally developed for ≥8 cameras, the method performance was tested both (i) on the HumanEva II dataset (Sigal and Black, Technical Report CS-06-08, 2006) in a 4 camera configuration, (ii) on a series of motions including walking trials, a very challenging gymnastic motion and a dataset with motions similar to HumanEva II but with variable number of cameras.


Markerless motion capture Tracking 3D reconstruction Human body model Shape from silhouette 

Supplementary material

Below is the link to the electronic supplementary material. (WMV 1,841 kB)

Below is the link to the electronic supplementary material. (WMV 1,961 kB)

Below is the link to the electronic supplementary material. (MOV 2,275 kB)


  1. Aggarwal, J., & Cai, Q. (1999). Human motion analysis: a review. Computer Vision and Image Understanding, 73(3), 295–304. CrossRefGoogle Scholar
  2. Andriacchi, T. P., Alexander, E. J., Toney, M. K., Dyrby, C. O., & Sum, J. A. (1998). A point cluster method for in vivo motion analysis: applied to a study of knee kinematics. Journal of Biomechanical Engineering, 120, 743–749. CrossRefGoogle Scholar
  3. Anguelov, D., Koller, D., Pang, H., Srinivasan, P., & Thrun, S. (2004). Recovering articulated object models from 3D range data. In Proceedings UAI. Google Scholar
  4. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., & Davis, J. (2005). SCAPE: shape completion and animation of people. In Proceedings SIGGRAPH. Google Scholar
  5. Balan, A. O., Sigal, L., Black, M. J., Davis, J. E., & Haussecker, H. W. (2007). Detailed human shape and pose from images. In Proceedings CVPR. Google Scholar
  6. Baran, I., & Popovic, J. (2007). Automatic rigging and animation of 3D characters. In Proceedings of SIGGRAPH. Google Scholar
  7. Besl, P., & McKay, N. (1992). A method for registration of 3D shapes. Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256. CrossRefGoogle Scholar
  8. Bharatkumar, A. G., Daigle, K. E., Pandy, M. G., Cai, Q., & Aggarwal, J. K. (1994). Lower limb kinematics of human walking with the medial axis transformation. In IEEE Workshop on Non-Rigid Motion, Austin, USA (pp. 70–76). Google Scholar
  9. Bottino, A., & Laurentini, A. (2001). A silhouette based technique for the reconstruction of human movement. Computer Vision and Image Understanding, 83, 79. zbMATHCrossRefGoogle Scholar
  10. Bregler, C., & Malik, J. (1997). Tracking people with twists and exponential maps. In Proceedings CVPR. Google Scholar
  11. Cedras, C., & Shah, M. (1995). Motion-based recognition: a survey. Image and Vision Computing, 13(2), 129–155. CrossRefGoogle Scholar
  12. Cheung, K., Baker, S., & Kanade, T. (2005). Shape-from-silhouette across time part I: Theory and algorithm. International Journal of Computer Vision, 62, 221–247. CrossRefGoogle Scholar
  13. Corazza, S., Mündermann, L., Chaudhari, A. M., Demattio, T., Cobelli, C., & Andriacchi, T. P. (2006). A markerless motion capture system to study musculoskeletal biomechanics: visual hull and simulated annealing approach. Annals Biomedical Engineering, 34(6), 1019–1029. CrossRefGoogle Scholar
  14. Corazza, S., Gambaretto, E., Mündermann, L., & Andriacchi, T. (2009). Automatic generation of a subject specific model for accurate markerless motion capture and biomechanical applications. IEEE Transactions on Biomedical Engineering, in press. Google Scholar
  15. Delamarre, Q., & Faugeras, O. (1999). 3D articulated models and multiview tracking with silhouettes. In Proceedings ICCV. Google Scholar
  16. Demirdjian, D. (2004). Combining geometric- and view-based approaches for articulated pose. In Proceedings ECCV04 (Vol. III, pp. 183–194). Google Scholar
  17. Deutscher, J., Blake, A., & Reid, I. (2000). Articulated body motion capture by annealed particle filtering. In Proceedings CVPR (pp. 2126–2133). Google Scholar
  18. Gavrila, D. (1999). The visual analysis of human movement: a survey. Computer Vision and Image Understanding, 73(3), 82–98. zbMATHCrossRefGoogle Scholar
  19. Gavrila, D., & Davis, L. (1996). 3-D model based tracking of humans in action:a multiview approach. In Proceedings CVPR (pp. 73–80). Google Scholar
  20. Hogg, D. (1983). Model-based vision: a program to see a walking person. Image and Vision Computing, 1, 5. CrossRefGoogle Scholar
  21. Isard, M., & Blake, A. (1996). Estimating 3D hand pose using hierarchical multi-label classification. In Proceedings of 4th European Conference on Computer Vision, Cambridge, UK. Google Scholar
  22. Kakadiaris, I. A., & Metaxas, D. (1998). Three-dimensional human body model acquisition from multiple views. International Journal of Computer Vision, 30, 191. CrossRefGoogle Scholar
  23. Kanade, T., Saito, H., & Vedula, S. (1998). The 3D Room: Digitizing time-varying 3D events by synchronized multiple video streams (Tech. report CMU-RI-TR-98-34). Robotics Institute, Carnegie Mellon University. Google Scholar
  24. Knossow, D., Ronfard, R., & Horaud, R. P. (2008). Human motion tracking with a kinematic parameterization of extremal contours. International Journal of Computer Vision, 79(2), 247–269. CrossRefGoogle Scholar
  25. Kohli, P., Rihan, J., Bray, M., & Torr, P. H. S. (2008). Simultaneous segmentation and pose estimation of humans using dynamic graph cuts. International Journal of Computer Vision, 79(3), 285–298. CrossRefGoogle Scholar
  26. Laurentini, A. (1994). The Visual Hull concept for silhouette base image understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 150–162. CrossRefGoogle Scholar
  27. Leardini, A., Chiari, L., Della Croce, U., & Cappozzo, A. (2005). Human movement analysis using stereophotogrammetry. Part 3: Soft tissue artifact assessment and compensation. Gait and Posture, 21, 221–225. CrossRefGoogle Scholar
  28. Lee, H. J., & Chen, Z. (1985). Determination of 3D human body posture from a single view. Computer Vision, Graphics, and Image Processing, 30, 148–168. CrossRefMathSciNetGoogle Scholar
  29. Lee, W., Gu, J., & Magnenat-Thalmann, N. (2000). Generating animatable 3D virtual humans from photographs. In Proceedings Computer Graphics Forum—Eurographics (pp. 1–10). Google Scholar
  30. Legrand, L., Marzani, F., & Dusserre, L. (1998). A marker-free system for the analysis of. movement disabilities. Medinfo, 9, 1066–1070. Google Scholar
  31. Liu, Q., & Prakash, E. C. (2003). The parametrization of joint rotation with the unit quaternion. In Proceedings of 7° Digital Image Computing. Google Scholar
  32. Marzani, F., Calais, E., & Legrand, L. (2001). A 3-D marker-free system for the analysis of movement disabilities—an application to the legs. IEEE Transactions on Information Technology in Biomedicine, 5(1), 18–26. CrossRefGoogle Scholar
  33. Mikic, I., Trivedi, M., Hunter, E., & Cosman, P. (2003). Human body model acquisition and tracking using voxel data. International Journal of Computer Vision, 53, 199–223. CrossRefGoogle Scholar
  34. Moeslund, T. B., Hilton, A., & Krüger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2), 90–126. CrossRefGoogle Scholar
  35. Moon, H., Chellappa, R., & Rosenfeld, A. (2001). 3D object tracking using shape-encoded particle propagation. In Proceedings ICCV. Google Scholar
  36. Mündermann, L., Corazza, S., Chaudhari, A. M., Alexander, E. J., & Andriacchi, T. P. (2005). Most favorable camera configuration for a shape-from-silhouette markerless motion capture system for biomechanical analysis. Proceedings of SPIE-IS&T Electronic Imaging, 5665, 278–287. CrossRefGoogle Scholar
  37. Mündermann, L., Corazza, S., & Andriacchi, T.P. (2006). The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications. Journal of Neuroengineering and Rehabilitation, 3(1). Google Scholar
  38. Mündermann, L., Corazza, S., & Andriacchi, T. (2007). Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models. In Proceedings of CVPR. Google Scholar
  39. Narayanan, P. J., Rander, P., & Kanade, T. (1995). Synchronous capture of image sequences from multiple cameras (Technical Report CMU-RI-TR-95-25). Robotics Institute, Carnegie Mellon University. Google Scholar
  40. Nielsen, H. B. (1999). Damping parameter in Marquardt’s method (Technical Report IMM-REP-1999-05). Technical University of Denmark. Google Scholar
  41. Niskanen, M., Boyer, E., & Horaud, R. (2005). Articulated motion capture from 3-D points and normals. In Proceedings of BMVC’05. Google Scholar
  42. O’Rourke, J., & Badler, N. I. (1980). Model-based image analysis of human motion using constraint propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 522–536. Google Scholar
  43. Plankers, R., & Fua, P. (2003). Articulated soft objects for multiview shape and motion capture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 1182–1187. CrossRefGoogle Scholar
  44. Rosenhahn, B., & Klette, R. (2005). Automatic human model generation. Computer Analysis of Images and Patterns, 230–237. Google Scholar
  45. Rosenhahn, B., Brox, T., Kersting, U. G., Smith, A. W., Gurney, J. K., & Klette, R. (2006). A system for marker-less motion capture. Künstliche Intelligenz (KI), 1, 45–51. Google Scholar
  46. Sigal, L., & Black, M. J. (2006). HumanEva: synchronized video and motion capture dataset for evaluation of articulated human motion (Technical Report CS-06-08). Brown University. Google Scholar
  47. Wagg, D. K., & Nixon, M. S. (2004). Automated markerless extraction of walking people using deformable contour models. Computer Animation and Virtual Worlds, 15, 399–406. CrossRefGoogle Scholar
  48. Wren, C. R., Azarbayejani, A., Darrell, T., & Pentland, A. P. (1997). Pfinder—real-time tracking of the human body. Transactions on Pattern Analysis and Machine Intelligence, 19, 780–785. CrossRefGoogle Scholar
  49. Yamamoto, M., & Koshikawa, K. (1991). Human motion analysis based on a robot arm model. In Proceedings Computer Vision and Pattern Recognition. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Stefano Corazza
    • 1
    Email author
  • Lars Mündermann
    • 1
  • Emiliano Gambaretto
    • 2
  • Giancarlo Ferrigno
    • 2
  • Thomas P. Andriacchi
    • 1
    • 3
  1. 1.Stanford UniversityStanfordUSA
  2. 2.Politecnico di MilanoMilanoItaly
  3. 3.Bone and Joint RR&D, VA Palo Alto HospitalPalo AltoUSA

Personalised recommendations