Advertisement

International Journal of Computer Vision

, Volume 63, Issue 3, pp 225–245 | Cite as

Shape-From-Silhouette Across Time Part II: Applications to Human Modeling and Markerless Motion Tracking

  • Kong-man (German) CheungEmail author
  • Simon Baker
  • Takeo Kanade
Article

Abstract

In Part I of this paper we developed the theory and algorithms for performing Shape-From-Silhouette (SFS) across time. In this second part, we show how our temporal SFS algorithms can be used in the applications of human modeling and markerless motion tracking. First we build a system to acquire human kinematic models consisting of precise shape (constructed using the temporal SFS algorithm for rigid objects), joint locations, and body part segmentation (estimated using the temporal SFS algorithm for articulated objects). Once the kinematic models have been built, we show how they can be used to track the motion of the person in new video sequences. This marker-less tracking algorithm is based on the Visual Hull alignment algorithm used in both temporal SFS algorithms and utilizes both geometric (silhouette) and photometric (color) information.

Keywords

human kinematic modeling markerless motion capture articulated human tracking 3D reconstruction Shape-From-Silhouette visual hull stereo temporal alignment 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11263_63_3 video file.mpg (4.6 mb)
Video (mpg 4.761 KB)

References

  1. Allen, B., Curless, B., and Popovic, Z. 2003. The space of human body shapes: Reconstruction and parameterization from range scans. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 587–594.Google Scholar
  2. Beymer, D. and Konolige, K. 1999. Real-time tracking of multiple people using stereo. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  3. Barron, C. and Kakadiaris, I. 2000. Estimating anthropometry and pose from a single image. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island SC.Google Scholar
  4. Blinn, J. 1982. A generalization of algebraic surface drawing. ACM Transactions on Graphics, 1(3):235–256.CrossRefGoogle Scholar
  5. Bregler, C. and Malik, J. 1997. Video motion capture. Technical Report CSD-97-973, University of California Berkeley.Google Scholar
  6. Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential map. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’98), Santa Barbara, CA, vol. 1, pp. 8–15 .Google Scholar
  7. Baker, S. and Matthews, I. 2004. Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision, 56(3):221–255.CrossRefGoogle Scholar
  8. Cai, Q. and Aggarwal, J. 1996. Tracking human motion using multiple cameras. In Proceedings of International Conference on Pattern Recognition (ICPR’96), vol. 3, pp. 68–72.Google Scholar
  9. Cai, Q. and Aggarwal, J. 1998. Automatic tracking of human motion in indoor scenes across multiple synchronized video streams. In Proceedings of the Sixth International Conference on Computer Vision (ICCV’98), Bombay, India.Google Scholar
  10. Cheung, K., Baker, S., Hodgins, J., and Kanade, T. 2004. Markerless human motion transfer. In Proceedings of the Second International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT’04), Thessaloniki, Greece.Google Scholar
  11. Cheung, G., Baker, S., and Kanade, T. 2003b. Shape-from-silhouette for articulated objects and its use for human body kinematics estimation and motion capture. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03), Madison, MI.Google Scholar
  12. Cheung, G., Baker, S., and Kanade, T. 2003a. Visual hull alignment and refinement across time: A 3D reconstruction algorithm combining shape-frame-silhouette with stereo. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03), Madison, MI.Google Scholar
  13. Cheung, K., Baker, S., and Kanade, T. 2005. Shape-from-silhouette across time part I: Theory and algorithms. International Journal on Computer Vision, 62(3):221–247.CrossRefGoogle Scholar
  14. Cheung, G. 2003. Visual Hull Construction, Alignment and Refinement for Human Kinematic Modeling, Motion Tracking and Rendering. PhD thesis, Carnegie Mellon University.Google Scholar
  15. Cheung, G., Kanade, T., Bouquet, J., and Holler, M. 2000. A real time system for robust 3D voxel reconstruction of human motions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island, SC.Google Scholar
  16. Coen, M. 1998. Design principals for intelligent environments. In Proceedings of AAAI Spring Symposium on Intelligent Environments, Stanford, CA.Google Scholar
  17. Cham, T. and Rehg, J. 1999a. A multiple hypothesis approach to figure tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99), Ft. Collins, CO.Google Scholar
  18. Cham, T. and Rehg, J. 1999b. Dynamic feature ordering for efficient registration. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  19. Carranza, J., Theobalt, C., Magnor, M., and Seidel, H. 2003. Free-viewpoint video of human actors. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 569–577.Google Scholar
  20. Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’00), Hilton Head Island, SC.Google Scholar
  21. Drummond, T. and Cipolla, R. 2001. Real-time tracking of highly articulated structures in the presence of noisy measurements. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada, pp. 315–320.Google Scholar
  22. DiFranco, D., Cham, T., and Rehg, J. 1999. Recovering of 3D articulated motion from 2d correspondences. Technical Report CRL 99/7, Compaq Cambridge Research Laboratory.Google Scholar
  23. Difranco, D., Cham, T., and Rehg, J. 2001. Reconstruction of 3D figure motion from 2D correspondences. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai, HI.Google Scholar
  24. Delamarre, Q. and Faugeras, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  25. Fua, P., Gruen, A., D’Apuzzo, N., and Plänkers, R. 2002. Markerless full body shape and motion capture from video sequences. International Archives of Photogrammetry and Remote Sensing, 34(5):256–261.Google Scholar
  26. Fua, P., Herda, L., Plänkers, R., and Boulic, R. 2000. Human shape and motion recovery using animation models. In XIX ISPRS Congress.Google Scholar
  27. Gavrila, G. and Davis, L. 1996. Tracking of humans in action: 3D model-based approach. In ARPA Image Understanding Workshop 1996.Google Scholar
  28. Haritaoglu, I., Harwood, D., and Davis, L.S. 1998. W4: Who? when? where? what? a real time system for detecting and tracking people. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’98), Japan.Google Scholar
  29. Ju, S., Black, M., and Yacoob, Y. 1996. Cardboard people: A parameterized model of articulated image motion. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’96), Vermont, USA.Google Scholar
  30. Jojic, N., Turk, M., and Huang, T. 1999. Tracking self-occluding articulated objects in dense disparity maps. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  31. Kakadiaris, I. and Metaxas, D. 1995. 3D human body model acquisition from multiple views. In Proceedings of International Conference on Computer Vision (ICCV’95), Cambridge MA, pp. 618–623.Google Scholar
  32. Kakadiaris, I. and Metaxas, D. 1998. 3D human body model acquisition from multiple views. International Journal on Computer Vision, 30(3):191–218.CrossRefGoogle Scholar
  33. Kakadiaris, I., Metaxas, D., and Bajcsy, R. 1994. Active part-decomposition, shape and motion estimation of articulated objects: A physics-based approach. Technical Report IRCS Report 94-18, University of Pennsylvania.Google Scholar
  34. Krahnstoever, N., Yeasin, M., and Sharma, R. 2001. Automatic acquisition and initialization of kinematic models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Technical Sketches, Kauai, HI.Google Scholar
  35. Krahnstoever, N., Yeasin, M., and Sharma, R. 2003. Automatic acquisition and initialization of articulated models. In To appear in Machine Vision and Applications.Google Scholar
  36. Liebowitz, D. and Carlsson, S. 2001. Uncalibrated motion capture exploiting articulated structure constraints. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada.Google Scholar
  37. Leung, M. and Yang, Y. 1995. First sight: A human body outline labeling system. IEEE Transactions Pattern Analysis and Machine Intelligence, 17(4):359–377.CrossRefGoogle Scholar
  38. Lucente, M., Zwart, G., and George, A. 1998. Visualization space: A testbed for deviceless multimodal user interface. In Proceedings of AAAI Spring Symposium on Intelligent Environments, Stanford, CA.Google Scholar
  39. Matusik, W. 2001. Image-based visual hulls. Master’s thesis, Massachusetts Institute of Technology.Google Scholar
  40. Moeslund, T. and Granum, E. 2001. A survey of computer vision-based human motion capture. Computer Vision and Image Understanding: CVIU, 81(3):231–268.CrossRefGoogle Scholar
  41. Mikic, I., Hunter, E., Trivedi, M., and Cosman, P. 2001. Articulated body posture estimation from multi-camera voxel data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai, HI.Google Scholar
  42. Murray, R., Li, Z., and Sastry, S. 1994. A Mathematical Introduction to Robotic Manipulation, CRC Press.Google Scholar
  43. Moezzi, S., Tai, L., and Gerard, P. 1997. Virtual view generation for 3D digital video. IEEE Computer Society Multimedia, 4(1).Google Scholar
  44. Mikic, I., Trivedi, M., Hunter, E., and Cosman, P. 2003. Human body model acquisition and tracking using voxel data. International Journal on Computer Vision, 53(3):199–223.CrossRefGoogle Scholar
  45. O’Brien, J., Bodenheimer, R., Brostow, G., and Hodgins, J. 2000. Automatic joint parameter estimation from magnetic motion capture data. In Proceedings of Graphics Interface’00, pp. 53–60.Google Scholar
  46. Plänkers, R. and Fua, P. 2001. Articulated soft objects for video-based body modeling. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada, pp. 394–401.Google Scholar
  47. Plänkers, R., Fua, P., and D’Apuzzo, N. 1999. Automated body modeling from video sequences. In Proceedings of the 1999 International Workshop on Modeling People (MPEOPLE’99), Corfu, Greece.Google Scholar
  48. Pavlovic, V., Rehg, J., Cham, T., and Murphy, K. 1999. A dynamic bayesian network approach to figure tracking using learned dynamic models. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  49. Rehg, J. and Kanade, T. 1995. Model-based tracking of self-occluding articulated objects. In Proceedings of International Conference on Computer Vision (ICCV’95), Cambridge MA., pp. 612–617.Google Scholar
  50. Sidenbladh, H., Black, M., and Fleet, D. 2000a. Stochastic tracking of 3D human figures using 2D image motion. In Proceedings of European Conference on Computer Vision (ECCV’00), Dublin, Ireland.Google Scholar
  51. Sullivan, J. and Carlsson, S. 2002. Recognizing and tracking human action. In Proceedings of European Conference on Computer Vision (ECCV’02), Denmark.Google Scholar
  52. Sidenbladh, H., DeLaTorre, F., and Black, M. 2000b. A framework for modeling the appearance of 3D articulated figures. In Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition (ICAFGR’00).Google Scholar
  53. Shafer, S., Krumm, J., Brumitt, B., Meyers, B., Czerwinski, M., and Robbins, D. 1998. The new easyliving project at microsoft research. In Proceedings of Joint DARPA/NIST Smart Spaces Workshop, Gaithersburgh, MD.Google Scholar
  54. Sand, P., McMillan, L., and Popovic, J. 2003. Continuous capture of skin deformation. In Computer Graphics Annual Conference Series (SIGGRAPH’03), San Diego, CA, pp. 578–586.Google Scholar
  55. Thirdtech inc. http://www.3rdtech.com.
  56. Vicon motion systems. http://www.vicon.com.
  57. Wren, C., Azarbayejani, A., Darrell, T., and Pentland, A. 1997. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780–785.CrossRefGoogle Scholar
  58. Yamamoto, M., Sato, A., Kawada, S., Kondo, T., and Osaki, Y. 1998. Incremental tracking of human actions from multiple views. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’98), CA, vol. 1, pp. 2–7.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • Kong-man (German) Cheung
    • 1
    Email author
  • Simon Baker
    • 2
  • Takeo Kanade
    • 2
  1. 1.Neven VisionSanta MonicaUSA
  2. 2.The Robotics InstituteCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations