Human Body Model Acquisition and Tracking Using Voxel Data

Mikić, Ivana; Trivedi, Mohan; Hunter, Edward; Cosman, Pamela

doi:10.1023/A:1023012723347

Human Body Model Acquisition and Tracking Using Voxel Data

Published: July 2003

Volume 53, pages 199–223, (2003)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Ivana Mikić¹,
Mohan Trivedi²,
Edward Hunter¹ &
…
Pamela Cosman²

894 Accesses
193 Citations
3 Altmetric
Explore all metrics

Abstract

We present an integrated system for automatic acquisition of the human body model and motion tracking using input from multiple synchronized video streams. The video frames are segmented and the 3D voxel reconstructions of the human body shape in each frame are computed from the foreground silhouettes. These reconstructions are then used as input to the model acquisition and tracking algorithms.

The human body model consists of ellipsoids and cylinders and is described using the twists framework resulting in a non-redundant set of model parameters. Model acquisition starts with a simple body part localization procedure based on template fitting and growing, which uses prior knowledge of average body part shapes and dimensions. The initial model is then refined using a Bayesian network that imposes human body proportions onto the body part size estimates. The tracker is an extended Kalman filter that estimates model parameters based on the measurements made on the labeled voxel data. A voxel labeling procedure that handles large frame-to-frame displacements was designed resulting in very robust tracking performance.

Extensive evaluation shows that the system performs very reliably on sequences that include different types of motion such as walking, sitting, dancing, running and jumping and people of very different body sizes, from a nine year old girl to a tall adult male.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies

Article 22 August 2014

A framework for interpreting, modeling and recognizing human body gestures through 3D eigenpostures

Article 21 March 2018

References

Bregler, C. 1997. Learning and recognizing human dynamics in video sequences, IEEE International Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico.
Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps, IEEE International Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA.
Cheung, G., Kanade, T., Bouguet, J., and Holler, M. 2000. A real time system for robust 3D voxel reconstruction of human motions. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, SC, USA, vol.2, pp. 714– 720.
Google Scholar
Covell, M., Rahimi, A., Harville, M., and Darrell, T. 2000. Articulated-pose estimation using brightness-and depthconstancy constraints. In IEEE Int. Conference on Computer Vision and Pattern Recognition, Hilton Head Island, SC, pp.438– 445.
Delamarre, Q. and Faugeras, O. 2001. 3D articulated models and multi-view tracking with physical forces, The special issue of the CVIU journal on modeling people, 81(3):328–357.
Google Scholar
Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering, IEEE Int. Conference on Computer Vision and Pattern Recognition, Hilton Head Island, SC.
Deutscher, J., Davison, A., and Reid, I. 2001. Automatic partitioning of high dimensional search spaces associated with articulated body motion capture, IEEE Int. Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii.
DiFranco, D., Cham, T., and Rehg, J. 2001. Reconstruction of 3D figure motion from 2D correspondences. In IEEE Int. Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii.
Gavrila, D. 1999. Visual analysis of human movement: A survey. Computer Vision and Image Understanding,73(1):82– 98.
Google Scholar
Gavrila, D. and Davis, L. 1996. 3D model-based tracking of humans in action: A multi-view approach. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, pp. 73–80.
Hilton, A. 1999. Towards model-based capture of persons shape, appearance and motion. In International Workshop on Modeling People at ICCV'99, Corfu, Greece.
Horprasert, T., Harwood, D., and Davis, L.S. 1999. A statistical approach for real-time robust background subtraction and shadow detection. In Proc. IEEE ICCV'99 FRAME-RATE Workshop, Kerkyra, Greece.
Howe, N., Leventon, M., and Freeman, W. 1999. Bayesian reconstruction of 3D human motion from single-camera video. In Neural Information Processing Systems, Denver, Colorado.
Hunter, E. 1999. Visual estimation of articulated motion using the expectation-constrained maximization algorithm, Ph.D. Dissertation, University of California, San Diego.
Google Scholar
Hunter, E., Kelly, P., and Jain, R. 1997. Estimation of articulated motion using kinematically constrained mixture densities.In IEEE Nonrigid and Articulated Motion Workshop, San Juan, Puerto Rico.
Ioffe, S. and Forsyth, D. 2001. Human tracking with mixtures of trees. In IEEE International Conference on Computer Vision, Vancouver, Canada.
Isard, M. and Blake, A. 1996. Visual tracking by stochastic propagation of conditional density. In Proc. 4th European Conference on Computer Vision, Cambridge, England.
Jojić, N., Turk, M., and Huang, T. 1999. Tracking self-occluding articulated objects in dense disparity maps. In IEEE Int. Conference on Computer Vision. Corfu, Greece.
Jung, S. and Wohn, K. 1997. Tracking and motion estimation of the articulated object: A hierarchical Kalman filter approach, Real-Time Imaging, 3:415–432.
Google Scholar
Kakadiaris, I. and Metaxas, D. 1996. Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection. In Proc. IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA.
Kakadiaris, I. and Metaxas, D. 1998. Three-dimensional human body model acquisition from multiple views, International Journal of Computer Vision, 30(3):191–218.
Google Scholar
Metaxas, D. and Terzopoulos, D. 1993. Shape and nonrigid motion estimation through physics-based synthesis, IEEE Trans. Pattern Analysis and Machine Intelligence, 15(6):580–591.
Google Scholar
Mikić, I. 2002. Human body model acquisition and tracking using multi-camera voxel data, Ph.D. Dissertation, University of California, San Diego.
Google Scholar
Mikić I., Trivedi, M., Hunter, E., and Cosman, P. 2001. Articulated body posture estimation from multi-camera voxel data. In IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii.
Moeslund, T. and Granum, E. 2001. A survey of computer visionbased human motion capture, Computer Vision and Image Understanding,81:231–268.
Google Scholar
Murray, R., Li, Z., and Sastry, S. 1993. A mathematical introduction to robotic manipulation, CRC Press.
Plankers, R. and Fua, P. 1999. Articulated soft objects for video-based body modeling. In InternationalWorkshop on Modeling People at ICCV'99, Corfu, Greece.
Plankers, R. and Fua, P. 2001. Tracking and modeling people in video sequences, Computer Vision and Image Understanding, 81:285– 302.
Google Scholar
Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. 1993. Numerical Recipes in C: The Art of Scientific Computing, Cambridge University Press.
Rehg, J. and Kanade, T. 1995. Model-based tracking of selfoccluding articulated objects. In IEEE International Conference on Computer Vision, Cambridge.
Sminchiescu, C. and Triggs, B. 2001. Covariance scaled sampling for monocular 3D body tracking. InIEEE International Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii.
Szeliski, R. 1993. Rapid octree construction from image sequences, CVGIP: Image Understanding, 58(1):23–32.
Google Scholar
Tsai, R. 1987. A versatile camera calibration technique for highaccuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses, IEEE Journal of Robotics and Automation, RA-3(4):323–344.
Google Scholar
Wachter, S. and Nagel, H.1999. Tracking persons in monocular image sequences, Computer Vision and Image Understanding, 74(3):174–192.
Google Scholar
Wren, C. 2000. Understanding expressive action, Ph.D. Dissertation, Massachusetts Institute of Technology.
Yamamoto, M., Sato, A., Kawada, S., Kondo, T., and Osaki, Y.1998. Incremental tracking of human actions from multiple views,IEEE International Conference on Computer Vision and Pattern Recognition.

Download references

Author information

Authors and Affiliations

Q3DM, Inc., 10110 Sorrento Valley Rd., Suite B, San Diego, CA, 92121, USA
Ivana Mikić & Edward Hunter
Department of Electrical and Computer Engineering, University of California, San Diego, 9500 Gilman Drive 0434, La Jolla, CA, 92093-0434, USA
Mohan Trivedi & Pamela Cosman

Authors

Ivana Mikić
View author publications
You can also search for this author in PubMed Google Scholar
Mohan Trivedi
View author publications
You can also search for this author in PubMed Google Scholar
Edward Hunter
View author publications
You can also search for this author in PubMed Google Scholar
Pamela Cosman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mikić, I., Trivedi, M., Hunter, E. et al. Human Body Model Acquisition and Tracking Using Voxel Data. International Journal of Computer Vision 53, 199–223 (2003). https://doi.org/10.1023/A:1023012723347

Download citation

Issue Date: July 2003
DOI: https://doi.org/10.1023/A:1023012723347

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human Body Model Acquisition and Tracking Using Voxel Data

Abstract

Access this article

Similar content being viewed by others

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies

A framework for interpreting, modeling and recognizing human body gestures through 3D eigenpostures

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Human Body Model Acquisition and Tracking Using Voxel Data

Abstract

Access this article

Similar content being viewed by others

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies

A framework for interpreting, modeling and recognizing human body gestures through 3D eigenpostures

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation