Abstract
A novel algorithm is presented for the 3D reconstruction of human action in long (>30 second) monocular image sequences. A sequence is represented by a small set of automatically found representative keyframes. The skeletal joint positions are manually located in each keyframe and mapped to all other frames in the sequence. For each keyframe a 3D key pose is created, and interpolation between these 3D body poses, together with the incorporation of limb length and symmetry constraints, provides a smooth initial approximation of the 3D motion. This is then fitted to the image data to generate a realistic 3D reconstruction. The degree of manual input required is controlled by the diversity of the sequence’s content. Sports’ footage is ideally suited to this approach as it frequently contains a limited number of repeated actions. Our method is demonstrated on a long (36 second) sequence of a woman playing tennis filmed with a non-stationary camera. This sequence required manual initialisation on <1.5% of the frames, and demonstrates that the system can deal with very rapid motion, severe self-occlusions, motion blur and clutter occurring over several concurrent frames. The monocular 3D reconstruction is verified by synthesising a view from the perspective of a ‘ground truth’ reference camera, and the result is seen to provide a qualitatively accurate 3D reconstruction of the motion.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blake, A., Isard, M.: Active Contours. Springer, Heidelberg (1998)
Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: CVPR (1998)
Deutscher, J., Blake, A., Reid, I.: Motion capture by annealed particle filtering. In: Proc. Conf. Computer Vision and Pattern Recognition (2000)
Lepetit, V., Shahrokni, A., Fua, P.: Robust data association for anline applications. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Loy, G., Zelinsky, A.: Fast radial symmetry for detecting points of interest. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(8), 959–973 (2003)
Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)
Moselund, T., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81(3) (2001)
Ramanan, D., Forsyth, D.: Finding and tracking people from the bottom up. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Shoemake, K.: Animating rotation with quaternion curves. In: SIGGRAPH (1985)
Sidenbladh, H., Black, M.: Implicit probabilistic models of human motion for synthesis and human tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 784–800. Springer, Heidelberg (2002)
Sidenbladh, H., Black, M., Fleet, D.J.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)
Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3d body tracking. In: Proc. Conf. Computer Vision and Pattern Recognition (2001)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Sullivan, J., Carlsson, S.: Recognizing and tracking human action. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 629–644. Springer, Heidelberg (2002)
Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single image. Computer Vision and Image Understanding 80(3), 349–363 (2000)
Toyama, K., Blake, A.: Probabilistic tracking in a metric space. In: ICCV (July 2001)
Zelnik-Manor, L., Irani, M.: Event-based video analysis. In: Proc. Conf. Computer Vision and Pattern Recognition (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Loy, G., Eriksson, M., Sullivan, J., Carlsson, S. (2004). Monocular 3D Reconstruction of Human Motion in Long Action Sequences. In: Pajdla, T., Matas, J. (eds) Computer Vision - ECCV 2004. ECCV 2004. Lecture Notes in Computer Science, vol 3024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24673-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-24673-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21981-1
Online ISBN: 978-3-540-24673-2
eBook Packages: Springer Book Archive