Advertisement

Machine Vision and Applications

, Volume 22, Issue 6, pp 995–1008 | Cite as

A recognition-based motion capture baseline on the HumanEva II test data

  • Nicholas R. Howe
Original Paper

Abstract

The advent of the HumanEva standardized motion capture data sets has enabled quantitative evaluation of motion capture algorithms on comparable terms. This paper measures the performance of an existing monocular recognition-based pose recovery algorithm on select HumanEva data, including all the HumanEva II clips. The method uses a physically motivated Markov process to connect adjacent frames and achieve a 3D relative mean error of 8.9 cm per joint. It further investigates factors contributing to the error and finds that research into better pose retrieval methods offers promise for improvement of this technique and those related to it. Finally, it investigates the effects of local search optimization with the same recognition-based algorithm and finds no significant deterioration in the results, indicating that processing speed can be largely independent of the size of the recognition library for this approach.

Keywords

Markerless motion capture Human pose recovery 3D tracking HumanEva 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

ESM 1 (AVI 10,193 kb)

ESM 2 (AVI 12,720 kb)

138_2011_344_MOESM3_ESM.avi (13.4 mb)
ESM 3 (AVI 13,725 kb)
138_2011_344_MOESM4_ESM.avi (12.2 mb)
ESM 4 (AVI 12,480 kb)
138_2011_344_MOESM5_ESM.avi (12.7 mb)
ESM 5 (AVI 12,988 kb)

ESM 6 (AVI 13,747 kb)

ESM 7 (AVI 13,890 kb)

ESM 8 (AVI 12,860 kb)

ESM 9 (AVI 14,602 kb)

References

  1. 1.
    Agarwal, A., Triggs, B.: 3D human pose from silhouettes by relevance vector regression. In: International Conference on Computer Vision and Pattern Recognition, vol. II, pp. 882–888 (2004)Google Scholar
  2. 2.
    Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1) (2006)Google Scholar
  3. 3.
    Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2003)Google Scholar
  4. 4.
    Belkin N.J., Kantor P., Fox E.A., Shaw J.A.: Combining the evidence of multiple query representations for information retrieval. Inf. Process. Manage. 31(3), 431–448 (1995)CrossRefGoogle Scholar
  5. 5.
    Bo L., Sminchisescu C.: Twin gaussian processes for structured prediction. Int. J. Comput. Vis. 87(1–2), 28–52 (2010)CrossRefGoogle Scholar
  6. 6.
    Cheng, S., Trivedi, M.: Articulated body pose estimation from voxel reconstructions using kinematically constrained gaussian mixture models: algorithm and evaluation. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)
  7. 7.
    Elgammal, A., Lee, C.: Inferring 3D body pose from silhouettes using activity manifold learning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. II, pp. 681–688 (2004)Google Scholar
  8. 8.
    Forsyth, D., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational studies of human motion: Part 1. Tracking and motion synthesis. Found. Trends. Comput. Graphics Vis. 1(2/3) (2006)Google Scholar
  9. 9.
    Fusiello, A., Aprile, M., Marzotto, R., Murino, V.: Mosaic of a video shot with multiple moving objects. In: IEEE International Conference on Image Processing, vol. II, pp. 307–310 (2003)Google Scholar
  10. 10.
    Howe, N.: Silhouette lookup for automatic pose tracking. In: IEEE Workshop on Articulated and Nonrigid Motion (2004)Google Scholar
  11. 11.
    Howe, N.: Flow lookup and biological motion perception. In: International Conference on Image Processing (2005)Google Scholar
  12. 12.
    Howe, N.: Recognition-based motion capture and the humaneva ii test data. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)
  13. 13.
    Howe, N., Deschamps, A.: Better foreground segmentation through graph cuts. Technical report, Smith College. http://arxiv.org/abs/cs.CV/0401017 (2004)
  14. 14.
    Howe, N.R.: Evaluating lookup-based monocular human pose tracking on the humaneva test data. Technical report, Smith College. Extended abstract for EHUM 2006 workshop (2006)Google Scholar
  15. 15.
    Howe N.R.: Silhouette lookup for monocular 3D pose tracking. Image Vis. Comput. 25(3), 331–341 (2006)CrossRefGoogle Scholar
  16. 16.
    Husz, Z., Wallace, A., Green, P.: Evaluation of a hierarchical partitioned particle filter with action primitives. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)
  17. 17.
    Kohli, P., Torr, P., Bray, M.: PoseCut: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In: European Conference on Computer Vision, pp. 642–655 (2006)Google Scholar
  18. 18.
    Krause, E.: Motion Estimation for Frame-Rate Conversion. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA (1987)Google Scholar
  19. 19.
    Lee C.-S., Elgammal A.: Coupled visual and kinematic manifold models for tracking. Int. J. Comput. Vis. 87(1–2), 118–139 (2010)CrossRefGoogle Scholar
  20. 20.
    Li R., Tian T.-P., Sclaroff S.: 3D human motion tracking with a coordinated mixture of factor analyzers. Int. J. Comput. Vis. 87(1–2), 170–190 (2010)CrossRefGoogle Scholar
  21. 21.
    McIntosh, C., Hamarneh, G., Mori, G.: Human limb delineation and joint position recovery using localized boundary models. In: IEEE Workshop on Motion and Video Computing (2007)Google Scholar
  22. 22.
    Moeslund T., Hilton A., Krüger V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)CrossRefGoogle Scholar
  23. 23.
    Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: European Conference on Computer Vision (2002)Google Scholar
  24. 24.
    Mori G., Malik J.: Recovering 3D human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1052–1062 (2006)CrossRefGoogle Scholar
  25. 25.
    Navaratnam, R., Fitzgibbon, A., Cipolla, R.: The joint manifold model for semi-supervised multi-valued regression. In: iccv, pp. 1–8 (2007)Google Scholar
  26. 26.
    Peursum P., Venkatesh S., West G.: A study on smoothing for particle-filtered 3D human body tracking. Int. J. Comput. Vis. 87(1–2), 53–74 (2010)CrossRefGoogle Scholar
  27. 27.
    Poppe, R.: Evaluating example-based pose estimation: experiments on the humaneva sets. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)
  28. 28.
    Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: tracking people by finding stylized poses. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 271–278 (2005)Google Scholar
  29. 29.
    Rogez G., Orrite-Uruñuelaa C., Martínez-del Rincón J.: A spatio-temporal 2D-models framework for human pose recovery in monocular sequences. Pattern Recognit. 41(9), 2926–2944 (2008)CrossRefMATHGoogle Scholar
  30. 30.
    Rosales R., Sclaroff S.: Combining generative and discriminative models in a framework for articulated pose estimation. Int. J. Comput. Vis. 67(3), 251–276 (2006)CrossRefGoogle Scholar
  31. 31.
    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: International Conference on Computer Vision, pp. 750–757 (2003)Google Scholar
  32. 32.
    Sidenbladh, H., Black, M.J., Fleet, D.A.: Stochastic tracking of 3D human figures using 2D image motion. In: European Conference on Computer Vision, pp. 702–718 (2000)Google Scholar
  33. 33.
    Sigal L., Balan A., Black M.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1), 4–27 (2010)CrossRefGoogle Scholar
  34. 34.
    Sigal L., Black M.: Guest editorial: state of the art in image- and video-based human pose and motion estimation. Int. J. Comput. Vis. 87(1-2), 1–3 (2010)CrossRefGoogle Scholar
  35. 35.
    Sigal, L., Black, M.J.: Predicting 3D people from 2D pictures. In: 4th Conference on Articulated Motion and Deformable Objects, pp. 185–195 (2006)Google Scholar
  36. 36.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 390–397 (2005)Google Scholar
  37. 37.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.: Learning joint top-down and bottom-up processes for 3D visual inference. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1743–1752 (2006)Google Scholar
  38. 38.
    Sun, Y., Yuan, B., Miao, Z., Wan, C.: Better foreground segmentation for static cameras via new energy form and dynamic graph-cut. In: ICPR (4), pp. 49–52 (2006)Google Scholar
  39. 39.
    Sundaresan A., Chellappa R.: Model driven segmentation and registration of articulating humans in laplacian eigenspace. IEEE Trans. Pattern Anal. Mach. Intell. 10(3), 1771–1785 (2008)CrossRefGoogle Scholar
  40. 40.
    Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with gaussian process dynamical models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 238–245 (2006)Google Scholar
  41. 41.
    Zhao X., Liu Y.: Generative tracking of 3D human motion by hierarchical annealed genetic algorithm. Pattern Recognit. 41(8), 2470–2483 (2008)CrossRefMATHGoogle Scholar
  42. 42.
    Zhong, J., Sclaroff, S.: Segmenting foreground objects from a dynamic, textured background via a robust Kalman filter. In: International Conference on Computer Vision, pp. 44–50 (2003)Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Smith CollegeNorthamptonUSA

Personalised recommendations