Advertisement

A recognition-based motion capture baseline on the HumanEva II test data

  • 138 Accesses

  • 10 Citations

Abstract

The advent of the HumanEva standardized motion capture data sets has enabled quantitative evaluation of motion capture algorithms on comparable terms. This paper measures the performance of an existing monocular recognition-based pose recovery algorithm on select HumanEva data, including all the HumanEva II clips. The method uses a physically motivated Markov process to connect adjacent frames and achieve a 3D relative mean error of 8.9 cm per joint. It further investigates factors contributing to the error and finds that research into better pose retrieval methods offers promise for improvement of this technique and those related to it. Finally, it investigates the effects of local search optimization with the same recognition-based algorithm and finds no significant deterioration in the results, indicating that processing speed can be largely independent of the size of the recognition library for this approach.

This is a preview of subscription content, log in to check access.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

References

  1. 1

    Agarwal, A., Triggs, B.: 3D human pose from silhouettes by relevance vector regression. In: International Conference on Computer Vision and Pattern Recognition, vol. II, pp. 882–888 (2004)

  2. 2

    Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1) (2006)

  3. 3

    Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2003)

  4. 4

    Belkin N.J., Kantor P., Fox E.A., Shaw J.A.: Combining the evidence of multiple query representations for information retrieval. Inf. Process. Manage. 31(3), 431–448 (1995)

  5. 5

    Bo L., Sminchisescu C.: Twin gaussian processes for structured prediction. Int. J. Comput. Vis. 87(1–2), 28–52 (2010)

  6. 6

    Cheng, S., Trivedi, M.: Articulated body pose estimation from voxel reconstructions using kinematically constrained gaussian mixture models: algorithm and evaluation. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)

  7. 7

    Elgammal, A., Lee, C.: Inferring 3D body pose from silhouettes using activity manifold learning. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. II, pp. 681–688 (2004)

  8. 8

    Forsyth, D., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational studies of human motion: Part 1. Tracking and motion synthesis. Found. Trends. Comput. Graphics Vis. 1(2/3) (2006)

  9. 9

    Fusiello, A., Aprile, M., Marzotto, R., Murino, V.: Mosaic of a video shot with multiple moving objects. In: IEEE International Conference on Image Processing, vol. II, pp. 307–310 (2003)

  10. 10

    Howe, N.: Silhouette lookup for automatic pose tracking. In: IEEE Workshop on Articulated and Nonrigid Motion (2004)

  11. 11

    Howe, N.: Flow lookup and biological motion perception. In: International Conference on Image Processing (2005)

  12. 12

    Howe, N.: Recognition-based motion capture and the humaneva ii test data. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)

  13. 13

    Howe, N., Deschamps, A.: Better foreground segmentation through graph cuts. Technical report, Smith College. http://arxiv.org/abs/cs.CV/0401017 (2004)

  14. 14

    Howe, N.R.: Evaluating lookup-based monocular human pose tracking on the humaneva test data. Technical report, Smith College. Extended abstract for EHUM 2006 workshop (2006)

  15. 15

    Howe N.R.: Silhouette lookup for monocular 3D pose tracking. Image Vis. Comput. 25(3), 331–341 (2006)

  16. 16

    Husz, Z., Wallace, A., Green, P.: Evaluation of a hierarchical partitioned particle filter with action primitives. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)

  17. 17

    Kohli, P., Torr, P., Bray, M.: PoseCut: Simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In: European Conference on Computer Vision, pp. 642–655 (2006)

  18. 18

    Krause, E.: Motion Estimation for Frame-Rate Conversion. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA (1987)

  19. 19

    Lee C.-S., Elgammal A.: Coupled visual and kinematic manifold models for tracking. Int. J. Comput. Vis. 87(1–2), 118–139 (2010)

  20. 20

    Li R., Tian T.-P., Sclaroff S.: 3D human motion tracking with a coordinated mixture of factor analyzers. Int. J. Comput. Vis. 87(1–2), 170–190 (2010)

  21. 21

    McIntosh, C., Hamarneh, G., Mori, G.: Human limb delineation and joint position recovery using localized boundary models. In: IEEE Workshop on Motion and Video Computing (2007)

  22. 22

    Moeslund T., Hilton A., Krüger V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2), 90–126 (2006)

  23. 23

    Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: European Conference on Computer Vision (2002)

  24. 24

    Mori G., Malik J.: Recovering 3D human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1052–1062 (2006)

  25. 25

    Navaratnam, R., Fitzgibbon, A., Cipolla, R.: The joint manifold model for semi-supervised multi-valued regression. In: iccv, pp. 1–8 (2007)

  26. 26

    Peursum P., Venkatesh S., West G.: A study on smoothing for particle-filtered 3D human body tracking. Int. J. Comput. Vis. 87(1–2), 53–74 (2010)

  27. 27

    Poppe, R.: Evaluating example-based pose estimation: experiments on the humaneva sets. In: EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation. http://www.cs.brown.edu/~ls/ehum2/schedule.html (2007)

  28. 28

    Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: tracking people by finding stylized poses. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 271–278 (2005)

  29. 29

    Rogez G., Orrite-Uruñuelaa C., Martínez-del Rincón J.: A spatio-temporal 2D-models framework for human pose recovery in monocular sequences. Pattern Recognit. 41(9), 2926–2944 (2008)

  30. 30

    Rosales R., Sclaroff S.: Combining generative and discriminative models in a framework for articulated pose estimation. Int. J. Comput. Vis. 67(3), 251–276 (2006)

  31. 31

    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: International Conference on Computer Vision, pp. 750–757 (2003)

  32. 32

    Sidenbladh, H., Black, M.J., Fleet, D.A.: Stochastic tracking of 3D human figures using 2D image motion. In: European Conference on Computer Vision, pp. 702–718 (2000)

  33. 33

    Sigal L., Balan A., Black M.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1), 4–27 (2010)

  34. 34

    Sigal L., Black M.: Guest editorial: state of the art in image- and video-based human pose and motion estimation. Int. J. Comput. Vis. 87(1-2), 1–3 (2010)

  35. 35

    Sigal, L., Black, M.J.: Predicting 3D people from 2D pictures. In: 4th Conference on Articulated Motion and Deformable Objects, pp. 185–195 (2006)

  36. 36

    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 390–397 (2005)

  37. 37

    Sminchisescu, C., Kanaujia, A., Metaxas, D.: Learning joint top-down and bottom-up processes for 3D visual inference. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1743–1752 (2006)

  38. 38

    Sun, Y., Yuan, B., Miao, Z., Wan, C.: Better foreground segmentation for static cameras via new energy form and dynamic graph-cut. In: ICPR (4), pp. 49–52 (2006)

  39. 39

    Sundaresan A., Chellappa R.: Model driven segmentation and registration of articulating humans in laplacian eigenspace. IEEE Trans. Pattern Anal. Mach. Intell. 10(3), 1771–1785 (2008)

  40. 40

    Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with gaussian process dynamical models. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 238–245 (2006)

  41. 41

    Zhao X., Liu Y.: Generative tracking of 3D human motion by hierarchical annealed genetic algorithm. Pattern Recognit. 41(8), 2470–2483 (2008)

  42. 42

    Zhong, J., Sclaroff, S.: Segmenting foreground objects from a dynamic, textured background via a robust Kalman filter. In: International Conference on Computer Vision, pp. 44–50 (2003)

Download references

Author information

Correspondence to Nicholas R. Howe.

Electronic Supplementary Material

The Below is the Electronic Supplementary Material.

ESM 1 (AVI 10,193 kb)

ESM 2 (AVI 12,720 kb)

ESM 6 (AVI 13,747 kb)

ESM 7 (AVI 13,890 kb)

ESM 8 (AVI 12,860 kb)

ESM 9 (AVI 14,602 kb)

ESM 1 (AVI 10,193 kb)

ESM 2 (AVI 12,720 kb)

ESM 3 (AVI 13,725 kb)

ESM 4 (AVI 12,480 kb)

ESM 5 (AVI 12,988 kb)

ESM 6 (AVI 13,747 kb)

ESM 7 (AVI 13,890 kb)

ESM 8 (AVI 12,860 kb)

ESM 9 (AVI 14,602 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Howe, N.R. A recognition-based motion capture baseline on the HumanEva II test data. Machine Vision and Applications 22, 995–1008 (2011). https://doi.org/10.1007/s00138-011-0344-x

Download citation

Keywords

  • Markerless motion capture
  • Human pose recovery
  • 3D tracking
  • HumanEva