Model-Based Pose Estimation

Abstract

Model-based pose estimation algorithms aim at recovering human motion from one or more camera views and a 3D model representation of the human body. The model pose is usually parameterized with a kinematic chain and thereby the pose is represented by a vector of joint angles. The majority of algorithms are based on minimizing an error function that measures how well the 3D model fits the image. This category of algorithms usually has two main stages, namely defining the model and fitting the model to image observations. In the first section, the reader is introduced to the different kinematic parametrization of human motion. In the second section, the most commonly used representations of the human shape are described. The third section is dedicated to the description of different error functions proposed in the literature and to common optimization techniques used for human pose estimation. Specifically, local optimization and particle-based optimization and filtering are discussed and compared. The chapter concludes with a discussion of the state-of-the-art in model-based pose estimation, current limitations and future directions.

References

  1. 1.
    Allen, B., Curless, B., Popović, Z.: Articulated body deformation from range scan data. In: ACM Transactions on Graphics, pp. 612–619. ACM, New York (2002) Google Scholar
  2. 2.
    Allen, B., Curless, B., Popović, Z.: The space of human body shapes: Reconstruction and parameterization from range scans. In: ACM Transactions on Graphics, pp. 587–594. ACM, New York (2003) Google Scholar
  3. 3.
    Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: Scape: shape completion and animation of people. ACM Trans. Graph. 24, 408–416 (2005) CrossRefGoogle Scholar
  4. 4.
    Anguelov, D., Srinivasan, P., Pang, H.C., Koller, D., Thrun, S., Davis, J.: The correlated correspondence algorithm for unsupervised registration of nonrigid surfaces. In: Advances in Neural Information Processing Systems, p. 33. MIT Press, Cambridge (2005) Google Scholar
  5. 5.
    Balan, A.O., Sigal, L., Black, M.J., Davis, J.E., Haussecker, H.W.: Detailed human shape and pose from images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  6. 6.
    Baran, I., Popović, J.: Automatic rigging and animation of 3d characters. In: ACM Transactions on Graphics, p. 72. ACM, New York (2007) Google Scholar
  7. 7.
    Besl, P., McKay, N.: A method for registration of 3d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 12, 239–256 (1992) CrossRefGoogle Scholar
  8. 8.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001) CrossRefGoogle Scholar
  9. 9.
    Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 8–15 (1998) Google Scholar
  10. 10.
    Bregler, C., Malik, J., Pullen, K.: Twist based acquisition and tracking of animal and human kinematics. Int. J. Comput. Vis. 56, 179–194 (2004) CrossRefGoogle Scholar
  11. 11.
    Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3d tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 402–415 (2010) CrossRefGoogle Scholar
  12. 12.
    Cagniart, C., Boyer, E., Ilic, S.: Free-form mesh tracking: A patch-based approach. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1339–1346 (2010) CrossRefGoogle Scholar
  13. 13.
    Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Trans. Image Process. 10(2), 266–277 (2001) MATHCrossRefGoogle Scholar
  14. 14.
    Cheung, K.M.G., Baker, S., Kanade, T.: Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1 (2003) Google Scholar
  15. 15.
    Choo, K., Fleet, D.J.: People tracking using hybrid Monte Carlo filtering. In: IEEE International Conference on Computer Vision, vol. 2, pp. 321–328 (2001) Google Scholar
  16. 16.
    Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., Andriacchi, T.P.: Markerless motion capture through visual hull, articulated icp and subject specific model generation. Int. J. Comput. Vis. 87(1), 156–169 (2010) CrossRefGoogle Scholar
  17. 17.
    Dambreville, S., Sandhu, R., Yezzi, A., Tannenbaum, A.: Robust 3d pose estimation and efficient 2d region-based segmentation from a 3d shape prior. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) European Conference on Computer Vision. Lecture Notes in Computer Science, vol. 5303, pp. 169–182. Springer, Berlin (2008) Google Scholar
  18. 18.
    de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., Thrun, S.: Performance capture from sparse multi-view video. In: ACM Transactions on Graphics, pp. 1–10. ACM, New York (2008) Google Scholar
  19. 19.
    Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 126–133 (2000) Google Scholar
  20. 20.
    Deutscher, J., Davison, A., Reid, I.: Automatic partitioning of high dimensional search spaces associated with articulated body motion capture. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2 (2001) Google Scholar
  21. 21.
    Gall, J., Potthoff, J., Schnorr, C., Rosenhahn, B., Seidel, H.: Interacting and annealing particle filters: Mathematics and a recipe for applications. J. Math. Imaging Vis. 28, 1–18 (2007) MathSciNetCrossRefGoogle Scholar
  22. 22.
    Gall, J., Rosenhahn, B., Seidel, H.: Clustered stochastic optimization for object recognition and pose estimation. In: DAGM. Lecture Notes in Computer Science, vol. 4713, pp. 32–41. Springer, Berlin (2007) Google Scholar
  23. 23.
    Gall, J., Rosenhahn, B., Brox, T., Seidel, H.: Optimization and filtering for human motion capture. Int. J. Comput. Vis. 87, 75–92 (2010) CrossRefGoogle Scholar
  24. 24.
    Gavrila, D., Davis, L.: 3D model based tracking of humans in action: A multiview approach. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1996) Google Scholar
  25. 25.
    Grassia, S.: Practical parameterization of rotations using the exponential map. J. Graph. Tools 3, 29–48 (1998) Google Scholar
  26. 26.
    Hasler, N., Ackermann, H., Rosenhahn, B., Thormaehlen, T., Seidel, H.: Multilinear pose and body shape estimation of dressed subjects from image sets. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1823–1830 (2010) CrossRefGoogle Scholar
  27. 27.
    Hasler, N., Rosenhahn, B., Thormaehlen, T., Wand, M., Gall, J., Seidel, H.-P.: Markerless motion capture with unsynchronized moving cameras. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 224–231 (2009) Google Scholar
  28. 28.
    Ju, S.X., Black, M.J., Yacoob, Y.: Cardboard people: A parameterized model of articulated image motion. In: International Workshop on Automatic Face and Gesture Recognition, pp. 38–44 (1996) Google Scholar
  29. 29.
    Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) MathSciNetCrossRefGoogle Scholar
  30. 30.
    Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Trans. Pattern Anal. Mach. Intell. 16(2), 150–162 (1994) CrossRefGoogle Scholar
  31. 31.
    Lepetit, V., Fua, P.: Monocular model-based 3d tracking of rigid objects: A survey. Found. Trends Comput. Graph. Vis. 1(1), 1–89 (2005) CrossRefGoogle Scholar
  32. 32.
    Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: ACM Transactions on Graphics, pp. 165–172. ACM, New York (2000) Google Scholar
  33. 33.
    Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artificial Intelligence, vol. 3, pp. 674–679 (1981) Google Scholar
  34. 34.
    Murray, R.M., Li, Z., Sastry, S.S.: Mathematical Introduction to Robotic Manipulation. CRC Press, Baton Rouge (1994) MATHGoogle Scholar
  35. 35.
    Piccardi, M.: Background subtraction techniques: A review. In: Proc. IEEE Int Systems, Man and Cybernetics Conf., vol. 4, pp. 3099–3104 (2004) Google Scholar
  36. 36.
    Plankers, R., Fua, P.: Articulated soft objects for video-based body modeling. In: IEEE International Conference on Computer Vision, vol. 1, pp. 394–401 (2001) Google Scholar
  37. 37.
    Pons-Moll, G., Baak, A., Helten, T., Mueller, M., Seidel, H.-P., Rosenhahn, B.: Multisensor-fusion for 3d full-body human motion capture. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 663–670 (2010) CrossRefGoogle Scholar
  38. 38.
    Pons-Moll, G., Rosenhahn, B.: Ball joints for marker-less human motion capture. In: Proc. IEEE Workshop Applications of Computer Vision (WACV) (2009) Google Scholar
  39. 39.
    Rosenhahn, B., Brox, T.: Scaled motion dynamics for markerless motion capture. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  40. 40.
    Schmaltz, C., Rosenhahn, B., Brox, T., Cremers, D., Weickert, J., Wietzke, L., Sommer, G.: Region-based pose tracking. In: Proc. 3rd Iberian Conference on Pattern Recognition and Image Analysis, vol. 4478, pp. 56–63 (2007) Google Scholar
  41. 41.
    Shoemake, K.: Animating rotation with quaternion curves. ACM SIGGRAPH Computer Graphics 19, 245–254 (1985) CrossRefGoogle Scholar
  42. 42.
    Sidenbladh, H., Black, M., Fleet, D.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) European Conference on Computer Vision. Lecture Notes in Computer Science, vol. 1843, pp. 702–718. Springer, Berlin (2000) Google Scholar
  43. 43.
    Sigal, L., Balan, A.O., Black, M.J.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vis. 87(1), 4–27 (2010) CrossRefGoogle Scholar
  44. 44.
    Sminchisescu, C.: Consistency and coupling in human model likelihoods. In: International Workshop on Automatic Face and Gesture Recognition (2002) Google Scholar
  45. 45.
    Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3d body tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1 (2001) Google Scholar
  46. 46.
    Sminchisescu, C., Triggs, B.: Building roadmaps of local minima of visual models. In: European Conference on Computer Vision, pp. 566–582 (2002) Google Scholar
  47. 47.
    Sminchisescu, C., Triggs, B.: Hyperdynamics importance sampling. In: European Conference on Computer Vision, pp. 769–783 (2002) Google Scholar
  48. 48.
    Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2003) Google Scholar
  49. 49.
    Sumner, R.W., Popović, J.: Deformation transfer for triangle meshes. In: ACM Transactions on Graphics, pp. 399–405. ACM, New York (2004) Google Scholar
  50. 50.
    Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 677–684 (2000) Google Scholar
  51. 51.
    Vondrak, M., Sigal, L., Jenkins, O.C.: Physical simulation for probabilistic motion tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  52. 52.
    Zhang, Z.: Iterative points matching for registration of free form curves and surfaces. Int. J. Comput. Vis. 13(2), 119–152 (1994) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Leibniz UniversityHanoverGermany

Personalised recommendations