Optimization and Filtering for Human Motion Capture

A Multi-Layer Framework
  • Juergen Gall
  • Bodo Rosenhahn
  • Thomas Brox
  • Hans-Peter Seidel
Open Access
Article

Abstract

Local optimization and filtering have been widely applied to model-based 3D human motion capture. Global stochastic optimization has recently been proposed as promising alternative solution for tracking and initialization. In order to benefit from optimization and filtering, we introduce a multi-layer framework that combines stochastic optimization, filtering, and local optimization. While the first layer relies on interacting simulated annealing and some weak prior information on physical constraints, the second layer refines the estimates by filtering and local optimization such that the accuracy is increased and ambiguities are resolved over time without imposing restrictions on the dynamics. In our experimental evaluation, we demonstrate the significant improvements of the multi-layer framework and provide quantitative 3D pose tracking results for the complete HumanEva-II dataset. The paper further comprises a comparison of global stochastic optimization with particle filtering, annealed particle filtering, and local optimization.

Keywords

Human motion capture Stochastic optimization Filtering Tracking 

References

  1. Agarwal, A., & Triggs, B. (2006). Recovering 3D human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58. CrossRefGoogle Scholar
  2. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., & Davis, J. (2005). Scape: shape completion and animation of people. ACM Transactions on Graphics, 24(3), 408–416. CrossRefGoogle Scholar
  3. Balan, A., Sigal, L., & Black, M. (2005). A quantitative evaluation of video-based 3D person tracking. In IEEE workshop on VS-PETS (pp. 349–356). Google Scholar
  4. Balan, A., Sigal, L., Black, M., Davis, J., & Haussecker, H. (2007). Detailed human shape and pose from images. In IEEE conference on computer vision and pattern recognition. Google Scholar
  5. Borgefors, G. (1986). Distance transformations in digital images. Computer Vision, Graphics, and Image Processing, 34(3). Google Scholar
  6. Bray, M., Kohli, P., & Torr, P. (2006). Posecut: simultaneous segmentation and 3D pose estimation of humans using dynamic graph-cuts. In European conference on computer vision (pp. 642–655). Google Scholar
  7. Bray, M., Koller-Meier, E., & Gool, L. V. (2007). Smart particle filtering for high-dimensional tracking. Computer Vision and Image Understanding, 106(1), 116–129. CrossRefGoogle Scholar
  8. Bregler, C. (1997). Learning and recognizing human dynamics in video sequences. In IEEE conference on computer vision and pattern recognition. Google Scholar
  9. Bregler, C., & Malik, J. (1998). Tracking people with twists and exponential maps. In IEEE conference on computer vision and pattern recognition (pp. 8–15). Google Scholar
  10. Bregler, C., Malik, J., & Pullen, K. (2004). Twist based acquisition and tracking of animal and human kinematics. International Journal of Computer Vision, 56(3), 179–194. CrossRefGoogle Scholar
  11. Brox, T., Rousson, M., Deriche, R., & Weickert, J. (2003). Unsupervised segmentation incorporating colour, texture, and motion. In Lecture notes in computer science : Vol. 2756. Computer analysis of images and patterns (pp. 353–360). Berlin: Springer. Google Scholar
  12. Brox, T., Rosenhahn, B., & Weickert, J. (2005). Three-dimensional shape knowledge for joint image segmentation and pose estimation. In Lecture notes in computer science : Vol. 3663. Pattern recognition (DAGM) (pp. 109–116). Berlin: Springer. CrossRefGoogle Scholar
  13. Brox, T., Rosenhahn, B., Kersting, U., & Cremers, D. (2006). Nonparametric density estimation for human pose tracking. In Lecture notes in computer science : Vol. 4174. Pattern recognition (DAGM) (pp. 546–555). Berlin: Springer. CrossRefGoogle Scholar
  14. Cheung, K., Baker, S., & Kanade, T. (2005). Shape-from-silhouette across time, part Ii: applications to human modeling and markerless motion tracking. International Journal of Computer Vision, 63(3), 225–245. CrossRefGoogle Scholar
  15. Choo, K., & Fleet, D. (2001). People tracking using hybrid Monte Carlo filtering. In International conference on Computer vision (pp. 321–328). Google Scholar
  16. CMU (2007). Graphics lab motion capture database. http://mocap.cs.cmu.edu.
  17. Deutscher, J., & Reid, I. (2005). Articulated body motion capture by stochastic search. International Journal of Computer Vision, 61(2), 185–205. CrossRefGoogle Scholar
  18. Deutscher, J., Blake, A., & Reid, I. (2000). Articulated body motion capture by annealed particle filtering. In IEEE conference on computer vision and pattern recognition (Vol. 2, pp. 1144–1149). Google Scholar
  19. Douc, R., Cappe, O., & Moulines, E. (2005). Comparison of resampling schemes for particle filtering. In International symposium on image and signal processing and analysis (pp. 64–69). Google Scholar
  20. Doucet, A., de Freitas, N., & Gordon, N. (Eds.) (2001). Sequential Monte Carlo methods in practice. New York: Springer. MATHGoogle Scholar
  21. Fossati, A., Dimitrijevic, M., Lepetit, V., & Fua, P. (2007). Bridging the gap between detection and tracking for 3D monocular video-based motion capture. In IEEE conference on computer vision and pattern recognition (pp. 1–8). Google Scholar
  22. Gall, J., Brox, T., Rosenhahn, B., & Seidel, H. P. (2007a). Global stochastic optimization for robust and accurate human motion capture. (Tech. Rep. MPI-I-2007-4-008). Max-Planck-Institut für Informatik, Germany. Google Scholar
  23. Gall, J., Potthoff, J., Schnoerr, C., Rosenhahn, B., & Seidel, H. P. (2007b). Interacting and annealing particle filters: mathematics and a recipe for applications. Journal of Mathematical Imaging and Vision, 28(1), 1–18. CrossRefMathSciNetGoogle Scholar
  24. Gall, J., Rosenhahn, B., & Seidel, H. P. (2007c). Clustered stochastic optimization for object recognition and pose estimation. In Lecture notes in computer science : Vol. 4713. Pattern recognition (pp. 32–41). Berlin: Springer. CrossRefGoogle Scholar
  25. Gall, J., Rosenhahn, B., & Seidel, H. P. (2008). Drift-free tracking of rigid and articulated objects. In IEEE conference on computer vision and pattern recognition. Google Scholar
  26. Gavrila, D., & Davis, L. (1996). 3D model-based tracking of humans in action: a multi-view approach. In IEEE conference on computer vision and pattern recognition (pp. 73–80). Google Scholar
  27. Hogg, D. (1983). Model-based vision: a program to see a walking person. Image and Vision Computing, 1(1), 5–20. CrossRefGoogle Scholar
  28. Isard, M., & Blake, A. (1996). Contour tracking by stochastic propagation of conditional density. In European conference on computer vision (pp. 343–356). Google Scholar
  29. Isard, M., & Blake, A. (1998). A smoothing filter for condensation. In European conference on computer vision (pp. 767–781). Google Scholar
  30. Kakadiaris, I., & Metaxas, D. (1996). Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection. In IEEE conference on computer vision and pattern recognition (pp. 81–87). Google Scholar
  31. Kalman, R. (1960). A new approach to linear filtering and prediction problems. Transactions of the ASME—Journal of Basic Engineering, 82(Series D), 35–45. Google Scholar
  32. Kehl, R., Bray, M., & Gool, L. V. (2005). Full body tracking from multiple views using stochastic sampling. In IEEE conference on computer vision and pattern recognition (pp. 129–136). Google Scholar
  33. Lee, M., & Nevatia, R. (2006). Human pose tracking using multi-level structured models. In European conference on computer vision (pp. 368–381). Google Scholar
  34. Moeslund, T., Hilton, A., & Krüger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2), 90–126. CrossRefGoogle Scholar
  35. Moon, K., & Pavlovic, V. (2006). Impact of dynamics on subspace embedding and tracking of sequences. In IEEE conference on computer vision and pattern recognition (pp. 198–205). Google Scholar
  36. Moral, P. D. (2004). Feynman-Kac formulae. Genealogical and interacting particle systems with applications. New York: Springer. MATHGoogle Scholar
  37. Mundermann, L., Corazza, S., & Andriacchi, T. (2007). Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models. In Computer vision and pattern recognition (pp. 1–6). Google Scholar
  38. Pennec, X., & Ayache, N. (1998). Uniform distribution, distance and expectation problems for geometric features processing. Journal of Mathematical Imaging and Vision, 9(1), 49–67. MATHCrossRefMathSciNetGoogle Scholar
  39. Puzicha, J., Buhmann, J. M., Rubner, Y., & Tomasi, C. (1999). Empirical evaluation of dissimilarity measures for color and texture. In International conference on computer vision (pp. 1165–1172). Google Scholar
  40. Ramanan, D., Forsyth, D., & Zisserman, A. (2007). Tracking people by learning their appearance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 65–81. CrossRefGoogle Scholar
  41. Rosenhahn, B., Brox, T., Smith, D., Gurney, J., & Klette, R. (2006). A system for marker-less human motion estimation. Künstliche Intelligenz, 1, 45–51. Google Scholar
  42. Rosenhahn, B., Brox, T., & Seidel, H. P. (2007a). Scaled motion dynamics for markerless motion capture. In IEEE conference on computer vision and pattern recognition (pp. 1–8). Google Scholar
  43. Rosenhahn, B., Brox, T., & Weickert, J. (2007b). Three-dimensional shape knowledge for joint image segmentation and pose tracking. International Journal of Computer Vision, 73(3), 243–262. CrossRefGoogle Scholar
  44. Rosenhahn, B., Klette, R., & Metaxas, D. (Eds.) (2008). Computational imaging and vision : Vol. 36. Human motion—understanding, modelling, capture and animation. Netherlands: Springer. Google Scholar
  45. Schraudolph, N. (1999). Local gain adaptation in stochastic gradient descent. In International conference on artificial neural networks (pp. 569–574). Google Scholar
  46. Sidenbladh, H., Black, M., & Fleet, D. (2000). Stochastic tracking of 3D human figures using 2D image motion. In European conference on computer vision (pp. 702–718). Google Scholar
  47. Sigal, L., & Black, M. (2006). Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion (Tech. Rep. CS-06-08). Brown University. Google Scholar
  48. Sigal, L., Bhatia, S., Roth, S., Black, M., & Isard, M. (2004). Tracking loose-limbed people. In IEEE conference on computer vision and pattern recognition (pp. 421–428). Google Scholar
  49. Sminchisescu, C., & Triggs, B. (2003). Estimating articulated human motion with covariance scaled sampling. The International Journal of Robotics Research, 22(6), 371–391. CrossRefGoogle Scholar
  50. Stolfi, J. (1991). Oriented projective geometry: a framework for geometric computation. Boston: Academic Press. Google Scholar
  51. Urtasun, R., & Fua, P. (2004). 3D human body tracking using deterministic temporal motion models. In European conference on computer vision (pp. 92–106). Google Scholar
  52. Urtasun, R., Fleet, D. J., & Fua, P. (2006). 3D people tracking with Gaussian process dynamical models. In IEEE conference on computer vision and pattern recognition (pp. 238–245). Google Scholar
  53. Weickert, J., ter Haar Romeny, B., & Viergever, M. (1998). Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Transactions on Image Processing, 7, 398–410. CrossRefGoogle Scholar
  54. Williams, C., & Rasmussen, C. (1996). Gaussian processes for regression. In Advances in neural information processing systems. Google Scholar
  55. Zhang, Z. (1994). Iterative point matching for registration of free-form curves and surfaces. International Journal of Computer Vision, 13(2), 119–152. CrossRefGoogle Scholar

Copyright information

© The Author(s) 2008

Authors and Affiliations

  • Juergen Gall
    • 1
  • Bodo Rosenhahn
    • 1
  • Thomas Brox
    • 2
  • Hans-Peter Seidel
    • 1
  1. 1.Max-Planck-Institute for Computer ScienceSaarbrückenGermany
  2. 2.Department of Computer ScienceUniversity of DresdenDresdenGermany

Personalised recommendations