Advertisement

International Journal of Computer Vision

, Volume 122, Issue 2, pp 388–408 | Cite as

3D Human Pose Tracking Priors using Geodesic Mixture Models

  • Edgar Simo-Serra
  • Carme Torras
  • Francesc Moreno-Noguer
Article

Abstract

We present a novel approach for learning a finite mixture model on a Riemannian manifold in which Euclidean metrics are not applicable and one needs to resort to geodesic distances consistent with the manifold geometry. For this purpose, we draw inspiration on a variant of the expectation-maximization algorithm, that uses a minimum message length criterion to automatically estimate the optimal number of components from multivariate data lying on an Euclidean space. In order to use this approach on Riemannian manifolds, we propose a formulation in which each component is defined on a different tangent space, thus avoiding the problems associated with the loss of accuracy produced when linearizing the manifold with a single tangent space. Our approach can be applied to any type of manifold for which it is possible to estimate its tangent space. Additionally, we consider using shrinkage covariance estimation to improve the robustness of the method, especially when dealing with very sparsely distributed samples. We evaluate the approach on a number of situations, going from data clustering on manifolds to combining pose and kinematics of articulated bodies for 3D human pose tracking. In all cases, we demonstrate remarkable improvement compared to several chosen baselines.

Keywords

Probabilistic priors Mixture modelling Riemannian manifolds 3D human pose Human kinematics 

Notes

Acknowledgments

We would like to thank the three anonymous reviewers for their insights and comments that have significantly contributed to improving this manuscript. This work was partly funded by the Spanish MINECO project RobInstruct TIN2014-58178-R and by the ERA-net CHISTERA project I-DRESS PCIN-2015-147.

References

  1. Andriluka, M., Roth, S., & Schiele, B. (2010). Monocular 3D pose estimation and tracking by detection. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  2. Archambeau, C., & Verleysen, M. (2005). Manifold constrained finite gaussian mixtures. In: Computational Intelligence and Bioinspired Systems (pp. 820–828). Berlin: Springer.Google Scholar
  3. Banerjee, A., Dhillon, I. S., Ghosh, J., Sra, S., & Ridgeway, G. (2005). Clustering on the unit hypersphere using von Mises–Fisher distributions. Journal of Machine Learning Research, 6(9), 1345–1382.MathSciNetMATHGoogle Scholar
  4. Boothby, W. M. (2003). An introduction to differentiable manifolds and riemannian geometry (2nd ed.). New York: Academic Press.MATHGoogle Scholar
  5. Brand, M. (2003). Charting a manifold. In: Neural Information Processing Systems (pp. 961–968).Google Scholar
  6. Brubaker, M. A., Salzmann, M., & Urtasun, R. (2012). A family of MCMC methods on implicitly defined manifolds. Journal of Machine Learning Research, 22, 161–172.Google Scholar
  7. do Carmo, M. P. (1992). Riemannian geometry. Boston: Birkhäuser.CrossRefMATHGoogle Scholar
  8. Caseiro, R., Martins, P., Henriques, J. F., & Batista, J. (2012). A nonparametric riemannian framework on tensor field with application to foreground segmentation. Pattern Recognition, 45(11), 3997–4017.CrossRefMATHGoogle Scholar
  9. Caseiro, R., Martins, P., Henriques, J. F., Leite, F. S., & Batista, J. (2013). Rolling riemannian manifolds to solve the multi-class classification problem. In IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  10. Chang, J., & Fisher III, J. W. (2013). Parallel sampling of dp mixture models using sub-cluster splits. In: Neural Information Processing Systems (pp. 620–628).Google Scholar
  11. Chen, Y., Wiesel, A., Eldar, Y., & Hero, A. (2010). Shrinkage algorithms for mmse covariance estimation. IEEE Transactions on Signal Processing, 58(10), 5016–5029.MathSciNetCrossRefGoogle Scholar
  12. Darling, R. (1996). Martingales on noncompact manifolds: Maximal inequalities and prescribed limits. Annales de l’IHP Probabilités et statistiques, 32(4), 431–454.MathSciNetMATHGoogle Scholar
  13. Davis, B. C., Bullitt, E., Fletcher, P. T., & Joshi, S. (2007). Population shape regression from random design data. In: International Conference on Computer Vision.Google Scholar
  14. Dedieu, J. P., & Nowicki, D. (2005). Symplectic methods for the approximation of the exponential map and the newton iteration on riemannian submanifolds. Journal of Complexity, 21(4), 487–501.MathSciNetCrossRefMATHGoogle Scholar
  15. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 1–38.Google Scholar
  16. Deutscher, J., & Reid, I. (2005). Articulated body motion capture by stochastic search. International Journal of Computer Vision, 61(2), 185–205.CrossRefGoogle Scholar
  17. Figueiredo, M., & Jain, A. (2002). Unsupervised learning of finite mixture models. IEEE Transactions Pattern Analylis and Machine Intelligence, 24(3), 381–396.CrossRefGoogle Scholar
  18. Fletcher, P., Lu, C., Pizer, S., & Joshi, S. (2004). Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Transactions on Medical Imaging, 23(8), 995–1005.CrossRefGoogle Scholar
  19. Freifeld, O., & Black, M. J. (2012). Lie bodies: A manifold representation of 3D human shape. In: European Conference on Computer Vision.Google Scholar
  20. Gall, J., Rosenhahn, B., Brox, T., & Seidel, H. P. (2010). Optimization and filtering for human motion capture. International Journal of Computer Vision, 87, 75–92.CrossRefGoogle Scholar
  21. Harandi, M., Sanderson, C., Hartley, R., & Lovell, B. (2012). Sparse coding and dictionary learning for symmetric positive definite matrices: A kernel approach. In: European Conference on Computer Vision.Google Scholar
  22. Harandi, M. T., Salzmann, M., & Hartley, R. (2014). From manifold to manifold: Geometry-aware dimensionality reduction for spd matrices. In: European Conference on Computer Vision.Google Scholar
  23. Hauberg, S., Sommer, S., & Pedersen, K. S. (2012). Natural metrics and least-committed priors for articulated tracking. Image and Vision Computing, 30(6), 453–461.CrossRefGoogle Scholar
  24. Huckemann, S., Hotz, T., & Munk, A. (2010). Intrinsic shape analysis: Geodesic PCA for Riemannian manifolds modulo isometric lie group actions. Statistica Sinica, 20, 1–100.MathSciNetMATHGoogle Scholar
  25. Ionescu, C., Li, F., & Sminchisescu, C. (2011). Latent structured models for human pose estimation. In: International Conference on Computer Vision.Google Scholar
  26. Ionescu, C., Papava, D., Olaru, V., & Sminchisescu, C. (2014). Human3.6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Transactions Pattern Analylis and Machine Intelligence, 36(7), 1325–1339.CrossRefGoogle Scholar
  27. Jain, S., & Govindu, V. (2013). Efficient higher-order clustering on the grassmann manifold. In: International Conference on Computer Vision.Google Scholar
  28. Jayasumana, S., Hartley, R., Salzmann, M., Li, H., & Harandi, M. (2013). Kernel methods on the riemannian manifold of symmetric positive definite matrices. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  29. Jayasumana, S., Hartley, R., Salzmann, M., Li, H., & Harandi, M. (2015). Kernel methods on Riemannian manifolds with Gaussian RBF Kernels. In: IEEE Transactions Pattern Analylis and Machine Intelligence.Google Scholar
  30. Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30(5), 509–541.MathSciNetCrossRefMATHGoogle Scholar
  31. Lawrence, N. D. (2005). Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of Machine Learning Research, 6, 1783–1816.MathSciNetMATHGoogle Scholar
  32. Lawrence, N. D., & Moore, A. J. (2007). Hierarchical Gaussian process latent variable models. In: International Conference in Machine Learning.Google Scholar
  33. Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2), 365–411.MathSciNetCrossRefMATHGoogle Scholar
  34. Ledoit, O., & Wolf, M. (2011). Nonlinear shrinkage estimation of large-dimensional covariance matrices. Institute for Empirical Research in Economics University of Zurich Working Paper (515).Google Scholar
  35. Lenglet, C., Rousson, M., Deriche, R., & Faugeras, O. (2006). Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor mri processing. Journal of Mathematical Imaging and Vision, 25(3), 423–444.MathSciNetCrossRefGoogle Scholar
  36. Li, R., Tian, T. P., Sclaroff, S., & Yang, M. H. (2010). 3D human motion tracking with a coordinated mixture of factor analyzers. International Journal of Computer Vision, 87(1–2), 170–190.CrossRefGoogle Scholar
  37. Moeslund, T. B., & Granum, E. (2001). A survey of computer vision-based human motion capture. Computer Vision and Image Understanding, 81(3), 231–268.CrossRefMATHGoogle Scholar
  38. Moeslund, T. B., Hilton, A., & Krüger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104, 90–126.CrossRefGoogle Scholar
  39. Muralidharan, P., & Fletcher, P. T. (2012). Sasaki metrics for analysis of longitudinal data on manifolds. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  40. Ozakin, A., & Gray, A. (2009). Submanifold density estimation. In: Neural Information Processing Systems (pp. 1375–1382).Google Scholar
  41. Pelletier, B. (2005). Kernel density estimation on Riemannian manifolds. Statistics & Probability Letters, 73(3), 297–304.MathSciNetCrossRefMATHGoogle Scholar
  42. Pennec, X. (2006). Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements. Journal of Mathematical Imaging and Vision, 25(1), 127–154.MathSciNetCrossRefGoogle Scholar
  43. Pennec, X. (2009). Statistical computing on manifolds: From riemannian geometry to computational anatomy. In: Emerging Trends in Visual Computing (pp. 347–386). Berlin: Springer.Google Scholar
  44. Pennec, X., Fillard, P., & Ayache, N. (2006). A Riemannian framework for tensor computing. International Journal of Computer Vision, 66(1), 41–66.CrossRefMATHGoogle Scholar
  45. Quiñonero-candela, J., Rasmussen, C. E., & Herbrich, R. (2005). A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research, 6, 1939–1959.MathSciNetMATHGoogle Scholar
  46. Said, S., Courtry, N., Bihan, N. L., & Sangwine, S. (2007). Exact principal geodesic analysis for data on \(SO(3)\). In: European Signal Processing Conference.Google Scholar
  47. Sanin, A., Sanderson, C., Harandi, M., & Lovell, B. (2012). K-tangent spaces on riemannian manifolds for improved pedestrian detection. In: International Conference on Image Processing.Google Scholar
  48. Sasaki, S. (1958). On the differential geometry of tangent bundles of riewannian manifolds. Tohoku Mathematical Journal, Second Series, 10(3), 338–354.MathSciNetCrossRefMATHGoogle Scholar
  49. Schäfer, J., & Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), 32.MathSciNetCrossRefGoogle Scholar
  50. Shirazi, S., Harandi, M., Sanderson, C., Alavi, A., & Lovell, B. (2012). Clustering on grassmann manifolds via kernel embedding with application to action analysis. In: International Conference on Image Processing.Google Scholar
  51. Sigal, L., Bhatia, S., Roth, S., Black, M., & Isard, M. (2004). Tracking loose-limbed people. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  52. Sigal, L., Isard, M., Haussecker, H. W., & Black, M. J. (2012). Loose-limbed people: Estimating 3D human pose and motion using non-parametric belief propagation. International Journal of Computer Vision, 98(1), 15–48.MathSciNetCrossRefMATHGoogle Scholar
  53. Simo-Serra, E., Quattoni, A., Torras, C., & Moreno-Noguer, F. (2013). A joint model for 2D and 3D pose estimation from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  54. Simo-Serra, E., Ramisa, A., Alenyà, G., Torras, C., & Moreno-Noguer, F. (2012). Single image 3D human pose estimation from noisy observations. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  55. Simo-Serra, E., Torras, C., & Moreno-Noguer, F. (2014). Geodesic finite mixture models. In: British Machine Vision Conference.Google Scholar
  56. Simo-Serra, E., Torras, C., & Moreno-Noguer, F. (2015). Lie algebra-based kinematic prior for 3D human pose tracking. In: International Conference on Machine Vision Applications.Google Scholar
  57. Sivalingam, R., Boley, D., Morellas, V., & Papanikolopoulos, N. (2010). Tensor sparse coding for region covariances. In: European Conference on Computer Vision.Google Scholar
  58. Sminchisescu, C., & Triggs, B. (2003). Estimating articulated human motion with covariance scaled sampling. International Journal of Robotics Research, 22(6), 371–391. Special issue on Visual Analysis of Human Movement.CrossRefGoogle Scholar
  59. Sommer, S. (2015). Anisotropic distributions on manifolds: Template estimation and most probable paths. In: Information Processing in Medical Imaging. Lecture Notes in Computer Science. Berlin: Springer.Google Scholar
  60. Sommer, S., Lauze, F., Hauberg, S., & Nielsen, M. (2010). Manifold valued statistics, exact principal geodesic analysis and the effect of linear approximations. In: European Conference on Computer Vision.Google Scholar
  61. Sommer, S., Lauze, F., & Nielsen, M. (2014). Optimization over geodesics for exact principal geodesic analysis. Advances in Computational Mathematics, 40(2), 283–313.MathSciNetCrossRefMATHGoogle Scholar
  62. Straub, J., Chang, J., Freifeld, O., & Fisher III, J. W. (2015). A dirichlet process mixture model for spherical data. In: International Conference on Artificial Intelligence and Statistics.Google Scholar
  63. Taylor, G., Sigal, L., Fleet, D., & Hinton, G. (2010). Dynamical binary latent variable models for 3d human pose tracking. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  64. Tosato, D., Farenzena, M., Cristani, M., Spera, M., & Murino, V. (2010). Multi-class classification on riemannian manifolds for video surveillance. In: European Conference on Computer Vision (pp. 378–391).Google Scholar
  65. Tosato, D., Spera, M., Cristani, M., & Murino, V. (2013). Characterizing humans on riemannian manifolds. IEEE Transactions Pattern Analylis and Machine Intelligence, 35(8), 1972–1984.CrossRefGoogle Scholar
  66. Tournier, M., Wu, X., Courty, N., Arnaud, E., & Reveret, L. (2009). Motion compression using principal geodesics analysis. Computer Graphics Forum, 28(2), 355–364.CrossRefGoogle Scholar
  67. Turaga, P., Veeraraghavan, A., Srivastava, A., & Chellappa, R. (2011). Statistical computations on grassmann and stiefel manifolds for image and video-based recognition. IEEE Transactions Pattern Analylis and Machine Intelligence, 33(11), 2273–2286.CrossRefGoogle Scholar
  68. Tuzel, O., Porikli, F., & Meer, P. (2008). Pedestrian detection via classification on Riemannian manifolds. IEEE Transactions Pattern Analylis and Machine Intelligence, 30(10), 1713–1727.CrossRefGoogle Scholar
  69. Urtasun, R., Fleet, D. J., & Fua, P. (2006). 3D people tracking with gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  70. Urtasun, R., Fleet, D. J., & Lawrence, N. D. (2007). Modeling human locomotion with topologically constrained latent variable models. In: Proceedings of the 2nd Conference on Human Motion: Understanding, Modeling, Capture and Animation.Google Scholar
  71. Varol, A., Salzmann, M., Fua, P., & Urtasun, R. (2012). A constrained latent variable model. In: IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
  72. Wallace, C. S., & Freeman, P. R. (1987). Estimation and inference by compact coding. Journal of the Royal Statistical Society: Series B (Methodological), 240–265.Google Scholar
  73. Wang, J., Fleet, D., & Hertzmann, A. (2005). Gaussian process dynamical models. In: Neural Information Processing Systems.Google Scholar
  74. Yao, A., Gall, J., Gool, L. V., & Urtasun, R. (2011). Learning probabilistic non-linear latent variable models for tracking complex activities. In: Neural Information Processing Systems.Google Scholar
  75. Zhang, M., & Fletcher, P. T. (2013). Probabilistic principal geodesic analysis. In: Neural Information Processing Systems (pp. 1178–1186).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Edgar Simo-Serra
    • 1
  • Carme Torras
    • 2
  • Francesc Moreno-Noguer
    • 2
  1. 1.Waseda UniversityTokyoJapan
  2. 2.Institut de Robòtica i Informàtica Industrial (CSIC-UPC)BarcelonaSpain

Personalised recommendations