Pose Estimation with Motionlet LLC Coding

  • Li Sun
  • Mingli Song
  • Jiajun Bu
  • Chun Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7674)


3D human pose estimation is a challenging but important research topic with abundant applications. As for discriminative human pose estimation, the main goal is to learn a nonlinear mapping from image descriptors to 3D human pose configurations, which is difficult due to the high-dimensionality of human pose space and the multimodality of the distribution. To address these problems, we propose a novel motionlet LLC coding on a discriminative framework. A motionlet consists of training examples covering a local area in terms of image space, pose space and time stream. We first group most informative and helpful training examples into motionlets, then perform LLC Coding to learn the nonlinear mapping and get candidate poses, and finally choose the most appropriate pose as the result estimate. To further eliminate ambiguities and improve robustness, we extend our framework to incorporate multiviews. We conduct qualitative evaluation on our Taichi data set and quantitative evaluation on HumanEva data set, which show that our approach has gained the-state-of-the-art performance and significant improvement against previous approaches.


human pose estimation multimodality multiview motionlet LLC coding 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: A Local Basis Representation for Estimating Human Pose from Cluttered Images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006, Part I. LNCS, vol. 3851, pp. 50–59. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Bo, L., Sminchisescu, C.: Twin gaussian processes for structured prediction. IJCV (2010)Google Scholar
  3. 3.
    Elgammal, A., Lee, C.-S.: Nonlinear manifold learning for dynamic shape and dynamic appearance. CVIU 106(1), 31–46 (2007)Google Scholar
  4. 4.
    Fergie, M., Galata, A.: Local Gaussian processes for pose recognition from noisy inputs. In: BMVC (2010)Google Scholar
  5. 5.
    Grauman, K., Shakhnarovich, G., Darell, T.: Inferring 3D structure with a statistical image-based shape model. In: ICCV (2003)Google Scholar
  6. 6.
    Howe, N.R.: Silhouette lookup for monocular 3D pose tracking. Image and Vision Computing 25(3), 331–341 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained Linear Coding for image classification. In: CVPR (2010)Google Scholar
  8. 8.
    Kanaujia, A., Sminchisescu, C., Metaxas, D.: Semi-supervised hierarchical models for 3D human pose reconstruction. In: CVPR (2007)Google Scholar
  9. 9.
    Ning, H., Wei, X., Gong, Y., Huang, T.: Discriminative learning of visual words for 3D human pose estimation. In: CVPR (2008)Google Scholar
  10. 10.
    Ong, E.-J., Micilotta, A.S., Bowden, R., Hilton, A.: Viewpoint invariant exemplar-based 3D human tracking. CVIU 104(23), 178–189 (2006)Google Scholar
  11. 11.
    Poppe, R.W.: Evaluating example-based pose estimation: Experiments on the Humaneva sets. Tech. Report TR-CTIT-07-72, University of Twente (2007)Google Scholar
  12. 12.
    Rosales, R., Sclaroff, S.: Learning body pose via specialized maps. In: NIPS (2002)Google Scholar
  13. 13.
    Serre, T., Wolf, L., Poggion, T.: Object recognition with features inspired by visual cortex. In: CVPR (2005)Google Scholar
  14. 14.
    Shakhnarovich, G., Viola, P.A., Darrel, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV (2003)Google Scholar
  15. 15.
    Sigal, L., Black, M.: Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Tech. Report CS-06-08, Brown University (2006)Google Scholar
  16. 16.
  17. 17.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3D human motion estimation. In: CVPR (2005)Google Scholar
  18. 18.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.: Learning joint top-down and bottom-up processes for 3D visual inference. In: CVPR (2006)Google Scholar
  19. 19.
    Urtasun, R., Darrel, T.: Local probabilistic regression for activity-indenpendent human pose inference. In: CVPR (2008)Google Scholar
  20. 20.
    Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: NIPS (2009)Google Scholar
  21. 21.
    Zhao, X., Ning, H., Liu, Y., Huang, T.: Discriminative estimation of 3D human pose using Gaussian processes. In: CVPR (2008)Google Scholar
  22. 22.
    Zhao, X., Fu, Y., Liu, Y.: Temporal-Spatial Local Gaussian Process Experts for Human Pose Estimation. In: Zha, H., Taniguchi, R.-I., Maybank, S. (eds.) ACCV 2009, Part I. LNCS, vol. 5994, pp. 364–373. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Li Sun
    • 1
  • Mingli Song
    • 1
  • Jiajun Bu
    • 1
  • Chun Chen
    • 1
  1. 1.Zhejiang Provincial Key Laboratory of Service Robot, College of Computer ScienceZhejiang UniversityChina

Personalised recommendations