Multimedia Tools and Applications

, Volume 73, Issue 1, pp 327–344 | Cite as

Motionlet LLC coding for discriminative human pose estimation

  • Li SunEmail author
  • Mingli Song
  • Dapeng Tao
  • Jiajun Bu
  • Chun Chen


3D human pose estimation is a challenging but important research topic with abundant applications. As for discriminative human pose estimation, the main goal is to learn a nonlinear mapping from image descriptors to 3D human pose configurations, which is difficult due to the high-dimensionality of human pose space and the multimodality of the distribution. To address these problems, we propose a novel motionlet LLC coding in a discriminative framework. A motionlet consists of training examples covering a local area in terms of image space, pose space and time stream. We first group most informative and helpful training examples into motionlets, then perform LLC Coding to learn the nonlinear mapping and get candidate poses, and finally choose the most appropriate pose as the result estimate. To further eliminate ambiguities and improve robustness, we extend our framework to incorporate multiviews. We conduct qualitative evaluation on our Taichi data set and quantitative evaluation on HumanEva data set, which show that our approach has gained the-state-of-the-art performance and significant improvement against previous approaches.


Pose estimation Multimodality Motionlet LLC coding Multiview 



This work was supported in part by National Natural Science Foundation of China (61170142), National Key Technology R&D Program (2011BAG05B04), International Science & Technology Cooperation Program of China (2013DFG12840), and the Fundamental Research Funds for the Central Universities.


  1. 1.
    Agarwal A, Triggs B (2004) 3D human pose from silhouettes by relevance vector regression. In: CVPRGoogle Scholar
  2. 2.
    Agarwal A, Triggs B (2006) Recovering 3D human pose from monocular images. PAMI 28(1):44–58CrossRefGoogle Scholar
  3. 3.
    Agarwal A, Triggs B (2006) A local basis representation for estimating human pose from cluttered images. In: ACCVGoogle Scholar
  4. 4.
    Bo L, Sminchisescu C (2010) Twin gaussian processes for structured prediction. In: IJCVGoogle Scholar
  5. 5.
    Duan K, Batra D, Crandall D (2012) A multi-layer composite model for human pose estimation. In: BMVCGoogle Scholar
  6. 6.
    Elgammal A, Lee C (2004) Infering 3D body pose from silhouettes using activity manifold learning. In: CVPRGoogle Scholar
  7. 7.
    Elgammal A, Lee C-S (2007) Nonlinear manifold learning for dynamic shape and dynamic appearance. CVIU 106(1):31–46Google Scholar
  8. 8.
    Fergie M, Galata A (2010) Local Gaussian processes for pose recognition from noisy inputs. In: BMVCGoogle Scholar
  9. 9.
    Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. IJCV 61(1):55–79CrossRefGoogle Scholar
  10. 10.
    Grauman K, Shakhnarovich G, Darell T (2003) Inferring 3D structure with a statistical image-based shape model. In: ICCVGoogle Scholar
  11. 11.
    Howe NR (2007) Silhouette lookup for monocular 3D pose tracking. Image Vis Comput 25(3):331–341CrossRefMathSciNetGoogle Scholar
  12. 12.
    HumanEva project (2007)
  13. 13.
    Jinjun W, Jianchao Y, Kai Y, Fengjun L, Huang T, Yihong G (2010) Locality-constrained linear coding for image classification. In: CVPRGoogle Scholar
  14. 14.
    Kanaujia A, Sminchisescu C, Metaxas D (2007) Semi-supervised hierarchical models for 3D human pose reconstruction. In: CVPRGoogle Scholar
  15. 15.
    Ning H, Wei X, Gong Y, Huang T (2008) Discriminative learning of visual words for 3D human pose estimation. In: CVPRGoogle Scholar
  16. 16.
    Lee MW, Chohen I (2004) Human upper body pose estimation in static images. In: ECCVGoogle Scholar
  17. 17.
    Ong E-J, Micilotta AS, Bowden R, Hilton A (2006) Viewpoint invariant exemplar-based 3D human tracking. CVIU 104(23):178–189Google Scholar
  18. 18.
    Poppe RW (2007) Evaluating example-based pose estimation: experiments on the Humaneva sets. Tech. Report TR-CTIT-07-72, University of TwenteGoogle Scholar
  19. 19.
    Rosales, R, Sclaroff S (2002) Learning body pose via specialized maps. In: NIPSGoogle Scholar
  20. 20.
    Sapp B, Toshev A, Taskar B (2010) Cascaded models for articulated pose estimation. In: ECCVGoogle Scholar
  21. 21.
    Serre T, Wolf L, Poggion T (2005) Object recognition with features inspired by visual cortex. In: CVPRGoogle Scholar
  22. 22.
    Shakhnarovich G, Viola PA, Darrel T (2003) Fast pose estimation with parameter-sensitive hashing. In: ICCVGoogle Scholar
  23. 23.
    Sigal L, Black M (2006) Humaneva: synchronized video and motion capture dataset for evaluation of articulated human motion. Tech. Report CS-06-08, Brown UniversityGoogle Scholar
  24. 24.
    Sminchisescu C, Kanaujia A, Li Z, Metaxas D (2005) Discriminative density propagation for 3D human motion estimation. In: CVPRGoogle Scholar
  25. 25.
    Sminchisescu C, Kanaujia A, Metaxas D (2006) Learning joint top-down and bottom-up processes for 3D visual inference. In: CVPRGoogle Scholar
  26. 26.
    Song M, Tao D, Liu Z, Li X, Zhou M (2010) Image ratio features for facial expression recognition application. TSMCB 40(3):779–788CrossRefGoogle Scholar
  27. 27.
    Song M, Tao D, Li X (2010) Visual context boosting for eye detection. TSMCB 40(6):1460–1467CrossRefGoogle Scholar
  28. 28.
    Stenger B, Thyananthan A, Torr PHS, Cipolla R (2006) Model-based hand tracking using a hierarchical Bayesian filter. PAMI 28(9):1372–1384CrossRefGoogle Scholar
  29. 29.
    Sun L, Song ML, Bu JJ, Chen C (2012) Pose estimation with motionlet LLC coding. In: PCMGoogle Scholar
  30. 30.
    Urtasun R, Darrel T (2008) Local probabilistic regression for activity-indenpendent human pose inference. In: CVPRGoogle Scholar
  31. 31.
    Yang Y, Ramanan D (2011) Articulated pose estimation with flexible mixture-of-parts. In: CVPRGoogle Scholar
  32. 32.
    Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. In: NIPSGoogle Scholar
  33. 33.
    Zhao X, Ning H, Liu Y, Huang T (2008) Discriminative estimation of 3D human pose using Gaussian processes. In: CVPRGoogle Scholar
  34. 34.
    Zhao X, Fu Y, Liu Y (2009) Temporal-spatial local Gaussian processes experts for human pose estimation. In: ACCVGoogle Scholar
  35. 35.
    Zhao X, Fu Y, Liu Y (2011) Human motion tracking by temporal-spacial local Gaussian process experts. TIP 20(4):1141–1151MathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Li Sun
    • 1
    Email author
  • Mingli Song
    • 1
  • Dapeng Tao
    • 2
  • Jiajun Bu
    • 1
  • Chun Chen
    • 1
  1. 1.Zhejiang Provincial Key Laboratory of Service Robot, College of Computer ScienceZhejiang UniversityHangzhouChina
  2. 2.School of Electronic and Information EngineeringSouth China University of TechnologyGuangZhouChina

Personalised recommendations