Multiple Human Pose Estimation with Temporally Consistent 3D Pictorial Structures

  • Vasileios BelagiannisEmail author
  • Xinchao Wang
  • Bernt Schiele
  • Pascal Fua
  • Slobodan Ilic
  • Nassir Navab
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8925)


Multiple human 3D pose estimation from multiple camera views is a challenging task in unconstrained environments. Each individual has to be matched across each view and then the body pose has to be estimated. Additionally, the body pose of every individual changes in a consistent manner over time. To address these challenges, we propose a temporally consistent 3D Pictorial Structures model (3DPS) for multiple human pose estimation from multiple camera views. Our model builds on the 3D Pictorial Structures to introduce the notion of temporal consistency between the inferred body poses. We derive this property by relying on multi-view human tracking. Identifying each individual before inference significantly reduces the size of the state space and positively influences the performance as well. To evaluate our method, we use two challenging multiple human datasets in unconstrained environments. We compare our method with the state-of-the-art approaches and achieve better results.


Human pose estimation 3D pictorial structures Part-based pose estimation 


  1. 1.
    Alahari, K., Seguin, G., Sivic, J., Laptev, I.: Pose estimation and segmentation of people in 3d movies. In: ICCV (2013)Google Scholar
  2. 2.
    Amin, S., Andriluka, M., Rohrbach, M., Schiele, B.: Multi-view pictorial structures for 3d human pose estimation. In: BMVC (2013)Google Scholar
  3. 3.
    Andriluka, M., Roth, S., Schiele, B.: Pictorial structures revisited: People detection and articulated pose estimation. In: CVPR (2009)Google Scholar
  4. 4.
    Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: CVPR, pp. 1–8. IEEE (2008)Google Scholar
  5. 5.
    Andriluka, M., Roth, S., Schiele, B.: Monocular 3d pose estimation and tracking by detection. In: CVPR (2010)Google Scholar
  6. 6.
    Belagiannis, V., Amin, S., Andriluka, M., Schiele, B., Navab, N., Ilic, S.: 3D pictorial structures for multiple human pose estimation. In: CVPR. IEEE (2014)Google Scholar
  7. 7.
    Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using k-shortest paths optimization. TPAMI (2011)Google Scholar
  8. 8.
    Bishop, C.M., et al.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  9. 9.
    Burenius, M., Sullivan, J., Carlsson, S.: 3d pictorial structures for multiple view articulated pose estimation. In: CVPR (2013)Google Scholar
  10. 10.
    Eichner, M., Ferrari, V.: We are family: joint pose estimation of multiple persons. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 228–242. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  11. 11.
    Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV (2005)Google Scholar
  12. 12.
    Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR (2008)Google Scholar
  13. 13.
    Fischler, M.A., Elschlager, R.A.: The representation and matching of pictorial structures. IEEE Transactions on Computers (1973)Google Scholar
  14. 14.
    Gammeter, S., Ess, A., Jäggli, T., Schindler, K., Leibe, B., Van Gool, L.: Articulated multi-body tracking under egomotion. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 816–830. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  15. 15.
    Hartley, R., Zisserman, A.: Multiple view geometry in computer vision, vol. 2. Cambridge Univ Press (2000)Google Scholar
  16. 16.
    Kazemi, V., Burenius, M., Azizpour, H., Sullivan, J.: Multi-view body part recognition with random forests. In: BMVC (2013)Google Scholar
  17. 17.
    Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Lee, M.W., Nevatia, R.: Human pose tracking using multi-level structured models. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 368–381. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  19. 19.
    Lin, M., Gottschalk, S.: Collision detection between geometric models: A survey. In: Proc. of IMA Conference on Mathematics of Surfaces (1998)Google Scholar
  20. 20.
    Luo, X., Berendsen, B., Tan, R.T., Veltkamp, R.C.: Human pose estimation for multiple persons based on volume reconstruction. In: ICPR. pp. 3591–3594. IEEE (2010)Google Scholar
  21. 21.
    Mitchelson, J.R., Hilton, A.: Simultaneous pose estimation of multiple people using multiple-view cues with hierarchical sampling. In: BMVC, pp. 1–10 (2003)Google Scholar
  22. 22.
    Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Computer vision and image understanding (2006)Google Scholar
  23. 23.
    Plankers, R., Fua, P.: Articulated soft objects for multi-view shape and motion capture. IEEE PAMI 25(10) (2003)Google Scholar
  24. 24.
    Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: CVPR. IEEE (2003)Google Scholar
  25. 25.
    Sigal, L., Isard, M., Haussecker, H., Black, M.: Loose-limbed people: Estimating 3d human pose and motion using non-parametric belief propagation. IJCV (2011)Google Scholar
  26. 26.
    Sigal, L., Black, M.J.: Guest editorial: state of the art in image-and video-based human pose and motion estimation. IJCV (2010)Google Scholar
  27. 27.
    Sutton, C., McCallum, A., Rohanimanesh, K.: Dynamic conditional random fields: Factorized probabilistic models for labeling and segmenting sequence data. The Journal of Machine Learning Research 8, 693–723 (2007)zbMATHGoogle Scholar
  28. 28.
    Wang, X., Türetken, E., Fleuret, F., Fua, P.: Tracking interacting objects optimally using integer programming. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 17–32. Springer, Heidelberg (2014) Google Scholar
  29. 29.
    Zhao, T., Nevatia, R.: Tracking multiple humans in complex situations. TPAMI (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Vasileios Belagiannis
    • 1
    Email author
  • Xinchao Wang
    • 2
  • Bernt Schiele
    • 3
  • Pascal Fua
    • 2
  • Slobodan Ilic
    • 1
    • 4
  • Nassir Navab
    • 1
    • 5
  1. 1.Computer Aided Medical ProceduresTechnische Universität MünchenMünchenGermany
  2. 2. Computer Vision LaboratoryEPFLLausanneSwitzerland
  3. 3.Max Planck Institute for InformaticsSaarbrückenGermany
  4. 4.Siemens AGMunichGermany
  5. 5.Computer Aided Medical ProceduresJohns Hopkins UniversityBaltimoreUSA

Personalised recommendations