Real-Time Marker-Less Multi-person 3D Pose Estimation in RGB-Depth Camera Networks

  • Marco CarraroEmail author
  • Matteo Munaro
  • Jeff Burke
  • Emanuele Menegatti
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 867)


This paper proposes a novel system to estimate and track the 3D poses of multiple persons in calibrated RGB-Depth camera networks. The multi-view 3D pose of each person is computed by a central node which receives the single-view outcomes from each camera of the network. Each single-view outcome is computed by using a CNN for 2D pose estimation and extending the resulting skeletons to 3D by means of the sensor depth. The proposed system is marker-less, multi-person, independent of background and does not make any assumption on people appearance and initial pose. The system provides real-time outcomes, thus being perfectly suited for applications requiring user interaction. Experimental results show the effectiveness of this work with respect to a baseline multi-view approach in different scenarios. To foster research and applications based on this work, we released the source code in OpenPTrack, an open source project for RGB-D people tracking.



This work was partially supported by U.S. National Science Foundation award IIS-1629302.


  1. 1.
    Han, F., Yang, X., Reardon, C., Zhang, Y., Zhang, H.: Simultaneous feature and body-part learning for real-time robot awareness of human behaviors, pp. 2621–2628 (2017)Google Scholar
  2. 2.
    Zanfir, M., Leordeanu, M., Sminchisescu, C.: The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2752–2759 (2013)Google Scholar
  3. 3.
    Wang, C., Wang, Y., Yuille, A.L. : An approach to pose-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 915–922 (2013)Google Scholar
  4. 4.
    Ghidoni, S., Munaro, M.: A multi-viewpoint feature-based re-identification system driven by skeleton keypoints. Robot. Autonom. Syst. 90, 45–54 (2017)CrossRefGoogle Scholar
  5. 5.
    Jaimes, A., Sebe, N.: Multimodal human-computer interaction: a survey. Comput. Vis. Image Underst. 108(1), 116–134 (2007)CrossRefGoogle Scholar
  6. 6.
    Morato, C., Kaipa, K.N., Zhao, B., Gupta, S.K.: Toward safe human robot collaboration by using multiple kinects based real-time human tracking. J. Comput. Inf. Sci. Eng. 14(1), 011006 (2014)CrossRefGoogle Scholar
  7. 7.
    Michieletto, S., Stival, F., Castelli, F., Khosravi, M., Landini, A., Ellero, S., Landš, R., Boscolo, N., Tonello, S., Varaticeanu, B., Nicolescu, C., Pagello, E.: Flexicoil: flexible robotized coils winding for electric machines manufacturing industry. In: ICRA Workshop on Industry of the Future: Collaborative, Connected, Cognitive (2017)Google Scholar
  8. 8.
    Stival, F., Michieletto, S., Pagello, E.: How to deploy a wire with a robotic platform: learning from human visual demonstrations. In: FAIM 2017 (2017)CrossRefGoogle Scholar
  9. 9.
    Zivkovic, Z.: Wireless smart camera network for real-time human 3D pose reconstruction. Comput. Vis. Image Underst. 114(11), 1215–1222 (2010)CrossRefGoogle Scholar
  10. 10.
    Carraro, M., Munaro, M., Menegatti, E.: A powerful and cost-efficient human perception system for camera networks and mobile robotics. In: International Conference on Intelligent Autonomous Systems, pp. 485–497. Springer, Cham (2016)CrossRefGoogle Scholar
  11. 11.
    Carraro, M., Munaro, M., Menegatti, E.: Cost-efficient rgb-d smart camera for people detection and tracking. J. Electr. Imaging 25(4), 041007–041007 (2016)CrossRefGoogle Scholar
  12. 12.
    Basso, F., Levorato, R., Menegatti, E.: Online calibration for networks of cameras and depth sensors. In: OMNIVIS: The 12th Workshop on Non-classical Cameras, Camera Networks and Omnidirectional Vision-2014 IEEE International Conference on Robotics and Automation (ICRA 2014) (2014)Google Scholar
  13. 13.
    Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: CVPR (2016)Google Scholar
  14. 14.
    Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310 (2017)Google Scholar
  15. 15.
    Munaro, M., Horn, A., Illum, R., Burke, J., Rusu, R.B.: OpenPTrack: people tracking for heterogeneous networks of color-depth cameras. In: IAS-13 Workshop Proceedings: 1st International Workshop on 3D Robot Perception with Point Cloud Library, pp. 235–247 (2014)Google Scholar
  16. 16.
    Munaro, M., Basso, F., Menegatti, E.: OpenPTrack: open source multi-camera calibration and people tracking for RGB-D camera networks. Robot. Autonom. Syst. 75, 525–538 (2016)CrossRefGoogle Scholar
  17. 17.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRefGoogle Scholar
  18. 18.
    Buys, K., Cagniart, C., Baksheev, A., De Laet, T., De Schutter, J., Pantofaru, C.: An adaptable system for RGB-D based human body detection and pose estimation. J. Vis. Commun. Image Representation 25(1), 39–52 (2014)CrossRefGoogle Scholar
  19. 19.
    Carraro, M., Munaro, M., Roitberg, A., Menegatti, E.: Improved skeleton estimation by means of depth data fusion from multiple depth cameras. In: International Conference on Intelligent Autonomous Systems, pp. 1155–1167. Springer, Cham (2016)CrossRefGoogle Scholar
  20. 20.
    Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. In: European Conference on Computer Vision, pp. 34–50. Springer (2016)Google Scholar
  21. 21.
    Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: DeepCut: joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016)Google Scholar
  22. 22.
    Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J.: Human pose estimation with iterative error feedback. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  23. 23.
    Elhayek, A., de Aguiar, E., Jain, A., Thompson, J., Pishchulin, L., Andriluka, M., Bregler, C., Schiele, B., Theobalt, C.: Marconi-convnet-based marker-less motion capture in outdoor and indoor scenes. IEEE Trans. Patt. Anal. Mach. Intell. 39, 501–514 (2017)CrossRefGoogle Scholar
  24. 24.
    Gao, Z., Yu, Y., Zhou, Y., Du, S.: Leveraging two kinect sensors for accurate full-body motion capture. Sensors 15(9), 24297–24317 (2015)CrossRefGoogle Scholar
  25. 25.
    Lora, M., Ghidoni, S., Munaro, M., Menegatti, E.: A geometric approach to multiple viewpoint human body pose estimation. In: 2015 European Conference on Mobile Robots (ECMR), pp. 1–6. IEEE (2015)Google Scholar
  26. 26.
    Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Patt. Anal. Mach. Intell. 35(12), 2878–2890 (2013)CrossRefGoogle Scholar
  27. 27.
    Kim, Y.: Dance motion capture and composition using multiple RGB and depth sensors. Int. J. Distrib. Sens. Netw. 13(2), 1550147717696083 (2017)Google Scholar
  28. 28.
    Kanaujia, A., Haering, N., Taylor, G., Bregler, C.: 3D human pose and shape estimation from multi-view imagery. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 49–56. IEEE (2011)Google Scholar
  29. 29.
    Yeung, K.-Y., Kwok, T.-H., Wang, C.C.: Improved skeleton tracking by duplex kinects: a practical approach for real-time applications. J. Comput. Inf. Sci. Eng. 13(4), 041007 (2013)CrossRefGoogle Scholar
  30. 30.
    Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5. Kobe (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Marco Carraro
    • 1
    Email author
  • Matteo Munaro
    • 1
  • Jeff Burke
    • 2
  • Emanuele Menegatti
    • 1
  1. 1.Department of Information EngineeringUniversity of PadovaPadovaItaly
  2. 2.REMAP, School of Theater, Film and TelevisionUCLALos AngelesUSA

Personalised recommendations