Multimedia Tools and Applications

, Volume 76, Issue 3, pp 4249–4271 | Cite as

Real-time human body tracking based on data fusion from multiple RGB-D sensors

  • Juan C. Núñez
  • Raúl Cabido
  • Antonio S. Montemayor
  • Juan J. Pantrigo


In this work we present a human pose estimation method based on the skeleton fusion and tracking using multiple RGB-D sensors. The proposed method considers the skeletons provided by each RGB-D device and constructs an improved skeleton, taking into account the quality measures provided by the sensors at two different levels: the whole skeleton and each joint individually. Then, each joint is tracked by a Kalman filter, resulting in a smooth tracking performance. We have also developed a new dataset consisting of six subjects performing seven different gestures, recorded with four Kinect devices simultaneously. Experimental results performed on this dataset show that the system obtains better smoothness results than the most representative methods found in the literature. The proposed system operates at a processing rate of 25 frames per second (including the whole algorithm loop, i.e., data acquisition and processing) without the explicit use of the multithreading capabilities of the system.


Human body tracking Sensor fusion RGBD sensors 



This research has been partially supported by the Spanish Government research funding ref. MINECO/FEDER TIN2015-69542-C2-1 and the Banco de Santander and Universidad Rey Juan Carlos Funding Program for Excellence Research Groups ref. “Computer Vision and Image Processing (CVIP)”.


  1. 1.
    Behún K, Herout A, Páldy A (2014) Kinect-supported dataset creation for human pose estimation. In: Proceedings of the 30th Spring conference on computer graphics (SCCG), pp 55–62Google Scholar
  2. 2.
    Berger K (2013) The role of RGB-D benchmark datasets: an overview. Comput Res Reposit:4321–4326Google Scholar
  3. 3.
    Berger K (2014) A state of the art report on multiple RGB-D sensor research and on publicly available RGB-D datasets. In: Computer vision and machine learning with RGB-d sensors. Springer International Publishing, pp 27–44Google Scholar
  4. 4.
    Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1401–1409Google Scholar
  5. 5.
    Bünger M (2013) Evaluation of skeleton trackers and gesture recognition for human-robot interaction. Master Thesis, Aalborg UniversityGoogle Scholar
  6. 6.
    Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995–2006CrossRefGoogle Scholar
  7. 7.
    Destelle F, Ahmadi A, O’Connor N, Moran K, Chatzitofis A, Zarpalas D, Daras P (2014) Low-cost accurate skeleton tracking based on fusion of kinect and wearable inertial sensors. In: Proceedings of signal processing conference (EUSIPCO), pp 371–375Google Scholar
  8. 8.
    Deutscher J, Blake A, Reid I (2000) Articulated body motion capture by annealed particle filtering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2, pp 126–133Google Scholar
  9. 9.
    Hong Yoon J, Lee C-R, Yang M-H, Yoon K-J (2016) Online multi-object tracking via structural constraint event aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1392–1400Google Scholar
  10. 10.
    Kalman R E (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35CrossRefGoogle Scholar
  11. 11.
    Lacabex B, Cuesta A, Montemayor AS, Pantrigo JJ (2016) Lightweight tracking-by-detection system for multiple pedestrian targets. Integrated computer-aided engineering. In pressGoogle Scholar
  12. 12.
    MacCormick J, Isard M (2000) Partitioned sampling, articulated objects and interface-quality hand tracking. In: Proceedings of the 6th European conference on computer vision (ECCV), part II, pp 3–19Google Scholar
  13. 13.
    MacCormick J (2002) Stochastic algorithm for visual tracking. SpringerGoogle Scholar
  14. 14.
    Morato C, Kaipa KN, Zhao B, Gupta SK (2014) Toward safe human robot collaboration by using multiple kinects based real-time human tracking. J Comput Inform Sci Eng 14(1):011006CrossRefGoogle Scholar
  15. 15.
    Papadopoulos G, Axenopoulos A, Daras P (2014) Real-time skeleton-tracking-based human action recognition using kinect data. In: Gurrin C, Hopfgartner F, Hurst W, Johansen H, Lee H, Connor N (eds) MultiMedia modeling, lecture notes in computer science, pp 473–483Google Scholar
  16. 16.
    Pernici F, Del Bimbo A (2014) Object tracking by oversampling local features. IEEE Trans Pattern Anal Mach Intell 36(12):2538–2551CrossRefGoogle Scholar
  17. 17.
    Souvenir R, Hajja A, Spurlock S (2012) Gamesourcing to acquire labeled human pose estimation data. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW), pp 1--6Google Scholar
  18. 18.
    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images, In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition, pp 1297–1304Google Scholar
  19. 19.
    Wang L, Ouyang W, Wang X, Lu H (2016) STCT: sequentially training convolutional networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1373–1381Google Scholar
  20. 20.
    Willianson B, LaViola J, Roberts T, Garrity P. (2012) Multi-kinect tracking for dismounted soldier training. In: Interservice/industry training, simulation and education conference (I/ITSEC)Google Scholar
  21. 21.
    Ye M, Zhang Q, Wang L, Zhu J, Yang R, Gall J (2013) A survey on human motion analysis from depth data. In: Grzegorzek M, Theobalt C, Reinhard K, Andreas K (eds) Time-of-flight and depth imaging. Sensors, algorithms, and applications, lecture notes in computer science, pp 149–187Google Scholar
  22. 22.
    Yeung K-Y, Kwok T-H, Wang CC (2013) Improved skeleton tracking by duplex kinects: a practical approach for real-time applications. J Comput Inf Sci Eng 13(4)Google Scholar
  23. 23.
    Yu H, Zhou Y, Simmons J, Przybyla C P, Lin Y, Fan X, Mi Y, Wang S (2016) Groupwise tracking of crowded similar-appearance targets from low-continuity image sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 952–960Google Scholar
  24. 24.
    Zhang L, Sturm J, Cremers D, Lee D. (2012) Real-time human motion tracking using multiple depth cameras. In: Proceedings of the international conference on intelligent robot systems (IROS)Google Scholar
  25. 25.
    Zhang B, Perina A, Li Z, Murino V, Liu J, Ji R (in press 2016) Bounding multiple Gaussians uncertainty with application to object tracking. Int J Comput Vis. doi: 10.1007/s11263-016-0880-y
  26. 26.
    Zhang B, Li Z, Perina A, Del Bue A, Murino V (in press 2016) Adaptive local movement modelling (ALMM) for object tracking. IEEE Trans Circuits Syst Video Technol. doi: 10.1109/TCSVT.2016.2540978
  27. 27.
    Zhu G, Porikli F, Hongdong L (2016) Beyond local search: tracking objects everywhere with instance-specific proposalsGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Juan C. Núñez
    • 1
  • Raúl Cabido
    • 1
  • Antonio S. Montemayor
    • 1
  • Juan J. Pantrigo
    • 1
  1. 1.Universidad Rey Juan CarlosMóstolesSpain

Personalised recommendations