Abstract
In this work we present a human pose estimation method based on the skeleton fusion and tracking using multiple RGB-D sensors. The proposed method considers the skeletons provided by each RGB-D device and constructs an improved skeleton, taking into account the quality measures provided by the sensors at two different levels: the whole skeleton and each joint individually. Then, each joint is tracked by a Kalman filter, resulting in a smooth tracking performance. We have also developed a new dataset consisting of six subjects performing seven different gestures, recorded with four Kinect devices simultaneously. Experimental results performed on this dataset show that the system obtains better smoothness results than the most representative methods found in the literature. The proposed system operates at a processing rate of 25 frames per second (including the whole algorithm loop, i.e., data acquisition and processing) without the explicit use of the multithreading capabilities of the system.
Similar content being viewed by others
Notes
This dataset, as well as some illustrative performance examples, are publicly available at: http://www.etsii.urjc.es/jjpantrigo/paperMTAP/
References
Behún K, Herout A, Páldy A (2014) Kinect-supported dataset creation for human pose estimation. In: Proceedings of the 30th Spring conference on computer graphics (SCCG), pp 55–62
Berger K (2013) The role of RGB-D benchmark datasets: an overview. Comput Res Reposit:4321–4326
Berger K (2014) A state of the art report on multiple RGB-D sensor research and on publicly available RGB-D datasets. In: Computer vision and machine learning with RGB-d sensors. Springer International Publishing, pp 27–44
Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1401–1409
Bünger M (2013) Evaluation of skeleton trackers and gesture recognition for human-robot interaction. Master Thesis, Aalborg University
Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995–2006
Destelle F, Ahmadi A, O’Connor N, Moran K, Chatzitofis A, Zarpalas D, Daras P (2014) Low-cost accurate skeleton tracking based on fusion of kinect and wearable inertial sensors. In: Proceedings of signal processing conference (EUSIPCO), pp 371–375
Deutscher J, Blake A, Reid I (2000) Articulated body motion capture by annealed particle filtering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2, pp 126–133
Hong Yoon J, Lee C-R, Yang M-H, Yoon K-J (2016) Online multi-object tracking via structural constraint event aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1392–1400
Kalman R E (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35
Lacabex B, Cuesta A, Montemayor AS, Pantrigo JJ (2016) Lightweight tracking-by-detection system for multiple pedestrian targets. Integrated computer-aided engineering. In press
MacCormick J, Isard M (2000) Partitioned sampling, articulated objects and interface-quality hand tracking. In: Proceedings of the 6th European conference on computer vision (ECCV), part II, pp 3–19
MacCormick J (2002) Stochastic algorithm for visual tracking. Springer
Morato C, Kaipa KN, Zhao B, Gupta SK (2014) Toward safe human robot collaboration by using multiple kinects based real-time human tracking. J Comput Inform Sci Eng 14(1):011006
Papadopoulos G, Axenopoulos A, Daras P (2014) Real-time skeleton-tracking-based human action recognition using kinect data. In: Gurrin C, Hopfgartner F, Hurst W, Johansen H, Lee H, Connor N (eds) MultiMedia modeling, lecture notes in computer science, pp 473–483
Pernici F, Del Bimbo A (2014) Object tracking by oversampling local features. IEEE Trans Pattern Anal Mach Intell 36(12):2538–2551
Souvenir R, Hajja A, Spurlock S (2012) Gamesourcing to acquire labeled human pose estimation data. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW), pp 1--6
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images, In: Proceedings of the 2011 IEEE conference on computer vision and pattern recognition, pp 1297–1304
Wang L, Ouyang W, Wang X, Lu H (2016) STCT: sequentially training convolutional networks for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1373–1381
Willianson B, LaViola J, Roberts T, Garrity P. (2012) Multi-kinect tracking for dismounted soldier training. In: Interservice/industry training, simulation and education conference (I/ITSEC)
Ye M, Zhang Q, Wang L, Zhu J, Yang R, Gall J (2013) A survey on human motion analysis from depth data. In: Grzegorzek M, Theobalt C, Reinhard K, Andreas K (eds) Time-of-flight and depth imaging. Sensors, algorithms, and applications, lecture notes in computer science, pp 149–187
Yeung K-Y, Kwok T-H, Wang CC (2013) Improved skeleton tracking by duplex kinects: a practical approach for real-time applications. J Comput Inf Sci Eng 13(4)
Yu H, Zhou Y, Simmons J, Przybyla C P, Lin Y, Fan X, Mi Y, Wang S (2016) Groupwise tracking of crowded similar-appearance targets from low-continuity image sequences. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 952–960
Zhang L, Sturm J, Cremers D, Lee D. (2012) Real-time human motion tracking using multiple depth cameras. In: Proceedings of the international conference on intelligent robot systems (IROS)
Zhang B, Perina A, Li Z, Murino V, Liu J, Ji R (in press 2016) Bounding multiple Gaussians uncertainty with application to object tracking. Int J Comput Vis. doi:10.1007/s11263-016-0880-y
Zhang B, Li Z, Perina A, Del Bue A, Murino V (in press 2016) Adaptive local movement modelling (ALMM) for object tracking. IEEE Trans Circuits Syst Video Technol. doi:10.1109/TCSVT.2016.2540978
Zhu G, Porikli F, Hongdong L (2016) Beyond local search: tracking objects everywhere with instance-specific proposals
Acknowledgments
This research has been partially supported by the Spanish Government research funding ref. MINECO/FEDER TIN2015-69542-C2-1 and the Banco de Santander and Universidad Rey Juan Carlos Funding Program for Excellence Research Groups ref. “Computer Vision and Image Processing (CVIP)”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Núñez, J.C., Cabido, R., Montemayor, A.S. et al. Real-time human body tracking based on data fusion from multiple RGB-D sensors. Multimed Tools Appl 76, 4249–4271 (2017). https://doi.org/10.1007/s11042-016-3759-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3759-6