Abstract
This paper addresses the problem of 3D tracking of human gesture for buying behavior estimation. The top view of the customers, which has been little treated for human tracking, is exploited in this particular context. This point of view avoids occlusion except for those of the arms. We propose an hybrid 3D–2D tracking method based on the particle filtering framework, which uses the exclusion principle to separate the observation related to each customer and deals with multi-person tracking. The head and shoulders are tracked in the 2D space, while the arms are tracked in the 3D space: these are the spaces where they are the most descriptive. We validate our method both experimentally, so as to obtain qualitative results, and on-site. We demonstrated that it makes a good estimation for various cases and situations in real-time (\(\approx\)40 fps).
Similar content being viewed by others
References
Andriyenko, A., Schindler, K.: Multi-target tracking by continuous energy minimization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1265–1272 (2011). doi:10.1109/CVPR.2011.5995311
Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3457–3464 (2011). doi:10.1109/CVPR.2011.5995667
Ben Shitrit, H., Berclaz, J., Fleuret, F., Fua, P.: Tracking multiple people under global appearance constraints. In: IEEE International Conference on Computer Vision, pp. 137–144 (2011). doi:10.1109/ICCV.2011.6126235.
Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using K-shortest paths optimization. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1806–1819 (2011). doi:10.1109/TPAMI.2011.21
Botella, G., Martn H., J.A., Santos, M., Meyer-Baese, U.: FPGA-based multimodal embedded sensor system integrating low- and mid-level vision. Sensors. 11, 8164–8179 (2011). doi:10.3390/s110808164
Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson tracking-by-detection from a single, uncalibrated camera. In: IEEE Trans. Pattern Anal. Mach. Intell. 33, 1820–1833 (2010). doi:10.1109/TPAMI.2010.232
Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32, 402–415 (2009). doi:10.1109/TPAMI.2009.32
Brendel, W., Amer, M., Todorovic, S.: Multiobject tracking as maximum weight independent set. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1273–1280 (2011). doi:10.1109/CVPR.2011.5995395
Canton-Ferrer, C., Salvador, J., Casas, J.R., Pardàs, M.: Multi-person tracking strategies based on voxel analysis. In: Multimodal Technologies for Perception of Humans, pp. 91–103 (2008). doi:10.1007/978-3-540-68585-2_7
Choi, W., Savarese, S.: Multiple target tracking in world coordinate with single, minimally calibrated camera. In: European Conference on Computer Vision, vol. 4, pp. 553–567 (2010). doi:10.1007/978-3-642-15561-1_40
Dantone, M., Gall, J., Leistner, C., Van Gool, L.: Human pose estimation using body parts dependent joint regressors. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3048 (2013). doi:10.1109/CVPR.2013.391
Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. Int. J. Comput. Vis. 61, 185–205 (2005). doi:10.1023/B:VISI.0000043757.18370.9c
Gonzalez, M., Collet, C.: Robust body parts tracking using particle filter and dynamic template. In: IEEE International Conference on Image Processing, pp. 529–523 (2011). doi:10.1109/ICIP.2011.6116398
Heath, K., Guibas, L.J.: Multi-person tracking from sparse 3D trajectories in a camera sensor network. In: ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–9 (2008). doi:10.1109/ICDSC.2008.4635679
Hauberg, S., Sommer, S., Pedersen, K.S.: Gaussian-like spatial priors for articulated tracking. In: European Conference on Computer Vision, pp. 425–437 (2010). doi:10.1007/978-3-642-15549-9_31
Horaud, R., Niskanen, M., Dewaele, G., Boyer, E.: Human motion tracking by registering an articulated surface to 3D points and normals. IEEE Trans. Pattern Anal. Mach. Intell. 31, 158–163 (2008). doi:10.1109/TPAMI.2008.108
Hu, Z., Wang, G., Lin, X., Yan, H.: Recovery of upper body poses in static images based on joints detection. Pattern Recognit. Lett. 30, 503–512 (2009). doi:10.1016/j.patrec.2008.12.005
Isard, M., Blake, A.: CONDENSATION\_conditional density propagation for visual tracking. Int. J. Comput. Vis. 29, 5–28 (1998). doi:10.1023/A:1008078328650
Jiang, Z., Huynh, D.Q., Moran, W., Challa, S., Spadaccini, N.: Multiple pedestrian tracking using colour and motion models. In: IEEE International Conference on Digital Image Computing: Techniques and Applications, pp. 328–334 (2010). doi:10.1109/DICTA.2010.63
Khan, Z., Balch, T.R., Dellaert, F.: Efficient particle filter-based tracking of multiple interacting targets using an MRF-based motion model. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 254–259 (2003). doi:10.1109/IROS.2003.1250637
Kitagawa, G.: Monte Carlo filter and smoother for non-gaussian nonlinear state space models. J. Comput. Graph. Stat. 5, 1–25 (1996). doi:10.2307/1390750
Kjellstrom, H., Kragić, D., Black, M.J.: Tracking people interacting with objects. In: IEEE Conference Computer Vision and Pattern Recognition, pp. 747–754 (2010). doi:10.1109/CVPR.2010.5540140
Larsen, A.B.L., Hauberg, S., Pedersen, K.S.: Unscented Kalman filtering for articulated human tracking. In: Scandinavian Conference on Image Analysis, pp. 228–237 (2011). doi:10.1007/978-3-642-21227-7_22
Lee, M.W., Cohen I., Jung, S.K.: Particle filter with analytical inference for human body tracking. In: IEEE Workshop on Motion and Video Computing, pp. 159–165 (2002). doi:10.1109/MOTION.2002.1182229
Liem, M., Gavrila, D.: Multi-person tracking with overlapping cameras in complex, dynamic environments. In: British Machine Vision Conference, pp. 1–10 (2009). doi:10.5244/C.23.87
Luber, M., Spinello, L., Arras, K.O.: People tracking in RGB-D data with on-line boosted target models. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3844–3849 (2011). doi:10.1109/IROS.2011.6095075
MacCormick, J., Blake, A.: A probabilistic exclusion principle for tracking multiple objects. Int. J. Comput. Vis. 39, 57–71 (2000). doi:10.1023/A:1008122218374
Micilotta, A.S., Bowden, R.: View-based location and tracking of body parts for visual interaction. In: British Machine Vision Conference, pp. 849–858 (2004). doi: 10.5244/C.18.87
Migniot, C., Ababsa, F.: Part-based 3D multi-person tracking using depth cue in a top view. In: International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2014)
Mitzel, D., Sudowe, P., Leibe, B.: Real-time multi-person tracking with time-constrained detection. In: British Machine Vision Conference, pp. 1–11 (2011). doi:10.5244/C.25.104
Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.J.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: IEEE International Conference on Computer Vision, pp. 261–268 (2009). doi:10.1109/ICCV.2009.5459260
Pilu, M., Fitzgibbon, A.W. and Fisher, R.B.: Ellipse-Specific Direct Least-Square Fitting. IEEE Int’l Conf. on Image Processing. 3: 599–602 (1996). doi:10.1109/ICIP.1996.560566
Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1201–1208 (2011). doi:10.1109/CVPR.2011.5995604
Ristic, B., Arulampalam, S., Gordon, N.: Beyond the Kalman Filter: Particle Filters for Tracking Applications. Artech House, London (2004)
Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N.: Human skeleton tracking from depth data using geodesic distances and optical flow. Image Vis. Comput. 30, 217–226 (2012). doi:10.1016/j.imavis.2011.12.001
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. Mach. Learn. Comput. Vis. 411, 119–135 (2013). doi:10.1007/978-3-642-28661-2_5
Shu, G., Dehghan, A., Oreifej, O., Hand, E., Shah, M.: Part-based multiple-person tracking with partial occlusion handling. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1815–1821 (2012). doi:10.1109/CVPR.2012.6247879
Song, B., Jeng, T.Y., Staudt, E., Roy-Chowdhury, A.K.: A stochastic graph evolution framework for robust multi-target tracking. In: European Conference on Computer Vision, pp. 605–619 (2010). doi:10.1007/978-3-642-15549-9_44
Stoll, C., Hasler, N., Gall, J., Seidel, H.-P., Theobalt, C.: Fast articulated motion tracking using a sums of gaussians body model. In: IEEE International Conference on Computer Vision, pp. 951–958 (2011). doi:10.1109/ICCV.2011.6126338
Thiel, E., Montanvert, A.: Chamfer masks : discrete distance functions, geometrical properties and optimization. In: IAPR International Conference on Pattern Recognition, pp. 244–247 (1992). doi:10.1109/ICPR.1992.201971
Viola, P.A., Jones, M.J.: Robust real-time face detection. IEEE Int. J. Comput. Vis. 57, 137–154 (2004). doi:10.1023/B:VISI.0000013087.49260.fb
Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75, 247–266 (2007). doi:10.1007/s11263-006-0027-7
Xing, J., Ai, H., Lao, S.: Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1200–1207 (2009). doi:10.1109/CVPR.2009.5206745
Xtion PRO-LIVE. http://www.asus.com/Multimedia/Xtion_PRO_LIVE/
Yang, B., Nevatia, R.: An online learned CRF model for multi-target tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2034–2041 (2012). doi:10.1109/CVPR.2012.6247907
Yang, C., Duraiswami, R., Davis, L.S.: Fast multiple object tracking via a hierarchical particle filter. IEEE Int. Conf. Comput. Vis. 1, 212–219 (2005). doi:10.1109/ICCV.2005.951
Zhang, Z., Hou, Y., Wang, Y., Qin, J.: A traffic flow detection system combining optical flow and shadow removal. In: IEEE Conference on Intelligent Visual Surveillance, pp. 45–48 (2011). doi:10.1109/IVSurv.6157021
Acknowledgments
This work is supported by project ANR-10-CORD0016 ORIGAMI2.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Migniot, C., Ababsa, F. Hybrid 3D–2D human tracking in a top view. J Real-Time Image Proc 11, 769–784 (2016). https://doi.org/10.1007/s11554-014-0429-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-014-0429-7