Journal of Real-Time Image Processing

, Volume 11, Issue 4, pp 769–784 | Cite as

Hybrid 3D–2D human tracking in a top view

  • Cyrille Migniot
  • Fakhreddine AbabsaEmail author
Original Research Paper


This paper addresses the problem of 3D tracking of human gesture for buying behavior estimation. The top view of the customers, which has been little treated for human tracking, is exploited in this particular context. This point of view avoids occlusion except for those of the arms. We propose an hybrid 3D–2D tracking method based on the particle filtering framework, which uses the exclusion principle to separate the observation related to each customer and deals with multi-person tracking. The head and shoulders are tracked in the 2D space, while the arms are tracked in the 3D space: these are the spaces where they are the most descriptive. We validate our method both experimentally, so as to obtain qualitative results, and on-site. We demonstrated that it makes a good estimation for various cases and situations in real-time (\(\approx\)40 fps).


Human tracking Particle filtering Multi-target tracking Buying behavior analysis Xtion Pro-Live 



This work is supported by project ANR-10-CORD0016 ORIGAMI2.


  1. 1.
    Andriyenko, A., Schindler, K.: Multi-target tracking by continuous energy minimization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1265–1272 (2011). doi: 10.1109/CVPR.2011.5995311
  2. 2.
    Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3457–3464 (2011). doi: 10.1109/CVPR.2011.5995667
  3. 3.
    Ben Shitrit, H., Berclaz, J., Fleuret, F., Fua, P.: Tracking multiple people under global appearance constraints. In: IEEE International Conference on Computer Vision, pp. 137–144 (2011). doi: 10.1109/ICCV.2011.6126235.
  4. 4.
    Berclaz, J., Fleuret, F., Turetken, E., Fua, P.: Multiple object tracking using K-shortest paths optimization. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1806–1819 (2011). doi: 10.1109/TPAMI.2011.21 CrossRefGoogle Scholar
  5. 5.
    Botella, G., Martn H., J.A., Santos, M., Meyer-Baese, U.: FPGA-based multimodal embedded sensor system integrating low- and mid-level vision. Sensors. 11, 8164–8179 (2011). doi: 10.3390/s110808164
  6. 6.
    Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson tracking-by-detection from a single, uncalibrated camera. In: IEEE Trans. Pattern Anal. Mach. Intell. 33, 1820–1833 (2010). doi: 10.1109/TPAMI.2010.232
  7. 7.
    Brox, T., Rosenhahn, B., Gall, J., Cremers, D.: Combined region and motion-based 3D tracking of rigid and articulated objects. IEEE Trans. Pattern Anal. Mach. Intell. 32, 402–415 (2009). doi: 10.1109/TPAMI.2009.32
  8. 8.
    Brendel, W., Amer, M., Todorovic, S.: Multiobject tracking as maximum weight independent set. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1273–1280 (2011). doi: 10.1109/CVPR.2011.5995395
  9. 9.
    Canton-Ferrer, C., Salvador, J., Casas, J.R., Pardàs, M.: Multi-person tracking strategies based on voxel analysis. In: Multimodal Technologies for Perception of Humans, pp. 91–103 (2008). doi: 10.1007/978-3-540-68585-2_7
  10. 10.
    Choi, W., Savarese, S.: Multiple target tracking in world coordinate with single, minimally calibrated camera. In: European Conference on Computer Vision, vol. 4, pp. 553–567 (2010). doi: 10.1007/978-3-642-15561-1_40
  11. 11.
    Dantone, M., Gall, J., Leistner, C., Van Gool, L.: Human pose estimation using body parts dependent joint regressors. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3041–3048 (2013). doi: 10.1109/CVPR.2013.391
  12. 12.
    Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. Int. J. Comput. Vis. 61, 185–205 (2005). doi: 10.1023/B:VISI.0000043757.18370.9c CrossRefGoogle Scholar
  13. 13.
    Gonzalez, M., Collet, C.: Robust body parts tracking using particle filter and dynamic template. In: IEEE International Conference on Image Processing, pp. 529–523 (2011). doi: 10.1109/ICIP.2011.6116398
  14. 14.
    Heath, K., Guibas, L.J.: Multi-person tracking from sparse 3D trajectories in a camera sensor network. In: ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–9 (2008). doi: 10.1109/ICDSC.2008.4635679
  15. 15.
    Hauberg, S., Sommer, S., Pedersen, K.S.: Gaussian-like spatial priors for articulated tracking. In: European Conference on Computer Vision, pp. 425–437 (2010). doi: 10.1007/978-3-642-15549-9_31
  16. 16.
    Horaud, R., Niskanen, M., Dewaele, G., Boyer, E.: Human motion tracking by registering an articulated surface to 3D points and normals. IEEE Trans. Pattern Anal. Mach. Intell. 31, 158–163 (2008). doi: 10.1109/TPAMI.2008.108 CrossRefGoogle Scholar
  17. 17.
    Hu, Z., Wang, G., Lin, X., Yan, H.: Recovery of upper body poses in static images based on joints detection. Pattern Recognit. Lett. 30, 503–512 (2009). doi: 10.1016/j.patrec.2008.12.005 CrossRefGoogle Scholar
  18. 18.
    Isard, M., Blake, A.: CONDENSATION\_conditional density propagation for visual tracking. Int. J. Comput. Vis. 29, 5–28 (1998). doi: 10.1023/A:1008078328650 CrossRefGoogle Scholar
  19. 19.
    Jiang, Z., Huynh, D.Q., Moran, W., Challa, S., Spadaccini, N.: Multiple pedestrian tracking using colour and motion models. In: IEEE International Conference on Digital Image Computing: Techniques and Applications, pp. 328–334 (2010). doi: 10.1109/DICTA.2010.63
  20. 20.
    Khan, Z., Balch, T.R., Dellaert, F.: Efficient particle filter-based tracking of multiple interacting targets using an MRF-based motion model. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 254–259 (2003). doi: 10.1109/IROS.2003.1250637
  21. 21.
    Kitagawa, G.: Monte Carlo filter and smoother for non-gaussian nonlinear state space models. J. Comput. Graph. Stat. 5, 1–25 (1996). doi: 10.2307/1390750 MathSciNetGoogle Scholar
  22. 22.
    Kjellstrom, H., Kragić, D., Black, M.J.: Tracking people interacting with objects. In: IEEE Conference Computer Vision and Pattern Recognition, pp. 747–754 (2010). doi: 10.1109/CVPR.2010.5540140
  23. 23.
    Larsen, A.B.L., Hauberg, S., Pedersen, K.S.: Unscented Kalman filtering for articulated human tracking. In: Scandinavian Conference on Image Analysis, pp. 228–237 (2011). doi: 10.1007/978-3-642-21227-7_22
  24. 24.
    Lee, M.W., Cohen I., Jung, S.K.: Particle filter with analytical inference for human body tracking. In: IEEE Workshop on Motion and Video Computing, pp. 159–165 (2002). doi: 10.1109/MOTION.2002.1182229
  25. 25.
    Liem, M., Gavrila, D.: Multi-person tracking with overlapping cameras in complex, dynamic environments. In: British Machine Vision Conference, pp. 1–10 (2009). doi: 10.5244/C.23.87
  26. 26.
    Luber, M., Spinello, L., Arras, K.O.: People tracking in RGB-D data with on-line boosted target models. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3844–3849 (2011). doi: 10.1109/IROS.2011.6095075
  27. 27.
    MacCormick, J., Blake, A.: A probabilistic exclusion principle for tracking multiple objects. Int. J. Comput. Vis. 39, 57–71 (2000). doi: 10.1023/A:1008122218374 CrossRefzbMATHGoogle Scholar
  28. 28.
    Micilotta, A.S., Bowden, R.: View-based location and tracking of body parts for visual interaction. In: British Machine Vision Conference, pp. 849–858 (2004). doi:  10.5244/C.18.87
  29. 29.
    Migniot, C., Ababsa, F.: Part-based 3D multi-person tracking using depth cue in a top view. In: International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2014)Google Scholar
  30. 30.
    Mitzel, D., Sudowe, P., Leibe, B.: Real-time multi-person tracking with time-constrained detection. In: British Machine Vision Conference, pp. 1–11 (2011). doi: 10.5244/C.25.104
  31. 31.
    Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.J.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: IEEE International Conference on Computer Vision, pp. 261–268 (2009). doi: 10.1109/ICCV.2009.5459260
  32. 32.
    Pilu, M., Fitzgibbon, A.W. and Fisher, R.B.: Ellipse-Specific Direct Least-Square Fitting. IEEE Int’l Conf. on Image Processing. 3: 599–602 (1996). doi: 10.1109/ICIP.1996.560566
  33. 33.
    Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1201–1208 (2011). doi: 10.1109/CVPR.2011.5995604
  34. 34.
    Ristic, B., Arulampalam, S., Gordon, N.: Beyond the Kalman Filter: Particle Filters for Tracking Applications. Artech House, London (2004)Google Scholar
  35. 35.
    Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N.: Human skeleton tracking from depth data using geodesic distances and optical flow. Image Vis. Comput. 30, 217–226 (2012). doi: 10.1016/j.imavis.2011.12.001 CrossRefGoogle Scholar
  36. 36.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. Mach. Learn. Comput. Vis. 411, 119–135 (2013). doi: 10.1007/978-3-642-28661-2_5 CrossRefGoogle Scholar
  37. 37.
    Shu, G., Dehghan, A., Oreifej, O., Hand, E., Shah, M.: Part-based multiple-person tracking with partial occlusion handling. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1815–1821 (2012). doi: 10.1109/CVPR.2012.6247879
  38. 38.
    Song, B., Jeng, T.Y., Staudt, E., Roy-Chowdhury, A.K.: A stochastic graph evolution framework for robust multi-target tracking. In: European Conference on Computer Vision, pp. 605–619 (2010). doi: 10.1007/978-3-642-15549-9_44
  39. 39.
    Stoll, C., Hasler, N., Gall, J., Seidel, H.-P., Theobalt, C.: Fast articulated motion tracking using a sums of gaussians body model. In: IEEE International Conference on Computer Vision, pp. 951–958 (2011). doi: 10.1109/ICCV.2011.6126338
  40. 40.
    Thiel, E., Montanvert, A.: Chamfer masks : discrete distance functions, geometrical properties and optimization. In: IAPR International Conference on Pattern Recognition, pp. 244–247 (1992). doi: 10.1109/ICPR.1992.201971
  41. 41.
    Viola, P.A., Jones, M.J.: Robust real-time face detection. IEEE Int. J. Comput. Vis. 57, 137–154 (2004). doi: 10.1023/B:VISI.0000013087.49260.fb CrossRefGoogle Scholar
  42. 42.
    Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75, 247–266 (2007). doi: 10.1007/s11263-006-0027-7 CrossRefGoogle Scholar
  43. 43.
    Xing, J., Ai, H., Lao, S.: Multi-object tracking through occlusions by local tracklets filtering and global tracklets association with detection responses. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1200–1207 (2009). doi: 10.1109/CVPR.2009.5206745
  44. 44.
  45. 45.
    Yang, B., Nevatia, R.: An online learned CRF model for multi-target tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2034–2041 (2012). doi: 10.1109/CVPR.2012.6247907
  46. 46.
    Yang, C., Duraiswami, R., Davis, L.S.: Fast multiple object tracking via a hierarchical particle filter. IEEE Int. Conf. Comput. Vis. 1, 212–219 (2005). doi: 10.1109/ICCV.2005.951 Google Scholar
  47. 47.
    Zhang, Z., Hou, Y., Wang, Y., Qin, J.: A traffic flow detection system combining optical flow and shadow removal. In: IEEE Conference on Intelligent Visual Surveillance, pp. 45–48 (2011). doi: 10.1109/IVSurv.6157021

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.IBISC laboratoryEvryFrance

Personalised recommendations