Advertisement

Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes

  • Alexandre RobicquetEmail author
  • Amir Sadeghian
  • Alexandre Alahi
  • Silvio Savarese
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9912)

Abstract

Humans navigate crowded spaces such as a university campus by following common sense rules based on social etiquette. In this paper, we argue that in order to enable the design of new target tracking or trajectory forecasting methods that can take full advantage of these rules, we need to have access to better data in the first place. To that end, we contribute a new large-scale dataset that collects videos of various types of targets (not just pedestrians, but also bikers, skateboarders, cars, buses, golf carts) that navigate in a real world outdoor environment such as a university campus. Moreover, we introduce a new characterization that describes the “social sensitivity” at which two targets interact. We use this characterization to define “navigation styles” and improve both forecasting models and state-of-the-art multi-target tracking–whereby the learnt forecasting models help the data association step.

Keywords

Trajectory forecasting Multi-target tracking Social Forces UAV 

References

  1. 1.
    Yamaguchi, K., Berg, A.C., Ortiz, L.E., Berg, T.L.: Who are you with and where are you going? In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1345–1352. IEEE (2011)Google Scholar
  2. 2.
    Pellegrini, S., Ess, A., Van Gool, L.: Improving data association by joint modeling of pedestrian trajectories and groupings. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 452–465. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Leal-Taixé, L., Fenzi, M., Kuznetsova, A., Rosenhahn, B., Savarese, S.: Learning an image-based motion context for multiple people tracking. In: CVPR, pp. 3542–3549. IEEE (2014)Google Scholar
  4. 4.
    Choi, W., Savarese, S.: A unified framework for multi-target tracking and collective activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 215–230. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Smeulders, A.W., Chu, D.M., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)CrossRefGoogle Scholar
  6. 6.
    Xie, D., Todorovic, S., Zhu, S.C.: Inferring dark matter and dark energy from videos. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2224–2231. IEEE (2013)Google Scholar
  7. 7.
    Choi, W., Savarese, S.: Understanding collective activitiesof people from videos. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1242–1257 (2014)CrossRefGoogle Scholar
  8. 8.
    Lan, T., Wang, Y., Yang, W., Mori, G.: Beyond actions: discriminative models for contextual group activities. In: Advances in Neural Information Processing Systems, pp. 1216–1224 (2010)Google Scholar
  9. 9.
    Choi, W., Shahid, K., Savarese, S.: What are they doing? Collective activity classification using spatio-temporal relationship among people. In: 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1282–1289. IEEE (2009)Google Scholar
  10. 10.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  11. 11.
    Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. In: Computer Graphics Forum, vol. 26, pp. 655–664. Wiley Online Library (2007)Google Scholar
  12. 12.
    Trautman, P., Ma, J., Murray, R.M., Krause, A.: Robot navigation in dense human crowds: the case for cooperation. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 2153–2160. IEEE (2013)Google Scholar
  13. 13.
    Cucchiara, R., Grana, C., Tardini, G., Vezzani, R.: Probabilistic people tracking for occlusion handling. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 1, pp. 132–135. IEEE (2004)Google Scholar
  14. 14.
    Hughes, R.L.: The flow of human crowds. Annu. Rev. Fluid Mech. 35(1), 169–182 (2003)CrossRefzbMATHMathSciNetGoogle Scholar
  15. 15.
    Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282 (1995)CrossRefGoogle Scholar
  16. 16.
    Boyle, P., Frean, M.: Dependent gaussian processes. Adv. Neural Inf. Process. Syst. 17, 217–224 (2005)Google Scholar
  17. 17.
    Tay, M.K.C., Laugier, C.: Modelling smooth paths using gaussian processes. In: Laugier, C., Siegwart, R. (eds.) Field and Service Robotics. Springer Tracts in Advanced Robotics, vol. 42, pp. 381–390. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 261–268. IEEE (2009)Google Scholar
  19. 19.
    Treuille, A., Cooper, S., Popović, Z.: Continuum crowds. ACM Trans. Graph. (TOG) 25(3), 1160–1168 (2006)CrossRefGoogle Scholar
  20. 20.
    Antonini, G., Venegas, S., Thiran, J.P., Bierlaire, M.: A discrete choice pedestrian behavior model for pedestrian detection in visual tracking systems. In: Advanced Concepts for Intelligent Vision Systems, ACIVS 2004. Number EPFL-CONF-87109. IEEE (2004)Google Scholar
  21. 21.
    Antonini, G., Bierlaire, M., Weber, M.: Discrete choice models of pedestrian walking behavior. Transp. Res. Part B Methodological 40(8), 667–687 (2006)CrossRefGoogle Scholar
  22. 22.
    Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 283–298 (2008)CrossRefGoogle Scholar
  23. 23.
    Ziebart, B.D., Ratliff, N., Gallagher, G., Mertz, C., Peterson, K., Bagnell, J.A., Hebert, M., Dey, A.K., Srinivasa, S.: Planning-based prediction for pedestrians. In: 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009, pp. 3931–3936. IEEE (2009)Google Scholar
  24. 24.
    Henry, P., Vollmer, C., Ferris, B., Fox, D.: Learning to navigate through crowded environments. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 981–986. IEEE (2010)Google Scholar
  25. 25.
    Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: AAAI, pp. 1433–1438 (2008)Google Scholar
  26. 26.
    Levine, S., Popovic, Z., Koltun, V.: Nonlinear inverse reinforcement learning with gaussian processes. In: Advances in Neural Information Processing Systems, pp. 19–27 (2011)Google Scholar
  27. 27.
    Thompson, S., Horiuchi, T., Kagami, S.: A probabilistic model of human motion and navigation intent for mobile robot path planning. In: 2009 4th International Conference on Autonomous Robots and Agents, ICARA 2009, pp. 663–668. IEEE (2009)Google Scholar
  28. 28.
    Luber, M., Stork, J.A., Tipaldi, G.D., Arras, K.O.: People tracking with human motion predictions from social forces. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 464–469. IEEE (2010)Google Scholar
  29. 29.
    Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 935–942. IEEE (2009)Google Scholar
  30. 30.
    Leal-Taixé, L., Pons-Moll, G., Rosenhahn, B.: Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 120–127. IEEE (2011)Google Scholar
  31. 31.
    Alahi, A., Ramanathan, V., Fei-Fei, L.: Socially-aware large-scale crowd forecasting. In: CVPR (2014)Google Scholar
  32. 32.
    Kretzschmar, H., Kuderer, M., Burgard, W.: Learning to predict trajectories of cooperatively navigating agents. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 4015–4020. IEEE (2014)Google Scholar
  33. 33.
    Yi, S., Li, H., Wang, X.: Understanding pedestrian behaviors from stationary crowd groups. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3488–3496 (2015)Google Scholar
  34. 34.
    Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multicamera people tracking with a probabilistic occupancy map. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 267–282 (2008)CrossRefGoogle Scholar
  35. 35.
    Pirsiavash, H., Ramanan, D., Fowlkes, C.C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR (2011)Google Scholar
  36. 36.
    Alahi, A., Boursier, Y., Jacques, L., Vandergheynst, P.: A sparsity constrained inverse problem to locate people in a network of cameras. In: 2009 16th International Conference on Digital Signal Processing, pp. 1–7. IEEE (2009)Google Scholar
  37. 37.
    Alahi, A., Jacques, L., Boursier, Y., Vandergheynst, P.: Sparsity driven people localization with a heterogeneous network of cameras. J. Math. Imaging Vis. 41(1), 1–20 (2011)zbMATHMathSciNetGoogle Scholar
  38. 38.
    Roshan Zamir, A., Dehghan, A., Shah, M.: GMCP-tracker: global multi-object tracking using generalized minimum clique graphs. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 343–356. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  39. 39.
    Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: International Conference on Computer Vision (ICCV), pp. 4705–4713 (2015)Google Scholar
  40. 40.
    Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. [cs], April 2015. arXiv:1504.01942
  41. 41.
    Amer, M.R., Xie, D., Zhao, M., Todorovic, S., Zhu, S.-C.: Cost-sensitive top-down/bottom-up inference for multiscale activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 187–200. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  42. 42.
    Shu, T., Xie, D., Rothrock, B., Todorovic, S., Chun Zhu, S.: Joint inference of groups, events and human roles in aerial videos. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  43. 43.
    Alahi, A., Bierlaire, M., Vandergheynst, P.: Robust real-time pedestrians detection in urban environments with low-resolution cameras. Transp. Res. Part C Emerg. Technol. 39, 113–128 (2014)CrossRefGoogle Scholar
  44. 44.
    Alahi, A., Bierlaire, M., Kunt, M.: Object detection and matching with mobile cameras collaborating with fixed cameras. In: Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications-M2SFA2 2008 (2008)Google Scholar
  45. 45.
    Trautman, P., Krause, A.: Unfreezing the robot: navigation in dense, interacting crowds. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 797–803. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Alexandre Robicquet
    • 1
    Email author
  • Amir Sadeghian
    • 1
  • Alexandre Alahi
    • 1
  • Silvio Savarese
    • 1
  1. 1.CVGLStanford UniversityStanfordUSA

Personalised recommendations