Skip to main content

Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Modeling the dynamics of people walking is a problem of long-standing interest in computer vision. Many previous works involving pedestrian trajectory prediction define a particular set of individual actions to implicitly model group actions. In this paper, we present a novel architecture named GP-Graph which has collective group representations for effective pedestrian trajectory prediction in crowded environments, and is compatible with all types of existing approaches. A key idea of GP-Graph is to model both individual-wise and group-wise relations as graph representations. To do this, GP-Graph first learns to assign each pedestrian into the most likely behavior group. Using this assignment information, GP-Graph then forms both intra- and inter-group interactions as graphs, accounting for human-human relations within a group and group-group relations, respectively. To be specific, for the intra-group interaction, we mask pedestrian graph edges out of an associated group. We also propose group pooling &unpooling operations to represent a group with multiple pedestrians as one graph node. Lastly, GP-Graph infers a probability map for socially-acceptable future trajectories from the integrated features of both group interactions. Moreover, we introduce a group-level latent vector sampling to ensure collective inferences over a set of possible future trajectories. Extensive experiments are conducted to validate the effectiveness of our architec ture, which demonstrates consistent performance improvements with publicly available benchmarks. Code is publicly available at https://github.com/inhwanbae/GPGraph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  2. Bae, I., Jeon, H.G.: Disentangled multi-relational graph convolutional network for pedestrian trajectory prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2021)

    Google Scholar 

  3. Bae, I., Park, J.H., Jeon, H.G.: Non-probability sampling network for stochastic human trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  4. Bartoli, F., Lisanti, G., Ballan, L., Del Bimbo, A.: Context-aware trajectory prediction. In: 2018 24th International Conference on Pattern Recognition (ICPR) (2018)

    Google Scholar 

  5. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)

  6. Bisagno, N., Zhang, B., Conci, N.: Group LSTM: group trajectory prediction in crowded scenarios. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131, pp. 213–225. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11015-4_18

    Chapter  Google Scholar 

  7. Cangea, C., Velickovic, P., Jovanovic, N., Kipf, T., Lio’, P.: Towards sparse hierarchical graph classifiers. arXiv preprint arXiv:1811.01287 (2018)

  8. Chen, G., Li, J., Lu, J., Zhou, J.: Human trajectory prediction via counterfactual analysis. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  9. Chen, G., Li, J., Zhou, N., Ren, L., Lu, J.: Personalized trajectory prediction via distribution discrimination. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  10. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the Neural Information Processing Systems (NeurIPS) (2016)

    Google Scholar 

  11. Dendorfer, P., Elflein, S., Leal-Taixé, L.: MG-GAN: a multi-generator model preventing out-of-distribution samples in pedestrian trajectory prediction. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  12. Fernando, T., Denman, S., Sridharan, S., Fookes, C.: GD-GAN: generative adversarial networks for trajectory prediction and group detection in crowds. In: Proceedings of Asian Conference on Computer Vision (ACCV) (2018)

    Google Scholar 

  13. Gao, H., Ji, S.: Graph U-Nets. In: Proceedings of the International Conference on Machine Learning (ICML) (2019)

    Google Scholar 

  14. Ge, W., Collins, R.T., Ruback, R.B.: Vision-based analysis of small groups in pedestrian crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2012)

    Google Scholar 

  15. Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proceedings of the International Conference on Machine Learning (ICML) (2017)

    Google Scholar 

  16. Gu, T., et al.: Stochastic trajectory prediction via motion indeterminacy diffusion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  17. Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  18. Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282 (1995)

    Article  Google Scholar 

  19. Huang, Y., Bi, H., Li, Z., Mao, T., Wang, Z.: STGAT: modeling spatial-temporal interactions for human trajectory prediction. In: Proceedings of International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  20. Ivanovic, B., Pavone, M.: The trajectron: probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In: Proceedings of International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  21. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-Softmax. International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  22. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  23. Kosaraju, V., Sadeghian, A., Martín-Martín, R., Reid, I., Rezatofighi, H., Savarese, S.: Social-BiGAT: multimodal trajectory forecasting using bicycle-GAN and graph attention networks. In: Proceedings of the Neural Information Processing Systems (NeurIPS) (2019)

    Google Scholar 

  24. Lawal, I.A., Poiesi, F., Anguita, D., Cavallaro, A.: Support vector motion clustering. IEEE Trans. Circ. Syst. Video Technol. (TCSVT) 27, 2395–2408 (2017)

    Google Scholar 

  25. Lee, J., Lee, I., Kang, J.: Self-attention graph pooling. In: Proceedings of the International Conference on Machine Learning (ICML) (2019)

    Google Scholar 

  26. Lee, M., Sohn, S.S., Moon, S., Yoon, S., Kapadia, M., Pavlovic, V.: Muse-VAE: multi-scale VAE for environment-aware long term trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  27. Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H.S., Chandraker, M.: Desire: distant future prediction in dynamic scenes with interacting agents. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  28. Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. Comput. Graph. Forum 26(3), 655–664 (2007)

    Article  Google Scholar 

  29. Li, J., Ma, H., Tomizuka, M.: Conditional generative neural system for probabilistic trajectory prediction. In: Proceedings of IEEE International Conference on Intelligent Robots and Systems (IROS) (2019)

    Google Scholar 

  30. Li, J., Yang, F., Tomizuka, M., Choi, C.: EvolveGraph: multi-agent trajectory prediction with dynamic relational reasoning. In: Proceedings of the Neural Information Processing Systems (NeurIPS) (2020)

    Google Scholar 

  31. Li, S., Zhou, Y., Yi, J., Gall, J.: Spatial-temporal consistency network for low-latency trajectory forecasting. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  32. Liang, J., Jiang, L., Murphy, K., Yu, T., Hauptmann, A.: The garden of forking paths: Towards multi-future trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  33. Liang, J., Jiang, L., Niebles, J.C., Hauptmann, A.G., Fei-Fei, L.: Peeking into the future: predicting future person activities and locations in videos. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  34. Liu, Y., Yan, Q., Alahi, A.: Social NCE: contrastive learning of socially-aware motion representations. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  35. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  36. Mangalam, K., et al.: It is not the journey but the destination: endpoint conditioned trajectory prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 759–776. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_45

    Chapter  Google Scholar 

  37. Marchetti, F., Becattini, F., Seidenari, L., Bimbo, A.D.: Mantra: memory augmented networks for multiple trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  38. Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  39. Mohamed, A., Qian, K., Elhoseiny, M., Claudel, C.: Social-STGCNN: a social spatio-temporal graph convolutional neural network for human trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  40. Moussaïd, M., Perozo, N., Garnier, S., Helbing, D., Theraulaz, G.: The Walking Behaviour of Pedestrian Social Groups and Its Impact on Crowd Dynamics. Public Library of Science One (2010)

    Google Scholar 

  41. Pellegrini, S., Ess, A., Van Gool, L.: Improving data association by joint modeling of pedestrian trajectories and groupings. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 452–465. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_33

    Chapter  Google Scholar 

  42. Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: Proceedings of International Conference on Computer Vision (ICCV) (2009)

    Google Scholar 

  43. Pfeiffer, M., Paolo, G., Sommer, H., Nieto, J.I., Siegwart, R.Y., Cadena, C.: A data-driven model for interaction-aware pedestrian motion prediction in object cluttered environments. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (2018)

    Google Scholar 

  44. Qiu, F., Hu, X.: Modeling group structures in pedestrian crowd simulation. Simul. Model. Pract. Theory 18(2), 190–205 (2010)

    Google Scholar 

  45. Rhee, S., Seo, S., Kim, S.: Hybrid approach of relation network and localized graph convolutional filtering for breast cancer subtype classification. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligencev (IJCAI) (2018)

    Google Scholar 

  46. Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_33

    Chapter  Google Scholar 

  47. Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_33

    Chapter  Google Scholar 

  48. Rudenko, A., Palmieri, L., Lilienthal, A.J., Arras, K.O.: Human motion prediction under social grouping constraints. In: Proceedings of IEEE International Conference on Intelligent Robots and Systems (IROS) (2018)

    Google Scholar 

  49. Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, H., Savarese, S.: Sophie: an attentive GAN for predicting paths compliant to social and physical constraints. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  50. Salzmann, T., Ivanovic, B., Chakravarty, P., Pavone, M.: Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. In: Proceedings of European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  51. Seitz, M., Köster, G., Pfaffinger, A.: Pedestrian group behavior in a cellular automaton. In: Weidmann, U., Kirsch, U., Schreckenberg, M. (eds.) Pedestrian and Evacuation Dynamics 2012, pp. 807–814. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-02447-9_67

    Chapter  Google Scholar 

  52. Shafiee, N., Padir, T., Elhamifar, E.: Introvert: Human trajectory prediction via conditional 3d attention. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  53. Shao, J., Loy, C.C., Wang, X.: Scene-independent group profiling in crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)

    Google Scholar 

  54. Shi, L., et al.: SGCN: sparse graph convolution network for pedestrian trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  55. Shi, X., et al.: Multimodal interaction-aware trajectory prediction in crowded space. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2020)

    Google Scholar 

  56. Singh, H., Arter, R., Dodd, L., Langston, P., Lester, E., Drury, J.: Modelling subgroup behaviour in crowd dynamics dem simulation. Appl. Math. Model. 33(12), 4408–4423 (2009)

    Google Scholar 

  57. Solera, F., Calderara, S., Cucchiara, R.: Socially constrained structural learning for groups detection in crowd. IEEE Trans. Pattern Anal. Mach. Intell. 38, 995–1008 (2016)

    Google Scholar 

  58. Sun, H., Zhao, Z., He, Z.: Reciprocal learning networks for human trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  59. Sun, J., Jiang, Q., Lu, C.: Recursive social behavior graph for trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  60. Sun, J., Li, Y., Fang, H.S., Lu, C.: Three steps to multimodal trajectory prediction: Modality clustering, classification and synthesis. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  61. Tao, C., Jiang, Q., Duan, L., Luo, P.: Dynamic and static context-aware LSTM for multi-agent motion prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12366, pp. 547–563. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58589-1_33

    Chapter  Google Scholar 

  62. Varshneya, D., Srinivasaraghavan, G.: Human trajectory prediction using spatially aware deep attention models. arXiv preprint arXiv:1705.09436 (2017)

  63. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  64. Vemula, A., Muelling, K., Oh, J.: Social attention: modeling attention in human crowds. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (2018)

    Google Scholar 

  65. Xu, Y., Wang, L., Wang, Y., Fu, Y.: Adaptive trajectory prediction via transferable GNN. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  66. Yamaguchi, K., Berg, A.C., Ortiz, L.E., Berg, T.L.: Who are you with and where are you going? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  67. Yi, S., Li, H., Wang, X.: Understanding pedestrian behaviors from stationary crowd groups. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  68. Ying, Z., You, J., Morris, C., Ren, X., Hamilton, W., Leskovec, J.: Hierarchical graph representation learning with differentiable pooling. In: Proceedings of the Neural Information Processing Systems (NeurIPS) (2018)

    Google Scholar 

  69. Yu, C., Ma, X., Ren, J., Zhao, H., Yi, S.: Spatio-temporal graph transformer networks for pedestrian trajectory prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 507–523. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_30

    Chapter  Google Scholar 

  70. Yuan, Y., Weng, X., Ou, Y., Kitani, K.: AgentFormer: agent-aware transformers for socio-temporal multi-agent forecasting. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  71. Zanotto, M., Bazzani, L., Cristani, M., Murino, V.: Online Bayesian nonparametrics for group detection. In: Proceedings of British Machine Vision Conference (BMVC) (2012)

    Google Scholar 

  72. Zhang, M., Cui, Z., Neumann, M., Chen, Y.: An end-to-end deep learning architecture for graph classification. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) (2018)

    Google Scholar 

  73. Zhang, P., Ouyang, W., Zhang, P., Xue, J., Zheng, N.: SR-LSTM: state refinement for LSTM towards pedestrian trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  74. Zhao, H., Wildes, R.P.: Where are you heading? dynamic trajectory prediction with expert goal examples. In: Proceedings of International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  75. Zhao, T., et al.: Multi-agent tensor fusion for contextual trajectory prediction. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  76. Zhong, J., Cai, W., Luo, L., Yin, H.: Learning behavior patterns from video: a data-driven framework for agent-based crowd modeling. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2015)

    Google Scholar 

  77. Zhou, B., Tang, X., Wang, X.: Coherent filtering: Detecting coherent motions from crowd clutters. In: Proceedings of European Conference on Computer Vision (ECCV) (2012)

    Google Scholar 

  78. Zhou, B., Wang, X., Tang, X.: Understanding collective crowd behaviors: learning a mixture model of dynamic pedestrian-agents. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

Download references

Acknowledgement

This work is in part supported by the Institute of Information & communications Technology Planning & Evaluation (IITP) (No. 2019-0-01842, Artificial Intelligence Graduate School Program (GIST), No. 2021-0-02068, Artificial Intelligence Innovation Hub), the National Research Foundation of Korea (NRF) (No. 2020R1C1C1012635) grant funded by the Korea government (MSIT), Vehicles AI Convergence Research & Development Program through the National IT Industry Promotion Agency of Korea (NIPA) funded by the Ministry of Science and ICT (No. S1602-20-1001), the GIST-MIT Collaboration grant and AI-based GIST Research Scientist Project funded by the GIST in 2022.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hae-Gon Jeon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bae, I., Park, JH., Jeon, HG. (2022). Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13682. Springer, Cham. https://doi.org/10.1007/978-3-031-20047-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20047-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20046-5

  • Online ISBN: 978-3-031-20047-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics