Abstract
Traditional methods for constructing crowd simulations often have shortcomings in terms of realism, and data-driven methods are an effective approach to enhancing the visual realism of crowd simulation. However, existing work mainly constructs crowd simulations through prediction-based approaches based on deep learning or by fitting the parameters of traditional methods, which limits the expressiveness of the model. In response to these limitations, this paper introduces a method capable of generating realistic pedestrian crowds. This approach uses a Generative Adversarial Network, complemented by a transformer module, to learn behavioral patterns from actual crowd trajectories. We use a transformer module to extract trajectory features of the crowd, then convert the spatial relationships between individuals into sequences using a special data processing mechanism, and use the transformer module to extract social features of the crowd, while guiding the movement of each individual with their target direction. During training, we simultaneously learn from real crowd data and simulation data resolving collisions by traditional methods, to enhance the collision avoidance behavior of virtual crowds while maintaining the movement patterns of real crowds, resulting in more general collision avoidance behavior. The crowds generated by the model are not limited to specific scenarios and show generalization capabilities. Compared to other models, our method shows better performance on publicly available large-scale pedestrian datasets after training. Our code is publicly available at https://github.com/ydp91/NPCGAN.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Lee, S.J., Popović, Z.: Learning behavior styles with inverse reinforcement learning. ACM Trans. Graph. (TOG) 29(4), 1–7 (2010)
Curtis, S., Snape, J., Manocha, D.: Way portals: efficient multi-agent navigation with line-segment goals. In: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 15–22 (2012)
Jordao, K., Charalambous, P., Christie, M., Pettré, J., Cani, M.-P.: Crowd art: density and flow based crowd motion design. In: Proceedings of the 8th ACM SIGGRAPH Conference on Motion in Games, pp. 167–176 (2015)
Kanyuk, P.: Virtual crowds in film and narrative media. In: Simulating Heterogeneous Crowds with Interactive Behaviors, vol. 217 (2016)
Ting, S.P., Zhou, S.: Snap: a time critical decision-making framework for MOUT simulations. Comput. Animat. Virtual Worlds 19(3–4), 505–514 (2008)
Zhang, J., Jin, D., Li, Y.: Mirage: an efficient and extensible city simulation framework (systems paper). In: Proceedings of the 30th International Conference on Advances in Geographic Information Systems. SIGSPATIAL ’22. Association for Computing Machinery, New York, NY, USA (2022)
Colombo, R.M., Rosini, M.: Existence of nonclassical solutions in a pedestrian flow model. Nonlinear Anal. Real World Appl. 10(5), 2716–2728 (2009)
Golas, A., Narain, R., Lin, M.: A continuum model for simulating crowd turbulence. In: ACM SIGGRAPH 2014 Talks. SIGGRAPH ’14. Association for Computing Machinery, New York, NY, USA (2014)
Narain, R., Golas, A., Curtis, S., Lin, M.C.: Aggregate dynamics for dense crowd simulation. In: ACM SIGGRAPH Asia 2009 Papers, pp. 1–8 (2009)
Lu, G., Chen, L., Luo, W.: Real-time crowd simulation integrating potential fields and agent method. ACM Trans. Model. Comput. Simul. 26(4), 1–16 (2016)
Tsai, T.-Y., Wong, S.-K., Chou, Y.-H., Lin, G.-W.: Directing virtual crowds based on dynamic adjustment of navigation fields. Comput. Animat. Virtual Worlds 29(1), 1765 (2018)
Silva, A.R.D., Lages, W.S., Chaimowicz, L.: Boids that see: using self-occlusion for simulating large groups on gpus. Comput. Entertain. 7(4), 1–20 (2010)
Fiorini, P., Shiller, Z.: Motion planning in dynamic environments using velocity obstacles. Int. J Robot. Res. 17(7), 760–772 (1998)
Berg, J., Lin, M., Manocha, D.: Reciprocal velocity obstacles for real-time multi-agent navigation. In: 2008 IEEE International Conference on Robotics and Automation, pp. 1928–1935 (2008)
Berg, J., Guy, S., Lin, M., Manocha, D.: Reciprocal n-Body Collision Avoidance, vol. 70, pp. 3–19 (2011)
Snape, J., Van Den Berg, J., Guy, S.J., Manocha, D.: Smooth and collision-free navigation for multiple robots under differential-drive constraints. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp. 4584–4589 (2010)
Snape, J., Van Den Berg, J., Guy, S.J., Manocha, D.: The hybrid reciprocal velocity obstacle. IEEE Trans. Robot. 27(4), 696–706 (2011)
Luo, L., Chai, C., Ma, J., Zhou, S., Cai, W.: Proactivecrowd: Modelling proactive steering behaviours for agent-based crowd simulation. In: Computer Graphics Forum, vol. 37. Wiley Online Library, pp. 375–388 (2018)
Ma, Y., Lee, E., Yuen, R.: An artificial intelligence-based approach for simulating pedestrian movement. IEEE Trans. Intell. Transp. Syst. 17(11), 3159–3170 (2016)
Wei, X., Lu, W., Zhu, L., Xing, W.: Learning motion rules from real data: neural network for crowd simulation. Neurocomputing 310, 125–134 (2018)
Yao, Z., Zhang, G., Lu, D., Liu, H.: Learning crowd behavior from real data: a residual network method for crowd simulation. Neurocomputing 404, 173–185 (2020)
Xiao, S., Han, D., Sun, J., Zhang, Z.: A data-driven neural network approach to simulate pedestrian movement. Phys. A Stat. Mech. Appl. 509, 827–844 (2018)
Zhong, J., Li, D., Huang, Z., Lu, C., Cai, W.: Data-driven crowd modeling techniques: a survey. ACM Trans. Model. Comput. Simul. (TOMACS) 32(1), 1–33 (2022)
Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. In: Computer Graphics Forum, vol. 26. Wiley Online Library, pp. 655–664 (2007)
Charalambous, P., Chrysanthou, Y.: The pag crowd: A graph based approach for efficient data-driven crowd simulation. In: Computer Graphics Forum, vol. 33. Wiley Online Library, pp. 95–108 (2014)
Yersin, B., Maïm, J., Pettré, J., Thalmann, D.: Crowd patches: populating large-scale virtual environments for real-time applications. In: Proceedings of the 2009 Symposium on Interactive 3D Graphics and Games, pp. 207–214 (2009)
Li, Y., Mao, T., Meng, R., Yan, Q., Wang, Z.: DeepORCA: realistic crowd simulation for varying scenes. Comput. Animat. Virtual Worlds 33(3/4), e2067 (2022)
Zhang, J., Li, C., Wang, C., He, G.: Orcanet: differentiable multi-parameter learning for crowd simulation. In: Computer Animation and Virtual Worlds, 2114 (2022)
Zhang, G., Yu, Z., Jin, D., Li, Y.: Physics-infused machine learning for crowd simulation. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’22, pp. 2439–2449. Association for Computing Machinery, New York, NY, USA (2022)
Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282 (1995)
Charalambous, P., Pettre, J., Vassiliades, V., Chrysanthou, Y., Pelechano, N.: GREIL-crowds: crowd simulation with deep reinforcement learning and examples. ACM Trans. Graph. 42(4), 1–15 (2023)
Panayiotou, A., Kyriakou, T., Lemonari, M., Chrysanthou, Y., Charalambous, P.: CCP: Configurable crowd profiles. In: ACM SIGGRAPH 2022 Conference Proceedings. SIGGRAPH ’22. Association for Computing Machinery, New York, NY, USA (2022)
Hu, K., Haworth, B., Berseth, G., Pavlovic, V., Faloutsos, P., Kapadia, M.: Heterogeneous crowd simulation using parametric reinforcement learning. IEEE Trans. Vis. Comput. Graph. 29(4), 2036–2052 (2023)
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2255–2264 (2018)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Helbing, D., Buzna, L., Johansson, A., Werner, T.: Self-organized pedestrian crowd dynamics: experiments, simulations, and design solutions. Transp. Sci. 39(1), 1–24 (2005)
Zhao, M., Cai, W., Turner, S.J.: Clust: simulating realistic crowd Behaviour by mining pattern from crowd videos. In: Computer Graphics Forum. Wiley Online Library, vol. 37, pp. 184–201 (2018)
Zhao, M., Turner, S.J., Cai, W.: A data-driven crowd simulation model based on clustering and classification. In: 2013 IEEE/ACM 17th International Symposium on Distributed Simulation and Real Time Applications. IEEE, pp. 125–134 (2013)
Kim, S., Bera, A., Best, A., Chabra, R., Manocha, D.: Interactive and adaptive data-driven crowd simulation. In: 2016 IEEE Virtual Reality (VR). IEEE, pp. 29–38 (2016)
Amirian, J., Van Toll, W., Hayet, J.-B., Pettré, J.: Data-driven crowd simulation with generative adversarial networks. In: Proceedings of the 32nd International Conference on Computer Animation and Social Agents, pp. 7–10 (2019)
Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: Human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)
Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Transformer networks for trajectory forecasting. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp. 10335–10342 (2021)
Lv, Z., Huang, X., Cao, W.: An improved GAN with transformers for pedestrian trajectory prediction models. Int. J. Intell. Syst. 37(8), 4417–4436 (2022)
Yuan, Y., Weng, X., Ou, Y., Kitani, K.M.: Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9813–9823 (2021)
Mohamed, A., Zhu, D., Vu, W., Elhoseiny, M., Claudel, C.: Social-implicit: rethinking trajectory prediction evaluation and the effectiveness of implicit maximum likelihood estimation. In: Proceedings of the ECCV 2022: 17th European Conference on Computer Vision, Tel Aviv, Israel, October 23–27, 2022, Part XXII, pp. 463–479. Springer (2022)
Xie, Z., Zhang, W., Sheng, B., Li, P., Chen, C.L.P.: BaGFN: broad attentive graph fusion network for high-order feature interactions. IEEE Trans. Neural Netw. Learn. Syst. 34(8), 4499–4513 (2023)
Sheng, B., Li, P., Ali, R., Chen, C.L.P.: Improving video temporal consistency via broad learning system. IEEE Trans. Cybern. 52(7), 6662–6675 (2022)
Park, J., Kim, Y.: Styleformer: transformer based generative adversarial networks with style vector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8983–8992 (2022)
Yu, S., Tack, J., Mo, S., Kim, H., Kim, J., Ha, J.-W., Shin, J.: Generating videos with dynamics-aware implicit generative adversarial networks. arXiv:2202.10571 (2022)
Kojima, T., Iwasawa, Y., Matsuo, Y.: Making use of latent space in language GANs for generating diverse text without pre-training. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 175–182. Association for Computational Linguistics, Online (2021)
Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, H., Savarese, S.: Sophie: An attentive GAN for predicting paths compliant to social and physical constraints. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1349–1358 (2019)
Amirian, J., Hayet, J.-B., Pettre, J.: Social ways: learning multi-modal distributions of pedestrian trajectories with GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
Huang, L., Zhuang, J., Cheng, X., Xu, R., Ma, H.: STI-GAN: multimodal pedestrian trajectory prediction using spatiotemporal interactions and a generative adversarial network. IEEE Access 9, 50846–50856 (2021)
Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International Conference on Computer Vision. IEEE, pp. 261–268 (2009)
Acknowledgements
This research has been supported by the National Key Research and Development Program of China (No. 2020YFC2007200).
Author information
Authors and Affiliations
Contributions
DY was responsible for the primary writing of the paper and coding the experiments. GD provided experimental equipment, financial support, and reviewed and revised the experimental section of the paper. KH handled the creation of the paper’s images and built the code for comparative experiments. TH conducted a comprehensive review of the paper, revised the logical structure of the experimental part, and enhanced the quality and readability of the article.
Corresponding author
Ethics declarations
Conflict of interest
All authors have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, D., Ding, G., Huang, K. et al. Generating natural pedestrian crowds by learning real crowd trajectories through a transformer-based GAN. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03385-4
Accepted:
Published:
DOI: https://doi.org/10.1007/s00371-024-03385-4