SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction

Sriram, N. N.; Liu, Buyu; Pittaluga, Francesco; Chandraker, Manmohan

doi:10.1007/978-3-030-58583-9_28

N. N. Sriram¹²,
Buyu Liu¹²,
Francesco Pittaluga¹² &
…
Manmohan Chandraker^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12372))

Included in the following conference series:

European Conference on Computer Vision

3800 Accesses
7 Citations
6 Altmetric

Abstract

We propose advances that address two key challenges in future trajectory prediction: (i) multimodality in both training data and predictions and (ii) constant time inference regardless of number of agents. Existing trajectory predictions are fundamentally limited by lack of diversity in training data, which is difficult to acquire with sufficient coverage of possible modes. Our first contribution is an automatic method to simulate diverse trajectories in the top-view. It uses pre-existing datasets and maps as initialization, mines existing trajectories to represent realistic driving behaviors and uses a multi-agent vehicle dynamics simulator to generate diverse new trajectories that cover various modes and are consistent with scene layout constraints. Our second contribution is a novel method that generates diverse predictions while accounting for scene semantics and multi-agent interactions, with constant-time inference independent of the number of agents. We propose a convLSTM with novel state pooling operations and losses to predict scene-consistent states of multiple agents in a single forward pass, along with a CVAE for diversity. We validate our proposed multi-agent trajectory prediction approach by training and testing on the proposed simulated dataset and existing real datasets of traffic scenes. In both cases, our approach outperforms SOTA methods by a large margin, highlighting the benefits of both our diverse dataset simulation and constant-time diverse trajectory prediction methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding

M $$^2$$ Sim: A Long-Term Interactive Driving Simulator

Long-Term Interactive Driving Simulation: MPC to the Rescue

Notes

1.
Generated 2044 scenes in total containing multiple trajectories for every scene.
2.
Argoverse Forecasting for vehicle trajectory prediction is a large scale dataset containing 333,441 (5 s) trajectories captured from 320 h of driving.

References

Waymo open dataset: An autonomous driving dataset (2019)
Google Scholar
Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: Human trajectory prediction in crowded spaces. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Amirian, J., Hayet, J.B., Pettre, J.: Social ways: learning multi-modal distributions of pedestrian trajectories with GANs. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
Google Scholar
Caesar, H., et al.: nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027 (2019)
Chandra, R., Bhattacharya, U., Bera, A., Manocha, D.: Traphic: trajectory prediction in dense and heterogeneous traffic using weighted interactions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8483–8492 (2019)
Google Scholar
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Colyar, J., Halkias, J.: Us highway 101 dataset. Federal Highway Administration (FHWA), Technical Report FHWA-HRT-07-030 (2007)
Google Scholar
Deo, N., Trivedi, M.M.: Convolutional social pooling for vehicle trajectory prediction (2018). CoRR abs/1805.06771, http://arxiv.org/abs/1805.06771
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017)
Google Scholar
Fang, J., et al.: Simulating LIDAR point cloud for autonomous driving using real-world scenes and traffic flows (2018). CoRR abs/1811.07112, http://arxiv.org/abs/1811.07112
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis (2016). CoRR abs/1605.06457, http://arxiv.org/abs/1605.06457
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Rob. Res. (IJRR) 32, 1231–1237 (2013)
Article Google Scholar
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2255–2264 (2018)
Google Scholar
Hong, J., Sapp, B., Philbin, J.: Rules of the road: predicting driving behavior with a convolutional model of semantic interactions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Ivanovic, B., Pavone, M.: The trajectron: probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs (2018)
Google Scholar
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? (2016). CoRR abs/1610.01983, http://arxiv.org/abs/1610.01983
Kesten, R., et al.: Lyft level 5 av dataset 2019 (2019). https://level5.lyft.com/dataset/
Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_15
Chapter Google Scholar
Kosaraju, V., Sadeghian, A., Martín-Martín, R., Reid, I.D., Rezatofighi, S.H., Savarese, S.: Social-bigat: multimodal trajectory forecasting using bicycle-gan and graph attention networks. CoRR abs/1907.03395, http://arxiv.org/abs/1907.03395 (2019)
Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H.S., Chandraker, M.K.: Desire: distant future prediction in dynamic scenes with interacting agents. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2165–2174 (2017)
Google Scholar
Leurent, E.: An environment for autonomous driving decision-making (2018). https://github.com/eleurent/highway-env
Li, Y.: Which way are you going? imitative decision learning for path forecasting in dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 294–303 (2019)
Google Scholar
Ma, Y., Zhu, X., Zhang, S., Yang, R., Wang, W., Manocha, D.: Trafficpredict: Trajectory prediction for heterogeneous traffic-agents. arXiv preprint arXiv:1811.02146 (2018)
OpenStreetMap contributors: Planet dump (2017). https://planet.osm.org. https://www.openstreetmap.org
Rhinehart, N., Kitani, K.M., Vernaza, P.: r2p2: a reparameterized pushforward policy for diverse, precise generative path forecasting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 794–811. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_47
Chapter Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.: The SYNTHIA Dataset: A large collection of synthetic images for semantic segmentation of urban scenes (2016)
Google Scholar
Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Savarese, S.: Sophie: an attentive GAN for predicting paths compliant to social and physical constraints (2018). CoRR abs/1806.01482, http://arxiv.org/abs/1806.01482
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: high-fidelity visual and physical simulation for autonomous vehicles (2017). CoRR abs/1705.05065, http://arxiv.org/abs/1705.05065
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Advances in Neural Information Processing Systems, pp. 3483–3491 (2015)
Google Scholar
Srikanth, S., Ansari, J.A., Ram, R.K., Sharma, S., Murthy, J.K., Krishna, K.M.: Infer: intermediate representations for future prediction. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019). https://doi.org/10.1109/iros40897.2019.8968553, http://dx.doi.org/10.1109/IROS40897.2019.8968553
Sriram, N.N., et al.: A hierarchical network for diverse trajectory proposals. In: 2019 IEEE Intelligent Vehicles Symposium (IV), pp. 689–694 (2019). https://doi.org/10.1109/IVS.2019.8813986
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS’14, vol. 2, pp. 3104–3112. MIT Press, Cambridge (2014)
Google Scholar
Treiber, M., Hennecke, A., Helbing, D.: Congested traffic states in empirical observations and microscopic simulations. Phys. Rev. E Stat. Phys. Plasmas Fluids Related Interdisc. Topics 62(2 Pt A), 1805–1824 (2000)
Google Scholar
Treiber, M., Kesting, A.: Modeling lane-changing decisions with MOBIL. In: Appert-Rolland, C., Chevoir, F., Gondret, P., Lassarre, S., Lebacque, J.P., Schreckenberg, M. (eds.) Traffic and Granular Flow ’07, pp. 211–221. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-77074-9_19
Chapter MATH Google Scholar
Xingjian, S., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In: Advances in Neural Information Processing Systems, pp. 802–810 (2015)
Google Scholar
Zhao, T., et al.: Multi-agent tensor fusion for contextual trajectory prediction. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

NEC Laboratories America, San Jose, USA
N. N. Sriram, Buyu Liu, Francesco Pittaluga & Manmohan Chandraker
UC San Diego, San Diego, USA
Manmohan Chandraker

Authors

N. N. Sriram
View author publications
You can also search for this author in PubMed Google Scholar
Buyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Pittaluga
View author publications
You can also search for this author in PubMed Google Scholar
Manmohan Chandraker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. N. Sriram .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 45267 KB)

Supplementary material 2 (pdf 14370 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sriram, N.N., Liu, B., Pittaluga, F., Chandraker, M. (2020). SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12372. Springer, Cham. https://doi.org/10.1007/978-3-030-58583-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-58583-9_28
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58582-2
Online ISBN: 978-3-030-58583-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction

Abstract

Access this chapter

Similar content being viewed by others

Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding

M $$^2$$ Sim: A Long-Term Interactive Driving Simulator

Long-Term Interactive Driving Simulation: MPC to the Rescue

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 2 (pdf 14370 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction

Abstract

Access this chapter

Similar content being viewed by others

Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding

M $$^2$$ Sim: A Long-Term Interactive Driving Simulator

Long-Term Interactive Driving Simulation: MPC to the Rescue

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 2 (pdf 14370 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation