Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data

Salzmann, Tim; Ivanovic, Boris; Chakravarty, Punarjay; Pavone, Marco

doi:10.1007/978-3-030-58523-5_40

Tim Salzmann¹²,
Boris Ivanovic¹²,
Punarjay Chakravarty¹³ &
…
Marco Pavone¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12363))

Included in the following conference series:

European Conference on Computer Vision

5997 Accesses
276 Citations

Abstract

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

T. Salzmann and B. Ivanovic—Equal contribution.

T. Salzmann—Work done as a visiting student in the Autonomous Systems Lab.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
All of our source code, trained models, and data can be found online at
https://github.com/StanfordASL/Trajectron-plus-plus.

References

Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)
Google Scholar
Battaglia, P.W., Pascanu, R., Lai, M., Rezende, D., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: Conference on Neural Information Processing Systems (2016)
Google Scholar
Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2015)
Google Scholar
Britz, D., Goldie, A., Luong, M.T., Le, Q.V.: Massive exploration of neural machine translation architectures. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1442–1451 (2017)
Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving (2019)
Google Scholar
Casas, S., Gulino, C., Liao, R., Urtasun, R.: SpAGNN: spatially-aware graph neural networks for relational behavior forecasting from sensor data (2019)
Google Scholar
Casas, S., Luo, W., Urtasun, R.: IntentNet: learning to predict intention from raw sensor data. In: Conference on Robot Learning, pp. 947–956 (2018)
Google Scholar
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)
Google Scholar
Deo, M.F., Trivedi, J.: Multi-modal trajectory prediction of surrounding vehicles with maneuver based LSTMs. In: IEEE Intelligent Vehicles Symposium (2018)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Conference on Neural Information Processing Systems (2014)
Google Scholar
Gupta, A., Johnson, J., Li, F., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Gweon, H., Saxe, R.: Developmental cognitive neuroscience of theory of mind, chap. 20. In: Neural Circuit Development and Function in the Brain, pp. 367–377. Academic Press (2013). https://doi.org/10.1016/B978-0-12-397267-5.00057-1. http://www.sciencedirect.com/science/article/pii/B9780123972675000571
Hallac, D., Leskovec, J., Boyd, S.: Network lasso: clustering and optimization in large graphs. In: ACM International Conference on Knowledge Discovery and Data Mining (2015)
Google Scholar
Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282–4286 (1995)
Article Google Scholar
Higgins, I., et al.: \(\upbeta \)-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)
Google Scholar
Ho, J., Ermon, S.: Multiple futures prediction. In: Conference on Neural Information Processing Systems (2019)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
Ivanovic, B., Pavone, M.: The trajectron: probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In: IEEE International Conference on Computer Vision (2019)
Google Scholar
Ivanovic, B., Schmerling, E., Leung, K., Pavone, M.: Generative modeling of multimodal multi-human behavior. In: IEEE/RSJ International Conference on Intelligent Robots & Systems (2018)
Google Scholar
Jain, A., Zamir, A.R., Savarese, S., Saxena, A.: Structural-RNN: deep learning on spatio-temporal graphs. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Jain, A., et al.: Discrete residual flow for probabilistic pedestrian behavior prediction. In: Conference on Robot Learning (2019)
Google Scholar
Jang, E., Gu, S., Poole, B.: Categorial reparameterization with Gumbel-Softmax. In: International Conference on Learning Representations (2017)
Google Scholar
Kalman, R.E.: A new approach to linear filtering and prediction problems. ASME J. Basic Eng. 82, 35–45 (1960)
Article MathSciNet Google Scholar
Kesten, R., et al.: Lyft Level 5 AV Dataset 2019 (2019). https://level5.lyft.com/dataset/
Kong, J., Pfeifer, M., Schildbach, G., Borrelli, F.: Kinematic and dynamic vehicle models for autonomous driving control design. In: IEEE Intelligent Vehicles Symposium (2015)
Google Scholar
Kosaraju, V., et al.: Social-BiGAT: multimodal trajectory forecasting using bicycle-GAN and graph attention networks. In: Conference on Neural Information Processing Systems (2019)
Google Scholar
LaValle, S.M.: Better unicycle models. In: Planning Algorithms, p. 743. Cambridge University Press (2006)
Google Scholar
LaValle, S.M.: A simple unicycle. In: Planning Algorithms, pp. 729–730. Cambridge University Press (2006)
Google Scholar
Lee, N., et al.: DESIRE: distant future prediction in dynamic scenes with interacting agents. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Lee, N., Kitani, K.M.: Predicting wide receiver trajectories in American football. In: IEEE Winter Conference on Applications of Computer Vision (2016)
Google Scholar
Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. Comput. Graph. Forum 26(3), 655–664 (2007)
Article Google Scholar
Morton, J., Wheeler, T.A., Kochenderfer, M.J.: Analysis of recurrent neural networks for probabilistic modeling of driver behavior. IEEE Trans. Pattern Anal. Mach. Intell. 18(5), 1289–1298 (2017)
Article Google Scholar
Paden, B., Čáp, M., Yong, S.Z., Yershov, D., Frazzoli, E.: A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh. 1(1), 33–55 (2016)
Article Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch. In: Conference on Neural Information Processing Systems - Autodiff Workshop (2017)
Google Scholar
Pellegrini, S., Ess, A., Schindler, K., Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: IEEE International Conference on Computer Vision (2009)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning), 1st edn. MIT Press, Cambridge (2006)
MATH Google Scholar
Rhinehart, N., McAllister, R., Kitani, K., Levine, S.: PRECOG: prediction conditioned on goals in visual multi-agent settings. In: IEEE International Conference on Computer Vision (2019)
Google Scholar
Rudenko, A., Palmieri, L., Herman, M., Kitani, K.M., Gavrila, D.M., Arras, K.O.: Human motion trajectory prediction: a survey (2019). https://arxiv.org/abs/1905.06113
Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, S.H., Savarese, S.: SoPhie: an attentive GAN for predicting paths compliant to social and physical constraints. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Sadeghian, A., Legros, F., Voisin, M., Vesel, R., Alahi, A., Savarese, S.: CAR-Net: Clairvoyant attentive recurrent network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_10
Chapter Google Scholar
Schöller, C., Aravantinos, V., Lay, F., Knoll, A.: What the constant velocity model can teach us about pedestrian motion prediction. IEEE Robot. Autom. Lett. 5, 1696–1703 (2020)
Article Google Scholar
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Conference on Neural Information Processing Systems (2015)
Google Scholar
Thiede, L.A., Brahma, P.P.: Analyzing the variety loss in the context of probabilistic trajectory prediction. In: IEEE International Conference on Computer Vision (2019)
Google Scholar
Thrun, S., Burgard, W., Fox, D.: The extended Kalman filter. In: Probabilistic Robotics, pp. 54–64. MIT Press (2005)
Google Scholar
Vemula, A., Muelling, K., Oh, J.: Social attention: modeling attention in human crowds. In: Proceedings of the IEEE Conference on Robotics and Automation (2018)
Google Scholar
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 283–298 (2008)
Article Google Scholar
Waymo: Safety report (2018). https://waymo.com/safety/. Accessed 9 Nov 2019
Waymo: Waymo Open Dataset: An autonomous driving dataset (2019). https://waymo.com/open/
Zeng, W., et al.: End-to-end interpretable neural motion planner. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Zhao, S., Song, J., Ermon, S.: InfoVAE: balancing learning and inference in variational autoencoders. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Zhao, T., et al.: Multi-agent tensor fusion for contextual trajectory prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the Ford-Stanford Alliance. This article solely reflects the opinions and conclusions of its authors.

Author information

Authors and Affiliations

Autonomous Systems Lab, Stanford University, Stanford, USA
Tim Salzmann, Boris Ivanovic & Marco Pavone
Ford Greenfield Labs, Palo Alto, USA
Punarjay Chakravarty

Authors

Tim Salzmann
View author publications
You can also search for this author in PubMed Google Scholar
Boris Ivanovic
View author publications
You can also search for this author in PubMed Google Scholar
Punarjay Chakravarty
View author publications
You can also search for this author in PubMed Google Scholar
Marco Pavone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boris Ivanovic .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1132 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salzmann, T., Ivanovic, B., Chakravarty, P., Pavone, M. (2020). Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12363. Springer, Cham. https://doi.org/10.1007/978-3-030-58523-5_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-58523-5_40
Published: 04 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58522-8
Online ISBN: 978-3-030-58523-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics