Advertisement

Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12363)

Abstract

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

Keywords

Trajectory forecasting Spatiotemporal graph modeling Human-robot interaction Autonomous driving 

Notes

Acknowledgment

This work was supported in part by the Ford-Stanford Alliance. This article solely reflects the opinions and conclusions of its authors.

Supplementary material

504473_1_En_40_MOESM1_ESM.pdf (1.1 mb)
Supplementary material 1 (pdf 1132 KB)

References

  1. 1.
    Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  2. 2.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)Google Scholar
  3. 3.
    Battaglia, P.W., Pascanu, R., Lai, M., Rezende, D., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: Conference on Neural Information Processing Systems (2016)Google Scholar
  4. 4.
    Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2015)Google Scholar
  5. 5.
    Britz, D., Goldie, A., Luong, M.T., Le, Q.V.: Massive exploration of neural machine translation architectures. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1442–1451 (2017)Google Scholar
  6. 6.
    Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving (2019)Google Scholar
  7. 7.
    Casas, S., Gulino, C., Liao, R., Urtasun, R.: SpAGNN: spatially-aware graph neural networks for relational behavior forecasting from sensor data (2019)Google Scholar
  8. 8.
    Casas, S., Luo, W., Urtasun, R.: IntentNet: learning to predict intention from raw sensor data. In: Conference on Robot Learning, pp. 947–956 (2018)Google Scholar
  9. 9.
    Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  10. 10.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1724–1734 (2014)Google Scholar
  11. 11.
    Deo, M.F., Trivedi, J.: Multi-modal trajectory prediction of surrounding vehicles with maneuver based LSTMs. In: IEEE Intelligent Vehicles Symposium (2018)Google Scholar
  12. 12.
    Goodfellow, I., et al.: Generative adversarial nets. In: Conference on Neural Information Processing Systems (2014)Google Scholar
  13. 13.
    Gupta, A., Johnson, J., Li, F., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  14. 14.
    Gweon, H., Saxe, R.: Developmental cognitive neuroscience of theory of mind, chap. 20. In: Neural Circuit Development and Function in the Brain, pp. 367–377. Academic Press (2013).  https://doi.org/10.1016/B978-0-12-397267-5.00057-1. http://www.sciencedirect.com/science/article/pii/B9780123972675000571
  15. 15.
    Hallac, D., Leskovec, J., Boyd, S.: Network lasso: clustering and optimization in large graphs. In: ACM International Conference on Knowledge Discovery and Data Mining (2015)Google Scholar
  16. 16.
    Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51(5), 4282–4286 (1995)CrossRefGoogle Scholar
  17. 17.
    Higgins, I., et al.: \(\upbeta \)-VAE: learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations (2017)Google Scholar
  18. 18.
    Ho, J., Ermon, S.: Multiple futures prediction. In: Conference on Neural Information Processing Systems (2019)Google Scholar
  19. 19.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)CrossRefGoogle Scholar
  20. 20.
    Ivanovic, B., Pavone, M.: The trajectron: probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In: IEEE International Conference on Computer Vision (2019)Google Scholar
  21. 21.
    Ivanovic, B., Schmerling, E., Leung, K., Pavone, M.: Generative modeling of multimodal multi-human behavior. In: IEEE/RSJ International Conference on Intelligent Robots & Systems (2018)Google Scholar
  22. 22.
    Jain, A., Zamir, A.R., Savarese, S., Saxena, A.: Structural-RNN: deep learning on spatio-temporal graphs. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  23. 23.
    Jain, A., et al.: Discrete residual flow for probabilistic pedestrian behavior prediction. In: Conference on Robot Learning (2019)Google Scholar
  24. 24.
    Jang, E., Gu, S., Poole, B.: Categorial reparameterization with Gumbel-Softmax. In: International Conference on Learning Representations (2017)Google Scholar
  25. 25.
    Kalman, R.E.: A new approach to linear filtering and prediction problems. ASME J. Basic Eng. 82, 35–45 (1960)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Kesten, R., et al.: Lyft Level 5 AV Dataset 2019 (2019). https://level5.lyft.com/dataset/
  27. 27.
    Kong, J., Pfeifer, M., Schildbach, G., Borrelli, F.: Kinematic and dynamic vehicle models for autonomous driving control design. In: IEEE Intelligent Vehicles Symposium (2015)Google Scholar
  28. 28.
    Kosaraju, V., et al.: Social-BiGAT: multimodal trajectory forecasting using bicycle-GAN and graph attention networks. In: Conference on Neural Information Processing Systems (2019)Google Scholar
  29. 29.
    LaValle, S.M.: Better unicycle models. In: Planning Algorithms, p. 743. Cambridge University Press (2006)Google Scholar
  30. 30.
    LaValle, S.M.: A simple unicycle. In: Planning Algorithms, pp. 729–730. Cambridge University Press (2006)Google Scholar
  31. 31.
    Lee, N., et al.: DESIRE: distant future prediction in dynamic scenes with interacting agents. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  32. 32.
    Lee, N., Kitani, K.M.: Predicting wide receiver trajectories in American football. In: IEEE Winter Conference on Applications of Computer Vision (2016)Google Scholar
  33. 33.
    Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. Comput. Graph. Forum 26(3), 655–664 (2007)CrossRefGoogle Scholar
  34. 34.
    Morton, J., Wheeler, T.A., Kochenderfer, M.J.: Analysis of recurrent neural networks for probabilistic modeling of driver behavior. IEEE Trans. Pattern Anal. Mach. Intell. 18(5), 1289–1298 (2017)CrossRefGoogle Scholar
  35. 35.
    Paden, B., Čáp, M., Yong, S.Z., Yershov, D., Frazzoli, E.: A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans. Intell. Veh. 1(1), 33–55 (2016)CrossRefGoogle Scholar
  36. 36.
    Paszke, A., et al.: Automatic differentiation in PyTorch. In: Conference on Neural Information Processing Systems - Autodiff Workshop (2017)Google Scholar
  37. 37.
    Pellegrini, S., Ess, A., Schindler, K., Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: IEEE International Conference on Computer Vision (2009)Google Scholar
  38. 38.
    Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning), 1st edn. MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  39. 39.
    Rhinehart, N., McAllister, R., Kitani, K., Levine, S.: PRECOG: prediction conditioned on goals in visual multi-agent settings. In: IEEE International Conference on Computer Vision (2019)Google Scholar
  40. 40.
    Rudenko, A., Palmieri, L., Herman, M., Kitani, K.M., Gavrila, D.M., Arras, K.O.: Human motion trajectory prediction: a survey (2019). https://arxiv.org/abs/1905.06113
  41. 41.
    Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, S.H., Savarese, S.: SoPhie: an attentive GAN for predicting paths compliant to social and physical constraints. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  42. 42.
    Sadeghian, A., Legros, F., Voisin, M., Vesel, R., Alahi, A., Savarese, S.: CAR-Net: Clairvoyant attentive recurrent network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_10CrossRefGoogle Scholar
  43. 43.
    Schöller, C., Aravantinos, V., Lay, F., Knoll, A.: What the constant velocity model can teach us about pedestrian motion prediction. IEEE Robot. Autom. Lett. 5, 1696–1703 (2020)CrossRefGoogle Scholar
  44. 44.
    Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Conference on Neural Information Processing Systems (2015)Google Scholar
  45. 45.
    Thiede, L.A., Brahma, P.P.: Analyzing the variety loss in the context of probabilistic trajectory prediction. In: IEEE International Conference on Computer Vision (2019)Google Scholar
  46. 46.
    Thrun, S., Burgard, W., Fox, D.: The extended Kalman filter. In: Probabilistic Robotics, pp. 54–64. MIT Press (2005)Google Scholar
  47. 47.
    Vemula, A., Muelling, K., Oh, J.: Social attention: modeling attention in human crowds. In: Proceedings of the IEEE Conference on Robotics and Automation (2018)Google Scholar
  48. 48.
    Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models for human motion. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 283–298 (2008)CrossRefGoogle Scholar
  49. 49.
    Waymo: Safety report (2018). https://waymo.com/safety/. Accessed 9 Nov 2019
  50. 50.
    Waymo: Waymo Open Dataset: An autonomous driving dataset (2019). https://waymo.com/open/
  51. 51.
    Zeng, W., et al.: End-to-end interpretable neural motion planner. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  52. 52.
    Zhao, S., Song, J., Ermon, S.: InfoVAE: balancing learning and inference in variational autoencoders. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)Google Scholar
  53. 53.
    Zhao, T., et al.: Multi-agent tensor fusion for contextual trajectory prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Autonomous Systems LabStanford UniversityStanfordUSA
  2. 2.Ford Greenfield LabsPalo AltoUSA

Personalised recommendations