Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding

Park, Seong Hyeon; Lee, Gyubok; Seo, Jimin; Bhat, Manoj; Kang, Minseok; Francis, Jonathan; Jadhav, Ashwin; Liang, Paul Pu; Morency, Louis-Philippe

doi:10.1007/978-3-030-58621-8_17

Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding

Seong Hyeon Park¹²,
Gyubok Lee¹³,
Jimin Seo¹⁴,
Manoj Bhat¹⁵,
Minseok Kang¹⁶,
Jonathan Francis^15,17,
Ashwin Jadhav¹⁵,
Paul Pu Liang¹⁵ &
…
Louis-Philippe Morency¹⁵

Conference paper
First Online: 27 November 2020

5167 Accesses
37 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12356))

Abstract

Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians, for safe and reliable decision-making. Due to partial observability in these dynamical scenes, directly obtaining the posterior distribution over future agent trajectories remains a challenging problem. In realistic embodied environments, each agent’s future trajectories should be both diverse since multiple plausible sequences of actions can be used to reach its intended goals, and admissible since they must obey physical constraints and stay in drivable areas. In this paper, we propose a model that synthesizes multiple input signals from the multimodal world|the environment’s scene context and interactions between multiple surrounding agents|to best model all diverse and admissible trajectories. We compare our model with strong baselines and ablations across two public datasets and show a significant performance improvement over previous state-of-the-art methods. Lastly, we offer new metrics incorporating admissibility criteria to further study and evaluate the diversity of predictions. Codes are at: https://github.com/kami93/CMU-DATF.

J. Seo and M. Bhat—Authors contributed equally.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social lSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)
Google Scholar
Ballan, L., Castaldo, F., Alahi, A., Palmieri, F., Savarese, S.: Knowledge transfer for scene-specific motion prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016. Lecture Notes in Computer Science, vol. 9905, pp. 697–713. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_42
Chapter Google Scholar
Bansal, M., Krizhevsky, A., Ogale, A.: Chauffeurnet: learning to drive by imitating the best and synthesizing the worst. arXiv preprint arXiv:1812.03079 (2018)
Bernstein, D.S., So, W.: Some explicit formulas for the matrix exponential. IEEE Trans. Autom. Control 38(8), 1228–1232 (1993)
Article MathSciNet Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Google Scholar
Casas, S., Luo, W., Urtasun, R.: IntentNet: learning to predict intention from raw sensor data. In: Conference on Robot Learning, pp. 947–956 (2018)
Google Scholar
Chang, M.F., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8748–8757 (2019)
Google Scholar
Deo, N., Trivedi, M.M.: Convolutional social pooling for vehicle trajectory prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1468–1476 (2018)
Google Scholar
Fernando, T., Denman, S., Sridharan, S., Fookes, C.: Soft + hardwired attention: an lstm framework for human trajectory prediction and abnormal event detection. Neural networks 108, 466–478 (2018)
Article Google Scholar
Gindele, T., Brechtel, S., Dillmann, R.: Learning driver behavior models from traffic observations for decision making and planning. IEEE Intell. Transp. Syst. Mag. 7(1), 69–79 (2015)
Article Google Scholar
Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2255–2264 (2018)
Google Scholar
Huang, X., et al.: DiversityGAN: diversity-aware vehicle motion prediction via latent semantic sampling. IEEE Robot. Autom. Lett. (2020)
Google Scholar
Kim, B., Kang, C.M., Kim, J., Lee, S.H., Chung, C.C., Choi, J.W.: Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network. In: 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pp. 399–404. IEEE (2017)
Google Scholar
Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing systems. pp. 4743–4751 (2016)
Google Scholar
Kooij, J.F.P.: Context-based pedestrian path prediction. In: Fleet, D., Pajdla, T., Tuytelaars, T. (eds.) European Conference on Computer Vision. Lecture Notes in Computer Science, vol. 8694, pp. 618–633. Springer, Cham (2014)
Google Scholar
Krajewski, R., Bock, J., Kloeker, L., Eckstein, L.: The highd dataset: a drone dataset of naturalistic vehicle trajectories on German highways for validation of highly automated driving systems. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2118–2125. IEEE (2018)
Google Scholar
Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H., Chandraker, M.: Desire: distant future prediction in dynamic scenes with interacting agents. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 336–345 (2017)
Google Scholar
Ma, Y., Zhu, X., Zhang, S., Yang, R., Wang, W., Manocha, D.: Trafficpredict: Trajectory prediction for heterogeneous traffic-agents. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6120–6127 (2019)
Google Scholar
Park, S.H., Kim, B., Kang, C.M., Chung, C.C., Choi, J.W.: Sequence-to-sequence prediction of vehicle trajectory via LSTM encoder-decoder architecture. In: 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 1672–1678. IEEE (2018)
Google Scholar
Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. arXiv preprint arXiv:1505.05770 (2015)
Rhinehart, N., Kitani, K.M., Vernaza, P.: R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 772–788 (2018)
Google Scholar
Rhinehart, N., McAllister, R., Kitani, K., Levine, S.: Precog: Prediction conditioned on goals in visual multi-agent settings. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2821–2830 (2019)
Google Scholar
Rudenko, A., Palmieri, L., Herman, M., Kitani, K.M., Gavrila, D.M., Arras, K.O.: Human motion trajectory prediction: A survey. The International Journal of Robotics Research p. 0278364920917446 (2019)
Google Scholar
Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Rezatofighi, H., Savarese, S.: Sophie: An attentive gan for predicting paths compliant to social and physical constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1349–1358 (2019)
Google Scholar
Schulz, J., Hubmann, C., Löchner, J., Burschka, D.: Interaction-aware probabilistic behavior prediction in urban environments. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 3999–4006. IEEE (2018)
Google Scholar
Tang, C., Salakhutdinov, R.R.: Multiple futures prediction. In: Advances in Neural Information Processing Systems. pp. 15398–15408 (2019)
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in neural information processing systems. pp. 5998–6008 (2017)
Google Scholar
Vemula, A., Muelling, K., Oh, J.: Social attention: Modeling attention in human crowds. In: 2018 IEEE international Conference on Robotics and Automation (ICRA). pp. 1–7. IEEE (2018)
Google Scholar
Verlet, L.: Computer” experiments” on classical fluids. i. thermodynamical properties of lennard-jones molecules. Physical review 159(1), 98 (1967)
Google Scholar
Xie, G., Gao, H., Qian, L., Huang, B., Li, K., Wang, J.: Vehicle trajectory prediction by integrating physics-and maneuver-based approaches using interactive multiple models. IEEE Transactions on Industrial Electronics 65(7), 5999–6008 (2017)
Article Google Scholar
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning. pp. 2048–2057 (2015)
Google Scholar
Yuan, Y., Kitani, K.M.: Diverse trajectory forecasting with determinantal point processes. In: International Conference on Learning Representations (2020)
Google Scholar
Zhao, T., Xu, Y., Monfort, M., Choi, W., Baker, C., Zhao, Y., Wang, Y., Wu, Y.N.: Multi-agent tensor fusion for contextual trajectory prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 12126–12134 (2019)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the Technology Innovation Program under Grant 10083646 (Development of Deep Learning-Based Future Prediction and Risk Assessment Technology considering Inter-vehicular Interaction in Cut-in Scenario), funded by the Ministry of Trade, Industry, and Energy, South Korea. We also acknowledge the anonymous reviewers for their constructive comments.

Author information

Authors and Affiliations

Hanyang University, Seoul, Korea
Seong Hyeon Park
Yonsei University, Seoul, Korea
Gyubok Lee
Korea University, Seoul, Korea
Jimin Seo
Carnegie Mellon University, Pittsburgh, PA, USA
Manoj Bhat, Jonathan Francis, Ashwin Jadhav, Paul Pu Liang & Louis-Philippe Morency
Sogang University, Seoul, Korea
Minseok Kang
Bosch Research Pittsburgh, Pittsburgh, PA, USA
Jonathan Francis

Authors

Seong Hyeon Park
View author publications
You can also search for this author in PubMed Google Scholar
Gyubok Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jimin Seo
View author publications
You can also search for this author in PubMed Google Scholar
Manoj Bhat
View author publications
You can also search for this author in PubMed Google Scholar
Minseok Kang
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Francis
View author publications
You can also search for this author in PubMed Google Scholar
Ashwin Jadhav
View author publications
You can also search for this author in PubMed Google Scholar
Paul Pu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Louis-Philippe Morency
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seong Hyeon Park .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2434 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, S.H. et al. (2020). Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12356. Springer, Cham. https://doi.org/10.1007/978-3-030-58621-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-58621-8_17
Published: 27 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58620-1
Online ISBN: 978-3-030-58621-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics