Advertisement

r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

  • Nicholas RhinehartEmail author
  • Kris M. Kitani
  • Paul Vernaza
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11217)

Abstract

We propose a method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both “diverse” (produces most of the likely paths) and “precise” (mostly produces likely paths). This balance is achieved through minimization of a symmetrized cross-entropy between the distribution and demonstration data. By viewing the simulated-outcome distribution as the pushforward of a simple distribution under a simulation operator, we obtain expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization. We propose concrete policy architectures for this model, discuss our evaluation metrics relative to previously-used degenerate metrics, and demonstrate the superiority of our method relative to state-of-the-art methods in both the Kitti dataset and a similar but novel and larger real-world dataset explicitly designed for the vehicle forecasting domain.

Keywords

Trajectory forecasting Imitation learning Generative modeling Self-driving vehicles 

Notes

Acknowledgment

This work was sponsored in part by JST CREST (JPMJCR14E1) and IARPA (D17PC00340).

Supplementary material

474201_1_En_47_MOESM1_ESM.pdf (2.6 mb)
Supplementary material 1 (pdf 2617 KB)
474201_1_En_47_MOESM2_ESM.m4v (32.2 mb)
Supplementary material 2 (m4v 32933 KB)

References

  1. 1.
    Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)Google Scholar
  2. 2.
    Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM (2004)Google Scholar
  3. 3.
    Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)Google Scholar
  4. 4.
    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)
  5. 5.
    Ballan, L., Castaldo, F., Alahi, A., Palmieri, F., Savarese, S.: Knowledge transfer for scene-specific motion prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 697–713. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_42CrossRefGoogle Scholar
  6. 6.
    Baram, N., Anschel, O., Caspi, I., Mannor, S.: End-to-end differentiable adversarial imitation learning. In: International Conference on Machine Learning, pp. 390–399 (2017)Google Scholar
  7. 7.
    Bhattacharyya, A., Malinowski, M., Schiele, B., Fritz, M.: Long-term image boundary prediction. In: Thirty-Second AAAI Conference on Artificial Intelligence, AAAI (2017)Google Scholar
  8. 8.
    Bhattacharyya, A., Schiele, B., Fritz, M.: Accurate and diverse sampling of sequences based on a “best of many” sample objective. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8485–8493 (2018)Google Scholar
  9. 9.
    Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016)
  10. 10.
    Finn, C., Levine, S.: Deep visual foresight for planning robot motion. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2786–2793. IEEE (2017)Google Scholar
  11. 11.
    Gal, Y.: Uncertainty in deep learning. Ph.D. thesis, University of Cambridge (2016)Google Scholar
  12. 12.
    Galceran, E., Cunningham, A.G., Eustice, R.M., Olson, E.: Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction. In: Robotics: Science and Systems XI, Sapienza University of Rome, Rome, 13–17 July 2015. http://www.roboticsproceedings.org/rss11/p43.html
  13. 13.
    Grover, A., Dhar, M., Ermon, S.: Flow-GAN: bridging implicit and prescribed learning in generative models. arXiv preprint arXiv:1705.08868 (2017)
  14. 14.
    Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5769–5779 (2017)Google Scholar
  15. 15.
    Gupta, A., Johnson, J.: Social GAN: socially acceptable trajectories with generative adversarial networks (2018)Google Scholar
  16. 16.
    Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 4565–4573 (2016)Google Scholar
  17. 17.
    Hoai, M., De la Torre, F.: Max-margin early event detectors. Int. J. Comput. Vis. 107(2), 191–202 (2014)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Jain, A., Singh, A., Koppula, H.S., Soh, S., Saxena, A.: Recurrent neural networks for driver activity anticipation via sensory-fusion architecture. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3118–3125. IEEE (2016)Google Scholar
  19. 19.
    Kakade, S.M., et al.: On the sample complexity of reinforcement learning. Ph.D. thesis (2003)Google Scholar
  20. 20.
    Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible \(1\times 1\) convolutions. arXiv preprint arXiv:1807.03039 (2018)
  21. 21.
    Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: Advances in Neural Information Processing Systems, pp. 4743–4751 (2016)Google Scholar
  22. 22.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  23. 23.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33765-9_15CrossRefGoogle Scholar
  24. 24.
    Lan, T., Chen, T.-C., Savarese, S.: A hierarchical representation for future action prediction. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 689–704. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10578-9_45CrossRefGoogle Scholar
  25. 25.
    Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H., Chandraker, M.: Desire: Distant future prediction in dynamic scenes with interacting agents (2017)Google Scholar
  26. 26.
    Lee, N., Kitani, K.M.: Predicting wide receiver trajectories in American football. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016)Google Scholar
  27. 27.
    Li, Y., Song, J., Ermon, S.: Infogail: interpretable imitation learning from visual demonstrations. In: Advances in Neural Information Processing Systems, pp. 3815–3825 (2017)Google Scholar
  28. 28.
    Ma, W.C., Huang, D.A., Lee, N., Kitani, K.M.: Forecasting interactive dynamics of pedestrians with fictitious play. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4636–4644. IEEE (2017)Google Scholar
  29. 29.
    Najfeld, I., Havel, T.F.: Derivatives of the matrix exponential and their computation. Adv. Appl. Math. 16(3), 321–375 (1995)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Park, H.S., Hwang, J.J., Niu, Y., Shi, J.: Egocentric future localization. In: CVPR, vol. 2, p. 4 (2016)Google Scholar
  31. 31.
    Ratliff, N.D., Bagnell, J.A., Zinkevich, M.A.: Maximum margin planning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 729–736. ACM (2006)Google Scholar
  32. 32.
    Recht, B.: The policy of truth (2018). http://www.argmin.net/2018/02/20/reinforce/
  33. 33.
    Rhinehart, N., Kitani, K.M.: First-person activity forecasting with online inverse reinforcement learning. In: The IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
  34. 34.
    Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_33CrossRefGoogle Scholar
  35. 35.
    Ryoo, M.S., Fuchs, T.J., Xia, L., Aggarwal, J.K., Matthies, L.H.: Robot-centric activity prediction from first-person videos: what will they do to me’. In: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction, HRI 2015, Portland, 2–5 March 2015, pp. 295–302 (2015).  https://doi.org/10.1145/2696454.2696462
  36. 36.
    Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1036–1043. IEEE (2011)Google Scholar
  37. 37.
    Sadeghian, A., Kosaraju, V., Gupta, A., Savarese, S., Alahi, A.: TrajNet: towards a benchmark for human trajectory prediction. arXiv preprint (2018)Google Scholar
  38. 38.
    Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)Google Scholar
  39. 39.
    Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)Google Scholar
  40. 40.
    Venkatraman, A., et al.: Predictive-state decoders: encoding the future into recurrent networks. In: Advances in Neural Information Processing Systems, pp. 1172–1183 (2017)Google Scholar
  41. 41.
    Verlet, L.: Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159(1), 98 (1967)CrossRefGoogle Scholar
  42. 42.
    Villegas, R., Yang, J., Zou, Y., Sohn, S., Lin, X., Lee, H.: Learning to generate long-term future via hierarchical prediction. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, 6–11 August 2017, pp. 3560–3569 (2017). http://proceedings.mlr.press/v70/villegas17a.html
  43. 43.
    Vondrick, C., Pirsiavash, H., Torralba, A.: Anticipating visual representations from unlabeled video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 98–106 (2016)Google Scholar
  44. 44.
    Vondrick, C., Pirsiavash, H., Torralba, A.: Generating videos with scene dynamics. In: Advances in Neural Information Processing Systems, pp. 613–621 (2016)Google Scholar
  45. 45.
    Vondrick, C., Torralba, A.: Generating the future with adversarial transformers. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, 21–26 July 2017, pp. 2992–3000 (2017).  https://doi.org/10.1109/CVPR.2017.319
  46. 46.
    Walker, J., Gupta, A., Hebert, M.: Patch to the future: unsupervised visual prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3302–3309 (2014)Google Scholar
  47. 47.
    Walker, J., Marino, K., Gupta, A., Hebert, M.: The pose knows: video forecasting by generating pose futures. In: IEEE International Conference on Computer Vision (ICCV), pp. 3352–3361. IEEE (2017)Google Scholar
  48. 48.
    Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinf. Learn. 5–32 (1992)Google Scholar
  49. 49.
    Wulfmeier, M., Rao, D., Wang, D.Z., Ondruska, P., Posner, I.: Large-scale cost function learning for path planning using deep inverse reinforcement learning. Int. J. Robot. Res. 36(10), 1073–1087 (2017)CrossRefGoogle Scholar
  50. 50.
    Xie, D., Todorovic, S., Zhu, S.C.: Inferring “dark matter” and “dark energy” from videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 2224–2231. IEEE (2013)Google Scholar
  51. 51.
    Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, 22–29 October 2017, pp. 2242–2251 (2017).  https://doi.org/10.1109/ICCV.2017.244
  52. 52.
    Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, 13–17 July 2008, pp. 1433–1438 (2008). http://www.aaai.org/Library/AAAI/2008/aaai08-227.php

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA
  2. 2.NEC Labs AmericaCupertinoUSA

Personalised recommendations