Trajectory Prediction by Coupling Scene-LSTM with Human Movement LSTM
Abstract
We develop a novel human trajectory prediction system that incorporates the scene information (Scene-LSTM) as well as individual pedestrian movement (Pedestrian-LSTM) trained simultaneously within static crowded scenes. We superimpose a two-level grid structure (grid cells and subgrids) on the scene to encode spatial granularity plus common human movements. The Scene-LSTM captures the commonly traveled paths that can be used to significantly influence the accuracy of human trajectory prediction in local areas (i.e. grid cells). We further design scene data filters, consisting of a hard filter and a soft filter, to select the relevant scene information in a local region when necessary and combine it with Pedestrian-LSTM for forecasting a pedestrian’s future locations. The experimental results on several publicly available datasets demonstrate that our method outperforms related works and can produce more accurate predicted trajectories in different scene contexts.
Keywords
Human movement Scene information LSTM networkReferences
- 1.Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–971 (2016)Google Scholar
- 2.Ballan, L., Castaldo, F., Alahi, A., Palmieri, F., Savarese, S.: Knowledge transfer for scene-specific motion prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part I. LNCS, vol. 9905, pp. 697–713. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_42CrossRefGoogle Scholar
- 3.Bartoli, F., Lisanti, G., Ballan, L., Del Bimbo, A.: Context-aware trajectory prediction. In: International Conference on Pattern Recognition, pp. 1941–1946. IEEE (2018)Google Scholar
- 4.Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: CVPR, pp. 3457–3464. IEEE (2011)Google Scholar
- 5.Ferryman, J., Shahrokni, A.: PETS 2009: dataset and challenge. In: Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 1–6. IEEE (2009)Google Scholar
- 6.Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
- 7.Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2255–2264 (2018)Google Scholar
- 8.Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint: arXiv:1412.6980 (2014)
- 9.Leonard, J.J., Durrant-Whyte, H.F.: Application of multi-target tracking to sonar-based mobile robot navigation. In: 29th IEEE Conference on Decision and Control, pp. 3118–3123. IEEE (1990)Google Scholar
- 10.Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. In: Computer Graphics Forum, vol. 26, pp. 655–664. Wiley Online Library (2007)Google Scholar
- 11.Levinson, J., et al.: Towards fully autonomous driving: systems and algorithms. In: 2011 IEEE Intelligent Vehicles Symposium (IV), pp. 163–168. IEEE (2011)Google Scholar
- 12.Manh, H., Alaghband, G.: Spatiotemporal KSVD dictionary learning for online multi-target tracking. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 150–157. IEEE (2018)Google Scholar
- 13.Paszke, A., Gross, S., Chintala, S., Chanan, G.: PyTorch: tensors and dynamic neural networks in python with strong GPU acceleration 6 (2017) Google Scholar
- 14.Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 261–268. IEEE (2009)Google Scholar
- 15.Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part VIII. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_33CrossRefGoogle Scholar
- 16.Trautman, P., Krause, A.: Unfreezing the robot: navigation in dense, interacting crowds. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 797–803. IEEE (2010)Google Scholar
- 17.Varshneya, D., Srinivasaraghavan, G.: Human trajectory prediction using spatially aware deep attention models. arXiv preprint: arXiv:1705.09436 (2017)
- 18.Vemula, A., Muelling, K., Oh, J.: Social attention: modeling attention in human crowds. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1–7. IEEE (2018)Google Scholar
- 19.Vivacqua, R.P.D., Bertozzi, M., Cerri, P., Martins, F.N., Vassallo, R.F.: Self-localization based on visual lane marking maps: an accurate low-cost approach for autonomous driving. IEEE Trans. Intell. Transp. Syst. 19(2), 582–597 (2017)CrossRefGoogle Scholar
- 20.Xu, Y., Piao, Z., Gao, S.: Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5275–5284 (2018)Google Scholar
- 21.Xue, H., Huynh, D.Q., Reynolds, M.: SS-LSTM: a hierarchical LSTM model for pedestrian trajectory prediction. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1186–1194. IEEE (2018)Google Scholar
- 22.Yamaguchi, K., Berg, A.C., Ortiz, L.E., Berg, T.L.: Who are you with and where are you going? In: CVPR 2011, pp. 1345–1352. IEEE (2011)Google Scholar
- 23.Zhou, B., Wang, X., Tang, X.: Understanding collective crowd behaviors: learning a mixture model of dynamic pedestrian-agents. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2871–2878. IEEE (2012)Google Scholar