RED: A Simple but Effective Baseline Predictor for the TrajNet Benchmark

  • Stefan BeckerEmail author
  • Ronny Hug
  • Wolfgang Hübner
  • Michael Arens
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)


In recent years, there is a shift from modeling the tracking problem based on Bayesian formulation towards using deep neural networks. Towards this end, in this paper the effectiveness of various deep neural networks for predicting future pedestrian paths are evaluated. The analyzed deep networks solely rely, like in the traditional approaches, on observed tracklets without human-human interaction information. The evaluation is done on the publicly available TrajNet benchmark dataset [39], which builds up a repository of considerable and popular datasets for trajectory prediction. We show how a Recurrent-Encoder with a Dense layer stacked on top, referred to as RED-predictor, is able to achieve top-rank at the TrajNet 2018 challenge compared to elaborated models. Further, we investigate failure cases and give explanations for observed phenomena, and give some recommendations for overcoming demonstrated shortcomings.


Trajectory forecasting Path prediction Trajectory-based activity forecasting 



The authors thank the organizers of the TrajNet challenge for providing a framework towards a more meaningful, standardized trajectory prediction benchmarking.


  1. 1.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015)., software available from
  2. 2.
    Akaike, H.: Fitting autoregressive models for prediction. Ann. Inst. Stat. Math. 21(1), 243–247 (1969).–247MathSciNetCrossRefGoogle Scholar
  3. 3.
    Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: human trajectory prediction in crowded spaces. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 961–971. IEEE (2016)Google Scholar
  4. 4.
    Alahi, A., et al.: Learning to predict human behaviour in crowded scenes. In: Group and Crowd Behavior for Computer Vision. Elsevier (2017)Google Scholar
  5. 5.
    Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint abs/1803.01271 (2018).
  6. 6.
    Ballan, L., Castaldo, F., Alahi, A., Palmieri, F., Savarese, S.: Knowledge transfer for scene-specific motion prediction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 697–713. Springer, Cham (2016). Scholar
  7. 7.
    Brownlee, J.: Introduction to time series forecasting with python: how to prepare data and develop models to predict the future (2017).
  8. 8.
    Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics, Doha, Qatar (2014).
  9. 9.
    Chung, J., Kastner, K., Dinh, L., Goel, K., Courville, A., Bengio, Y.: A recurrent latent variable model for sequential data. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  10. 10.
    Coscia, P., Castaldo, F., Palmieri, F.A., Alahi, A., Savarese, S., Ballan, L.: Long-term path prediction in urban scenarios using circular distributions. Image Vis. Comput. 69, 81–91 (2018). Scholar
  11. 11.
    Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Conference on Computer Vision and Pattern Recognition. IEEE (2015)Google Scholar
  12. 12.
    Draper, N.R., Smith, H.: Applied Regression Analysis. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1966)zbMATHGoogle Scholar
  13. 13.
    Ellis, D., Sommerlade, E., Reid, I.: Modelling pedestrian trajectory patterns with Gaussian processes. In: International Conference on Computer Vision Workshops (ICCVW), pp. 1229–1234. IEEE (2009).
  14. 14.
    Ferryman, J., Shahrokni, A.: Pets 2009: dataset and challenge. In: IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS), pp. 1–6 (2009).
  15. 15.
    Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013).
  16. 16.
    Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2018)Google Scholar
  17. 17.
    Hasan, I., Setti, F., Tsesmelis, T., Bue, A.D., Galasso, F., Cristani, M.: MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2018)Google Scholar
  18. 18.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016).
  19. 19.
    Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51, 4282–4286 (1995). Scholar
  20. 20.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). Scholar
  21. 21.
    Huang, S., et al.: Deep learning driven visual path prediction from a single image. IEEE Trans. Image Process. 25(12), 5892–5904 (2016). Scholar
  22. 22.
    Huber, M.: Nonlinear Gaussian filtering: theory, algorithms, and applications. Ph.D. thesis, Karlsruhe Institute of Technology (KIT) (2015)Google Scholar
  23. 23.
    Hug, R., Becker, S., Hübner, W., Arens, M.: On the reliability of LSTM-MDL models for predicting pedestrian trajectories. In: Representations, Analysis and Recognition of Shape and Motion from Imaging Data (RFMI), Savoie, France (2017)Google Scholar
  24. 24.
    Hug, R., Becker, S., Hübner, W., Arens, M.: Particle-based pedestrian path prediction using LSTM-MDL models. In: IEEE International Conference on Intelligent Transportation Systems (ITSC) (2018).
  25. 25.
    Kalman, R.E.: A new approach to linear filtering and prediction problems. ASME J. Basic Eng. 82, 35–45 (1960)CrossRefGoogle Scholar
  26. 26.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference for Learning Representations (ICLR) (2015)Google Scholar
  27. 27.
    Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012). Scholar
  28. 28.
    Kooij, J.F.P., Schneider, N., Flohr, F., Gavrila, D.M.: Context-based pedestrian path prediction. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 618–633. Springer, Cham (2014). Scholar
  29. 29.
    Lee, N., Choi, W., Vernaza, P., Choy, C.B., Torr, P.H.S., Chandraker, M.: Desire: distant future prediction in dynamic scenes with interacting agents. In: Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2017)Google Scholar
  30. 30.
    Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. Comput. Graph. Forum 26(3), 655–664 (2007)CrossRefGoogle Scholar
  31. 31.
    Li, Z., Zhou, Y., Xiao, S., He, C., Li, H.: Auto-conditioned LSTM network for extended complex human motion synthesis. arXiv preprint abs/1707.05363 (2017).
  32. 32.
    Ma, W., Huang, D., Lee, N., Kitani, K.M.: Forecasting interactive dynamics of pedestrians with fictitious play. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4636–4644. IEEE (2017).
  33. 33.
    Martinez, J., Black, M.J., Romero, J.: On human motion prediction using recurrent neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4674–4683. IEEE (2017).
  34. 34.
    McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman & Hall, CRC, London (1989)CrossRefGoogle Scholar
  35. 35.
    van den Oord, A., et al.: Wavenet: a generative model for raw audio. arXiv preprint abs/1609.03499 (2016).
  36. 36.
    Pellegrini, S., Ess, A., Schindler, K., van Gool, L.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: International Conference on Computer Vision, pp. 261–268. IEEE (2009).
  37. 37.
    Priestley, M.B.: Spectral Analysis and Time Series. Academic Press, London, New York (1981)zbMATHGoogle Scholar
  38. 38.
    Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). Scholar
  39. 39.
    Sadeghian, A., Kosaraju, V., Gupta, A., Savarese, S., Alahi, A.: Trajnet: towards a benchmark for human trajectory prediction. arXiv preprint (2018)Google Scholar
  40. 40.
    Sadeghian, A., Kosaraju, V., Sadeghian, A., Hirose, N., Savarese, S.: SoPhie: an attentive GAN for predicting paths compliant to social and physical constraints. arXiv preprint arXiv:1806.01482 (2018)
  41. 41.
    Vemula, A., Muelling, K., Oh, J.: Modeling cooperative navigation in dense human crowds. In: International Conference on Robotics and Automation (ICRA), pp. 1685–1692. IEEE, May 2017.
  42. 42.
    Williams, C.K.I.: Prediction with Gaussian processes: from linear regression to linear prediction and beyond. In: Jordan, M.I. (ed.) Learning in Graphical Models. NATO ASI Series, pp. 599–621. Springer, Dordrecht (1998). Scholar
  43. 43.
    Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: Bach, F., Blei, D. (eds.) International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 2048–2057. PMLR, Lille, France (2015)Google Scholar
  44. 44.
    Xue, H., Huynh, D.Q., Reynolds, H.M.: SS-LSTM: a hierarchical LSTM model for pedestrian trajectory prediction. In: Winter Conference on Applications of Computer Vision (WACV). IEEE (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Fraunhofer Institute for Optronics, System Technologies, and Image Exploitation IOSBEttlingenGermany

Personalised recommendations