Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Revealing the hidden features in traffic prediction via entity embedding


Models based on neural networks (NN) have been used widely and successfully in traffic prediction resulting in improved accuracy and efficiency in traffic flow, speed, passenger flow, and delay. Input data include continuous and discrete variables and these impact traffic changes both internally and externally. However, few studies have focused on discrete traffic-related variables in NN-based forecasting models. Inappropriate utilization of discrete variables may cause useful factors to become insignificant and lead to an inefficient forecasting model. In this paper, a NN-based model is used to predict traffic flow of a bike-sharing system in Suzhou, China. The model only uses external and discrete variables like weather, places of interest (POIs), and holiday periods. We applied both entity embedding and one-hot encoding for the data preprocessing of these variables. The results show that (1) Entity embedding can effectively increase the continuity of categorical variables and slightly improve the prediction efficiency for the NN model; and (2) The hidden relationship in variables can be identified through visual analysis, and the trained embedding vectors can also be used in traffic-related tasks.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10


  1. 1.



  1. 1.

    Tian Y, Pan L (2015) Predicting short-term traffic flow by long short-term memory recurrent neural network. In: Smart City/SocialCom/SustainCom (SmartCity), 2015 IEEE International Conference on. IEEE, pp 153-158

  2. 2.

    Moretti F, Pizzuti S, Panzieri S, Annunziato M (2015) Urban traffic flow forecasting through statistical and neural network bagging ensemble hybrid modeling. Neurocomputing 167:3–7

  3. 3.

    Jin F, Sun S (2017) Neural network multitask learning for traffic flow forecasting. arXiv preprint arXiv:171208862

  4. 4.

    Zhang J, Zheng Y (2017) Qi D Deep Spatio-Temporal Residual Networks for citywide crowd flows prediction. In: AAAI. pp 1655-1661

  5. 5.

    Vlahogianni EI, Karlaftis MG, Golias JC (2005) Optimized and meta-optimized neural networks for short-term traffic flow prediction: a genetic approach. Transportation Research Part C: Emerging Technologies 13(3):211–234

  6. 6.

    Abdel-Aty M, Pande A (2005) Identifying crash propensity using specific traffic speed conditions. J Saf Res 36(1):97–108

  7. 7.

    Ma X, Tao Z, Wang Y, Yu H, Wang Y (2015) Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transport Res C: Emerg Technol 54:187–197

  8. 8.

    Chen Y-y, Lv Y, Li Z, Wang F-Y (2016) Long short-term memory model for traffic congestion prediction with online open data. In: Intelligent transportation systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, pp 132-137

  9. 9.

    Lingras P, Mountford P (2001) Time delay neural networks designed using genetic algorithms for short term inter-city traffic forecasting. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, pp 290-299

  10. 10.

    Fu R, Zhang Z, Li L (2016) Using LSTM and GRU neural network methods for traffic flow prediction. In: Chinese Association of Automation (YAC), Youth Academic Annual Conference of. IEEE, pp 324-328

  11. 11.

    Yu H, Wu Z, Wang S, Wang Y, Ma X (2017) Spatiotemporal recurrent convolutional networks for traffic prediction in transportation networks. Sensors 17(7):1501

  12. 12.

    Wu Y, Tan H, Qin L, Ran B, Jiang Z (2018) A hybrid deep learning based traffic flow prediction method and its understanding. Transportation Research Part C: Emerging Technologies 90:166–180. https://doi.org/10.1016/j.trc.2018.03.001

  13. 13.

    Liu L, Chen R-C (2017) A novel passenger flow prediction model using deep learning methods. Transportation Research Part C: Emerging Technologies 84:74–91. https://doi.org/10.1016/j.trc.2017.08.001

  14. 14.

    Jia Y, Wu J, Xu M (2017) Traffic flow prediction with rainfall impact using a deep learning method. J Adv Transp 2017:1–10. https://doi.org/10.1155/2017/6575947

  15. 15.

    Ke J, Zheng H, Yang H, Chen X (2017) Short-term forecasting of passenger demand under on-demand ride services: a spatio-temporal deep learning approach. Transportation Research Part C: Emerging Technologies 85:591–608. https://doi.org/10.1016/j.trc.2017.10.016

  16. 16.

    Zhang D, Kabuka MR (2018) Combining weather condition data to predict traffic flow: a GRU-based deep learning approach. IET Intell Transp Syst

  17. 17.

    Guo C, Berkhahn F (2016) Entity embeddings of categorical variables. arXiv preprint arXiv:160406737

  18. 18.

    Zeng M, Yu T, Wang X, Su V, Nguyen LT, Mengshoel OJ (2016) Improving demand prediction in bike sharing system by learning global features. In: Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining: Machine Learning for Large Scale Transportation Systems (LSTS)

  19. 19.

    Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp 1532-1543

  20. 20.

    Goldberg Y, Levy O (2014) word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:14023722

  21. 21.

    Yang B, Yih W-t, He X, Gao J, Deng L (2014) Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:14126575

  22. 22.

    Nickel M, Rosasco L, Poggio TA (2016) Holographic embeddings of knowledge graphs. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. pp 1955–1961

  23. 23.

    Wu Y, Zhang S, Zhang Y, Bengio Y (2016) Salakhutdinov RR On multiplicative integration with recurrent neural networks. In: Advances in neural information processing systems. pp 2856-2864

  24. 24.

    AMAP; AMAP Developer Platform. https://lbs.amap.com/. Accessed 14 July 2018

  25. 25.

    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12(Oct):2825–2830

  26. 26.

    Wikipedia (2018) Marching squares. https://en.wikipedia.org/wiki/Marching_squares. Accessed 15 July 2018

  27. 27.

    Gu T, Kim I, Currie G (2019) To be or not to be dockless: empirical analysis of dockless bikeshare development in China. Transp Res A Policy Pract 119:122–147

Download references

Author information

Correspondence to Inhi Kim.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, B., Shaaban, K. & Kim, I. Revealing the hidden features in traffic prediction via entity embedding. Pers Ubiquit Comput (2019). https://doi.org/10.1007/s00779-019-01333-x

Download citation


  • Neural networks
  • Visualization
  • Traffic prediction
  • Entity embedding
  • One-hot encoding