Skip to main content
Log in

Residual stacked gated recurrent unit with encoder–decoder architecture and an attention mechanism for temporal traffic prediction

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The fast growth of information and communication technology has increased the application of deep learning technology in many areas. Traffic has become one of the leading problems for modern life in urban settings because of the steady growth of vehicles. Tracking congestion throughout the network road for achieving intelligent transportation systems is important. However, predicting traffic flow is quite difficult due to its nonlinear characteristics. In this study, a model was proposed, that used an attention mechanism on modified recurrent neural networks (RNN). The attention mechanism was used to address the limitation of modeling long-dependencies and efficient usage of memory for computation. The modified RNN, which combines the residual module and deep stacked GRU-type RNN, was also applied as the encoder–decoder network function to improve the prediction performance of the model by decreasing vanishing gradient potential and enhancing the ability to capture longer dependencies. The proposed method was also evaluated on two real-world road sensor data from an open-access database named PeMS San Jose Bay area and Northbound Interstate 405 area. The results show how deep learning features with attention mechanisms can provide precise short-term and long-term traffic prediction compared to classical and modern deep neural network-based baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

This work made use of the California Department of Transportation’s Performance Measurement System (PeMS) from the Internet.

References

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473

  • Bao Y-X, Shi Q, Shen Q-Q, Cao Y (2022) Spatial-temporal 3D residual correlation network for urban traffic status prediction. Symmetry 14(1):33

    Article  Google Scholar 

  • Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127

    Article  MathSciNet  Google Scholar 

  • Bengio Y, Mesnil G, Dauphin Y, Rifai S (2013, February) Better mixing via deep representations. In: International conference on machine learning, USA, pp 552–560

  • Chevalier G (2018) LARNN: linear attention recurrent neural network. arXiv preprint arXiv:1808.05578

  • Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014, October) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Qatar, pp 1724–1734

  • Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  • Connor J, Martin R, Atlas L (1994) Recurrent neural networks and robust time series prediction. IEEE Trans Neural Netw 5(2):240–254

    Article  Google Scholar 

  • Daganzo C, Daganzo CF (1997) Fundamentals of transportation and traffic operations, vol 30. Pergamon, Oxford

    Book  Google Scholar 

  • Duan Y, Lv Y, Wang FY (2016, November) Travel time prediction with LSTM neural network. In: 2016 IEEE 19th International conference on intelligent transportation systems (ITSC), Brazil, pp 1053–1058

  • El Hihi S, Bengio Y (1996) Hierarchical recurrent neural networks for long-term dependencies. In: Advances in neural information processing systems 8 (NIPS), USA, pp 493–499

  • Fusco G, Gori S (1995, June) The use of artificial neural networks in advanced traveler information and traffic management systems. In: Applications of advanced technologies in transportation engineering (ASCE), Italy, pp 341–345

  • Giles CL, Miller CB, Chen D, Sun GZ, Chen HH, Lee YC (1992) Extracting and learning an unknown grammar with recurrent neural networks. In: Advances in neural information processing systems (NIPS), USA, pp 317–324

  • Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, Canada, pp 315–323

  • Goodfellow I, Lee H, Le QV, Saxe A, Ng AY (2009) Measuring invariances in deep networks. In: Advances in neural information processing systems 22 (NIPS), Canada, pp 646–654

  • Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850

  • Hamilton JD (1994) Time series analysis, vol 2. Princeton University Press, New Jersey, pp 690–696

    Book  Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, USA, pp 770–778

  • He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision (ECCV), Netherlands, pp 630–645

  • Hermans M, Schrauwen B (2013) Training and analyzing deep recurrent neural networks. In: Advances in neural information processing systems 26 (NIPS), USA, pp 190–198

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Hua J, Faghri A (1994) Applications of artificial neural networks to intelligent vehicle-highway systems. Transp Res Rec 1453:83–90

    Google Scholar 

  • Ishak S, Kotha P, Alecsandru C (2003) Optimization of dynamic neural network performance for short-term traffic prediction. Transp Res Rec 1836(1):45–56

    Article  Google Scholar 

  • Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference of empirical methods in natural language processing (EMNLP), USA, pp 1700–1709

  • Karpathy A, Johnson J, Fei-Fei L (2015) Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078

  • Kilimci Z, Akyuz A, Uysal M, Akyokus S, Uysal M, Bulbul B, Ekmis M (2019) An improved demand forecasting model using deep learning approach and proposed decision integration strategy for supply chain. Complexity 2019:1–15

    Google Scholar 

  • Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  • Lawrence S, Giles CL, Fong S (2000) Natural language grammatical inference with recurrent neural networks. IEEE Trans Knowl Data Eng 12(1):126–140

    Article  Google Scholar 

  • Leal MT (2002) Empirical analysis of traffic flow features of a freeway bottleneck surrounding a lane drop. MS Report. Department of Civil and Environmental Engineering, Portland State University, Portland, Ore

  • LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  • Li Y, Yu R, Shahabi C, Liu Y (2018, April) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: Sixth international conference on learning representations (ICLR), USA

  • Lingras P, Sharma S, Zhong M (2002) Prediction of recreational travel using genetically designed regression and time-delay neural network models. Transp Res Rec 1805(1):16–24

    Article  Google Scholar 

  • Liu H, Van Zuylen H, Van Lint H, Salomons M (2006) Predicting urban arterial travel time with state-space neural networks and Kalman filters. Transp Res Rec 1968(1):99–108

    Article  Google Scholar 

  • Ma X, Tao Z, Wang Y, Yu H, Wang Y (2015) Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp Res Part C Emerg Technol 54:187–197

    Article  Google Scholar 

  • Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE Spoken language technology workshop (SLT), USA, pp 234–239

  • Niu T, Wang J, Lu H, Yang W, Du P (2020) Developing a deep learning framework with two-stage feature selection for multivariate financial time series forecasting. Expert Syst Appl 148:113237

    Article  Google Scholar 

  • Park D, Rilett LR (1999) Forecasting freeway link travel times with a multilayer feedforward neural network. Comput Aided Civ Infrastruct Eng 14(5):357–367

    Article  Google Scholar 

  • Qi Y, Ishak S (2014) A Hidden Markov Model for short term prediction of traffic conditions on freeways. Transp Res Part C Emerg Technol 43:95–111

    Article  Google Scholar 

  • Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536

    Article  Google Scholar 

  • Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  • Saad E, Prokhorov D, Wunsch D (1998) Comparative study of stock trend prediction using time delay, recurrent, and probabilistic neural networks. IEEE Trans Neural Netw 9(6):1456–1470

    Article  Google Scholar 

  • Scher S (2018) Toward data-driven weather and climate forecasting: approximating a simple general circulation model with deep learning. Geophys Res Lett 45(22):12–616

    Article  Google Scholar 

  • Sholl P, Wolfe RK (1985) The Kalman filter as an adaptive forecasting procedure for use with Box-Jenkins ARIMA models. Comput Ind Eng 9(3):247–262

    Article  Google Scholar 

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27 (NIPS), Canada, pp 3104–3112

  • Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017, February) Inception-v4, inception-resnet, and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence, USA, pp 4278–4284

  • Van Lint JWC, Hoogendoorn SP, van Zuylen HJ (2002) Freeway travel time prediction with state-space neural networks: modeling state-space dynamics with recurrent neural networks. Transp Res Rec 1811(1):30–39

    Article  Google Scholar 

  • Van Lint JWC, Hoogendoorn SP, van Zuylen HJ (2005) Accurate freeway travel time prediction with state-space neural networks under missing data. Transp Res Part C Emerg Technol 13(5–6):347–369

    Article  Google Scholar 

  • Vlahogianni EI, Karlaftis MG, Golias JC (2014) Short-term traffic forecasting: where we are and where we’re going. Transp Res Part C Emerg Technol 43:3–19

    Article  Google Scholar 

  • Wang Y, Zhang D, Liu Y, Dai B, Lee LH (2018) Enhancing transportation systems via deep learning: a survey. Transp Res Part C Emerg Technol 99:144–163

    Article  Google Scholar 

  • Watrous RL, Kuhn GM (1991) Induction of finite-state automata using second-order recurrent networks. In: Advances in neural information processing systems 4 (NISP), USA, pp 309–317

  • Yeo K, Melnyk I (2019) Deep learning algorithm for data-driven simulation of a noisy dynamical system. J Comput Phys 376:1212–1231

    Article  MathSciNet  Google Scholar 

  • Yu XM, Feng WZ, Wang H, Chu Q, Chen Q (2020) An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system. Soft Comput 24:5831–5845

    Article  Google Scholar 

  • Yue B, Fu J, Liang J (2018) Residual recurrent neural networks for learning sequential representations. Information 9(3):56

    Article  Google Scholar 

  • Zhang S, Wu Y, Che T, Lin Z, Memisevic R, Salakhutdinov RR, Bengio Y (2016) Architectural complexity measures of recurrent neural networks. In: Advances in neural information processing systems 29 (NISP), Spain, pp 1822–1830

  • Zhang Y, Li Y, Zhou X, Luo J, Zhang ZL (2022) Urban traffic dynamics prediction—a continuous spatial-temporal meta-learning approach. ACM Trans Intell Syst Technol 13(2):1–19

    Article  Google Scholar 

  • Zheng L, Ismail K, Meng X (2014a) Traffic conflict techniques for road safety analysis: open questions and some insights. Can J Civ Eng 41(7):633–641

    Article  Google Scholar 

  • Zheng Y, Capra L, Wolfson O, Yang H (2014b) Urban computing: concepts, methodologies, and applications. ACM Trans Intell Syst Technol (TIST) 5(3):38

    Google Scholar 

Download references

Funding

This study is finally supported by Minister of Science and Technology of the Taiwanese government under contract number: MOST 109-2221-E-011-099-MY2. Her support is much appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. J. Kuo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Hyperparameter search statistical result

Appendix: Hyperparameter search statistical result

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuo, R.J., Kunarsito, D.A. Residual stacked gated recurrent unit with encoder–decoder architecture and an attention mechanism for temporal traffic prediction. Soft Comput 26, 8617–8633 (2022). https://doi.org/10.1007/s00500-022-07230-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-022-07230-5

Keywords

Navigation