Abstract
The fast growth of information and communication technology has increased the application of deep learning technology in many areas. Traffic has become one of the leading problems for modern life in urban settings because of the steady growth of vehicles. Tracking congestion throughout the network road for achieving intelligent transportation systems is important. However, predicting traffic flow is quite difficult due to its nonlinear characteristics. In this study, a model was proposed, that used an attention mechanism on modified recurrent neural networks (RNN). The attention mechanism was used to address the limitation of modeling long-dependencies and efficient usage of memory for computation. The modified RNN, which combines the residual module and deep stacked GRU-type RNN, was also applied as the encoder–decoder network function to improve the prediction performance of the model by decreasing vanishing gradient potential and enhancing the ability to capture longer dependencies. The proposed method was also evaluated on two real-world road sensor data from an open-access database named PeMS San Jose Bay area and Northbound Interstate 405 area. The results show how deep learning features with attention mechanisms can provide precise short-term and long-term traffic prediction compared to classical and modern deep neural network-based baselines.
Similar content being viewed by others
Data availability
This work made use of the California Department of Transportation’s Performance Measurement System (PeMS) from the Internet.
References
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Bao Y-X, Shi Q, Shen Q-Q, Cao Y (2022) Spatial-temporal 3D residual correlation network for urban traffic status prediction. Symmetry 14(1):33
Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127
Bengio Y, Mesnil G, Dauphin Y, Rifai S (2013, February) Better mixing via deep representations. In: International conference on machine learning, USA, pp 552–560
Chevalier G (2018) LARNN: linear attention recurrent neural network. arXiv preprint arXiv:1808.05578
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014, October) Learning phrase representations using RNN encoder–decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Qatar, pp 1724–1734
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
Connor J, Martin R, Atlas L (1994) Recurrent neural networks and robust time series prediction. IEEE Trans Neural Netw 5(2):240–254
Daganzo C, Daganzo CF (1997) Fundamentals of transportation and traffic operations, vol 30. Pergamon, Oxford
Duan Y, Lv Y, Wang FY (2016, November) Travel time prediction with LSTM neural network. In: 2016 IEEE 19th International conference on intelligent transportation systems (ITSC), Brazil, pp 1053–1058
El Hihi S, Bengio Y (1996) Hierarchical recurrent neural networks for long-term dependencies. In: Advances in neural information processing systems 8 (NIPS), USA, pp 493–499
Fusco G, Gori S (1995, June) The use of artificial neural networks in advanced traveler information and traffic management systems. In: Applications of advanced technologies in transportation engineering (ASCE), Italy, pp 341–345
Giles CL, Miller CB, Chen D, Sun GZ, Chen HH, Lee YC (1992) Extracting and learning an unknown grammar with recurrent neural networks. In: Advances in neural information processing systems (NIPS), USA, pp 317–324
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, Canada, pp 315–323
Goodfellow I, Lee H, Le QV, Saxe A, Ng AY (2009) Measuring invariances in deep networks. In: Advances in neural information processing systems 22 (NIPS), Canada, pp 646–654
Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850
Hamilton JD (1994) Time series analysis, vol 2. Princeton University Press, New Jersey, pp 690–696
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, USA, pp 770–778
He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision (ECCV), Netherlands, pp 630–645
Hermans M, Schrauwen B (2013) Training and analyzing deep recurrent neural networks. In: Advances in neural information processing systems 26 (NIPS), USA, pp 190–198
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hua J, Faghri A (1994) Applications of artificial neural networks to intelligent vehicle-highway systems. Transp Res Rec 1453:83–90
Ishak S, Kotha P, Alecsandru C (2003) Optimization of dynamic neural network performance for short-term traffic prediction. Transp Res Rec 1836(1):45–56
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference of empirical methods in natural language processing (EMNLP), USA, pp 1700–1709
Karpathy A, Johnson J, Fei-Fei L (2015) Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078
Kilimci Z, Akyuz A, Uysal M, Akyokus S, Uysal M, Bulbul B, Ekmis M (2019) An improved demand forecasting model using deep learning approach and proposed decision integration strategy for supply chain. Complexity 2019:1–15
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Lawrence S, Giles CL, Fong S (2000) Natural language grammatical inference with recurrent neural networks. IEEE Trans Knowl Data Eng 12(1):126–140
Leal MT (2002) Empirical analysis of traffic flow features of a freeway bottleneck surrounding a lane drop. MS Report. Department of Civil and Environmental Engineering, Portland State University, Portland, Ore
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Li Y, Yu R, Shahabi C, Liu Y (2018, April) Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: Sixth international conference on learning representations (ICLR), USA
Lingras P, Sharma S, Zhong M (2002) Prediction of recreational travel using genetically designed regression and time-delay neural network models. Transp Res Rec 1805(1):16–24
Liu H, Van Zuylen H, Van Lint H, Salomons M (2006) Predicting urban arterial travel time with state-space neural networks and Kalman filters. Transp Res Rec 1968(1):99–108
Ma X, Tao Z, Wang Y, Yu H, Wang Y (2015) Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp Res Part C Emerg Technol 54:187–197
Mikolov T, Zweig G (2012) Context dependent recurrent neural network language model. In: 2012 IEEE Spoken language technology workshop (SLT), USA, pp 234–239
Niu T, Wang J, Lu H, Yang W, Du P (2020) Developing a deep learning framework with two-stage feature selection for multivariate financial time series forecasting. Expert Syst Appl 148:113237
Park D, Rilett LR (1999) Forecasting freeway link travel times with a multilayer feedforward neural network. Comput Aided Civ Infrastruct Eng 14(5):357–367
Qi Y, Ishak S (2014) A Hidden Markov Model for short term prediction of traffic conditions on freeways. Transp Res Part C Emerg Technol 43:95–111
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput vis 115(3):211–252
Saad E, Prokhorov D, Wunsch D (1998) Comparative study of stock trend prediction using time delay, recurrent, and probabilistic neural networks. IEEE Trans Neural Netw 9(6):1456–1470
Scher S (2018) Toward data-driven weather and climate forecasting: approximating a simple general circulation model with deep learning. Geophys Res Lett 45(22):12–616
Sholl P, Wolfe RK (1985) The Kalman filter as an adaptive forecasting procedure for use with Box-Jenkins ARIMA models. Comput Ind Eng 9(3):247–262
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27 (NIPS), Canada, pp 3104–3112
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017, February) Inception-v4, inception-resnet, and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence, USA, pp 4278–4284
Van Lint JWC, Hoogendoorn SP, van Zuylen HJ (2002) Freeway travel time prediction with state-space neural networks: modeling state-space dynamics with recurrent neural networks. Transp Res Rec 1811(1):30–39
Van Lint JWC, Hoogendoorn SP, van Zuylen HJ (2005) Accurate freeway travel time prediction with state-space neural networks under missing data. Transp Res Part C Emerg Technol 13(5–6):347–369
Vlahogianni EI, Karlaftis MG, Golias JC (2014) Short-term traffic forecasting: where we are and where we’re going. Transp Res Part C Emerg Technol 43:3–19
Wang Y, Zhang D, Liu Y, Dai B, Lee LH (2018) Enhancing transportation systems via deep learning: a survey. Transp Res Part C Emerg Technol 99:144–163
Watrous RL, Kuhn GM (1991) Induction of finite-state automata using second-order recurrent networks. In: Advances in neural information processing systems 4 (NISP), USA, pp 309–317
Yeo K, Melnyk I (2019) Deep learning algorithm for data-driven simulation of a noisy dynamical system. J Comput Phys 376:1212–1231
Yu XM, Feng WZ, Wang H, Chu Q, Chen Q (2020) An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system. Soft Comput 24:5831–5845
Yue B, Fu J, Liang J (2018) Residual recurrent neural networks for learning sequential representations. Information 9(3):56
Zhang S, Wu Y, Che T, Lin Z, Memisevic R, Salakhutdinov RR, Bengio Y (2016) Architectural complexity measures of recurrent neural networks. In: Advances in neural information processing systems 29 (NISP), Spain, pp 1822–1830
Zhang Y, Li Y, Zhou X, Luo J, Zhang ZL (2022) Urban traffic dynamics prediction—a continuous spatial-temporal meta-learning approach. ACM Trans Intell Syst Technol 13(2):1–19
Zheng L, Ismail K, Meng X (2014a) Traffic conflict techniques for road safety analysis: open questions and some insights. Can J Civ Eng 41(7):633–641
Zheng Y, Capra L, Wolfson O, Yang H (2014b) Urban computing: concepts, methodologies, and applications. ACM Trans Intell Syst Technol (TIST) 5(3):38
Funding
This study is finally supported by Minister of Science and Technology of the Taiwanese government under contract number: MOST 109-2221-E-011-099-MY2. Her support is much appreciated.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Hyperparameter search statistical result
Appendix: Hyperparameter search statistical result
Rights and permissions
About this article
Cite this article
Kuo, R.J., Kunarsito, D.A. Residual stacked gated recurrent unit with encoder–decoder architecture and an attention mechanism for temporal traffic prediction. Soft Comput 26, 8617–8633 (2022). https://doi.org/10.1007/s00500-022-07230-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-07230-5