Graph cooperation deep reinforcement learning for ecological urban traffic signal control

Yan, Liping; Zhu, Lulong; Song, Kai; Yuan, Zhaohui; Yan, Yunjuan; Tang, Yue; Peng, Chan

doi:10.1007/s10489-022-03208-w

Graph cooperation deep reinforcement learning for ecological urban traffic signal control

Published: 07 July 2022

Volume 53, pages 6248–6265, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Liping Yan ORCID: orcid.org/0000-0003-0019-7766¹,
Lulong Zhu¹,
Kai Song²,
Zhaohui Yuan¹,
Yunjuan Yan³,
Yue Tang¹ &
…
Chan Peng¹

872 Accesses
7 Citations
Explore all metrics

Abstract

Cooperation between intersections in large-scale road networks is critical in traffic congestion. Currently, most traffic signals cooperate via pre-defined timing phases, which is extremely inefficient in real-time traffic scenarios. Most existing studies on multi-agent reinforcement learning (MARL) traffic signal control have focused on designing efficient communication methods, but have ignored the importance of how agents interact in cooperative communication. To achieve more efficient cooperation among traffic signals and alleviate urban traffic congestion, this study constructs a Graph Cooperation Q-learning Network Traffic Signal Control (GCQN-TSC) model, which is a graph cooperation network with an embedded self-attention mechanism that enables agents to adjust their attention in real time according to the dynamic traffic flow information, perceive the traffic environment quickly and effectively in a larger range, and help agents achieve more effective collaboration. Moreover, the Deep Graph Q-learning (DGQ) algorithm is proposed in this model to optimize the traffic signal control strategy according to the spatio-temporal characteristics of different traffic scenes and provide the optimal signal phase for each intersection. This study also integrates the ecological traffic concept into MARL traffic signal control, which aims to reduce traffic exhaust emissions. Finally, the proposed GCQN-TSC is experimentally validated both in a synthetic traffic grid and a real-world traffic network using the SUMO simulator. The experimental results show that GCQN-TSC outperforms other traffic signal control methods in almost all performance metrics, including average queue length and waiting time, as it can aggregate information acquired from collaborative agents and make network-level signal optimization decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FGCF: fault-aware green computing framework in software-defined social internet of vehicle

Article 27 April 2024

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Article 05 May 2021

References

Hunt PB, Robertson DI, Bretherton RD, Winton RI (1981) Scoot-a traffic responsive method of coordinating signals. Tech. Report
Sutton RS, Barto AG (2018) Reinforcement learning: An introduction. MIT press, Cambridge
MATH Google Scholar
Abdulhai B, Pringle R, Karakoulas GJ (2003) Reinforcement learning for true adaptive traffic signal control. J Transp Eng 129(3):278–285
Article Google Scholar
Mannion P, Duggan J, Howley E (2016) An experimental review of reinforcement learning algorithms for adaptive traffic signal control. Autonomic road transport support systems, pp 47–66
Zhu W-X, Zhang J-Y (2017) An original traffic additional emission model and numerical simulation on a signalized road. Physica A: Stat Mech Appl 467:107–119
Article MATH Google Scholar
Wang JM, Jeong C-H, Zimmerman N, Healy RM, Evans GJ (2018) Real world vehicle fleet emission factors: Seasonal and diurnal variations in traffic related air pollutants. Atmos Environ 184:77–86
Article Google Scholar
Mousavi SS, Schukat M, Howley E (2017) Traffic light control using deep policy-gradient and value-function-based reinforcement learning. IET Intell Transp Syst 11(7):417–423
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Van der Pol E, Oliehoek FA (2016) Coordinated deep reinforcement learners for traffic light control. Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016)
Jeon H, Lee J, Sohn K (2018) Artificial intelligence for traffic signal control based solely on video images. J Intell Transp Syst 22(5):433–445
Article Google Scholar
Ma J, Wu F (2020) Feudal multi-agent deep reinforcement learning for traffic signal control. In: Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp 816–824
Lee J, Chung J, Sohn K (2019) Reinforcement learning for joint control of traffic signals in a transportation network. IEEE Trans Veh Technol 69(2):1375–1387
Article Google Scholar
Zhang Z, Yang J, Zha H (2019) Integrating independent and centralized multi-agent reinforcement learning for traffic signal network optimization. arXiv:1909.10651
Chu T, Chinchali S, Katti S (2020) Multi-agent reinforcement learning for networked system control. arXiv:2004.01339
Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275
Foerster J, Farquhar G, Afouras T, Nardelli N, Whiteson S (2018) Counterfactual multi-agent policy gradients. In: Proceedings of the AAAI conference on artificial intelligence
Sukhbaatar S, Szlam A, Fergus R (2016) Learning multiagent communication with backpropagation. arXiv:1605.07736
Peng P, Wen Y, Yang Y, Yuan Q, Tang Z, Long H, Wang J (2017) Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv:1703.10069
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. arXiv:1805.07733
Das A, Gervet T, Romoff J, Batra D, Parikh D, Rabbat M, Pineau J (2019) Tarmac: Targeted multi-agent communication. In: International conference on machine learning, PMLR, pp 1538–1546
Singh A, Jain T, Sukhbaatar S (2018) Learning when to communicate at scale in multiagent cooperative and competitive tasks. arXiv:1812.09755
Wang T, Dong H, Lesser V, Zhang C (2020) Multi-agent reinforcement learning with emergent roles. arXiv:2003.08039
Belhadi A, Djenouri Y, Djenouri D, Michalak T, Lin JC-W (2021) Machine learning for identifying group trajectory outliers. ACM Transactions on Management Information Systems (TMIS) 12(2):1–25
Article Google Scholar
Belhadi A, Djenouri Y, Djenouri D, Michalak T, Lin JC-W (2020) Deep learning versus traditional solutions for group trajectory outliers. IEEE Transactions on Cybernetics
Belhadi A, Djenouri Y, Srivastava G, Djenouri D, Lin JC-W, Fortino G (2021) Deep learning for pedestrian collective behavior analysis in smart cities: A model of group trajectory outlier detection. Inf Fus 65:13–20
Article Google Scholar
Djenouri Y, Djenouri D, Lin JC-W (2021) Trajectory outlier detection: New problems and solutions for smart cities. ACM Transactions on Knowledge Discovery from Data (TKDD) 15(2):1–28
Article Google Scholar
Al Islam SMA Bin, Hajbabaie A, Aziz HM Abdul (2020) A real-time network-level traffic signal control methodology with partial connected vehicle information. Transp Res Part C: Emerg Technol 121:102830
Article Google Scholar
Kabir R, Remias SM, Lavrenz SM, Waddell J (2021) Assessing the impact of traffic signal performance on crash frequency for signalized intersections along urban arterials: A random parameter modeling approach. Accid Anal Prev 149:105868
Article Google Scholar
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Li C, Shimamoto S (2011) A real time traffic light control scheme for reducing vehicles co 2 emissions. In: 2011 IEEE Consumer Communications and Networking Conference (CCNC), IEEE, pp 855–859
Asad SM, Ozturk M, Rais RNB, Zoha A, Hussain S, Abbasi QH, Imran MA (2019) Reinforcement learning driven energy efficient mobile communication and applications. In: 2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), IEEE, pp 1–7
Liu M, Hoogendoorn S, Wang M (2020) Receding horizon cooperative platoon trajectory planning on corridors with dynamic traffic signal. Transp Res Rec 2674(12):324–338
Article Google Scholar
Barth M, An F, Younglove T, Scora G, Levine C, Ross M, Wenzel T (2000) The development of a comprehensive modal emissions model. NCHRP Web-only Document 122:25–11
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762
Sunehag P, Lever G, Gruslys A, Czarnecki WM, Zambaldi V, Jaderberg M, Lanctot M, Sonnerat N, Leibo JZ, Tuyls K et al (2017) Value-decomposition networks for cooperative multi-agent learning. arXiv:1706.05296
Schaul T, Quan J, Antonoglou I, Silver D (2015) Prioritized experience replay. arXiv:1511.05952
Lopez PA, Behrisch M, Bieker-Walz L, Erdmann J, Flötteröd Y-P, Hilbrich R, Lücken L, Rummel J, Wagner P, Wießner E (2018) Microscopic traffic simulation using sumo. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), IEEE, pp 2575–2582
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: A system for large-scale machine learning. In: 12th symposium on operating systems design and implementation, pp 265–283
Varaiya P (2013) The max-pressure controller for arbitrary networks of signalized intersections. In: Advances in dynamic network modeling in complex transportation systems. Springer, pp 27–66
Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
Chu T, Wang J, Codeca L, Li Z (2020) Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst 21:1086–1095
Article Google Scholar
Zou F, Shen L, Jie Z, Zhang W, Liu W (2019) A sufficient condition for convergences of adam and rmsprop. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11127–11135

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (62002117, 61862023); the Key Project of Jiangxi Natural Science Foundation (20202ACBL202009), and the Science and Technology Project of Jiangxi Provincial Education Department (GJJ190325, GJJ200627). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

School of Software, East China Jiaotong University, Nanchang, 330013, China
Liping Yan, Lulong Zhu, Zhaohui Yuan, Yue Tang & Chan Peng
School of Information Engineering, East China Jiaotong University, Nanchang, 330013, China
Kai Song
School of Mechatronics and Vehicle Engineering, East China Jiaotong University, Nanchang, 330013, China
Yunjuan Yan

Authors

Liping Yan
View author publications
You can also search for this author in PubMed Google Scholar
Lulong Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Kai Song
View author publications
You can also search for this author in PubMed Google Scholar
Zhaohui Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Yunjuan Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yue Tang
View author publications
You can also search for this author in PubMed Google Scholar
Chan Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liping Yan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, L., Zhu, L., Song, K. et al. Graph cooperation deep reinforcement learning for ecological urban traffic signal control. Appl Intell 53, 6248–6265 (2023). https://doi.org/10.1007/s10489-022-03208-w

Download citation

Accepted: 10 January 2022
Published: 07 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03208-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph cooperation deep reinforcement learning for ecological urban traffic signal control

Abstract

Access this article

Similar content being viewed by others

FGCF: fault-aware green computing framework in software-defined social internet of vehicle

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graph cooperation deep reinforcement learning for ecological urban traffic signal control

Abstract

Access this article

Similar content being viewed by others

FGCF: fault-aware green computing framework in software-defined social internet of vehicle

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation