AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control

Ma, Tinghuai; Peng, Kexing; Rong, Huan; Qian, Yurong

doi:10.1007/s00521-023-08875-5

AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control

Original Article
Published: 30 July 2023

Volume 35, pages 21007–21022, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Tinghuai Ma¹,
Kexing Peng ORCID: orcid.org/0000-0002-2712-9088²,
Huan Rong³ &
…
Yurong Qian⁴

379 Accesses
2 Citations
Explore all metrics

Abstract

Traffic signal control (TSC) can be described as a multi-agent cooperative game. To realize cooperation, multi-agent reinforcement learning (MARL) is a significant approach, with communication being a core component. The large-scale traffic signals and the partially observable information in TSC pose a considerable challenge in finding the optimal joint control policy. This paper proposed a deep MARL model named attentional graph relations communications network (AGRCNet). Based on the Actor-Critic framework, AGRCNet designs a communication network to exchange observation information with agents to help obtain the optimal joint action, reducing the decision error caused by the partially observable condition. Specifically, through the communication network, the chain propagation of graph attention networks (GAT) and graph convolutional networks is used to expand the receptive domain of agents, improve communication efficiency and promote cooperative behavior. We simulate the traffic situation near the Nanjing Yangtze River Bridge in Simulation of Urban MObility. With a compound reward, our method performs best. Meanwhile, AGRCNet is applied to two abstract environments, and the results show that our approach can also adapt to dynamic agent relationships and is more efficient than comparison algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Article 12 August 2023

Data availability

Enquiries about data availability should be directed to the authors.

References

Fan Z, Huang D, Xu K et al (2022) Comparative analysis of rail transit braking digital command control strategies based on neural network. Neural Comput Appl 1–13
Tsai CW, Teng TC, Liao JT et al (2021) An effective hybrid-heuristic algorithm for urban traffic light scheduling. Neural Comput Appl 33(24):17,535-17,549
Article Google Scholar
Suau M, He J, Congeduti E et al (2022) Influence-aware memory architectures for deep reinforcement learning in pomdps. Neural Comput Appl 1–17
Li M, Cai Z, Zhao J et al (2022) Disturbance rejection and high dynamic quadrotor control based on reinforcement learning and supervised learning. Neural Comput Appl 1–21
Jia J, Yu R, Du Z et al (2022) Distributed localization for iot with multi-agent reinforcement learning. Neural Comput Appl 34(9):7227–7240
Article Google Scholar
Cui Y, Liu X (2022) Adaptive consensus tracking control of strict-feedback nonlinear multi-agent systems with unknown dynamic leader. Neural Comput Appl 34(8):6215–6226
Article Google Scholar
Fang Y, Chen P et al (2022) Hint: harnessing the wisdom of crowds for handling multi-phase tasks. Neural Comput Appl 1–23
Gronauer S, Diepold K (2022) Multi-agent deep reinforcement learning: a survey. Artif Intell Rev 1–49
Mishra S, Arora A (2022) A huber reward function-driven deep reinforcement learning solution for cart-pole balancing problem. Neural Comput Appl 1–18
Liu W, Liu S, Cao J et al (2021) Learning communication for cooperation in dynamic agent-number environment. IEEE/ASME Trans Mechatron
Ceren R, He K, Doshi P et al (2021) Palo bounds for reinforcement learning in partially observable stochastic games. Neurocomputing 420:36–56
Article Google Scholar
Qiu S, Yang Z, Ye J et al (2021) On finite-time convergence of actor-critic algorithm. IEEE J Sel Areas Inf Theory 2(2):652–664
Article Google Scholar
Ye Y, Ji S (2021) Sparse graph attention networks. IEEE Trans Knowl Data Eng
Chen Z, Xu J, Peng T, et al (2021) Graph convolutional network-based method for fault diagnosis using a hybrid of measurement and prior knowledge. IEEE Trans Cybern
Jiang S, Huang Y, Jafari M, et al (2021) A distributed multi-agent reinforcement learning with graph decomposition approach for large-scale adaptive traffic signal control. IEEE Trans Intell Transp Syst
Li J, Ma H, Zhang Z et al (2021) Spatio-temporal graph dual-attention network for multi-agent prediction and tracking. IEEE Trans Intell Transp Syst
Lopez PA, Behrisch M, Bieker-Walz L et al (2018) Microscopic traffic simulation using sumo. In: 2018 21st International conference on intelligent transportation systems (ITSC). IEEE, pp 2575–2582
Lowe R, Wu Y, Tamar A et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30:6379–6390
Google Scholar
Oroojlooy A, Hajinezhad D (2022) A review of cooperative multi-agent deep reinforcement learning. Appl Intell pp 1–46
Sukhbaatar S, Fergus R et al (2016) Learning multiagent communication with backpropagation. Adv Neural Inf Process Syst 29:2244–2252
Google Scholar
Peng P, Wen Y, Yang Y et al (2017) Multiagent bidirectionally-coordinated nets: emergence of human-level coordination in learning to play starcraft combat games. arXiv preprint arXiv:1703.10069
Yu Z, Tan P, Sun Q et al (2021) Longitudinal wind field prediction based on ddpg. Neural Comput Appl 1–13
Khan MA, Ullah I, Kumar N et al (2021) An efficient and secure certificate-based access control and key agreement scheme for flying ad-hoc networks. IEEE Trans Veh Technol 70(5):4839–4851
Article Google Scholar
Kim D, Moon S, Hostallero D et al (2019) Learning to schedule communication in multi-agent reinforcement learning. arXiv preprint arXiv:1902.01554
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 2961–2970
Singh A, Jain T, Sukhbaatar S (2018) Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: International conference on learning representations
Kim D, Moon S, Hostallero D et al (2018) Learning to schedule communication in multi-agent reinforcement learning. In: International conference on learning representations
Pesce E, Montana G (2020) Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication. Mach Learn 1–21
Das A, Gervet T, Romoff J et al (2019) Tarmac: targeted multi-agent communication. In: International conference on machine learning. PMLR, pp 1538–1546
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. arXiv preprint arXiv:1805.07733
Zhang C, Jin S, Xue W et al (2021) Independent reinforcement learning for weakly cooperative multiagent traffic control problem. IEEE Trans Veh Technol
Liu B, Ding Z (2022) A distributed deep reinforcement learning method for traffic light control. Neurocomputing 490:390–399
Article Google Scholar
Carta S, Ferreira A, Podda AS et al (2021) Multi-dqn: an ensemble of deep q-learning agents for stock market forecasting. Expert Syst Appl 164(113):820
Google Scholar
Ge H, Gao D, Sun L et al (2021) Multi-agent transfer reinforcement learning with multi-view encoder for adaptive traffic signal control. IEEE Trans Intell Transp Syst
Chen YH, Huang L, Wang CD et al (2021) Hybrid-order gated graph neural network for session-based recommendation. IEEE Trans Ind Inform
Wang T, Liao R, Ba J et al (2018) Nervenet: learning structured policy with graph neural networks. In: International conference on learning representations
You J, Liu B, Ying R et al (2018) Graph convolutional policy network for goal-directed molecular graph generation. In: Proceedings of the 32nd international conference on neural information processing systems, pp 6412–6422
Guan Y, Coley CW, Wu H et al (2021) Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem Sci 12(6):2198–2208
Article Google Scholar
Liu Y, Wang W, Hu Y, et al (2020) Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI conference on artificial intelligence, pp 7211–7218
Yin P, Ji D, Yan H et al (2022) Multimodal deep collaborative filtering recommendation based on dual attention. Neural Comput Appl 1–14
Chandaliya PK, Nain N (2022) Aw-gan: face aging and rejuvenation using attention with wavelet gan. Neural Comput Appl 1–15
Nikolaidis S, Refanidis I (2022) Consolidating incentivization in distributed neural network training via decentralized autonomous organization. Neural Comput Appl 1–15
Girihagama L, Naveed Khaliq M, Lamontagne P et al (2022) Streamflow modelling and forecasting for canadian watersheds using lstm networks with attention mechanism. Neural Comput Appl 1–21
Liu Y, Zhang K, Basar T et al (2020) An improved analysis of (variance-reduced) policy gradient and natural policy gradient methods. In: NeurIPS
Li S, Gupta JK, Morales P, et al (2021) Deep implicit coordination graphs for multi-agent reinforcement learning. In: Proceedings of the 20th international conference on autonomous agents and multiagent systems, pp 764–772
Vo DM, Nguyen DM, Lee SW (2021) Deep softmax collaborative representation for robust degraded face recognition. Eng Appl Artif Intell 97(104):052
Google Scholar
Chen J, Feng X, Jiang L et al (2021) State of charge estimation of lithium-ion battery using denoising autoencoder and gated recurrent unit recurrent neural network. Energy 227(120):451
Google Scholar
Huang HY, Kim KT, Youn HY (2021) Determining node duty cycle using q-learning and linear regression for wsn. Front Comput Sci 15(1):1–7
Article Google Scholar
Wang M, Wu L, Li J, et al (2021) Traffic signal control with reinforcement learning based on region-aware cooperative strategy. IEEE Trans Intell Transp Syst
Strypsteen T, Bertrand A (2021) End-to-end learnable eeg channel selection for deep neural networks with gumbel-softmax. J Neural Eng

Download references

Acknowledgements

This work is supported in part by National Key Research and Development Program of China (International Technology Cooperation Project No. 2021YFE014400). This work is also supported in part by the National Natural Science Foundation of China (No. 62102187, No. 42175194). This work is also supported in part by Jiangsu Provincial Graduate Research and Practical Innovation Program (No. KYCX23_1358). The authors have not disclosed any funding.

Author information

Authors and Affiliations

School of Software, Nanjing University of Information Science & Technology, Nanjing, 210-044, Jiangsu, China
Tinghuai Ma
School of Computer Science, Nanjing University of Information Science & Technology, Nanjing, 210-044, Jiangsu, China
Kexing Peng
School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, China
Huan Rong
School of Software, Xinjiang University, Ürümqi, 830-008, Xinjiang, China
Yurong Qian

Authors

Tinghuai Ma
View author publications
You can also search for this author in PubMed Google Scholar
Kexing Peng
View author publications
You can also search for this author in PubMed Google Scholar
Huan Rong
View author publications
You can also search for this author in PubMed Google Scholar
Yurong Qian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by KP, TM and HR, YQ. The first draft of the manuscript was written by KP and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Kexing Peng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The authors approve that the research presented in this paper is conducted following the principles of ethical and professional conduct.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, T., Peng, K., Rong, H. et al. AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control. Neural Comput & Applic 35, 21007–21022 (2023). https://doi.org/10.1007/s00521-023-08875-5

Download citation

Received: 19 September 2022
Accepted: 12 July 2023
Published: 30 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s00521-023-08875-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation