Abstract
Traffic signal control (TSC) can be described as a multi-agent cooperative game. To realize cooperation, multi-agent reinforcement learning (MARL) is a significant approach, with communication being a core component. The large-scale traffic signals and the partially observable information in TSC pose a considerable challenge in finding the optimal joint control policy. This paper proposed a deep MARL model named attentional graph relations communications network (AGRCNet). Based on the Actor-Critic framework, AGRCNet designs a communication network to exchange observation information with agents to help obtain the optimal joint action, reducing the decision error caused by the partially observable condition. Specifically, through the communication network, the chain propagation of graph attention networks (GAT) and graph convolutional networks is used to expand the receptive domain of agents, improve communication efficiency and promote cooperative behavior. We simulate the traffic situation near the Nanjing Yangtze River Bridge in Simulation of Urban MObility. With a compound reward, our method performs best. Meanwhile, AGRCNet is applied to two abstract environments, and the results show that our approach can also adapt to dynamic agent relationships and is more efficient than comparison algorithms.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Fan Z, Huang D, Xu K et al (2022) Comparative analysis of rail transit braking digital command control strategies based on neural network. Neural Comput Appl 1–13
Tsai CW, Teng TC, Liao JT et al (2021) An effective hybrid-heuristic algorithm for urban traffic light scheduling. Neural Comput Appl 33(24):17,535-17,549
Suau M, He J, Congeduti E et al (2022) Influence-aware memory architectures for deep reinforcement learning in pomdps. Neural Comput Appl 1–17
Li M, Cai Z, Zhao J et al (2022) Disturbance rejection and high dynamic quadrotor control based on reinforcement learning and supervised learning. Neural Comput Appl 1–21
Jia J, Yu R, Du Z et al (2022) Distributed localization for iot with multi-agent reinforcement learning. Neural Comput Appl 34(9):7227–7240
Cui Y, Liu X (2022) Adaptive consensus tracking control of strict-feedback nonlinear multi-agent systems with unknown dynamic leader. Neural Comput Appl 34(8):6215–6226
Fang Y, Chen P et al (2022) Hint: harnessing the wisdom of crowds for handling multi-phase tasks. Neural Comput Appl 1–23
Gronauer S, Diepold K (2022) Multi-agent deep reinforcement learning: a survey. Artif Intell Rev 1–49
Mishra S, Arora A (2022) A huber reward function-driven deep reinforcement learning solution for cart-pole balancing problem. Neural Comput Appl 1–18
Liu W, Liu S, Cao J et al (2021) Learning communication for cooperation in dynamic agent-number environment. IEEE/ASME Trans Mechatron
Ceren R, He K, Doshi P et al (2021) Palo bounds for reinforcement learning in partially observable stochastic games. Neurocomputing 420:36–56
Qiu S, Yang Z, Ye J et al (2021) On finite-time convergence of actor-critic algorithm. IEEE J Sel Areas Inf Theory 2(2):652–664
Ye Y, Ji S (2021) Sparse graph attention networks. IEEE Trans Knowl Data Eng
Chen Z, Xu J, Peng T, et al (2021) Graph convolutional network-based method for fault diagnosis using a hybrid of measurement and prior knowledge. IEEE Trans Cybern
Jiang S, Huang Y, Jafari M, et al (2021) A distributed multi-agent reinforcement learning with graph decomposition approach for large-scale adaptive traffic signal control. IEEE Trans Intell Transp Syst
Li J, Ma H, Zhang Z et al (2021) Spatio-temporal graph dual-attention network for multi-agent prediction and tracking. IEEE Trans Intell Transp Syst
Lopez PA, Behrisch M, Bieker-Walz L et al (2018) Microscopic traffic simulation using sumo. In: 2018 21st International conference on intelligent transportation systems (ITSC). IEEE, pp 2575–2582
Lowe R, Wu Y, Tamar A et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30:6379–6390
Oroojlooy A, Hajinezhad D (2022) A review of cooperative multi-agent deep reinforcement learning. Appl Intell pp 1–46
Sukhbaatar S, Fergus R et al (2016) Learning multiagent communication with backpropagation. Adv Neural Inf Process Syst 29:2244–2252
Peng P, Wen Y, Yang Y et al (2017) Multiagent bidirectionally-coordinated nets: emergence of human-level coordination in learning to play starcraft combat games. arXiv preprint arXiv:1703.10069
Yu Z, Tan P, Sun Q et al (2021) Longitudinal wind field prediction based on ddpg. Neural Comput Appl 1–13
Khan MA, Ullah I, Kumar N et al (2021) An efficient and secure certificate-based access control and key agreement scheme for flying ad-hoc networks. IEEE Trans Veh Technol 70(5):4839–4851
Kim D, Moon S, Hostallero D et al (2019) Learning to schedule communication in multi-agent reinforcement learning. arXiv preprint arXiv:1902.01554
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning. In: International conference on machine learning. PMLR, pp 2961–2970
Singh A, Jain T, Sukhbaatar S (2018) Learning when to communicate at scale in multiagent cooperative and competitive tasks. In: International conference on learning representations
Kim D, Moon S, Hostallero D et al (2018) Learning to schedule communication in multi-agent reinforcement learning. In: International conference on learning representations
Pesce E, Montana G (2020) Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication. Mach Learn 1–21
Das A, Gervet T, Romoff J et al (2019) Tarmac: targeted multi-agent communication. In: International conference on machine learning. PMLR, pp 1538–1546
Jiang J, Lu Z (2018) Learning attentional communication for multi-agent cooperation. arXiv preprint arXiv:1805.07733
Zhang C, Jin S, Xue W et al (2021) Independent reinforcement learning for weakly cooperative multiagent traffic control problem. IEEE Trans Veh Technol
Liu B, Ding Z (2022) A distributed deep reinforcement learning method for traffic light control. Neurocomputing 490:390–399
Carta S, Ferreira A, Podda AS et al (2021) Multi-dqn: an ensemble of deep q-learning agents for stock market forecasting. Expert Syst Appl 164(113):820
Ge H, Gao D, Sun L et al (2021) Multi-agent transfer reinforcement learning with multi-view encoder for adaptive traffic signal control. IEEE Trans Intell Transp Syst
Chen YH, Huang L, Wang CD et al (2021) Hybrid-order gated graph neural network for session-based recommendation. IEEE Trans Ind Inform
Wang T, Liao R, Ba J et al (2018) Nervenet: learning structured policy with graph neural networks. In: International conference on learning representations
You J, Liu B, Ying R et al (2018) Graph convolutional policy network for goal-directed molecular graph generation. In: Proceedings of the 32nd international conference on neural information processing systems, pp 6412–6422
Guan Y, Coley CW, Wu H et al (2021) Regio-selectivity prediction with a machine-learned reaction representation and on-the-fly quantum mechanical descriptors. Chem Sci 12(6):2198–2208
Liu Y, Wang W, Hu Y, et al (2020) Multi-agent game abstraction via graph attention neural network. In: Proceedings of the AAAI conference on artificial intelligence, pp 7211–7218
Yin P, Ji D, Yan H et al (2022) Multimodal deep collaborative filtering recommendation based on dual attention. Neural Comput Appl 1–14
Chandaliya PK, Nain N (2022) Aw-gan: face aging and rejuvenation using attention with wavelet gan. Neural Comput Appl 1–15
Nikolaidis S, Refanidis I (2022) Consolidating incentivization in distributed neural network training via decentralized autonomous organization. Neural Comput Appl 1–15
Girihagama L, Naveed Khaliq M, Lamontagne P et al (2022) Streamflow modelling and forecasting for canadian watersheds using lstm networks with attention mechanism. Neural Comput Appl 1–21
Liu Y, Zhang K, Basar T et al (2020) An improved analysis of (variance-reduced) policy gradient and natural policy gradient methods. In: NeurIPS
Li S, Gupta JK, Morales P, et al (2021) Deep implicit coordination graphs for multi-agent reinforcement learning. In: Proceedings of the 20th international conference on autonomous agents and multiagent systems, pp 764–772
Vo DM, Nguyen DM, Lee SW (2021) Deep softmax collaborative representation for robust degraded face recognition. Eng Appl Artif Intell 97(104):052
Chen J, Feng X, Jiang L et al (2021) State of charge estimation of lithium-ion battery using denoising autoencoder and gated recurrent unit recurrent neural network. Energy 227(120):451
Huang HY, Kim KT, Youn HY (2021) Determining node duty cycle using q-learning and linear regression for wsn. Front Comput Sci 15(1):1–7
Wang M, Wu L, Li J, et al (2021) Traffic signal control with reinforcement learning based on region-aware cooperative strategy. IEEE Trans Intell Transp Syst
Strypsteen T, Bertrand A (2021) End-to-end learnable eeg channel selection for deep neural networks with gumbel-softmax. J Neural Eng
Acknowledgements
This work is supported in part by National Key Research and Development Program of China (International Technology Cooperation Project No. 2021YFE014400). This work is also supported in part by the National Natural Science Foundation of China (No. 62102187, No. 42175194). This work is also supported in part by Jiangsu Provincial Graduate Research and Practical Innovation Program (No. KYCX23_1358). The authors have not disclosed any funding.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by KP, TM and HR, YQ. The first draft of the manuscript was written by KP and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
The authors approve that the research presented in this paper is conducted following the principles of ethical and professional conduct.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ma, T., Peng, K., Rong, H. et al. AGRCNet: communicate by attentional graph relations in multi-agent reinforcement learning for traffic signal control. Neural Comput & Applic 35, 21007–21022 (2023). https://doi.org/10.1007/s00521-023-08875-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08875-5