Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning

Qiao, Zhimin; Ke, Liangjun; Wang, Xiaoqiang

doi:10.1007/s10489-022-03643-9

Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning

Published: 11 June 2022

Volume 53, pages 4483–4498, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

787 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

In contemporary urban, traffic signal control is still enormously difficult. Multi-agent reinforcement learning (MARL) is a promising ways to solve this problem. However, most MARL algorithms can not effectively transfer learning strategies when the agents increase or decrease. This paper proposes a new MARL algorithm called cooperative dynamic delay updating twin delayed deep deterministic policy gradient based on the exponentially weighted moving average (CoTD3-EWMA) to solve the problem. By introducing mean-field theory, the algorithm implicitly models the interaction between agents and environment. It reduces the dimension of action space and improves the scalability of the algorithm. In addition, we propose a dynamic delay updating method based on the exponentially weighted moving average (EWMA), which improves the Q value overestimation problem of the traditional TD3 algorithm. Moreover, a joint reward allocation mechanism and state sharing mechanism are proposed to improve the global strategy learning ability and robustness of the agent. The simulation results show that the performance of the new algorithm is better than the current state-of-the-art algorithms, which effectively reduces the delay time of vehicles and improves the traffic efficiency of the traffic network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

Article 05 May 2021

A Survey of Traffic Prediction: from Spatio-Temporal Data to Intelligent Transportation

Article Open access 23 January 2021

References

Guo Q, Li L, Ban X J (2019) Urban traffic signal control with connected and automated vehicles: A survey. Transp Res Part C: Emerging Technol 101:313–334
Article Google Scholar
Gao K, Zhang Y, Su R, Yang F, Suganthan P N, Zhou M (2018) Solving traffic signal scheduling problems in heterogeneous traffic network by using meta-heuristics. IEEE Trans Intell Transp Syst 20 (9):3272–3282
Article Google Scholar
Wei H, Zheng G, Gayah V, Li Z (2019) A survey on traffic signal control methods. arXiv:1904.08117
Deng L Y, Liang H C, Wang C-T, Wang C-S, Hung L-P (2005) The development of the adaptive traffic signal control system. In: 11th International conference on parallel and distributed systems (ICPADS’05), vol 2. IEEE, pp 634–638
Zhang Y, Zhou Y (2018) Distributed coordination control of traffic network flow using adaptive genetic algorithm based on cloud computing. J Netw Comput Appl 119:110–120
Article Google Scholar
Qiao Z, Ke L, Zhang G, Wang X (2021) Adaptive collaborative optimization of traffic network signal timing based on immune-fireworks algorithm and hierarchical strategy. Appl Intell. https://doi.org/10.1007/s10489-021-02256-y https://doi.org/10.1007/s10489-021-02256-y
Yu X, Qiao Y, Li Q, Xu G, Kang C, Estevez C, Deng C, Wang S (2020) Parallelizing comprehensive learning particle swarm optimization by open computing language on an integrated graphical processing unit. Complexity
Zhang Y, Zhou Y, Lu H, Fujita H (2021) Spark cloud-based parallel computing for traffic network flow predictive control using non-analytical predictive model. IEEE Trans Intell Transp Syst
Zhang B, Zheng Y-J, Zhang M-X, Chen S-Y (2015) Fireworks algorithm with enhanced fireworks interaction. IEEE/ACM Trans Comput Biol Bioinform 14(1):42–55
Article Google Scholar
Sutton R S, Barto A G (2018) Reinforcement learning: An introduction. MIT press
Wiering MA, Veenen J , Vreeken J, Koopman A (2004) Intelligent traffic light control. Utrecht University: Information and Computing Sciences
Prashanth LA, Bhatnagar S (2010) Reinforcement learning with function approximation for traffic signal control. IEEE Trans Intell Transp Syst 12(2):412–421
Google Scholar
Ozan C, Baskan O, Haldenbilen S, Ceylan H (2015) A modified reinforcement learning algorithm for solving coordinated signalized networks. Transp Res Part C: Emerging Technol 54:40–55
Article Google Scholar
El-Tantawy S, Abdulhai B, Abdelgawad H (2013) Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (marlin-atsc): methodology and large-scale application on downtown toronto. IEEE Trans Intell Transp Syst 14(3):1140–1150
Article Google Scholar
Zhang Y, Zhou Y, Lu H, Fujita H (2020) Traffic network flow prediction using parallel training for deep convolutional neural networks on spark cloud. IEEE Trans Ind Inf 16(12):7369–7380
Article Google Scholar
Zhao L, Zhou Y, Lu H, Fujita H (2019) Parallel computing method of deep belief networks and its application to traffic flow prediction. Knowl-Based Syst 163:972–987
Article Google Scholar
Arulkumaran K, Deisenroth M P, Brundage M, Bharath A A (2017) Deep reinforcement learning: A brief survey. IEEE Signal Proc Mag 34(6):26–38
Article Google Scholar
François-Lavet V, Henderson P, Islam R, Bellemare M G, Pineau J (2018) An introduction to deep reinforcement learning. arXiv:1811.12560
Wang S, Liu H, Gomes P H, Krishnamachari B (2018) Deep reinforcement learning for dynamic multichannel access in wireless networks. IEEE Trans Cogn Commun Netw 4(2):257–265
Article Google Scholar
Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P et al (2018) Soft actor-critic algorithms and applications. arXiv:1812.05905
Zhang Y, Zhou Y, Lu H, Fujita H (2021) Cooperative multi-agent actor–critic control of traffic network flow based on edge computing. Futur Gener Comput Syst 123:128–141
Article Google Scholar
Casas N (2017) Deep deterministic policy gradient for urban traffic light control. arXiv:1703.09035
Zhang F, Li J, Li Z (2020) A td3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment. Neurocomputing 411:206–215
Article Google Scholar
Ceylan H, Bell MGH (2004) Traffic signal timing optimisation based on genetic algorithm approach, including drivers’ routing. Transp Res B Methodol 38(4):329–342
Article Google Scholar
Wei H, Zheng G, Yao H, Li Z (2018) Intellilight: A reinforcement learning approach for intelligent traffic light control. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 2496–2505
Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI 1998(746-752):2
Google Scholar
Shamshirband S (2012) A distributed approach for coordination between traffic lights based on game theory. Int Arab J Inf Technol 9(2):148–153
Google Scholar
Arel I, Liu C, Urbanik T, Kohls A G (2010) Reinforcement learning-based multi-agent system for network traffic. IET Intell Transp Syst 4(2):128–135
Article Google Scholar
Wiering M, Vreeken J, Van Veenen J, Koopman A (2004) Simulation and optimization of traffic in a city. In: IEEE Intelligent Vehicles Symposium, 2004. IEEE, pp 453–458
Salkham A , Cunningham R, Garg A, Cahill V (2008) A collaborative reinforcement learning approach to urban traffic control optimization. In: 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol 2. IEEE, pp 560–566
Aziz HM, Feng Z, Ukkusuri S V (2013) Reinforcement learning-based signal control using r-markov average reward technique (rmart) accounting for neighborhood congestion information sharing. Technical report
Wang X, Ke L, Qiao Z, Chai X (2020) Large-scale traffic signal control using a novel multiagent reinforcement learning. IEEE Trans Cybern
Nguyen H D, Tran K P, Heuchenne C (2019) Monitoring the ratio of two normal variables using variable sampling interval exponentially weighted moving average control charts. Qual Reliab Eng Int 35(1):439–460
Article Google Scholar
Pan L, Cai Q, Huang L (2020) Softmax deep double deterministic policy gradients. Adv Neural Inf Process Syst 33
Domb C (2000) Phase transitions and critical phenomena. Elsevier
Yang Y, Luo R, Li M, Zhou M, Zhang W, Wang J (2018) Mean field multi-agent reinforcement learning. In: International Conference on Machine Learning. PMLR, pp 5571–5580
Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Cai Q, Yang Z, Lee J D, Wang Z (2019) Neural temporal-difference learning converges to global optima. In: Advances in Neural Information Processing Systems, pp 11315–11326
Sadhu A K, Konar A (2018) An efficient computing of correlated equilibrium for cooperative q-learning-based multi-robot planning. IEEE Transactions on Systems, Man, and Cybernetics: Systems
Alshehri A, Badawy A-H A, Huang H (2020) Fq-ago: Fuzzy logic q-learning based asymmetric link aware and geographic opportunistic routing scheme for manets. Electronics 9(4):576
Article Google Scholar
Abed-Alguni B H, Paul D J, Chalup S K, Henskens F A (2016) A comparison study of cooperative q-learning algorithms for independent learners. Int J Artif Intell 14(1):71–93
Google Scholar
Banerjee D, Sen S (2007) Reaching pareto-optimality in prisoner dilemma using conditional joint action learning. Auton Agent Multi-Agent Syst 15(1):91–108
Article Google Scholar
Buşoniu L, Babuška R, De Schutter B (2010) Multi-agent reinforcement learning: An overview. Innov Multi-Agent Syst Appl-1, pp 183–221
Agogino A K, Tumer K (2008) Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Auton Agent Multi-Agent Syst 17(2):320–338
Article Google Scholar
Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275
Sutandi A C (2020) Advanced traffic control systems: Performance evaluation in a developing country. LAP Lambert Academic Publishing
Chu T, Wang J, Codecà L, Li Z (2019) Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61973244, 72001214).

Author information

Authors and Affiliations

School of Automation Science and Engineering, Xi’an Jiaotong University, Xi’an, 710049, China
Zhimin Qiao, Liangjun Ke & Xiaoqiang Wang

Authors

Zhimin Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Liangjun Ke
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liangjun Ke.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiao, Z., Ke, L. & Wang, X. Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning. Appl Intell 53, 4483–4498 (2023). https://doi.org/10.1007/s10489-022-03643-9

Download citation

Accepted: 14 April 2022
Published: 11 June 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03643-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

A Survey of Traffic Prediction: from Spatio-Temporal Data to Intelligent Transportation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Computation offloading optimization for UAV-assisted mobile edge computing: a deep deterministic policy gradient approach

A Survey of Traffic Prediction: from Spatio-Temporal Data to Intelligent Transportation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation