Abstract
In a pursuit-evasion game, the pursuers usually can capture the evaders successfully when the practical application environment is similar to the one that the pursuers was trained on. However, when there are some pursuers broken down or some new pursuers joining, which will result in that the number of agents in practice is different from the number of agents that was trained on. In other words, the environment has changed. In multi-agent deep reinforcement leaning algorithm, which means that the input and output dimension of network has changed, the trained pursuers may can not capture the evaders in the real-world application. To solve this problem, we proposed a multi-agent reinforcement learning framework so that when the number of pursuers has changed, the pursuers can also capture the evaders. Based on deep deterministic policy gradient (DDPG) framework and Bi directional recurrent neural network (Bi-RNN), we proposed the scalable deep reinforcement learning method for pursuit-evasion game, and apply it into multi-agent pursuit-evasion game in 2D-Dynamic environment. In this game, the speed of evaders is higher than the pursuers, but the number of evaders is less than the pursuers. Our experimental results show that this algorithm can increase the scalability and stability of multi-agent pursuit-evasion game.
This work was supported in part by the National Natural Science Foundation of China under Grants 61672245, 61873287, 61672112, 61572210, and 61633011.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Soga S, Kobayashi I (2013) A study on the efficiency of learning a robot controller in various environments. In: 2013 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), pp 164–169
Awheda MD, Schwartz HM (2016) A fuzzy reinforcement learning algorithm using a predictor for pursuit-evasion games. In: 2016 annual IEEE systems conference (SysCon). IEEE, pp 1–8
Schwartz HM, Howard M (2014) Multi-agent machine learning: a reinforcement approach. Wiley Publishing, Hoboken, pp 144–199
Jouffe L (1998) Fuzzy inference system learning by reinforcement methods. IEEE Trans Syst Man Cybern 28(3):338–355
Desouky SF, Schwartz HM (2011) Q (\(\lambda \))-learning adaptive fuzzy logic controllers for pursuit-evasion differential games. Int J Adapt Control Signal Process 25(10):910–927
Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In: 2015 IEEE 28th Canadian conference on electrical and computer engineering (CCECE). IEEE, pp 1006–1011
Mnih V, Kavukcuoglu K, Silver D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Silver D, Huang A (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Silver D, Schrittwieser J, Simonyan K (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Levine S, Finn C, Darrell T (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
Mao H, Alizadeh M, Menache I (2016) Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM workshop on hot topics in networks. ACM, pp 50–56
Jaques N, Gu S, Turner RE (2017) Tuning recurrent neural networks with reinforcement learning. In: Proceedings of the 34th international conference on machine learning
Tan M (1993) Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
Enright JJ, Wurman PR (2011) Optimization and coordinated autonomy in mobile fulfillment systems. In: Workshops at the twenty-fifth AAAI conference on artificial intelligence
Stephan J, Fink J, Kumar V (2017) Concurrent control of mobility and communication in multirobot systems. IEEE Trans Robot 33(5):1248–1254
Foerster JN, Farquhar G, Afouras T (2018) Counterfactual multi-agent policy gradients. In: Thirty-second AAAI conference on artificial intelligence
Lowe R, Wu Y, Tamar A, Harb J, Abbeel OP, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems, pp 6382–6393
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Machine learning proceedings 1994, pp 157–163
Bilgin AT, Kadioglu UE (2015) An approach to multi-agent pursuit evasion games using reinforcement learning. In: 2015 international conference on advanced robotics (ICAR). IEEE, pp 164—169
Foerster J, Assael YM, Freitas N (2016) Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems, pp 2137–2145
Khan A, Zhang C, Lee DD (2018) Scalable centralized deep multi-agent reinforcement learning via policy gradients. arXiv preprint arXiv
Lillicrap TP, Timothy P (2015) Continuous control with deep reinforcement learning. Comput Sci 8(6):187
Foerster J, Assael IA, Freitas N (2016) Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems, pp 2137–2145
Tesauro G (2004) Extending q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, pp 871–878
Silver D, Lever G, Heess N (2014) Deterministic policy gradient algorithms. In: International conference on international conference on machine learning, ICML, pp 387–395
Kingma DP, Ba J (2015) Adam: a method for Stochastic Optimization. In: 3rd international conference for learning representations, San Diego
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, L., Hu, B., Guan, Z., Cheng, X., Li, T., Xiao, J. (2020). Multi-agent Deep Reinforcement Learning for Pursuit-Evasion Game Scalability. In: Jia, Y., Du, J., Zhang, W. (eds) Proceedings of 2019 Chinese Intelligent Systems Conference. CISC 2019. Lecture Notes in Electrical Engineering, vol 592. Springer, Singapore. https://doi.org/10.1007/978-981-32-9682-4_69
Download citation
DOI: https://doi.org/10.1007/978-981-32-9682-4_69
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9681-7
Online ISBN: 978-981-32-9682-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)