Abstract
To solve the problems of difficult control law design, poor portability, and poor stability of traditional multi-agent formation obstacle avoidance algorithms, a multi-agent formation obstacle avoidance method based on deep reinforcement learning (DRL) is proposed. This method combines the perception ability of convolutional neural networks (CNNs) with the decision-making ability of reinforcement learning in a general form and realizes direct output control from the visual perception input of the environment to the action through an end-to-end learning method. The multi-agent system (MAS) model of the follow-leader formation method was designed with the wheelbarrow as the control object. An improved deep Q netwrok (DQN) algorithm (we improved its discount factor and learning efficiency and designed a reward value function that considers the distance relationship between the agent and the obstacle and the coordination factor between the multi-agents) was designed to achieve obstacle avoidance and collision avoidance in the process of multi-agent formation into the desired formation. The simulation results show that the proposed method achieves the expected goal of multi-agent formation obstacle avoidance and has stronger portability compared with the traditional algorithm.
Similar content being viewed by others
References
XIE G, ZHANG Y. Survey of consensus problem in cooperative control of multi-agent systems [J]. Application Research of Computers, 2011, 28(6): 2035–2039 (in Chinese).
CHEN Z, LIN L, YAN G. An approach to scientific cooperative robotics: Through MAS (multi-agent system) [J]. Robot, 2001, 23(4): 368–373 (in Chinese).
DUAN Y, YANG H, CUI B, et al. Application of reinforcement learning to basic action learning of soccer robot [J]. Robot, 2008, 30(5): 453–459 (in Chinese).
LITTMAN M L. Reinforcement learning improves behaviour from evaluative feedback [J]. Nature, 2015, 521(7553): 445–451.
ZHU Y, ZHAO D. Probably approximately correct reinforcement learning solving continuous-state control problem [J]. Control Theory and Applications, 2016, 33(12): 1603–1613 (in Chinese).
ZHOU W. The application of deep learning algorithms in intelligent collaborative robots [J]. China New Telecommunications, 2017, 19(21): 129–130 (in Chinese).
POLYDOROS A S, NALPANTIDIS L. Survey of model-based reinforcement learning: Applications on robotics [J]. Journal of Intelligent & Robotic Systems, 2017, 86(2): 153–173.
LIMA H, KUROE Y. Swarm reinforcement learning methods improving certainty of learning for amulti-robot formation problem [C]//2015 IEEE Congress on Evolutionary Computation (CEC). Sendai: IEEE, 2015: 3026–3033.
LIU Q, ZHAI J, ZHANG Z, et al. A survey on deep reinforcement learning [J]. Chinese Journal of Computers, 2018, 41(1): 1–27 (in Chinese).
RIEDMILLER M. Neural fitted Q iteration: First experiences with a data efficient neural reinforcement learning method [M]//Machine learning: ECML2005. Berlin, Heidelberg: Springer, 2005: 317–328.
LANGE S, RIEDMILLER M. Deep auto-encoder neural networks in reinforcement learning [C]//The 2010 International Joint Conference on Neural Networks (IJCNN). Barcelona: IEEE, 2010: 1–8.
ABTAHI F, FASEL I. Deep belief nets as function approximators for reinforcement learning [C]//Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence. Frankfurt: AAAI, 2011: 2–7.
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning [J]. Nature, 2015, 518(7540): 529–533.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item
the National Natural Science Foundation of China (No. 61963006), and the Natural Science Foundation of Guangxi Province (Nos. 2020GXNSFDA238011, 2018GXNSFAA050029, and 2018GXNSFAA294085)
Rights and permissions
About this article
Cite this article
Ji, X., Hai, J., Luo, W. et al. Obstacle Avoidance in Multi-Agent Formation Process Based on Deep Reinforcement Learning. J. Shanghai Jiaotong Univ. (Sci.) 26, 680–685 (2021). https://doi.org/10.1007/s12204-021-2357-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-021-2357-6