Abstract
Reinforcement learning is widely used to learn complex behaviors for robotics. However, due to the high-dimensional state/action spaces, reinforcement learning usually suffers from slow learning speed in robotic control applications. A feasible solution to this challenge is to utilize structural decomposition of the control problem and resort to decentralized learning methods to expedite the overall learning process. In this paper, a multiagent reinforcement learning approach is proposed to enable decentralized learning of component behaviors for a robot that is decomposed as a coordination graph. By using this approach, all the component behaviors are learned in parallel by some individual reinforcement learning agents and these agents coordinate their behaviors to solve the global control problem. The approach is validated and analyzed in two benchmark robotic control problems. The experimental validation provides evidence that the proposed approach enables better performance than approaches without decomposition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Kober, J., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Yu, C., Zhang, M., Ren, F., Tan, G.: Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3083–3096 (2015)
Wiering, M., Van Otterlo, M.: Reinforcement Learning. Adaptation, Learning, and Optimization. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-27645-3
Busoniu, L., Babuska, R., De Schutter, B.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(2), 156–172 (2008)
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by payoff propagation. J. Mach. Learn. Res. 7, 1789–1828 (2006)
Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: ICML 2002, pp. 227–234 (2002)
Grana, M., Fernandez-Gauna, B., Lopez-Guede, J.M.: Cooperative multi-agent reinforcement learning for multi-component robotic systems: guidelines for future research. Paladyn 2(2), 71–81 (2011)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 41 (1992)
Takahashi, Y., Edazawa, K., Asada, M.: Modular learning system and scheduling for behavior acquisition in multi-agent environment. In: Nardi, D., Riedmiller, M., Sammut, C., Santos-Victor, J. (eds.) RoboCup 2004. LNCS (LNAI), vol. 3276, pp. 548–555. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-32256-6_51
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discret. Event Dyn. Syst. 13(1–2), 41–77 (2003)
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control, pp. 1329–1338 (2016)
Schulman, J., Levine, S., Moritz, P., Jordan, M.I., Abbeel, P.: Trust region policy optimization. Comput. Sci. 2(2), 1889–1897 (2015)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: ICML 2014, pp. 387–395 (2014)
Deisenroth, M.P., Neumann, G., Peters, J.: A Survey on Policy Search for Robotics. Now Publishers Inc., Breda (2013)
Yu, C., Zhang, M., Ren, F.: Collective learning for the emergence of social norms in networked multiagent systems. IEEE Trans. Cybern. 44(12), 2342–2355 (2014)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2015)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: ICRA 2017, pp. 3389–3396 (2017)
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: ICML 2016, pp. 2829–2838 (2016)
Yu, C., Zhang, M., Ren, F.: Multiagent learning of coordination in loosely coupled multiagent systems. IEEE Trans. Cybern. 45(12), 2853–2867 (2015)
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grant 61502072, 61572104 and 61403059, Hongkong Scholar Program under Grant XJ2017028, and Dalian High Level Talent Innovation Support Program under Grant 2017RQ008.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, C., Wang, D., Ren, J., Ge, H., Sun, L. (2018). Decentralized Multiagent Reinforcement Learning for Efficient Robotic Control by Coordination Graphs. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)