Abstract
In this paper, we propose a distributed dynamic correlation matrix based multi-Q (D-DCM-Multi-Q) learning method for multi-robot systems. First, a dynamic correlation matrix is proposed for multi-agent reinforcement learning, which not only considers each individual robot’s Q-value, but also the correlated Q-values of neighboring robots. Then, the theoretical analysis of the system convergence for this D-DCM-Multi-Q method is provided. Various simulations for multi-robot foraging as well as a proof-of-concept experiment with a physical multi-robot system have been conducted to evaluate the proposed D-DCM-Multi-Q method. The extensive simulation/experimental results show the effectiveness, robustness, and stability of the proposed method.
Similar content being viewed by others
References
Agogino, A.K., Tumer, K.: QUICR-learning for multi-agent coordination. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA (2006)
Balch, T., Arkin, R.C.: Behavior-based formation control for multi-agent teams. IEEE Trans. Robot Autom. 14(6), 926–939 (1998)
Balch, T., Arkin, R.C.: Communication in reactive multiagent robotics systems. Auton. Robots 1(1), 27–52 (1995)
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discret. Event Dyn. Syst. 13, 41–77 (2003)
Bertsekas, D.P.: Dynamic Programming and Optimal Control. Athena Scientific, Nashua (2001)
Coggan, M.: Exploration and exploitation in reinforcement learning. In: Fourth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’01). Shonan International Village Yokosuka City (2001)
Greenwald, A., Hall, K.: Correlated-Q learning. In: Proceedings of the 20th International Conference on Machine Learning (ICML 2003). Washington DC, USA (2003)
Guo, H., Meng, Y.: Dynamic correlation matrix based multi-Q learning for a multi-robot system. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 08). Nice, France (2008)
Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems. Columbia University in New York City (2004)
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163. New Brunswick, NJ, USA (1994)
Littman, M.L.: Friend-or-foe Q-learning in general-sum games. In: Proceedings of the 18th International Conference on Machine Learning, Williams College(Massachusetts) USA Morgan Kaufman, pp. 322–328 (2001)
Marsella, S., Adibi, J., Al-Onaizan, Y., Kaminka, G., Muslea, I., Tambe, M.: On being a teammate: experiences acquired in the design of RoboCup teams. In: Etzioni, O., Muller, J., Bradshaw, J. (eds.) Proceedings of the Third Annual Conference on Autonomous Agents, pp. 221–227 (1999)
Matignon, L., Laurent, G.J., Fort-Piat, N.L.: Hysteretic Q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams. In: 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS07). San Diego, CA, USA (2007)
Martinoli, A., Ijspeert, A., Mondada, F.: Understanding collective aggregation mechanisms: from probabilistic modeling to experiments with real agents. Robot. Auton. Syst. 29, 51–63 (1999)
McLurkin, J., Smith, J.: Distributed algorithms for dispersion in indoor environments using a swarm of autonomous mobile robots. In: Symposium on Distributed Autonomous Robotic Systems, Springer (2004)
Meng, X., Babuska, R., Busoniu, L., Chen, Y., Tan, W.: An improved multiagent reinforcement learning algorithm. In: Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology—Volume 00 Compiegne University of Technology, France, pp. 337–343 (2005)
Meng, Y., Gan, J.: LIVS: local interaction via virtual stigmergy coordination in distributed search and collective cleanup. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA (2007)
Nolfi, S., Floreano, D.: Evolutionary Robotics: the Biology, Intelligence, and Technology of Self-organizing Machines. MIT, Cambridge (2000)
Parker, L.E.: Distributed intelligence: overview of the field and its application in multi-robot systems. Invited article. Journal of Physical Agents 2(2), 5–14 (2008) (special issue on multi-robot systems)
Sutton, S., Barto, G.: Reinforcement Learning: an Introduction. MIT, Cambridge (1998)
Suematsu, N., Hayashi, A.: A multiagent reinforcement learning algorithm using extended optimal response. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems: Part 1. Bologna, Italy (2002)
Tambe, M.: Towards flexible teamwork. J. Artif. Intell. Res. 7, 83–124 (1997)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
Zhang, X.: Matrix Analysis and Applications. Tsinghua University Press, Beijing (2004) ISBN 7-302-09271-0/0.390
Zheng, Y., Meng, Y.: Adaptive object tracking using particle swarm optimization. In: IEEE International Symposium on Computational Intelligence in Robotics and Automation, Jacksonville, Florida, USA (2007)
Zlot, R., Stentz, A.: Market-based multirobot coordination for complex tasks. Int. J. Rob. Res. 25(1), 73–101 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, H., Meng, Y. Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging. J Intell Robot Syst 60, 531–551 (2010). https://doi.org/10.1007/s10846-010-9429-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-010-9429-4