Baselines for Joint-Action Reinforcement Learning of Coordination in Cooperative Multi-agent Systems
A common assumption for the study of reinforcement learning of coordination is that agents can observe each other’s actions (so-called joint-action learning). We present in this paper a number of simple joint-action learning algorithms and show that they perform very well when compared against more complex approaches such as OAL , while still maintaining convergence guarantees. Based on the empirical results, we argue that these simple algorithms should be used as baselines for any future research on joint-action learning of coordination.
KeywordsReinforcement Learn Joint Action Multiagent System Optimal Action Stochastic Game
Unable to display preview. Download preview PDF.
- 1.Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the Sixteenth Conference on Neural Information Processing Systems (2002)Google Scholar
- 2.Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multiagent systems. In: Proceedings of the Eighteenth National Conference on Artifical Intelligence (2002)Google Scholar
- 3.Claus, C., Boutilier, C.: Dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artifical Intelligence 1998(1998)Google Scholar
- 4.Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: A bayesian approach. In: Proceedings of the Second international conference on Automonous Agents and Multiagent Systems (2003)Google Scholar
- 5.Littman, M.L.: Markov games as a framework for multi - agent reinforcement learning. In: Proceedings of the Eleventh International Conference on machine Learning (1994)Google Scholar
- 6.Watkins, C.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge University (1989)Google Scholar