Abstract
In this article, we propose two methods to adapt parameters in multi-agent reinforcement learning (MARL) for repeated resource sharing problems(RRSP). Resource sharing problems (RSP) are important and widely-applicable frameworks on MARL. RRSP is a variation of RSP in which agents select resources repeatedly and periodically. We have been proposing a learning method called Moderated Global Information (MGI) for MARL in RRSP. However, we need carefully adapt several parameters in MGI, especially temperature parameter T in Boltzmann selection in agent behavior and modification parameter L, to converge the learning into suitable states. In order to avoid this difficulty, we propose two methods to adjust these parameters according to the performance of each agent and statistical behaviors of agents. Results of several experiments tell us that the proposed methods are robust against changes of environments and force agent-behaviors to the optimal situation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proc. of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
Garland, A., Alterman, R.: Learning procedural knowledge to better coordinate. In: Proc. of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI, pp. 1073–1079 (2001)
Yamashita, T., Izumi, K., Kurumatani, K., Nakashima, H.: Smooth traffic flow with a cooperative car navigation system. In: Proc. of the Fourth International Joint Conference on Autonomous Agents and Milti Agent Systems, AAMAS, pp. 478–485 (2005)
Wolpert, D.H., Tumer, K., Frank, J.: Using collective intelligence to route internet traffic. Advances in Neural Information Processing Systems 11, 952–958 (1999)
Kluegl, F., Bazzan, A.L.C., Wahle, J.: Selection of information types based on personal utility - a testbed for traffic information markets. In: Proc. of the Second International Joint Conference on Autonomous Agents and Milti Agent Systems, AAMAS, pp. 377–384 (2003)
Ohta, M., Noda, I.: Reduction of adverse effect of global-information on selfish agents. In: Antunes, L., Takadama, K. (eds.) Seventh International Workshop on Multi-Agent-Based Simulation (MABS), Hakodate, AAMAS 2006, Shinko, pp. 7–16 (May 2006)
Carmel, D., Markovitch, S.: Exploration strategies for model-based learning in multiagent systems. Autonomous Agents and Multi-agent Systems 2, 141–172 (1999)
Guo, M., Liu, Y., Malec, J.: A new q-learning algorithm based on the metropolis criterion. IEEE Transactions on Systems, Man and Cybernetics, Part B 34(5), 2140–2143 (2004)
Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136, 215–250 (2002)
Abdallah, S., Lesser, V.: Learning the task allocation game. In: Proc. of AAMAS 2006, IFAAMAS, pp. 850–857 (May 2006)
Abdallah, S., Lesser, V.: Multiagent reinforcement learning and self-organization in a network of agents. In: Proc. of AAMAS 2007, IFAAMAS, pp. 172–179 (May 2007)
Marden, J.R., Arslan, G., Shamma, J.S.: Regret based dynamics: Convergence in weakly acyclic games. In: Proc. of AAMAS 2007, IFAAMAS, pp. 194–201 (May 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Noda, I., Ohta, M. (2008). Meta-level Control of Multiagent Learning in Dynamic Repeated Resource Sharing Problems. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)