Meta-level Control of Multiagent Learning in Dynamic Repeated Resource Sharing Problems

Noda, Itsuki; Ohta, Masayuki

doi:10.1007/978-3-540-89197-0_29

Itsuki Noda^3,4,5 &
Masayuki Ohta³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5351))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1329 Accesses
1 Citations

Abstract

In this article, we propose two methods to adapt parameters in multi-agent reinforcement learning (MARL) for repeated resource sharing problems(RRSP). Resource sharing problems (RSP) are important and widely-applicable frameworks on MARL. RRSP is a variation of RSP in which agents select resources repeatedly and periodically. We have been proposing a learning method called Moderated Global Information (MGI) for MARL in RRSP. However, we need carefully adapt several parameters in MGI, especially temperature parameter T in Boltzmann selection in agent behavior and modification parameter L, to converge the learning into suitable states. In order to avoid this difficulty, we propose two methods to adjust these parameters according to the performance of each agent and statistical behaviors of agents. Results of several experiments tell us that the proposed methods are robust against changes of environments and force agent-behaviors to the optimal situation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proc. of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
Google Scholar
Garland, A., Alterman, R.: Learning procedural knowledge to better coordinate. In: Proc. of the Seventeenth International Joint Conference on Artificial Intelligence, IJCAI, pp. 1073–1079 (2001)
Google Scholar
Yamashita, T., Izumi, K., Kurumatani, K., Nakashima, H.: Smooth traffic flow with a cooperative car navigation system. In: Proc. of the Fourth International Joint Conference on Autonomous Agents and Milti Agent Systems, AAMAS, pp. 478–485 (2005)
Google Scholar
Wolpert, D.H., Tumer, K., Frank, J.: Using collective intelligence to route internet traffic. Advances in Neural Information Processing Systems 11, 952–958 (1999)
Google Scholar
Kluegl, F., Bazzan, A.L.C., Wahle, J.: Selection of information types based on personal utility - a testbed for traffic information markets. In: Proc. of the Second International Joint Conference on Autonomous Agents and Milti Agent Systems, AAMAS, pp. 377–384 (2003)
Google Scholar
Ohta, M., Noda, I.: Reduction of adverse effect of global-information on selfish agents. In: Antunes, L., Takadama, K. (eds.) Seventh International Workshop on Multi-Agent-Based Simulation (MABS), Hakodate, AAMAS 2006, Shinko, pp. 7–16 (May 2006)
Google Scholar
Carmel, D., Markovitch, S.: Exploration strategies for model-based learning in multiagent systems. Autonomous Agents and Multi-agent Systems 2, 141–172 (1999)
Article Google Scholar
Guo, M., Liu, Y., Malec, J.: A new q-learning algorithm based on the metropolis criterion. IEEE Transactions on Systems, Man and Cybernetics, Part B 34(5), 2140–2143 (2004)
Article Google Scholar
Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136, 215–250 (2002)
Article MathSciNet MATH Google Scholar
Abdallah, S., Lesser, V.: Learning the task allocation game. In: Proc. of AAMAS 2006, IFAAMAS, pp. 850–857 (May 2006)
Google Scholar
Abdallah, S., Lesser, V.: Multiagent reinforcement learning and self-organization in a network of agents. In: Proc. of AAMAS 2007, IFAAMAS, pp. 172–179 (May 2007)
Google Scholar
Marden, J.R., Arslan, G., Shamma, J.S.: Regret based dynamics: Convergence in weakly acyclic games. In: Proc. of AAMAS 2007, IFAAMAS, pp. 194–201 (May 2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan
Itsuki Noda & Masayuki Ohta
School of Information Science, Japan Advanced Institute of Science and Technology, Japan
Itsuki Noda
Department of Systems Innovation, The University of Tokyo, Japan
Itsuki Noda

Authors

Itsuki Noda
View author publications
You can also search for this author in PubMed Google Scholar
Masayuki Ohta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Asahidai 1-1, 923-12292, Nomi, Japan
Tu-Bao Ho
Department of Computer Science & Technology, Nanjing University, 22 Hankou Road, 210093, China
Zhi-Hua Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noda, I., Ohta, M. (2008). Meta-level Control of Multiagent Learning in Dynamic Repeated Resource Sharing Problems. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_29

Download citation

DOI: https://doi.org/10.1007/978-3-540-89197-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics