Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

Noma, Kentaro; Takahashi, Yasutake; Asada, Minoru

doi:10.1007/978-3-540-68847-1_9

Kentaro Noma¹,
Yasutake Takahashi¹ &
Minoru Asada^1,2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5001))

Included in the following conference series:

Robot Soccer World Cup

1646 Accesses
1 Citations

Abstract

The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competition since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of hierarchical modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its teammates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the state and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defence team) on 5 (offence team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).

Download to read the full chapter text

Chapter PDF

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

Article 01 December 2023

Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI

Scalable Multi-agent Reinforcement Learning Architecture for Semi-MDP Real-Time Strategy Games

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Connell, J.H., Mahadevan, S.: ROBOT LEARNING. Kluwer Academic Publishers, Dordrecht (1993)
MATH Google Scholar
Doya, K., Samejima, K., Katagiri, K.i., Kawato, M.: Multiple model-based reinforcement learning. Technical report, Kawato Dynamic Brain Project Technical Report, KDB-TR-08, Japan Science and Technology Corporation (June 2000)
Google Scholar
Elfwing, S., Uchibe, E., Doya, K., Chirstensen, H.I.: Multi-agent reinforcement learning: Using macro actions to learn a mating task. In: Proceedings of 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 4, pp. 3164–3169 (2004)
Google Scholar
Ikenoue, S., Asada, M., Hosoda, K.: Cooperative behavior acquisition by asynchronous policy renewal that enables simultaneous learning in multiagent environment. In: Proceedings of the 2002 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems, pp. 2728–2734 (2002)
Google Scholar
Jacobs, R., Jordan, M., Nowlan, S., Hinton, G.: Adaptive mixture of local experts. Neural Computation 3, 79–87 (1991)
Article Google Scholar
Kalyanakrishnan, S., Liu, Y., Stone, P.: Half field offense in robocup soccer: A multiagent reinforcement learning case study. In: Proceedings CD RoboCup (2006)
Google Scholar
Singh, S.P.: Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning 8, 323–339 (1992)
MATH Google Scholar
Stone, P., Sutton, R.S., Kuhlmann, G.: Scaling reinforcement learning toward robocup soccer. Journal of Machine Learing Research 13, 2201–2220 (2003)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Takahashi, Y., Edazawa, K., Asada, M.: Multi-module learning system for behavior acquisition in multi-agent environment. In: Proceedings of 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. CD–ROM 927–931 (October 2002)
Google Scholar
Takahashi, Y., Kawamata, T., Asada, M.: Learning utility for behavior acquisition and intention inference of other agent. In: Proceedings of the 2006 IEEE/RSJ IROS 2006 Workshop on Multi-objective Robotics, pp. 25–31 (2006)
Google Scholar
Whitehead, S., Karlsson, J., Tenenberg, J.: Learning multiple goal behavior via task decomposition and dynamic policy merging. In: Connell, J.H., Mahadevan, S. (eds.) ROBOT LEARNING, ch.3, pp. 45–78. Kluwer Academic Publishers (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Adaptive Machine Systems, Graduate School of Engineering, Osaka University,
Kentaro Noma, Yasutake Takahashi & Minoru Asada
JST ERATO Asada Synergistic Intelligence Project, Yamadaoka 2-1, Suita, Osaka, 565-0871, Japan
Minoru Asada

Authors

Kentaro Noma
View author publications
You can also search for this author in PubMed Google Scholar
Yasutake Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Minoru Asada
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ubbo Visser Fernando Ribeiro Takeshi Ohashi Frank Dellaert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noma, K., Takahashi, Y., Asada, M. (2008). Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others. In: Visser, U., Ribeiro, F., Ohashi, T., Dellaert, F. (eds) RoboCup 2007: Robot Soccer World Cup XI. RoboCup 2007. Lecture Notes in Computer Science(), vol 5001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68847-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-68847-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68846-4
Online ISBN: 978-3-540-68847-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

Abstract

Chapter PDF

Similar content being viewed by others

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI

Scalable Multi-agent Reinforcement Learning Architecture for Semi-MDP Real-Time Strategy Games

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

Abstract

Chapter PDF

Similar content being viewed by others

Hierarchical Method for Cooperative Multiagent Reinforcement Learning in Markov Decision Processes

Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI

Scalable Multi-agent Reinforcement Learning Architecture for Semi-MDP Real-Time Strategy Games

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation