Abstract
We present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the use of action-dependent features to generalize the state space. In our work, we use a learned action-dependent feature space to aid higher-level reinforcement learning. TPOT-RL is an effective technique to allow a team of agents to learn to cooperate towards the achievement of a specific goal. It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. TPOT-RL is fully implemented and has been tested in the robotic soccer domain, a complex, multi-agent framework. This paper presents the algorithmic details of TPOT-RL as well as empirical results demonstrating the effectiveness of the developed multi-agent learning approach with learned features.
This research is sponsored in part by the DARPA/RL Knowledge Based Planning and Scheduling Initiative under grant number F30602-97-2-0250. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies or endorsements, either expressed or implied, of the U. S. Government.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Minoru Asada, Shoichi Noda, Sukoya Tawaratumida, and Koh Hosoda. Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning, 23:279–303, 1996.
J. A. Boyan and M. L. Littman. Packet routing in dynamically changing networks: A reinforcement learning approach. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances In Neural Information Processing Systems 6. Morgan Kaufmann Publishers, 1994.
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237–285, May 1996.
Hiroaki Kitano, Yasuo Kuniyoshi, Itsuki Noda, Minoru Asada, Hitoshi Matsubara, and Eiichi Osawa. RoboCup: A challenge problem for AI. AI Magazine, 18(1):73–85, Spring 1997.
Michael L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157–163, San Mateo, CA, 1994. Morgan Kaufman.
Sean Luke, Charles Hohn, Jonathan Farris, Gary Jackson, and James Hendler. Coevolving soccer softbot team coordination with genetic programming. In Hiroaki Kitano, editor, RoboCup-97: Robot Soccer World Cup I, pages 398–411, Berlin, 1998. Springer Verlag.
Itsuki Noda, Hitoshi Matsubara, and Kazuo Hiraki. Learning cooperative behavior in multi-agent environment: a case study of choice of play-plans in soccer. In PRICAI’96: Topics in Artificial Intelligence (Proc. of 4th Pacific Rim International Conference on Artificial Intelligence, Cairns, Australia), pages 570–579, Cairns, Australia, August 1996.
J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
Peter Stone and Manuela Veloso. A layered approach to learning client behaviors in the RoboCup soccer server. Applied Artificial Intelligence, 12:165–188, 1998.
Peter Stone and Manuela Veloso. Towards collaborative and adversarial learning: A case study in robotic soccer. International Journal of Human-Computer Studies, 48(1):83–104, January 1998.
Peter Stone and Manuela Veloso. Using decision tree confidence factors for multi-agent control. In Hiroaki Kitano, editor, RoboCup-97: Robot Soccer World Cup I, pages 99–111. Springer Verlag, Berlin, 1998.
Ming Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pages 330–337, 1993.
Manuela Veloso, Peter Stone, Kwun Han, and Sorin Achim. The CMUnited-97 small-robot team. In Hiroaki Kitano, editor, RoboCup-97: Robot Soccer World Cup I, pages 242–256. Springer Verlag, Berlin, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Heidelberg Berlin
About this paper
Cite this paper
Stone, P., Veloso, M. (1999). Team-Partitioned, Opaque-Transition Reinforcement Learning. In: Asada, M., Kitano, H. (eds) RoboCup-98: Robot Soccer World Cup II. RoboCup 1998. Lecture Notes in Computer Science(), vol 1604. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48422-1_21
Download citation
DOI: https://doi.org/10.1007/3-540-48422-1_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66320-1
Online ISBN: 978-3-540-48422-6
eBook Packages: Springer Book Archive