Abstract
Relational reinforcement learning is a promising direction within reinforcement learning research. It upgrades reinforcement learning techniques using relational representations for states, actions, and learned value functions or policies to allow natural representations and abstractions of complex tasks. Multi-agent systems are characterized by their relational structure and present a good example of a complex task. In this article, we show how relational reinforcement learning could be a useful tool for learning in multi-agent systems. We study this approach in more detail on one important aspect of multi-agent systems, i.e., on learning a communication policy for cooperative systems (e.g., resource distribution). Communication between agents in realistic multi-agent systems can be assumed costly, limited, and unreliable. We perform a number of experiments that highlight the conditions in which relational representations can be beneficial when taking the constraints mentioned above into account.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multi-agent systems. In: Proceedings of the 15th International Conference on Artificial Intelligence, pp. 746–752 (1998)
Croonenborghs, T., Tuyls, K., Ramon, J., Bruynooghe, M.: Multi-agent relational reinforcement learning. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 192–206. Springer, Heidelberg (2006), http://www.cs.kuleuven.ac.be/cgi-bin-dtai/publ_info.pl?id=41977
Driessens, K.: Relational reinforcement learning. Ph.D. thesis, Department of Computer Science, Katholieke Universiteit Leuven (2004), http://www.cs.kuleuven.be/publicaties/doctoraten/cw/CW2004_05.abs.html
Driessens, K., Ramon, J., Blockeel, H.: Speeding up relational reinforcement learning through the use of an incremental first order decision tree learner. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 97–108. Springer, Heidelberg (2001)
Džeroski, S., De Raedt, L., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7–52 (2001)
Finzi, A., Lukasiewicz, T.: Game theoretic golog under partial observability. In: AAMAS 2005: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pp. 1301–1302. ACM, New York (2005), http://doi.acm.org/10.1145/1082473.1082743
Guerra-Hernández, A., Fallah-Seghrouchni, A.E., Soldano, H.: Learning in BDI multi-agent systems. In: Dix, J., Leite, J. (eds.) CLIMA 2004. LNCS (LNAI), vol. 3259, pp. 218–233. Springer, Heidelberg (2004)
Hoen, P., Tuyls, K.: Engineering multi-agent reinforcement learning using evolutionary dynamics. In: Proceedings of the 15th European Conference on Machine Learning (2004)
Hu, J., Wellman, M.P.: Experimental results on Q-learning for general-sum stochastic games. In: ICML 2000: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 407–414. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Kaelbling, L., Littman, M., Moore, A.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research (1996)
Letia, I.A., Precup, D.: Developing collaborative golog agents by reinforcement learning. International Journal on Artificial Intelligence Tools 11(2), 233–246 (2002)
Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163 (1994)
Muggleton, S., De Raedt, L.: Inductive logic programming: Theory and methods. Journal of Logic Programming 19,20, 629–679 (1994)
Nowé, A., Parent, J., Verbeeck, K.: Social agents playing a periodical policy. In: Proceedings of the 12th European Conference on Machine Learning, Freiburg, pp. 382–393 (2001)
van Otterlo, M.: A characterization of sapient agents. In: International Conference Integration of Knowledge Intensive Multi-Agent Systems (KIMAS 2003), Boston, Massachusetts (2003)
Panait, L., Luke, S.: Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems 11(3), 387–434 (2005)
Panait, L., Tuyls, K., Luke, S.: Theoretical advantages of lenient learners: An evolutionary game theoretic perspective. Journal of Machine Learning Research 9, 423–457 (2008)
Puterman, M.: Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, New York (1994)
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Tech. rep., Cambridge University Engineering Department (1994)
Sen, S., Airiau, S., Mukherjee, R.: Towards a Pareto-optimal solution in general-sum games. In: The Proceedings of the Second Intenational Joint Conference on Autonomous Agents and Multiagent Systems, Melbourne, Australia, July 2003, pp. 153–160 (2003)
Stone, P.: Layered learning in multi-agent systems. MIT Press, Cambridge (2000)
Sutton, R., Barto, A.: Reinforcement Learning: an introduction. MIT Press, Cambridge (1998)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: An overview. In: Proceedings of the ICML 2004 Workshop on Relational Reinforcement Learning (2004)
Tumer, K., Wolpert, D.: COllective INtelligence and Braess’ Paradox. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence, pp. 104–109 (2000)
Tuyls, K., Verbeeck, K., Lenaerts, T.: A selection-mutation model for Q-learning in Multi-Agent Systems. In: The second International Joint Conference on Autonomous Agents and Multi-Agent Systems. ACM Press, Melbourne (2003)
van Otterlo, M.: The logic of adaptive behavior: Knowledge representation and algorithms for the Markov decision process framework in first-order domains. Ph.D. thesis, Department of Computer Science, University of Twente, Enschede, The Netherlands, p. 512 (May 2008)
Watkins, C.: Learning with delayed rewards. Ph.D. thesis, Cambridge University (1989)
Wiering, M.: Explorations in efficient reinforcement learning. Ph.D. thesis, Universiteit van Amsterdam (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ponsen, M. et al. (2010). Learning with Whom to Communicate Using Relational Reinforcement Learning. In: Babuška, R., Groen, F.C.A. (eds) Interactive Collaborative Information Systems. Studies in Computational Intelligence, vol 281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11688-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-11688-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11687-2
Online ISBN: 978-3-642-11688-9
eBook Packages: EngineeringEngineering (R0)