Skip to main content

Solving Sparse Delayed Coordination Problems in Multi-Agent Reinforcement Learning

  • Conference paper
Adaptive and Learning Agents (ALA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7113))

Included in the following conference series:

Abstract

One of the main advantages of Reinforcement Learning is the capability of dealing with a delayed reward signal. Using an appropriate backup diagram, rewards are backpropagated through the state space. This allows agents to learn to take the correct action that results in the highest future (discounted) reward, even if that action results in a suboptimal immediate reward in the current state. In a multi-agent environment, agents can use the same principles as in single agent RL, but have to apply them in a complete joint-state-joint-action space to guarantee optimality. Learning in such a state space can however be very slow. In this paper we present our approach for mitigating this problem. Future Coordinating Q-learning (FCQ-learning) detects strategic interactions between agents several timesteps before these interactions occur. FCQ-learning uses the same principles as CQ-learning [3] to detect the states in which interaction is required, but several timesteps before this is reflected in the reward signal. In these states, the algorithm will augment the state information to include information about other agents which is used to select actions. The techniques presented in this paper are the first to explicitly deal with a delayed reward signal when learning using sparse interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, Renesse, Holland, pp. 195–210 (1996)

    Google Scholar 

  2. Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the 15th National Conference on Artificial Intelligence, pp. 746–752. AAAI Press (1998)

    Google Scholar 

  3. De Hauwere, Y.-M., Vrancx, P., Nowé, A.: Learning multi-agent state space representations. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Toronto, Canada, pp. 715–722 (2010)

    Google Scholar 

  4. De Hauwere, Y.-M., Vrancx, P., Nowé, A.: Adaptive state representations for multi-agent reinforcement learning. In: Proceedings of the 3th International Conference on Agents and Artificial Intelligence, Rome, Italy, pp. 181–189 (2011)

    Google Scholar 

  5. Greenwald, A., Hall, K.: Correlated-q learning. In: AAAI Spring Symposium, pp. 242–249. AAAI Press (2003)

    Google Scholar 

  6. Hu, J., Wellman, M.: Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4, 1039–1069 (2003)

    MathSciNet  MATH  Google Scholar 

  7. Kok, J., ’t Hoen, P., Bakker, B., Vlassis, N.: Utile coordination: Learning interdependencies among cooperative agents. In: Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG), pp. 29–36 (2005)

    Google Scholar 

  8. Kok, J., Vlassis, N.: Sparse cooperative Q-learning. In: Proceedings of the 21st International Conference on Machine Learning (ICML). ACM, New York (2004)

    Google Scholar 

  9. Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning (ICML), pp. 157–163. Morgan Kaufmann (1994)

    Google Scholar 

  10. Melo, F.S., Veloso, M.: Learning of coordination: Exploiting sparse interactions in multiagent systems. In: Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), pp. 773–780. International Foundation for Autonomous Agents and Multiagent Systems (2009)

    Google Scholar 

  11. Melo, F., Veloso, M.: Local multiagent coordination in decentralised mdps with sparse interactions. Tech. Rep. CMU-CS-10-133, School of Computer Science, Carnegie Mellon University (2010)

    Google Scholar 

  12. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Tsitsiklis, J.: Asynchronous stochastic approximation and Q-learning. Journal of Machine Learning 16(3), 185–202 (1994)

    MATH  Google Scholar 

  14. Vrancx, P., Verbeeck, K., Nowé, A.: Decentralized learning in markov games. IEEE Transactions on Systems, Man and Cybernetics (Part B: Cybernetics) 38(4), 976–981 (2008)

    Article  Google Scholar 

  15. Watkins, C.: Learning from Delayed Rewards. Ph.D. thesis, University of Cambridge (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Hauwere, YM., Vrancx, P., Nowé, A. (2012). Solving Sparse Delayed Coordination Problems in Multi-Agent Reinforcement Learning. In: Vrancx, P., Knudson, M., Grześ, M. (eds) Adaptive and Learning Agents. ALA 2011. Lecture Notes in Computer Science(), vol 7113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28499-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28499-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28498-4

  • Online ISBN: 978-3-642-28499-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics