Dynamic Partition of Collaborative Multiagent Based on Coordination Trees

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 194)

Abstract

In team Markov games research, it is difficult for an individual agent to calculate the reward of collaborative agents dynamically. We present a coordination tree structure whose nodes are agent subsets or an agent. Two kinds of weights of a tree are defined which describe the cost of an agent collaborating with an agent subset. We can calculate a collaborative agent subset and its minimal cost for collaboration using these coordination trees. Some experiments of a Markov game have been done by using this novel algorithm. The results of the experiments prove that this method outperforms related multi-agent reinforcement-learning methods based on alterable collaborative teams.

Keywords

reinforcement learning multi-agent coordination tree Markov games 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Boeling, M.: Multiagent Learning in the Presence of Agents with Limitations. CMU 4, 1–172 (2003)Google Scholar
  2. 2.
    Parker, L.E.: Distributed algorithms for multi-robot observation of multiple moving targets. Autonomous Robots 12(3), 231–255 (2002)MATHCrossRefGoogle Scholar
  3. 3.
    Pynadath, D.V., Tambe, M.: The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research 16, 389–423 (2002)MathSciNetMATHGoogle Scholar
  4. 4.
    Guestrin, C.: Planning under uncertainty in complex structured environments. PhD thesis, Computer Science Department, Stanford University (August 2003)Google Scholar
  5. 5.
    Groen, F.C.A., Spaan, M.T.J., Kok, J.R., Pavlin, G.: Real World Multi-agent Systems: Information Sharing, Coordination and Planning. In: ten Cate, B.D., Zeevat, H.W. (eds.) TbiLLC 2005. LNCS (LNAI), vol. 4363, pp. 154–165. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Kok, J.R., Spaan, M.T.J., Vlassis, N.: Non-communicative multi-robot coordination in dynamic environments. Robotics and Autonomous Systems 50(2-3), 99–114 (2005)CrossRefGoogle Scholar
  7. 7.
    Tesauro, G.: Extending Q-learning to general adaptive multi-agent systems. In: Advances in Neural Information Processing Systems, vol. 16 (2004)Google Scholar
  8. 8.
    Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored MDPs. In: Advances in Neural Information Processing Systems (NIPS) 14. MIT Press (2002a)Google Scholar
  9. 9.
    Kok, J.R., Vlassis, N.: Collaborative Multiagent Reinforcement Learning by Payoff Propagation. Journal of Machine Learning Research, 1789–1828 (2006)Google Scholar
  10. 10.
    Christopher Gifford, M., Agah, A.: Sharing in Teams of Heterogeneous,Collaborative Learning Agents. International Journal of Intelligent Systems 24(2), 173–200 (2009)MATHCrossRefGoogle Scholar
  11. 11.
    Zhang, C., Lesser, V.R., Abdallah, S.: Self-organization for coordinating decentralized reinforcement learning. In: Proceedings of AAMAS, pp. 739–746 (2010)Google Scholar
  12. 12.
    Hoen, P.J.’., de Jong, E.D.: Evolutionary Multi-agent Systems. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 872–881. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems. In: Proceedings of the Third Autonomous Agents and Multi-Agent Systems Conference (2004)Google Scholar
  14. 14.
    Panait, L., Luke, S.: Cooperative Multi-Agent Learning: The State of the Art. Autonomous Agents and Multi-Agent Systems 11(3), 387–434 (2005)CrossRefGoogle Scholar
  15. 15.
    Li, J., Pan, Q., Hong, B.: A new multi-agent reinforcement learning approach. In: 2010 IEEE International Conference on Information and Automation (ICIA), vol. 6, pp. 1667–1671 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Institute of Computer ScienceXidian UniversityXidianChina
  2. 2.Informatics InstituteUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations