Coordination with Collective and Individual Decisions

  • Paulo Trigo
  • Anders Jonsson
  • Helder Coelho
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4140)


The response to a large-scale disaster, e.g. an earthquake or a terrorist incident, urges for low-cost policies that coordinate sequential decisions of multiple agents. Decisions range from collective (common good) to individual (self-interested) perspectives, intuitively shaping a two-layer decision model. However, current decision theoretic models are either purely collective or purely individual and seek optimal policies. We present a two-layer, collective versus individual (CvI) decision model and explore the tradeoff between cost reduction and loss of optimality while learning coordination skills. Experiments, in a partially observable domain, test our approach for learning a collective policy and results show near-optimal policies that exhibit coordinated behavior.


Multiagent System Markov Decision Process Coordination Policy Individual Policy Temporal Abstraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abdallah, S., Lesser, V.: Modeling Task Allocation Using a Decision Theoretic Model. In: Fourth Int. Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2005), pp. 719–726. ACM Press, New York (2005)CrossRefGoogle Scholar
  2. 2.
    Boutilier, C.: Sequential Optimality and Coordination in Multi-Agent Systems. In: Sixteenth Int. Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 478–485 (1999)Google Scholar
  3. 3.
    Bradtke, S., Duff, M.: Reinforcement learning methods for continuous time Markov decision problems. Advances in Neural Inf. Processing Systems 8, 393–400 (1995)Google Scholar
  4. 4.
    Corrêa, M., Coelho, H.: Collective Mental States in Extended Mental States Framework. In: International Conference on Collective Intentionality (2004)Google Scholar
  5. 5.
    Dietterich, T.: Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. Artificial Intelligence Research 13, 227–303 (2000)MATHMathSciNetGoogle Scholar
  6. 6.
    FIPA Communicative Act Library Specification (2002),
  7. 7.
    Ghavamzadeh, M., Mahadevan, S., Makar, R.: Hierarchical Multi-Agent Reinforcement Learning. Journal of Autonomous Agents and Multi-Agent Systems (2006)Google Scholar
  8. 8.
    Jonsson, A., Barto, A.: Automated State Abstractions for Options Using the U-Tree Algorithm. Advances in Neural Inf. Processing Systems 13, 1054–1060 (2001)Google Scholar
  9. 9.
    Kitano, H., Tadokoro, S., Noda, I., Matsubara, H., Takahashi, T., Shinjou, A., Shimada, S.: RoboCup Rescue: Search and Rescue in Large-Scale Disasters as a Domain for Autonomous Agents Research. In: Conf. on Man, System and Cyb. (MSC 1999), pp. 739–743 (1999)Google Scholar
  10. 10.
    Nash, J.: Non-Cooperative Games. Annals of Mathematics 54, 286–295 (1951)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Pynadath, D., Tambe, M.: The Communicative Multiagent Team Decision Problem: Analyzing Teamwork Theories and Models. Journal of AI Research, 389–423 (2002)Google Scholar
  12. 12.
    Rohanimanesh, K., Mahadevan, S.: Learning to Take Concurrent Actions. In: Sixteenth Annual Conference on Neural Information Processing Systems, pp. 1619–1626 (2003)Google Scholar
  13. 13.
    Sutton, R., Precup, D., Singh, S.: Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. A.I. 112(1-2), 181–211 (1999)MATHMathSciNetGoogle Scholar
  14. 14.
    Trigo, P., Coelho, H.: The Multi-Team Formation Precursor of Teamwork. In: Bento, C., Cardoso, A., Dias, G. (eds.) EPIA 2005. LNCS (LNAI), vol. 3808, pp. 560–571. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Paulo Trigo
    • 1
  • Anders Jonsson
    • 2
  • Helder Coelho
    • 3
  1. 1.Dep. Eng. Elect. Telec. e Comp. at Instituto Superior de Engenharia de LisboaPortugal
  2. 2.Departamento de Tecnología at Universidad Pompeu FabraBarcelonaSpain
  3. 3.Departamento de Informática at Faculdade de CiênciasUniversidade de LisboaPortugal

Personalised recommendations