Multiagent Planning with Trembling-Hand Perfect Equilibrium in Multiagent POMDPs

  • Yuichi Yabu
  • Makoto Yokoo
  • Atsushi Iwasaki
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5044)


Multiagent Partially Observable Markov Decision Processes are a popular model of multiagent systems with uncertainty. Since the computational cost for finding an optimal joint policy is prohibitive, a Joint Equilibrium-based Search for Policies with Nash Equilibrium (JESP-NE) is proposed that finds a locally optimal joint policy in which each policy is a best response to other policies; i.e., the joint policy is a Nash equilibrium.

One limitation of JESP-NE is that the quality of the obtained joint policy depends on the predefined default policy. More specifically, when finding a best response, if some observation have zero probabilities, JESP-NE uses this default policy. If the default policy is quite bad, JESP-NE tends to converge to a sub-optimal joint policy.

In this paper, we propose a method that finds a locally optimal joint policy based on a concept called Trembling-hand Perfect Equilibrium (TPE). In finding a TPE, we assume that an agent might make a mistake in selecting its action with small probability. Thus, an observation with zero probability in JESP-NE will have non-zero probability. We no longer use the default policy. As a result, JESP-TPE can converge to a better joint policy than the JESP-NE, which we confirm this fact by experimental evaluations.


Multiagent systems Partially Observable Markov Decision Process Nash equilibrium Trembling-hand perfect equilibrium 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Beard, R.W., McLain, T.W.: Multiple uav cooperative search under collision avoidance and limited range communication constraints. In: Proceedings of the 42nd Conference Decision and Control, pp. 25–30. IEEE, Los Alamitos (2003)Google Scholar
  2. 2.
    Nair, R., Tambe, M.: Hybrid BDI-POMDP framework for multiagent teaming. Journal of Artificial Intelligence Research 17, 171–228 (2002)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Lesser, V., Ortiz, C., Tambe, M.: Distributed Sensor Networks: A Multiagent Perspective. Kluwer, Dordrecht (2003)CrossRefzbMATHGoogle Scholar
  4. 4.
    Xuan, P., Lesser, V., Zilberstein, S.: Communication decisions in Multiagent cooperation. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 616–623 (2001)Google Scholar
  5. 5.
    Goldman, C.V., Zilberstein, S.: Optimizing information exchange in cooperative multi-agent systems. In: Proceedings of the Second International Joint Conference on Agents and Multiagent Systems (AAMAS 2003), pp. 137–144 (2003)Google Scholar
  6. 6.
    Nair, R., Tambe, M., Marsella, S.: Role allocation and reallocation in multiagent teams: Towards a practical analysis. In: Proceedings of the Second International Joint Conference on Agents and Multiagent Systems (AAMAS 2003), pp. 552–559 (2003)Google Scholar
  7. 7.
    Bernstein, D.S., Zilberstein, S., Immerman, N.: The complexity of decentralized control of markov decision processes. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000), pp. 32–37 (2000)Google Scholar
  8. 8.
    Nair, R., Roth, M., Yokoo, M., Tambe, M.: Communication for improving policy computation in distributed pomdps. In: Proceedings of the Third International Joint Conference on Agents and Multiagent Systems (AAMAS 2004), pp. 1098–1105 (2004)Google Scholar
  9. 9.
    Selten, R.: Reexamination of the perfectness concept for equilibrium points in extensive games. International Journal of Game Theory 4, 25–55 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Pynadath, D.V., Tambe, M.: The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research 16, 389–423 (2002)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Mas-Colell, A., Whinston, M.D., Green, J.R.: Microeconomic Theory. Oxford University Press, Oxford (1995)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Yuichi Yabu
    • 1
  • Makoto Yokoo
    • 1
  • Atsushi Iwasaki
    • 1
  1. 1.Graduate School of ISEEKyushu UniversityFukuokaJapan

Personalised recommendations