Partial Local FriendQ Multiagent Learning: Application to Team Automobile Coordination Problem

  • Julien Laumonier
  • Brahim Chaib-draa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4013)


Real world multiagent coordination problems are important issues for reinforcement learning techniques. In general, these problems are partially observable and this characteristic makes the solution computation intractable. Most of the existing approaches calculate exact or approximate solutions using the world model for only one agent. To handle a special case of partial observability, this article presents an approach to approximate the policy measuring a degree of observability for pure cooperative vehicle coordination problem. We compare empirically the performance of the learned policy for totally observable problems and performances of policies for different degrees of observability. If each degree of observability is associated with communication costs, multiagent system designers are able to choose a compromise between the performance of the policy and the cost to obtain the associated degree of observability of the problem. Finally, we show how the available space, surrounding an agent, influence the required degree of observability for near-optimal solution.


Nash Equilibrium Joint Action Multiagent System Stochastic Game Partial Observability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Xuan, P., Lesser, V., Zilberstein, S.: Communication decisions in multi-agent cooperation: Model and experiments. In: Müller, J.P., Andre, E., Sen, S., Frasson, C. (eds.) Fifth International Conference on Autonomous Agents, Montreal, Canada, pp. 616–623. ACM Press, New York (2001)CrossRefGoogle Scholar
  2. 2.
    Pynadath, D.V., Tambe, M.: The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of AI Research 16, 389–423 (2002)MATHMathSciNetGoogle Scholar
  3. 3.
    Aras, R., Dutech, A., Charpillet, F.: Cooperation in Stochastic Games through Communication. In: fourth Internantional Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2005) (poster), Utrecht, Nederlands (2005)Google Scholar
  4. 4.
    Verbeeck, K.: Exploring Selfish Reinforcement Learning in Stochastic Non-Zero Sum Games. PhD thesis, Vrije Universiteit Brussel (2004)Google Scholar
  5. 5.
    Bui, H.H.: An Approach to Coordinating Team of Agents under Incomplete Information. PhD thesis, Curtin University of Technology (1998)Google Scholar
  6. 6.
    Littman, M.: Friend-or-Foe Q-learning in General-Sum Games. In: Kaufmann, M. (ed.) Eighteenth International Conference on Machine Learning, pp. 322–328 (2001)Google Scholar
  7. 7.
    Wang, X., Sandholm, T.W.: Reinforcement Learning to Play An Optimal Nash Equilibrium in Team Markov Games. In: 16th Neural Information Processing Systems: Natural and Synthetic conference (2002)Google Scholar
  8. 8.
    Moriarty, D.E., Langley, P.: Distributed learning of lane-selection strategies for traffic management. Technical report, Palo Alto, CA, 98-2 (1998)Google Scholar
  9. 9.
    Varaiya, P.: Smart cars on smart roads: Problems of control. IEEE Transactions on Automatic Control 38(2), 195–207 (1993)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  11. 11.
    Basar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd edn. Classics In Applied Mathematics (1999)Google Scholar
  12. 12.
    Emery-Montermerlo, R.: Game-theoretic control for robot teams. Technical Report CMU-RI-TR-05-36, Robotics Institute, Carnegie Mellon University (2005)Google Scholar
  13. 13.
    Kok, J.R., Vlassis, N.: Sparse Cooperative Q-learning. In: Greiner, R., Schuurmans, D. (eds.) Proc. of the 21st Int. Conf. on Machine Learning, Banff, Canada, pp. 481–488. ACM, New York (2004)Google Scholar
  14. 14.
    Fulda, N., Ventura, D.: Dynamic Joint Action Perception for Q-Learning Agents. In: 2003 International Conference on Machine Learning and Applications (2003)Google Scholar
  15. 15.
    Dolgov, D., Durfee, E.H.: Graphical models in local, asymmetric multi-agent Markov decision processes. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004) (2004)Google Scholar
  16. 16.
    Ünsal, C., Kachroo, P., Bay, J.S.: Simulation study of multiple intelligent vehicle control using stochastic learning automata. IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans 29(1), 120–128 (1999)CrossRefGoogle Scholar
  17. 17.
    Pendrith, M.D.: Distributed reinforcement learning for a traffic engineering application. In: Fourth International Conference on Autonomous Agents, pp. 404–411 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Julien Laumonier
    • 1
  • Brahim Chaib-draa
    • 1
  1. 1.DAMAS Laboratory, Department of Computer Science, and Software EngineeringLaval UniversityCanada

Personalised recommendations