Autonomous Agents and Multi-Agent Systems

, Volume 27, Issue 3, pp 419–443 | Cite as

Coordinating actions in congestion games: impact of top–down and bottom–up utilities

  • Kagan TumerEmail author
  • Scott Proper


Congestion games offer a perfect environment in which to study the impact of local decisions on global utilities in multiagent systems. What is particularly interesting in such problems is that no individual action is intrinsically “good” or “bad” but that combinations of actions lead to desirable or undesirable outcomes. As a consequence, agents need to learn how to coordinate their actions with those of other agents, rather than learn a particular set of “good” actions. A congestion game can be studied from two different perspectives: (i) from the top down, where a global utility (e.g., a system-centric view of congestion) specifies the task to be achieved; or (ii) from the bottom up, where each agent has its own intrinsic utility it wants to maximize. In many cases, these two approaches are at odds with one another, where agents aiming to maximize their intrinsic utilities lead to poor values of a system level utility. In this paper we extend results on difference utilities, a form of shaped utility that enables multiagent learning in congested, noisy conditions, to study the global behavior that arises from the agents’ choices in two types of congestion games. Our key result is that agents that aim to maximize a modified version of their own intrinsic utilities not only perform well in terms of the global utility, but also, on average perform better with respect to their own original utilities. In addition, we show that difference utilities are robust to agents “defecting” and using their own intrinsic utilities, and that performance degrades gracefully with the number of defectors.


Multiagent Reinforcement learning Coordination Congestion games 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agogino, A., & Tumer, K. (2008). Regulating air traffic flow with coupled agents. In Proceedings of the seventh international joint conference on autonomous agents and multi-agent systems. Estoril, Portugal.Google Scholar
  2. 2.
    Agogino A. K., Tumer K. (2008) Analyzing and visualizing multiagent rewards in dynamic and stochastic environments. Journal of Autonomous Agents and Multi Agent Systems 17(2): 320–338CrossRefGoogle Scholar
  3. 3.
    Agogino A. K., Tumer K. (2008) Efficient evaluation functions for evolving coordination. Evolutionary Computation 16(2): 257–288CrossRefGoogle Scholar
  4. 4.
    Arthur W. B. (1994) Complexity in economic theory: Inductive reasoning and bounded rationality. The American Economic Review 84(2): 406–411Google Scholar
  5. 5.
    Balmer, M., Cetin, N., Nagel, K., & Raney, B. (2004). Towards truly agent-based traffic and mobility simulations. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (pp. 60–67). New York, NY.Google Scholar
  6. 6.
    Bando M., Hasebe K., Nakayama A., Shibata A., Sugiyama Y. (1995) Dynamical model of traffic congestion and numerical simulation. Physical Review E 51(2): 1035–1042CrossRefGoogle Scholar
  7. 7.
    Bazzan A., Kluegl F. (2009) Multiagent architectures for traffic and transportation engineering. Springer, BerlinCrossRefGoogle Scholar
  8. 8.
    Bazzan A. L., Klügl F. (2005) Case studies on the Braess paradox: Simulating route recommendation and learning in abstract and microscopic models. Transportation Research C 13(4): 299–319CrossRefGoogle Scholar
  9. 9.
    Bazzan, A. L., Wahle, J., & Klügl, F. (1999). Agents in traffic modelling—from reactive to social behaviour. In KI—Kunstliche Intelligenz (pp. 303–306).Google Scholar
  10. 10.
    Boutilier, C. (1996). Planning, learning and coordination in multiagent decision processes. In Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge. Holland.Google Scholar
  11. 11.
    Bowling M., Veloso M. (2002) Multiagent learning using a variable learning rate. Artificial Intelligence 136: 215–250MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Brafman R. I., Tennenholtz M. (2004) Efficient learning equilibrium. Artificial Intelligence 159(1–2): 27–47MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Burmeister B., Haddadi A., Matylis G. (1997) Application of multi-agent systems in traffic and transportation. IEEE Proceedings in Software Engineering 144(1): 51–60CrossRefGoogle Scholar
  14. 14.
    Carter C. R., Jennings N. R. (2002) Social responsibility and supply chain relationships. Transportation Research Part E 38: 37–52CrossRefGoogle Scholar
  15. 15.
    Challet D., Zhang Y. C. (1998) On the minority game: Analytical and numerical studies. Physica A 256: 514CrossRefGoogle Scholar
  16. 16.
    Cheng, J. (1997). The mixed strategy equilibria and adaptive dynamics in the bar problem. Tech. rep., Santa Fe Institute Computational Economics Workshop.Google Scholar
  17. 17.
    Conitzer, V., & Sandholm, T. (2003). Complexity results about nash equilibria. In Proceedings of the 18th international joint conference on Artificial intelligence, IJCAI’03 (pp. 765–771).Google Scholar
  18. 18.
    Dresner, K., & Stone, P. (2004). Multiagent traffic management: A reservation-based intersection control mechanism. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (pp. 530–537). New York, NY.Google Scholar
  19. 19.
    Groves T. (1973) Incentives in teams. Econometrica: Journal of the Econometric Society 41: 617–631MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Hall, S., & Draa, B. C. (2004). Collaborative driving system using teamwork for platoon formations. In The third workshop on Agents in Traffic and Transportation.Google Scholar
  21. 21.
    Hardin G. (1968) The tragedy of the commons. Science 162: 1243–1248CrossRefGoogle Scholar
  22. 22.
    Helbing D. (1998) Structure and instability of high-density equations for traffic flow. Physical Review E 57(5): 6176–6179CrossRefGoogle Scholar
  23. 23.
    Helbing D. (2001) Traffic and related self-driven many-particle systems. Reviews of Modern Physics 73: 1067–1141CrossRefGoogle Scholar
  24. 24.
    Helbing D., Tilch B. (1998) Generalized force model traffic dynamics. Physical Review E 58(1): 133–138CrossRefGoogle Scholar
  25. 25.
    Hu, J., & Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the fifteenth international conference on machine learning (pp. 242–250).Google Scholar
  26. 26.
    Huberman, B. A., & Hogg, T. (1988). The behavior of computational ecologies. In The ecology of computation (pp. 77–115). New York: North-Holland.Google Scholar
  27. 27.
    Ieong, S., McGrew, R., Nudelman, E., Shoham, Y., & Sun, Q. (2005). Fast and compact: A simple class of congestion games. In Proceedings of the 20th national conference on artificial intelligence—Volume 2, AAAI’05 (pp. 489–494).Google Scholar
  28. 28.
    Jefferies, P., Hart, M. L., & Johnson, N. F. (2002) Deterministic dynamics in the minority game. Physical Review E, 65(016105).Google Scholar
  29. 29.
    Jennings N. R., Sycara K., Wooldridge M. (1998) A roadmap of agent research and development. Autonomous Agents and Multi-Agent Systems 1: 7–38CrossRefGoogle Scholar
  30. 30.
    Kerner B. S., Rehborn H. (1996) Experimental properties of complexity in traffic flow. Physical Review E 53(5): R4275–4278CrossRefGoogle Scholar
  31. 31.
    Klügl, F., Bazzan, A., Ossowski, S. (Eds.) (2005) Applications of agent technology in traffic and transportation. Springer, HeidelbergzbMATHGoogle Scholar
  32. 32.
    Lazar A. A., Orda A., Pendarakis D. E. (1997) Capacity allocation under noncooperative routing. IEEE Transactions on Networking 5(6): 861–871CrossRefGoogle Scholar
  33. 33.
    Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th international conference on machine learning (pp. 157–163).Google Scholar
  34. 34.
    Nagel, K. (2001). Multi-modal traffic in TRANSIMS. In Pedestrian and Evacuation Dynamics (pp. 161–172). Springer, Berlin.Google Scholar
  35. 35.
    Oliveira, D., Bazzan, A. L. C., Silva, B. C., Basso, E. W., Nunes, L., Rossetti, R. J. F., Oliveira, E. C., Silva, R., & Lamb, L. C.: (2006). Reinforcement learning based control of traffic lights in non-stationary environments: a case study in a microscopic simulator. In B.D. Keplicz, A. Omicini, & J. Padget (Eds.), Proceedings of the 4th European workshop on multi-agent systems, EUMAS06 (pp. 31–42).Google Scholar
  36. 36.
    Papadimitriou, C. (2001). Algorithms, games, and the internet. In Proceedings of the thirty-third annual ACM symposium on theory of computing, STOC ’01 (pp. 749–753).Google Scholar
  37. 37.
    Parkes, D. (2004). On learnable mechanism design. In Collectives and the design of complex systems. New York: Springer.Google Scholar
  38. 38.
    Parkes, D. C. (2001). Iterative combinatorial auctions: Theory and practice. Ph.D. thesis, University of Pennsylvania.Google Scholar
  39. 39.
    Parkes, D. C., Shneidman, J. (2004). Distributed implementations of Vickrey-Clarke-Groves mechanisms. In Proceedings of the third international joint conference on autonomous agents and multiagent systems—volume 1, AAMAS ’04 (pp. 261–268). IEEE Computer Society.Google Scholar
  40. 40.
    Porter R., Nudelman E., Shoham Y. (2006) Simple search methods for finding a nash equilibrium. Games and Economic Behavior, 63: 642–664MathSciNetCrossRefGoogle Scholar
  41. 41.
    Rosenthal R. W. (1973) A class of games possessing pure-strategy nash equilibria. International Journal of Game Theory 2: 65–67MathSciNetzbMATHCrossRefGoogle Scholar
  42. 42.
    Sandholm T., Crites R. (1995) Multiagent reinforcement learning in the iterated prisoner’s dilemma. Biosystems 37: 147–166CrossRefGoogle Scholar
  43. 43.
    Stone P., Veloso M. (2000) Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3): 345–383CrossRefGoogle Scholar
  44. 44.
    Sutton R. S., Barto A. G. (1998) Reinforcement learning: An introduction. MIT Press, Cambridge, MAGoogle Scholar
  45. 45.
    Tennenholtz, M., & Zohar, A. (2009). Learning equilibria in repeated congestion games. In Proceedings of the 8th international conference on autonomous agents and multiagent systems—volume 1, AAMAS ’09 (pp. 233–240).Google Scholar
  46. 46.
    Tesauro G., Kephart J. O. (2002) Pricing in agent economies using multi-agent q-learning. Autonomous Agents and Multi-Agent Systems 5: 289–304CrossRefGoogle Scholar
  47. 47.
    Tumer, K., & Agogino, A. (2007). Distributed agent-based air traffic flow management. In Proceedings of the sixth international joint conference on autonomous agents and multi-agent systems (pp. 330–337). Honolulu, HI.Google Scholar
  48. 48.
    Tumer, K., Agogino, A. K., & Welch, Z. (2009) Traffic congestion management as a learning agent coordination problem. In: Bazzan, A., & Kluegl F. (Eds.) Multiagent architectures for traffic and transportation engineering. Berlin: Springer.Google Scholar
  49. 49.
    Tumer, K., Welch, Z. T., & Agogino, A. (2008). Aligning social welfare and agent preferences to alleviate traffic congestion. In Proceedings of the seventh international joint conference on autonomous agents and multi-agent systems. Estoril, Portugal.Google Scholar
  50. 50.
    Tumer, K., & Wolpert, D. (2004). A survey of collectives. In Collectives and the design of complex systems (pp. 1–42). Berlin: Springer.Google Scholar
  51. 51.
    Tumer, K., & Wolpert, D. H. (2000). Collective intelligence and Braess’ paradox. In Proceedings of the seventeenth national conference on artificial intelligence (pp. 104–109). Austin, TX.Google Scholar
  52. 52.
    Wiering, M. (2000) Multi-agent reinforcement leraning for traffic light control. In Proceedings of the seventeenth international conference on machine learning, ICML ’00 (pp. 1151–1158). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Google Scholar
  53. 53.
    Wolpert D. H., Tumer K. (2001) Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3): 265–279zbMATHCrossRefGoogle Scholar

Copyright information

© The Author(s) 2012

Authors and Affiliations

  1. 1.Oregon State UniversityCorvallisUSA

Personalised recommendations