Advertisement

An Overview of Cooperative and Competitive Multiagent Learning

  • Pieter Jan ’t Hoen
  • Karl Tuyls
  • Liviu Panait
  • Sean Luke
  • J. A. La Poutré
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3898)

Abstract

Multi-agent systems (MASs) is an area of distributed artificial intelligence that emphasizes the joint behaviors of agents with some degree of autonomy and the complexities arising from their interactions. The research on MASs is intensifying, as supported by a growing number of conferences, workshops, and journal papers. In this survey we give an overview of multi-agent learning research in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics.

MASs range in their description from cooperative to being competitive in nature. To muddle the waters, competitive systems can show apparent cooperative behavior, and vice versa. In practice, agents can show a wide range of behaviors in a system, that may either fit the label of cooperative or competitive, depending on the circumstances. In this survey, we discuss current work on cooperative and competitive MASs and aim to make the distinctions and overlap between the two approaches more explicit.

Lastly, this paper summarizes the papers of the first International workshop on Learning and Adaptation in MAS (LAMAS) hosted at the fourth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS’05) and places the work in the above survey.

Keywords

Nash Equilibrium Multiagent System Matrix Game Market Game Column Player 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Byde, C.P.A., Jennings, N.: Decision procedures for multiple auctions. In: Proceedings of the 1st Int. Conf. Autonomous Agents and Multi-Agent Systems, AAMAS 2002 (2002)Google Scholar
  2. 2.
    Airiau, S., Sen, S.: Towards a pareto-optimal solution in general-sum games, study in 2x2 games. In: LAMAS (2005)Google Scholar
  3. 3.
    Alkemade, A., La Poutré, J., Amman, H.: On social learning and robust evolutionary algorithm design in economic games. In: Proceedings of the 2005 IEEE Congress on Evolutionary Computation (CEC 2005), pp. 2445–2452. IEEE Press, Los Alamitos (2005)CrossRefGoogle Scholar
  4. 4.
    Alkemade, F., La Poutré, J.: Heterogeneous, boundedly rational agents in the cournot duopoly. In: Cowan, R., Jonard, N. (eds.) Heterogenous Agents, Interactions and Economic Performance. LNEMS, vol. 521, pp. 3–17. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Anthony, P., Jennings, N.: Developing a bidding agent for multiple heterogeneous auctions. ACM Transactions on Internet Technology (ACM TOIT) 3, 185–217 (2003)CrossRefGoogle Scholar
  6. 6.
    Arthur, W.: Inductive reasoning and bounded rationality. American Economic Review 84, 406–411 (1994)Google Scholar
  7. 7.
    Axelrod, R.: The evolution of cooperation. Basic Books, New York (1984)MATHGoogle Scholar
  8. 8.
    Balch, T.: Learning roles: Behavioral diversity in robot teams. Technical Report GIT-CC-97-12, Georgia Institute of Technology (1997)Google Scholar
  9. 9.
    Balch, T.: Behavioral Diversity in Learning Robot Teams. PhD thesis, College of Computing, Georgia Institute of Technology (1998)Google Scholar
  10. 10.
    Balch, T.: Reward and diversity in multirobot foraging. In: IJCAI 1999 Workshop on Agents Learning About, From and With other Agents, pp. 92–99 (1999)Google Scholar
  11. 11.
    Banerjee, B., Peng, J.: The role of reactivity in multiagent learning. In: Third International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 538–545 (2004)Google Scholar
  12. 12.
    Banerjee, B., Peng, J.: Convergence of no-regret learning in multiagent systems. In: LAMAS (2005)Google Scholar
  13. 13.
    Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete-Event Systems journal 13, 41–77 (2003)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Bazzan, A.L.C., Fehler, M., Klugl, F.: Learning to coordinate in a network of social drivers: the role of information. In: LAMAS (2005)Google Scholar
  15. 15.
    Berenji, H., Vengerov, D.: Advantages of cooperation between reinforcement learning agents in difficult stochastic problems. In: Proceedings of 9th IEEE International Conference on Fuzzy Systems (2000)Google Scholar
  16. 16.
    Bernhardt, D., Scoones, D.: A note on sequential auctions. The American Economic Review 84(3), 653–657 (1994)Google Scholar
  17. 17.
    Bernstein, D., Zilberstein, S., Immerman, N.: The complexity of decentralized control of MDPs. In: Proceedings of UAI-2000: The Sixteenth Conference on Uncertainty in Artificial Intelligence, pp. 819–840 (2000)Google Scholar
  18. 18.
    Binmore, K.: Fun and Games. D.C. Heath and Company, Lexington (1992)MATHGoogle Scholar
  19. 19.
    Binmore, K., Vulkan, N.: Applying game theory to automated negotiation. Netnomics 1, 1–9 (1999)CrossRefGoogle Scholar
  20. 20.
    Biso, A., Rossi, F., Sperdutti, A.: Experimental results on learning soft constraints. In: Cohn, A.G., Giunchiglia, F., Selman, B. (eds.) Proceedings of KR 2000: Principles of Knowledge Representation and Reasoning, pp. 435–444 (2000)Google Scholar
  21. 21.
    Bohté, S., Gerding, E., La Poutré, J.: Market-based recommendation: Agents that compete for consumer attention. ACM Transactions on Internet Technology (ACM TOIT) (Special Issue on Machine Learning on the Internet) 4(4), 420–448 (2004)Google Scholar
  22. 22.
    Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems. SFI Studies in the Sciences of Complexity. Oxford University Press, Oxford (1999)MATHGoogle Scholar
  23. 23.
    Boutilier, C., Goldszmidt, M., Sabata, B.: Sequential auctions for the allocation of resources with complementaries. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 527–534 (1999)Google Scholar
  24. 24.
    Bowling, M.: Convergence and no-regret in multiagent learning. Advances in Neural Information Processing Systems 17, 209–216 (2004)Google Scholar
  25. 25.
    Bowling, M., Velose, M.: Rational and convergent learning in stochastic games. In: Proceedings of the Seventh International Joint Conference on Artificial Intelligence (IJCAI), pp. 1021–1026 (2001)Google Scholar
  26. 26.
    Bowling, M., Veloso, M.: An analysis of stochastic game theory for multiagent reinforcement learning. Technical Report CMU-CS-00-165, Computer Science Department, Carnegie Mellon University (2000)Google Scholar
  27. 27.
    Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136(2), 215–250 (2002)MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Brooks, C., Fay, S., Das, R., MacKie-Mason, J., Kephart, J., Durfee, E.: Automated strategy searches in an electronic goods market: Learning complex price schedules. In: Proceedings of the ACM Conference on Electronic Commerce (ACM-EC), pp. 31–41. ACM Press, New York (1999)CrossRefGoogle Scholar
  29. 29.
    Brown, G.W.: Iterative solution of games by Fictitious Play. In: Koopmans, T.C. (ed.) Activity Analysis of Production and Allocation, pp. 374–376. Wiley, New York (1951)Google Scholar
  30. 30.
    Bull, L., Fogarty, T.C.: Evolving cooperative communicating classifier systems. In: Sebald, A.V., Fogel, L.J. (eds.) Proceedings of the Fourth Annual Conference on Evolutionary Programming (EP 1994), pp. 308–315 (1994)Google Scholar
  31. 31.
    Byde, A.: Applying evolutionary game theory to auction mechanism design. In: ACM Conference on E-Commerce, ACM-EC 2003 (2003)Google Scholar
  32. 32.
    Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: A Bayesian approach. In: Proceedings of The Second International Joint Conference on Autonomous Agents & Multiagent Systems (AAMAS 2003), pp. 709–716. ACM, New York (2003)CrossRefGoogle Scholar
  33. 33.
    Chang, Y.-H., Ho, T., Kaelbling, L.: All learning is local: Multi-agent learning in global reward games. In: Proceedings of Neural Information Processing Systems, NIPS 2003 (2003)Google Scholar
  34. 34.
    Chang, Y.-H., Kaelbling, L.P.: Playing is believing: the role of beliefs in multiagent learning. In: Advances in Neural Information Processing Systems-(NIPS), vol. 14 (2002)Google Scholar
  35. 35.
    Cheng, S.-F., Leung, E., Lochner, K., O’Malley, K., Reeves, D., Schvartzman, L., Wellman, M.: Walverine: A Walrasian trading agent. Decision Support Systems 39, 169–184 (2005)CrossRefGoogle Scholar
  36. 36.
    Clearwater, S.: Market based Control of Distributed Systems. World Scientific Press, Singapore (1995)Google Scholar
  37. 37.
    Conitzer, V., Sandholm, T.: AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In: 20th International Conference on Machine Learning (ICML), pp. 83–90 (2003); An Overview of Cooperative and Competitive Multiagent Learning 33Google Scholar
  38. 38.
    Dresner, K., Stone, P.: Multiagent traffic management: Opportunities for multiagent learning. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 129–138. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  39. 39.
    Driessens, K., Dzeroski, S.: Integrating guidance into relational reinforcement learning. Machine Learning 57(3), 271–304 (2004)MATHCrossRefGoogle Scholar
  40. 40.
    Durfee, E., Lesser, V., Corkill, D.: Coherent cooperation among communicating problem solvers. IEEE Transactions on Computers C-36(11), 1275–1291 (1987)CrossRefGoogle Scholar
  41. 41.
    Dzeroski, S., Raedt, L.D., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7–52 (2001)MATHCrossRefGoogle Scholar
  42. 42.
    Sousa, C.O., Sousa, C.O.e., Custodio, L.: Dealing with errors in a cooperative multi-agent learning system. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 139–154. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  43. 43.
    Elmaghraby, W.: The importance of ordering in sequential auctions. Management Science 49(5), 673–682 (2003)MATHCrossRefGoogle Scholar
  44. 44.
    Faratin, P., Sierra, C., Jennings, N.: Negotiation decision functions for autonomous agents. International Journal of Robotics and Autonomous Systems 34(24), 159–182 (1998)CrossRefGoogle Scholar
  45. 45.
    Faratin, P., Sierra, C., Jennings, N.: Using similarity criteria to make issue trade-offs. Artificial Intelligence 142, 205–237 (2002)MathSciNetCrossRefGoogle Scholar
  46. 46.
    Fatima, S., Wooldridge, M., Jennings, N.: Comparing equilibria for game theoretic and evolutionary bargaining models. In: Proceedings of the 5th International Workshop on Agent-Mediated Electronic Commerce (AMEC V), pp. 70–77 (2003)Google Scholar
  47. 47.
    Ficici, S., Melnik, O., Pollack, J.: Selection in Coevolutionary Algorithms and the Inverse Problem, pp. 277–294. Springer, Heidelberg (2004)MATHGoogle Scholar
  48. 48.
    Ficici, S., Pollack, J.: A game-theoretic approach to the simple coevolutionary algorithm. In: Proceedings of the Sixth International Conference on Parallel Problem Solving from Nature (PPSN VI). Springer, Heidelberg (2000)Google Scholar
  49. 49.
    Foster, D.P., Young, H.P.: On the impossibility of predicting behavior of rational agents. In: PNAS, Proceedings of the National Academy of Sciences of the USA, vol. 98(22) (2001)Google Scholar
  50. 50.
    Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games and Economic Behavior 29, 79–103 (1999)MathSciNetMATHCrossRefGoogle Scholar
  51. 51.
    Frisch, M., Smale, S.: Differential Equations, Dynamical Systems and Linear Algebra. Academic Press, Inc., London (1974)Google Scholar
  52. 52.
    Fudenberg, D., Levine, D.: Consistency and cautious fictitious play. Journal of Economic Dynamics and Control 19, 1065–1089 (1995)MathSciNetMATHCrossRefGoogle Scholar
  53. 53.
    Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1999)MATHGoogle Scholar
  54. 54.
    Fudenberg, D., Tirole, J.: Game Theory. MIT Press, Cambridge (1991)MATHGoogle Scholar
  55. 55.
    Gerding, E., La Poutré, J.: Bargaining with posterior opportunities: An evolutionary social simulation. In: Gallegati, M., Kirman, A.P., Marsili, M. (eds.) The Complex Dynamics of Economic Interactions. LNEMS, vol. 531, pp. 241–256. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  56. 56.
    Gerding, E., Somefun, K., La Poutré, H.: Automated bilateral bargaining about multiple attributes in a one-to-many setting. In: Proceedings of the Sixth International Conference on Electronic Commerce (ICEC 2004), pp. 105–112. ACM Press, New York (2004)Google Scholar
  57. 57.
    Gerding, E., Somefun, K., La Poutré, H.: Bilateral bargaining in a one-tomany bargaining setting. In: Agent Mediated Electronic Commerce VI (AMECVI). LNCS (LNAI), Springer, Heidelberg (2004) (to appear), invited for publicationGoogle Scholar
  58. 58.
    Gerding, E., van Bragt, D., La Poutré, J.: Multi-issue negotiation processes by evolutionary simulation: Validation and social extensions. Computational Economics 22, 39–63 (2003)MATHCrossRefGoogle Scholar
  59. 59.
    Ghavamzadeh, M., Mahadevan, S.: Learning to communicate and act using hierarchical reinforcement learning. In: AAMAS-2004 Proceedings of the Third International Joint Conference on Autonomous Agents and Multi Agent Systems, pp. 1114–1121 (2004)Google Scholar
  60. 60.
    Gintis, C.: Game Theory Evolving. University Press, Princeton (2000)MATHGoogle Scholar
  61. 61.
    Greenwald, A., Boyan, J.: Bidding under uncertainty: Theory and experiments. In: Twentieth Conference on Uncertainty in Artificial Intelligence, pp. 209–216 (2004)Google Scholar
  62. 62.
    Greenwald, A., Hall, K.: Correlated Q-learning. In: Proceedings of the Twentieth International Conference on Machine Learning, ICML, pp. 242–249 (2003)Google Scholar
  63. 63.
    Greenwald, A., Kephart, J.: Shopbots and pricebots. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 506–511 (1999)Google Scholar
  64. 64.
    Grefenstette, J., Daley, R.: Methods for competitive and cooperative coevolution. In: Adaptation, Coevolution and Learning in Multiagent Systems: Papers from the 1996 AAAI Spring Symposium, pp. 45–50. AAAI Press, Menlo Park (1996); Technical Report SS-96-01Google Scholar
  65. 65.
    Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: Proceedings of the 2002 AAAI Symposium Series: Collaborative Learning Agents, pp. 227–234 (2002)Google Scholar
  66. 66.
    Guo, Y., Muller, J., Weinhardt, C.: Learning user preferences for multiattribute negotiation: An evolutionary approach. In: Mařík, V., Müller, J.P., Pěchouček, M. (eds.) CEEMAS 2003. LNCS (LNAI), vol. 2691, pp. 303–313. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  67. 67.
    Hara, A., Nagao, T.: Emergence of cooperative behavior using ADG; Automatically Defined Groups. In: Proceedings of the 1999 Genetic and Evolutionary Computation Conference (GECCO 1999), pp. 1038–1046 (1999)Google Scholar
  68. 68.
    Haynes, T., Sen, S.: Evolving behavioral strategies in predators and prey. In: Weiß, G., Sen, S. (eds.) Adaptation and Learning in Multiagent Systems. LNCS (LNAI). Springer, Germany (1995)Google Scholar
  69. 69.
    Haynes, T.D., Sen, S.: Co-adaptation in a team. International Journal of Computational Intelligence and Organizations (IJCIO) 1(4) (1997)Google Scholar
  70. 70.
    He, M., Jennings, N.R.: Southampton TAC: An adaptive autonomous trading agent. ACM Transactions on Internet Technology 3, 218–235 (2003)CrossRefGoogle Scholar
  71. 71.
    He, M., Jennings, N.R., Prgel-Bennett, A.: A heuristic bidding strategy for buying multiple goods in multiple english auctions. ACM Transactions on Internet Technology (2006) (to appear)Google Scholar
  72. 72.
    He, M., Leung, H., Jennings, N.R.: A fuzzy logic based bidding strategy for autonomous agents in continuous double auctions. IEEE Trans. on Knowledge and Data Engineering 15, 1345–1363 (2003)CrossRefGoogle Scholar
  73. 73.
    Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge (1998)MATHCrossRefGoogle Scholar
  74. 74.
    Hoffman, R., Shadbolt, N.: Eliciting knowledge from experts: A methodological analysis. Organizational and Human Decision Process 62(2), 129–158 (1995)CrossRefGoogle Scholar
  75. 75.
    Hölldobler, B., Wilson, E.O.: The Ants. Harvard University Press, Cambridge (1990)CrossRefGoogle Scholar
  76. 76.
    Hu, J., Wellman, M.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 242–250. Morgan Kaufmann, San Francisco (1998)Google Scholar
  77. 77.
    Hu, J., Wellman, M.: Online learning about other agents in a dynamic multiagent system. In: Sycara, K.P., Wooldridge, M. (eds.) Proceedings of the Second International Conference on Autonomous Agents (Agents 1998), pp. 239–246. ACM Press, New York (1998)CrossRefGoogle Scholar
  78. 78.
    Hudson, B., Sandholm, T.: Effectiveness of preference elicitation in combinatorial auctions. In: Padget, J., Shehory, O., Parkes, D.C., Sadeh, N.M., Walsh, W.E. (eds.) AMEC 2002. LNCS (LNAI), vol. 2531, pp. 69–86. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  79. 79.
    Iba, H.: Evolutionary learning of communicating agents. Information Sciences 108, 181–206 (1998)CrossRefGoogle Scholar
  80. 80.
    Iba, H.: Evolving multiple agents by genetic programming. In: Spector, L., Langdon, W., O’Reilly, U.-M., Angeline, P. (eds.) Advances in Genetic Programming 3, pp. 447–466. MIT Press, Cambridge (1999)Google Scholar
  81. 81.
    Jansen, T., Wiegand, R.P.: Exploring the explorative advantage of the cooperative coevolutionary (1+1) EA. In: Cantu-Paz, E., et al. (eds.) Prooceedings of the Genetic and Evolutionary Computation Conference (GECCO). Springer, Heidelberg (2003)Google Scholar
  82. 82.
    Jennings, N., Faratin, P., Lomuscio, A., Parsons, S., Sierra, C., Wooldrigde, M.: Automated negotiation: prospects, methods, and challenges. International Journal of Group Decision and Negotiation 10, 199–215 (2001)CrossRefGoogle Scholar
  83. 83.
    Jim, K.-C., Giles, C.L.: Talking helps: Evolving communicating agents for the predator-prey pursuit problem. Artificial Life 6(3), 237–254 (2000)CrossRefGoogle Scholar
  84. 84.
    Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  85. 85.
    Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence, AAAI 2002 (2002)Google Scholar
  86. 86.
    Kephart, J., Brooks, C., Das, R.: Pricing information bundles in a dynamic environment. In: Proceedings of the 3rd ACM Conference on Electronic Commerce (ACMEC), pp. 180–190. ACM Press, New York (2001)CrossRefGoogle Scholar
  87. 87.
    Kephart, J., Hanson, J., Greenwald, A.: Dynamic pricing by software agents. Computer Networks 36(6), 731–752 (2000)CrossRefGoogle Scholar
  88. 88.
    Klein, M., Faratin, P., Sayama, H., Bar-Yam, Y.: Negotiating complex contracts. Group Decision and Negotiation 12, 111–125 (2003)MATHCrossRefGoogle Scholar
  89. 89.
    Krishna, V.: Auction Theory. Academic Press, London (2002)Google Scholar
  90. 90.
    Lichbach, M.I.: The cooperators dilemma. University of Michigan Press, Ann Arbor (1996)CrossRefGoogle Scholar
  91. 91.
    Lin, R.: Bilateral multi-issue contract negotiation for task redistribution using a mediation service. In: Proceedings Agent Mediated Electronic Commerce VI (2004) (to appear)Google Scholar
  92. 92.
    Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning (ML 1994), pp. 157–163. Morgan Kaufmann, New Brunswick (1994)CrossRefGoogle Scholar
  93. 93.
    Littman, M.: Friend-or-foe Q-learning in general-sum games. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 322–328. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  94. 94.
    Littman, M., Stone, P.: Leading best-response strategies in repeated games. In: Seventeenth International Joint Conference on Artificial Intelligence (IJCAI) workshop on Economic Agents, Models, and Mechanisms (2001)Google Scholar
  95. 95.
    Littman, M.L., Majercik, S.M.: Large-scale planning under uncertainty: A survey. In: Workshop on Planning and Scheduling for Space (1997)Google Scholar
  96. 96.
    Littman, M.L., Stone, P.: A polynomial-time nash equilibrium algorithm for repeated games. In: Proceedings of the 4th ACM conference on Electronic commerce, 2003, vol. 39, pp. 55–66 (2005); also appeared in Decision Support SystemsGoogle Scholar
  97. 97.
    Luke, S.: Genetic programming produced competitive soccer softbot teams for RoboCup97. In: Koza, J.R., et al. (eds.) Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 214–222. Morgan Kaufmann, San Francisco (1998)Google Scholar
  98. 98.
    Luke, S., Spector, L.: Evolving teamwork and coordination with genetic programming. In: Koza, J.R., Goldberg, D.E., Fogel, D.B., Riolo, R.L. (eds.) Genetic Programming 1996: Proceedings of the First Annual Conference, pp. 150–156, 28–31. Stanford University, MIT Press (1996)Google Scholar
  99. 99.
    Luo, X., Jennings, N.R., Shadbolt, N.: Acquiring tradeoff preferences for automated negotiations: A case study. In: Proceedings of the 5th International Workshop on Agent-Mediated Electronic Commerce (AMEC V), pp. 37–55 (2003)Google Scholar
  100. 100.
    Luo, X., Jennings, N.R., Shadbolt, N., Leung, H., Lee, J.H.: A fuzzy constraint based model for bilateral multi-issue negotiations in semi-competitive environments. Artificial Intelligence Journal 148(1-2), 53–102 (2003)MATHCrossRefGoogle Scholar
  101. 101.
    MacKie-Mason, J.K., Osepayshvili, A., Reeves, D.M., Wellman, M.P.: Price prediction strategies for market-based scheduling. In: Fourteenth International Conference on Automated Planning and Scheduling, pp. 244–252 (2004)Google Scholar
  102. 102.
    Mas-Collel, A., Whinston, M., Green, J.: Microeconomic Theory. Oxford University Press, Oxford (1995)MATHGoogle Scholar
  103. 103.
    Mataric, M.: Reinforcement learning in the multi-robot domain. Autonomous Robots 4(1), 73–83 (1997)CrossRefGoogle Scholar
  104. 104.
    Mataric, M.: Using communication to reduce locality in distributed multi-agent learning. Joint Special Issue on Learning in Autonomous Robots, Machine Learning, and Autonomous Robots 31(1-3), 141–167, 335-354 (1998)MATHGoogle Scholar
  105. 105.
    Maynard-Smith, J.: Evolution and the Theory of Games. Cambridge University Press, Cambridge (1982)MATHCrossRefGoogle Scholar
  106. 106.
    Maynard Smith, J., Price, J.: The logic of animal conflict. Nature 146, 15–18 (1973)MATHCrossRefGoogle Scholar
  107. 107.
    McDonald, A., Sen, S.: The success and failure of tag-mediated evolution of cooperation. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 155–164. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  108. 108.
    Miconi, T.: When evolving populations is better than coevolving individuals: The blind mice problem. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI 2003), pp. 647–652 (2003)Google Scholar
  109. 109.
    Mitchell, M., Crutchfield, J., Das, R.: Evolving cellular automata with genetic algorithms: A review of recent work. In: Proceedings of the First International Conference on Evolutionary Computation and its Applications, EvCA 1996 (1996)Google Scholar
  110. 110.
    Monekosso, N.D., Remagnino, P.: Phe-Q: A pheromone based Q-learning. In: Australian Joint Conference on Artificial Intelligence, pp. 345–355 (2001)Google Scholar
  111. 111.
    Myerson, R.B.: Game Theory. Analysis of Conflict. Harvard University Press, Cambridge (1991)MATHGoogle Scholar
  112. 112.
    Nachbar, J.: Prediction, optimization, and learning in repeated games. Econometrica 65(2), 275–309 (1997)MathSciNetMATHCrossRefGoogle Scholar
  113. 113.
    Nachbar, J.H., Zame, W.R.: Non-computable strategies and discounted repeated games. Economic Theory 8, 103–122 (1996)MathSciNetMATHGoogle Scholar
  114. 114.
    Nair, R., Pynadath, D., Yokoo, M., Tambe, M., Marsella, S.: Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)Google Scholar
  115. 115.
    Nash, J.: Non-cooperative games. Annals of Mathematics 54, 286–295 (1951)MathSciNetMATHCrossRefGoogle Scholar
  116. 116.
    Nguyen, T., Jennings, N.: Coordinating multiple concurrent negotiations. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS 2004). ACM Press, New York (2004)Google Scholar
  117. 117.
    Nudelman, E., Wortman, J., Shoham, Y., Leyton-Brown, K.: Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms. In: Third International Joint Conference on Autonomous Agents and Multiagent Systems (2004)Google Scholar
  118. 118.
    Osborne, M., Rubinstein, A.: Bargaining and Markets. Academic Press, London (1990)MATHGoogle Scholar
  119. 119.
    Osborne, M., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)MATHGoogle Scholar
  120. 120.
    Osepayshvili, A., Wellman, M.P., Reeves, D.M., MacKie-Mason, J.K.: Selfconfirming price prediction for bidding in simultaneous ascending auctions. In: Twenty First Conference on Uncertainty in Artificial Intelligence, pp. 441–449 (2005)Google Scholar
  121. 121.
    Panait, L., Wiegand, R.P., Luke, S.: A visual demonstration of convergence properties of cooperative coevolution. In: Parallel Problem Solving from Nature PPSN-2004, pp. 892–901. Springer, Heidelberg (2004)Google Scholar
  122. 122.
    Panait, L.A., Wiegand, R.P., Luke, S.: Improving coevolutionary search for optimal multiagent behaviors. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)Google Scholar
  123. 123.
    Papadimitriou, C.: Algorithms, games, and the internet. In: Proceedings of the ACM Symposium on Theory of Computing (STOC 2001), pp. 749–753. ACM Press, New York (2001)Google Scholar
  124. 124.
    Papadimitriou, C., Tsitsiklis, J.: Complexity of markov decision processes. Mathematics of Operations Research 12(3), 441–450 (1987)MathSciNetMATHCrossRefGoogle Scholar
  125. 125.
    Peshkin, L., Kim, K.-E., Meuleau, N., Kaelbling, L.: Learning to cooperate via policy search. In: Sixteenth Conference on Uncertainty in Artificial Intelligence, pp. 307–314. Morgan Kaufmann, San Francisco (2000)Google Scholar
  126. 126.
    Planqué, R., Britton, N., Franks, N., Peletier, M.A.: The adaptiveness of defense strategies against cuckoo parasitism. Bull. Math. Biol. 64, 1045–1068 (2001)MATHCrossRefGoogle Scholar
  127. 127.
    Potter, M., Meeden, L., Schultz, A.: Heterogeneity in the coevolved behaviors of mobile robots: The emergence of specialists. In: Proceedings of The Seventeenth International Conference on Artificial Intelligence, IJCAI 2001 (2001)Google Scholar
  128. 128.
    Powers, R., Shoham, Y.: New criteria and a new algorithm for learning in multi-agent systems. In: Neural Information Processing Systems, NIPS (2004)Google Scholar
  129. 129.
    Powers, R., Shoham, Y.: Learning against opponents with bounded memory. In: International Joint Conference on Artificial Intelligence, IJCAI (2005)Google Scholar
  130. 130.
    Preist, C., Byde, A., Bartolini, C.: Economic dynamics of agents in multiple autions. In: Proceedings of the fifth International Conference on Autonomous Agents, pp. 545–551 (2001)Google Scholar
  131. 131.
    Quinn, M.: Evolving communication without dedicated communication channels. In: Kelemen, J., Sosík, P. (eds.) ECAL 2001. LNCS (LNAI), vol. 2159, p. 357. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  132. 132.
    Rapoport, A., Guyer, M., Gordon, D.: The 2x2 Game. University of Michigan Press, Ann Arbor (1976)Google Scholar
  133. 133.
    Redondo, F.: Game Theory and Economics. Cambridge University Press, Cambridge (2001)Google Scholar
  134. 134.
    Reeves, D.M., Wellman, M.P., MacKie-Mason, J.K., Osepayshvili, A.: Exploring bidding strategies for market-based scheduling. Decision Support Systems 39, 67–85 (2005)CrossRefGoogle Scholar
  135. 135.
    Rejeb, L., Guessoum, Z., MHallah, R.: An adaptive approach for the exploration-exploitation dilemma and its application to economic systems. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 165–176. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  136. 136.
    Robu, V., La Poutré, J.: Learning the structure of utility graphs used in multiissue negotiation through collaborative filtering. In: Proceedings of the Pacific Rim International Workshop on Multi-Agents (PRIMA 2005). LNCS (LNAI), Springer, Heidelberg (2005) (to appear)Google Scholar
  137. 137.
    Robu, V., Somefun, K., La Poutré, J.: Modeling complex multi-issue negotiations using utility graphs. In: Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi Agent Systems (AAMAS 2005). ACM Press, New York (2005)Google Scholar
  138. 138.
    Rosenschein, J., Zlotkin, G.: Rules of Encounter. MIT Press, Cambridge (1994)MATHGoogle Scholar
  139. 139.
    Rubinstein, A.: Modeling Bounded Rationality. MIT Press, Cambridge (1998)Google Scholar
  140. 140.
    Salustowicz, R., Wiering, M., Schmidhuber, J.: Learning team strategies with multiple policy-sharing agents: A soccer case study. Technical report, ISDIA, Corso Elvezia 36, 6900 Lugano, Switzerland (1997)Google Scholar
  141. 141.
    Samuelson, L.: Evolutionary Games and Equilibrium Selection. MIT Press, Cambridge (1997)MATHGoogle Scholar
  142. 142.
    Sandholm, T., Suri, S.: BOB: Improved winner determination in combinatorial auctions and generalizations. Artificial Intelligence 145, 33–58 (2003)MathSciNetMATHCrossRefGoogle Scholar
  143. 143.
    Sandholm, T.W., Crites, R.H.: On multiagent Q-learning in a semicompetitive domain. In: G. Weiss and S. Sen, editors, Adaptation and Learning in Multiagent Systems, pp. 191–205. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  144. 144.
    Scarf, H., Hansen, T.: The Computation of Economic Equilibria. Yale University Press, New Haven (1973)Google Scholar
  145. 145.
    Sen, S., Sekaran, M.: Multiagent coordination with learning classifier systems. In: Weiss, G., Sen, S. (eds.) IJCAI-WS 1995. LNCS, vol. 1042, pp. 218–233. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  146. 146.
    Sen, S., Sekaran, M.: Individual learning of coordination knowledge. Journal of Experimental and Theoretical Artificial Intelligence 10(3), 333–356 (1998)MATHCrossRefGoogle Scholar
  147. 147.
    Sen, S., Weiss, G.: Learning in Multiagent Systems, ch. 6. MIT Press, Cambridge (1999)Google Scholar
  148. 148.
    Shoham, Y., Powers, R., Grenager, T.: Multi-agent reinforcement learning: a critical survey. In: AAAI Fall Symposium on Artificial Multi-Agent Learning (2004)Google Scholar
  149. 149.
    Sierra, C.: Agent-mediated electronic commerce. Autonomous Agents and MultiAgent Systems 9(3), 285–301 (2004)CrossRefGoogle Scholar
  150. 150.
    Simon, H.: Models of Bounded Rationality, vol. 2. MIT Press, Cambridge (1982)Google Scholar
  151. 151.
    Singh, S.P., Kearns, M.J., Mansour, Y.: Nash convergence of gradient dynamics in general-sum games. In: UAI 2000: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, pp. 541–548. Morgan Kaufmann Publishers Inc, San Francisco (2000)Google Scholar
  152. 152.
    Somefun, K., Gerding, E., Bohté, S., La Poutré, J.: Automated negotiation and bundling of information goods. In: Faratin, P., Parkes, D.C., Rodríguez-Aguilar, J.-A., Walsh, W.E. (eds.) AMEC 2003. LNCS (LNAI), vol. 3048, pp. 1–17. Springer, Heidelberg (2004)Google Scholar
  153. 153.
    Somefun, K., Gerding, E., Bohté, S., La Poutré, J.: Efficient methods for automated multi-issue negotiation: Negotiating over a two-part tariff. In: International Journal of Intelligent Systems, special issue on Learning Approaches for Negotiation Agents and Automated Negotiation (2006) (to appear)Google Scholar
  154. 154.
    Somefun, K., Klos, T., La Poutré, H.: Negotiating over bundles and prices using aggregate knowledge. In: Bauknecht, K., Bichler, M., Pröll, B. (eds.) EC-Web 2004. LNCS, vol. 3182, pp. 218–227. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  155. 155.
    Somefun, K., Klos, T., La Poutré, H.: Online learning of aggregate knowledge about nonlinear preferences applied to negotiating prices and bundles. In: Proceedings of the Sixth International Conference on Electronic Commerce (ICEC 2004), pp. 361–370. ACM Press, New York (2005)Google Scholar
  156. 156.
    Somefun, K., La Poutré, J.: Bundling and pricing for information brokerage: Customer satisfaction as a means to profit optimization. In: Proceedings of the IEEE/WIC International Conference on Web Intellingence (WI 2003), pp. 182–189. IEEE Computer Society press, Los Alamitos (2003)CrossRefGoogle Scholar
  157. 157.
    Stone, P.: Layered Learning in Multi-Agent Systems. PhD thesis, Carnegie Mellon University (1998)Google Scholar
  158. 158.
    Stone, P., Littman, M.: Implicit negotiation in repeated games. In: Meyer, J.-J., Tambe, M. (eds.) Proceedings of The Eighth International Workshop on Agent Theories, Architectures, and Languages (ATAL 2001), pp. 393–404 (2001)Google Scholar
  159. 159.
    Stone, P., Schapire, R.E., Littman, M.L., Csisik, J.A., McAllester, D.: Decisiontheoretic bidding based on learned density models in simultaneous, interacting auctions. Journal of Artificial Intelligence Research 19, 209–242 (2003)MathSciNetMATHGoogle Scholar
  160. 160.
    Stone, P., Veloso, M.M.: Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3), 345–383 (2000)CrossRefGoogle Scholar
  161. 161.
    Suematsu, N., Hayashi, A.: A multiagent reinforcement learning algorithm using extended optimal response. In: Proceedings of First International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2002), pp. 370–377 (2002)Google Scholar
  162. 162.
    Suryadi, D., Gmytrasiewicz, P.J.: Learning models of other agents using influence diagrams. In: Proceedings of the 1999 International Conference on User Modeling, pp. 223–232 (1999)Google Scholar
  163. 163.
    Sutton, R., Barto, A.: Reinforcement Learning: An introduction. MIT Press, Cambridge (1998)Google Scholar
  164. 164.
    ’t Hoen, P., La Poutré, J.: A decommitment strategy in a competitive multiagent transportation setting. In: Faratin, P., Parkes, D.C., Rodríguez-Aguilar, J.-A., Walsh, W.E. (eds.) AMEC 2003. LNCS (LNAI), vol. 3048, pp. 56–72. Springer, Heidelberg (2004)Google Scholar
  165. 165.
    ’t Hoen, P., La Poutré, J.: Repeated auctions with complementarities. In: La Poutré, H., Sadeh, N.M., Janson, S. (eds.) AMEC 2005 and TADA 2005. LNCS (LNAI), vol. 3937, pp. 16–29. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  166. 166.
    ’t Hoen, P., Tuyls, K.: Analyzing multi-agent reinforcement learning using evolutionary dynamics. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 168–179. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  167. 167.
    tac dev@sics.se. Trading agent competitition (tac): Tac classic and TAC supply chain management (scm) (2006), http://www.sics.se/tac
  168. 168.
    Tadepalli, P., Givan, R., Driessens, K.: Relational reinforcement learning: An overview. In: Tadepalli, P., Givan, R., Driessens, K. (eds.) Proceedings of the ICML 2004 Workshop on Relational Reinforcement Learning, pp. 1–9 (2004)Google Scholar
  169. 169.
    Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative learning. In: Huhns, M.N., Singh, M.P. (eds.) Readings in Agents, pp. 487–494. Morgan Kaufmann, San Francisco (1993)Google Scholar
  170. 170.
    Tesauro, G.: Extending Q-learning to general adaptive multi-agent systems. In: In Neural Information Processing Systems, NIPS (2003)Google Scholar
  171. 171.
    Tesfatsion, L.: Introduction to the special issue on agent-based computational economics. Journal of Economic Dynamics and Control 25, 281–293 (2001)MATHCrossRefGoogle Scholar
  172. 172.
    Tumer, K., Agogino, A.: Efficient reward functions for adaptive multi-rover systems. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 177–191. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  173. 173.
    Tuyls, K., Croonenborghs, T., Ramon, J., Goetschalckx, R., Bruynooghe, M.: Multi-agent relational reinforcement learning. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 192–206. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  174. 174.
    Tuyls, K., Heytens, D., Now, A., Manderick, B.: Extended replicator dynamics as a key to reinforcement learning in multi-agent systems. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 421–431. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  175. 175.
    Tuyls, K., Nowé, A.: Evolutionary game theory and multi-agent reinforcement learning. The Knowledge Engineering Review 20(01), 63–90 (2006)CrossRefGoogle Scholar
  176. 176.
    Tuyls, K., Verbeeck, K., Lenaerts, T.: A selection-mutation model for Qlearning in Multi-Agent Systems. In: The second International Joint Conference on Autonomous Agents and Multi-Agent Systems, Melbourne, Australia. ACM Press, New York (2003)Google Scholar
  177. 177.
    van Bragt, D., La Poutré, J.: Co-evolving automata negotiate with a variety of opponents. In: Proceedings of the IEEE Congress on Evolutionary Computation 2002 (CEC 2002), vol. 2, pp. 1426–1431. IEEE Press, Los Alamitos (2002)Google Scholar
  178. 178.
    van Bragt, D., La Poutré, J.: Why agents for automated negotiation should be adaptive. Netnomics 5, 101–118 (2003)CrossRefGoogle Scholar
  179. 179.
    van Otterloo, S.: The value of privacy. In: AAMAS (2005)Google Scholar
  180. 180.
    Vermeulen, I., Somefun, K., La Poutré, H.: An efficient turnkey agent for repeated trading with overall budget and preferences. In: Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems (CIS 2004), pp. 1072–1077. IEEE Press, Los Alamitos (2004)Google Scholar
  181. 181.
    Vidal, J., Durfee, E.: The impact of nested agent models in an information economy. In: Proceedings of the 2nd Intern. Conf. on Multiagent Systems, pp. 377–384. AAAI press, Menlo Park (1996)Google Scholar
  182. 182.
    Vrancx, P., Nowé, A., Steenhaut, K.: Multi-type ACO for light path protection. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S. (eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 207–215. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  183. 183.
    Wagner, K.: Cooperative strategies and the evolution of communication. Artificial Life 6(2), 149–179 (Spring 2000)CrossRefGoogle Scholar
  184. 184.
    Walsh, W., Das, R., Tesauro, G., Kephart, J.: Analyzing complex strategic interactions in multi-agent games. In: Proceedings of the The Eighteenth National Conference on Artificial Intelligence (AAAI 2002) Workshop on Game Theoretic and Decision Theoretic Agents, pp. 109–118 (2002)Google Scholar
  185. 185.
    Wang, X., Sandholm, T.: Reinforcement learning to play an optimal Nash equilibrium in team Markov games. Advances in Neural Information Processing Systems, NIPS 2002 (2002)Google Scholar
  186. 186.
    Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, University of Cambridge (1989)Google Scholar
  187. 187.
    Weibull, J.: Evolutionary Game Theory. MIT Press, Cambridge (1996)MATHGoogle Scholar
  188. 188.
    Weinberg, M., Rosenschein, J.S.: Best-response multiagent learning in nonstationary environments. In: The Third International Joint Conference on Autonomous Agents and Multiagent Systems, New York (July 2004)Google Scholar
  189. 189.
    Weiss, G.: Multi-agent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge (1999)Google Scholar
  190. 190.
    Wellman, M.: A market-oriented programming environment and its application to distributed multicommodity flow problems. Journal of Artificial Intelligence Research 1, 1–23 (1993)MATHGoogle Scholar
  191. 191.
    Wellman, M., Greenwald, A., Stone, P., Wurman, P.: The 2001 Trading Agent Competition. Electronic Markets 13, 4–12 (2001)CrossRefGoogle Scholar
  192. 192.
    Wellman, M., Hu, J.: Conjectural equilibrium in multiagent learning. Machine Learning 33(2-3), 179–200 (1998)MATHCrossRefGoogle Scholar
  193. 193.
    Wellman, M.P., Reeves, D.M., Lochner, K.M., Vorobeychik, y.: Price prediction in a trading agent competition. Journal of Artificial Intelligence Research 21, 19–36 (2004)Google Scholar
  194. 194.
    Wellman, M., Wurman, P., O’Malley, K., Bangera, R., Lin, S.d., Reeves, D., Walsh, W.: Designing the market game for the trading agent competition. IEEE Internet Computing 5, 43–51 (2001)CrossRefGoogle Scholar
  195. 195.
    Wiegand, R.P.: Analysis of Cooperative Coevolutionary Algorithms. PhD thesis, Department of Computer Science, George Mason University (2003)Google Scholar
  196. 196.
    Wolfe, B., James, M.R., Singh, S.: Learning predictive state representations in dynamical systems without reset. In: Proceedings of the 2005 International Conference on Machine Learning (2005)Google Scholar
  197. 197.
    Wolpert, D.H., Tumer, K.: Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3), 265–279 (2001)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Pieter Jan ’t Hoen
    • 1
  • Karl Tuyls
    • 2
  • Liviu Panait
    • 3
  • Sean Luke
    • 3
  • J. A. La Poutré
    • 1
    • 4
  1. 1.Center for Mathematics and Computer Science (CWI)AmsterdamThe Netherlands
  2. 2.Computer Science Department (IKAT)University of MaastrichtThe Netherlands
  3. 3.George Mason UniversityFairfaxUSA
  4. 4.TU EindhovenEindhovenThe Netherlands

Personalised recommendations