Distributed Learning of Best Response Behaviors in Concurrent Iterated Many-Object Negotiations

  • Jan Ole Berndt
  • Otthein Herzog
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7598)


Iterated negotiations are a well-established method for coordinating distributed activities in multiagent systems. However, if several of these take place concurrently, the participants’ activities can mutually influence each other. In order to cope with the problem of interrelated interaction outcomes in partially observable environments, we apply distributed reinforcement learning to concurrent many-object negotiations. To this end, we discuss iterated negotiations from the perspective of repeated games, specify the agents’ learning behavior, and introduce decentral decision-making criteria for terminating a negotiation. Furthermore, we empirically evaluate the approach in a multiagent resource allocation scenario. The results show that our method enables the agents to successfully learn mutual best response behaviors which approximate Nash equilibrium allocations. Additionally, the learning constrains the required interaction effort for attaining these results.


Nash Equilibrium Multiagent System Combinatorial Auction Acceptance Level Learning Agent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berndt, J.O., Herzog, O.: Distributed Reinforcement Learning for Optimizing Resource Allocation in Autonomous Logistics Processes. In: Kreowski, H.-J., Scholz-Reiter, B., Thoben, K.-D. (eds.) LDIC 2012, Bremen (2012)Google Scholar
  2. 2.
    Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent Reinforcement Learning: An Overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems and Applications - 1. SCI, vol. 310, pp. 183–221. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Claus, C., Boutilier, C.: The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In: AAAI 1998, Madison, pp. 746–752 (1998)Google Scholar
  4. 4.
    Cramton, P., Shoham, Y., Steinberg, R. (eds.): Combinatorial Auctions. The MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  5. 5.
    Faratin, P., Sierra, C., Jennings, N.R.: Negotiation decision functions for autonomous agents. Robot. Auton. Syst. 24(3-4), 159–182 (1998)CrossRefGoogle Scholar
  6. 6.
    Foundation for Intelligent Physical Agents: FIPA Iterated Contract Net Interaction Protocol Specification, Standard (2002); document No. SC00030HGoogle Scholar
  7. 7.
    Gjerstad, S., Dickhaut, J.: Price Formation in Double Auctions. Game. Econ. Behav. 22(1), 1–29 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Jennings, N.R., Faratin, P., Lomuscio, A.R., Parsons, S., Wooldridge, M.J., Sierra, C.: Automated Negotiation: Prospects, Methods and Challenges. Group Decis. Negot. 10, 199–215 (2001)CrossRefGoogle Scholar
  9. 9.
    Kaisers, M., Tuyls, K.: Frequency Adjusted Multiagent Q-learning. In: van der Hoek, W., Kaminka, G.A., Lespérance, Y., Luck, M., Sen, S. (eds.) AAMAS 2010, pp. 309–315. IFAAMAS, Toronto (2010)Google Scholar
  10. 10.
    Luckhart, C., Irani, K.B.: An Algorithmic Solution of N-Person Games. In: AAAI 1986, vol. 1, pp. 158–162. Morgan Kaufmann, Philadelphia (1986)Google Scholar
  11. 11.
    Nash, J.: Non-cooperative Games. Ann. Math. 54(2), 286–295 (1950)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Porter, R., Nudelman, E., Shoham, Y.: Simple search methods for finding a Nash equilibrium. Game. Econ. Behav. 63(2), 642–662 (2008)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Ramezani, S., Endriss, U.: Nash Social Welfare in Multiagent Resource Allocation. In: David, E., Gerding, E., Sarne, D., Shehory, O. (eds.) Agent-Mediated Electronic Commerce, pp. 117–131. Springer, Heidelberg (2010)Google Scholar
  14. 14.
    Richter, J., Klusch, M., Kowalczyk, R.: Monotonic Mixing of Decision Strategies for Agent-Based Bargaining. In: Klügl, F., Ossowski, S. (eds.) MATES 2011. LNCS, vol. 6973, pp. 113–124. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Schuldt, A., Berndt, J.O., Herzog, O.: The Interaction Effort in Autonomous Logistics Processes: Potential and Limitations for Cooperation. In: Hülsmann, M., Scholz-Reiter, B., Windt, K. (eds.) Autonomous Cooperation and Control in Logistics, pp. 77–90. Springer, Berlin (2011)CrossRefGoogle Scholar
  16. 16.
    Schuldt, A., Gehrke, J.D., Werner, S.: Designing a Simulation Middleware for FIPA Multiagent Systems. In: Jain, L., Gini, M., Faltings, B.B., Terano, T., Zhang, C., Cercone, N., Cao, L. (eds.) WI-IAT 2008, pp. 109–113. IEEE Computer Society Press, Sydney (2008)Google Scholar
  17. 17.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)Google Scholar
  18. 18.
    v. Neumann, J.: Zur Theorie der Gesellschaftsspiele. Math. Ann. 100, 295–320 (1928)Google Scholar
  19. 19.
    v. Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1944)Google Scholar
  20. 20.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3-4), 279–292 (1992)zbMATHCrossRefGoogle Scholar
  21. 21.
    Winoto, P., McCalla, G.I., Vassileva, J.: Non-Monotonic-Offers Bargaining Protocol. Auton. Agent. Multi-Ag. 11, 45–67 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Jan Ole Berndt
    • 1
  • Otthein Herzog
    • 1
  1. 1.Center for Computing and Communication Technologies (TZI)Universität BremenGermany

Personalised recommendations