Learning to Achieve Socially Optimal Solutions in General-Sum Games

Hao, Jianye; Leung, Ho-fung

doi:10.1007/978-3-642-32695-0_10

Jianye Hao²² &
Ho-fung Leung²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7458))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2872 Accesses
6 Citations

Abstract

During multi-agent interactions, robust strategies are needed to help the agents to coordinate their actions on efficient outcomes. A large body of previous work focuses on designing strategies towards the goal of Nash equilibrium under self-play, which can be extremely inefficient in many situations. On the other hand, apart from performing well under self-play, a good strategy should also be able to well respond against those opponents adopting different strategies as much as possible. In this paper, we consider a particular class of opponents whose strategies are based on best-response policy and also we target at achieving the goal of social optimality. We propose a novel learning strategy TaFSO which can effectively influence the opponent’s behavior towards socially optimal outcomes by utilizing the characteristic of best-response learners. Extensive simulations show that our strategy TaFSO achieves better performance than previous work under both self-play and against the class of best-response learners.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Banerjee, D., Sen, S.: Reaching pareto optimality in prisoner’s dilemma using conditional joint action learning. In: AAMAS 2007, pp. 91–108 (2007)
Google Scholar
Bowling, M.H., Veloso, M.M.: Multiagent learning using a variable learning rate. In: Artificial Intelligence, pp. 215–250 (2003)
Google Scholar
Brams, S.J.: Theory of Moves. Cambridge University Press, Cambridge (1994)
MATH Google Scholar
Camerer, C.F., Ho, T.H., Chong, J.K.: Sophisticated ewa learning and strategic teaching in repeated games. Journal of Economic Theory 104, 137–188 (2002)
Article MATH Google Scholar
Crandall, J.W., Goodrich, M.A.: Learning to teach and follow in repeated games. In: AAAI Workshop on Multiagent Learning (2005)
Google Scholar
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press (1998)
Google Scholar
Hao, J.Y., Leung, H.F.: Strategy and fairness in repeated two-agent interaction. In: ICTAI 2010, pp. 3–6. IEEE Computer Society (2010)
Google Scholar
Jafari, A., Greenwald, A., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium. In: ICML 2001, pp. 226–233 (2001)
Google Scholar
Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: ICML 1994, pp. 322–328 (1994)
Google Scholar
Littman, M.L., Stone, P.: Leading best-response strategies in repeated games. In: IJCAI Workshop on Economic Agents, Models, and Mechanisms (2001)
Google Scholar
Littman, M.L., Stone, P.: A polynomial time nash equilibrium algorithm for repeated games. Decision Support Systems 39, 55–66 (2005)
Article Google Scholar
Moriyama, K.: Learning-rate adjusting q-learning for prisoner’s dilemma games. In: WI-IAT 2008. pp. 322–325 (2008)
Google Scholar
oH, J., Smith, S.F.: A few good agents: multi-agent social learning. In: AAMAS 2008, pp. 339–346 (2008)
Google Scholar
Osborne, M.J., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)
MATH Google Scholar
Sen, S., Airiau, S., Mukherjee, R.: Towards a pareto-optimal solution in general-sum games. In: AAMAS 2003, pp. 153–160 (2003)
Google Scholar
Stimpson, J.L., Goodrich, M.A., Walters, L.C.: Satisficing and learning cooperation in the prisoner’s dilemma. In: IJCAI 2001, pp. 535–540 (2001)
Google Scholar
Watkins, C.J.C.H., Dayan, P.D.: Q-learning. In: Machine Learning, pp. 279–292 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
Jianye Hao & Ho-fung Leung

Authors

Jianye Hao
View author publications
You can also search for this author in PubMed Google Scholar
Ho-fung Leung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Environment, Society and Design, Department of Applied Computing, Lincoln University, P.O. Box 84, 7647, Christchurch, New Zealand
Patricia Anthony
School of Information Science and Technology, University of Tokyo, 7-3-1, Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Mitsuru Ishizuka
MIMOS Berhad, Knowledge Technology, Technology Park Malaysia,, 57000, Kuala Lumpur, Malaysia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hao, J., Leung, Hf. (2012). Learning to Achieve Socially Optimal Solutions in General-Sum Games. In: Anthony, P., Ishizuka, M., Lukose, D. (eds) PRICAI 2012: Trends in Artificial Intelligence. PRICAI 2012. Lecture Notes in Computer Science(), vol 7458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32695-0_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-32695-0_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32694-3
Online ISBN: 978-3-642-32695-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics