Strategy inference using the maximum likelihood estimation in the iterated prisoner’s dilemma game

Kim, Minjae

doi:10.1007/s40042-023-00954-z

Strategy inference using the maximum likelihood estimation in the iterated prisoner’s dilemma game

Original Paper - General, Mathematical and Statistical Physics
Published: 29 November 2023

Volume 84, pages 102–107, (2024)
Cite this article

Journal of the Korean Physical Society Aims and scope Submit manuscript

Minjae Kim^1,2

134 Accesses
Explore all metrics

An Erratum to this article was published on 25 January 2024

This article has been updated

Abstract

The iterated prisoner’s dilemma game has shown how cooperation evolves based on direct reciprocity. In the course of iterated interactions, a player can observe the co-player’s past actions, infer his or her strategy, and react to it. Among the above three processes, the fidelity of inference is relatively unexplored compared to the others. In this work, we explicitly construct an inference process between observation and reaction. Specifically, the focal player infers the co-player’s strategy by applying the maximum likelihood estimation to the observed sequence of actions. Our first finding is that the focal player’s inference is accurate when both players take only their last actions into consideration if the observed sequence is sufficiently long. To see the case in which inference must be inaccurate, we also set the focal player’s memory length to be shorter than the co-player’s. In this case, we choose a combination of Tit-for-tat and Anti-tit-for-tat (TA) as the co-player’s long-memory strategy. TA satisfies the following three conditions: (1) mutual cooperation is achieved when all players use the strategy, (2) the strategy exploits unconditional cooperation, and (3) a player using this strategy is not exploited repeatedly by any co-player. The short-memory player inaccurately infers TA as either Tit-for-tat, Win–Stay–Lose–Shift, or Grim trigger, depending on his or her own strategy. This work presents how a long-memory strategy is projected onto a short-memory one by inference with information loss. In addition, we suggest that each of those three well-known strategies could be a facet of a single successful strategy with a higher cognitive capacity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

On actor-network theory and algorithms: ChatGPT and the new power relationships in the age of AI

Article 28 June 2023

Deep multiagent reinforcement learning: challenges and directions

Article Open access 19 October 2022

Change history

24 January 2024
Affiliation 2 has been corrected.
25 January 2024
An Erratum to this paper has been published: https://doi.org/10.1007/s40042-024-01015-9

References

R.L. Trivers, O. Rev. Biol. 46, 35 (1971)
Article Google Scholar
M.A. Nowak, Science 314, 1560 (2006)
Article ADS Google Scholar
R. Axelrod, The Evolution of Cooperation (Basic Books, New York, 1984)
Google Scholar
R. Axelrod, W.D. Hamiltion, Science 211, 1390 (1981)
Article ADS MathSciNet Google Scholar
M. Nowak, K. Sigmund, Appl. Math. Comput. 30, 191 (1989)
MathSciNet Google Scholar
M. Nowak, K. Sigmund, Acta. Appl. Math. 20, 247 (1990)
Article MathSciNet Google Scholar
M. Nowak, Theor. Popul. Biol. 38, 93 (1990)
Article Google Scholar
M. Nowak, K. Sigmund, Nature 364, 56 (1993)
Article ADS Google Scholar
M.A. Nowak, K. Sigmund, E. El-Sedy, J. Math. Biol. 33, 703 (1995)
Article MathSciNet Google Scholar
A. Traulsen, M.A. Nowak, J.M. Pacheco, Phys. Rev. E 74, 011909 (2006)
Article ADS Google Scholar
S.K. Baek, H.-C. Jeong, C. Hilbe, M.A. Nowak, Sci. Rep. 6, 1 (2016)
Article Google Scholar
G. Szabó, C. Tőke, Phy. Rev. E 58, 69 (1998)
Article ADS Google Scholar
A.J. Stewart, J.B. Plotkin, Proc. Natl. Acad. Sci. USA 110, 15348 (2013)
Article ADS MathSciNet Google Scholar
H.H. Kelley, A.J. Stahelski, J. Exp. Soc. Psychol. 6, 401 (1970)
Article Google Scholar
T. Cusson, J. Engle-Warnick, CIRANO-Scientific Publications 2013s-26 (2013)
M. Kim, J.-K. Choi, S.K. Baek, Proc. R. Soc. B 288, 20211021 (2021)
Article Google Scholar
L. Wasserman, All of Statistics: A Concise Course in Statistical Inference, vol. 26 (Springer, Berlin, 2004)
Book Google Scholar
S. Do Yi, S.K. Baek, J.-K. Choi, J. Theor. Biol. 412, 1 (2017)
Article ADS Google Scholar

Download references

Acknowledgements

We are thankful to Seung Ki Baek for the discussions. We were supported by Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (NRF-2020R1I1A2071670) and the Ministry of Science and ICT (NRF-2019R1A2C2089463).

Author information

Authors and Affiliations

Department of Physics, Pukyong National University, Busan, 48513, Korea
Minjae Kim
Asia Pacific Center for Theoretical Physics, Pohang, 37673, Korea
Minjae Kim

Authors

Minjae Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minjae Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kim, M. Strategy inference using the maximum likelihood estimation in the iterated prisoner’s dilemma game. J. Korean Phys. Soc. 84, 102–107 (2024). https://doi.org/10.1007/s40042-023-00954-z

Download citation

Received: 08 May 2023
Revised: 27 September 2023
Accepted: 04 October 2023
Published: 29 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s40042-023-00954-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strategy inference using the maximum likelihood estimation in the iterated prisoner’s dilemma game

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

On actor-network theory and algorithms: ChatGPT and the new power relationships in the age of AI

Deep multiagent reinforcement learning: challenges and directions

Change history

24 January 2024

25 January 2024

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Strategy inference using the maximum likelihood estimation in the iterated prisoner’s dilemma game

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

On actor-network theory and algorithms: ChatGPT and the new power relationships in the age of AI

Deep multiagent reinforcement learning: challenges and directions

Change history

24 January 2024

25 January 2024

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation