Risk consideration and cooperation in the iterated prisoner’s dilemma

Zeng, Weijun; Li, Minqiang; Chen, Fuzan; Nan, Guofang

doi:10.1007/s00500-014-1523-2

Risk consideration and cooperation in the iterated prisoner’s dilemma

Methodologies and Application
Published: 22 November 2014

Volume 20, pages 567–587, (2016)
Cite this article

Soft Computing Aims and scope Submit manuscript

Weijun Zeng¹,
Minqiang Li¹,
Fuzan Chen¹ &
…
Guofang Nan¹

774 Accesses
11 Citations
Explore all metrics

Abstract

This paper investigates the cooperative behavior in the two-player iterated prisoner’s dilemma (IPD) game with the consideration of income stream risk. The standard deviation of one-move payoffs for players is defined for measuring the income stream risk, and thus the risk effect on the cooperation in the two-player IPD game is examined. A two-population coevolutionary learning model, embedded with a niching technique, is developed to search optimal strategies for two players to play the IPD game. As experimental results illustrate, risk-averse players perform better than risk-seeking players in cooperating with opponents. In particular, in the case with short game encounters, in which cooperation has been demonstrated to be difficult to achieve in previous work, a high level of cooperation can be obtained in the IPD if both players are risk-averse. The reason is that risk consideration induces players to negotiate for stable gains, which lead to steady mutual cooperation in the IPD. This cooperative pattern is found to be quite robust against low levels of noise. However, with increasingly higher levels of noise, only intermediate levels of cooperation can be achieved in games between two risk-averse players. Games with risk-seeking players get to even lower cooperation levels. By comparing the players’ strategies coevolved with and without a high level of noise, the main reason for the reduction in the extent of cooperation can be explained as the lack of contrition and forgiveness of players in the high-noise interactions. Moreover, although increasing encounter length is helpful in improving cooperation in the noiseless and low-noise IPD, we find that it may enforce the absence of contrition and forgiveness, and thus make cooperation even more difficult in the high-noise games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

A survey of experimental research on contests, all-pay auctions and tournaments

Article 06 November 2014

Notes

http://www.wto.org/.
Risk-seeking players prefer strategies of high risk because these strategies are expected to potentially bring them high payoffs in each game. Thus, \(\alpha _i =0\) indicates risk-seeking players who care about the maximization of the expected total payoff in the IPD game, but the real returns are not guaranteed to be maximal due to non-cooperative behaviors.
Note that the strategies start from random historic positions when playing the IPD, which will increase the bit-wise diversity of genome under selective pressure, and thus make each bit well trained.
In a round-robin tournament, each player plays an IPD with every player. The winner is the one who receives the highest average payoff in all IPDs. Observing that some risk-averse players may perform similarly well in the IPD game, we only consider tournament with the 12 players to make a valid comparison.
AllD is a strategy that always defects. (TF2T) is the inverse tit-for-two-tats, which cooperate except when the opponent makes two successive defections.

References

Alkemade F, Van Bragt D, La Poutré JA (2005) Stabilization of tag-mediated interaction by sexual reproduction in an evolutionary agent system. Inf Sci 170:101–119
Article Google Scholar
Andersen TJ, Denrell J, Bettis RA (2007) Strategic responsiveness and bowman’s risk-return paradox. Strateg Manag J 28:407–429
Article Google Scholar
Ashlock D, Kim E-Y, Ashlock W (2009) Fingerprint analysis of the noisy prisoner’s dilemma using a finite-state representation. IEEE Trans Comput Intell AI Games 1:154–167
Article Google Scholar
Axelrod R (1984) The evolution of cooperation. Basic Books, New York
Google Scholar
Barr J, Saraceno F (2009) Organization, learning and cooperation. J Econ Behav Organ 70:39–53
Article Google Scholar
Belleflamme P, Bloch F (2008) Sustainable collusion on separate markets. Econ Lett 99:384–386
Article MATH MathSciNet Google Scholar
Burguillo J (2014) Using self-organizing maps with complex network topologies and coalitions for time series prediction. Soft Comput 18:695–705
Article Google Scholar
Camerer CF (2003) Behavioral game theory: Experiments in strategic interaction. Princeton University Press, Princeton
Google Scholar
Cartlidge J, Bullock S (2004) Combating coevolutionary disengagement by reducing parasite virulence. Evolut Comput 12:193–222
Article Google Scholar
Chiong R, Kirley M (2012) Effects of iterated interactions in multiplayer spatial evolutionary games. IEEE Trans Evolut Comput 16:537–555
Article Google Scholar
Chong SY, Yao X (2005) Behavioral diversity, choices and noise in the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 9:540–551
Article Google Scholar
Chong SY, Yao X (2007) Multiple choices and reputation in multiagent interactions. IEEE Trans Evolut Comput 11:689–711
Article Google Scholar
Chong SY, Tio P, Yao X (2008) Measuring generalization performance in coevolutionary learning. IEEE Trans Evolut Comput 12:479–505
Article Google Scholar
Chong SY, Tio P, Yao X (2009) Relationship between generalization and diversity in coevolutionary learning. IEEE Trans Comput Intell AI Games 1:214–232
Article Google Scholar
Chong SY, Tio P, Ku DC, Yao X (2012) Improving generalization performance in co-evolutionary learning. IEEE Trans Evolut Comput 16:70–85
Article Google Scholar
Darwen PJ, Yao X (1995) On evolving robust strategies for iterated prisoner’s dilemma. In: Yao X (ed) Progress in evolutionary computation., Lecture Notes in Computer ScienceSpringer, Berlin, pp 276–292
Chapter Google Scholar
Darwen PJ, Yao X (1997) Speciation as automatic categorical modularization. IEEE Trans Evolut Comput 1:101–108
Article Google Scholar
Darwen PJ, Yao X (2002) Co-evolution in iterated prisoner’s dilemma with intermediate levels of cooperation: application to missile defense. Int J Comput Intell Appl 2:83–107
Article Google Scholar
Dixon HD (2000) Keeping up with the joneses: competition and the evolution of collusion. J Econ Behav Organ 43:223–238
Article Google Scholar
Fölling A, Grimme C, Lepping J, Papaspyrou A (2011) Connecting community-grids by supporting job negotiation with coevolutionary fuzzy-systems. Soft Comput 15:2375–2387
Article Google Scholar
Fogel DB (1993) Evolving behaviors in the iterated prisoner’s dilemma. Evolut Comput 1:77–97
Article Google Scholar
Fogel DB (1995) On the relationship between the duration of an encounter and the evolution of cooperation in the iterated prisoner’s dilemma. Evolut Comput 3:349–363
Article MathSciNet Google Scholar
Franken N, Engelbrecht AP (2005) Particle swarm optimization approaches to coevolve strategies for the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 9:562–579
Article Google Scholar
Friedman TL (2006) The world is flat [updated and expanded]: a brief history of the twenty-first century. Macmillan, London
Google Scholar
Fudenberg D, Maskin E (1986) The folk theorem in repeated games with discounting or with incomplete information. Econometrica 54:533–554
Article MATH MathSciNet Google Scholar
Furusawa T (1999) The negotiation of sustainable tariffs. J Int Econ 48:321–345
Article Google Scholar
Gao J, Yu Y (2013) Credibilistic extensive game with fuzzy payoffs. Soft Comput 17:557–567
Article MATH Google Scholar
Holton GA (2004) Defining risk. Financ Anal J 60:19–25
Article Google Scholar
Ioannou C (2014) Coevolution of finite automata with errors. J Evolut Econ 24:541–571
Article Google Scholar
Ishibuchi H, Namikawa N (2005) Evolution of iterated prisoner’s dilemma game strategies in structured demes under random pairing in game playing. IEEE Trans Evolut Comput 9:552– 561
Article Google Scholar
Ishibuchi H, Ohyanagi H, Nojima Y (2011) Evolution of strategies with different representation schemes in a spatial iterated prisoner’s dilemma game. IEEE Trans Comput Intell AI Games 3:67–82
Article Google Scholar
Lam K-m, Leung H-f (2007) Incorporating risk attitude and reputation into infinitely repeated games and an analysis on the iterated prisoner’s dilemma. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), Patras, Greece, 29–31 Oct. 2007. IEEE Computer Society, pp 60–67
Li J, Kendall G (2009) A strategy with novel evolutionary features for the iterated prisoner’s dilemma. Evolut Comput 17:257–274
Article Google Scholar
Li J, Kendall G (2013) Evolutionary stability of discriminating behaviors with the presence of kin cheaters. IEEE Trans Cybern 43:2044–2053
Article Google Scholar
Li J, Hingston P, Kendall G (2011) Engineering design of strategies for winning iterated prisoner’s dilemma competitions. IEEE Trans Comput Intell AI Games 3:348–360
Article Google Scholar
Li M, Lin D, Kou J (2010) An investigation on niching multiple species based on population replacement strategies for multimodal functions optimization. Soft Comput 14:49–69
Article Google Scholar
Lindgren K, Nordahl MG (1994) Evolutionary dynamics of spatial games. Phys D Nonlinear Phenom 75:292–309
Article MATH Google Scholar
McNamara JM, Barta Z, Houston AI (2004) Variation in behaviour promotes cooperation in the prisoner’s dilemma game. Nature 428:745–748
Article Google Scholar
Meng C-L, Pakath R (2001) The iterated prisoner’s dilemma: early experiences with learning classifier system-based simple agents. Decis Support Syst 31:379–403
Article Google Scholar
Meng F-L, Zeng X-J (2013) A stackelberg game-theoretic approach to optimal real-time pricing for the smart grid. Soft Comput 17:2365–2380
Article Google Scholar
Miller JH (1996) The coevolution of automata in the repeated prisoner’s dilemma. J Econ Behav Organ 29:87–112
Article Google Scholar
Miller KD, Bromiley P (1990) Strategic risk and corporate performance: an analysis of alternative risk measures. Acad Manag J 33:756–779
Article Google Scholar
Mittal S, Deb K (2009) Optimal strategies of the iterated prisoner’s dilemma problem for multiple conflicting objectives. IEEE Trans Evolut Comput 13:554–565
Article Google Scholar
Nowak M, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 364:56–58
Article Google Scholar
Nowak MA, Sigmund K (1998) Evolution of indirect reciprocity by image scoring. Nature 393:573–577
Article Google Scholar
Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428:646–650
Article Google Scholar
Ochea M-I (2013) Evolution of repeated prisoner’s dilemma play under logit dynamics. J Econ Dyn Control 37:2483–2499
Article MathSciNet Google Scholar
Ohtsuki H, Hauert C, Lieberman E, Nowak MA (2006) A simple rule for the evolution of cooperation on graphs and social networks. Nature 441:502–505
Article Google Scholar
Press WH, Dyson FJ (2012) Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Acad Sci USA 109:10409–10413
Article MATH Google Scholar
Quek H-Y, Tan KC, Goh C-K, Abbass HA (2009) Evolution and incremental learning in the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 13:303–320
Article Google Scholar
Riolo RL, Cohen MD, Axelrod R (2001) Evolution of cooperation without reciprocity. Nature 414:441–443
Article Google Scholar
Samothrakis S, Lucas S, Runarsson TP, Robles D (2013) Coevolving game-playing agents: measuring performance and intransitivities. IEEE Trans Evolut Comput 17:213–226
Article Google Scholar
Snijders C, Raub W (1998) Revolution and risk: paradoxical consequences of risk aversion in interdependent situations. Ration Soc 10:405–425
Article Google Scholar
Tapping D, Luyster T, Shuker T (2002) Value stream management: Eight steps to planning, mapping, and sustaining lean improvements. Productivity Press, New York
Google Scholar
Thibert-Plante X, Charbonneau P (2007) Crossover and evolutionary stability in the prisoner’s dilemma. Evolut Comput 15:321–344
Article Google Scholar
van Assen M, Snijders C (2010) The effect of nonlinear utility on behaviour in repeated prisoner’s dilemmas. Ration Soc 22:301–332
Article Google Scholar
van Doorn GS, Riebli T, Taborsky M (2014) Coaction versus reciprocity in continuous-time models of cooperation. J Theor Biol 356:1–10
Article Google Scholar
Waltman L, van Eck NJ (2012) A mathematical analysis of the long-run behavior of genetic algorithms for social modeling. Soft Comput 16:1071–1089
Waltman L, van Eck NJ, Dekker R, Kaymak U (2011) Economic modeling using evolutionary algorithms: the effect of a binary encoding of strategies. J Evolut Econ 21:737–756
Wu J, Axelrod R (1995) How to cope with noise in the iterated prisoner’s dilemma. J Confl Resolut 39:183–189
Zhang H, Gao M, Wang W, Liu Z (2014) Evolutionary prisoners dilemma game on graphs and social networks with external constraint. J Theor Biol 358:122–131
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported by the National Science Fund for Distinguished Young Scholars of China (Grant No. 70925005), the General Program of the National Science Foundation of China (No.71101103, No.71271148, No.71371135), and the Research Fund for the Doctoral Program of Higher Education of China (No. 20110032110070). It was also supported by the Program for Changjiang Scholars and Innovative Research Teams in Universities of China (PCSIRT). The authors would also like to thank The High Performance Computing Centre (HPCC) of Tianjin University for providing computing support. We also like to thank the editor and reviewers for their valuable comments.

Author information

Authors and Affiliations

College of Management and Economics, Tianjin University, Tianjin, 300072, People’s Republic of China
Weijun Zeng, Minqiang Li, Fuzan Chen & Guofang Nan

Authors

Weijun Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Minqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Fuzan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Guofang Nan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minqiang Li.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeng, W., Li, M., Chen, F. et al. Risk consideration and cooperation in the iterated prisoner’s dilemma. Soft Comput 20, 567–587 (2016). https://doi.org/10.1007/s00500-014-1523-2

Download citation

Published: 22 November 2014
Issue Date: February 2016
DOI: https://doi.org/10.1007/s00500-014-1523-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Risk consideration and cooperation in the iterated prisoner’s dilemma

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Monte Carlo Tree Search: a review of recent modifications and applications

A survey of experimental research on contests, all-pay auctions and tournaments

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Risk consideration and cooperation in the iterated prisoner’s dilemma

Abstract

Access this article

Similar content being viewed by others

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Monte Carlo Tree Search: a review of recent modifications and applications

A survey of experimental research on contests, all-pay auctions and tournaments

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation