Abstract
This paper investigates the cooperative behavior in the two-player iterated prisoner’s dilemma (IPD) game with the consideration of income stream risk. The standard deviation of one-move payoffs for players is defined for measuring the income stream risk, and thus the risk effect on the cooperation in the two-player IPD game is examined. A two-population coevolutionary learning model, embedded with a niching technique, is developed to search optimal strategies for two players to play the IPD game. As experimental results illustrate, risk-averse players perform better than risk-seeking players in cooperating with opponents. In particular, in the case with short game encounters, in which cooperation has been demonstrated to be difficult to achieve in previous work, a high level of cooperation can be obtained in the IPD if both players are risk-averse. The reason is that risk consideration induces players to negotiate for stable gains, which lead to steady mutual cooperation in the IPD. This cooperative pattern is found to be quite robust against low levels of noise. However, with increasingly higher levels of noise, only intermediate levels of cooperation can be achieved in games between two risk-averse players. Games with risk-seeking players get to even lower cooperation levels. By comparing the players’ strategies coevolved with and without a high level of noise, the main reason for the reduction in the extent of cooperation can be explained as the lack of contrition and forgiveness of players in the high-noise interactions. Moreover, although increasing encounter length is helpful in improving cooperation in the noiseless and low-noise IPD, we find that it may enforce the absence of contrition and forgiveness, and thus make cooperation even more difficult in the high-noise games.
Similar content being viewed by others
Notes
Risk-seeking players prefer strategies of high risk because these strategies are expected to potentially bring them high payoffs in each game. Thus, \(\alpha _i =0\) indicates risk-seeking players who care about the maximization of the expected total payoff in the IPD game, but the real returns are not guaranteed to be maximal due to non-cooperative behaviors.
Note that the strategies start from random historic positions when playing the IPD, which will increase the bit-wise diversity of genome under selective pressure, and thus make each bit well trained.
In a round-robin tournament, each player plays an IPD with every player. The winner is the one who receives the highest average payoff in all IPDs. Observing that some risk-averse players may perform similarly well in the IPD game, we only consider tournament with the 12 players to make a valid comparison.
AllD is a strategy that always defects. (TF2T) is the inverse tit-for-two-tats, which cooperate except when the opponent makes two successive defections.
References
Alkemade F, Van Bragt D, La Poutré JA (2005) Stabilization of tag-mediated interaction by sexual reproduction in an evolutionary agent system. Inf Sci 170:101–119
Andersen TJ, Denrell J, Bettis RA (2007) Strategic responsiveness and bowman’s risk-return paradox. Strateg Manag J 28:407–429
Ashlock D, Kim E-Y, Ashlock W (2009) Fingerprint analysis of the noisy prisoner’s dilemma using a finite-state representation. IEEE Trans Comput Intell AI Games 1:154–167
Axelrod R (1984) The evolution of cooperation. Basic Books, New York
Barr J, Saraceno F (2009) Organization, learning and cooperation. J Econ Behav Organ 70:39–53
Belleflamme P, Bloch F (2008) Sustainable collusion on separate markets. Econ Lett 99:384–386
Burguillo J (2014) Using self-organizing maps with complex network topologies and coalitions for time series prediction. Soft Comput 18:695–705
Camerer CF (2003) Behavioral game theory: Experiments in strategic interaction. Princeton University Press, Princeton
Cartlidge J, Bullock S (2004) Combating coevolutionary disengagement by reducing parasite virulence. Evolut Comput 12:193–222
Chiong R, Kirley M (2012) Effects of iterated interactions in multiplayer spatial evolutionary games. IEEE Trans Evolut Comput 16:537–555
Chong SY, Yao X (2005) Behavioral diversity, choices and noise in the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 9:540–551
Chong SY, Yao X (2007) Multiple choices and reputation in multiagent interactions. IEEE Trans Evolut Comput 11:689–711
Chong SY, Tio P, Yao X (2008) Measuring generalization performance in coevolutionary learning. IEEE Trans Evolut Comput 12:479–505
Chong SY, Tio P, Yao X (2009) Relationship between generalization and diversity in coevolutionary learning. IEEE Trans Comput Intell AI Games 1:214–232
Chong SY, Tio P, Ku DC, Yao X (2012) Improving generalization performance in co-evolutionary learning. IEEE Trans Evolut Comput 16:70–85
Darwen PJ, Yao X (1995) On evolving robust strategies for iterated prisoner’s dilemma. In: Yao X (ed) Progress in evolutionary computation., Lecture Notes in Computer ScienceSpringer, Berlin, pp 276–292
Darwen PJ, Yao X (1997) Speciation as automatic categorical modularization. IEEE Trans Evolut Comput 1:101–108
Darwen PJ, Yao X (2002) Co-evolution in iterated prisoner’s dilemma with intermediate levels of cooperation: application to missile defense. Int J Comput Intell Appl 2:83–107
Dixon HD (2000) Keeping up with the joneses: competition and the evolution of collusion. J Econ Behav Organ 43:223–238
Fölling A, Grimme C, Lepping J, Papaspyrou A (2011) Connecting community-grids by supporting job negotiation with coevolutionary fuzzy-systems. Soft Comput 15:2375–2387
Fogel DB (1993) Evolving behaviors in the iterated prisoner’s dilemma. Evolut Comput 1:77–97
Fogel DB (1995) On the relationship between the duration of an encounter and the evolution of cooperation in the iterated prisoner’s dilemma. Evolut Comput 3:349–363
Franken N, Engelbrecht AP (2005) Particle swarm optimization approaches to coevolve strategies for the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 9:562–579
Friedman TL (2006) The world is flat [updated and expanded]: a brief history of the twenty-first century. Macmillan, London
Fudenberg D, Maskin E (1986) The folk theorem in repeated games with discounting or with incomplete information. Econometrica 54:533–554
Furusawa T (1999) The negotiation of sustainable tariffs. J Int Econ 48:321–345
Gao J, Yu Y (2013) Credibilistic extensive game with fuzzy payoffs. Soft Comput 17:557–567
Holton GA (2004) Defining risk. Financ Anal J 60:19–25
Ioannou C (2014) Coevolution of finite automata with errors. J Evolut Econ 24:541–571
Ishibuchi H, Namikawa N (2005) Evolution of iterated prisoner’s dilemma game strategies in structured demes under random pairing in game playing. IEEE Trans Evolut Comput 9:552– 561
Ishibuchi H, Ohyanagi H, Nojima Y (2011) Evolution of strategies with different representation schemes in a spatial iterated prisoner’s dilemma game. IEEE Trans Comput Intell AI Games 3:67–82
Lam K-m, Leung H-f (2007) Incorporating risk attitude and reputation into infinitely repeated games and an analysis on the iterated prisoner’s dilemma. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), Patras, Greece, 29–31 Oct. 2007. IEEE Computer Society, pp 60–67
Li J, Kendall G (2009) A strategy with novel evolutionary features for the iterated prisoner’s dilemma. Evolut Comput 17:257–274
Li J, Kendall G (2013) Evolutionary stability of discriminating behaviors with the presence of kin cheaters. IEEE Trans Cybern 43:2044–2053
Li J, Hingston P, Kendall G (2011) Engineering design of strategies for winning iterated prisoner’s dilemma competitions. IEEE Trans Comput Intell AI Games 3:348–360
Li M, Lin D, Kou J (2010) An investigation on niching multiple species based on population replacement strategies for multimodal functions optimization. Soft Comput 14:49–69
Lindgren K, Nordahl MG (1994) Evolutionary dynamics of spatial games. Phys D Nonlinear Phenom 75:292–309
McNamara JM, Barta Z, Houston AI (2004) Variation in behaviour promotes cooperation in the prisoner’s dilemma game. Nature 428:745–748
Meng C-L, Pakath R (2001) The iterated prisoner’s dilemma: early experiences with learning classifier system-based simple agents. Decis Support Syst 31:379–403
Meng F-L, Zeng X-J (2013) A stackelberg game-theoretic approach to optimal real-time pricing for the smart grid. Soft Comput 17:2365–2380
Miller JH (1996) The coevolution of automata in the repeated prisoner’s dilemma. J Econ Behav Organ 29:87–112
Miller KD, Bromiley P (1990) Strategic risk and corporate performance: an analysis of alternative risk measures. Acad Manag J 33:756–779
Mittal S, Deb K (2009) Optimal strategies of the iterated prisoner’s dilemma problem for multiple conflicting objectives. IEEE Trans Evolut Comput 13:554–565
Nowak M, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 364:56–58
Nowak MA, Sigmund K (1998) Evolution of indirect reciprocity by image scoring. Nature 393:573–577
Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428:646–650
Ochea M-I (2013) Evolution of repeated prisoner’s dilemma play under logit dynamics. J Econ Dyn Control 37:2483–2499
Ohtsuki H, Hauert C, Lieberman E, Nowak MA (2006) A simple rule for the evolution of cooperation on graphs and social networks. Nature 441:502–505
Press WH, Dyson FJ (2012) Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proc Natl Acad Sci USA 109:10409–10413
Quek H-Y, Tan KC, Goh C-K, Abbass HA (2009) Evolution and incremental learning in the iterated prisoner’s dilemma. IEEE Trans Evolut Comput 13:303–320
Riolo RL, Cohen MD, Axelrod R (2001) Evolution of cooperation without reciprocity. Nature 414:441–443
Samothrakis S, Lucas S, Runarsson TP, Robles D (2013) Coevolving game-playing agents: measuring performance and intransitivities. IEEE Trans Evolut Comput 17:213–226
Snijders C, Raub W (1998) Revolution and risk: paradoxical consequences of risk aversion in interdependent situations. Ration Soc 10:405–425
Tapping D, Luyster T, Shuker T (2002) Value stream management: Eight steps to planning, mapping, and sustaining lean improvements. Productivity Press, New York
Thibert-Plante X, Charbonneau P (2007) Crossover and evolutionary stability in the prisoner’s dilemma. Evolut Comput 15:321–344
van Assen M, Snijders C (2010) The effect of nonlinear utility on behaviour in repeated prisoner’s dilemmas. Ration Soc 22:301–332
van Doorn GS, Riebli T, Taborsky M (2014) Coaction versus reciprocity in continuous-time models of cooperation. J Theor Biol 356:1–10
Waltman L, van Eck NJ (2012) A mathematical analysis of the long-run behavior of genetic algorithms for social modeling. Soft Comput 16:1071–1089
Waltman L, van Eck NJ, Dekker R, Kaymak U (2011) Economic modeling using evolutionary algorithms: the effect of a binary encoding of strategies. J Evolut Econ 21:737–756
Wu J, Axelrod R (1995) How to cope with noise in the iterated prisoner’s dilemma. J Confl Resolut 39:183–189
Zhang H, Gao M, Wang W, Liu Z (2014) Evolutionary prisoners dilemma game on graphs and social networks with external constraint. J Theor Biol 358:122–131
Acknowledgments
This work was supported by the National Science Fund for Distinguished Young Scholars of China (Grant No. 70925005), the General Program of the National Science Foundation of China (No.71101103, No.71271148, No.71371135), and the Research Fund for the Doctoral Program of Higher Education of China (No. 20110032110070). It was also supported by the Program for Changjiang Scholars and Innovative Research Teams in Universities of China (PCSIRT). The authors would also like to thank The High Performance Computing Centre (HPCC) of Tianjin University for providing computing support. We also like to thank the editor and reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Zeng, W., Li, M., Chen, F. et al. Risk consideration and cooperation in the iterated prisoner’s dilemma. Soft Comput 20, 567–587 (2016). https://doi.org/10.1007/s00500-014-1523-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1523-2