Skip to main content
Log in

Lévy noise promotes cooperation in the prisoner’s dilemma game with reinforcement learning

  • Original Paper
  • Published:
Nonlinear Dynamics Aims and scope Submit manuscript

Abstract

Uncertainties are ubiquitous in everyday life, and it is thus important to explore their effects on the evolution of cooperation. In this paper, the prisoner’s dilemma game with reinforcement learning subject to Lévy noise is studied. Specifically, diverse fluctuations mimicked by Lévy distributed noise are reflected in the payoff matrix of each player. At the same time, the self-regarding Q-learning algorithm is considered as the strategy update rule to learn the behavior that achieves the highest payoff. The results show that not only does Lévy noise promote the evolution of cooperation with reinforcement learning, it does so comparatively better than Gaussian noise. We explain this with the iterative updating pattern of the self-regarding Q-learning algorithm, which has an accumulative effect on the noise entering the payoff matrix. It turns out that under Lévy noise, the Q-value of cooperative behavior becomes significantly larger than that of defective behavior when the current strategy is defection, which ultimately leads to the prevalence of cooperation, while this is absent with Gaussian noise or without noise. This research thus unveils a particular positive role of Lévy noise in the evolutionary dynamics of social dilemmas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Darwin C.: The Origin of Species. Harward Univ. Press, Cambridge (1859) (Reprinted, 1964)

  2. Perc, M., Marhl, M.: Evolutionary and dynamical coherence resonances in the pair approximated prisoner’s dilemma game. New J. Phys. 8(8), 142 (2006)

    Article  Google Scholar 

  3. Zhang, J., Zhang, C., Chu, T., Perc, M.: Resolution of the stochastic strategy spatial prisoner’s dilemma by means of particle swarm optimization. PLoS ONE 6(7), e21787 (2011)

    Article  Google Scholar 

  4. Wu, Z.X., Xu, X.J., Huang, Z.G., Wang, S.J., Wang, Y.H.: Evolutionary prisoner’s dilemma game with dynamic preferential selection. Phys. Rev. E 74, 21107 (2006)

    Article  MathSciNet  Google Scholar 

  5. Nowak, M.A., May, R.M.: Evolutionary games and spatial chaos. Nature 359(6398), 826–829 (1992)

    Article  Google Scholar 

  6. Tomassini, M., Luthi, L., Giacobini, M.: Hawks and doves on small-world networks. Phys. Rev. E 73(1), 16132 (2006)

    Article  Google Scholar 

  7. Fu, F., Liu, L.H., Wang, L.: Evolutionary prisoner’s dilemma on heterogeneous Newman-Watts small-world network. Eur. Phys. J. B 56(4), 367–372 (2007)

    Article  Google Scholar 

  8. Chen, X., Wang, L.: Promotion of cooperation induced by appropriate payoff aspirations in a small-world networked game. Phys. Rev. E 77(1), 17103 (2008)

    Article  MathSciNet  Google Scholar 

  9. Santos, F.C., Pacheco, J.M.: Scale-free networks provide a unifying framework for the emergence of cooperation. Phys. Rev. Lett. 95(9), 98104 (2005)

    Article  Google Scholar 

  10. Rong, Z., Li, X., Wang, X.: Roles of mixing patterns in cooperation on a scale-free networked game. Phys. Rev. E 76(2), 27101 (2007)

    Article  Google Scholar 

  11. Assenza, S., Gómez-Gardeñes, J., Latora, V.: Enhancement of cooperation in highly clustered scale-free networks. Phys. Rev. E 78(1), 17101 (2008)

    Article  Google Scholar 

  12. Poncela, J., Gómez-Gardenes, J., Moreno, Y.: Cooperation in scale-free networks with limited associative capacities. Phys. Rev. E 83(5), 57101 (2011)

    Article  Google Scholar 

  13. Xia, C., Li, X., Wang, Z., Perc, M.: Doubly effects of information sharing on interdependent network reciprocity. New J. Phys. 20(7), 75005 (2018)

    Article  Google Scholar 

  14. Shi, L., Shen, C., Geng, Y., Chu, C., Meng, H., Perc, M., Boccaletti, S., Wang, Z.: Winner-weaken-loser-strengthen rule leads to optimally cooperative interdependent networks. Nonlinear Dyn. 96(1), 49–56 (2019)

    Article  MATH  Google Scholar 

  15. Nowak, M.A., Sigmund, K.: Tit for tat in heterogeneous populations. Nature 355(6357), 250–253 (1992)

    Article  Google Scholar 

  16. Baek, S.K., Kim, B.J.: Intelligent tit-for-tat in the iterated prisoner’s dilemma game. Phys. Rev. E 78(1), 11125 (2008)

    Article  MathSciNet  Google Scholar 

  17. Nowak, M.A., Sigmund, K.: A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 364(6432), 56–58 (1993)

    Article  Google Scholar 

  18. Amaral, M.A., Wardil, L., Perc, M., da Silva, J.K.L.: Stochastic win-stay-lose-shift strategy with dynamic aspirations in evolutionary social dilemmas. Phys. Rev. E 94(3), 32317 (2016)

    Article  Google Scholar 

  19. Szabó, G., Tőke, C.: Evolutionary prisoner’s dilemma game on a square lattice. Phys. Rev. E 58(1), 69–73 (1998)

    Article  Google Scholar 

  20. Ezaki, T., Horita, Y., Takezawa, M., Masuda, N.: Reinforcement learning explains conditional cooperation and its moody cousin. PLoS Comput. Biol. 12(7), e1005034 (2016)

    Article  Google Scholar 

  21. Jia, D., Guo, H., Song, Z., Shi, L., Deng, X., Perc, M., Wang, Z.: Local and global stimuli in reinforcement learning. New J. Phys. 23(8), 83020 (2021)

    Article  Google Scholar 

  22. Jia, D., Li, T., Zhao, Y., Zhang, X., Wang, Z.: Empty nodes affect conditional cooperation under reinforcement learning. Appl. Math. Comput. 413(6398), 126658 (2022)

    MathSciNet  MATH  Google Scholar 

  23. Zhang, S.P., Zhang, J.Q., Chen, L., Liu, X.D.: Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning. Nonlinear Dyn. 99, 3301–3312 (2020)

    Article  Google Scholar 

  24. Zhang, S.P., Zhang, J.Q., Huang, Z.G., Guo, B.H., Wu, Z.X., Wang, J.: Collective behavior of artificial intelligence population: transition from optimization to game. Nonlinear Dyn. 95(2), 1627–1637 (2019)

    Article  MATH  Google Scholar 

  25. Wang, W.X., Ren, J., Chen, G., Wang, B.H.: Memory-based snowdrift game on networks. Phys. Rev. E 74(5), 56113 (2006)

    Article  Google Scholar 

  26. Hilbe, C., Martinez-Vaquero, L.A., Chatterjee, K., Nowak, M.A.: Memory-n strategies of direct reciprocity. Proc. Natl. Acad. Sci. USA 114(8), 4715–4720 (2017)

    Article  Google Scholar 

  27. Dong, Y., Xu, H., Fan, S.: Memory-based stag hunt game on regular lattices. Physica A 519, 247–255 (2019)

    Article  MathSciNet  Google Scholar 

  28. Platkowski, T.: Enhanced cooperation in prisoner’s dilemma with aspiration. Appl. Math. Lett. 22(8), 1161–1165 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  29. Yang, H.X., Wu, Z.X., Wang, B.H.: Role of aspiration-induced migration in cooperation. Phys. Rev. E 81, 65101–65104 (2010)

    Article  Google Scholar 

  30. Rong, Z.H., Zhao, Q., Wu, Z.X., Zhou, T., Tse, C.K.: Proper aspiration level promotes generous behavior in the spatial prisoner’s dilemma game. Eur. Phys. J. B 89(7), 1–7 (2016)

    Article  MathSciNet  Google Scholar 

  31. Szolnoki, A., Perc, M., Szabó, G., Stark, H.U.: Impact of aging on the evolution of cooperation in the spatial prisoner’s dilemma game. Phys. Rev. E 80, 21901 (2009)

    Article  Google Scholar 

  32. Wang, Z., Zhu, X., Arenzon, J.J.: Cooperation and age structure in spatial games. Phys. Rev. E 85(1), 011149 (2012)

    Article  Google Scholar 

  33. Wang, Z., Wang, Z., Yang, Y.H., Yu, M.X., Liao, L.: Age-related preferential selection can promote cooperation in the prisoner’s dilemma game. Int. J. Mod. Phys. C 23(2), 1250013 (2012)

    Article  MATH  Google Scholar 

  34. Han, Y., Song, Z., Sun, J., Ma, J., Guo, Y., Zhu, P.: Investing the effect of age and cooperation in spatial multigame. Physica A 541, 123269 (2020)

    Article  Google Scholar 

  35. Fowler, J.H.: Altruistic punishment and the origin of cooperation. Proc. Natl. Acad. Sci. USA 102(19), 7047–7049 (2005)

    Article  Google Scholar 

  36. Balliet, D., Mulder, L.B., Van Lange, P.A.M.: Reward, punishment, and cooperation: a meta-analysis. Psychol. Bull. 137(4), 594–615 (2011)

    Article  Google Scholar 

  37. Wu, Y., Chang, S., Zhang, Z., Deng, Z.: Impact of social reward on the evolution of the cooperation behavior in complex networks. Sci. Rep. 7(1), 1–9 (2017)

    Google Scholar 

  38. Zhu, P., Guo, H., Zhang, H., Han, Y., Wang, Z., Chu, C.: The role of punishment in the spatial public goods game. Nonlinear Dyn. 102(4), 2959–2968 (2020)

    Article  Google Scholar 

  39. Song, Q., Cao, Z., Tao, R., Jiang, W., Liu, C., Liu, J.: Conditional Neutral Punishment Promotes Cooperation in the Spatial Prisoner’s Dilemma Game. Appl. Math. Comput. 368, 124798 (2020)

    MathSciNet  MATH  Google Scholar 

  40. Fu, F., Hauert, C., Nowak, M.A., Wang, L.: Reputation-based partner choice promotes cooperation in social networks. Phys. Rev. E 78(2), 26117 (2008)

    Article  Google Scholar 

  41. Gallo, E., Yan, C.: The effects of reputational and social knowledge on cooperation. Proc. Natl. Acad. Sci. USA 112(12), 3647–3652 (2015)

    Article  Google Scholar 

  42. Gross, J., De Dreu, C.: The rise and fall of cooperation through reputation and group polarization. Nat. Commun. 10(1), 1–10 (2019)

    Article  Google Scholar 

  43. Wang, L., Ye, S.Q., Cheong, K.H., Bao, W., Xie, N.: The role of emotions in spatial prisoner’s dilemma game with voluntary participation. Physica A 490, 1396–1407 (2018)

    Article  Google Scholar 

  44. Wang, Z., Szolnoki, A., Perc, M.: Self-organization towards optimally interdependent networks by means of coevolution. New J. Phys. 16(3), 33041 (2014)

    Article  Google Scholar 

  45. Liu, C., Guo, H., Li, Z., Gao, X., Li, S.: Coevolution of multi-game resolves social dilemma in network population. Appl. Math. Comput. 341, 402–407 (2019)

    MathSciNet  MATH  Google Scholar 

  46. Chu, C., Mu, C., Liu, J., Liu, C., Boccaletti, S., Shi, L., Wang, Z.: Aspiration-based coevolution of node weights promotes cooperation in the spatial prisoner’s dilemma game. New J. Phys. 21(6), 63024 (2019)

    Article  Google Scholar 

  47. Guo, H., Li, X., Hu, K., Dai, X., Jia, D., Boccaletti, S., Perc, M., Wang, Z.: The dynamics of cooperation in asymmetric sub-populations. New J. Phys. 22(8), 83015 (2020)

    Article  MathSciNet  Google Scholar 

  48. Babajanyan, S.G., Lin, W., Cheong, K.H.: Cooperate or not cooperate in predictable but periodically varying situations? Cooperation in fast oscillating environment. Adv. Sci. 7(21), 2001995 (2020)

    Article  Google Scholar 

  49. Jiang, L.L., Zhao, M., Yang, H.X., Wakeling, J., Wang, B.H., Zhou, T.: Reducing the heterogeneity of payoffs: an effective way to promote cooperation in the prisoner’s dilemma game. Phys. Rev. E 80(3), 031144 (2009)

    Article  Google Scholar 

  50. Perc, M.: Coherence resonance in a spatial prisoner’s dilemma game. New J. Phys. 8(2), 22 (2006)

    Article  Google Scholar 

  51. Perc, M.: Transition from Gaussian to Levy distributions of stochastic payoff variations in the spatial prisoner’s dilemma game. Phys. Rev. E 75(2), 22101 (2007)

    Article  MathSciNet  Google Scholar 

  52. Xu, W., Hao, M., Gu, X., Yang, G.: Stochastic resonance induced by Lévy noise in a tumor growth model with periodic treatment. Mod. Phys. Lett. B. 28, 1450085 (2014)

    Article  Google Scholar 

  53. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)

    Article  MATH  Google Scholar 

  54. Shigaki, K., Wang, Z., Tanimoto, J., Fukuda, E.: Effect of initial fraction of cooperators on cooperative behavior in evolutionary prisoner’s dilemma game. PLoS ONE 8(11), e76942 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by National Key R&D Program of China(Grant no. 2018AAA0100905), the National Science Fund for Distinguished Young Scholars (Grants No. 62025602), the National Natural Science Foundation of China (Grant Nos. 11931015, U1803263,81961138010 and 62073263), Fok Ying-Tong Education Foundation, China (Grant No. 171105), Key Technology Research and Development Program of Science and Technology-Scientific and Technological Innovation Team of Shaanxi Province (Grant No. 2020TD-013), the Tencent Foundation and XPLORER PRIZE, the Slovenian Research Agency (Grant Nos. P1-0403 and J1-2457), and Key Research and Development Program of Shaanxi Province (Grant No 2022KW-26). Discussions with Hao Guo and Chen Chu are gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhen Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Jia, D., Zhang, L. et al. Lévy noise promotes cooperation in the prisoner’s dilemma game with reinforcement learning. Nonlinear Dyn 108, 1837–1845 (2022). https://doi.org/10.1007/s11071-022-07289-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11071-022-07289-7

Keywords

Navigation