Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning

Zhang, Si-Ping; Zhang, Ji-Qiang; Chen, Li; Liu, Xu-Dong

doi:10.1007/s11071-019-05398-4

Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning

Original paper
Published: 09 January 2020

Volume 99, pages 3301–3312, (2020)
Cite this article

Nonlinear Dynamics Aims and scope Submit manuscript

Si-Ping Zhang¹,
Ji-Qiang Zhang²,
Li Chen³ &
…
Xu-Dong Liu²

886 Accesses
25 Citations
Explore all metrics

Abstract

Large-scale cooperation underpins the evolution of ecosystems and the human society, and the collective behaviors by self-organization of multi-agent systems are the key for understanding. As artificial intelligence (AI) prevails in almost all branches of science, it would be of great interest to see what new insights of collective behaviors could be obtained from a multi-agent AI system. Here, we introduce a typical reinforcement learning (RL) algorithm—Q-learning into evolutionary game dynamics, where agents pursue optimal action on the basis of the introspectiveness rather than the outward manner such as the birth–death or imitation processes in the traditional evolutionary game (EG). We investigate the cooperation prevalence numerically for a general \(2\times 2\) game setting. We find that the cooperation prevalence in the multi-agent AI is unexpectedly of equal level as in the traditional EG in most cases. However, in the snowdrift games with RL, we reveal that explosive cooperation appears in the form of periodic oscillation, and we study the impact of the payoff structure on its emergence. Finally, we show that the periodic oscillation can also be observed in some other EGs with the RL algorithm, such as the rock–paper–scissors game. Our results offer a reference point to understand the emergence of cooperation and oscillatory behaviors in nature and society from AI’s perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to alternate

Article 12 April 2018

Intrinsic fluctuations of reinforcement learning promote cooperation

Article Open access 24 January 2023

Learning in Networked Interactions: A Replicator Dynamics Approach

References

Greig, D., Travisano, M.: The prisoner’s dilemma and polymorphism in yeast suc genes. Proc. R. Soc. Lond. B Biol. Sci. 271(Suppl 3), S25–S26 (2004)
Google Scholar
Nowak, M.A.: Evolutionary Dynamics. Harvard University Press, Cambridge (2006)
MATH Google Scholar
West, S.A., Griffin, A.S., Gardner, A.: Evolutionary explanations for cooperation. Curr. Biol. 17(16), R661–R672 (2007)
Google Scholar
Craig Maclean, R., Brandon, C.: Stable public goods cooperation and dynamic social interactions in yeast. J. Evol. Biol. 21(6), 1836–1843 (2008)
Google Scholar
Granovetter, M.: Threshold models of collective behavior. Am. J. Sociol. 83(6), 1420–1443 (1978)
Google Scholar
Hamilton, W.D.: The genetical evolution of social behaviour. ii. J. Theor. Biol. 7(1), 17–52 (1964)
Google Scholar
Bourke, A.F.: Principles of Social Evolution. Oxford University Press, Oxford (2011)
Google Scholar
Smith, J.M., Price, G.R.: The logic of animal conflict. Nature 246(5427), 15 (1973)
MATH Google Scholar
Lee, D.: Game theory and neural basis of social decision making. Nat. Neurosci. 11(4), 404 (2008)
Google Scholar
Sanfey, A.G.: Social decision-making: insights from game theory and neuroscience. Science 318(5850), 598–602 (2007)
Google Scholar
Zomorrodi, A.R., Segrè, D.: Genome-driven evolutionary game theory helps understand the rise of metabolic interdependencies in microbial communities. Nat. Commun. 8(1), 1563 (2017)
Google Scholar
Xu, X., Chen, Z., Si, G., Hu, X., Jiang, Y., Xu, X.: The chaotic dynamics of the social behavior selection networks in crowd simulation. Nonlinear Dyn. 64(1–2), 117–126 (2011)
MathSciNet MATH Google Scholar
Trivers, R.L.: The evolution of reciprocal altruism. Q. Rev. Biol. 46(1), 35–57 (1971)
Google Scholar
Van Veelen, M., García, J., Rand, D.G., Nowak, M.A.: Direct reciprocity in structured populations. Proc. Natl. Acad. Sci. 109(25), 9929–9934 (2012)
MATH Google Scholar
Rand, D.G., Ohtsuki, H., Nowak, M.A.: Direct reciprocity with costly punishment: generous tit-for-tat prevails. J. Theor. Biol. 256(1), 45–57 (2009)
MathSciNet MATH Google Scholar
Nowak, M.A., Sigmund, K.: Evolution of indirect reciprocity. Nature 437(7063), 1291 (2005)
Google Scholar
Panchanathan, K., Boyd, R.: Indirect reciprocity can stabilize cooperation without the second-order free rider problem. Nature 432(7016), 499 (2004)
Google Scholar
Rockenbach, B., Milinski, M.: The efficient interaction of indirect reciprocity and costly punishment. Nature 444(7120), 718 (2006)
Google Scholar
Szabó, G., Fath, G.: Evolutionary games on graphs. Phys. Rep. 446(4–6), 97–216 (2007)
MathSciNet Google Scholar
Ohtsuki, H., Hauert, C., Lieberman, E., Nowak, M.A.: A simple rule for the evolution of cooperation on graphs and social networks. Nature 441(7092), 502 (2006)
Google Scholar
Rapp, P.E.: Why are so many biological systems periodic? Prog. Neurobiol. 29(3), 261–273 (1987)
Google Scholar
Solé, R.V., Miramontes, O., Goodwin, B.C.: Oscillations and chaos in ant societies. J. Theor. Biol. 161(3), 343–357 (1993)
Google Scholar
Sumpter, D.J.: Collective Animal Behavior. Princeton University Press, Princeton (2010)
MATH Google Scholar
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education Limited, Malaysia (2016)
MATH Google Scholar
Mitchell, R.S., Michalski, J.G., Carbonell, T.M.: An Artificial Intelligence Approach. Springer, Berlin (2013)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar
Nasrabadi, N.M.: Pattern recognition and machine learning. J. Electron. Imaging 16(4), 049901 (2007)
MathSciNet Google Scholar
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)
Google Scholar
Tompson, J.J., Jain, A., LeCun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp. 1799–1807 (2014)
Cruz, J.A., Wishart, D.S.: Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2, 117693510600200030 (2006)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Google Scholar
Brown, N., Sandholm, T.: Superhuman ai for heads-up no-limit poker: libratus beats top professionals. Science 359(6374), 418–424 (2018)
MathSciNet MATH Google Scholar
Parkes, D.C., Wellman, M.P.: Economic reasoning and artificial intelligence. Science 349(6245), 267–272 (2015)
MathSciNet MATH Google Scholar
Tian, J., Gu, H.: Anomaly detection combining one-class svms and particle swarm optimization algorithms. Nonlinear Dyn. 61(1–2), 303–310 (2010)
MATH Google Scholar
Jin, X., Shao, J., Zhang, X., An, W., Malekian, R.: Modeling of nonlinear system based on deep learning framework. Nonlinear Dyn. 84(3), 1327–1340 (2016)
MathSciNet Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
MathSciNet MATH Google Scholar
Busoniu, L., Babuska, R., De Schutter, B., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010)
MATH Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Google Scholar
Bellman, R.E., Dreyfus, S.E.: Applied Dynamic Programming, vol. 2050. Princeton University Press, Princeton (2015)
MATH Google Scholar
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
MATH Google Scholar
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI, vol. 2, Phoenix, AZ, p. 5 (2016)
Van Hasselt, H.: Double q-learning. In: Advances in Neural Information Processing Systems, pp. 2613–2621 (2010)
Cao, M., Morse, A.S., Anderson, B.D.: Coordination of an asynchronous multi-agent system via averaging. IFAC Proc. 38(1), 17–22 (2005)
Google Scholar
Zeng, H.-L., Alava, M., Aurell, E., Hertz, J., Roudi, Y.: Maximum likelihood reconstruction for ising models with asynchronous updates. Phys. Rev. Lett. 110(21), 210601 (2013)
Google Scholar
Stewart, A.J., Plotkin, J.B.: Collapse of cooperation in evolving games. Proc. Natl. Acad. Sci. 111(49), 17558–17563 (2014)
Google Scholar
Zhang, S.-P., Zhang, J.-Q., Huang, Z.-G., Guo, B.-H., Wu, Z.-X., Wang, J.: Collective behavior of artificial intelligence population: transition from optimization to game. Nonlinear Dyn. 1–11 (2019)

Download references

Acknowledgements

Si-Ping Zhang is supported by grants from the National Natural Science Foundation of China (Grant Nos. 11975178, 61431012). Li Chen and Ji-Qiang Zhang are supported by the National Natural Science Foundation of China under Grants No. 61703257.

Author information

Authors and Affiliations

The Key Laboratory of Biomedical Information Engineering of Ministry of Education, The Key Laboratory of Neuro-informatics & Rehabilitation Engineering of Ministry of Civil Affairs, and Institute of Health and Rehabilitation Science, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, China
Si-Ping Zhang
Beijing Advanced Innovation Center for Big Data and Brain Computing, and School of Computer Science of Engineering, Beihang University, Beijing, 100191, China
Ji-Qiang Zhang & Xu-Dong Liu
School of Physics and Information Technology, Shaanxi Normal University, Xi’an, 710062, China
Li Chen

Authors

Si-Ping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xu-Dong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ji-Qiang Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1943 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, SP., Zhang, JQ., Chen, L. et al. Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning. Nonlinear Dyn 99, 3301–3312 (2020). https://doi.org/10.1007/s11071-019-05398-4

Download citation

Received: 21 June 2019
Accepted: 26 November 2019
Published: 09 January 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11071-019-05398-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Learning to alternate

Intrinsic fluctuations of reinforcement learning promote cooperation

Learning in Networked Interactions: A Replicator Dynamics Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 1943 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Oscillatory evolution of collective behavior in evolutionary games played with reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Learning to alternate

Intrinsic fluctuations of reinforcement learning promote cooperation

Learning in Networked Interactions: A Replicator Dynamics Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 1943 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation