Skip to main content
Log in

Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics

  • Interdisciplinary Physics
  • Published:
The European Physical Journal B Aims and scope Submit manuscript

Abstract

A continuous time model for multiagent systems governed by reinforcement learning with scale-free memory is developed. The agents are assumed to act independently of one another in optimizing their choice of possible actions via trial-and-error search. To gain awareness about the action value the agents accumulate in their memory the rewards obtained from taking a specific action at each moment of time. The contribution of the rewards in the past to the agent current perception of action value is described by an integral operator with a power-law kernel. Finally a fractional differential equation governing the system dynamics is obtained. The agents are considered to interact with one another implicitly via the reward of one agent depending on the choice of the other agents. The pairwise interaction model is adopted to describe this effect. As a specific example of systems with non-transitive interactions, a two agent and three agent systems of the rock-paper-scissors type are analyzed in detail, including the stability analysis and numerical simulation. Scale-free memory is demonstrated to cause complex dynamics of the systems at hand. In particular, it is shown that there can be simultaneously two modes of the system instability undergoing subcritical and supercritical bifurcation, with the latter one exhibiting anomalous oscillations with the amplitude and period growing with time. Besides, the instability onset via this supercritical mode may be regarded as “altruism self-organization”. For the three agent system the instability dynamics is found to be rather irregular and can be composed of alternate fragments of oscillations different in their properties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Econophysics and Sociophysics: Trands and Perspectives, edited by B.K. Chakrabarti, A. Chakraborti, A. Chatterjee (Wiley-VCH Verlag GmbH & Co. RGaA, Weinheim, 2006)

  2. D. Helbing, Rev. Mod. Phys. 73, 1067 (2001)

    Article  ADS  Google Scholar 

  3. R.N. Mantegna, H.E. Stanley, Introduction to Econophysics: Correlations and Complexity in Finance (Cambridge University Press, 2000)

  4. E. Easterling, Unexpected Returns: Understanding Secular Stock Market Cycles (Cypress House, Fort Bragg, 2005)

  5. C. Castellano, S. Fortunato, V. Loreto, Rev. Mod. Phys. 81, 591 (2009)

    Article  ADS  Google Scholar 

  6. L. Buşoniu, R. Babuška, B. De Schutter, IEEE Trans. Syst. Men. Cybern. Part C Appl. Rev. 38, 156 (2008)

    Article  Google Scholar 

  7. J.P. Garrahan, E. Moro, D. Sherrington, Phys. Rev. E 62, R9 (2000)

    Article  ADS  Google Scholar 

  8. D. Challet, M. Marsili, R. Zecchina, Phys. Rev. Lett. 84, 1824 (2000)

    Article  ADS  Google Scholar 

  9. M. Marsili, D. Challet, Phys. Rev. E 64, 056138 (2001)

    Article  ADS  Google Scholar 

  10. A. De Martino, Eur. Phys. J. B 35, 143 (2003)

    Article  ADS  Google Scholar 

  11. L. Panait, K. Tuyls, S. Luke, J. Mech. Learn. Res. 9, 423 (2008)

    MathSciNet  Google Scholar 

  12. A. Cavagna, J.P. Garrahan, I. Giardina, D. Sherrington, Phys. Rev. Lett. 83, 4429 (1999)

    Article  ADS  Google Scholar 

  13. T. Borgers, R. Sarin, J. Econ. Th. 77, 1 (1997)

    Article  MathSciNet  Google Scholar 

  14. D. Fudenberg, D.K. Levine, Theory of Learning in Games (MIT Press, 1998)

  15. Y. Sato, J.P. Crutchfield, Phys. Rev. E 67, 015206 (2003)

    Article  ADS  Google Scholar 

  16. Y. Sato, E. Akiyama, J.P. Crutchfield, Physica D 210, 21 (2005)

    Article  MATH  MathSciNet  ADS  Google Scholar 

  17. A. Galstyan, Continuous strategy replicator dynamics for multi-agent learning, e-print arXiv:0904.4717[cs.LG]

  18. I. Lubashevsky, N. Plawinska, Physics of systems with motivation as an interdisciplinary branch of science, e-print arXiv:0902.3785[physics.soc-ph]

  19. I. Lubashevsky, N. Plawinska, Mathematical formalism of physics of systems with motivation, e-print arXiv:0908.1217[physics.soc-ph]

  20. P.A. Garber, Am. J. Primatol. 19 203 (1989)

    Google Scholar 

  21. S. Gibeault, S.E. MacDonald, Primates 41, 147 (2000)

    Article  Google Scholar 

  22. E.M. Erhart, D.J. Overdorff, Folia Primatol. 79, 185 (2008)

    Article  Google Scholar 

  23. R.A. Johnson, Ecology 72, 1408 (1991)

    Article  Google Scholar 

  24. M. Amaya-Márquez, P.S.M. Hill, J.F. Barthell, L.L. Pham, D.R. Doty, H. Wells, J. Kansas, Entomol. Soc. 81 315 (2008)

    Google Scholar 

  25. M. Koganezawa, H. Hara, Y. Hayakawa, I. Shimada, J. Theor. Biol. 260, 353 (2009)

    Article  Google Scholar 

  26. T. Rhodes, M.T. Turvey, Physica A 385, 255 (2007)

    Article  ADS  Google Scholar 

  27. L.R. Squire, Neurobiol. Learn. Mem. 82, 171 (2004)

    Article  Google Scholar 

  28. F. Wang, K. Yamasaki, S. Havlin, H.E. Stanley, Phys. Rev. E 73, 026117 (2006)

    Article  ADS  Google Scholar 

  29. K. Yamasaki, L. Muchnik, S. Havlin, A. Bunde, H.E. Stanley, Proc. Natl. Acad. Sci. USA 102, 9424 (2005)

    Article  ADS  Google Scholar 

  30. C. Hauert, G. Szabó, Am. J. Phys. 73, 405 (2005)

    Article  ADS  Google Scholar 

  31. L.W. Buss, J.B.C. Jackson, Am. Nat. 113, 223 (1979)

    Article  Google Scholar 

  32. C.E. Paquin, J. Adams, Nature 306, 368 (1983)

    Article  ADS  Google Scholar 

  33. S.M. Shuster, M.J. Wade, Nature 350, 606 (1991)

    Article  ADS  Google Scholar 

  34. S.M. Shuster, M.J. Wade, Anim. Behav. 41, 1071 (1991)

    Article  Google Scholar 

  35. B. Kerr, M.A. Riley, M.W. Feldman, B.J.M. Bohannan, Nature 418, 171 (2002)

    Article  ADS  Google Scholar 

  36. C. Kirkup, M.A. Riley, Nature 428, 412 (2004)

    Article  ADS  Google Scholar 

  37. B. Sinervo, C.M. Lively, Nature 380, 240 (1996)

    Article  ADS  Google Scholar 

  38. B. Sinervo, K.R. Zamudio, J. Heredity 92, 198 (2001)

    Article  Google Scholar 

  39. D.B. Lank, C.M. Smith, O. Hanotte, T. Burke, F. Cooke, Nature, 378, 411 (1995)

  40. F. Widemo, Anim. Behav. 56, 329 (1998)

    Article  Google Scholar 

  41. S.R. Pryke, S.C. Griffith, Proc. R. Soc. B 273, 949 (2006)

    Article  Google Scholar 

  42. J. Maynard Smith, Evolution and the Theory of Games (Cambridge Univ. Press, Cambridge, MA, 1982)

  43. B. Sinervo, R. Calsbeek, Annu. Rev. Ecol. Evol. Syst. 37, 581 (2006)

    Article  Google Scholar 

  44. P.J. Greenwood, P.H. Harvey, Ann. Rev. Ecol. Syst. 13, 1 (1982)

    Article  Google Scholar 

  45. O. Ronce, Annu. Rev. Ecol. Syst. 38, 231 (2007)

    Article  Google Scholar 

  46. B. Doligez, L. Gustafsson, T. Pärt, Proc. R. Soc. B 276, 2829 (2009)

    Article  Google Scholar 

  47. R.P. Balda, A.C. Kamil, Anim. Behav. 44, 761 (1992)

    Article  Google Scholar 

  48. C. Mettke-Hofmann, E. Gwinner, Proc. Natl. Acad. Sci. USA 100, 5863 (2003)

    Article  ADS  Google Scholar 

  49. L.D. LaDage, B.J. Riggs, B. Sinervo, V.V. Pravosudov, Anim. Behav. 78, 91 (2009)

    Article  Google Scholar 

  50. A. Corl, A.R. Davis, S.R. Kuchta, B. Sinervo, Proc. Natl. Acad. Sci. USA 107, 4254 (2010)

    Article  ADS  Google Scholar 

  51. J.M. Smith, Nature 195, 60 (1962)

    Article  ADS  Google Scholar 

  52. M.J. West-Eberhard, Proc. Natl. Acad. Sci. USA 83, 1388 (1986)

    Article  ADS  Google Scholar 

  53. B. Sinervo, E. Svensson, Heredity 89, 329 (2002)

    Article  Google Scholar 

  54. S.M. Gray, J.S. McKinnon, Trends Ecol. Evol. 22, 71 (2007)

    Article  Google Scholar 

  55. J.M. Rowland, D.J. Emlen, Science 323, 773 (2009)

    Article  ADS  Google Scholar 

  56. R.R. Bush, F. Mosteller, Stochastic models for learning (Wiley, New York, 1955)

  57. W.-T. Fu, J.R. Anderson, J. Exp. Psychol. Gen. 135, 184 (2006)

    Article  Google Scholar 

  58. R. Hau, T.J. Pleskac, J. Kiefer, R. Hertwig, J. Behav. Decis. Mak. 21, 493 (2008)

    Article  Google Scholar 

  59. R. Hertwig, I. Erev, Trends in Cognitive Sciences 13, 517 (2009)

    Article  Google Scholar 

  60. A. Tversky, D. Kahneman, J. Risk Uncert. 5, 297 (1992)

    Article  MATH  Google Scholar 

  61. A.M. Dufty Jr., J. Clobert, A.P. Møller, Trends Ecol. Evol. 17, 190 (2002)

    Article  Google Scholar 

  62. T. Uller, Trends Ecol. Evol. 23, 432 (2008)

    Article  Google Scholar 

  63. A.V. Badyaev, T. Uller, Phil. Trans. R. Soc. B 364, 1169 (2009)

    Article  Google Scholar 

  64. A.A. Kilbas, H.M. Srivastava, J.J. Trujillo, Theory and Applications of Fractional Differential Equations (Elsevier B.V., Amsterdam, 2006)

  65. I. Podlubny, Fractional differential equations (Academic Press, San Diego, 1999)

  66. J.R. Kok, N. Vlassis, J. Mach. Learn. Res. 7, 1789 (2006)

    MathSciNet  Google Scholar 

  67. L. Kirwan, A. Lüscher, M.T. Sebastiá, J.A. Finn, R.P. Collins, C. Porqueddu, A. Helgadottir, O.H. Baadshaug, C. Brophy, C. Coran, S. Dalmannsdóttir, I. Delgado, A. Elgersma, M. Fothergill, B.E. Frankow-Lindberg, P. Golinski, P. Grieu, A.M. Gustavsson, M. Höglind, O. Huguenin-Elie, C. Iliadis, M. Jørgensen, Z. Kadziuliene, T. Karyotis, T. Lunnan, M. Malengier, S. Maltoni, V. Meyer, D. Nyfeler, P. Nykanen-Kurki, J. Parente, H.J. Smit, U. Thumm, J. Connolly, J. Ecol. 95, 530 (2007)

    Article  Google Scholar 

  68. L. Kirwan, J. Connolly, J.A. Finn, C. Brophy, A. Lüscher, D. Nyfeler, M.T. Sebastiá, Ecology 90, 2032 (2009)

    Article  Google Scholar 

  69. T.A. Perkins, W.R. Holmes, J.F. Weltzin, J. Veg. Sci. 18, 685 (2007)

    Article  Google Scholar 

  70. E.C. Engel, J.F. Weltzin, Plant Ecol. 195, 77 (2008)

    Article  Google Scholar 

  71. B. Sinervo, Genetica 112–113, 417 (2001)

    Article  Google Scholar 

  72. C. Bleay, B. Sinervo, Behav. Ecol. 18, 304 (2007)

    Article  Google Scholar 

  73. W. Schuett, T. Tregenza, S.R.X. Dall, Biol. Rev. (2009), published Online

  74. L.T. Lancaster, C.A. Hipsley, B. Sinervo, Behav. Ecol. 20, 993 (2009)

    Article  Google Scholar 

  75. E.J. Collins, J.M. McNamara, D.M. Ramsey, Behav. Ecol. 17, 799 (2006)

    Article  Google Scholar 

  76. T.W. Fawcett, C. Bleay, Behav. Ecol. 20, 68 (2009)

    Article  Google Scholar 

  77. K. Deithelm, J.M. Ford, N.J. Ford, M. Weilbeer, J. Comput. Appl. Math. 186, 482 (2006)

    Article  MathSciNet  ADS  Google Scholar 

  78. V. Gafiychuk, B. Datsko, Appl. Math. Comput. 198, 251 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  79. L. Galeone, R. Garrappa, J. Comput. Appl. Math. 228, 548 (2009)

    Article  MATH  MathSciNet  ADS  Google Scholar 

  80. L. Lehmann, L. Keller, J. Evolut. Biol. 19, 1365 (2006)

    Article  Google Scholar 

  81. S.A. West, A. Gardner, Science 327, 1341 (2010)

    Article  ADS  Google Scholar 

  82. R. Trivers, Q. Rev. Biol. 46, 35 (1971)

    Article  Google Scholar 

  83. V.A.A. Jansen, M. van Baalen, Nature 440, 663 (2006)

    Article  ADS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to I. Lubashevsky or S. Kanemoto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lubashevsky, I., Kanemoto, S. Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics. Eur. Phys. J. B 76, 69–85 (2010). https://doi.org/10.1140/epjb/e2010-00201-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1140/epjb/e2010-00201-8

Keywords

Navigation