Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Can boundedly rational sellers learn to play Nash?

  • 70 Accesses

  • 2 Citations


How does Nash pricing compare to pricing with adaptive sellers using reinforcement learning (RL)? We consider a market game similar to Varian’s model (Am Econ Rev 70:651–659, 1980) with two types of consumers differing by the size of their fixed sample search rule and derive the Nash search equilibrium (NSE) strategy (the density, the mean and the variance of the posted price distribution). Our findings are twofold. First, we find that the RL price distribution does not converge in a statistical sense to the NSE one except when competition is à la Bertrand. Second, we show that the qualitative properties of the NSE with respect to a change in buyers‘ search behavior are still valid for the RL distribution. The average price and the variance of both price distributions exhibit similar variations to a change in buyers’ search.

This is a preview of subscription content, log in to check access.


  1. Beggs AW (2005) On the convergence of reinforcement learning. J Econ Theory 122:1–36

  2. Benaïm M, Hofbauer J, Hopkins E (2005) Learning in games with unstable equilibria. Unpublished paper

  3. Brenner T (2002) A behavioural learning approach to the dynamics of prices. Comput Econ 19:67–94

  4. Camerer C, HO T-H (1999) Experience-weighted attraction learning in normal form games. Econometrica 67:4, 827–874

  5. Capra MR, Goeree JK, Gomez R, Holt J (2002) Learning and noisy equilibrium behavior in a experimental study of imperfect price competition. Int Econ Rev 43(3):613–636

  6. Darmon E, Waldeck R (2005) Convergence of reinforcement learning to Nash equilibrium: a search-market experiment. Physica A 355:1

  7. De Palma A, Thisse J-F (1987) Les Modèles de Choix Discrets. Annales d’Economie et de Statistique, 9:151–190

  8. Erev I, Roth AE (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. Am Econ Rev 88:4, 848–881

  9. Hopkins E, Seymour R (2002) The stability of price dispersion under seller and consumer learning. Int Econ Rev 19:1, 46–76

  10. Kirman AP, Vriend NJ (2001) Evolving market structure: an ACE model of price dispersion and loyalty. J Econ Dyn Control 25:3–4, 459–502

  11. Morgan J, Orzen H, Sefton M (2006) An experimental study of price dispersion. Games Econ Behav 54(1):134–158

  12. Posch M (1997) Cycling in a stochastic learning algorithm for normal form games. J Evol Econ 7:2, 193–207

  13. Salmon TC (2001) An evaluation of econometric models of adaptive learning. Econometrica 69:1597–1628

  14. Simon H (1976) From substantive to procedural rationality. In: Latsis SJ Methods and appraisal in economics. Cambridge University Press, Cambridge, pp 129–148

  15. Stigler GJ (1961) The economics of Information. J Polit Econ 69(6):213–225

  16. Sutton RS, Barto AG (1998) Reinforcement learning, an introduction. MIT Press, Cambridge

  17. Varian H (1980) A model of sales. Am Econ Rev 70:651–659

  18. Waldeck R (2006) Search and price competition. J Econ Behav Organ (forthcoming)

Download references

Author information

Correspondence to Roger Waldeck.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Waldeck, R., Darmon, E. Can boundedly rational sellers learn to play Nash?. J Econ Interac Coord 1, 147–169 (2006). https://doi.org/10.1007/s11403-006-0009-4

Download citation


  • Imperfect information
  • Price competition
  • Price dispersion
  • Search market equilibrium
  • Reinforcement learning
  • Numerical computation

JEL Classifications

  • D43
  • D83
  • C63