# Convergence results on stochastic adaptive learning

Research Article

First Online:

- 143 Downloads
- 1 Citations

## Abstract

We investigate an adaptive learning model which nests several existing learning models such as payoff assessment learning, valuation learning, stochastic fictitious play learning, experience-weighted attraction learning and delta learning with foregone payoff information in normal form games. In particular, we consider adaptive players each of whom assigns payoff assessments to his own actions, chooses the action which has the highest assessment with some perturbations and updates the assessments using observed payoffs, which may include payoffs from unchosen actions. Then, we provide conditions under which the learning process converges to a quantal response equilibrium in normal form games.

## Keywords

Adaptive learning Normal form games Asynchronous stochastic approximation Quantal response equilibrium## JEL Classification

C72 D83## References

- Beggs, A.W.: On the convergence of reinforcement learning. J. Econ. Theory
**122**, 1–36 (2005)CrossRefGoogle Scholar - Benaïm, M.: Dynamics of stochastic approximation algorithms. In: Azéma, J., Émery, M., Ledoux, M., Yor, M. (eds.) Séminaire De Probabilités, XXXIII. Lecture Notes in Mathematics, vol. 1709, pp. 1–68. Springer, Berlin (1999)CrossRefGoogle Scholar
- Benaïm, M., Hirsch, M.: Mixed equilibria and dynamical systems arising from fictitious play in perturbed games. Games Econ. Behav.
**29**, 36–72 (1999)CrossRefGoogle Scholar - Borkar, V.S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press, Cambridge (2008)CrossRefGoogle Scholar
- Camerer, C., Ho, T.H.: Experience-weighted attraction learning in normal form games. Econometrica
**67**, 827–874 (1999)CrossRefGoogle Scholar - Chen, Y., Khoroshilov, Y.: Learning under limited information. Games Econ. Behav.
**44**, 1–25 (2003)CrossRefGoogle Scholar - Cominetti, R., Melo, E., Sorin, S.: A payoff-based learning procedure and its application to traffic games. Games Econ. Behav.
**70**, 71–83 (2010)CrossRefGoogle Scholar - Conley, T.G., Udry, C.R.: Learning about a new technology: pineapple in Ghana. Am. Econ. Rev.
**100**, 35–69 (2010)CrossRefGoogle Scholar - Duffy, J., Feltovich, N.: Does observation of others affect learning in strategic environments? An experimental study. Int. J. Game Theory
**28**, 131–52 (1999)CrossRefGoogle Scholar - Erev, I., Roth, A.E.: Predicting how people play games: reinforcement learning in experimental games with unique mixed strategy equilibria. Am. Econ. Rev.
**88**, 848–881 (1998)Google Scholar - Fudenberg, D., Kreps, D.M.: Learning mixed equilibria. Games Econ. Behav.
**5**, 320–367 (1993)CrossRefGoogle Scholar - Fudenberg, D., Takahashi, S.: Heterogeneous beliefs and local information in stochastic fictitious play. Games Econ. Behav.
**71**, 100–120 (2011)CrossRefGoogle Scholar - Funai, N.: An adaptive learning model with foregone payoff information. B.E. J. Theor. Econ.
**14**, 149–176 (2014)CrossRefGoogle Scholar - Funai, N.: A unified model of adaptive learning in normal form games. Working paper (2016a)Google Scholar
- Funai, N.: Reinforcement learning with foregone payoff information in normal form games. Working paper (2016b)Google Scholar
- Grosskopf, B., Erev, I., Yechiam, E.: Foregone with the wind: indirect payoff information and its implications for choice. Int. J. Game. Theory
**34**, 285–302 (2006)CrossRefGoogle Scholar - Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Academic Press, New York (1980)Google Scholar
- Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica
**68**, 1127–1150 (2000)CrossRefGoogle Scholar - Heller, D., Sarin, R.: Adaptive learning with indirect payoff information. Working paper (2001)Google Scholar
- Hofbauer, J., Hopkins, E.: Learning in perturbed asymmetric games. Games Econ. Behav.
**52**, 133–152 (2005)CrossRefGoogle Scholar - Hofbauer, J., Sandholm, W.H.: On the global convergence of stochastic fictitious play. Econometrica
**70**, 2265–2294 (2002)CrossRefGoogle Scholar - Hopkins, E.: Two competing models of how people learn in games. Econometrica
**70**, 2141–2166 (2002)CrossRefGoogle Scholar - Hopkins, E., Posch, M.: Attainability of boundary points under reinforcement learning. Games Econ. Behav.
**53**, 110–125 (2005)CrossRefGoogle Scholar - Ianni, A.: Learning strict Nash equilibria through reinforcement. J. Math. Econ.
**50**, 148–155 (2014)CrossRefGoogle Scholar - Jehiel, P., Samet, D.: Learning to play games in extensive form by valuation. J. Econ. Theory
**124**, 129–148 (2005)CrossRefGoogle Scholar - Laslier, J.F., Topol, R., Walliser, B.: A behavioural learning process in games. Games Econ. Behav.
**37**, 340–366 (2001)CrossRefGoogle Scholar - Leslie, D.S., Collins, E.J.: Individual q-learning in normal form games. SIAM J. Control Optim.
**44**, 495–514 (2005)CrossRefGoogle Scholar - McKelvey, R.D., Palfrey, T.R.: Quantal response equilibria for normal form games. Games Econ. Behav.
**10**, 6–38 (1995)CrossRefGoogle Scholar - Rustichini, A.: Optimal properties of stimulus-response learning models. Games Econ. Behav.
**29**, 244–273 (1999)CrossRefGoogle Scholar - Roth, A.E., Erev, I.: Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ. Behav.
**8**, 164–212 (1995)CrossRefGoogle Scholar - Sarin, R., Vahid, F.: Payoff assessments without probabilities: a simple dynamic model of choice. Games Econ. Behav.
**28**, 294–309 (1999)CrossRefGoogle Scholar - Sarin, R., Vahid, F.: Predicting how people play games: a simple dynamic model of choice. Games Econ. Behav.
**34**, 104–122 (2001)CrossRefGoogle Scholar - Tsitsiklis, J.N.: Asynchronous stochastic approximation and q-learning. Mach. Learn.
**16**, 185–202 (1994)Google Scholar - Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn.
**8**, 279–292 (1992)Google Scholar - Wu, H., Bayer, R.: Learning from inferred foregone payoffs. J. Econ. Dyn. Control
**51**, 445–458 (2015)CrossRefGoogle Scholar - Yechiam, E., Busemeyer, J.R.: Comparison of basic assumptions embedded in learning models for experience-based decision making. Psychon. Bull. Rev.
**12**, 387–402 (2005)CrossRefGoogle Scholar - Yechiam, E., Busemeyer, J.R.: The effect of foregone payoffs on underweighting small probability events. J. Behav. Dec. Mak.
**19**, 1–16 (2006)CrossRefGoogle Scholar

## Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018