Imitation dynamics with payoff shocks

Mertikopoulos, Panayotis; Viossat, Yannick

doi:10.1007/s00182-015-0505-7

Imitation dynamics with payoff shocks

Original Paper
Published: 29 September 2015

Volume 45, pages 291–320, (2016)
Cite this article

International Journal of Game Theory Aims and scope Submit manuscript

Panayotis Mertikopoulos¹ &
Yannick Viossat²

355 Accesses
6 Citations
Explore all metrics

Abstract

We investigate the impact of payoff shocks on the evolution of large populations of myopic players that employ simple strategy revision protocols such as the “imitation of success”. In the noiseless case, this process is governed by the standard (deterministic) replicator dynamics; in the presence of noise however, the induced stochastic dynamics are different from previous versions of the stochastic replicator dynamics (such as the aggregate-shocks model of Fudenberg and Harris in J Econ Theory 57(2):420–441, 1992). In this context, we show that strict equilibria are always stochastically asymptotically stable, irrespective of the magnitude of the shocks; on the other hand, in the high-noise regime, non-equilibrium states may also become stochastically asymptotically stable and dominated strategies may survive in perpetuity (they become extinct if the noise is low). Such behavior is eliminated if players are less myopic and revise their strategies based on their cumulative payoffs. In this case, we obtain a second order stochastic dynamical system where non-equilibrium states are no longer attracting and dominated strategies become extinct (a.s.), no matter the noise level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Replicator Dynamics and Weak Pay-Off Positive Selection Dynamics: An Overview

Imitation in heterogeneous populations

Article 31 March 2017

Evolution of cooperation in stochastic games

Article 04 July 2018

Notes

Importantly, fluctuations due to randomized choices disappear in the large population limit (Benaïm and Weibull 2003).
In a deterministic setting, Sandholm (2010) calls a dynamical system imitative if (in addition to some monotonicity requirements) strategies that are initially absent from the population do not appear. The biological stochastic replicator dynamics satisfy this condition, but this does not mean that they are derived from a revision protocol based on imitation of other agents; by contrast, the dynamics that we study in this paper are derived from such an imitation model, hence the name “imitation dynamics with payoff shocks”.
In the example of the choice of an itinerary to go to work, the size of the commuting population remains roughly constant over time spans allowing significant evolutions of behavior. Moreover, in the short-term, users typically have a standard itinerary, but may revise their choice occasionally, and may then get information on travel times and travel comfort from discussions with other commuters. These features suggest that our imitation dynamics with payoff shocks are a better fit for this situation than the stochastic dynamics of Fudenberg and Harris (1992). Of course, an even more realistic model of itinerary choice should also allow for innovation, that is, the possibility of trying an itinerary not based on imitation, but on an otherwise informed guess that this itinerary might be appealing.
Note that we are considering general payoff functions and not only multilinear (resp. linear) payoffs arising from asymmetric (resp. symmetric) random matching in finite N-person (resp. 2-person) games. This distinction is important as it allows our model to cover e.g. general traffic games as in Sandholm (2010).
In other words, $\rho _{\alpha \beta }$ is the probability of an $\alpha $-strategist becoming a $\beta $-strategist up to normalization by the alarm clocks’ rate.
Modulo an additive constant which ensures that $\rho $ is positive but which cancels out when it comes to the dynamics.
An important special case where it makes sense to consider correlated shocks is if the payoff functions $v_{\alpha }(x)$ are derived from random matchings in a finite game whose payoff matrix is subject to stochastic perturbations. This specific disturbance model is discussed in Sect. 5.
The replicator equation (RD) is obtained simply by computing the evolution of the frequencies $x_{\alpha }= z_{\alpha }/\sum _{\beta } z_{\beta }$ under (2.11).
Khasminskii and Potsepun (2006) also considered a related evolutionary model with Stratonovich-type perturbations while, more recently, Vlasic (2012) studied the effect of discontinuous semimartingale shocks incurred by catastrophic, earthquake-like events.
The intermediate variable $y_{\alpha }$ should be thought of as an evaluation of how good the strategy $\alpha $ is, and the formula for $x_{\alpha }$ as a way of transforming these evaluations into a strategy.
Elimination is obvious; for survival, simply add $\frac{1}{2}\sigma _{\min }^{2}t$ to the exponents of (2.18) and recall that any Wiener process has $\limsup _{t} W(t) > 0$ and $\liminf _{t} W(t) <0$ (a.s.).
We are implicitly assuming here deterministic initial conditions, i.e. $X(0) = x$ (a.s.) for some $x\in \mathcal {X}$.
If several strategies are unaffected by noise, that is, are such that $\sigma _{\alpha }=0$, then their relative shares remain constant (that is, if $\alpha $ and $\beta $ are two such strategies, then $X_{\alpha }(t)/X_{\beta }(t) = X_{\alpha }(0)/X_{\beta }(0)$ for all $t\ge 0$). It follows from this observation and the above result that, almost surely, all these strategies are eliminated or all these strategies survive (and only them).
In the pure noise case of the model of Fudenberg and Harris (1992), what remains constant is the expected number of individuals playing a strategy. A crucial point here is that this number may grow to infinity. What happens to strategies affected by large aggregate shocks is that with small probability, the total number of individuals playing this strategy gets huge, but with a large probability (going to 1), it gets small (at least compared to the number of individuals playing other strategies). This can be seen as a gambler’s ruin phenomenon, which explains that even with a higher expected payoff than others (hence a higher expected subpopulation size), the frequency of a strategy may go to zero almost surely (see e.g. Robson and Samuelson 2011, Sect. 3.1.1). This cannot happen in our model since noise is added directly to the frequencies (which are bounded).
Put differently, it is more probable for X(n) to decrease rather than increase: $X(n+2) > X(n)$ with probability 1 / 4 (i.e. if and only if $\xi _{n}$ takes two positive steps), while $X(n+2) < X(n)$ with probability 3 / 4.
Simply note that $X_{\alpha ^{*}} = \big (1 + \sum _{\beta \in \mathcal {A}^{*}} \exp (Z_{\beta })\big )^{-1}$.
In a discrete time setting, if $Z(n+1)= g(n) Z_n$ and $g(n)=k_i$ with probability $p_i$, what we mean is that the quantity that a.s. governs the long-term growth of Z is not $E(g)=\sum _{i} p_i k_i$, but $\exp (E (\ln g))= \prod _i k_i^{p_i}$.
Recall that $\sum \nolimits _{\alpha } dV_{\alpha } = 0$ since $\sum \nolimits _{\alpha } X_{\alpha } = 1$.
The reason however is different: in (SRD), there is no Itô correction because the noise is added directly to the dynamical system under study; in (4.8), there is no Itô correction because the noise is integrated over, so $X_{\alpha }$ is smooth (and, hence, obeys the rules of ordinary calculus).
Recall that $\int _{0}^{t} \sigma _{\alpha }(X(s)) \,dW_{\alpha }(s)$ is continuous, so there is no Itô correction.
Theorem 4.1 actually applies to mixed dominated strategies as well (even iteratively dominated ones). The proof is a simple adaptation of the pure strategies case, so we omit it.
Recall here that equilibria of $\mathcal {G}$ are also equilibria of $\mathcal {G}^{\sigma }$, but the converse need not hold.
For a general overview of the differences between Itô and Stratonovich integration, see Kampen (1981); for a more specific account in the context of stochastic population growth models, the reader is instead referred to Khasminskii and Potsepun (2006) and Hofbauer and Imhof (2009).

References

Akin E (1980) Domination or equilibrium. Math Biosci 50(3–4):239–250
Article Google Scholar
Benaïm M, Weibull JW (2003) Deterministic approximation of stochastic evolution in games. Econometrica 71(3):873–903
Article Google Scholar
Bergstrom TC (2014) On the evolution of hoarding, risk-taking, and wealth distribution in nonhuman and human populations. Proc Natl Acad Sci USA 111(3):10,860–10,867
Article Google Scholar
Bertsekas DP, Gallager R (1992) Data networks, 2nd edn. Prentice Hall, Englewood Cliffs
Google Scholar
Björnerstedt J, Weibull JW (1996) Nash equilibrium and evolution by imitation. In: Arrow KJ, Colombatto E, Perlman M, Schmidt C (eds) The rational foundations of economic behavior. St. Martin’s Press, New York, pp 155–181
Google Scholar
Bravo M, Mertikopoulos P (2014) On the robustness of learning in games with stochastically perturbed payoff observations. arXiv:1412.6565
Cabrales A (2000) Stochastic replicator dynamics. Int Econ Rev 41(2):451–481
Article Google Scholar
Fudenberg D, Harris C (1992) Evolutionary dynamics with aggregate shocks. J Econ Theory 57(2):420–441
Article Google Scholar
Hofbauer J, Imhof LA (2009) Time averages, recurrence and transience in the stochastic replicator dynamics. Ann Appl Probab 19(4):1347–1368
Article Google Scholar
Hofbauer J, Sigmund K (2003) Evolutionary game dynamics. Bull Am Math Soc 40(4):479–519
Article Google Scholar
Hofbauer J, Sorin S, Viossat Y (2009) Time average replicator and best reply dynamics. Math Oper Res 34(2):263–269
Article Google Scholar
Imhof LA (2005) The long-run behavior of the stochastic replicator dynamics. Ann Appl Probab 15(1B):1019–1045
Article Google Scholar
Karatzas I, Shreve SE (1998) Brownian motion and stochastic calculus. Springer-Verlag, Berlin
Book Google Scholar
Khasminskii RZ (2012) Stochastic stability of differential equations, vol 66, 2nd edn. Stochastic modelling and applied probability. Springer-Verlag, Berlin
Khasminskii RZ, Potsepun N (2006) On the replicator dynamics behavior under Stratonovich type random perturbations. Stoch Dyn 6:197–211
Article Google Scholar
Kuo HH (2006) Introduction to Stochastic integration. Springer, Berlin
Google Scholar
Laraki R, Mertikopoulos P (2013) Higher order game dynamics. J Econ Theory 148(6):2666–2695
Article Google Scholar
Laraki R, Mertikopoulos P (2015) Inertial game dynamics and applications to constrained optimization. SIAM J Control Optim (to appear)
Littlestone N, Warmuth MK (1994) The weighted majority algorithm. Inf Comput 108(2):212–261
Article Google Scholar
Mertikopoulos P, Moustakas AL (2009) Learning in the presence of noise. In: GameNets ’09: proceedings of the 1st international conference on game theory for networks
Mertikopoulos P, Moustakas AL (2010) The emergence of rational behavior in the presence of stochastic perturbations. Ann Appl Probab 20(4):1359–1388
Article Google Scholar
Nachbar JH (1990) Evolutionary selection dynamics in games. Int J Game Theory 19:59–89
Article Google Scholar
Øksendal B (2007) Stochastic differential equations, 6th edn. Springer-Verlag, Berlin
Google Scholar
Robson AJ, Samuelson L (2011) The evolutionary foundations of preferences. In: Benhabib J, Bisin A, Jackson MO (eds) Handbook of social economics, vol 1, chap 7. North-Holland, Amsterdam, pp 221–310
Rustichini A (1999) Optimal properties of stimulus-response learning models. Games Econ Behav 29:230–244
Google Scholar
Samuelson L, Zhang J (1992) Evolutionary stability in asymmetric games. J Econ Theory 57:363–391
Article Google Scholar
Sandholm WH (2010) Population games and evolutionary dynamics. Economic learning and social evolution. MIT Press, Cambridge, MA
Schlag KH (1998) Why imitate, and if so, how? A boundedly rational approach to multi-armed bandits. J Econ Theory 78(1):130–156
Article Google Scholar
Sorin S (2009) Exponential weight algorithm in continuous time. Math Program 116(1):513–528
Article Google Scholar
Taylor PD, Jonker LB (1978) Evolutionary stable strategies and game dynamics. Math Biosci 40(1–2):145–156
Article Google Scholar
van Kampen NG (1981) Itô versus Stratonovich. J Stat Phys 24(1):175–187
Article Google Scholar
Vlasic A (2012) Long-run analysis of the stochastic replicator dynamics in the presence of random jumps. arXiv:1206.0344
Vovk VG (1990) Aggregating strategies. In: COLT ’90: proceedings of the 3rd workshop on computational learning theory, pp 371–383
Weibull JW (1995) Evolutionary game theory. MIT Press, Cambridge, MA
Google Scholar

Download references

Acknowledgments

Supported in part by the French National Research Agency under Grant No. GAGA–13–JS01–0004–01 and the French National Center for Scientific Research (CNRS) under Grant No. PEPS–GATHERING–2014. The authors are grateful to the associate editor in charge of the manuscript and to two anonymous referees for their insightful comments and remarks

Author information

Authors and Affiliations

CNRS (French National Center for Scientific Research) and Univ. Grenoble Alpes LIG, 38000, Grenoble, France
Panayotis Mertikopoulos
PSL, Université Paris–Dauphine, CEREMADE UMR7534, Place du Maréchal de Lattre de Tassigny, 75775, Paris, France
Yannick Viossat

Authors

Panayotis Mertikopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Yannick Viossat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Panayotis Mertikopoulos.

Additional information

Dedicated to Abraham “Merale” Neyman on the occasion of his 66th birthday.

Appendix: Auxiliary results from stochastic analysis

In this appendix, we provide an asymptotic growth bound for Wiener processes relying on the law of the iterated logarithm. This result appears in a similar context in Bravo and Mertikopoulos (2014); the proof below is given only for completeness and ease of reference.

Lemma 6.1

Let $W(t) = (W_{1}(t),\cdots ,W_{n}(t))$, $t\ge 0$, be an n-dimensional Wiener processes and let Z(t) be a bounded, continuous process in $\mathbb {R}^{n}$. Then:

$$\begin{aligned} f(t) + \int _{0}^{t} Z(s) \cdot dW(s) \sim f(t) \quad \text {as}\quad t\rightarrow \infty ( a.s.) , \end{aligned}$$

(6.1)

for any function $f:[0,\infty )\rightarrow \mathbb {R}$ such that $\lim _{t\rightarrow \infty } \left( t\log \log t\right) ^{-1/2} f(t) = +\infty $.

Proof

Let $\xi (t) = \int _{0}^{t} Z(s) \cdot dW(s) = \sum _{i=1}^{n} \int _{0}^{t} Z_{i}(s) \,dW_{i}(s)$. Then, the quadratic variation $\rho = [\xi ,\xi ]$ of $\xi $ satisfies:

$$\begin{aligned} d[\xi ,\xi ] = d\xi \cdot d\xi = \sum _{i=1}^{n} Z_{i} Z_{j} \delta _{ij} \,dt \le M \,dt, \end{aligned}$$

(6.2)

where $M = \sup _{t\ge 0} \left\| Z(t) \right\| ^{2} < +\infty $ (recall that Z(t) is bounded by assumption). On the other hand, by the time-change theorem for martingales (Øksendal 2007, Corollary 8.5.4), there exists a Wiener process $\widetilde{W}(t)$ such that $\xi (t) = \widetilde{W}(\rho (t))$, and hence:

$$\begin{aligned} \frac{f(t) + \xi (t)}{f(t)} = 1 + \frac{\widetilde{W}(\rho (t))}{f(t)}. \end{aligned}$$

(6.3)

Obviously, if $\lim _{t\rightarrow \infty } \rho (t) \equiv \rho (\infty ) < +\infty $, $\widetilde{W}(\rho (\infty ))$ is normally distributed so $\widetilde{W}(\rho (t))/f(t) \rightarrow 0$ and there is nothing to show. Otherwise, if $\lim _{t\rightarrow \infty } \rho (t) = +\infty $, the quadratic variation bound (6.2) and the law of the iterated logarithm yield:

$$\begin{aligned} \frac{\big \vert \widetilde{W}(\rho (t))\big \vert }{f(t)} \le \frac{\big \vert \widetilde{W}(\rho (t)) \big \vert }{\sqrt{2\rho (t) \log \log \rho (t)}} \times \frac{\sqrt{2Mt \log \log Mt}}{f(t)} \rightarrow 0 \quad \text {as}\quad t\rightarrow \infty , \end{aligned}$$

(6.4)

and our claim follows. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mertikopoulos, P., Viossat, Y. Imitation dynamics with payoff shocks. Int J Game Theory 45, 291–320 (2016). https://doi.org/10.1007/s00182-015-0505-7

Download citation

Accepted: 17 September 2015
Published: 29 September 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00182-015-0505-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imitation dynamics with payoff shocks

Abstract

Access this article

Similar content being viewed by others

Replicator Dynamics and Weak Pay-Off Positive Selection Dynamics: An Overview

Imitation in heterogeneous populations

Evolution of cooperation in stochastic games

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Auxiliary results from stochastic analysis

Lemma 6.1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Imitation dynamics with payoff shocks

Abstract

Access this article

Similar content being viewed by others

Replicator Dynamics and Weak Pay-Off Positive Selection Dynamics: An Overview

Imitation in heterogeneous populations

Evolution of cooperation in stochastic games

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Auxiliary results from stochastic analysis

Appendix: Auxiliary results from stochastic analysis

Lemma 6.1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation