Abstract
We study the long-run outcomes of noisy asynchronous repeated games with players that are heterogeneous in their patience. The players repeatedly play a \(2\times 2\) coordination game with random pair-wise matching. The games are noisy because the players may make mistakes when choosing their actions and are asynchronous because only one player can move in each period. We characterize the long-run outcomes of Markov perfect equilibrium that are robust to the mistakes and show that if one player is sufficiently patient whereas the other players are not so patient, the efficient state can be the unique robust outcome even if it is risk-dominated. Because we need heterogeneity for the result, we argue that it enables the most patient player in effect to be the leader.
Similar content being viewed by others
Notes
The concept of stochastic stability is first introduced by Foster and Young (1990).
We do not exclude the possibility that the discount factors are zero (i.e., the players are myopic).
Bhaskar and Vega-Redondo (2002) show that if players incur the costs of memorizing past histories, every Nash equilibrium of asynchronous repeated game involves only Markovian strategies.
Because we consider a dynamic equilibrium concept, our model is not an evolutionary model that considers a disequilibrium adaptation process toward a static equilibrium. This is the reason why we do not adopt the term “stochastically stable outcome.”
If there are more than two players, we assume that only one player is forward-looking and the rest of players are myopic. Hence, heterogeneity necessarily exists.
Besides considering forward-looking players, there have been two other approaches for bringing about the result that the efficient equilibrium is uniquely selected even if it is risk-dominated: one approach considers games with local interaction [see Weidenholzer (2010) and references therein] and the other one considers games with pre-play communication [e.g., Bhaskar (1998), Kim and Sobel (1995), Matsui (1991) and Sobel (1993)].
Fudenberg and Maskin (1990) consider finitely repeated games without discounting in which players may make mistakes when choosing their actions. The players move simultaneously in each period. Restricting attention to strategies of finite complexity, they show that the efficient outcome is uniquely evolutionary stable (Smith 1982) for any 2-player common interest game.
More interestingly, Dutta also shows that the result continues to hold even when we consider a asynchronizable game in which players can choose when to move.
Lagunoff and Matsui (1997) characterize the long-run outcomes of asynchronous repeated games in which the stage game is a pure coordination game. In pure coordination games, though, strict Nash equilibrium cannot be efficient and risk-dominated at the same time.
This is also true for reputation models which we mention below.
Fudenberg and Levine (1992) extend the result of their (1989) paper to the case in which actions are not perfectly observed.
While reputation models consider a belief on a commitment type, Sandroni (2000) considers a different kind of belief that generates cooperation in a two-player common interest game. He shows that cooperation can be achieved if players believe in strict reciprocity in the following sense: if two players cooperate in current period, they believe that the opponent cooperates more likely in the next period than in the current period; and if they do not cooperate in current period, they believe that the opponent does not cooperate more likely in the next period than in the current period.
Strictly speaking, player \(i\)’s equilibrium strategy should be defined as
$$\begin{aligned} \psi _i^*(x) = \arg \max _{y\in \{0, 1\}} W_i(y, x_{-i} ) \equiv (1-\varepsilon ) V_i(y, x_{-i}) +\varepsilon V_i (1-y, x_{-i}) \end{aligned}$$because \(V_i(x)\) is evaluated at the realized action profile. However, we may focus on \(V_i(\,\cdot \,)\) because \(W_i(y, x_{-i}) - W_i (1-y, x_{-i}) \propto V_i(y, x_{-i}) - V_i (1-y, x_{-i}) \).
Unfortunately, since the detailed balance conditions [see, e.g., Blume (1997)] are not satisfied when the best responses are different among the players, it is difficult to explicitly derive the general formula for the invariant distribution in terms of the birth and death rates.
Because Lemma 1 holds when \(\varepsilon =0\), the efficient state is, in fact, the unique long-run outcome even if there is no mistake. Therefore, in the two-player case, noise is not necessary for the more patient player to take the leadership. However, we will see that this result does not extend to the case that has more than two players.
Recall that, because people play the stage game through random pair-wise matching in our model, \(N\) is assumed to be an even number.
Instead, if we assume that, in case of a tie, the myopic players stick to the strategy to which they have committed with probability \(1-\varepsilon \) and change their strategy with probability \(\varepsilon \), allowing indifference will not affect our results.
Unlike the case with friction, it follows that the limit invariant distribution puts all the mass on the efficient state if and only if \(\psi _1^*=\psi _2^*= \mathbf {0}\). In particular, if player 1 always takes the efficient action whereas player 2 behaves myopically as in Lemma 1, the limit invariant distribution puts half of the mass on each of \((0, 0)\) and \((0, 1)\).
If \(\frac{2(a-c)}{a+d-2c}<\frac{2(d-b)}{a-c+d-b}\) and \(\delta \in \left( \frac{2(a-c)}{a+d-2c}, \frac{2(d-b)}{a-c+d-b} \right) \), the third MPE uniquely attains. Thus, the efficient state is the unique robust equilibrium outcome even though the players are equally patient. The parameter region \(\{(a, b, c, d)\}\subseteq \mathbb {R}^4\) in which \(\frac{2(a-c)}{a+d-2c}<\frac{2(d-b)}{a-c+d-b}\) vanishes as \(d\rightarrow a\), or the degree of Pareto dominance goes to zero, because, normalizing \(c\) to zero, the condition is equivalent to \(( d-a <)\, b< \frac{d+a}{d} (d-a)\) where the first inequality follows from \(a-c > d-b\).
References
Aoyagi M (1996) Reputation and dynamic Stackelberg leadership in infinitely repeated games. J Econ Theory 71:378–393
Beggs A (2005) Waiting times and equilibrium selection. Econ Theory 25:599–628
Bhaskar V, Vega-Redondo F (2002) Asynchronous choice and Markov equilibria. J Econ Theory 103: 334–350
Bhaskar V (1998) Noisy communication and the evolution of cooperation. J Econ Theory 82:110–131
Blume LE (2004) Evolutionary equilibrium with forward-looking players. The Santa Fe Institute, Santa Fe
Blume LE (1997) Population games. In: Arthur WB, Durlauf SN, Lane DA (eds) The economy as an evolving complex system II. Addison-Wesley, Reading
Camerer CF, Ho TH, Chong JK (2002) Sophisticated experience-weighted attraction learning and strategic teaching in repeated games. J Econ Theory 104:137–188
Carlsson H, van Damme E (1993) Global games and equilibrium selection. Econometrica 61:989–1018
Celentani M, Fudenberg D, Levine DK, Pesendorfer W (1996) Maintaining a reputation against a long-lived opponent. Econometrica 64:691–704
Cripps M, Schimidt K, Thomas J (1996) Reputation in perturbed repeated games. J Econ Theory 69:387–410
Dutta PK (2012) Coordination need not be a problem. Games Econ Behav 76:519–534
Ellison G (2000) Basins of attraction, long-run equilibria and the speed of step-by-step evolution. Rev Econ Stud 67:17–45
Ellison G (1997) Learning from personal experience: one rational guy and the justification of myopia. Games Econ Behav 19:180–210
Evans R, Thomas JP (1997) Reputation and experimentation in repeated games with two long-run players. Econometrica 65:1153–1173
Foster D, Young HP (1990) Stochastic evolutionary game dynamics. Theor Popul Biol 38:219–232
Freidlin M, Wentzell A (1984) Random perturbations of dynamical systems. Springer, New York
Fudenberg D, Levine DK (1992) Maintaining a reputation when strategies are imperfectly observed. Rev Econ Stud 59:561–579
Fudenberg D, Levine DK (1989) Reputation and equilibrium selection in games with a patient player. Econometrica 57:759–778
Fudenberg D, Maskin E (1990) Evolution and cooperation in noisy repeated games. Am Econ Rev 80: 274–279
Harsanyi JC, Selten R (1988) A general theory of equilibrium selection in games. MIT Press, Cambridge
Hyndman K, Terracol A, Vaksmann J (2009) Learning and sophistication in coordination games. Exp Econ 12:450–472
Kajii A, Morris S (1997) The robustness of equilibria to incomplete information. Econometrica 65:1283–1309
Kandori M, Mailath GT, Rob R (1993) Learning, mutation, and long run equilibria in games. Econometrica 61:29–56
Kim Y-G, Sobel J (1995) An evolutionary approach to pre-play communication. Econometrica 5:1181–1193
Lagunoff R (2000) On the evolution of Pareto-optimal behavior in repeated coordination problems. Int Econ Rev 41:273–293
Lagunoff R, Matsui A (1997) Asynchronous choice in repeated coordination games. Econometrica 65: 1467–1477
Maskin E, Tirole J (1988a) A theory of dynamic oligopoly, I: overview and quantity competition with large fixed costs. Econometrica 56:549–570
Maskin E, Tirole J (1988b) A theory of dynamic oligopoly, II: price competition, kinked demand curves and fixed costs. Econometrica 56:571–600
Matsui A (1991) Cheap-talk and cooperation in a society. J Econ Theory 54:245–258
Matsui A, Matsuyama K (1995) An approach to equilibrium selection. J Econ Theory 65:415–434
McCannon BC (2011) Coordination between a sophisticated and fictitious player. J Econ 102:263–273
Morris S, Rob R, Shin HS (1995) \(p\)-dominance and belief potential. Econometrica 63:145–157
Noldeke G, Samuelson L (1993) The evolutionary foundation of backwards and forwards induction. Games Econ Behav 5:425–454
Norris JR (1998) Markov chains, (Cambridge series in statistical and probabilistic mathematics). Cambridge University Press, Cambridge
Robles J (2001) Evolution in finitely repeated coordination games. Games Econ Behav 34:312–330
Sandroni A (2000) Reciprocity and cooperation in repeated coordination games: the principal-player approach. Games Econ Behav 32:157–182
Schmidt K (1993) Reputation and equilibrium characterization in repeated games with conflicting interests. Econometrica 61:325–351
Smith JM (1982) Evolution and the theory of games. Cambridge University Press, Cambridge
Sobel J (1993) Evolutionary stability and efficiency. Econ Lett 42:301–312
Weidenholzer S (2010) Coordination games and local interactions: a survey of the game theoretic literature. Games 1:551–585
Young HP (1993) The evolution of conventions. Econometrica 61:57–84
Acknowledgments
I am grateful to David Levine, John Nachbar, Marcus Berliant, Rohan Dutta, two anonymous referees, seminar participants at Washington University in St. Louis, and participants at 2012 Game Theory Workshop in Japan for their helpful comments. Any remaining errors are my own
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this appendix, proofs omitted from the main text are provided.
Proof of Lemma 1
Since \(V_i(x) \le \frac{d}{1-\delta _i} <\infty \),
as \(\varepsilon \rightarrow 0\). Thus, we focus on \(\Delta V_i (x_j)\)’s and invoke their continuity in \(\varepsilon \). Note that \(\Delta V_i(0) -\Delta V_i(1) = u_\beta (0) -u_\alpha (1) - u_\beta (1) +u_\alpha (2) = a-c +d-b >0\). Suppose a MPE \(\psi ^*\) exists. Let \(\psi ^*_i = (\psi _i^*(0), \psi ^*_i (1) )\). We first show that \(\Delta V_2(0) >0\) and \(\Delta V_2(1) <0\) at equilibrium in the following two claims.
Claim 1
\(\Delta V_2(0) >0\).
Proof
If \(\psi ^*_1= (0, 0)\), \(\Delta V_2 (0) = (1- \delta _2/2) ^{-1} (u_\beta (0)-u_\alpha (1) ) >0\). If \(\psi ^*_1 = (1, 1)\),
Therefore, \(\Delta V_2(0) = u_\beta (0) -u_\alpha (1) + \frac{\delta _2}{2-\delta _2} ( u_\beta (1)- u_\alpha (2) ) >0 \) where the last inequality follows because \(\delta _2 < \frac{2(d-b)}{a-c+d-b}\). Finally, suppose \(\psi ^*_1= (0, 1)\) but \(\Delta V_2(0) \le 0\). (Note that \(\psi _1^*\ne (1, 0)\) because, if so, \(\Delta V_1(0) \le \Delta V_1(1)\).) Then, \(\Delta V_2 (0) = u_\beta (0) -u_\alpha (1) + \frac{\delta _2}{2} \left( V_2(0, 0)- V_2(1, 1) \right) \). Because \(V_2(0, 0) = u_\beta (0) + \frac{\delta _2}{2} V_2(0, \psi _2^*(0) ) +\frac{\delta _2}{2} V_2( 0, 0)\) and \(V_2(0, \psi _2^*(0) ) \ge V_2(0, 0)\), we have \(V_2(0, 0) \ge \frac{u_\beta (0)}{ 1-\delta _2}\). Similarly, \(V_2(1, 1) = u_\alpha (2) + \frac{\delta _2}{2} V_2(1, \psi _2^*(1) ) +\frac{\delta _2}{2} V_2( 1, 1)\). But, because \(\Delta V_2 (1) < \Delta V_2 (0) \le 0\), \(\psi _2^*(1)=1\). Thus, \(V_2(1, 1)=\frac{u_\alpha (2)}{1-\delta _2}\). Therefore, \(\Delta V_2(0) \ge u_\beta (0)- u_\alpha (1) + \frac{\delta _2}{2(1-\delta _2)} ( u_\beta (0) - u_\alpha (2) ) >0\). \(\square \)
Claim 2
\(\Delta V_2(1) <0\).
Proof
If \(\psi ^*_1= (1, 1)\), \(\Delta V_2 (1) = (1-\delta _2/2)^{-1} ( u_\beta (1) -u_\alpha (2) )<0\). If \(\psi ^*_1 = (0, 0)\),
Thus, \(\Delta V_2 (1) = u_\beta (1)- u_\alpha (2) + \frac{\delta _2}{2-\delta _2} (u_\beta (0)-u_\alpha (1) ) < 0\). Finally, if \(\psi ^*_1 = (0, 1)\), \(\Delta V_2(1) \le u_\beta (1)- u_\alpha (2) + \frac{\delta _2}{2(1-\delta _2)} ( u_\beta (0) - u_\alpha (2) ) <0\) because \(V_2(0, 0) = \frac{u_\beta (0)}{ 1-\delta _2}\) (by Claim 1), \(V_2(1, 1)\ge \frac{u_\alpha (2)}{1-\delta _2}\), and \(\delta _2 < \frac{2(a-c)}{a+d-2c}\). \(\square \)
Now, suppose \(\Delta V_1(1) \le 0\). By Claims 1 and 2, \(\psi ^*_2 =(0, 1)\). Then, since \(V_1(0, 0)\ge \frac{u_\beta (0)}{1-\delta _1}, V_1(1, 1) = \frac{u_\alpha (2)}{1-\delta _1}\), and \(\delta _1 > \frac{2(a-c)}{a+d-2c}\),
Hence, we must have \(\Delta V_1(1) >0\) and therefore \(\Delta V_1 (0) > \Delta V_1(1) >0\). Thus, if a MPE \(\psi ^*\) exists, it is uniquely given by \(\psi ^*_1 = (0, 0)\) and \(\psi ^*_2 = (0, 1)\). But, because our state and action spaces are finite, the existence of MPE follows by a standard argument. \(\square \)
Proposition A1 Suppose \(N\!=\!2\) and \(\delta _1 \!=\! \delta _2\). If \(\frac{2(a-c)}{a+d-2c} \!\ge \! \frac{2(d-b)}{a-c+d-b}\), \(\lim _{\varepsilon \rightarrow 0} \pi ^\varepsilon ((0, 0)) \ne 1\) whenever a MPE uniquely exists.
Proof
Let \(\delta _1 =\delta _2= \delta \) and \(\varepsilon =0\). Suppose \(\psi _j^*= (0, 1)\). We first show \(\psi _i^*(1) =0 \Leftrightarrow \delta > \frac{2(a-c)}{a+d-2c}\). If \(1\in \psi _i^*(1) \), \(V_i (1, 1) = \frac{u_\alpha (2)}{1-\delta }\). Then, because \(V_i( 0, 0) = \frac{u_\beta (0)}{1-\delta }\), \(\Delta V_i(1) = u_\beta (1)- u_\alpha (2) + \frac{\delta }{2-\delta } (u_\beta (0)-u_\alpha (1) ) \le 0 \Rightarrow \delta \le \frac{2(a-c)}{a+d-2c}\). If \(\psi _i^*(1)= 0\), we have
Because \(V_i( 0, 0) = \frac{u_\beta (0)}{1-\delta }\), we can solve the above equations for \(V_i(1, 0)\) and \(V_i(1, 1)\). Then, it follows that \(\Delta V_i(1) >0 \Rightarrow \delta > \frac{2(a-c)}{a+d-2c}\).
Therefore, by the previous argument (see the proofs of Claims 1 and 2),
Note that \(\psi _i= (1, 0)\) is strictly dominated. Thus, there can exist three types of (pure) Markov perfect equilibria: (i) \(\psi _i^*= \psi _j^*= (0, 1)\); (ii) \(\psi _i^*= \psi _j^*= (1, 1)\); and (iii) \(\psi _i^*= (0, 1), \psi _j^*= (0, 0)\). In the first MPE, the limit invariant distribution puts half of the mass on each of the static equilibria because both of the players behave as if they are myopic. In the second MPE, the limit invariant distribution puts all the mass on the risk dominant state because both of the players always take action \(\alpha \). The third MPE is the same as the one we attained in Lemma 1, and therefore the limit invariant distribution puts all the mass on the efficient state.
Now, suppose \(\delta \le \frac{2(d-b)}{a-c+d-b}\). Then, because \(\delta < \frac{2(a-c)}{a+d-2c}\) by assumption, the first MPE attains. But, if \(\delta > \frac{2(d-b)}{a-c+d-b}\), the second MPE attains. Therefore, there always exists a MPE under which \(\lim _{\varepsilon \rightarrow 0} \pi ^\varepsilon ((0, 0)) \ne 1\).Footnote 24 \(\square \)
Proof of Lemma 2
As in Lemma 1, we let \(\varepsilon =0\) and invoke continuity in \(\varepsilon \). Suppose \(S_N>1\) so that there exists \(m \in \{0, 1, \ldots , N-1\}\) such that \(m < S_N-1\). Then, by (14),
for \(0\le m < S_N -1\) because \(\psi _\alpha ^*(x) = \psi _\beta ^*(x) =0\). Therefore,
where \(\Delta V_1(m) = V_1(0, m)- V_1(1, m)\). On the other hand, because \( S_N < N-2\) when \(N\ge 4\), there exists \(m \in \{0, 1, \ldots , N-1\}\) such that \(m > S_N+1\). Then, we get
for \(S_N +1 < m \le N-1\). But, by (20),
This, in turn, implies that
by (20). Continuing in this way, we induce \(\Delta V_1(m) >0\) for \(0\le m < S_N-1\). The same argument is applied to (21) to show that \(\Delta V_1(m) <0\) for \(S_N+1 < m \le N-1\). \(\square \)
Proof of Lemma 4
As in the previous lemmas, we let \(\varepsilon =0\) and invoke continuity in \(\varepsilon \). Then, we have
Suppose \(\Delta V_1(m^*+1) \le 0\). Then,
(25) holds with equality iff \(\Delta V_1(m^*) \ge 0\). If \(4\le N < \frac{2(a-c)}{a-c-(d-b)}\), it follows that \(m^*= \frac{N-2}{2}\). Therefore, (25) and (26) are combined and simplified to
where \(Q(n) = \left( 1 - \delta \frac{1}{N} - \delta \frac{m^*+n}{N} \right) ^{-1}\). Then, solving (27) iteratively, we get
where \(f(1)=1, f(2)=m^*, f(3)= m^*(m^*-1)\), and \(f(k)= m^*(m^*-1)\cdots ( m^*-k+2)\) for \(k \ge 4\). Then, since \(V_1(0, 0) \!-\! V_1(1, N-1) \!=\! \frac{d-a}{1-\delta }\), \(V_1(0, m^*)- V_1(1, m^*+1) \!\rightarrow \! \infty \) as \(\delta \rightarrow 1\). Hence, by (24), there exists \(\delta ^*\in (0, 1)\) such that \(\Delta V_1(m^*+1) >0\) for all \(\delta \in (\delta ^*, 1)\). Therefore, we must have \(\delta \in (0, \delta ^*]\). Thus, \(\Delta V_1(m^*+1)>0\) if \(\delta \in (\delta ^*, 1)\). Moreover, for such \(\delta \)’s,
since \(u_\beta (m^*) > u_\alpha ( m^*+1)\) and \(\Delta V_1 (m^*-1) >0\). \(\square \)
Rights and permissions
About this article
Cite this article
Fujishima, S. The emergence of cooperation through leadership. Int J Game Theory 44, 17–36 (2015). https://doi.org/10.1007/s00182-014-0417-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00182-014-0417-y