Technical Developments in Sect. 4
The Prohorov Metric: The Prohorov metric \(\rho _A\) for a metric space A is such that, for any distributions \(\pi ,\pi '\in \mathcal{P}(A)\),
$$\begin{aligned} \rho _A(\pi ,\pi ')=\inf \left( \epsilon >0\mid \pi '((A')^\epsilon )+\epsilon \ge \pi (A'),\; \text{ for } \text{ all } A'\in {\mathscr {B}}(A)\right) , \end{aligned}$$
(A.1)
where
$$\begin{aligned} (A')^\epsilon =\{a\in A\mid d_A(a,a')<\epsilon \text{ for } \text{ some } a'\in A'\}. \end{aligned}$$
(A.2)
The metric \(\rho _A\) is known to generate the weak topology for \(\mathcal{P}(A)\).
According to Parthasarathy [33] (Theorem II.7.1), the strong LLN applies to the empirical distribution under the weak topology, and hence under the Prohorov metric. In the following, we state its weak version.
Lemma 1
Given separable metric spaces A and B, suppose distribution \(\pi _A\in \mathcal{P}(A)\) and measurable mapping \(y\in \mathcal{M}(A,B)\). Then, for any \(\epsilon >0\), as long as n is large enough,
$$\begin{aligned} (\pi _A)^n\left( \left\{ a\equiv (a_m)_{m=1,\ldots ,n}\in A^n\mid \rho _B(\varepsilon (a) \cdot y^{-1},\pi _A \cdot y^{-1})<\epsilon \right\} \right) >1-\epsilon . \end{aligned}$$
For a separable metric space A, a point \(a\in A\), and an \((n-1)\)-point empirical distribution space \(\pi \in \mathcal{P}_{n-1}(A)\), we use \((a,\pi )_n\) to represent the member of \(\mathcal{P}_n(A)\) that has an additional 1/n weight on the point a, but with probability masses in \(\pi \) being reduced to \((n-1)/n\) times of their original values. For \(a\in A^n\) and \(m=1,\ldots ,n\), we have \((a_m,\varepsilon (a_{-m}))_n=\varepsilon (a)\). Concerning the Prohorov metric, we have also a simple but useful observation.
Lemma 2
Let A be a separable metric space. Then, for any \(n=2,3,\ldots \), \(a\in A\), and \(\pi \in \mathcal{P}_{n-1}(A)\),
$$\begin{aligned} \rho _A\left( (a,\pi )_n,\pi \right) \le \frac{1}{n}. \end{aligned}$$
Proof
Let \(A'\in {\mathscr {B}}(A)\) be chosen. If \(a\notin A'\), then
$$\begin{aligned} (a,\pi )_n(A')\le \pi (A')\le (a,\pi )_n(A')+\frac{1}{n}; \end{aligned}$$
(A.3)
if \(a\in A'\), then
$$\begin{aligned} (a,\pi )_n(A')-\frac{1}{n}\le \pi (A')\le (a,\pi )_n(A'). \end{aligned}$$
(A.4)
Hence, it is always true that
$$\begin{aligned} \mid (a,\pi )_n(A')-\pi (A')\mid \le \frac{1}{n}. \end{aligned}$$
(A.5)
In view of (A.1) and (A.2), we have
$$\begin{aligned} \rho _A\left( (a,\pi )_n,\pi \right) \le \frac{1}{n}. \end{aligned}$$
(A.6)
We have thus completed the proof. \(\square \)
The following result is important for showing the near-trajectory evolution of aggregate environments in large multi-period games. Among other things, it relies on Lemma 1.
Lemma 3
Given a separable metric space A and complete separable metric spaces B and C, suppose \(y_n\in \mathcal{M}(A^n,B^n)\) for every \(n\in {\mathbb {N}}\), \(\pi _A\in \mathcal{P}(A)\), \(\pi _B\in \mathcal{P}(B)\), and \(\pi _C\in \mathcal{P}(C)\). If
$$\begin{aligned} (\pi _A)^n\left( \{a\in A^n\mid \rho _B(\varepsilon (y_n(a)),\pi _B)<\epsilon \}\right) >1-\epsilon , \end{aligned}$$
for any \(\epsilon >0\) and any n large enough, then
$$\begin{aligned} (\pi _A\otimes \pi _C)^n\left( \{(a,c)\in (A\times C)^n\mid \rho _{B\times C}(\varepsilon (y_n(a),c),\pi _B\otimes \pi _C)<\epsilon \}\right) >1-\epsilon , \end{aligned}$$
for any \(\epsilon >0\) and any n large enough.
Proof
Suppose sequence \(\{\pi '_{B1},\pi '_{B2},\ldots \}\) weakly converges to the given probability measure \(\pi _B\), and sequence \(\{\pi '_{C1},\pi '_{C2},\ldots \}\) weakly converges to the given probability measure \(\pi _C\). We are to show that the sequence \(\{\pi '_{B1}\otimes \pi '_{C1},\pi '_{B2}\otimes \pi '_{C2},\ldots \}\) weakly converges to \(\pi _B\otimes \pi _C\).
Let F(B) denote the family of uniformly continuous real-valued functions on B with bounded support. Let F(C) be similarly defined for C. We certainly have
$$\begin{aligned} \left\{ \begin{array}{l} \lim _{k\rightarrow +\infty }\int _B f(b)\cdot \pi '_{Bk}(db)=\int _B f(b)\cdot \pi _B(db),\quad \forall f\in F(B),\\ \lim _{k\rightarrow +\infty }\int _C f(c)\cdot \pi '_{Ck}(dc)=\int _C f(c)\cdot \pi _C(dc),\quad \forall f\in F(C). \end{array}\right. \end{aligned}$$
(A.7)
Define F so that
$$\begin{aligned} \begin{array}{l} F=\{f\mid f(b,c)=f_B(b)\cdot f_C(c)\; \text{ for } \text{ any } (b,c)\in B\times C,\\ \;\;\;\;\;\;\;\;\;\;\;\; \text{ where } f_B\in F(B)\cup \{\mathbf{1}\} \text{ and } f_C\in F(C)\cup \{\mathbf{1}\}\}, \end{array} \end{aligned}$$
(A.8)
where \(\mathbf{1}\) stands for the function whose value is 1 everywhere. By (A.7) and (A.8),
$$\begin{aligned} \lim _{k\rightarrow +\infty }\int _{B\times C}f(b,c)\cdot (\pi '_{Bk}\otimes \pi '_{Ck})(d(b,c))=\int _{B\times C}f(b,c)\cdot (\pi _B\otimes \pi _C)(d(b,c)). \end{aligned}$$
(A.9)
According to Ethier and Kurtz [14] (Proposition III.4.4), F(B) and F(C) happen to be \(\mathcal{P}(B)\) and \(\mathcal{P}(C)\)’s convergence determining families, respectively. As B and C are complete, Ethier and Kurtz ([14], Proposition III.4.6, whose proof involves Prohorov’s Theorem, i.e., the equivalence between tightness and relative compactness of a collection of probability measures defined for complete separable metric spaces) further states that F as defined through (A.8) is convergence determining for \(\mathcal{P}(B\times C)\). Therefore, we have the desired weak convergence by (A.9).
Let \(\epsilon >0\) be given. In view of the above product-measure convergence and the equivalence between the weak topology and that induced by the Prohorov metric, there must be \(\delta _B>0\) and \(\delta _C>0\), such that \(\rho _B(\pi '_B,\pi _B)<\delta _B\) and \(\rho _C(\pi '_C,\pi _C)<\delta _C\) will imply
$$\begin{aligned} \rho _{B\times C}(\pi '_B\otimes \pi '_C,\pi _B\otimes \pi _C)<\epsilon . \end{aligned}$$
(A.10)
By (A.1) and the given hypothesis, there is \({\bar{n}}^1\in {\mathbb {N}}\), so that for \(n={\bar{n}}^1,{\bar{n}}^1+1,\ldots \),
$$\begin{aligned} (\pi _A)^n({\tilde{A}}_n)>1-\frac{\epsilon }{2}, \end{aligned}$$
(A.11)
where \({\tilde{A}}_n\) contains all \(a\in A^n\) such that
$$\begin{aligned} \rho _B(\varepsilon (y_n(a)),\pi _B)<\delta _B. \end{aligned}$$
(A.12)
By (A.1) and Lemma 1, on the other hand, there is \({\bar{n}}^2\in {\mathbb {N}}\), so that for \(n={\bar{n}}^2,\bar{n}^2+1,\ldots \),
$$\begin{aligned} (\pi _C)^n({\tilde{C}}_n)>1-\frac{\epsilon }{2}, \end{aligned}$$
(A.13)
where \({\tilde{C}}_n\) contains all \(c\in C^n\) such that
$$\begin{aligned} \rho _C(\varepsilon (c),\pi _C)<\delta _C. \end{aligned}$$
(A.14)
For any \(n={\bar{n}}^1\vee {\bar{n}}^2,{\bar{n}}^1\vee {\bar{n}}^2+1,\ldots \), let (a, c) be an arbitrary member of \({\tilde{A}}_n\times {\tilde{C}}_n\). We have from (A.10), (A.12), and (A.14) that,
$$\begin{aligned} \rho _{B\times C}(\varepsilon (y_n(a),c),\pi _B\otimes \pi _C)<\epsilon . \end{aligned}$$
(A.15)
Noting the facilitating (a, c) is but an arbitrary member of \({\tilde{A}}_n\times {\tilde{C}}_n\), we see that
$$\begin{aligned} \begin{array}{l} (\pi _A\otimes \pi _C)^n\left( \{(a,c)\in (A\times C)^n\mid \rho _{B\times C}(\varepsilon (y_n(a),c),\pi _B\otimes \pi _C)<\epsilon \}\right) \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\ge (\pi _A)^n({\tilde{A}}_n)\times (\pi _C)^n({\tilde{C}}_n), \end{array} \end{aligned}$$
(A.16)
which by (A.11) and (A.13), is greater than \(1-\epsilon \). \(\square \)
Because the equivalence between tightness and relative compactness of a collection of probability measures is indirectly related to the proof of Lemma 3, we require B and C to be complete separable metric spaces.
Lemma 4
Given separable metric spaces A, B, C, and D, as well as distributions \(\pi _A\in \mathcal{P}(A)\), \(\pi _B\in \mathcal{P}(B)\), and \(\pi _C\in \mathcal{P}(C)\), suppose \(y_n\in \mathcal{M}(A^n,B^n)\) for every \(n\in {\mathbb {N}}\) and \(z\in \mathcal{K}(B,C,\pi _C,D)\). If
$$\begin{aligned} (\pi _A\otimes \pi _C)^n\left( \{a\in A^n,c\in C^n\mid \rho _{B\times C}(\varepsilon (y_n(a),c),\pi _B\otimes \pi _C)<\epsilon \}\right) >1-\epsilon , \end{aligned}$$
for any \(\epsilon >0\) and any n large enough, then
$$\begin{aligned} (\pi _A\otimes \pi _C)^n\left( \left\{ a\in A^n,c\in C^n\mid \rho _D(\varepsilon (y_n(a),c)\cdot z^{-1},(\pi _B\otimes \pi _C)\cdot z^{-1})<\epsilon \right\} \right) >1-\epsilon , \end{aligned}$$
for any \(\epsilon >0\) and any n large enough.
Proof
Let \(\epsilon >0\) be given. Since \(z\in \mathcal{K}(B,C,\pi _C,D)\), there exist \(C'\in {\mathscr {B}}(C)\) satisfying
$$\begin{aligned} \pi _C(C')>1-\frac{\epsilon }{2}, \end{aligned}$$
(A.17)
as well as
$$\begin{aligned} \delta \in (0,\epsilon /2], \end{aligned}$$
(A.18)
such that for any \(b,b'\in B\) and \(c,c'\in C'\) satisfying \(d_{B\times C}((b,c),(b',c'))<\delta \),
$$\begin{aligned} d_D(z(b,c),z(b',c'))<\epsilon . \end{aligned}$$
(A.19)
For any subset \(D'\) in \({\mathscr {B}}(D)\), we therefore have
$$\begin{aligned} (z^{-1}(D'))^\delta \cap (B\times C')\subseteq z^{-1}((D')^\epsilon ). \end{aligned}$$
(A.20)
This leads to \((z^{-1}(D'))^\delta \setminus (B\times (C \setminus C'))\subseteq z^{-1}((D')^\epsilon )\), and hence due to (A.17),
$$\begin{aligned} (\pi _B\otimes \pi _C)\left( z^{-1}((D')^\epsilon )\right) \ge (\pi _B\otimes \pi _C)\left( (z^{-1}(D'))^\delta \right) -\frac{\epsilon }{2}. \end{aligned}$$
(A.21)
On the other hand, by the hypothesis, we know for n large enough,
$$\begin{aligned} (\pi _A\otimes \pi _C)^n(E'_n)>1-\delta , \end{aligned}$$
(A.22)
where
$$\begin{aligned} E'_n=\{a\in A^n,c\in C^n\mid \rho _{B\times C}(\varepsilon (y_n(a),c),\pi _B\otimes \pi _C)<\delta \}\in {\mathscr {B}}^n(A\times C). \end{aligned}$$
(A.23)
By (A.23), for any \((a,c)\in E'_n\) and \(F'\in {\mathscr {B}}(B\times C)\),
$$\begin{aligned} (\pi _B\otimes \pi _C)((F')^\delta )\ge [\varepsilon (y_n(a),c)](F')-\delta . \end{aligned}$$
(A.24)
Combining the above, we have, for any \((a,c)\in E'_n\) and \(D'\in {\mathscr {B}}(D)\),
$$\begin{aligned} \begin{array}{l} [(\pi _B\otimes \pi _C)\cdot z^{-1}]((D')^\epsilon )=(\pi _B\otimes \pi _C)(z^{-1}((D')^\epsilon ))\\ \;\;\;\;\;\;\ge (\pi _B\otimes \pi _C)((z^{-1}(D'))^\delta )-\epsilon /2\ge [\varepsilon (y_n(a),c)](z^{-1}(D'))-\delta -\epsilon /2\\ \;\;\;\;\;\;\ge [\varepsilon (y_n(a),c)](z^{-1}(D'))-\epsilon =([\varepsilon (y_n(a),c)]\cdot z^{-1})(D')-\epsilon . \end{array} \end{aligned}$$
(A.25)
where the first inequality is due to (A.21), the second inequality is due to (A.24), and the third inequality is due to (A.18). That is, we have
$$\begin{aligned} \rho _D\left( \varepsilon (y_n(a),c)\cdot z^{-1},(\pi _B\otimes \pi _C)\cdot z^{-1}\right) \le \epsilon ,\quad \forall (a,c)\in E'_n. \end{aligned}$$
(A.26)
In view of (A.18) and (A.22), we have the desired result. \(\square \)
With Lemmas 1 to 4 ready, we can now prove Proposition 1 and then Theorem 1.
Proof of Proposition 1:
Let \(t=1,\ldots ,{\bar{t}}-1\) and \(x\in \mathcal{K}(S,G,\gamma ,X)\) be given. Define a map \(z\in \mathcal{M}(S\times G\times I,S)\), such that
$$\begin{aligned} z(s,g,i)=\theta _t\left( s,x(s,g),M(\sigma ,x),i\right) ,\quad \forall s\in S,g\in G,i\in I, \end{aligned}$$
(A.27)
where \(M(\sigma ,x)\) is given in (7). In view of (10) and (A.27), we have, for any \(S'\in {\mathscr {B}}(S)\),
$$\begin{aligned} \begin{array}{l} [T_t(x)\circ \sigma ](S')=\int _S\int _G\int _I\mathbf{1}(z(s,g,i)\in S')\cdot \iota (di)\cdot \gamma (dg)\cdot \sigma (ds)\\ \;\;\;\;\;\;=(\sigma \otimes \gamma \otimes \iota )(\{(s,g,i)\in S\times G\times I\mid z(s,g,i)\in S'\})=(\sigma \otimes \gamma \otimes \iota )(z^{-1}(S')). \end{array} \end{aligned}$$
(A.28)
For \(n\in {\mathbb {N}}\), \(g\equiv (g_m)_{m=1,\ldots ,n}\in G^n\), and \(i\equiv (i_m)_{m=1,\ldots ,n}\in I^n\), also define an operator \(T'_n(g,i)\) on \(\mathcal{P}_n(S)\) so that \(T'_n(g,i)\circ \varepsilon (s)=\varepsilon (s')\), where for \(m=1,2,\ldots ,n\),
$$\begin{aligned} s'_m=z(s_m,g_m,i_m)=\theta _t\left( s_m,x(s_m,g_m),M(\sigma ,x),i_m\right) . \end{aligned}$$
(A.29)
It is worth noting that (A.29) is different from the earlier (15). In view of (A.27) and (A.29), we have, for \(S'\in {\mathscr {B}}(S)\), that \([T'_n(g,i)\circ \varepsilon (s)](S')\) equals
$$\begin{aligned} \frac{1}{n}\cdot \sum _{m=1}^n\mathbf{1}\left( z(s_m,g_m,i_m)\in S'\right) =\varepsilon ((s_1,g_1,i_1),\ldots ,(s_n,g_n,i_n))\left( z^{-1}(S')\right) . \end{aligned}$$
(A.30)
Combining (A.28) and (A.30), we arrive to a key observation that
$$\begin{aligned} T_t(x)\circ \sigma =(\sigma \otimes \gamma \otimes \iota )\cdot z^{-1},\quad \text{ while } \quad T'_n(g,i)\circ \varepsilon (s)=\varepsilon (s,g,i)\cdot z^{-1}. \end{aligned}$$
(A.31)
In the rest of the proof, we first show the asymptotic closeness between \(T_t(x)\circ \sigma \) and \(T'_n(g,i)\circ \varepsilon (s_n(a))\), and then that between the latter and \(T_{nt}(x,g,i)\circ \varepsilon (s_n(a))\).
First, due to the hypothesis on the convergence of \(\varepsilon (s_n(a))\) to \(\sigma \), the completeness of the spaces S, G, and I and hence also the completeness of \(G\times I\), as well as Lemma 3,
$$\begin{aligned} (\pi \otimes \gamma \otimes \iota )^n(\{(a,g,i)\in (A\times G\times I)^n\mid \rho _{S\times G\times I}(\varepsilon (s_n(a),g,i),\sigma \otimes \gamma \otimes \iota )<\epsilon '\})>1-\epsilon ', \end{aligned}$$
(A.32)
for any \(\epsilon '>0\) and any n large enough. We then show z as defined in (A.27) is a member of \(\mathcal{K}(S,G\times I,\gamma \otimes \iota ,S)\) according to the definition around (17) using (S1) and \(x\in \mathcal{K}(S,G,\gamma ,X)\).
Fix any \(\epsilon >0\). By (S1), there exist \(\delta >0\) and \(I'\in {\mathscr {B}}(I)\) with \(\iota (I')>1-\epsilon /2\) such that
$$\begin{aligned} d_S(\theta _t(s,y,M(\sigma ,x),i),\theta _t(s',y',M(\sigma ,x),i'))<\epsilon , \end{aligned}$$
(A.33)
for any \((s,y),(s',y')\in S\times X\) and \(i,i'\in I'\) satisfying \(d_{S\times X\times I}((s,y,i),(s',y',i'))<\delta \). Since \(x\in \mathcal{K}(S,G,\gamma ,X)\), there exist \(\delta '\in (0,\delta /2]\) and \(G'\in {\mathscr {B}}(G)\) with \(\gamma (G')>1-\epsilon /2\) such that
$$\begin{aligned} d_X(x(s,g),x(s',g'))<\frac{\delta }{2}, \end{aligned}$$
(A.34)
for any \(s,s'\in S\) and \(g,g'\in G'\) satisfying \(d_{S\times G}((s,g),(s',g'))<\delta '\).
Now suppose \(s,s'\in S\) and \((g,i),(g',i')\in G'\times I'\) satisfy
$$\begin{aligned} d_{S\times G\times I}((s,g,i),(s',g',i'))<\delta '. \end{aligned}$$
(A.35)
By the first inequality of (3), \(d_{S\times G}((s,g),(s',g'))<\delta '\). This and the fact that \(g,g'\in G'\) would result in (A.34). Due to the first inequality of (3), another consequence of (A.35) is
$$\begin{aligned} d_{S\times I}((s,i),(s',i'))<\delta '\le \frac{\delta }{2}. \end{aligned}$$
(A.36)
With the second inequality of (3), we can conclude from (A.34) and (A.36) that
$$\begin{aligned} d_{S\times X\times I}((s,x(s,g),i),(s',x(s',g'),i'))<\delta . \end{aligned}$$
(A.37)
As \(i,i'\in I'\), we can see from (A.33) that
$$\begin{aligned} d_S(\theta _t(s,x(s,g),M(\sigma ,x),i),\theta _t(s',x(s',g') ,M(\sigma ,x),i'))<\epsilon . \end{aligned}$$
(A.38)
In addition, the measures of \(G'\) and \(I'\) would lead to
$$\begin{aligned} (\gamma \otimes \iota )(G'\times I')\ge 1-(1-\gamma (G'))-(1-\iota (I'))>1-\epsilon . \end{aligned}$$
(A.39)
Since \(\epsilon >0\) is arbitrary, (17), (A.35), (A.38), and (A.39) would together mean that z as defined through (A.27) is a member of \(\mathcal{K}(S,G\times I,\gamma \otimes \iota ,S)\).
By Lemma 4, this fact along with (A.32) will lead to the strict dominance of \(1-\epsilon '\) by
$$\begin{aligned} (\pi \otimes \gamma \otimes \iota )^n(\{(a,g,i)\in (A\times G\times I)^n\mid \rho _S(\varepsilon (s_n(a),g,i)\cdot z^{-1},(\sigma \otimes \gamma \otimes \iota )\cdot z^{-1})<\epsilon '\}), \end{aligned}$$
(A.40)
for any \(\epsilon '>0\) and any n large enough. By (A.31), this is equivalent to that, given \(\epsilon >0\), there exists \({\bar{n}}^1\in {\mathbb {N}}\) so that for any \(n={\bar{n}}^1,{\bar{n}}^1+1,\ldots \),
$$\begin{aligned} (\pi \otimes \gamma \otimes \iota )^n \left( {\tilde{A}}_n(\epsilon )\right) >1-\frac{\epsilon }{2}, \end{aligned}$$
(A.41)
where \({\tilde{A}}_n(\epsilon )\in {\mathscr {B}}^n(A\times G\times I)\) is equal to
$$\begin{aligned} \left\{ (a,g,i)\in (A\times G\times I)^n\mid \rho _S\left( T_t(x)\circ \sigma ,T'_n(g,i)\circ \varepsilon (s_n(a))\right) <\frac{\epsilon }{2}\right\} . \end{aligned}$$
(A.42)
Next, note that the only difference between \(T_{nt}(x,g,i)\circ \varepsilon (s_n(a))\) and \(T'_n(g,i)\circ \varepsilon (s_n(a))\) lies in that \(\varepsilon (s_{n,-m}(a),g_{-m})\) is used in the former as in (15), whereas \(\sigma \otimes \gamma \) is used in the latter as in (A.29). Here, \(s_{n,-m}(a)\) refers to the vector \((s_{n1}(a),\ldots ,s_{n,m-1}(a),s_{n,m+1}(a),\ldots ,s_{nn}(a))\). By (S2), there is \(\delta \in (0,\epsilon /4]\) and \(I'\in {\mathscr {B}}(I)\) with
$$\begin{aligned} \iota (I')>1-\frac{\epsilon }{4}, \end{aligned}$$
(A.43)
so that for any \((s,g,i)\in S\times G\times I'\) and any \(\mu '\in \mathcal{P}(S\times X)\) satisfying \(\rho _{S\times X}(M(\sigma ,x),\mu ')<\delta \),
$$\begin{aligned} d_S\left( \theta _t(s,x(s,g),M(\sigma ,x),i), \theta _t(s,x(s,g),\mu ',i)\right) <\frac{\epsilon }{2}. \end{aligned}$$
(A.44)
For each \(n\in {\mathbb {N}}\), define \(I'_n\) so that
$$\begin{aligned} I'_n=\left\{ i\equiv (i_m)_{m=1,\ldots ,n}\in I^n\mid \text{ more } \text{ than } \left( 1-\frac{\epsilon }{2}\right) \cdot n \text{ components } \text{ come } \text{ from } I'\right\} . \end{aligned}$$
(A.45)
Also important is that by (A.44) and (A.45), for any \(S'\in {\mathscr {B}}(S)\) and \(i\equiv (i_m)_{m=1,\ldots ,n}\in I'_n\),
$$\begin{aligned} \left[ T_{nt}(x,g,i)\circ \varepsilon (s_n(a))\right] \left( (S')^{\epsilon /2}\right) +\frac{\epsilon }{2}\ge \left[ T'_n(g,i)\circ \varepsilon (s_n(a))\right] (S'), \end{aligned}$$
(A.46)
whenever
$$\begin{aligned} \rho _{S\times X}\left( M(\sigma ,x),M_n(\varepsilon (s_{n,-m}(a)),x,g_{-m})\right) <\delta . \end{aligned}$$
(A.47)
It can be shown that \(I'_n\) will occupy a big chunk of \(I^n\) as measured by \(\iota ^n\) when n is large. Define a map q from I to \(\{0,1\}\) so that \(q(i)=1\) or 0 depending on whether or not \(i\in I'\). By (A.43), \(\iota \cdot q^{-1}\) is a Bernoulli distribution with \((\iota \cdot q^{-1})(\{1\})>1-\epsilon /4\). So by (A.45), \(I'_n\) contains all \(i\equiv (i_m)_{m=1,\ldots ,n}\in I^n\) that satisfy
$$\begin{aligned} \rho _{\{0,1\}}(\varepsilon (i)\cdot q^{-1},\iota \cdot q^{-1})<\frac{\epsilon }{4}. \end{aligned}$$
(A.48)
Therefore, by Lemma 1, there exits \({\bar{n}}^2\in {\mathbb {N}}\), so that for \(n={\bar{n}}^2,{\bar{n}}^2+1,\ldots \),
$$\begin{aligned} \iota ^n(I'_n)>1-\frac{\epsilon }{4}. \end{aligned}$$
(A.49)
We can also demonstrate that (A.47) will be highly likely when n is large. By Lemma 3 and the hypothesis on the convergence of \(\varepsilon (s_n(a))\) to \(\sigma \), we know \(\varepsilon (s_n(a),g)\) will converge to \(\sigma \otimes \gamma \) in probability. Due to Lemma 2, this conclusion applies to the sequence \(\varepsilon (s_{n,-m}(a),g_{-m})\) as well. The fact that \(x\in \mathcal{K}(S,G,\gamma ,X)\) certainly leads to \((\text{ prj }^{S\times G}_S,x)\in \mathcal{K}(S,G,\gamma ,S\times X)\). So by Lemma 4, there is \({\bar{n}}^3\in {\mathbb {N}}\), so that for \(n={\bar{n}}^3,{\bar{n}}^3+1,\ldots \),
$$\begin{aligned} (\pi ^n\otimes \gamma ^n)\left( {\tilde{B}}_n(\delta )\right) >1-\frac{\epsilon }{4}, \end{aligned}$$
(A.50)
where
$$\begin{aligned} {\tilde{B}}_n(\delta )=\{(a,g)\in A^n\times G^n\mid (A.47) \text{ is } \text{ true }\}\in {\mathscr {B}}^n(A\times G). \end{aligned}$$
(A.51)
Consider arbitrary \(n={\bar{n}}^1\vee {\bar{n}}^2\vee {\bar{n}}^3,\bar{n}^1\vee {\bar{n}}^2\vee {\bar{n}}^3+1,\ldots \), \((a,g,i)\in \tilde{A}_n(\epsilon )\cap ({\tilde{B}}_n(\delta )\times I'_n)\), and \(S'\in {\mathscr {B}}(S)\). By (A.1) and (A.42), we see that
$$\begin{aligned}{}[T'_n(g,i)\circ \varepsilon (s_n(a))]\left( (S')^{\epsilon /2}\right) +\frac{\epsilon }{2}\ge [T_t(x)\circ \sigma ](S'). \end{aligned}$$
(A.52)
Combining this with (A.46), (A.47), and (A.51), we obtain
$$\begin{aligned}{}[T_{nt}(x,g,i)\circ \varepsilon (s_n(a))]\left( (S')^\epsilon \right) +\epsilon \ge [T'_n(g,i)\circ \varepsilon (s_n(a))]\left( (S')^{\epsilon /2}\right) +\frac{\epsilon }{2}\ge [T_t(x)\circ \sigma ](S'). \end{aligned}$$
(A.53)
According to (A.1), this means
$$\begin{aligned} \rho _S\left( T_{nt}(x,g,i)\circ \varepsilon (s_n(a)),T_t(x)\circ \sigma \right) \le \epsilon . \end{aligned}$$
(A.54)
Therefore, for \(n\ge {\bar{n}}^1\vee {\bar{n}}^2\vee {\bar{n}}^3\),
$$\begin{aligned} \begin{array}{l} (\pi \otimes \gamma \otimes \iota )^n\left( \{(a,g,i)\in (A\times G\times I)^n\mid \rho _S(T_{nt}(x,g,i)\circ \varepsilon (s_n(a)),T_t(x)\circ \sigma )\le \epsilon \}\right) \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\ge (\pi \otimes \gamma \otimes \iota )^n\left( \tilde{A}_n(\epsilon )\cap ({\tilde{B}}_n(\delta )\times I'_n)\right) , \end{array}\end{aligned}$$
(A.55)
whereas the latter is, in view of (A.41), (A.49), and (A.50), greater than \(1-\epsilon \). \(\square \)
Proof of Theorem 1:
We use induction to show that, for each \(\tau =0,1,\ldots ,{\bar{t}}-t+1\),
$$\begin{aligned} \left( \sigma _t\otimes \gamma ^\tau \otimes \iota ^\tau \right) ^n\left( \tilde{A}_{n\tau }(\epsilon )\right) >1-\frac{\epsilon }{{\bar{t}}-t+2}, \end{aligned}$$
(A.56)
for any \(\epsilon >0\) and n large enough, where \(\tilde{A}_{n\tau }(\epsilon )\in {\mathscr {B}}^n(S\times G^\tau \times I^\tau )\) is such that, for any \((s_t,g_{[t,t+\tau -1]},i_{[t,t+\tau -1]})\in {\tilde{A}}_{n\tau }(\epsilon )\),
$$\begin{aligned} \rho _S\left( T_{n,[t,t+\tau -1]}(x_{[t,t+\tau -1]},g_{[t,t+\tau -1]}, i_{[t,t+\tau -1]})\circ \varepsilon (s_t),T_{[t,t+\tau -1]}(x_{[t,t+\tau -1]})\circ \sigma _t\right) <\epsilon . \end{aligned}$$
(A.57)
Once the above is achieved, we can then define \({\tilde{A}}_n(\epsilon )\) required in the theorem by
$$\begin{aligned} {\tilde{A}}_n(\epsilon )=\bigcap _{\tau =0}^{{\bar{t}}-t+1}\left[ {\tilde{A}}_{n\tau }(\epsilon )\times G^{n\cdot ({\bar{t}}-t+1-\tau )}\times I^{n\cdot ({\bar{t}}-t+1-\tau )}\right] . \end{aligned}$$
(A.58)
By this and (A.56), we have \(\left( \sigma _t\otimes \gamma ^{{\bar{t}}-t+1}\otimes \iota ^{{\bar{t}}-t+1}\right) ^n\left( \tilde{A}_n(\epsilon )\right) \) greater than
$$\begin{aligned} \begin{array}{l} 1-\sum _{\tau =0}^{{\bar{t}}-t+1}\left[ 1-\left( \sigma _t\otimes \gamma ^{{\bar{t}}-t+1}\otimes \iota ^{{\bar{t}}-t+1}\right) \left( {\tilde{A}}_{n\tau }(\epsilon )\times G^{n\cdot ({\bar{t}}-t+1-\tau )}\times I^{n\cdot ({\bar{t}}-t+1-\tau )}\right) \right] \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;=1-\sum _{\tau =0}^{{\bar{t}}-t+1}\left[ 1-(\sigma _t\otimes \gamma ^\tau \otimes \iota ^\tau )({\tilde{A}}_{n\tau }(\epsilon ))\right] \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;>1-({\bar{t}}-t+2)\cdot \left[ 1-(1-\epsilon /({\bar{t}}-t+2))\right] =1-\epsilon , \end{array}\end{aligned}$$
(A.59)
for any \(\epsilon >0\) and n large enough.
Now we proceed with the induction process. First, note that \(T_{n,[t,t-1]}\circ \varepsilon (s_t)\) is merely \(\varepsilon (s_t)\) itself and \(T_{[t,t-1]}\circ \sigma _t\) is merely \(\sigma _t\) itself. Hence, we will have (A.56) for \(\tau =0\) for any \(\epsilon >0\) and n large enough just by Lemma 1. Then, for some \(\tau =1,2,\ldots ,{\bar{t}}-t+1\), suppose
$$\begin{aligned} \left( \sigma _t\otimes \gamma ^{\tau -1}\otimes \iota ^{\tau -1}\right) ^n\left( \tilde{A}_{n,\tau -1}(\epsilon )\right) >1-\frac{\epsilon }{{\bar{t}}-t+2}, \end{aligned}$$
(A.60)
for any \(\epsilon >0\) and n large enough. We may apply Proposition 1 to the above, while at the same time identifying \(S\times G^{\tau -1}\times I^{\tau -1}\) with A, \(\sigma _t\otimes \gamma ^{\tau -1}\otimes \iota ^{\tau -1}\) with \(\pi \), \(x_{t+\tau -1}\) with x, \(T_{n,[t,t+\tau -2]}(x_{[t,t+\tau -2]},g_{[t,t+\tau -2]},i_{[t,t+\tau -2]})\circ \varepsilon (s_t)\) with \(\varepsilon (s_n(a))\), and \(T_{[t,t+\tau -2]}(x_{[t,t+\tau -2]})\circ \sigma _t\) with \(\sigma \). This way, we will verify (A.56) for any \(\epsilon >0\) and n large enough. Therefore, the induction process can be completed. \(\square \)
Technical Developments in Sect. 5
Proof of Proposition 2:
Because payoffs are bounded, the value functions are bounded too. We then prove by induction on t. By (20), we know the result is true for \(t={\bar{t}}+1\). Suppose for some \(t={\bar{t}},{\bar{t}}-1,\ldots ,2\), we have the continuity of \(v_{t+1}(s_{t+1},\sigma _{t+1},x_{[t+1,{\bar{t}}]},x_{t+1})\) in \(s_{t+1}\). By this induction hypothesis, the distribution-wise uniform continuity of \(x_t\), (S1), (F1), and the boundedness of the value functions, we see the continuity of the right-hand side of (21) in \(s_t\). So, \(v_t(s_t,\sigma _t,x_{[t{\bar{t}}]},x_t)\) is continuous in \(s_t\), and we have completed our induction process. \(\square \)
Proof of Proposition 3:
We prove by induction on t. By (20) and (24), we know the result is true for \(t={\bar{t}}+1\). Suppose for some \(t={\bar{t}},{\bar{t}}-1,\ldots ,2\), we have the convergence of \(v_{n,t+1}(s_{t+1,1},\varepsilon (s^n_{t+1,-1}),x_{[t+1,{\bar{t}}]},x_{t+1})\) to \(v_{t+1}(s_{t+1,1},\sigma _{t+1},x_{[t+1,{\bar{t}}]},x_{t+1})\) at an \(s_{t+1,1}\)-independent rate when \(s_{t+1,-1}\equiv (s_{t+1,2},s_{t+1,3},\ldots )\) is sampled from \(\sigma _{t+1}\). Now, suppose \(s_{t,-1}\equiv (s_{t2},s_{t3},\ldots )\) is sampled from \(\sigma _t\). Let also \(g\equiv (g_1,g_2,\ldots )\) be generated through sampling on \((G,{\mathscr {B}}(G),\gamma )\) and \(i\equiv (i_1,i_2,\ldots )\) be generated through sampling on \((I,{\mathscr {B}}(I),\iota )\). In the remainder of the proof, let \(s^n_t\equiv (s_{t1},s_{t2},\ldots ,s_{tn})\) for any arbitrary \(s_{t1}\in S\), \(g^n\equiv (g_1,\ldots ,g_n)\) and \(i^n\equiv (i_1,\ldots ,i_n)\).
Due to Lemma 1, \(\varepsilon (s^n_{t,-1})\) will converge to \(\sigma _t\). By Lemma 2, \(\varepsilon (s^n_t)\) will converge to \(\sigma _t\) at an \(s_{t1}\)-independent rate. By Proposition 1, we know that \(T_{nt}(x_t,g^n,i^n)\circ \varepsilon (s^n_t)\) will converge to \(T_t(x_t)\circ \sigma _t\) in probability at an \(s_{t1}\)-independent rate, and by Lemma 2 again, so will \([T_{nt}(x_t,g^n,i^n)\circ \varepsilon (s^n_t)]_{-1}\) to \(T_t(x_t)\circ \sigma _t\). Now Lemma 3 will lead to the convergence in probability of \(\varepsilon (s^n_{t,-1},g^n_{-1})\) to \(\sigma _t\otimes \gamma \). Due to \(x_t\)’s distribution-wise uniform continuity, Lemma 4 will lead to the convergence in probability of \(M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1})\) to \(M(\sigma _t,x_t)\). Thus,
1. \(\psi _t(s_{t1},x_t(s_{t1},g_1),M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1}))\) will converge to \(\psi _t(s_{t1},x_t(s_{t1},g_1), M(\sigma _t,x_t))\) in probability at an \(s_{t1}\)-independent rate due to (F2);
2. \(v_{n,t+1}(\theta _t(s_{t1},x_t(s_{t1},g_1),M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1}),i_1),[T_{nt}(x_t,g^n,i^n)\circ \varepsilon (s^n_t)]_{-1}, x_{[t+1,{\bar{t}}]},x_{t+1})\) will converge to \(v_{t+1}(\theta _t(s_{t1},x_t(s_{t1},g_1),M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1})),i_1),T_t(x_t)\circ \sigma _t,x_{[t+1,{\bar{t}}]},x_{t+1})\) in probability at an \(s_{t1}\)-independent rate due to the induction hypothesis; the latter will in turn converge to \(v_{t+1}(\theta _t(s_{t1},x_t(s_{t1},g_1),M(\sigma _t,x_t),i_1),T(x_t)\circ \sigma _t,x_{[t+1,{\bar{t}}]},x_{t+1})\) in probability at an \(s_{t1}\)-independent rate due to (S2) and Proposition 2.
As per-period payoffs are bounded, all value functions are bounded. The above convergences will then lead to the convergence of the right-hand side of (25) to the right-hand side of (21) at an \(s_{t1}\)-independent rate. That is, \(v_{nt}(s_{t1},\varepsilon (s^n_{t,-1}),x_{[t{\bar{t}}]},x_t)\) will converge to \(v_t(s_{t1},\sigma _t,x_{[t{\bar{t}}]},x_t)\) at a rate independent of \(s_{t1}\). We have completed the induction process. \(\square \)
Proof of Theorem 2:
Let us consider subgames starting with some time \(t=1,2,\ldots ,{\bar{t}}\). For convenience, we let \(\sigma _t=T_{[1,t-1]}(x^*_{[1,t-1]})\circ \sigma _1\). Now let \(s_t\equiv (s_{t1},s_{t2},\ldots )\) be generated through sampling on \((S,{\mathscr {B}}(S),\sigma _t)\), \(g\equiv (g_1,g_2,\ldots )\) be generated through sampling on \((G,{\mathscr {B}}(G),\gamma )\), and \(i\equiv (i_1,i_2,\ldots )\) be generated through sampling on \((I,{\mathscr {B}}(I),\iota )\). In the remainder of the proof, we let \(s^n_t\equiv (s_{t1},\ldots ,s_{tn})\), \(s^n_{t,-1}\equiv (s_{t2},\ldots ,s_{tn})\), \(g^n\equiv (g_1,\ldots ,g_n)\), and \(i^n\equiv (i_1,\ldots ,i_n)\).
By Lemma 1 and Proposition 1, we know that \(\varepsilon (s^n_t)=\varepsilon (s_{t1},\ldots ,s_{tn})\) converges to \(\sigma _t\) in probability, and also that \(T_{nt}(x^*_t,g^n,i^n)\circ \varepsilon (s^n_t)\) converges to \(T_t(x^*_t)\circ \sigma _t\) in probability. Due to Lemma 2, \(\varepsilon (s^n_{t,-1})\) and \([T_{nt}(x^*_t,g^n,i^n)\circ \varepsilon (s^n_t)]_{-1}\) will have the same respective convergences. Also, Lemma 3 will lead to the convergence in probability of \(\varepsilon (s^n_{t,-1},g^n_{-1})\) to \(\sigma _t\otimes \gamma \). Due to \(x_t\)’s distribution-wise uniform continuity, Lemma 4 will lead to the convergence in probability of \(M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1})\) to \(M(\sigma _t,x_t)\). Then,
1. \(\psi _t(s_{t1},y(s_{t1},g_1),M_n(\varepsilon (s^n_{t,-1}),x_t,g_{-1}))\) will converge to \(\psi _t(s_{t1},y(s_{t1},g_1), M(\sigma _t,x_t))\) in probability at a y-independent rate due to (F2);
2. \(v_{n,t+1}(\theta _t(s_{t1},y(s_{t1},g_1), M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1}),i_1),[T_{nt}(x^*_t,g^n,i^n)\circ \varepsilon (s^n_t)]_{-1}, x^*_{[t+1,{\bar{t}}]},x^*_{t+1})\) will converge to \(v_{t+1}(\theta _t(s_{t1},y(s_{t1},g_1),M_n(\varepsilon (s^n_{t,-1}),x_t,g^n_{-1}),i_1),T_t(x^*_t)\circ \sigma _t, x^*_{[t+1,{\bar{t}}]}, x^*_{t+1})\) in probability at a y-independent rate due to Proposition 3, which, due to (S2) and Proposition 2, will converge to \(v_{t+1}(\theta _t(s_{t1},y(s_{t1},g_1),M(\sigma _t,x_t),i_1),T_t(x^*_t)\circ \sigma _t,x^*_{[t+1,{\bar{t}}]},x^*_{t+1})\) in probability at a y-independent rate.
As per-period payoffs are bounded, all value functions are bounded. By (21) and (25), the above convergences will then lead to the convergence of the left-hand side of (31) to the left-hand side of (27). At the same time, the right-hand side of (31) plus \(\epsilon \) will converge to the right-hand side of (27) due to the convergence of \(\varepsilon (s^n_{t,-1})\) to \(\sigma _t\), Proposition 3, and the uniform boundedness of the value functions. By (27), as long as n is large enough, (31) will be true for any \(\epsilon >0\) and \(y\in \mathcal{M}(S\times G,X)\). This would then lead to (32) due to Theorem 1 and the boundedness of payoff functions. \(\square \)
Technical Developments in Sect. 6
Value Functions for the Stationary Case: For \(t=0,1,\ldots \), we define \(v_t(s,\sigma ,x,y)\) as the total expected payoff a player can make from period 1 to t, when he starts period 1 with a state \(s\in S\) and a state-variable profile \(\sigma \), while all players keep on using the strategy x from period 1 to t with the exception of the current player in the very beginning, who deviates to \(y\in \mathcal{M}(S\times G,X)\) then. As a terminal condition, we have
$$\begin{aligned} v_0(s,\sigma ,x,y)=0. \end{aligned}$$
(C.1)
Due to the stationarity of the setting, we have, for \(t=1,2,\ldots \),
$$\begin{aligned} \begin{array}{ll} v_t(s,\sigma ,x,y)=\int _G[\psi (s,y(s,g),M(\sigma ,x))\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;+\alpha \cdot \int _I v_{t-1}(\theta (s,y(s,g),M(\sigma ,x),i),\sigma ,x,x)\cdot \iota (di)]\cdot \gamma (dg). \end{array} \end{aligned}$$
(C.2)
This is much like (21); but with (39) being true, the last term in (C.2) actually appears simpler than its counterpart. Using (C.1) and (C.2), we can inductively show that
$$\begin{aligned} \mid v_{t+1}(s,\sigma ,x,y)-v_t(s,\sigma ,x,y)\mid \le \alpha ^t\cdot {{\bar{\psi }}},\quad t=0,1,\ldots . \end{aligned}$$
(C.3)
The sequence \(\{v_t(s,\sigma ,x,y)\}_{t=0,1,\ldots }\) is thus Cauchy with a limit point \(v_\infty (s,\sigma ,x,y)\). This \(v_\infty (s,\sigma ,x,y)\) can be understood as the infinite-horizon total discounted expected payoff a player can obtain by starting with state s and environment \(\sigma \), while all players adhere to the action plan x except for the current player in the beginning, who deviates to y then.
Now we move on to the n-player game \(\Gamma _n\) with the same stationary features provided by \(\psi \), \(\theta \), and \(\alpha \). The in-action environment experienced by player m will be \(M_n(\varepsilon (s_{-m}),x,g_{-m})\), as defined in (14), when the other players start with the state vector \(s_{-m}\equiv (s_l)_{l\ne m}\), all act according to some x strategy, and experience the pre-action shock vector \(g_{-m}\equiv (g_l)_{l\ne m}\). Given any strategy \(x\in \mathcal{M}(S\times G,X)\), pre-action shock vector \(g\equiv (g_m)_{m=1,\ldots ,n}\in G^n\), and post-action shock vector \(i\equiv (i_m)_{m=1,\ldots ,n}\in I^n\), we define \(T_n(x,g,i)\) as the operator on \(\mathcal{P}_n(S)\) that converts a period’s state-variable profile into that of the next period. Following the transient version (15), \(\varepsilon (s')=T_n(x,g,i)\circ \varepsilon (s)\) is such that
$$\begin{aligned} s'_m=\theta \left( s_m,x(s_m,g_m),M_n(\varepsilon (s_{-m}),x,g_{-m}), i_m\right) ,\quad \forall m=1,2,\ldots ,n. \end{aligned}$$
(C.4)
Let \(v_{nt}(s_1,\varepsilon (s_{-1}),x,y)\) be the total expected payoff player 1 can make from period 1 to t, when the player’s starting state is \(s_1\in S\), the other players’ initial states are given by the vector \(s_{-1}\equiv (s_m)_{m\ne 1}\), and all players adopt the strategy \(x\in \mathcal{M}(S\times G,X)\) with the exception of player 1, who adopts the strategy \(y\in \mathcal{M}(S\times G,X)\) in the beginning. We have
$$\begin{aligned} v_{n0}(s_1,\varepsilon (s_{-1}),x,y)=0. \end{aligned}$$
(C.5)
For \(t=1,2,\ldots \), similarly to (25), \(v_{nt}(s_1,\varepsilon (s_{-1}),x,y)\) is equal to
$$\begin{aligned} \begin{array}{l} \int _{G^n}\gamma ^n(dg)\times \{\psi \left( s_1,y(s_1,g_1), M_n(\varepsilon (s_{-1}),x,g_{-1})\right) +\alpha \cdot \int _{I^n}\iota ^n(di)\times \\ \;\;\;\;\;\;\;\;\;\;\;\;\times v_{n,t-1}\left( \theta (s_1,y(s_1,g_1),M_n (\varepsilon (s_{-1}),x,g_{-1}),i_1),[T_n(x,g,i)\circ \varepsilon (s)]_{-1},x,x\right) \}. \end{array} \end{aligned}$$
(C.6)
In (C.6), \([T_n(x,g,i)\circ \varepsilon (s)]_{-1}\) stands for \(\varepsilon (s'_{-1})\), while \(s'\) comes from \(\varepsilon (s')=T_n(x,g,i)\circ \varepsilon (s)\). Using (C.5) and (C.6), we can inductively show that
$$\begin{aligned} \mid v_{n,t+1}(s_1,\varepsilon (s_{-1}),x,y)-v_{nt}(s_1,\varepsilon (s_{-1}),x,y)\mid \le \alpha ^t\cdot {{\bar{\psi }}},\forall t=0,1,\ldots . \end{aligned}$$
(C.7)
Thus, the sequence \(\{v_{nt}(s_1,\varepsilon (s_{-1}),x,y)\}_{t=0,1,\ldots }\) is Cauchy with limit \(v_{n\infty }(s_1,\varepsilon (s_{-1}),x,y)\).
Proof of Theorem 3:
Let \(\epsilon >0\) be fixed. For \(t=1,2,\ldots \) satisfying \(t\ge \ln (6{{\bar{\psi }}}/(\epsilon \cdot (1-\alpha )))/\ln (1/\alpha )+1\), we have from (C.6) and (C.7),
$$\begin{aligned} \mid v_{n\infty }(s_1,\varepsilon (s_{-1}),x^*,y)-v_{nt} (s_1,\varepsilon (s_{-1}),x^*,y)\mid <\frac{\epsilon }{6}. \end{aligned}$$
(C.8)
Therefore, we need merely to select such a large t and show that, when n is large enough,
$$\begin{aligned} \int _{S^n} v_{n t}(s_1,\varepsilon (s_{-1}),x^*,x^*)\cdot (\sigma ^*)^n(ds)\ge \int _{S^n} v_{n t}(s_1,\varepsilon (s_{-1}),x^*,y)\cdot (\sigma ^*)^n(ds)-\frac{2\epsilon }{3}. \end{aligned}$$
(C.9)
For \(t=1,2,\ldots \), since \((x^*,\sigma ^*)\) forms an equilibrium for \(\Gamma \), we know (42) is true. This, as well as (C.2) and (C.3), lead to
$$\begin{aligned} \alpha ^{t-\tau }\cdot \left[ \int _S v_\tau (s,\sigma ^*,x^*,y)\cdot \sigma ^*(ds)-\int _S v_\tau (s,\sigma ^*,x^*,x^*)\cdot \sigma ^*(ds)\right] \le \frac{2\alpha ^{t-1}\cdot {{\bar{\psi }}}}{1-\alpha }\le \frac{\epsilon }{3}. \end{aligned}$$
(C.10)
for \(\tau =1,2,\ldots ,t\), \(g\in G\), \(s\in S\), and \(y\in \mathcal{M}(S\times G,X)\).
We associate entities here with those defined in Sect. 5 when \({\bar{t}}\) there is fixed at the t here. To signify the difference in the two notational systems, we add superscript “K” to symbols defined in the previous section. For instance, we write \(v^K_\tau \) for the \(v_\tau \) defined in that section, which has a different meaning than the \(v_\tau \) here. Now, our \(\alpha ^{t-\tau }\cdot v_\tau (s,\sigma ^*,x^*,y)\) can be understood as \(v^K_{t+1-\tau }(s,\sigma ^*,x',y)\), with \(x'\equiv (x'_{t+1-\tau },\ldots ,x'_t)\in (\mathcal{M}(S\times G,X))^\tau \) being such that \(x'_{t'}=x^*\) for \(t'=t+1-\tau ,\ldots ,t\). Due to the consistency of \(\sigma ^*\) with \(x^*\) through the definition (39), we can understand \(\sigma ^*\) as \(T^K_{[1,\tau -1]}(x'_{[1,\tau -1]})\circ \sigma ^K_1\), where \(x'_{[1,\tau -1]}\equiv (x'_1,\ldots ,x'_{\tau -1})\in (\mathcal{M}(S\times G,X))^{\tau -1}\) is such that \(x'_{t'}=x^*\) for \(t'=1,2,\ldots ,\tau -1\).
With these correspondences, (C.10) can be translated into something akin to (27), with the only difference being that \(-\epsilon /3\) should be added to all the right-hand sides. That is, we now know that the current \((x^*,\sigma ^*)\) offers an \((\epsilon /3)\)-Markov equilibrium for the nonatomic game \(\Gamma ^{K}(\sigma ^*)\) with \({\bar{t}}=t\), \(\theta ^K_\tau =\theta \), and \(\psi ^K_\tau =\alpha ^{\tau -1}\cdot \psi \). Even though Theorem 2 is nominally about going from a 0-equilibrium for the nonatomic game to \(\epsilon \)-equilibria for finite games, we can follow exactly the same logic used to prove it to go from an \((\epsilon /3)\)-equilibrium for the nonatomic game to \((2\epsilon /3)\)-equilibria for finite games.
Thus, from one of the theorem’s claims, we can conclude that, for n large enough and any \(y\in \mathcal{M}(S\times G,X)\),
$$\begin{aligned} \int _{S^n}\left( \sigma ^K_1\right) ^n(ds)\cdot v^K_{nt}\left( s_{1},\varepsilon (s_{-1}),x'_{[1t]},x'_1\right) \ge \int _{S^n}\left( \sigma ^K_1\right) ^n(ds)\cdot v^{K}_{nt}\left( s_{1},\varepsilon (s_{-1}),x'_{[1t]},y\right) -\frac{2\epsilon }{3}, \end{aligned}$$
(C.11)
where \(x'_{[1t]}\) is again to be understood as the strategy that takes action \(x^*(s,g)\) whenever the most immediate state–shock pair is (s, g). But this translates into (C.9). \(\square \)