We assume now that M is primitive (that is, irreducible and aperiodic), which guarantees the existence of and convergence to a unique stable stationary distribution \(q = \big (q(\alpha ) \big )_{\alpha \in L} \in \mathbb {R}^L\) such that
$$\begin{aligned} q^{\mathsf {T}} = q^{\mathsf {T}} M, \end{aligned}$$
(10)
where \(\mathsf {T}\) denotes transpose.
We also assume that
$$\begin{aligned} \bigwedge \{ \delta \in \mathbb {S}({[n]}) : r_\delta ^{} > 0 \} = \underline{0}. \end{aligned}$$
(11)
That is, the coarsest common refinement of all partitions with positive recombination probability is the minimal partition \(\underline{0}\) of [n] into singletons. This is only a matter of technical convenience; otherwise, we could simply consider as a single site any set of sites that are not separated by any partition \(\delta \) with \(r_\delta ^{} > 0\). Note that Eq. (11) implies that \(\underline{0}\) is the unique absorbing state of the unlabelled partitioning process. We can now explicitly state the asymptotic behaviour of the MRE (4).
Theorem 7.1
Under the above assumptions, one has
$$\begin{aligned} \lim _{t \rightarrow \infty } \mu _t = \mu _\infty = \big ( \mu _\infty (\alpha ) \big )_{\alpha \in L}, \end{aligned}$$
where
$$\begin{aligned} \mu _\infty (\alpha ) = \bigotimes _{i=1}^n \mu _\infty ^{\{i\}}(\alpha ) \end{aligned}$$
(12)
and
$$\begin{aligned} \mu _\infty ^{\{i\}}(\alpha ) \mathrel {\mathop :}=\sum _{\beta \in L} q(\beta ) \mu _0^{\{i\}}(\beta ) \end{aligned}$$
(13)
for all \(\alpha \in L\). The convergence is geometric, i.e. there is a \(\gamma \in (0,1)\) such that
$$\begin{aligned} \mu _t = \mu _\infty + \mathcal {O}(\gamma ^t) \end{aligned}$$
as \(t \rightarrow \infty \), uniformly in \(\mu _0\).
This is in line with Bürger (2009, Theorem 3.1), which states that the solution of the MRE (4) approaches (at a uniform geometric rate) the submanifold defined by spatial stationarity and linkage equilibrium. Spatial stationarity means that
$$\begin{aligned} \mu (\alpha ) = \sum _{\beta \in L} q(\beta )\mu (\beta ) \end{aligned}$$
with q of (10); and, under the assumption (11), linkage equilibrium means that \(\mu (\alpha )\) is the product of its one-dimensional marginals, as in Eq. (12). However, like the explicit time evolution in Theorem 5.1, the explicit expression in Eq. (13) seems to be new.
In view of Theorem 6.2, this result is highly plausible: almost surely (at a uniform geometric rate), the partitioning process will enter its unique absorbing state where all blocks are singletons. In the sequel, independent migration processes will, for each block, converge to the unique stationary distribution q, again at a geometric rate and uniformly in the initial distribution. This behaviour is also clear in terms of the BRW picture. At some point, the type of each particle is a singleton, whence the particles stop branching and just keep performing independent random walks; see Remark 6.3.
For the formal proof, note that the uniform convergence of the migration processes follows directly from the primitivity of M via standard theory (Karlin and Taylor 1975, Appendix, Thm. 2.3). That the partitioning process enters its absorbing state at a uniform geometric rate is the content of the following lemma.
Lemma 7.2
Let
$$\begin{aligned} \eta \mathrel {\mathop :}=\max _{\delta \in \mathbb {S}({[n]}) \setminus \{ \underline{0}\}} T^\mathrm{{ul}}_{\delta \delta } < 1 \end{aligned}$$
be the maximal sojourn probability of the unlabelled partitioning process and let
$$\begin{aligned} \tau \mathrel {\mathop :}=\min \{ t \in \mathbb {N}_0 : \varSigma _t = \underline{0}\} \end{aligned}$$
be its time to absorption. Then, uniformly in the initial distribution,
$$\begin{aligned} \mathbb {P}(\tau > t) = \mathcal {O}\big ((\eta + \varepsilon )^t \big ) \end{aligned}$$
for any \(\varepsilon > 0\) as \(t \rightarrow \infty \).
Proof
Since the state space is finite and the partitioning process never returns to a state
the current state, this Markov chain may jump at most a finite number of times, say m times, before it is absorbed in \(\underline{0}\). Thus, for any fixed \(\varepsilon > 0\),
$$\begin{aligned} \begin{aligned} \mathbb {P}(\tau > t)&\leqslant \mathbb {P}(\text {the chain has performed at most { m} jumps up to time { t}}) \\&\leqslant \sum _{j = 0}^m \left( {\begin{array}{c}t\\ j\end{array}}\right) (1 - \eta )^j \eta ^{t - j} \\&\leqslant \sum _{j = 0}^m \Big ( \frac{1 - \eta }{\eta } \Big )^j t^m \eta ^t = C' t^m \eta ^t \leqslant C \eta ^t \Big (\frac{\eta + \varepsilon }{\eta } \Big )^t = C(\eta + \varepsilon )^t, \end{aligned} \end{aligned}$$
where \(C' = \sum _{j = 0}^m \Big ( \frac{1 - \eta }{\eta } \Big )^j\) and C is sufficiently large. \(\square \)
Next, we investigate the asymptotic behaviour of the LPP.
Proposition 7.3
There exists a \(\gamma \in (0,1)\) such that
$$\begin{aligned} \mathbb {P}\big ( {\varvec{\varSigma }}_t = \big \{(\{1\},\alpha _1),\ldots ,(\{n\},\alpha _n)\big \} \big ) = \prod _{i = 1}^n q(\alpha _i) + \mathcal {O}(\gamma ^t) \end{aligned}$$
as \(t \rightarrow \infty \), uniformly in \(\alpha ^{}_1,\ldots ,\alpha ^{}_n \in L\) and the initial distribution of the LPP. For \({\varvec{\delta }}\in \mathbb {L}\mathbb {S}({[n]})\) with \(\delta \ne \underline{0}\),
$$\begin{aligned} \mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }}) = \mathcal {O}\big ((\eta + \varepsilon )^t \big ), \end{aligned}$$
for all \(\varepsilon > 0\), again uniformly in the initial distribution.
Proof
Let \(\tau \) be as in Lemma 7.2. The second statement follows immediately from Lemma 7.2 by noting that
$$\begin{aligned} \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}) \leqslant \mathbb {P}(\tau > t). \end{aligned}$$
Now, assume that \({\varvec{\delta }}\) is of the form
$$\begin{aligned} {\varvec{\delta }}= \big \{ (\{1\},\alpha ^{}_1),\ldots ,(\{n\},\alpha ^{}_n) \big \}. \end{aligned}$$
Then, for all \(\gamma _1 > \eta \),
$$\begin{aligned} \begin{aligned}&\mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }})\\&\quad = \mathbb {P}\Big ({\varvec{\varSigma }}_t = {\varvec{\delta }}\big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) \mathbb {P}\Big (\tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) \\&\qquad + \mathbb {P}\Big ({\varvec{\varSigma }}_t = {\varvec{\delta }}\big | \tau> \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) \mathbb {P}\Big (\tau > \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) \\&\quad = \mathbb {P}\Big ({\varvec{\varSigma }}_t = {\varvec{\delta }}\big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) + \mathcal {O}(\gamma _1^t) \end{aligned} \end{aligned}$$
(14)
as \(t \rightarrow \infty \), where the last step follows by an application of Lemma 7.2. Furthermore,
$$\begin{aligned} \begin{aligned} \mathbb {P}\Big ({\varvec{\varSigma }}_t = {\varvec{\delta }}\big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big )&= \mathbb {P}\Big (\Lambda _t^{(i)} = \alpha _i \text { for all } 1 \leqslant i \leqslant n \big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) \\&= \prod _{i = 1}^n \mathbb {P}\Big (\Lambda _t^{(i)} = \alpha _i \big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ). \end{aligned} \end{aligned}$$
(15)
Here, the \( \big ( \Lambda _t^{(i)} \big )_{t \in \mathbb {N}\geqslant \tau _{}}\) for \(i \in L\) are the labels of the (singleton) blocks from time \(\tau \) onwards; they are independent L-valued Markov chains with transition matrix M. By standard theory, we can be sure that, regardless of the initial value, there is a \(\gamma _2 \in (0,1)\) such that
$$\begin{aligned} \mathbb {P}\Big (\Lambda _t^{(i)} = \alpha _i \big | \tau \leqslant \Big \lfloor \frac{t}{2} \Big \rfloor \Big ) = q(\alpha _i) + \mathcal {O}(\gamma _2^t), \end{aligned}$$
uniformly in \(\alpha _i\). Combining this with Eqs. (14) and (15) proves the theorem. \(\square \)
Proof of Theorem 7.1
By Theorem 6.2, Proposition 7.3, and Definition 3.2, we have for some \(\gamma \in (0,1)\), independent of \(\mu _0\),
$$\begin{aligned} \begin{aligned} \mu _t(\alpha )&= \mathbb {E}[\mathcal {R}^{}_{{\varvec{\varSigma }}_t} (\mu _0^{}) \mid {\varvec{\varSigma }}_0 = {\varvec{\underline{1}}}^\alpha ] \\&=O(\gamma ^{t}) +\sum _{\beta _1,\ldots ,\beta _n \in L} \Big ( \prod _{i = 1}^n q(\beta _i) \Big )\\&\qquad \times \mathbb {E}[\mathcal {R}^{}_{{\varvec{\varSigma }}_t} (\mu _0^{}) \mid {\varvec{\varSigma }}_0 = {\varvec{\underline{1}}}^\alpha ,{\varvec{\varSigma }}_t = \{ ( \{1\},\beta _1),\ldots ,(\{n\},\beta _n) \} ] \\&= \sum _{\beta _1,\ldots ,\beta _n \in L} \bigotimes _{i = 1}^n q (\beta _i)\mu _0^{\{i\}}(\beta _i) + \mathcal {O}(\gamma ^t) \\&= \bigotimes _{i = 1}^n \sum _{\beta \in L} q(\beta ) \mu _0^{\{i\}}(\beta ) + \mathcal {O}(\gamma ^t) \\&= \bigotimes _{i = 1}^n \mu _\infty ^{\{i\}} (\alpha ) + \mathcal {O}(\gamma ^t) = \mu _\infty ^{}(\alpha ) + \mathcal {O}(\gamma ^t) \end{aligned} \end{aligned}$$
\(\square \)
Since the asymptotic behaviour of the LPP is so simple, we now go one step further and inquire about its quasi-limiting behaviour; that is, its asymptotic behaviour, conditioned on non-absorption of its base. Generally speaking, quasi-limiting distributions describe the first-order approximation of the deviation from the stationary behaviour. Recall that the partitioning process (labelled or unlabelled) is a process of progressive refinement, and never returns to a state coarser than the current state. This is very different from the situation considered by Collet et al. (2013), where the focus is on irreducible chains.
Unlike the limiting distribution, the quasi-limiting distribution will generally depend on the initial distribution. For convenience of notation, we let the LPP start from a maximal labelled partition \({\varvec{\underline{1}}}^\alpha \), consisting of a single block with label \(\alpha \). However, the following discussion can easily be adapted to the more general setting. In what follows, we will exclude the pathological case of \(r_{\underline{0}}^{} = 1\), where the probability of non-absorption is zero, and the conditional distribution we are interested in is not well defined.
We start by recalling the quasi-limiting behaviour of \(\varSigma \), which was already investigated by Martínez (2017). We posit throughout that \(\varSigma _0=\underline{1}\). To state the result, we need some additional notation. First, we define the set of states
$$\begin{aligned} \mathbb {S}^\downarrow ({[n]}) \mathrel {\mathop :}=\{ \delta \in \mathbb {S}({[n]}) : \exists \ell \in \mathbb {N}\text { s.t. }\big ( (T^{\text {ul}})^\ell \big )_{\underline{1}\delta } >0 \} \end{aligned}$$
that are reachable by \(\varSigma \) when starting in \(\underline{1}\). As before, \(\eta \) denotes the maximal sojourn probability of \(\varSigma \) (compare Lemma 7.2). We will also need the set
$$\begin{aligned} \mathcal {F}\mathrel {\mathop :}=\{\delta \in \mathbb {S}^\downarrow ({[n]}) : T^{\text {ul}}_{\delta \delta } = \eta \} \end{aligned}$$
of reachable states with maximal sojourn probability. Note that our assumption \(r_{\underline{0}}^{} \ne 1\) guarantees that \(\eta > 0\). Finally, we define the first hitting time of any given \(\delta \in \mathbb {S}({[n]})\),
$$\begin{aligned} \tau _\delta ^{} \mathrel {\mathop :}=\min \{ t \in \mathbb {N}_0 : \varSigma _t = \delta \}, \end{aligned}$$
we write \( \tau _\mathcal {F}^{} \mathrel {\mathop :}=\min _{\delta \in \mathcal {F}} \tau _\delta ^{} \) for the first hitting time of \(\mathcal {F}\), and, as before, \( \tau = \tau _{\underline{0}} \) for the time to absorption. The following result is known; see Martínez (2017, Theorem 5.5).
Theorem 7.4
For all \(\delta \in \mathcal {F}\), one has
$$\begin{aligned} 0< \mathbb {E}[ \eta ^{- \tau _\delta ^{}}; \tau _\delta ^{}< \infty ] \leqslant \mathbb {E}[ \eta ^{-\tau _{\mathcal {F}}^{}}; \tau ^{}_{\mathcal {F}}< \infty ] < \infty . \end{aligned}$$
For all \(\delta \in \mathbb {S}({[n]})\), the limit
$$\begin{aligned} \mathbb {P}_{\text {qlim}}^\varSigma (\delta ) \mathrel {\mathop :}=\lim _{t \rightarrow \infty } \mathbb {P}(\varSigma _t = \delta \mid \tau > t) \end{aligned}$$
exists and is given by
$$\begin{aligned} \mathbb {P}_{\text {qlim}}^\varSigma (\delta ) = \frac{\mathbb {E}[ \eta ^{- \tau _\delta ^{}}; \tau _\delta ^{}< \infty ]}{\mathbb {E}[ \eta ^{-\tau _{\mathcal {F}}^{}}; \tau ^{}_{\mathcal {F}} < \infty ]} \mathbb {1}_{\delta \in \mathcal {F}}. \end{aligned}$$
Thus defined, \(\mathbb {P}_{\text {qlim}}^\varSigma \) is a probability measure on \(\mathbb {S}({[n]})\), called the quasi-limiting distribution of \(\varSigma \) (starting from \(\underline{1})\).
Recall that the labels of the different blocks evolve conditionally independently. Thus, we expect the quasi-limiting distribution of the LPP to be similar to the quasi-limiting distribution from Theorem 7.4, garnished with the stationary distribution q of the migration process. To be more explicit, we will prove the following result.
Theorem 7.5
For all \({\varvec{\delta }}\in \mathbb {L}\mathbb {S}({[n]})\),
$$\begin{aligned} \lim _{t \rightarrow \infty } \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau > t) = \Big (\prod _{(d,\lambda ) \in {\varvec{\delta }}} q(\lambda ) \Big ) \mathbb {P}_{\text {qlim}}^\varSigma (\delta ), \end{aligned}$$
where q is the unique stationary distribution (10) of the migration process.
Remark 7.6
In Theorem 7.1, we have approximated the solution of the MRE (4) by using Proposition 7.3 to approximate the distribution of the labelled partitioning process by its limiting distribution. We can try to improve on this rather coarse estimate by also taking into account the quasi-limiting distribution; at least in principle, the disintegration
$$\begin{aligned} \mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }}) = \mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau \leqslant t) \mathbb {P}(\tau \leqslant t) + \mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau>t ) \mathbb {P}( \tau > t) \end{aligned}$$
allows us to express the error term in Theorem 7.1 via the quasi-limiting distribution, at least when migration is strong compared to recombination. Acquiring precise asymptotics, however, would require more detailed knowledge about the probability \(\mathbb {P}(\tau > t)\) and the rate of convergence of the conditional distribution \(\mathbb {P}( {\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau >t )\) to the quasi-limiting distribution.
At the heart of the proof is the observation that further refinement of any \(\delta \in \mathcal {F}\) immediately leads to absorption; this was also one of the crucial ingredients in the proof of Theorem 7.4, see Martínez (2017, Theorem 5.5) for the original reference.Footnote 2
Lemma 7.7
For all \(\delta \in \mathcal {F}\), we have
$$\begin{aligned} T^\mathrm{{ul}}_{\delta \delta } + T^\mathrm{{ul}}_{\delta \underline{0}} = 1. \end{aligned}$$
(16)
Proof
We show that, for all \(\delta \in \mathbb {S}^\downarrow ({[n]})\) with \(T^\mathrm{{ul}}_{\delta \delta } + T^\mathrm{{ul}}_{\delta \underline{0}} \ne 1\), one has \(\delta \notin \mathcal {F}\). Indeed, for any such \(\delta \), there is an \(\varepsilon \notin \{\underline{0}, \delta \}\) with \(T^\mathrm{{ul}}_{\delta \varepsilon } > 0\). For any such \(\varepsilon \), there is at least one block \(e \in \varepsilon \) with \(|e |>1\). For any such e, the partition
$$\begin{aligned} \varepsilon ' \mathrel {\mathop :}=\{e\} \cup \big \{ \{i \} : i \in {[n]} \setminus e \big \} \prec \delta \end{aligned}$$
is reachable by Assumption (11) (with [n] replaced by individual blocks of \(\delta \)). We then have
$$\begin{aligned} T^{\text {ul}}_{\varepsilon '\varepsilon '} = r^e_{\{e\}}> \, r^{\tilde{d}}_{\{\tilde{d}\}} \prod _{\begin{array}{c} d \in \delta \\ d \ne \tilde{d}, |d |>1 \end{array}} r^d_{\{d\}} = \prod _{d \in \delta } r^d_{\{d\}} = T^{\text {ul}}_{\delta \delta }, \end{aligned}$$
where \(\tilde{d}\) is the block in \(\delta \) that contains e. The inequality is true since \(\varepsilon ' \prec \delta \) implies that either \(|e |< |\tilde{d} |\), in which case \(r^e_{\{e\}} > r^{\tilde{d}}_{\{\tilde{d}\}}\); or \(|\{d \in \delta : |d |> 1 \}|> 1\), which entails that the constrained product is not empty (note that \(r^d_{\{d\}}<1\) for d with \(|d |> 1\)). We have thus proved that \(\delta \notin \mathcal {F}\). \(\square \)
Remark 7.8
One might be tempted to assume that the sojourn probability is nondecreasing along every path
$$\begin{aligned} \underline{1}\succcurlyeq \delta _1 \succcurlyeq \delta _2 \succcurlyeq \ldots \succcurlyeq \underline{0}\end{aligned}$$
from the maximal partition to the absorbing state. To illustrate that this is not true in general, consider the following setup. Let \(n = 4\) and assume the recombination distribution given by \(r^{}_{\underline{0}} = \frac{1}{2}\), \(r^{}_{\{\{1,2\},\{3,4\}\}} = \frac{1}{10}, r^{}_{\underline{1}} = \frac{2}{5}\) and \(r^{}_\delta = 0\) otherwise. Then, the sojourn probability of the state \(\underline{1}\) is \(r^{}_{\underline{1}} = \frac{2}{5}\), while the (finer) state \(\{\{1,2\},\{3,4\}\}\) has the smaller sojourn probability
$$\begin{aligned} r^{\{1,2\}}_{\{1,2\}} r^{\{3,4\}}_{\{3,4\}} = (1-r^{}_{\underline{0}})^2 = \frac{1}{4}. \end{aligned}$$
\(\diamondsuit \)
The idea of the proof of Theorem 7.5 is simple. First, notice that Lemma 7.7 implies that conditional on non-absorption, \(\varSigma \) remains constant after \(\tau _{\mathcal {F}}^{}\). From then on, the labels keep on evolving independently according to M, and their distributions converge to q. To make this rigorous, we just need to make sure that \(t - \tau _\mathcal {F}^{}\) is large enough (conditional on non-absorption). This is the content of the next Lemma.
Lemma 7.9
-
(a)
There exists \(c > 0\) such that \(\mathbb {P}(\tau > t) \geqslant c \eta ^t\) for all \(t \in \mathbb {N}\).
-
(b)
Let \(\eta ' \mathrel {\mathop :}=\max _{\delta \in \mathbb {S}({[n]}) \setminus (\mathcal {F}\cup \{\underline{0}\})} T^\mathrm{{ul}}_{\delta \delta }\). Then, for all \(\eta '' > \eta '\), there exists \(C > 0\) such that \(\mathbb {P}(\tau ^{}_\mathcal {F}\wedge \tau > t) \leqslant C (\eta '')^t\) for all \(t \in \mathbb {N}\).
-
(c)
There is a \(\gamma \in (0,1)\) such that \(\lim _{t \rightarrow \infty } \mathbb {P}(\tau _{\mathcal {F}}^{}> \gamma t \mid \tau > t) = 0\).
Proof
First, we show (a). By definition, \(\mathcal {F}\subseteq \mathbb {S}^\downarrow (I)\). Thus, there exists a \(t_0 \in \mathbb {N}\) such that \(\mathbb {P}(\tau _\mathcal {F}^{} = t_0) > 0\). Then, we have for all \(t \geqslant t_0\) that
$$\begin{aligned} \mathbb {P}(\tau> t)\geqslant & {} \mathbb {P}(\tau> t, \tau _\mathcal {F}^{} = t_0) \\= & {} \mathbb {P}(\tau > t \mid \tau _\mathcal {F}^{} = t_0) \, \mathbb {P}(\tau _\mathcal {F}^{} = t_0) = c' \eta ^{t - t_0} = (c' \eta ^{-t_0} )\eta ^t \end{aligned}$$
with \(c' = \mathbb {P}(\tau _\mathcal {F}^{} = t_0)\). Note that we used Lemma 7.7 in the second-last step. Now, simply choose
$$\begin{aligned} c \mathrel {\mathop :}=\min \Biggl \{ \frac{\mathbb {P}(\tau > t)}{\eta ^t} : 0 \leqslant t \leqslant t_0 \Bigg \}\cup \big \{c' \eta ^{-t_0} \big \}. \end{aligned}$$
For the proof of (b), we couple \((\varSigma _t)_{t \in \mathbb {N}_0}\) to another process \((N_t)_{t \in \mathbb {N}_0}\) with values in \(\mathbb {N}_0 \cup \{\infty \}\) and \(N_0=0\). It evolves as follows. When \(\varSigma _{t+1} = \varSigma _t\), then \(N_{t+1}\mathrel {\mathop :}=N_t\) and when \(\varSigma _{t+1} \in \mathcal {F}\cup \{ \underline{0}\}\), we set \(N_{t+1} \mathrel {\mathop :}=\infty \). In all other cases, we perform a Bernoulli experiment with success probability
$$\begin{aligned} \frac{1 - \eta '}{1 - T^{\text {ul}}_{\varSigma _t \varSigma _t}}. \end{aligned}$$
Upon success, we set \(N_{t+1} \mathrel {\mathop :}=N_t + 1\); otherwise, \(N_{t+1} \mathrel {\mathop :}=N_t\). Note that the marginal \((N_t)_{t \in \mathbb {N}_0}\) of the coupling \((\varSigma _t,N_t)_{t \in \mathbb {N}_0}\) stochastically dominates a process that has independent Bernoulli increments with parameter \(1 - \eta '\).
As we have argued before, the partitioning process can only jump a finite number of times before hitting either \(\underline{0}\) or \(\mathcal {F}\). Thus, there is a positive integer m such that, for all \(t \in \mathbb {N}\), \(\tau \wedge \tau ^{}_\mathcal {F}> t\) implies \(N_t \leqslant m\). Thus,
$$\begin{aligned} \mathbb {P}(\tau \wedge \tau _\mathcal {F}^{} > t) \leqslant \mathbb {P}(N_t \leqslant m) \leqslant \sum _{k = 0}^{m} \left( {\begin{array}{c}t\\ k\end{array}}\right) (1-\eta ')^k (\eta ')^{t - k} = P(t) (\eta ')^t < C (\eta '')^t, \end{aligned}$$
where P(t) is a polynomial with degree \(\leqslant m\), and C and \(\eta ''\) are as stated.
Finally, (c) is a straightforward consequence of (a) and (b); first, fix any \(\eta '' \in (\eta ', \eta )\). Then, choose \(\gamma \) such that \((\eta '')^\gamma < \eta \). \(\square \)
After these preparations, the proof of Theorem 7.5 is not difficult.
Proof of Theorem 7.5
Choose \(\gamma \) as in (c) of Lemma 7.9. We split
$$\begin{aligned} \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau> t) = \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}, \tau _\mathcal {F}^{}> \gamma t \mid \tau> t) + \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}, \tau _\mathcal {F}^{} \leqslant \gamma t \mid \tau > t). \end{aligned}$$
The first probability tends to zero as \(t \rightarrow \infty \), due to our choice of \(\gamma \). The second can be rewritten as
$$\begin{aligned} \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau> t, \tau _\delta ^{} \leqslant \gamma t) \mathbb {P}(\varSigma _t = \delta , \tau _\mathcal {F}^{} \leqslant \gamma t \mid \tau > t), \end{aligned}$$
where we have used that Lemma 7.7 implies
$$\begin{aligned} \{\tau>t,\tau ^{}_\mathcal {F}\leqslant \gamma t, \varSigma _t = \delta \}=\{\tau >t,\tau ^{}_\delta \leqslant \gamma t\}. \end{aligned}$$
Here, the second factor converges to \(\mathbb {P}_{\text {qlim}}^\varSigma (\delta )\) by the choice of \(\gamma \) and Lemma 7.9(c).
Now consider the first factor. Together with \(\tau > t\) and Lemma 7.7, \(\tau _\delta ^{} \leqslant \gamma t\) implies that \(\varSigma _{s} = \delta \) for all s between \(\gamma t\) and t. During this period, the labels of the blocks of \(\delta \) evolve independently, and by the uniform convergence to the stationary distribution q, we obtain
$$\begin{aligned} \lim _{t \rightarrow \infty } \mathbb {P}({\varvec{\varSigma }}_t = {\varvec{\delta }}\mid \tau > t, \tau _\delta ^{} \leqslant \gamma t) = \prod _{(d,\lambda ) \in {\varvec{\delta }}} q(\lambda ), \end{aligned}$$
which completes the argument. For additional details, see also the proof of Proposition 7.3. \(\square \)