The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Shi, Zhiyan; Wang, Zhongzhi; Zhong, Pingping; Fan, Yan

doi:10.1007/s10959-021-01117-1

The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Published: 03 August 2021

Volume 35, pages 1367–1390, (2022)
Cite this article

Download PDF

Journal of Theoretical Probability Aims and scope Submit manuscript

The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Download PDF

Zhiyan Shi ORCID: orcid.org/0000-0002-3190-8121¹,
Zhongzhi Wang²,
Pingping Zhong¹ &
…
Yan Fan¹

2406 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, by constructing a class of random variables with a parameter and the mean value of one, we establish a strong limit theorem for delayed sums of the bivariate functions of such chains using the Borel–Cantelli lemma. Secondly, we prove the strong law of large numbers for the frequencies of occurrence of states of delayed sums and the generalized entropy ergodic theorem. As corollaries, we generalize some known results.

A diagram-free approach to the stochastic estimates in regularity structures

Article Open access 14 June 2024

Conservative and Semiconservative Random Walks: Recurrence and Transience

Article 27 February 2017

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $\{X_n,n \ge 0\}$ be an arbitrary information source taking values in a finite alphabet set on probability space $(\Omega ,{{{\mathcal {F}}}},\mathbf {P})$. Set $f_n(\omega )=-\frac{1}{n} \ln \mathbf {P}(X_0,X_1,\ldots ,X_n)$, and $f_n(\omega )$ is called the relative entropy density of $\{X_0,X_1,\ldots , X_n\}$ in information theory. The convergence of $f_n(\omega )$ to a constant in a sense ($L_1$ convergence, convergence in probability, a.e. convergence) is regarded as the entropy ergodic theorem or the asymptotic equipartiton property (AEP), or the Shannon–McMillan–Breiman theorem, which is the fundamental theorem in information theory. Shannon [22] first proved the entropy ergodic theorem for convergence in probability for stationary ergodic information sources with finite alphabet. The entropy ergodic theorem in $L_1$ and a.e. convergence, respectively, for stationary ergodic information sources was explored by McMillan [20] and Breiman [6]. Chung [8] considered the case of countable alphabet and Billingsley [5] extended the result to stationary nonergodic sources. Gray and Kieffer [14] extended it to asymptotically stationary measure process. The entropy ergodic theorem for general stochastic processes can be found, for example, in Barron [2] and Algoet and Cover [1]. Yang and Liu [19, 30] obtained the entropy ergodic theorem of nonhomogeneous Markov information sources in finite state space. Yang and Liu [32] studied the entropy ergodic theorem for mth-order nonhomogeneous Markov information sources.

Let $S_n=\sum _{k=1}^{n}X_k$, $S_0=0$ and for any $m,n \in \mathbb {N^{+}}$, define

$$\begin{aligned} T_{m,n}=S_{m+n}-S_{m}=\sum _{k=m+1}^{m+n}X_k, \end{aligned}$$

$\frac{T_{m,n}}{n}$ is regarded as the moving average or delay sum in probability theory. Many researches have been taken on topics of moving average. Shepp [23] studied the limiting values of the averages $[S_{n+f(n)}-S_n]/f(n)$ for i.i.d. random variables. Gaposhkin [12] established the law of large numbers for moving averages of independent random variables. Lanzinger [17] studied an almost sure limit theorem for moving averages of random variables between the strong law of large numbers and the Erdos–Rényi law. Lai [16] gave a review of limit theorems for moving averages and described some recent developments motivated by applications to signal detection and change point problems. Recently, Wang and Yang [27] considered the entropy ergodic theorem of the moving average form and obtained the generalized entropy ergodic theorem for nonhomogeneous Markov chains.

The tree indexed stochastic process is one of the research hotspots of stochastic structure in recent years. The tree indexed stochastic process generally includes tree indexed random wak, random tree (such as Galton–Watson tree) and tree indexed Markov chain et al. There are a lot of researches about probability limit theorems of tree indexed stochastic process, we briefly list as follows: Chen [7] studied the average properties of random walks on Galton-Watson trees. Telcs and Wormald [26] studied the strong recurrence of tree indexed random walks determined by the resistance properties of spherically symmetric graphs. Dembo et al. [10] extended the notions of shift-invariance and specific relative entropy—as typically understood for Markov fields on deterministic graphs such as ${\mathbb {Z}}^d$-to Markov fields on random trees, and also developed single-generation empirical measure large deviation principles for a more general class of random trees. Le Gall [18] considered Galton–Watson trees associated with a critical offspring distribution and condition to have exactly n vertices, and they proved that these conditioned spatial trees converge as $n\rightarrow \infty $, moduloan appropriate rescaling, towards the conditioned brownian tree under suitable assumptions on the offspring distribution and the spatial displacements. Guyon [13] studied the law of large numbers and central limit theorems for the bifurcating Markov chains indexed by a binary tree, and applied these results to detect cellular aging in Escherichia Coli, using the data of Stewart et al. and a bifurcating autoregressive model. Yamamoto [34] established a large deviation theorem for the number of branches of each order in a random binary tree, where the rate function associated with a large deviation was given by asymptotic forms of the rate function.

The significant progress of tree indexed Markov chains is its entropy ergodic theorem. Benjamini and Peres [3] gave the definition of tree-indexed Markov chains and studied the recurrence and ray-recurrence for them. Berger and Ye [4] studied the existence of entropy rate for some stationary random fields on a homogenous tree. Ye and Berger [35, 36], by using Pemantle’s [21] result and a combinational approach, have obtained entropy ergodic theorem in probability for a PPG-invariant and ergodic random field on a homogenous tree. Yang and Liu [30] established the strong law of large numbers for frequency of state occurrence on Markov chains indexed by a homogenous tree (in fact, it is special case of tree-indexed Markov chains and PPG-invariant random field). Yang [31, 33] obtained the strong law of large numbers and the entropy ergodic theorem for tree-indexed Markov chains. Huang and Yang [15] studied the strong law of large numbers and entropy ergodic theorem for Markov chains indexed by an uniformly bounded tree. Shi and Yang [25] studied the entropy ergodic theorem for mth-order nonhomogeneous Markov chains indexed by a tree. Recently, Dang et al. [9] defined a discrete form of nonhomogeneous bifurcating Markov chains indexed by a binary tree and discuss the equivalent properties for them, meanwhile the strong law of large numbers and the entropy ergodic theorem are studied for these Markov chains with finite state space. Shi et al. [24] studied the strong law of large numbers and entropy ergodic theorem for Markov chains indexed by a Cayley tree in a Markovian environment with countable state space.

Inspired by Dang et al. [9], Wang and Yang [27], and infused with some new ideas, in this paper, we study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, we prove a strong limit theorem for moving average of the bivariate functions of such chains. Secondly, we prove the strong law of large numbers for the frequencies of occurrence of states of moving average and the generalized entropy ergodic theorem. As corollaries, we generalize some known results. The research innovations of this paper are embodied in generalizing the entropy ergodic theorem in the form of moving average. As the classical Doob martingale convergence theorem cannot be employed, the core technique in this paper is that we construct a class of random variables with a parameter and the mean value of one, and use Borel–Cantelli lemma to prove the existence of a.e. convergence of certain random variables.

The rest of this paper is organized as follows. Section 2 describes some preliminaries, some concepts and properties of Markov chains indexed by a tree and the entropy density are reviewed. The most significant results of this article, i.e. the strong law of large numbers for the frequencies of occurrence of states and the generalized entropy ergodic theorem for the finite nonhomogeneous bifurcating Markov chains indexed by a binary tree, will be illustrated in Sect. 3. Finally, the proofs of main results in Sect. 3 are provided in Sect. 4.

2 Preliminaries

A tree is a graph T which is connected and contains no circuits. Given any two vertices $\alpha \ne \beta \in T$. Let $\overline{\alpha \beta }$ be the unique path connecting $\alpha $ and $\beta $. Define the distance $d(\alpha ,\beta )$ to be the number of edges contained in the path $\overline{\alpha \beta }$. Select a vertex as the root (denoted by o). For any two vertices $\sigma $ and t of tree T, we write $\sigma \le t$ if $\sigma $ is on the unique path from the root o to t. We denote by $\sigma \wedge t$ the vertex farthest from o satisfying $\sigma \wedge t\le t$ and $\sigma \wedge t\le \sigma $. The set of all vertices with distance n from the root o is called the n-th level of T . We denote by $L_n$ the set of all vertices on level n $(L_o = \{o\})$. We denote by $L_m^n$ to be the set of all vertices on the mth to nth level of T, specially by $T^{(n)}$ to be the set of all vertices on level 0 (the root o) to level n. Let T be any tree and $t\in T\backslash \{o\}$. If a vertex in this tree is on the unique path from the root o to t and is the nearest to t, we call it the predecessor of t and denote it by $1_t$ , we also call t a successor of $1_t$. If the root of a tree has N neighboring vertices and other vertices have $N +1$ neighboring vertices, we call this type of tree a Cayley tree and denote it by $T_{C,N}$. That is, for any vertex t of Cayley tree $T_{C,N}$, it has N successors on the next level. In this paper, we mainly investigate the binary tree $T_{C,2}$, on which each vertex has two successors on the next level. For simplicity, we denote $T_{C,2}$ by $T_2$ (see Fig. 1). For any vertex t of the binary tree $T_2$, we denote by $t^1$ and $t^2$ the two successors of t, and call them the first successor and the second successor of t respectively.

Let $(\Omega ,{{{\mathcal {F}}}},{\mathbf {P}})$ be a probability space, and T be any tree, $\{X_{t},t \in T\}$ be tree-indexed stochastic processes defined on $(\Omega ,{{{\mathcal {F}}}},\mathbf {P})$. Let A be the subgraph of T, $X^{A} = \{X_{t},t \in A\}$. We denote by |A| the number of vertices of A, $x^{A}$ the realization of $X^{A}$. Dang et al. [9] defined the discrete form of nonhomogeneous bifurcating Markov chains indexed by a binary tree. First we review the definition of this process.

Definition 2.1

(Dang et al. [9]) Let $T_{2}$ be a binary tree, G a countable state space, $\{X_{t},t\in T_{2}\}$ be a collection of G-valued random variables defined on probability space $(\Omega ,\mathcal{F},\mathbf {P})$. Let

$$\begin{aligned} p=\{p(x),x\in G\} \end{aligned}$$

(1)

be a distribution on G, and

$$\begin{aligned} P_{t}=(P_{t}(y_{1},y_{2}|x)),\quad x,y_{1},y_{2}\in G,\quad t\in T_{2} \end{aligned}$$

(2)

be a collection of stochastic matrices (that is $P_{t}(y_{1},y_{2}|x)\ge 0,\forall y_{1},y_{2},x \in G$, and $\sum _{(y_{1},y_{2})\in G^{2}} P_{t}(y_{1},y_{2}|x)=1,\forall x\in G)$ on $G\times G^{2}$. If $\forall n\ge 1$,

$$\begin{aligned} {\mathbf {P}}(X^{L_{n}} = x^{L_{n}} | X^{T^{(n-1)}}=x^{T^{(n-1)}}) = \prod _{t\in L_{n-1}} P_{t}(x_{t^{1}},x_{t^{2}} | x_{t}), \end{aligned}$$

(3)

and

$$\begin{aligned} {\mathbf {P}}(X_{o} = x) = p(x),\quad \forall x\in G, \end{aligned}$$

(4)

$\{X_{t},t\in T_{2}\} $ will be called G-valued nonhomogeneous bifurcating Markov chains indexed by a binary tree $T_{2}$ with the initial distribution (1) and stochastic matrices (2). If $\forall t \in T_{2},P_{t}=P$, where $P=\big \{P(y_{1},y_{2}|x),x,y_{1},y_{2}\in G\big \}$ is a stochastic matrix on $G\times G^{2}$, $\{X_{t},t\in T_{2}\} $ will be called G-valued homogeneous bifurcating Markov chains indexed by a binary tree.

Dang et al. [9] presented the equivalent properties for nonhomogeneous bifurcating Markov chains indexed by a binary tree as following.

Property 2.1

(Dang et al. [9]) Let $T_{2}$ be a binary tree, G a countable state space, and $\{X_{t},t\in T_{2}\}$ be a collection of G-valued random variables defined on probability space $(\Omega ,\mathcal{F},\mathbf {P})$, then the three propositions below are equivalent:

(i)
$\{X_{t},t\in T_{2}\}$ is a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_{2}$ with the initial distribution (1) and stochastic matrices (2) defined by Definition 2.1;
(ii)
For $\forall n\ge 1$ and $\forall x^{T^{(n)}}\in G^{T^{(n)}}$, we have
$$\begin{aligned} {\mathbf {P}}(X^{T^{(n)}} = x^{T^{(n)}}) = p(x_{o})\prod _{ t\in T^{(n-1)}}P_{t}(x_{t^{1}},x_{t^{2}}|x_{t}); \end{aligned}$$
(5)
(iii)
For $\forall n\ge 1$ and $t,t_{1},t_{2},\ldots ,t_{n} \in T_{2}$, satisfying $t_{i} \wedge t^{1} \le t,t_{i} \wedge t^{2} \le t,1\le i \le n$, we have
$$\begin{aligned}&{\mathbf {P}}(X_{t^{1}}=y_{1},X_{t^{2}}=y_{2} | X_{t} = x,X_{t_{1}} = x_{t_{1}},\ldots ,X_{t_{n}}= x_{t_{n}}) \nonumber \\&\quad = P_{t}(y_{1},y_{2} | x)= {\mathbf {P}}(X_{t^{1}} = y_{1},X_{t^{2}}= y_{2} | X_{t} = x), \ \ \forall x,y_{1},y_{2} \in G, \end{aligned}$$
(6)
and
$$\begin{aligned} {\mathbf {P}}(X_{o} = x) = p(x),\ \ \forall x\in G. \end{aligned}$$

Remark 2.1

It is a consequence of Kolmogorov extension theorem that there exists a collection of G-valued random variables $\{X_{t},t\in T_{2}\}$ on some probability space such that (5) holds.

Remark 2.2

By (5), we can easily obtain that for $\forall m,n \ge 1,n \ge m$ and $\forall x^{L^{n}_{m}} = G^{L^{n}_{m}}$,

$$\begin{aligned} {\mathbf {P}}(X^{L^{n}_{m}}= x^{L^{n}_{m}}) = {\mathbf {P}}(X^{L_{m}} = x^{L_{m}})\prod _{t\in L^{n-1}_{m}} P(x_{t^{1}},x_{t^{2}} | x_{t}). \end{aligned}$$

(7)

Remark 2.3

If $\{X_{t},t\in T_{2}\}$ is a G-valued nonhomogeneous bifurcating Markov chains indexed by a binary tree $T_{2}$ with the stochastic matrices (2) defined by Definition 2.1. From the second equality of (6), we have that for any $t\in T$,

$$\begin{aligned} {\mathbf {P}}(X_{t^{1}} = y_{1},X_{t^{2}} = y_{2} | X_{t} = x) = P_{t}(y_{1},y_{2} | x). \end{aligned}$$

Below we will recall the definition of tree indexed nonhomogeneous Markov chains.

Definition 2.2

(Dong et al. [11]) Let T be a local finite and infinite tree, G a countable state space, $\{X_{t},t\in T\}$ be a collection of G-valued random variables defined on probability space $(\Omega ,{{\mathcal {F}}},\mathbf {P})$. Let

$$\begin{aligned} p = \{p(x),x\in G\} \end{aligned}$$

(8)

be a distribution on G, and

$$\begin{aligned} Q_{t} = (Q_{t}(y | x)),\ \ x,y\in G,\ \ t\in T\backslash \{o\} \end{aligned}$$

(9)

be a collection of transition matrices on $G^{2}$. If $\forall n\ge 1$, and $t, t_{1}, t_{2},\ldots ,t_{n} \in T$, satisfying $t_{i}\wedge t \le 1_{t},1 \le i \le n$, we have

$$\begin{aligned} {\mathbf {P}}(X_{t}&= y | X_{1_{t}} = x,X_{t_{1}} = x_{t_{1}},\ldots ,X_{t_{n}} = x_{t_{n}})\nonumber \\&{=\mathbf {P}}(X_{t} = y | X_{1_{t}} = x) = Q_{t}(y | x),\ \ \ \forall x,y \in G, \end{aligned}$$

(10)

and

$$\begin{aligned} {\mathbf {P}}(X_{o} = x) = p(x),\ \ \ \forall x\in G, \end{aligned}$$

(11)

$\{X_{t},t \in T\}$ will be called G-valued nonhomogeneous Markov chains indexed by tree T with the initial distribution (8) and transition matrices (9), or called tree indexed nonhomogeneous Markov chains with state space G.

The above definition is the natural generalization of the definition of homogeneous Markov chains indexed by tree T (see Benjamini and Peres [3]). Similar to the equivalent property of nonhomogeneous bifurcating Markov chains indexed by a binary tree, by Property 2.1, we can immediately obtain the equivalent property of nonhomogeneous Markov chains indexed by a tree.

Property 2.2

(Dang et al. [9]) Let T be a local finite and infinite tree, G a countable state space, and $\{X_{t},t\in T\}$ be a collection of G-valued random variables defined on probability space $(\Omega ,{{\mathcal {F}}},\mathbf {P})$. Then $\{X_{t},t\in T\}$ is a tree indexed nonhomogeneous Markov chain taking values in G defined by Definition 2.2 if and only if $\forall n \ge 1$ and $\forall x^{T^{(n)}}\in G^{T^{(n)}}$,

$$\begin{aligned} {\mathbf {P}}(X^{T^{(n)}} = x^{T^{(n)}}) = p(x_{o})\prod _{ t\in {T^{(n)}\backslash \{o\}}} Q_{t}(x_{t}|x_{1_{t}}). \end{aligned}$$

(12)

Remark 2.4

From Property 2.2, we know that $\{X_{t},t\in T_{2}\}$ is a tree indexed nonhomogeneous Markov chain if and only if, $\forall n\ge 1$ and $\forall x^{T^{(n)}}\in G^{T^{(n)}}$,

$$\begin{aligned} {\mathbf {P}}(X^{T^{(n)}} = x^{T^{(n)}}) = p(x_{o})\prod _ {t\in T^{(n-1)}} Q_{t^{1}}(x_{t^{1}}|x_{t})Q_{t^{2}}(x_{t^{2}}|x_{t}). \end{aligned}$$

(13)

Thus a nonhomogeneous bifurcating Markov chain indexed by a binary tree is the nonhomogeneous Markov chain indexed by a binary tree if and only if, for $\forall t\in T_{2}$ and $\forall x,y_{1},y_{2} \in G$,

$$\begin{aligned} P_{t}(y_{1},y_{2}|x) = Q_{t^{1}}(y_{1}|x)Q_{t^{2}}(y_{2}|x), \end{aligned}$$

(14)

that is $\forall t\in T_{2}$,

$$\begin{aligned} \mathbf {P}(X_{t^{1}} = y_{1},X_{t^{2}} = y_{2}|X_{t} = x) = \mathbf {P}(X_{t^{1}} = y_{1}|X_{t} = x)\mathbf {P}(X_{t^{2}} = y_{2}|X_{t} = x). \end{aligned}$$

The above equality means that a nonhomogeneous bifurcating Markov chain indexed by a binary tree is the nonhomogeneous Markov chain indexed by a binary tree if and only if for any $t\in T_{2}$, their two successors of the same predecessor of t are conditionally independent.

Let T be a tree, $\{X_{t},t\in T\}$ be a stochastic process indexed by tree T taking values in countable state space G. Denote $P(x^{L^{n}_{m}}) = {\mathbf {P}}(X^{L^{n}_{m}} = x^{L^{n}_{m}})$. Let $\{a_{n},n\ge 0\}$ and $\{\phi (n),n\ge 0\}$ be two sequences of nonnegative integers such that $\lim _{n\rightarrow \infty }\phi (n) = \infty $. Define

$$\begin{aligned} f_{a_{n},\phi (n)}(\omega ) = - \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}} |}\ln P(X^{L^{a_{n}+\phi (n)}_{a{n}}} ), \end{aligned}$$

(15)

$f_{a_{n},\phi (n)}(\omega )$ will be called the generalized entropy density of $X^{L^{a_{n}+\phi (n)}_{a_{n}}}$. Particularly, if $a_{n} \equiv 0$ and $\phi (n) = n$, $f_{a_{n},\phi (n)}(\omega )$ will become the classical entropy density of $X^{T^{(n)}}$ defined as follows

$$\begin{aligned} f_{n}(\omega )\doteq f_{0,n}(\omega ) = - \frac{ 1}{|T^{(n)}|}\ln P(X^{T^{(n)}}). \end{aligned}$$

(16)

Obviously, if $\{X_{t},t\in T\}$ is a nonhomogeneous bifurcating Markov chains indexed by a binary tree defined by Definition 2.1, it follows from (7) that

$$\begin{aligned} f_{a_{n},\phi (n)}(\omega ) = - \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big [\ln P(X^{L_{a_{n}}})+ \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}} }\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t})\Big ], \end{aligned}$$

(17)

and

$$\begin{aligned} f_{n}(\omega ) = - \frac{ 1 }{\left| T^{(n)}\right| }\Big [\ln P(X_{o}) +\sum _{t\in T^{(n-1)}}\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t})\Big ]. \end{aligned}$$

(18)

Property 2.3

(Yang and Yang [28]) Let $T_{2}$ be a binary tree, $G = \{0,1,\ldots ,b - 1\}$ a finite state space and $\{X_{t},t\in T_{2}\}$ a tree-indexed stochastic process taking values in G. Let $f_{a_{n},\phi (n)}(\omega )$ be defined by (15). Then $f_{a_{n},\phi (n)}(\omega )$ are uniformly integrable.

3 Main Results

Let $G=\{0,1,\ldots ,b-1\}$ be a finite state space, $\{X_{t},t \in T_{2}\}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree defined as before. Let $S_{k}(L^{a_{n}+\phi (n)}_{a_{n}})(k\in G)$ be the number of k in set of random variables $\{X_{t},t \in L_{a_{n}}^{a_n+\phi (n)}\}$, and $ S_{k}(L_{a_{n}})(k \in G)$ be the number of k in set of random variables $\{X_{t},t \in L_{a_{n}}\}, S^{i}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})(k\in G)$ be the number of k in set of random variables $\{X_{t^{i}} = k,t \in L^{a_{n}+\phi (n)-1}_{a_{n}}\},i = 1,2$, which are defined as,

$$\begin{aligned}&S_{k}(L^{a_{n}+\phi (n)}_{a_{n}}) =|\{t\in L^{a_{n}+\phi (n)}_{a_{n}}: X_{t} = k\}|; \\&S_{k}(L_{a_{n}}) = |\{t\in L_{a_{n}} : X_{t} = k\}|; \end{aligned}$$

and

$$\begin{aligned} S^{i}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}}) = |\{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}: X_{t^{i}}=k\}|, \quad i = 1,2. \end{aligned}$$

It follows that

$$\begin{aligned}&S_{k}(L^{a_{n}+\phi (n)}_{a_{n}}) =\sum _{t\in L^{a_{n}+\phi (n)}_{a_{n}}}I_{k}(X_{t}); \end{aligned}$$

(19)

$$\begin{aligned}&S_{k}(L_{a_{n}}) = \sum _{t\in L_{a_{n}}}I_{k}(X_{t}); \end{aligned}$$

(20)

$$\begin{aligned}&S^{1}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})=\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}I_{k}(X_{t^{1}}); \end{aligned}$$

(21)

$$\begin{aligned}&S^{2}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})=\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}I_{k}(X_{t^{2}}); \end{aligned}$$

(22)

and

$$\begin{aligned} S_{k}(L^{a_{n}+\phi (n)}_{a_{n}}) = S^{1}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})+S^{2}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})+S_{k}(L_{a_{n}}), \end{aligned}$$

(23)

where

$$\begin{aligned} I_{k}(i)=\left\{ \begin{array}{cc} 1,\ \ &{} i=k,\\ 0,\ \ &{} i\ne k. \end{array} \right. \end{aligned}$$

In this section, we will establish the strong law of large numbers for the frequencies of occurrence of states and the generalized entropy ergodic theorem for the finite nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, we will give the strong law of large numbers for the frequencies of occurrence of states for this chains with finite state space.

Theorem 3.1

Let $G = \{0,1,\ldots ,b - 1\}$ be a finite state space, and $\{X_{t},t\in T_{2}\}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_{2}$ with stochastic matrices $\{P_{t},t\in T_{2}\}$ defined by Definition 2.1, $S_{k}(L^{a_{n}+\phi (n)}_{a_{n}})$ be defined by (19). Let $P = (P(y_{1},y_{2}|x)),x,y_{1},y_{2}\in G$ be another stochastic matrix, and let $P_{1}(y_{1}|x) = \sum _{y_{2}\in G}P(y_{1},y_{2}|x)$,$P_{2}(y_{2}|x) = \sum _{y_{1}\in G}P(y_{1},y_{2}|x)$,$P_{1}= (P_{1}(y|x))$,$P_{2}= (P_{2}(y|x))$. Let $Q = \frac{1}{2}(P_{1} + P_{2})$, and assume that the transition matrix Q is ergodic. Let $\{a_{n},n\ge 0\}$ and $\{\phi (n),n\ge 0\}$ be two nonnegative integer sequences such that for any positive integers n, m

$$\begin{aligned} \phi (m + n)-\phi (n)\ge m. \end{aligned}$$

(24)

If $\forall x,y_{1},y_{2} \in G$,

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{t\in T^{(n-1)}}|P_{t}(y_{1},y_{2}|x)-P(y_{1},y_{2}|x)|=0, \end{aligned}$$

(25)

then

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{S_{k}(L^{a_{n}+\phi (n)}_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}= \pi (k) \quad \mathrm{a.e.} \quad \forall k \in G, \end{aligned}$$

(26)

where $\pi =\{\pi (0),\pi (1),\ldots ,\pi (b-1)\}$ is the unique stationary distribution determined by the transition matrix Q.

The proof of the above theorem will be given in Sect. 4.

In the following, we will study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree with finite state space $G=\{0,1,\ldots ,b-1\}$.

Theorem 3.2

Under the conditions of Theorem 3.1, let $f_{a_{n},\phi _{n}}(\omega )$ be as defined in (17) and $\{a_{n},n\ge 0\}$ be a sequence of bounded nonnegative numbers, then

$$\begin{aligned} \lim _{n\rightarrow \infty }f_{a_{n},\phi _{n}}(\omega )=-\frac{1}{2}\sum ^{b-1}_{l=0}\pi (l)\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l) \quad \mathrm{a.e.}. \end{aligned}$$

(27)

The proof of the above theorem will be presented in Sect. 4.

Remark 3.1

Let $a_{n}\equiv 0,\phi (n) = n$ in Theorem 3.2, we can immediately get the entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree with finite state space G (see Dang et al. [9]).

Remark 3.2

From Property 2.3, we know that $f_{a_{n},\phi (n)}(\omega )$ are uniformly integrable. Thus (27) also holds with $L_{1}$ convergence.

We denote by $ g_{a_{n},\phi (n)}(\omega )$ the generalized entropy density of nonhomogeneous Markov chains indexed by a tree with the initial distribution (8) and transition matrices (9). From (12), it is easy to see that

$$\begin{aligned} g_{a_{n},\phi (n)}(\omega )=- \frac{1}{|L^{a_{n}+\phi (n)}_{a_n}|}\Big [\ln P(X^{L_{a_{n}}}) +\sum _{t\in L^{a_{n}+\phi (n)}_{a_{n}+1}}\ln Q_{t}(X_{t}|X_{1_{t}})\Big ]. \end{aligned}$$

(28)

By Theorem 3.2, we can establish the generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a binary tree.

Corollary 3.1

Let $T_{2}$ be a binary tree, $G =\{0,1,2,\ldots ,b-1\}$ be a finite state space, $\{X_{t},t\in T_{2}\}$ be a G-valued nonhomogeneous Markov chain indexed by $T_{2}$ with the transition matrices (9) defined by Definition 2.2. Let $Q=(Q(k|l)),k,l\in G$ be another transition matrix, and assume that Q is ergodic. Let $\{a_{n},n\ge 0\}$ be a sequence of bounded nonnegative integers and $\{\phi (n),n \ge 0\}$ be a nonnegative integer sequences such that for any positive integers n, m,

$$\begin{aligned} \phi (m + n)- \phi (n)\ge m. \end{aligned}$$

If

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{T^{(n)}\backslash \{o\}}|Q_{t}(k|l)-Q(k|l)| = 0,\quad \forall k,l\in G, \end{aligned}$$

(29)

then

$$\begin{aligned} \lim _{n\rightarrow \infty }g_{a_{n},\phi (n)}(\omega )= - \sum ^{b-1}_{l=0}\sum ^{b-1}_{k=0}\pi (l)Q(k|l)\ln Q(k|l) \quad \mathrm{a.e.},\quad \end{aligned}$$

(30)

where $\pi = \{\pi (0),\ldots ,\pi (b -1)\}$ is the unique stationary distribution determined by the transition matrix Q.

The proof of the above corollary will be given in Sect. 4.

Remark 3.3

Take $a_{n} \equiv 0,\phi (n) = n$ in Corollary 3.1, it is straightforward to obtain the entropy ergodic theorem for nonhomogeneous Markov chains indexed by a Cayley tree $T_{C,2}$ with finite state space G. The result is a special case of Dong, Yang and Bai [11] for $N=2$.

If there is only one son for each vertex of the tree, nonhomogeneous Markov chains indexed by a binary tree will degenerate into nonhomogeneous Markov chains. Similarly, we denote by $h_{a_{n},\phi (n)}(\omega ) $ the generalized entropy density of nonhomogeneous Markov chain with the initial distribution $\big \{\mu _{0}(0),\ldots ,\mu _{0}(b-1)\big \}$ and transition matrices $P_{n} = (p_{n}(i,j)),\ \ i,j \in G$. It easily follows that

$$\begin{aligned} h_{a_{n},\phi (n)}(\omega ) =- \frac{1}{\phi (n)}\bigg \{\log \mu _{a_{n}}(X_{a_{n}}) + \sum ^{a_{n}+\phi (n)}_{k=a_{n}+1 }\log p_{k}(X_{k-1},X_{k})\bigg \}, \end{aligned}$$

(31)

where $\mu _{a_{n}}(x)$ is the distribution of $X_{a_{n}}$. Thus we can get the generalized entropy ergodic theorem for nonhomogeneous Markov chains.

Corollary 3.2

Suppose $\{X_{n},n\ge 0\}$ is a nonhomogeneous Markov chain taking values from a finite state space $G = \{0,1,\ldots ,b-1\}$ with the initial distribution $\big \{\mu _{0}(0),\ldots ,\mu _{0}(b-1)\big \}$ and the transition matrices $\big \{P_{n} = (p_{n}(i,j)),\ \ i,j \in G,n = 1,2,\ldots \big \}$, where $p_{n}(i,j) = {\mathbf {P}}(X_{n} = j | X_{n-1} = i)$. Let $\{a_{n},n\ge 0\}$ be a sequence of bounded nonnegative integer and $\{\phi (n),n \ge 0\}$ be a nonnegative integer sequences such that for any positive integers n, m

$$\begin{aligned} \phi (m + n)- \phi (n)\ge m. \end{aligned}$$

Let $P = (p(i,j))$ be another transition matrix, and assume that P is irreducible. If

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\sum ^{n}_{k=1}|p_{k}(i,j)-p(i,j)|=0, \end{aligned}$$

(32)

then

$$\begin{aligned} \lim _{n\rightarrow \infty }h_{a_{n},\phi (n)}(\omega ) =- \sum ^{b-1}_{i=0}\sum ^{b-1}_{j=0}\pi _{i}p(i,j)\log p(i,j) \quad \mathrm{a.e.}. \end{aligned}$$

(33)

Proof

The corollary is a special case of Corollary 3.1, where $T_2$ is the set of nonnegative integers ${\mathbb {N}}$. $\square $

Remark 3.4

Note that

$$\begin{aligned}&\frac{1}{\phi (n)}\sum _{k=a_n+1}^{a_n+\phi (n)}|p_k(i,j)-p(i,j)|\\&\quad \le (1+\frac{a_n}{\phi (n)})\frac{1}{a_n+\phi (n)}\sum _{k=1}^{a_n+\phi (n)}|p_k(i,j)-p(i,j)|, \end{aligned}$$

and $\{a_n\}$ is bounded, by (32), we have that

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\frac{1}{\phi (n)}\sum _{k=a_n+1}^{a_n+\phi (n)}|p_k(i,j)-p(i,j)|=0. \end{aligned}$$

Thus, we can immediately obtain the results of Wang and Yang [27] on the generalized entropy ergodic theorem for delayed sums of nonhomogeneous Markov chains.

Remark 3.5

If $a_{n} \equiv 0,\phi (n) = n$ in Corollary 3.2, we can get the entropy ergodic theorem of nonhomogeneous Markov chains (see Yang, [29]).

4 The Proofs

Before providing the proofs of the main results in Sect 3, we begin with some lemmas.

Lemma 4.1

Let $T_{2}$ be a binary tree, and G be a countable state space. Assuming that $\{X_{t},t\in T_{2}\}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_2$ defined by Definition 2.1, and $\{g_{t}(x,y_{1},y_{2}),t \in T_{2}\}$ be a collection of functions defined on $G^{3}$. Suppose that $\exists \alpha > 0$, s.t. $E[e^{\alpha |g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}] < \infty ,\forall t\in T_{2}$. Let $\{a_{n},n \ge 0\}$ and $\{\phi (n),n\ge 0\} $ be two sequences of nonnegative integers such that $\phi (n)$ converges to infinity as $n \rightarrow \infty $. Assume that for $\forall \varepsilon > 0$,

$$\begin{aligned} \sum ^{\infty }_{n=1} \exp (-|L^{a_{n}+\phi (n)}_{a_{n}}|\varepsilon ) < \infty . \end{aligned}$$

(34)

Let

$$\begin{aligned} H_{a_{n},\phi (n)}(\omega ) = \sum _{ t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}g_{t}(X_{t},X_{t^{1}},X_{t^{2}}), \end{aligned}$$

(35)

and

$$\begin{aligned} G_{a_{n},\phi (n)}(\omega ) = \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}E\big [g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_{t}\big ]. \end{aligned}$$

(36)

Let $\alpha > 0$, and set

$$\begin{aligned} D(\alpha )= & {} \Bigg \{\omega : \limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)}_{a_{n}+1} }E\big [g^{2}_{t}(X_{t},X_{t^{1}},X_{t^{2}})e^{\alpha |g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_{t}\big ]\nonumber \\= & {} M(\alpha ;\omega )< \infty \Bigg \}. \end{aligned}$$

(37)

Then we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{H_{a_{n},\phi (n)}(\omega )-G_{a_{n},\phi (n)}(\omega )}{|L^{a_{n}+\phi (n)} _{a_{n}}|}= 0 \quad \mathrm{a.e.} \quad \omega \in D(\alpha ). \end{aligned}$$

(38)

Remark 4.1

It is easy to see that if $\{g_{t}(x,y_{1},y_{2}),t \in T_{2}\}$ is a collection of uniformly bounded functions, then for any $\alpha > 0, D(\alpha ) = \Omega $, thus we can get

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{H_{a_{n},\phi (n)}(\omega )-G_{a_{n},\phi (n)}(\omega )}{|L^{a_{n}+\phi (n)}_{a_{n}}|}= 0 \quad \mathrm{a.e.}. \end{aligned}$$

Remark 4.2

Let $a_{n} = 0$ and $\phi (n) = [\log _{2}n^{\alpha }](\alpha > 0)$. Since $T_{2}$ is a binary tree, we have

$$\begin{aligned} |L^{a_{n}+\phi (n)}_{a_{n}}| = 2^{[\log _{2}n^{\alpha }]+1}-1 \ge 2^{\log _{2}n^{\alpha }-1+1}-1 = n^{\alpha }-1, \end{aligned}$$

where $[\cdot ]$ is the usual greatest integer function. In this case (34) holds.

Proof

Let $\lambda $ be a nonzero real number, for fixed n, define

$$\begin{aligned} t_{a_{n},m}(\lambda ,\omega ) =\frac{e^{\lambda \sum _{t\in L^{a_{n}+m-1} _{a_{n}}}g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}}{\prod \limits _{t \in L^{a_{n}+m-1}_{a_{n}}}E\big [e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}\big ]},\ \ \ m = 1,2,\ldots ,\phi (n). \end{aligned}$$

(39)

Noticing that

$$\begin{aligned}&E\Big [e^{\lambda \sum _{t\in L_{a_{n}+\phi (n)-1}}g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X^{T^{(a_{n}+\phi (n)-1)}}\Big ]\nonumber \\&\quad = \sum _{t \in L_{a_{n}+\phi (n)-1},(x_{t^{1}},x_{t^{2}})\in G^{2}}e^{\lambda \sum _{t\in L_{a_{n}+\phi (n)-1}} g_{t}(X_{t},x_{t^{1}},x_{t^{2}})}\nonumber \\&\quad \quad \cdot \mathbf {P}(X^{L_{a_{n}+\phi (n)}}=x^{L_{a_{n}+\phi (n)}}|X^{T^{(a_{n}+\varphi (n)-1)}})\nonumber \\&\quad = \sum _{t\in L_{a_{n}+\phi (n)-1},(x_{t^{1}},x_{t^{2}})\in G^{2}}e^{\lambda \sum _{t\in L_{a_{n}+\phi (n)-1}}g_{t}(X_{t},x_{t^{1}},x_{t^{2}})}\cdot \prod _{t\in L_{a_{n}+\phi (n)-1}}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\&\quad = \prod _{t\in L_{a_{n}+\phi (n)-1}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}e^{\lambda g_{t}(X_{t},x_{t^{1}},x_{t^{2}})}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\&\quad = \prod _{t\in L_{a_{n}+\phi (n)-1}}E\Big [e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_t\Big ]. \end{aligned}$$

(40)

It is easy to see that $E [t_{a_{n},1}(\lambda ,\omega )] = 1$. Hence by (40),

$$\begin{aligned}&E[t_{a_{n},\phi (n)}(\lambda ,\omega )]\nonumber \\&\quad = E\left[ E[t_{a_{n},\phi (n)}(\lambda ,\omega )|X^{T^{(a_{n}+\phi (n)-1)}}]\right] \nonumber \\&\quad = E\Bigg [E\Big [t_{a_{n},\phi (n)-1}(\lambda ,\omega )\frac{e^{\lambda \sum _{t\in L_{a_{n}+\phi (n)-1}}g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}}{\prod \limits _{t\in L_{a_{n}+\phi (n)-1}}E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}]}|X^{T^{(a_{n}+\phi (n)-1)}}\Big ]\Bigg ]\nonumber \\&\quad = E\Bigg [t_{a_{n},\phi (n)-1}(\lambda ,\omega )\cdot \frac{E\Big [e ^{\lambda \sum _{t\in L_{a_{n}+\phi (n)-1}}g_{t}(X_{t},X_{t^{1}},X_{t^{2}})} \mid X^{T^{(a_{n}+\phi (n)-1)}}\Big ]}{\prod \limits _{t\in L_{a_{n}+\phi (n)-1}}E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{1_{t}}]}\Bigg ]\nonumber \\&\quad = E[t_{a_{n},\phi (n)-1}(\lambda ,\omega )] =\cdots = E [t_{a_{n},1}(\lambda ,\omega )] = 1. \end{aligned}$$

(41)

By Markov inequality, (34) and (41), for any $\varepsilon > 0$, we have

$$\begin{aligned}&\sum ^{\infty }_{n=1}P\Big [\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|} \ln t_{a_{n},\phi (n)}(\lambda ,\omega ) \ge \varepsilon \Big ] \nonumber \\&\quad = \sum ^{\infty }_{n=1} P\left[ t_{a_{n},\phi (n)}(\lambda ,\omega ) \ge \exp (|L^{a_{n}+\phi (n)}_{a_{n}}|\cdot \varepsilon )\right] \nonumber \\&\quad \le \sum ^{\infty }_{n=1} \exp (-|L^{a_{n}+\phi (n)}_{a_{n}}|\cdot \varepsilon ) <\infty . \end{aligned}$$

(42)

According to Borel–Cantelli Lemma and arbitrariness of $\varepsilon $, we have

$$\begin{aligned} \limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\ln t_{a_{n},\phi (n)}(\lambda ,\omega ) \le 0 \quad \mathrm{a.e.}. \end{aligned}$$

(43)

Noticing that

$$\begin{aligned} \frac{\ln t_{a_{n},\phi (n)}(\lambda ,\omega ) }{|L^{a_{n}+\phi (n)}_{a_{n}}|}&= \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}| }\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})\nonumber \\&\quad -\ln E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}]\Big \}. \end{aligned}$$

(44)

by (43) and (44), we have

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|} \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})\nonumber \\&\quad -\, \ln E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}]\Big \}\le 0\ \ \ \mathrm{a.e.}. \end{aligned}$$

(45)

Let $0 < \lambda \le \alpha $, dividing both sides of (45) by $\lambda $, we have

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|} \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{g_{t}(X_{t},X_{t^{1}},X_{t^{2}})\nonumber \\&\quad -\,\frac{\ln E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}]}{\lambda } \Big \}\le 0\quad \mathrm{a.e.}. \end{aligned}$$

(46)

By (37), (46), and inequalities $\ln x \le x-1 (x>0)$ and $0\le e^{x}-1-x \le \frac{1}{2}x^{2}e^{|x|}$, we get

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{g_{t}(X_{t},X_{t^{1}},X_{t^{2}})-E[g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_{t}]\Big \}\nonumber \\&\quad \le \limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}} \Big \{\frac{\ln E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_t]}{\lambda }\nonumber \\&\quad \quad -\,E[g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_t]\Big \}\nonumber \\&\quad \le \limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Bigg \{\frac{E[e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})}|X_{t}]-1 }{\lambda }\nonumber \\&\quad \quad -\, \frac{E[\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_{t}] }{\lambda }\Bigg \}\nonumber \\&\quad = \limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\nonumber \\&\quad \quad \frac{E\Big [e^{\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})} -1-\lambda g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_t\Big ]}{\lambda }\nonumber \\&\quad \le \frac{\lambda }{2}\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}E\Big [g^{2}_{t}(X_{t},X_{t^{1}},X_{t^{2}})e^{|\lambda |\cdot |g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_{t}\Big ]\nonumber \\&\quad \le \frac{\lambda }{2}\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}E\Big [g^{2}_{t}(X_{t},X_{t^{1}},X_{t^{2}})e^{|\alpha |\cdot |g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_{t}\Big ]\nonumber \\&\quad = \frac{\lambda }{2}M(\alpha ;\omega ) \quad a.e. \ \ \ \omega \in D(\alpha ). \end{aligned}$$

(47)

Letting $\lambda \rightarrow 0^{+}$ in (47) we have

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{g_{t}(X_{t},X_{t^{1}},X_{t^{2}})-E[g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_{t}]\Big \}\nonumber \\&\quad \le 0 \ \ \mathrm{a.e.} \ \ \omega \in D(\alpha ). \end{aligned}$$

(48)

Let $-\alpha \le \lambda < 0$, we similarly get

$$\begin{aligned}&\liminf _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\Big \{g_{t}(X_{t},X_{t^{1}},X_{t^{2}})-E[g_{t}(X_{t},X_{t^{1}}, X_{t^{2}})|X_{t}]\Big \}\nonumber \\&\quad \ge 0 \ \ \mathrm{a.e.}\ \ \omega \in D(\alpha ). \end{aligned}$$

(49)

Combining (48) and (49), we obtain (38) directly. $\square $

Lemma 4.2

Let $T_{2}$ be a binary tree, $\{a_{n},n \ge 0\}$ and $\{\phi (n),n \ge 0\}$ defined as in Lemma 4.1. Let $\{a_{t},t \in T\}$ be a collection of real numbers, and a be a real number. If

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|} \sum _{t\in T^{(n-1)}}|a_{t} -a| = 0, \end{aligned}$$

(50)

then

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}|a_{t} -a|= 0. \end{aligned}$$

(51)

Proof

Noticing that

$$\begin{aligned} \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}|a_{t}-a|\le \frac{|T^{(a_{n}+\phi (n))}|}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\frac{1}{|T^{(a_{n}+\phi (n))}|}\sum _{t\in T^{(a_{n}+\phi (n)-1)}}|a_{t}-a|.\nonumber \\ \end{aligned}$$

(52)

Since $T_{2}$ is a binary tree, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{|T^{(a_{n}+\phi (n))}|}{|L^{a_{n}+\phi (n)}_{a_{n}}|}=\lim _{n\rightarrow \infty }\frac{2^{a_{n}+\phi (n)+1} -1}{2^{a_{n}}(2^{\phi (n)+1} -1)} = 1. \end{aligned}$$

(53)

Equation (51) immediately follows from (50), (52) and (53). $\square $

Now, we present the proof of Theorem 3.1 as follows.

Proof of Theorem 3.1

It is easy to see from (24) that $\lim \limits _{n\rightarrow \infty }\phi (n) = \infty $ and (34) is satisfied. By (25) and Lemma 4.2, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}|P_{t}(y_{1},y_{2}|x)-P(y_{1},y_{2}|x)|=0. \end{aligned}$$

(54)

Let $g_{t}(x,y_{1},y_{2}) = I_{k}(y_{1})$ in Lemma 4.1. Obviously, $\{g_{t}(x,y_{1},y_{2}),t \in T_{2}\}$ are uniformly bounded. Since

$$\begin{aligned} H_{a_{n},\phi (n)}(\omega ) = \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}I_{k}(X_{t^{1}}) = S^1_k(L^{a_{n}+\phi (n)-1}_{a_{n}} ), \end{aligned}$$

(55)

and

$$\begin{aligned} G_{a_{n},\phi (n)}(\omega )= & {} \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}E[g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|X_{t}]\nonumber \\= & {} \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}g_{t}(X_{t},x_{t^{1}},x_{t^{2}})\cdot P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\= & {} \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}I_{k}(x_{t^{1}})\cdot P_{t}(x_{t^{1}},x_{t^{2}}|X_{t}). \end{aligned}$$

(56)

From Lemma 4.1, we have

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big \{S^{1}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})\nonumber \\&\quad -\,\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}I_{k}(x_{t^{1}})P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\Big \}=0 \ \ \mathrm{a.e.}. \end{aligned}$$

(57)

From (54), it can be easily verified that

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big \{ \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}I_{k}(x_{t^{1}})P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\&\quad -\,\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}I_{k}(x_{t^{1}})P(x_{t^{1}},x_{t^{2}}|X_{t})\Big \}=0.\nonumber \\ \end{aligned}$$

(58)

Since $\sum _{x_{t^{2}}\in G}P(x_{t^{1}},x_{t^{2}}|X_{t}) = P_{1}(x_{t^{1}}|X_{t})$, so

$$\begin{aligned}&\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}I_{k}(x_{t^{1}})P(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\&\quad = \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}P_{1}(k|X_{t})\nonumber \\&\quad = \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum ^{b-1}_{l=0}I_{l}(X_{t})P_{1}(k|l)\nonumber \\&\quad = \sum ^{b-1 }_{l=0}P_{1}(k|l)S_{l}(L^{a_{n}+\phi (n)-1}_{a_{n}}). \end{aligned}$$

(59)

By (57)–(59), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big \{S^{1}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})- \sum ^{b-1 }_{l=0}P_{1}(k|l)S_{l}(L^{a_{n}+\phi (n)-1}_{a_{n}})\Big \}=0 \quad \mathrm{a.e.}. \end{aligned}$$

(60)

Let $g_{t}(x,y_{1},y_{2}) = I_{k}(y_{2})$ in Lemma 4.1, similarly, we obtain that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big \{S^{2}_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})- \sum ^{b-1}_{l=0}P_{2}(k|l)S_{l}(L^{a_{n}+\phi (n)-1}_{a_{n}})\Big \}=0 \quad \mathrm{a.e.}. \end{aligned}$$

(61)

Adding (60) and (61), and noticing that

$$\begin{aligned} 0 \le \lim _{n\rightarrow \infty }\frac{S_{k}(L_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\le \lim _{n\rightarrow \infty }\frac{|L_{a_{n}}|}{|L^{a_{n}+\phi (n)}_{a_{n}}|}= \lim _{n\rightarrow \infty }\frac{2^{a_{n}}}{2^{a_{n}}(2^{\phi (n)}-1)}=0, \end{aligned}$$

$\lim \limits _{n\rightarrow \infty }\frac{|L^{a_{n}+\phi {n})}_{a_{n}}|}{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}=2$, and $Q = \frac{1}{2}(P_{1} + P_{2})$. By (23), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\Bigg \{\frac{S_{k}(L^{a_{n}+\phi (n)}_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}- \sum ^{b-1}_{l=0}Q(k|l)\frac{S_{l}(L^{a_{n}+\phi (n)-1}_{a_{n}})}{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}\Bigg \}=0 \quad a.e. \end{aligned}$$

(62)

Letting $\phi '(n) = \phi (n)-1$, it is easy to see that $\{\phi '(n),n \ge 0\}$ also satisfies (34). Using the same argument as that used to derive (62), we can prove that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Bigg \{\frac{S_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})}{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}- \sum ^{b-1}_{l=0}Q(k|l)\frac{S_{l}(L^{a_{n}+\phi (n)-2}_{a_{n}})}{|L^{a_{n}+\phi (n)-2}_{a_{n}}|}\Bigg \}=0 \quad \mathrm{a.e.}. \end{aligned}$$

(63)

Multiplying the k-th equality of (63) by Q(j|k), adding them together and using (62), we have

$$\begin{aligned} 0= & {} \lim _{n\rightarrow \infty }\left[ \sum ^{b-1}_{k=0}\frac{S_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})}{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}Q(j|k)- \sum ^{b-1}_{k=0}\sum ^{b-1}_{l=0}\frac{S_{l}(L^{a_{n}+\phi (n)-2}_{a_{n}})}{|L^{a_{n}+\phi (n)-2}_{a_{n}}|}Q(k|l)Q(j|k)\right] \nonumber \\= & {} \lim _{n\rightarrow \infty }\Bigg \{\left[ \sum ^{b-1}_{k=0}\frac{S_{k}(L^{a_{n}+\phi (n)-1}_{a_{n}})}{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}Q(j|k)- \frac{S_{j}(L^{a_{n}+\phi (n)}_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\right] \nonumber \\&+\left[ \frac{S_{j}(L^{a_{n}+\phi (n)}_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}- \sum ^{b-1}_{k=0}\sum ^{b-1}_{l=0}\frac{S_{l}(L^{a_{n}+\phi (n)-2}_{a_{n}})}{|L^{a_{n}+\phi (n)-2}_{a_{n}}|}Q(k|l)Q(j|k)\right] \Bigg \}\nonumber \\= & {} \lim _{n\rightarrow \infty }\left[ \frac{S_{j}(L^{a_{n}+\phi (n)}_{a_{n}})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}- \sum ^{b-1}_{l=0}\frac{S_{l}(L^{a_{n}+\phi (n)-2}_{a_{n}})}{|L^{a_{n}+\phi (n)-2}_{a_{n}}|}Q^{(2)}(j|l)\right] \quad \mathrm{a.e.}. \end{aligned}$$

(64)

where $Q^{(N)}(j|l)$ is the N-step transition probability determined by Q. By induction, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\left[ \frac{S_{j}(L^{a_n+\phi (n)}_{a_n})}{|L^{a_{n}+\phi (n)}_{a_{n}}|}- \sum ^{b-1}_{l=0}\frac{S_{l}(L^{a_{n}+\phi (n)-N}_{a_{n}})}{|L^{a_{n}+\phi (n)-N}_{a_{n}}|}Q^{(N)}(j|l)\right] =0 \quad \mathrm{a.e.}. \end{aligned}$$

(65)

Noticing that

$$\begin{aligned} \frac{1}{|L^{a_{n}+\phi (n)-N}_{a_{n}}|}\sum ^{b-1}_{l=0}S_{l}(L^{a_{n}+\phi (n)-N}_{a_{n}})=1, \end{aligned}$$

(66)

and

$$\begin{aligned} \lim _{N\rightarrow \infty }Q^{(N)}(j|l) = \pi (j),\ \ \ j\in G. \end{aligned}$$

(67)

(26) follows from (65), (66) and (67). This completes the proof of the Theorem 3.1.

$\square $

Before presenting the proof of Theorem 3.2, we cite a lemma which will be used.

Lemma 4.3

(Dong et al. [11]) Let $T_{2}$ be a binary tree, $\varphi (x)$ be a bounded function defined on interval $\bigtriangleup $, and $\varphi $ be continuous at $x = b(b\in \bigtriangleup )$. Let $\{b_{t},t \in T_{2}\}$ be a collection of real numbers. If

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{t\in T^{(n-1)}}|b_{t}-b|=0, \end{aligned}$$

(68)

then

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{t\in T^{(n-1)}}|\varphi (b_{t})-\varphi (b)|=0. \end{aligned}$$

(69)

Proof of Theorem 3.2

Since $\{a_{n},n\ge 0\}$ is bounded, then there exists $M\ge 0$ such that $|a_n|\le M$ for all $n\ge 0$. Since

$$\begin{aligned} E[e^{|\ln P(X^{L_{a_{n}}})|}]=\sum _{x^{L_{a_{n}}}}e^{-\ln P(X^{L_{a_{n}}}=x^{L_{a_{n}}})}P(X^{L_{a_{n}}}=x^{L_{a_{n}}})\le b^{|L_{a_{n}}|}. \end{aligned}$$

It is easy to see from (24) that $\{\phi (n),n\ge 0\}$ satisfies (34). By Markov inequality and (34), we have for every $\varepsilon > 0$,

$$\begin{aligned} \sum ^{\infty }_{n=1}P\left[ \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\ln {{P}}(X^{L_{a_{n}}})\ge \varepsilon \right] \le b^{2^{M}}\sum ^{\infty }_{n=1}\exp \{- \varepsilon |L^{a_{n}+\phi (n)}_{a_{n}}|\} <\infty . \end{aligned}$$

(70)

By Borel–Cantelli lemma, we get

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\ln P(X^{L_{a_{n}}})=0\quad \mathrm{a.e.}. \end{aligned}$$

(71)

Let $\varphi (x) = x \log x(\varphi (0) = 0)$. It is easy to see that $\varphi (x)$ is a continuous function on the interval [0, 1]. By Lemmas 4.2, 4.3 and (25), we have $\forall k_{1},k_{2},l\in G$

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\left| P_{t}(k_{1},k_{2}|l)\ln P_{t}(k_{1},k_{2}|l)\right. \nonumber \\&\quad \left. -P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l)\right| =0. \end{aligned}$$

(72)

Let $g_{t}(x,y_{1},y_{2}) = \ln P_{t}(y_{1},y_{2}|x)$ for all $t\in T_2$ in Lemma 4.1. By (35) and (36), we have

$$\begin{aligned}&H_{a_{n},\phi (n)}(\omega ) =\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t}),\end{aligned}$$

(73)

$$\begin{aligned}&G_{a_{n},\phi (n)}(\omega ) = \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}} P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\ln P_{t}(x_{t^{1}},x_{t^{2}}|X_{t}). \end{aligned}$$

(74)

Letting $\alpha =\frac{1}{2}$, noticing that for any $t\in T_{2}$, we have

$$\begin{aligned}&E\left[ g^{2}_{t}(X_{t},X_{t^{1}},X_{t^{2}})e^{\alpha |g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_{t}\right] \\&\quad = \sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}\ln ^{2}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\cdot e^{- \frac{1}{2}\ln P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\\&\quad = \sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}\ln ^{2}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})[P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})]^\frac{1}{2}\\&\quad \le 16b^{2}e^{-2}. \end{aligned}$$

and $\forall t\in T_{2}$,

$$\begin{aligned} E[e^{\frac{1}{2}|g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_{t}] < \infty . \end{aligned}$$

(75)

Thus

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}E[g^{2}_{t}(X_{t},X_{t^{1}},X_{t^{2}})\cdot e^{\frac{1}{2}|g_{t}(X_{t},X_{t^{1}},X_{t^{2}})|}|X_t]\nonumber \\&\quad \le 16b^{2}e^{-2}. \end{aligned}$$

(76)

By (73)–(76) and Lemma 4.1, we have

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Bigg \{\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t})\nonumber \\&\quad - \,\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\cdot \ln P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\Bigg \} = 0 \quad \mathrm{a.e.}.\qquad \end{aligned}$$

(77)

Now, we have

$$\begin{aligned}&\Bigg |\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum _{(x_{t^{1}},x_{t^{2}})\in G^{2}}P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\cdot \ln P_{t}(x_{t^{1}},x_{t^{2}}|X_{t})\nonumber \\&\quad -\,\frac{1}{2}\sum ^{b-1}_{l=0}\pi (l)\sum ^{b-1}_{k_ 1=0}\sum ^{b-1}_{k_ 2=0}P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l) \Bigg | \nonumber \\&\quad \le \Bigg |\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}I_{l}(X_{t})P_{t}(k_{1},k_{2}|l)\cdot \ln P_{t}(k_{1},k_{2}|l)\nonumber \\&\qquad - \,\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}I_{l}(X_{t})P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\Bigg |\nonumber \\&\qquad + \Bigg |\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}I_{l}(X_{t})P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\nonumber \\&\qquad -\,\frac{1}{2}\sum ^{b-1}_{l=0}\pi (l)\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l)\Bigg |\nonumber \\&\quad \le \sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\big |P_{t}(k_{1},k_{2}|l)\cdot \ln P_{t}(k_{1},k_{2}|l) \nonumber \\&\qquad -\,P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\big |+\sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1 }_{k_{2}=0}P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\nonumber \\&\quad \quad \left| \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{an}}I_{l}(X_{t})-\frac{1}{2}\pi (l)\right| \nonumber \\&\quad \le \sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\big |P_{t}(k_{1},k_{2}|l)\cdot \ln P_{t}(k_{1},k_{2}|l)\nonumber \\&\qquad -\,P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\big | + \sum ^{b-1}_{l=0}\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}P(k_{1},k_{2}|l)\cdot \ln P(k_{1},k_{2}|l)\nonumber \\&\quad \quad \left| \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}S_{l}(L^{a_{n}+\phi (n)-1}_{an})- \frac{1}{2}\pi (l)\right| \quad \mathrm{a.e.}. \end{aligned}$$

(78)

By Theorem 3.1, (72), (77) and (78), and noticing that $ \lim \limits _{n\rightarrow \infty }\frac{|L^{a_{n}+\phi (n)-1}_{a_{n}}|}{|L^{a_{n}+\phi (n)}_{a_{n}}|}=\frac{1}{2}$. We have

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t})\nonumber \\&\quad =\frac{1}{2}\sum ^{b-1}_{l=0}\pi (l)\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l)\quad \mathrm{a.e.}. \end{aligned}$$

(79)

(27) can be obtained from (17), (71) and (79), which completes the proof of the theorem 3.2. $\square $

Proof of Corollary 3.1

Let $\forall t\in T_{2}$ and $\forall x,y_{1},y_{2}\in G,P_{t}(y_{1},y_{2}|x) = Q_{t^{1}}(y_{1}|x)Q_{t^{2}}(y_{2}|x)$. From Remark 2.4 we know that nonhomogeneous Markov chain indexed by a binary tree given in this corollary is a nonhomogeneous bifurcating Markov chain indexed by a binary tree with the stochastic matrices $\{P_{t}= (P_{t}(y_{1},y_{2}|x)),t \in T_{2}\}$, and

$$\begin{aligned}&g_{a_{n},\phi (n)}(\omega ) =\frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big [\ln P(X^{L_{a_{n}}})+ \sum _{t\in L^{a_{n}+\phi (n)}_{a_{n}+1}}\ln Q_{t}(X_{t}|X_{1_{t}})\Big ]\nonumber \\&\quad = - \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|} \Big [\ln P(X^{L_{a_{n}}})+ \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln Q_{t^{1}}(X_{t^{1}}|X_{t}) \nonumber \\&\quad \quad +\, \sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln Q_{t^{2}}(X_{t^{2}}|X_{t})\Big ]\nonumber \\&\quad = - \frac{1}{|L^{a_{n}+\phi (n)}_{a_{n}}|}\Big [\ln P(X^{L_{a_{n}}}) +\sum _{t\in L^{a_{n}+\phi (n)-1}_{a_{n}}}\ln P_{t}(X_{t^{1}},X_{t^{2}}|X_{t})\Big ]\nonumber \\&\quad = f_{a_{n},\phi (n)}(\omega ). \end{aligned}$$

(80)

Let $P(y_{1},y_{2}|x) = Q(y_{1}|x)Q(y_{2}|x)$. It is easy to see that $P_{1} = Q,P_{2}= Q, \frac{1}{2}(P_{1}+P_{2}) = Q$, and Q is ergodic. Since

$$\begin{aligned}&|P_{t}(k_{1},k_{2}|l)-P(k_{1},k_{2}|l)|\nonumber \\&\quad = |Q_{t^{1}}(k_{1}|l)Q_{t^{2}}(k_{2}|l)-Q(k_{1}|l)Q(k_{2}|l)|\nonumber \\&\quad \le |Q_{t1}(k_{1}|l)Q_{t^{2}}(k_{2}|l) - Q(k_{1}|l)Q_{t^{2}}(k_{2}|l)|+|Q(k_{1}|l)Q_{t^{2}}(k_{2}|l)\nonumber \\&\qquad -Q(k_{1}|l)Q(k_{2}|l)|\nonumber \\&\quad \le |Q_{t^{1}}(k_{1}|l)-Q(k_{1}|l)|+|Q_{t^{2}}(k_{2}|l)-Q(k_{2}|l)|, \end{aligned}$$

(81)

and by (29), for $i = 1,2,$

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{t\in T^{(n-1)}\backslash \{o\}}|Q_{t^{i}}(k_{1}|l)-Q(k_{1}|l)|\nonumber \\&\quad \le \lim _{n\rightarrow \infty }\frac{1}{|T^{(n)}|}\sum _{t\in T^{(n)}\backslash \{o\}}|Q_{t}(k_{1}|l)-Q(k_{1}|l)| = 0, \end{aligned}$$

(82)

Thus (25) follows from (81), (82). By Theorem 3.2 and (80),

$$\begin{aligned}&\lim _{n\rightarrow \infty }g_{a_{n},\phi (n)}(\omega )\nonumber \\&\quad = \lim _{n\rightarrow \infty }f_{a_{n},\phi (n)}(\omega )\nonumber \\&\quad = - \frac{1}{2}\sum ^{b-1}_{l=0}\pi (l)\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}P(k_{1},k_{2}|l)\ln P(k_{1},k_{2}|l)\nonumber \\&\quad = - \frac{1}{2}\sum ^{b-1}_{l=0} \pi (l)\sum ^{b-1}_{k_{1}=0}\sum ^{b-1}_{k_{2}=0}Q(k_{1}|l)Q(k_{2}|l)\cdot \big [\ln Q(k_{1}|l) + \ln Q(k_{2}|l)\big ]\nonumber \\&\quad = - \sum ^{b-1}_{l=0}\sum ^{b-1}_{k=0}\pi (l)Q(k|l)\ln Q(k|l)\quad \mathrm{a.e.}. \end{aligned}$$

(83)

Thus, (30) holds. $\square $

References

Algoet, P.H., Cover, T.M.: A sandwich proof of the Shannon–McMillan–Breiman theorem. Ann. Probab. 16(2), 899–909 (1988)
Article MathSciNet Google Scholar
Barron, A.R.: The strong ergodic theorem for densities: generalized Shannon–McMillan–Breiman theorem. Ann. Probab. 13(4), 1292–1303 (1985)
Article MathSciNet Google Scholar
Benjamini, I., Peres, Y.: Markov chains indexed by trees. Ann. Probab. 22, 219–243 (1994)
Article MathSciNet Google Scholar
Berger, T., Ye, Z.: Entropic aspects of random fields on trees. IEEE Trans. Inform. Theory 36, 1006–1018 (1990)
Article MathSciNet Google Scholar
Billingsley, P.: Ergodic Theory and Information. Wiley, New York (1965)
MATH Google Scholar
Breiman, L.: The individual ergodic theorem of information theory. Ann. Math. Stat. 28(3), 809–811 (1957)
Article MathSciNet Google Scholar
Chen, D.Y.: Average properties of random walks on Galton-Watson trees. Ann. Inst. Henri Poincare 33(3), 359-369 (1997)
Chung, K.L.: The ergodic theorem of information theory. Ann. Math. Stat. 32(2), 612–614 (1961)
Article MathSciNet Google Scholar
Dang, H., Yang, W.G., Shi, Z.Y.: The strong law of large numbers and the entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. IEEE Trans. Inf. Theory 61(4), 1640–164 (2015)
Article MathSciNet Google Scholar
Dembo, A., Mörter, P., Sheffied, S.: Large deviations of Markov chains indexed by random trees. Ann. I. H. Poincare-Fr. 41(6), 971–996 (2005)
Article MathSciNet Google Scholar
Dong, Y., Yang, W.G., Bai, J.F.: The strong law of large numbers and the Shannon-McMillan theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Stat. Probab. Lett. 81(12), 1883–1890 (2011)
Article MathSciNet Google Scholar
Gaposhkin, V.F.: The law of large numbers for moving averages of independent random variables. Mathematical notes of the Academy of Sciences of the USSR 42, 579–583 (1987)
MathSciNet MATH Google Scholar
Guyon, J.: Limit theorems for bifurcating Markov chains. Application to the detection of cellular aging. Ann. Appl. Probab. 17(5-6):1538–1569 (2007)
Gray, R.M., Kieffer, J.C.: Asymptotically mean stationary measures. Ann. Probab. 8, 962–973 (1980)
Article MathSciNet Google Scholar
Huang, H.L., Yang, W.G.: Strong law of large number for Markov chains indexed by an infinite tree with uniformly bounded degree. Sci. China Ser. A 51(2), 195–202 (2008)
Article MathSciNet Google Scholar
Lai, T. L.: Limit theorems for moving averages. Probab. Finance Insur 25(4), 1-14 (2004)
Lanzinger, H.: An almost sure limit theorem for moving averages of random variables between the strong law of large numbers and the Erdös–Rényi law. ESAIM-Probab. Stat. 2, 163–183 (1998)
Article MathSciNet Google Scholar
Le Gall, J.F.: A conditional limit theorem for tree-indexedrandom walk. Stoch. Proc. Appl. 116, 539–567 (2006)
Article Google Scholar
Liu, W., Yang, W.G.: A extension of Shannon–McMillan theorem and some limit properties for nonhomogeneous Markov chains. Stoch. Proc. Appl. 61, 129–145 (1996)
Article MathSciNet Google Scholar
McMillan, B.: The basic theorems of information theory. Ann. Math. Statist. 24, 196–219 (1953)
Article MathSciNet Google Scholar
Pemantle, R.: Antomorphism invariant measure on trees. Ann. Probab. 20, 1549–1566 (1992)
Article MathSciNet Google Scholar
Shannon, C.: A mathematical theory of communication. Bell. Syst. Tech. J. 27(379–423), 623–656 (1948)
Article MathSciNet Google Scholar
Shepp, L.A.: A limit law concerning moving averages. Ann. Math. Statist. 35(1), 424–428 (1964)
Article MathSciNet Google Scholar
Shi, Z.Y., Zhong, P.P., Fan, Y.: The Shannon–McMillan theorem for Markov chains indexed by a Cayley tree in random environment. Probab. Eng. Inform. Sc. 32(4), 626–639 (2018)
Article MathSciNet Google Scholar
Shi, Z.Y., Yang, W.G.: Some limit properties for the mth-order nonhomogeneous Markov chains indexed by an rooted Cayley tree. Stat. Probab. Lett. 80, 1223–1233 (2010)
Article Google Scholar
Telcs, A., Wormald, N.: Branching and tree indexed random walks on fractals. J. Appl. Prob. 36, 999–1011 (1999)
Article MathSciNet Google Scholar
Wang, Z.Z., Yang, W.G.: The generalized entropy ergodic theorem for nonhomogeneous Markov chains. J. Theor. Probab. 29, 761–775 (2016)
Article MathSciNet Google Scholar
Yang, J., Yang, W.G.: The generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Chin. Ann. Math. A 41(1): 99-114 (2020). (In Chinese)
Yang, W.G.: The asymptotic equipartition property for nonhomogeneous Markov information sources. Probab. Eng. Inform. Sci. 12, 509–518 (1998)
Article MathSciNet Google Scholar
Yang, W.G., Liu, W.: Strong law of large numbers for Markov chains fields on a Bethe tree. Statist. Probab. Lett. 49, 245–250 (2000)
Article MathSciNet Google Scholar
Yang, W.G.: Some limit properties for Markov chains indexed by a homogeneous tree. Statist. Probab. Lett. 65, 241–250 (2003)
Article MathSciNet Google Scholar
Yang, W.G., Liu, W.: The asymptotic equipartition property for mth-order nonhomogeneous Markov information sources. IEEE Trans. Inform. Theory 50(12), 3326–3330 (2004)
Article MathSciNet Google Scholar
Yang, W.G., Ye, Z.: The asymptotic equipartition property for Markov chains indexed by a Homogeneous tree. IEEE Trans. Inf. Theory 53(9), 3275–3280 (2007)
Article MathSciNet Google Scholar
Yamamoto, K.: Large deviation theorem for branches of the random binary tree in the Horton–Strahler analysis. SIAM J. Discrete Math. 34(1), 938–949 (2020)
Article MathSciNet Google Scholar
Ye, Z., Berger, T.: Ergodic, regulary and asymptotic equipartition property of random fields on trees. J Combin. Inform. System Sci. 21, 157–184 (1996)
MathSciNet MATH Google Scholar
Ye, Z., Berger, T.: Information measures for discrete random fields. Science, Beijing (1998)
MATH Google Scholar

Download references

Acknowledgements

The authors sincerely thank the editor and reviewers for their helpful and important comments, especially during the time with COVID-19 pandemic. The authors are also very thankful to Professor Keyue Ding who helped us to improve the English of this paper greatly. This work is supported by the National Natural Science Foundation of China (11971197, 11601191).

Author information

Authors and Affiliations

School of Mathematical Sciences, Jiangsu University, Zhenjiang, 212013, China
Zhiyan Shi, Pingping Zhong & Yan Fan
School of Mathematics & Physics and Engineering, Anhui University of Technology, Ma’anshan, 243002, China
Zhongzhi Wang

Authors

Zhiyan Shi
View author publications
You can also search for this author in PubMed Google Scholar
Zhongzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pingping Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Yan Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiyan Shi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shi, Z., Wang, Z., Zhong, P. et al. The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree. J Theor Probab 35, 1367–1390 (2022). https://doi.org/10.1007/s10959-021-01117-1

Download citation

Received: 14 July 2019
Revised: 19 May 2021
Accepted: 08 July 2021
Published: 03 August 2021
Issue Date: September 2022
DOI: https://doi.org/10.1007/s10959-021-01117-1

Keywords

Mathematics Subject Classification (2020)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Abstract

Similar content being viewed by others

A diagram-free approach to the stochastic estimates in regularity structures

Conservative and Semiconservative Random Walks: Recurrence and Transience

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

1 Introduction

2 Preliminaries

Definition 2.1

Property 2.1

Remark 2.1

Remark 2.2

Remark 2.3

Definition 2.2

Property 2.2

Remark 2.4

Property 2.3

3 Main Results

Theorem 3.1

Theorem 3.2

Remark 3.1

Remark 3.2

Corollary 3.1

Remark 3.3

Corollary 3.2

Proof

Remark 3.4

Remark 3.5

4 The Proofs

Lemma 4.1

Remark 4.1

Remark 4.2

Proof

Lemma 4.2

Proof

Proof of Theorem 3.1

Lemma 4.3

Proof of Theorem 3.2

Proof of Corollary 3.1

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2020)

Search

Navigation