1 Introduction

By a tree T we mean an infinite, locally finite, connected graph with a distinguished vertex o called the root and without loops or cycles. We only consider trees without leaves. That is, the degree (the number of neighboring vertices) of each vertex (except o) is required to be at least 2.

Let T be an infinite tree with root o, the set of all vertices with distance n from the root is called the n th generation (or n th level) of T. We denote by T ( n ) the union of the first n generations of T, by T ( m ) ( n ) the union from the m th to n th generations of T, by L n the subgraph of T containing the vertices in the n th generation. For each vertex t, there is a unique path from o to t, and |t| for the number of edges on this path. We denote the first predecessor of t by 1 t , the second predecessor of t by 2 t , and by n t the n th predecessor of t. We also call t one of 1 t ’s sons. For any two vertices s and t, denote by st, if s is on the unique path from the root o to t, denote by st the vertex farthest from o satisfying sts and stt. X A ={ X t ,tA} and denote by |A| the number of vertices of A.

If each vertex on a tree T has m+1 neighboring vertices, we call it a Bethe tree T B , m ; if the root has m neighbors and the other vertices have m+1 neighbors on a tree T, we call it a Cayley tree T C , m . Both the Bethe tree and the Cayley tree are called regular (or homogeneous) trees. If the degrees of all vertices on a tree T are uniformly bounded, then we call T a uniformly bounded degree tree (see [1] and [2]).

Definition 1 (see [3])

Let T be a locally finite, infinite tree, S be a finite state-space, { X t ,tT} be a collection of S-valued random variables defined on the probability space (Ω,F,P). Let

{ p ( x ) , x S }
(1)

be a distribution on S, and

( P ( y | x ) ) ,x,yS
(2)

be a stochastic matrix on S 2 . If for any vertex t,

P ( X t = y | X 1 t = x  and  X s  for  t s 1 t ) = P ( X t = y | X 1 t = x ) = P ( y | x ) x , y S
(3)

and

P( X 0 =x)=p(x)xS,

then { X t ,tT} will be called S-valued Markov chains indexed by an infinite tree T with initial distribution (1) and transition matrix (2), or called T-indexed Markov chains with state-space S.

Benjamini and Peres [3] gave the notion of tree-indexed Markov chains and studied the recurrence and ray-recurrence for them. Berger and Ye [4] studied the existence of entropy rate for some stationary random fields on a homogeneous tree. Ye and Berger [5], by using Pemantle’s result [6] and a combinatorial approach, studied the Shannon-McMillan theorem with convergence in probability for a PPG-invariant and ergodic random field on a homogeneous tree. Yang and Liu [7] studied the strong law of large numbers and Shannon-McMillan theorems for Markov chains field on the Cayley tree. Yang [8] studied some strong limit theorems for homogeneous Markov chains indexed by a homogeneous tree and the strong law of large numbers and the asymptotic equipartition property (AEP) for finite homogeneous Markov chains indexed by a homogeneous tree. Yang and Ye [9] studied strong theorems for countable nonhomogeneous Markov chains indexed by a homogeneous tree and the strong law of large numbers and the AEP for finite nonhomogeneous Markov chains indexed by a homogeneous tree. Bao and Ye [10] studied the strong law of large numbers and asymptotic equipartition property for nonsymmetric Markov chain fields on Cayley trees. Takacs [1] studied the strong law of large numbers for the univariate functions of finite Markov chains indexed by an infinite tree with uniformly bounded degree. Huang and Yang [2] studied the strong law of large numbers for Markov chains indexed by uniformly bounded degree trees.

However, the degrees of the vertices in the tree models are uniformly bounded. What if the degrees of the vertices are not uniformly bounded? In this paper, we drop the uniformly bounded restriction. We mainly study the strong law of large numbers and AEP with a.e. convergence for finite Markov chains indexed by trees under the following assumption.

For any integer N0, let d 0 (t):=1 and denote

d N (t):=|σT: N σ =t|
(4)

by the amount of t’s N th descendants. Denote

O(n)= { c n : 0 < lim sup n c n n c , c  is a constant } .

We assume that for enough large n and any given integer N0,

max { d N ( t ) : t T ( n ) } O ( ln | T ( n + N ) | | T ( n ) | ) .
(5)

The following examples are used to explain assumption (5).

Example 1 Both the Bethe tree T B , m and the Cayley tree T C , m satisfy assumption (5). Actually, max{ d N (t):t T ( n N ) } is a constant m N , and ln(| T ( n ) |/| T ( n N ) |)=Nlnm.

Example 2 A uniformly bounded degree tree satisfies assumption (5). In fact, if the tree T is a uniformly bounded degree tree, then max{ d N (t):t T ( n N ) } is no more than a constant a N , and

ln | T ( n ) | | T ( n N ) | ln | T ( n N ) | × a N | T ( n N ) | =Nlna

is also a constant.

Example 3 Define the lower growth rate of the tree to be grT= lim inf n | T ( n ) | 1 n and the upper growth rate of the tree to be GrT= lim sup n | T ( n ) | 1 n .

If both the grT and GrT are finite, then

ln | T ( n + N ) | | T ( n ) | ln ( Gr T ) n + N ( gr T ) n =nln Gr T gr T +NlnGrT,

hence (5) implies that

max { d N ( t ) : t T ( n ) } O ( ln | T ( n + N ) | | T ( n ) | ) O(n).

2 Some notations and lemmas

In the following, let δ k () be a Kronecker δ-function. For any given integer N0, denote

S k N ( T ( n ) ) := t T ( n N ) δ k ( X t ) d N (t),
(6)

which can be considered as the number of k’s among the variables in T ( n N ) , weighted according to the number of N th descendants. By (6), we have

k S S n N (k)=| T ( n N ) |1.
(7)

Define

H n (ω)= t T ( n ) { o } g t ( X 1 t , X t )
(8)

and

G n (ω)= t T ( n ) { o } E [ g t ( X 1 t , X t ) | X 1 t ] .
(9)

Lemma 1 (see [2])

Let T be an infinite tree with assumption (5) holds. Let ( X t ) t T be a T-indexed Markov chain with state-space S defined as before, { g t (x,y),tT} be functions defined on S 2 . Let L o ={o}, F n =σ( X T ( n ) ),

t n (λ,ω)= e λ t T ( n ) { o } g t ( X 1 t , X t ) t T ( n ) { o } E [ e λ g t ( X 1 t , X t ) | X 1 t ] ,
(10)

where λ is a real number. Then { t n (λ,ω), F n ,n1} is a nonnegative martingale.

Lemma 2 (see [2])

Under the assumption of Lemma  1, let { a n ,n1} be a sequence of nonnegative random variables, α>0. Set

B= { lim n a n = }
(11)

and

D(α)= { lim sup n 1 a n t T ( n ) { o } E [ g t 2 ( X 1 t , X t ) e α | g t ( X 1 t , X t ) | | X 1 t ] = M ( ω ) < } B.
(12)

Then

lim n H n ( ω ) G n ( ω ) a n =0a.e. on D(α).
(13)

3 Strong law of large numbers and Shannon-McMillan theorem

In this section, we study the strong law of large numbers and the Shannon-McMillan theorem for finite Markov chains indexed by an infinite tree with assumption (5) holds.

Theorem 1 Let T be an infinite tree with assumption (5) holds. Then under the assumption of Lemma  1, for all kS and N0, we have

lim n 1 | T ( n + N ) | { S k N ( T ( n ) ) l S S l N + 1 ( T ( n 1 ) ) P ( k | l ) } =0a.e.
(14)

Proof Let g t (x,y)= d N (t) δ k (y), a n =| T ( n + N ) |. Since

G n ( ω ) = t T ( n ) { o } E [ g t ( X 1 t , X t ) | X 1 t ] = t T ( n ) { o } d N ( t ) x t S δ k ( x t ) P ( x t | X 1 t ) = t T ( n ) { o } d N ( t ) P ( k | X 1 t ) = l S t T ( n ) { o } δ l ( X 1 t ) d N ( t ) P ( k | l ) = l S t T ( n 1 ) δ l ( X t ) d N + 1 ( t ) P ( k | l ) = l S S l N + 1 ( T ( n 1 ) ) P ( k | l )
(15)

and

H n (ω)= t T ( n ) { o } g t ( X 1 t , X t )= t T ( n ) { o } d N (t) δ k ( X t )= S k N ( T ( n ) ) δ k ( X o ) d N (o).
(16)

By Lemma 1, we know that { t n (λ,ω), F n ,n1} is a nonnegative martingale. According to the Doob martingale convergence theorem, we have

lim n t n (λ,ω)=t(λ,ω)<a.e.
(17)

We have by (17)

lim sup n ln t n ( λ , ω ) | T ( n + N ) | 0a.e. ωB.
(18)

By (10), (16) and (18), we get

lim sup n 1 | T ( n + N ) | { λ H n ( ω ) t T ( n ) { o } ln [ E [ e λ g ( X 1 t , X t ) | X 1 t ] ] } 0a.e. ωB.
(19)

Let λ>0. Dividing two sides of (19) by λ, we have

lim sup n 1 a n { H n ( ω ) t T ( n ) { o } ln [ E [ e λ g ( X 1 t , X t ) | X 1 t ] ] λ } 0a.e. ωB.
(20)

The case { d N (t):t T ( n ) } is uniformly bounded was considered in [2], we only consider the case { d N (t):t T ( n ) } is not uniformly bounded. By (18) and inequalities lnxx1 (x>0), 0 e x 1x 2 1 x 2 e | x | , as 0<λα, we have

lim sup n 1 | T ( n + N ) | [ H n ( ω ) t T ( n ) { o } E [ g t ( X 1 t , X t ) | X 1 t ] ] lim sup n 1 | T ( n + N ) | t T ( n ) { ln [ E [ e λ g t ( X 1 t , X t ) | X 1 t ] ] λ E [ g t ( X 1 t , X t ) | X 1 t ] } lim sup n 1 | T ( n + N ) | t T ( n ) { E [ e λ g t ( X 1 t , X t ) | X 1 t ] 1 λ E [ g t ( X 1 t , X t ) | X 1 t ] } λ 2 lim sup n 1 | T ( n + N ) | t T ( n ) E [ g t 2 ( X 1 t , X t ) e λ | g t ( X 1 t , X t ) | | X 1 t ] = λ 2 lim sup n 1 | T ( n + N ) | t T ( 1 ) ( n ) E [ ( d N ( t ) δ k ( X t ) ) 2 e λ | d N ( t ) δ k ( X t ) | | X 1 t ] λ 2 lim sup n 1 | T ( n + N ) | t T ( 1 ) ( n ) [ ( d N ( t ) ) 2 e λ d N ( t ) P ( k | X 1 t ) ] λ 2 lim sup n 1 | T ( n + N ) | t T ( 1 ) ( n ) [ ( d N ( t ) ) 2 e λ d N ( t ) ] λ 2 lim sup n 1 | T ( n + N ) | t T ( 1 ) ( n ) e 2 λ d N ( t ) ( for enough large  d N ( t ) ) λ 2 lim sup n | T ( n ) | 1 | T ( n + N ) | max { e 2 λ d N ( t ) , t T ( 1 ) ( n ) } λ 2 lim sup n | T ( n ) | 1 | T ( n + N ) | ( e max { d N ( t ) , t T ( 1 ) ( n ) } ) 2 λ .
(21)

By (5), there exists a constant β>0 such that

max { d N ( t ) , t T ( 1 ) ( n ) } βln | T ( n + N ) | | T ( n ) | ,

hence,

( e max { d N ( t ) , t T ( 1 ) ( n ) } ) 2 λ < ( | T ( n + N ) | | T ( n ) | ) 2 λ β .
(22)

Set 0<λ< 1 2 β , by (21) and (22) we have

lim sup n H n ( ω ) G n ( ω ) | T ( n + N ) | λ 2 lim sup n | T ( n ) | 1 | T ( n + N ) | × ( | T ( n + N ) | | T ( n ) | ) 2 λ β .
(23)

Let λ 0 + in (23), by (15) and (16) we have

lim n 1 | T ( n + N ) | { S k N ( T ( n ) ) l S S l N + 1 ( T ( n 1 ) ) P ( k | l ) } 0a.e.
(24)

Let 1 2 β λ 0 . By (19), we similarly get

lim n 1 | T ( n + N ) | { S k N ( T ( n ) ) l S S l N + 1 ( T ( n 1 ) ) P ( k | l ) } 0a.e.
(25)

Combining (24) and (25), we obtain (14) directly. □

Let T be a tree, ( X t ) t T be a stochastic process indexed by the tree T with state-space S. Denote

P ( x T ( n ) ) =P ( X T ( n ) = x T ( n ) ) .

Let

f n (ω)= 1 | T ( n ) | lnP ( X T ( n ) ) ,
(26)

f n (ω) will be called the entropy density of X T ( n ) . If ( X t ) t T is a T-indexed Markov chain with state-space S defined by Definition 1, we have by (5)

f n (ω)= 1 | T ( n ) | [ ln P ( X 0 ) + t T ( n ) { o } ln P ( X 1 t , X t ) ] .
(27)

The convergence of f n (ω) to a constant in a sense ( L 1 convergence, convergence in probability, a.e. convergence) is called the Shannon-McMillan theorem or the entropy theorem or the AEP in information theory.

Theorem 2 Let T be an infinite tree with assumption (5) holds. Let kS, and P be an ergodic stochastic matrix. Denote the unique stationary distribution of P by π. Let ( X t ) t T be a T-indexed Markov chain with state-space S generated by P. Then, for given integer N0,

lim n S k N ( T ( n ) ) | T ( n + N ) | =π(k)a.e.
(28)

Let S l , k ( T ( n ) ):=|{t T ( n ) :( X 1 t , X t )=(l,k)}|, then

lim n S l , k ( T ( n ) ) | T ( n ) | =π(l)P(k|l)a.e.
(29)

Let f n (ω) be defined as (27), then

lim n f n (ω)= l S k S π(l)P(k|l)lnP(k|l)a.e.
(30)

Proof The proofs of (28) and (29) are similar to those of Huang and Yang ([2], Theorem 2 and Corollary 3). Letting g t (x,y)=lnP(y|x) in Lemma 1, then

lim n f n ( ω ) = lim n H n ( ω ) | T ( n ) | = lim n 1 | T ( n ) | t T ( n ) { o } ln P ( X 1 t , X t ) = lim n 1 | T ( n ) | t T ( n ) l S k S δ l ( X 1 t ) δ k ( X t ) ln P ( k | l ) = lim n l S k S ln P ( k | l ) S l , k ( T ( n ) ) | T ( n ) |

by (29), (30) holds. □