(Non)-escape of mass and equidistribution for horospherical actions on trees

Let G be a large group acting on a biregular tree T and Γ≤G\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Gamma \le G$$\end{document} a geometrically finite lattice. In an earlier work, the authors classified orbit closures of the action of the horospherical subgroups on G/Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G/\Gamma $$\end{document}. In this article we show that there is no escape of mass and use this to prove that, in fact, dense orbits equidistribute to the Haar measure on G/Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G/\Gamma $$\end{document}. On the other hand, we show that new dynamical phenomena for horospherical actions appear on quotients by non-geometrically finite lattices: we give examples of non-geometrically finite lattices where an escape of mass phenomenon occurs and where the orbital averages along a Følner sequence do not converge. In the last part, as a by-product of our methods, we show that projections to Γ\T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Gamma \backslash T$$\end{document} of the uniform distributions on large spheres in the tree T converge to a natural probability measure on Γ\T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Gamma \backslash T$$\end{document}. Finally, we apply this equidistribution result to a lattice point counting problem to obtain counting asymptotics with exponential error term.


Introduction
Let T be a (d 1 , d 2 )-biregular tree with d 1 , d 2 ≥ 3. Denote by Aut(T ) the group of automorphisms acting without edge inversion. Let G be a non-compact, closed subgroup of Aut(T ) acting transitively on the boundary of the tree ∂ T . Let ≤ G be a lattice and X = G/ .
This parallels the classical setting of homogeneous dynamics, where one studies the actions of certain subgroups on a quotient of a linear algebraic group by a lattice. These two worlds intersect, for example, when G = SL 2 (k), where k is a non-archimedean local field, in which case G naturally acts on the associated Bruhat-Tits tree. However, our geometric setting also comprises many groups G ≤ Aut(T ), including Aut(T ) itself, that are not linear [12].
We first focus on the homogeneous space X = G/ , where is a geometrically finite lattice. The dynamics of discrete geodesic flow on X was considered by Paulin in [41], and is related, among others, to the theory of continued fractions in non-archimedean local fields. We recall that when G is linear, by works of Raghunathan and Lubotzky [35,43], any lattice therein is geometrically finite.
In our geometric setup, the role of Ad-unipotent subgroups in classical homogeneous dynamics is played by the horospherical subgroups G 0 η of G, for η ∈ ∂ T . In the earlier work [13], the authors classified Borel probability measures on G/ invariant under G 0 η -action for large class of groups G and general lattices , establishing an analogue of Dani's result in [15]. Moreover, it was shown that when is geometrically finite, G 0 η -orbits are either compact or dense, as in the classical result of Hedlund [30] on the horocycle flow on finite volume hyperbolic surfaces.

Non-escape of mass
The horospherical group G 0 η is amenable and one can easily construct Følner sequences therein: let a ∈ G be a hyperbolic element that has η as its attracting fixed point on ∂ T and let M be the compact subgroup of G 0 η that fixes pointwise the translation axis of a in T . Then for any M-invariant compact subset O with non-empty interior in G 0 η , the sequence (O t := a t Oa −t ) t∈N constitutes a Følner sequence in G 0 η (see e.g. [13,Lemma 2.10]). In the sequel, we shall refer to such sequences O t as good Følner sequences. Følner sequences allow one to average along larger and larger pieces of the orbits. For x ∈ X , we define ν x,t = m O t * δ x , where m O t is the normalized restriction of the Haar measure m G 0 η to O t ; in other words for f ∈ C c (X ), The probability measures ν x,t are called the orbital measures. In general, one can have a qualitative information on the statistical behaviour of the typical points x ∈ X . This can be done using the Howe-Moore property, established in our setting in [11] and amenable ergodic theorem [34]. Our topological result in [13] says, however, that all points x ∈ X that do not lie in a compact G 0 η -orbit have dense orbits. Therefore, the immediate question arises whether every dense orbit equidistributes to the Haar measure on G/ . First possible obstruction to this is the escape of mass phenomenon. Our first result states that this does not happen when is a geometrically finite lattice.
Theorem A (Non-escape of mass) Let T be a (d 1 , d 2 )-biregular tree, with d 1 , d 2 ≥ 3, and G a non-compact, closed subgroup of Aut(T ) acting transitively on ∂ T . Let be a geometrically finite lattice in G, η ∈ ∂ T and O t a good Følner sequence in G 0 η . Then, for every ε > 0, there exists a compact set K = K (ε) ⊂ X such that for every x ∈ X not contained in a compact G 0 η -orbit, there exists a positive integer N = N (x, ε) with the property that for every t ≥ N , we have ν x,t (K ) > 1 − ε.
(1.1) the homogeneous orbit. This orbit supports a unique G 0 η -invariant measure and, hence, ν x,t equidistribute to the homogeneous measure supported on the orbit closure.
Under the additional topological simplicity assumption on G, our second result yields a complete qualitative description of statistical behaviour of every x ∈ X not contained in a compact G 0 η -orbit and for such x ∈ X , it identifies the limit of ν x,t as the Haar measure: Theorem B (Equidistribution) Let T be a (d 1 , d 2 )-biregular tree, with d 1 , d 2 ≥ 3, and G a non-compact, closed, topologically simple subgroup of Aut(T ) acting transitively on ∂ T . Let be a geometrically finite lattice in G and O t be a good Følner sequence in G 0 η . Assume x ∈ X does not belong to a compact G 0 η -orbit. Then, the orbital measures ν x,t equidistribute to the normalized Haar measure m X as t → ∞, in other words, for every f ∈ C c (X ), we have The previous theorem has the following immediate consequence on the statistical behaviour of G 0 η -orbits. Let L be a closed subgroup of G. A probability measure μ on X is called L-homogeneous if it is the unique L-invariant probability measure on a closed Lorbit. It is said to be homogeneous if it is L-homogeneous for some closed subgroup L < G.
A point x ∈ X is called generic for G 0 η (see [47,Definition 1]) if for some (equivalently for any) good Følner sequence O t , the sequence ν x,t of orbital measures equidistributes to a homogeneous measure.

Corollary 1.3 Keep the hypotheses of Theorem B. Any x ∈ X is generic for G
In the context of unipotent flows on SL 2 (R)/ , this result goes back to Dani-Smillie [17]. Since then, Ratner [46,48], Shah [51] and others have obtained very general results in Lie groups or algebraic groups over local fields of characteristic zero, but even in the case of a semisimple linear group G of rank one over a local field of positive characteristic, e.g. SL 2 (k) with k = F q ((X −1 )), this result does not appear in the literature. However, we remark that for arithmetic quotients of linear groups, one may deduce such an equidistribution result by combining the work of Mohammadi [38] and the result mentioned by Ghosh in [29, page 467]. In the linear setting, the previous results have the following immediate consequence: Corollary 1. 4 Keep the hypotheses and the notation of Corollary 1.1. The statement of Corollary 1.3 holds when G 0 η is replaced with the subgroup MU of H .
For example, for H = SL 2 (F q ((X −1 ))), one can take to be the non-uniform lattice SL 2 (F q [X ]) and the groups M and U to be We remark that for uniform lattices, one can use Margulis' orbit thickening argument to show that U -action is uniquely ergodic (see Mohammadi [38], Ellis-Perrizo [22] or [13,Lemma 6.3]). It is also worth noting that for non-uniform quotients, using our geometric approach, one can show the version of the previous corollary for the U -action (instead of MU ). Finally, we mention the work of Vatsal [56] in which the equidistribution results of Ratner [47,48] for unipotent dynamics in the p-adic case were applicable with a geometric approach similar to ours (see [56,). Regarding the proof of Theorem B, it is proven by using Theorem A, the classification of G 0 η -orbits, given in [13] and the Howe-Moore property established in [11].

New non-linear homogeneous dynamical phenomena
So far, the results obtained in Theorems A and B for geometrically finite lattices parallel the more classical results in linear homogeneous dynamics. However, the family of tree lattices is very rich and, as opposed to the linear setting, there exist many non-geometrically finite lattices. These exhibit wilder behaviors than their linear counterparts giving rise to several interesting phenomena that do not appear in the classical setting. Various aspects of these differences, as well as analogies, were studied by many, including Serre [50], Tits [55], Bass-Kulkarni [2], Burger-Mozes [10,11], Lubotzky [35], Bass-Lubotzky [3], Paulin [42], Bekka-Lubotzky [5] etc. The following results add a new dynamical aspect to these nonlinear phenomena showing that horospherical orbits on quotients by non-geometrically finite lattices can exhibit escape of mass, which does not occur in homogeneous dynamics in the linear setting.
Theorem C (Escape of mass) For any q ≥ 2, there exist a lattice in G = Aut(T 2q+2 ) and η ∈ ∂ T 2q+2 such that for the trivial coset x = e ∈ X , any compact K ⊂ X and any good Følner sequence Recall that in the setting of unipotent dynamics on linear homogeneous spaces, by now classical results of Ratner [45,46,48], Mozes, Shah [39,51] and others show that the orbital averages along unipotent group actions always converge towards an invariant probability measure. The following result contrasts the classical situation by giving an example where we see not only an escape of mass phenomenon, but also a failure of convergence of the orbital averages along Følner sequences.
Theorem D (Escape of mass and equidistribution) There exists a non-uniform lattice < Aut(T 6 ) with the property that for any η ∈ ∂ T there exist points x ∈ X = Aut(T 6 )/ such that for any good Følner sequence (O t ) t∈N in G 0 η , the set of accumulation points of the sequence of orbital averages ν x,t contains the zero measure and m X .
The proof of this theorem is carried out in Sect. 5 and consists of several parts. In fact, it yields an uncountable number of non-isomorphic such lattices in Aut(T 6 ). The construction of these lattices has a similar flavor as the constructions of Bass-Lubotzky in [3] to show that there are lattices of arbitrarily small covolumes in Aut(T ). Once the candidate lattices are constructed, the escape of mass phenomenon is proven by exploiting further the aforementioned connection between the Markov chain theory and distributions of horospherical orbits. This step uses the relatively finer ingredient of subgaussian concentration estimates for geometrically ergodic Markov chains (see e.g. Dedecker-Gouëzel [18]). Finally, the proofs of the uniqueness of the G 0 η -invariant probability measure and the equidistribution along some orbital averages rely, among others, on the mixing of the discrete geodesic flow and the positive recurrence of the associated Markov chain.

Equidistribution of spheres
To describe the general problem that we study here, consider a morphism of graphs π : T → Q, where T is a biregular tree. For a vertexṽ ∈ V T , let S(ṽ, n) be the set of vertices of T at distance n fromṽ. Let ρ n be the uniform distribution on S(ṽ, n). We are interested in the distributions π * ρ n on V Q: do they have a limiting distribution and, if yes, can one identify it?
Questions about equidistribution of spheres are well-studied in many homogeneous quotients: Euclidean spheres in R d /Z d in [44] or hyperbolic spheres in quotients of hyperbolic space H d / , where is a lattice in SO(d, 1) (see [6,Theorem 3.3], [21,25,44,52] for more general results with applications to various counting problems). In the following result, we answer such a question for the natural quotient Q of the tree associated to the -action, where is a general lattice in Aut(T ).
Theorem E (Equidistribution of spheres in quotients by tree lattices) Let T be a biregular tree, ≤ Aut(T ) a tree lattice. Denote by Q = \T .
1. (Non-escape of mass) For any > 0, there exists a finite subset K ⊂ V Q, such that for all n ∈ N we have 2. (Limiting distribution) There exists an integer p, and limiting probability distributions μ 0 , ..., μ p−1 on V Q such that for all v ∈ V T and for all 0 ≤ j < p we have π * ρ pn+ j → μ j , as n → ∞.

(Exponential convergence)
If, in addition, is geometrically finite, we can take p = 2 and there exists r > 1 such that where . denotes the total variation norm.
In geometrically finite case (3), the measures (μ j ) j=0,1 coincide with the projection of the Haar measure m X by the natural map proj : Aut(T )/ → V Q by two different base points. The exponential rate of convergence 1/r in this result can be made effective, using the effective version of geometric ergodic theorem for Markov chains as in [4].
The proof of the previous result relies on the tools we develop to prove Theorem A. Indeed, the Markov chain that we construct to track the statistical behaviour of horospherical averages easily allows one to understand the spherical averages provided one proves a (positive) geometric recurrence property for (non-) geometrically finite lattice quotients. This is carried out in Sect. 6. To draw an analogy, the overall proof can be seen to parallel, in considerably simpler fashion, the deduction of Theorem [24,Theorem 4.4] from Theorem 4.1 in that work.

Remark 1.5 (Diophantine exponent vs. speed of equidistribution)
In fact, in the geometrically finite case, using the geometric recurrence of the associated Markov chain (Lemma 6.7), one can show the version of the equidistribution in Theorem B on the quotient V Q additionally with a speed as in (3) above. The equidistribution itself directly follows by projecting the measures m O t and m X in Theorem B by the map proj. The speed of equidistribution depends on a geometric diophantine exponent (see e.g. [53, (1.6)] and [26,42]) of the boundary point g −1 η where x = g . From this perspective, Theorem E (3) can also be seen as a particular case based on the fact of hyperbolic geometry that large circles are well-approximated by horocycles [24, p.116] (see also Remark 6.8).

Counting lattice points
Another classical question closely related to the equidistribution of spheres is the problem of counting lattice points. To describe the general problem, consider a lattice (or more generally a discrete subgroup) in some locally compact topological group endowed with a non-negative functional · . One is interested in describing the asymptotics of This problem goes back to Gauss who was interested in the case Z d ≤ R d with Euclidean norm as the functional ||·||. This particular problem is known as Gauss circle problem and the sharp error rates are still unknown. For ≤ SL 2 (R), one can take ||·|| to be the operator norm induced by the Euclidean norm on R 2 , in which case, we have g = exp( 1 2 d H 2 (g.i, i)). This was already studied by Delsarte [19] in 40's, who obtained the first non-euclidean counting results. In the same setting, lattice point counting problem is also closely related to the counting of closed geodesics on hyperbolic surfaces. For an extensive historical survey and overview of methods used, we refer to [28], where the authors also develop spectral techniques to study the lattice point counting problem in a large generality.
Coming back to our setting, in analogy with the real hyperbolic case, it is natural to consider the functional g = d(gõ,õ), whereõ ∈ V T is some basepoint and d the graph distance on the tree. Clearly, for a discrete , N (R) is finite and non-decreasing in R. The following result describes the growth asymptotics of N (R) with exponential error term for a geometrically finite lattice : Theorem F Let T be a biregular tree, ≤ Aut(T ) a geometrically finite tree lattice. Let m be an Haar measure on Aut(T ) and m X the induced finite measure on X = Aut(T )/ . Fix a basepointõ ∈ V T and for R ∈ N, let Denote by B T (R) the cardinality of the set of vertices at an even distance fromõ that is at most R. Then, there exists c ∈ (0, 1) such that We stress that unlike before, we do not normalize the measure m X to be a probability measure. We also remark that the main term m(Gõ) m X (X ) can alternatively be expressed as ( v∈V Q 1 | ∩Gṽ | ) −1 , where for every vertex v of Q = \T ,ṽ ∈ V T denotes a lift of v, Gṽ is the maximal compact subgroup of Aut(T ) fixingṽ and V Q denotes the set of vertices Q at even distance from π(õ). Finally, we note that Aut(T ) acts without edge inversion and this implies that for every g ∈ Aut(T ) andṽ ∈ V T , d(gṽ,ṽ) ∈ N is an even number. This is the reason why, in the previous statement, we only consider vertices at even distance from each other.
We remark that this theorem also follows from the main result of Kwon in [32] and from the work of Roblin [49, Chapitre 4, Corollaire 2]. Our proof relies on our previous result on the equidistribution of spheres (Theorem E) and is a relatively straightforward consequence thereof. An exponential error rate c ∈ (0, 1) can also be effectively calculated.
The article is organized as follows. We recall some preliminary material mostly on lattices in groups acting on trees and set our notation in Sect. 2. In Sect. 3, we associate a natural Markov chain to an edge-indexed graph, study its properties and use these to prove Theorem A for geometrically finite lattices. In Sect. 4, we prove Theorem B. Theorems C and D are proven in Sect. 5. In Sect. 6, we introduce an auxiliary Markov chain and use this to study the edge-indexed graph associated to a general lattice and prove Theorems E and F.

Basic notation
We denote by T a (d 1 , d 2 )-regular tree, with d 1 , d 2 ≥ 3, with V T its set of vertices and E T , its edges. All edges are directed and ∂ 0 , ∂ 1 : E T → V T are, respectively, the initial and the terminal vertex maps. An (ordered) pair of edges e 1 , e 2 is called consecutive if ∂ 1 (e 1 ) = ∂ 0 (e 2 ). A sequence of consecutive edges e 1 , ..., e n is called a path of length n. We also refer to it as a path between ∂ 0 (e 1 ) and ∂ 1 (e n ). The distance d(·, ·) between two vertices of the graph is defined as the minimal length of a path between these vertices.
We denote by Aut(T ) the group of tree automorphisms acting without edge inversion, i.e. the group of automorphisms g such that d(gv, v) = 0 (mod 2) for one (equivalently every) vertex v ∈ V T . When d 1 = d 2 , this is an index two subgroup of full group of automorphisms. Endowed with pointwise convergence topology, it is a locally compact, second countable group. In this article G always stands for a non-compact, closed subgroup of Aut(T ) that acts transitively on the boundary ∂ T of T .
Throughout the rest of the article, we fix a basepointõ ∈ V T and a distinguished end η ∈ ∂ T , and denote by (y 0 , y 1 , y 2 , ...) the vertices of the infinite path converging to η, where y 0 =õ.
For a subset S ⊂ T , and a subgroup H < Aut(T ), H S denotes the pointwise stabilizer of S in H . Given η ∈ ∂ T , we define The group G 0 η is called the horospherical subgroup (see [13, Section 2] for more details on horospherical subgroups). It is a closed and amenable subgroup of G and as mentioned in the introduction, one can construct many good Følner sequences in G 0 η . The following sequence of compact open subgroups of G 0 η yields a good and tempered Følner sequence that is particularly convenient for our geometric approach. For t ∈ N, we set In fact, as we shall see, thanks to the structure of good Følner sequences, it will be sufficient to prove our results only for the sequence F t . Denote by m G and m G 0 η the Haar measures on G and G 0 η , respectively. By m F t we denote the Haar probability measure on F t which clearly coincides with the normalized restriction of m G 0 η to F t .

Lattices and theirs associated edge-indexed graphs
It is well-known that a subgroup ≤ G is discrete if and only if all vertex stabilizers v for v ∈ V T are finite. A discrete subgroup ≤ G is called a lattice if X = G/ admits a G-invariant Borel probability measure, in which case we denote this measure by m X . By our standing assumption of boundary transitivity of G, the quotient graph G\T has two vertices. Indeed by [10, Lemma 3.1.1], G acts two-transitively on ∂ T which in turn implies that G has precisely two orbits on V T . Moreover, since G acts without edge inversions, it acts transitively on the set of vertices of even distance. In this case, is a lattice in G if and only if it is a lattice in Aut(T ). Therefore, all lattices we will consider are tree lattices, i.e. lattices in Aut(T ). For convenience, we will often call them lattices without specifying the ambient group. We refer to [3] for more details on tree lattices and edge-indexed graphs.
Given a discrete subgroup , there is a useful construction [2] of a graph Q and map ind : E Q → N as follows: the graph Q is the quotient graph \T , which is well-defined, since acts without edge inversion. Denote by π : T → Q the projection map. The index map ind : E Q → N is given by ind(e) = [ ∂ 0 (ẽ) : ẽ ], whereẽ ∈ E T is any edge with π(ẽ) = e. This clearly does not depend on the choice of the liftẽ. The pair (Q, ind) is called the edge-indexed graph associated to < Aut(T ).
For v ∈ V Q, we define deg(v) to be the valency of any of its liftsṽ. By definition of the map ind, ind(e). (2.1) where (e 1 , ..., e n ) is a path from u to v. For an an edge-indexed graph (Q, ind) associated with a discrete subgroup , the value of N u (v) does not depend on the choice of the path.
where d(., .) denotes the graph distance on Q. We shall refer to this quantity as the volume of the edge-indexed graph (Q, ind) based at o. We also remark that changing the base point from o to o has the effect of multiplying the previous sum by the rational number (o ) (o) , therefore does not affect its finiteness.
Conversely, one can define an abstract edge-indexed graph (Q, ind) as a tuple consisting of a graph Q and map ind : E Q → N. Under natural assumptions on the associated maps and N as above, there exists a discrete subgroup whose associated edge-indexed graph coincides with (Q, ind) and the function N is proportional to v → | ṽ |, whereṽ is any lift of v (see [3, page 23]

or [2]).
For a discrete group ≤ G, we define the projection map proj : The map proj is clearly continuous and has compact fibers in G/ : for each v ∈ V Q and g ∈ G such that proj(g ) = v, we have proj −1 (v) = Gõg . Moreover, the measure of each fiber is In other words, using the Definition 2.2 of the map N o , we have

Geometrically finite lattices
Following [3,50], we define a Nagao ray to be an edge-indexed graph (Q, ind) whose underlying graph Q is an infinite ray and the map ind takes value 1 on all edges directed towards the infinity except the edge emanating from the vertex o at the origin. All edges e directed away from infinity are indexed by deg(∂ 1 (e)) − 1. Here, an edge e ∈ E Q is said to be directed towards infinity if d(∂ 1 (e), o) > d(∂ 0 (e), o), and directed away from infinity otherwise. See Fig. 1 for an example of Nagao ray in (q 1 + 1, q 2 + 1)-biregular tree. An open Nagao ray is obtained by removing the origin vertex from a Nagao ray. Following Paulin [42], a tree lattice is called geometrically finite if its associated edgeindexed graph (Q, ind) contains a finite subgraph F whose set theoretic complement in Q is a disjoint union of finitely many open Nagao rays. The finite part of (Q, ind) is the smallest non-empty finite subgraph F with this property. When T is a (q + 1)-regular tree, a tree lattice is called of Nagao type if the associated edge-indexed graph is a Nagao ray (see [3,Chapter 10]). Fig. 2 illustrates the corresponding edge-indexed graph. Another example of geometrically finite lattice, where T is (3, 10)-biregular tree, is given in Fig. 3.
Proposition 2.1 Let ≤ G geometrically finite lattice. Let g ∈ G be such that the G 0 η -orbit of g is not compact in G/ . Let F denote the finite part of Q = \T . Then π(g −1 y t ) belongs to F for infinitely many values of t, in particular t − d(π(g −1 y t ), F) is monotone non-decreasing and unbounded.

Markov chains
We recall some terminology and basic facts of the theory of Markov chains and set our notation. For more details, we refer the reader to [20,33,37].
Let S be a countable set, and P : S × S → [0, 1] a Markov kernel, i.e. y∈S P(x, y) = 1 for every x ∈ S. By (standard) abuse of notation, we shall also denote the associated Markov operator and its dual by P: for a function f on S, P f (x) = y f (y)P(x, y). For a measure μ on S, μP(·) = y μ(x)P(x, ·). For n ∈ N, P n denotes the n th -convolution power of P. For s ∈ S, we denote by δ s the probability measure supported on {s}: for s 1 , s 2 ∈ S, P n (s 1 , s 2 ) := δ s 1 P n (s 2 ).
The Markov kernel P is called irreducible if for every s, t ∈ S, there exists n ∈ N with P n (s, t) > 0. The period of an irreducible Markov kernel P is defined as gcd{n ∈ N | P n (s, s) > 0} for some (or equivalently all) s ∈ S. If the period is 1, the Markov chain is called aperiodic. Denoting the period by p, there exists a partition 0 , . . . , p−1 of the state space S into cyclic classes i such that for every s ∈ i , P(s, i+1 ) = 1 (i mod p). If P is irreducible and has period p, then P p restricted to each cyclic class is irreducible and aperiodic. In a standard manner [20, Section 3.1], a Markov kernel yields a canonical Markov chain on the state space S. Therefore, we shall equivalently speak of a Markov chain being irreducible, aperiodic etc.
A non-negative measure μ on S is said to be stationary for the Markov kernel P if μP = μ. An irreducible Markov kernel P is called positive recurrent if it admits a stationary probability measure, in which case this measure is unique. If, moreover, P has period p then μ = 1 For an irreducible aperiodic positive recurrent Markov chain and any initial distribution μ, μP n converges to the stationary probability measure as n → ∞. In case of an irreducible Markov chain that is not positive recurrent, μP n converges to 0, regardless of the period.

Non-escape of mass
The aim of this section is to prove Theorem A. We start by associating a Markov chain with a tree lattice , study its properties and eventually link the Markov chain to the study of orbital measures in G/ of horospherical subgroups. If is a uniform lattice, there is nothing to prove in Theorem A, so throughout the proof, is assumed to be non-uniform.

The Markov chain
Let be a tree lattice and (Q, ind) be the corresponding edge-indexed graph. Define the Markov chain M n with state space E Q and transition probabilities given by Note that by (2.1) transition probabilities sum to 1 so that P is a Markov kernel. As the subsequent proofs will show, we are naturally led to the study of the Markov chain M n which can simply be seen as the image by quotient map π of the simple random walk on the set of edges of the tree T . It came to our knowledge that this Markov chain was considered by Burger and Mozes [9] in the study of the notion of divergence groups in Aut(T ) and by Kwon [32] in the study of mixing properties of the discrete geodesic flow.
Let us illustrate the structure of this Markov chain as well as our subsequent use of it in a simple but important situation, that is when is lattice of Nagao type. Fig. 2 for the corresponding edge-indexed graph). In this case, the above construction of Markov chain gives rise to a state space and transition probabilities as illustrated in Fig. 4.

Example 3.1 Let be Nagao lattice in
Consider a random trajectory of this Markov chain on its state space as depicted in the previous figure. The key phenomenon for us in this example is that once the trajectory turns toward the finite part (here, this corresponds to the edges facing left or up), it must deterministically walk all the way toward the finite part without a chance to turn around. This feature entails very strong recurrence properties which will allow us to control hitting times and, eventually, deduce convergence of the Markov chain to the stationary measure (up to issues of periodicity) even with moving starting point. The latter property is crucial for Theorem A.

Basic properties Lemma 3.2 Let be a tree lattice. The associated Markov chain M n is irreducible.
Proof Since the graph Q is connected, it is sufficient to show that for any two edges e, f ∈ E Q, such that ∂ 1 e = ∂ 0 f , we have P n (e, f ) > 0 for some n ≥ 1. If f = e, this holds for n = 1 by definition of P.
We show the existence of a path as above by contradiction. Suppose for any nonbacktracking finite path starting at e we have ind(e n ) = 1. Such a path cannot end at a leaf, since then ind(e n ) = deg(∂ 0 e n ) = deg(∂ 1 e n ) > 2 by (2.1). Hence, we can extend it to produce an infinite non-backtracking path with ind(e i ) = 1 for all i ∈ N. In particular, N ∂ 0 (e) (e i ) ≤ 1 for all i, which contradicts the finiteness of the volume in (2.3).
In the case of geometrically finite lattices, we will prove positive recurrence of the associated Markov chain M n using Foster's drift criterion. Positive recurrence of M n in the setting of general tree lattices, which is required in the proof of Theorems C and E, is shown in Proposition 6.4 with a slightly more elaborate proof.
Assume is a geometrically finite tree lattice, (Q, ind) its associated edge-index graph and F the finite part of Q. For e ∈ E Q, we use the notation |e| := d(∂ 1 (e), F) to indicate the distance between an edge and the finite part F. For e / ∈ F, we say that e is oriented toward the finite part if d(∂ 1 (e), F) < d(∂ 0 (e), F), and oriented toward the cusp otherwise. In the case d 1 = d 2 = 3, a slightly different function V (which also works for the previous case) does the job: Let if e is oriented toward the finite part, 100(3/2) |e| otherwise.
We have A simple combinatorial observation allows us to show that when is geometrically finite, the period of the Markov chain M n is two. This is expressed in the following lemma: When is geometrically finite, one can simply take e, f to be two consecutive edges in a Nagao ray oriented toward the finite part so that the lemma applies.
Proof Let m − 1 be the length of a path from e toē along edges with positive transition probabilities. Since ind(e) > 1, P(e, e) > 0, hence there is a loop of length m with positive transition probabilities along all edges. On the other hand, after the previous loop, one can follow the path from e to e, continue to f , then to f and finally back to e. This is a loop of length m + 2. Hence, the period divides by m and m + 2, which forces it to be 1 or 2. On the other hand, since action on T preserves a partition into two sets of vertices (thanks to the assumption that Aut(T ) acts without edge inversion), hence the period cannot be 1, proving the claim.

Hitting time of the finite part
Let F be the finite part of the graph Q. For the Markov chain M n , we denote by τ the first hitting time of F i.e. τ = min{n ∈ N | ∂ 1 (M n ) ∈ F}. By positive recurrence, τ is finite almost surely. To deal with periodicity, define τ := min{n ∈ N | n ≥ τ, 2|n}.
We start by a lemma that controls the probabilities of long hitting times of the finite part. Proof Clearly, random walk starting at e can never hit F in less than |e| steps. Similarly when |e| = 0, the claim is obvious. When |e| > 0, by definition of geometric finiteness, ∂ 1 (e) belongs to some Nagao ray. Because of the structure of Nagao rays (see Example 3.1), a Markov trajectory starting at an edge oriented toward the finite part F must necessarily take at least one step toward F. This can only change once the trajectory visits F. Hence, if e is oriented toward finite part, we deduce that P e (τ = |e|) = 1 matching the upper bound in the statement.
On the other hand, if e is oriented toward the cusp, in order to avoid visiting F in the first i − 1 steps, the walk must take at least i−|e| 2 steps toward the cusp, all with probability q −1 . This gives the bound in the lemma.

Convergence of the Markov chain with varying initial distribution
As before, let P be the Markov operator corresponding to M n . Let 0 , 1 ⊆ E Q be its cyclic classes and for j = 0, 1, denote by μ j the unique P 2 stationary probability measure on j .
The next lemma describes the convergence of the Markov chain with moving initial distributions. The condition on the initial distributions will be clear later on, as this convergence will play a crucial role in the proof of Theorem A.

Lemma 3.6 Let be a cyclic class of P and e(t) ⊂ be a sequence of edges in the same cyclic class, such that t
where · denotes the total variation norm (see e.g. [37, § D.1.2]).
In the proof, we control the distributions with non-constant starting points e(t) by studying the behaviour of the Markov chain conditioned on the hitting time of the finite part. This, together with the precise control on the hitting time as provided by Lemma 3.5, allows us to prove the required convergence.
Proof By conditioning the Markov chain on the hitting time τ (as defined in Sect. 3.1.2), we have Here, for every i ∈ 2N with P e(t) (τ = i) > 0, P e(t) (δ e(t) P 2n(t) ∈ ·|τ = i) denotes the probability measure on E Q given by It follows from strong Markov property that for 2n(t) ≥ i, we have where B (F, 2) is the set of all edges at distance ≤ 2 from F.
With this notation, splitting the right-hand-side of (3.1) into three sums, we get that lefthand-side of (3.1) is bounded above by where we used (3.2) for the first two sums, and (1) for the third. We need to show that the above tends to 0 as t → ∞.
By (3), the first sum is identically 0 and as t → ∞, the third sum tends to 0 by (4) and the fact that 2n(t) − |e(t)| tends to ∞.
We focus on the middle sum of (3.3), which after denoting N t = 2n(t)−|e(t)|, we rewrite as follows On the other hand, by (1), the first sum in (3.4) is bounded above by which converges to 0 as t → ∞ by (4). This concludes the proof.

Proof of Theorem A
We now link the Markov chain to the study of orbital measures of horospherical orbits and use the properties of M n to prove Theorem A. Before starting the proof, we remark that it suffices to prove the result only for the Følner sequence F t . Indeed, let O be a M-invariant compact subset with non-empty interior in G 0 η , a ∈ G be a hyperbolic element with attractive fixed point η and of (minimal) translation distance 2 and O t = a t Oa −t be the associated good Følner sequence. It follows by compactness of F 0 and O that for some n 0 ∈ N and every t ∈ N, we have As a consequence, there exists c ∈ (0, 1) such that for every t ∈ N, the sequence F 2t = a t F 0 a −t satisfies One easily sees from these inequalities that the orbital measures ν x,t associated to F t have non-escape of mass if and only if those associated to O t have it.

Reduction to measures on the tree
For the rest of the section we fix x = g ∈ X with non-compact G 0 η -orbit. Recall that for t ∈ N, ν x,t denotes the probability measure on the orbit F t x obtained by pushforward of the Haar probability measure on F t under the orbit map u → ux for u ∈ F t .
Denote by σ t the uniform probability measure on the finite set g −1 F tõ ⊂ V T . The following observation is the first step in reducing the proof of recurrence of horospherical orbits to studying recurrence properties of the Markov chain M n introduced earlier.

Lemma 3.7 For every t ∈ N * , we have
(3.7) Proof Recall that x ∈ G/ is fixed and g ∈ G is such that x = g . Consider the map f : G 0 η → T given by f (u) = g −1 u −1õ . Denote by O : u → ug the orbit map. Then the following diagram clearly commutes: By definition, O * m F t = ν x,t and hence it is enough to see that f * m F t = σ t . This is readily verified and we are done.

Further reduction to shadows and the Markov chain
Above, we related the orbital measures ν x,t to σ t -distributions on V T . The next lemmas will link σ t to the distributions of the Markov chain.
For v ∈ V T and n ∈ N, denote by S(v, n) the set of vertices of T at distance n ≥ 0 from v. For w a neighbor of v, let S w (v, n) be the subset of S(v, n) consisting of vertices z ∈ V T such that d(z, w) < d(z, v). Thinking of v as a light source at the center the sphere, we call S w (v, n) the shadow of w (see Fig. 5 for illustration). Denote by λ (v,w),n the uniform probability measure on the shadow S w (v, n).

Lemma 3.8 Let G be a non-compact closed subgroup of Aut(T ) that acts transitively on ∂ T .
For any t ∈ N * , we have where {ṽ i t } is the collection of vertices in T neighboring g −1 y t except g −1 y t+1 .
Proof The last equality directly results from the definition of the probability measure λ (v,w),t , therefore we focus on the first equality. Since all the shadows involved have the same cardinality, using the definitions of σ t and λ (v,w),t , the equality will follow if we show In other words, the set g −1 F tõ is the set of vertices on the sphere of radius t around g −1 y t except the shadow of g −1 y t+1 .
Since g acts by isometry, it is enough to show this for g = id. We clearly have To show the other inclusion, let ξ 1 , ξ 2 ∈ ∂ T be such that (ξ i , η) ∩ [y 0 , η) ⊃ [y t , η) for i = 1, 2. It clearly suffices to show that there exist a sequence h n ∈ F t with h n ξ 1 → ξ 2 as n → ∞. To see this, note that since G is non-compact, closed and transitive on ∂ T , by [10, Lemma 3.1.1] it acts doubly transitively on ∂ T . Furthermore, since it is non-compact, it contains a hyperbolic element a that -thanks to double transitivity-we can suppose to have attracting point η and repelling point ξ 1 on ∂ T . Similarly up to conjugating a, let b be a hyperbolic element with attracting fixed point η and repelling fixed point ξ 2 . The sequence h n = b −n a n does the job and this concludes the proof.
where deg(.) denotes the valency of the vertexõ. Denote by ρ n the uniform measure on the sphere S(õ, n). In the following lemma, we realize the probability measures π * λ (v,w),n and π * ρ n as the n th -step distribution of our Markov chain with appropriate initial distributions.
The fact that such a relation exists is not surprising as the Markov chain M n is obtained as a quotient of the simple random walk on the edges of the tree T .
To see the second claim, note that by construction of the Markov chain L n , we have Applying π * to both sides yields (3.10).
Remark 3. 10 We remark here that the statements of Lemmas 3.7, 3.8 and 3.9 hold more generally for any lattice of Aut(T ). Indeed, the proofs do not make use of the particular structure of a geometrically finite lattice.
Proposition 3.11 Let x = g with non-compact G 0 η -orbit. Then the set of weak- * limit points of π * σ t is {∂ 1 * μ 0 , ∂ 1 * μ 1 }, where i 's are two cyclic classes of P and for i = 0, 1, μ i is the unique P 2 -stationary measures on i as before.

Remark 3.12
In this proposition, the measures π * σ t depend on the point x = g , but the set of limit points of π * σ t does not. Proof Combining Lemmas 3.8 and 3.9 and denoting v i t := π(ṽ i t ), we have for any t ∈ N * For a fixed t ∈ N, the edges (π(g −1 y t ), v i t ) belong to the same cyclic class, denote it by j(t) . Up to passing to a subsequence (i.e. considering even or odd t's), which we also denote by t, we may assume that j(t) is constant. For each t, choose one vertex v(t) ⊂ {v i t } and denote the edge e(t) = (π(g −1 y t ), v t ). (3.12) Up to passing to a further subsequence of t's, we may suppose that δ e(t) P t−1 is supported in a single cyclic class. Therefore, for some r ∈ {0, 1}, every t in this sequence writes as t − 1 = 2n(t) + r , where n(t) ∈ N. Thus we can write δ e(t) P t−1 = δ e(t) P 2n(t) P r . (3.13) Now, since by Proposition 2.1, we have t − |e(t)| → ∞, Lemma 3.6 applies and we deduce that δ e(t) P 2n(t) − μ j → 0 as t → ∞ for some j ∈ {0, 1}. Therefore, we have where i = j + r (mod 2). This finishes the proof.

Proof of Theorem A
With the notation of this section, we want to show that for any > 0, there exists a compact set K ⊂ G/ such that for every x ∈ X with non-compact G 0 Let L ⊂ E Q be a finite set such that μ j (L) > 1 − ε for j ∈ {0, 1}. The set K = proj −1 (∂ 1 (L)) is compact, since the map proj : G/ → V Q has compact fibers. Using Lemma 3.7 and Proposition 3.11, we have for any and the claim follows.

Equidistribution
This section is devoted to the proof of Theorem B which we deduce from Theorem A and our previous work [13].
Fix a hyperbolic element a ∈ G of translation length 2 with attracting fixed point η. Denote by η − ∈ ∂ T the repelling point of a and set M = G 0 η with non-empty interior. Let O t = a t Oa −t be the associated good Følner sequence for G 0 η . As before, for x ∈ X , denote by ν x,t the orbital measure m O t * δ x . Let x ∈ X be such that G 0 η -orbit of x is not compact. By Theorem A, up to passing to a subsequence, we can suppose that for the weak- * topology and where m is a Borel probability measure on X . Furthermore, since O t is a Følner sequence, m is G 0 η -invariant. We need to show that m = m X . Recall that by [13,Theorem 1.6], there exists countably many closed G 0 η -orbits in X . These are all compact and for each cusp of , there exists precisely a discrete one parameter family of compact orbits. Denote by k ∈ N the number of cusps of and let C i, j be the collection of compact G 0 η -orbits, where i = 1, . . . , k and j ∈ Z. By the same result, we have aC i, j = C i, j+1 and a − C i, j escapes to infinity as → ∞ in the sense that for any compact set K , we have K ∩ a − C i, j = ∅ for every large enough (see e.g. proof of [13, Lemma 6.2]).
We first prove that m(C i, j ) = 0 for every i = 1, . . . , k and j ∈ Z. For a contradiction, suppose m(C i 0 , j 0 ) > 0 for some i 0 , j 0 . Denote 1 2 m(C i 0 , j 0 ) =: > 0 and let K = K ( ) be the compact subset of X given by Theorem A. It follows by the latter result that we have Choose an ∈ N large enough so that a − C i 0 , j 0 ∩ K = ∅. Since a − x does not lie on a compact G 0 η -orbit either, using Theorem A, by passing to a further subsequence in (4.1), we can suppose that ν a − x,t also converges to a G 0 η -invariant probability measure that we denote by m a − . As in (4.2), by Theorem A, we have m a − (K ) ≥ 1 − .
Using the relation a − O t a = O t− , one verifies by a simple calculation that we have m a − = a − * m. Using this, we deduce a contradiction. Therefore, m(C i, j ) = 0 for all i = 1, . . . , k and j ∈ Z.
We mention that at this point, one could conclude the proof by appealing to the classification of ergodic G 0 η -invariant Borel probability measures [13, Theorem 1.1]. However, that result has extra hypotheses on G, namely Tits independence property and a certain transitivity condition. On the other hand, for a geometrically finite lattice , it is possible to give a similar classification of ergodic G 0 η -invariant Borel probability measures on G/ for a more general group G as in Theorem B. We single this out in the next proposition which is essentially contained in [13]. (d 1 , d 2 )-biregular tree, with d 1 , d 2 ≥ 3, and G a non-compact, closed and topologically simple subgroup of Aut(T ) acting transitively on ∂ T . Let be a geometrically finite lattice in G and η ∈ ∂ T . Then, any G 0 η -invariant and ergodic Borel probability measure on X = G/ is either G 0 η -homogeneous and compactly supported, or it is the Haar measure m X .

Proposition 4.1 Let T be a
To finish the proof of Theorem B, consider an ergodic decomposition of the G 0 η -invariant probability measure m. Since there are countably many closed G 0 η -orbits and each of them has zero measure with respect to m, the same holds for almost every ergodic component of m. Therefore by Proposition 4.1 almost every ergodic component of m is the Haar measure m X , hence m = m X completing the proof of Theorem B.
Proof of Proposition 4.1 We use the same notation introduced in the beginning of the proof of Theorem B, namely a is a hyperbolic element with attracting fixed point η ∈ ∂ T , the group M and the good Følner sequence O t are as defined there. Let m 0 be a G 0 η -invariant and ergodic probability measure on X . If m 0 gives positive mass to a compact G 0 η -orbit, then by ergodicity, it must be the homogeneous measure supported on that orbit. So let us suppose that m 0 gives zero mass to each compact G 0 η -orbit. By pointwise ergodic theorem for amenable groups [34, Theorem 1.2], there exists a point y ∈ X that is generic with respect to m 0 and the tempered Følner sequence O t (see [13, § 2.3]). By [13,Theorem 1.6], G 0 η -orbit of y is dense in X . Then, by [13, Lemma 6.2], there exists a compact set K in X , a sequence of integers n k → ∞ such that a −n k y ∈ K for every k ∈ N. For a function θ ∈ C c (X ), denotê θ(z) := θ(mz)dm M (z) where m M is the Haar probability measure on M. The function θ is clearly M-invariant. Since G is closed, transitive and topologically simple, it has the Howe-Moore property [11,Proposition 4.2] and in particular the action of the hyperbolic Fig. 6 An edge-indexed graph (Q, ind) of a non-geometrically finite lattice ≤ Aut(T 2q+2 ) element a on X is mixing. Therefore we can apply [13,Lemma 6.3], where we can take O + to be O t 0 for some t 0 ∈ Z small enough, for every θ ∈ C c (X ), we have On the other hand by choice of y ∈ X , the left-hand-side above also converges to θ (z)dm 0 (z). It follows But since m 0 and m X are G 0 η -invariant, by Fubini's theorem, it follows that in other words, m 0 = m X as required.

Escape of mass phenomenon
This section contains the proofs of Theorems C and D. We start by proving an escape of mass result that implies Theorem C. Regarding the construction of a lattice < Aut(T ) that figures in the following result, we note that by [3,§ 4.11,Example 1], for every q ≥ 2, there exists a lattice ≤ G = Aut(T 2q+2 ) whose associated edge-indexed graph is as in Fig. 6. Clearly, this is not geometrically finite. Fig. 6. Let x = e ∈ X = G/ be the trivial coset. Let ξ ∈ ∂ T be the end corresponding to the sequence {x i } i∈N for some lifts of x i and G 0 ξ be the corresponding horospherical subgroup. Then for any compact K ⊂ X lim t→∞ ν x,t (K ) = 0.

Theorem 5.1 Let be a tree lattice with associated edge-indexed graph (Q, ind) as in
Proof By (3.6), it clearly suffices to prove the statement for the orbital measures ν x,t associated to the Følner sequence F t . Letõ be a lift of the left-most vertex x 0 to T = T 2q+2 . By Lemma 3.7 and the fact that proj has compact fibers, it is enough to show that if σ t is the uniform measure on F tõ ⊂ V T , then for any x l ∈ V Q, we have π * σ t (x l ) → 0.
The set F tõ can be identified with all non-backtracking paths in T of length t that start at x t and do not containx t+1 . A path fromx t to a vertex y ∈ F tõ with π(y) = x l projects to a path in Q between x t and x l . Note that the projection of such paths to Q can only contain x 0 as endpoint. These will allow us to bound the number of such paths. Without loss of generality, assume that t is even. For l an even non-negative integer, we claim that the number of vertices in F tõ that project to x l is bounded above by Indeed, any such path from x t to x l must take t − l/2 steps to the left and l/2 steps to the right in Fig. 6. The binomial coefficient counts the number of choices when to take the right step. Since the projection of the paths that we consider can only contain x 0 as endpoint, for any choice of such path, each edge taken to the right has at most 2 lifts to T , while each edge taken to the left has at most 2q lifts. Therefore, for any even l ≥ 0 The rest of this section is devoted to the proof of Theorem D. Its proof consists of four parts. In the first part, we construct an uncountable family of lattices α in Aut(T 6 ). In the second part, thanks to an auxiliary Markov chain that we introduce, we obtain subgaussian concentration estimates on the Markov chain associated with the lattice α (see Sect. 3.1). In the third part, we show that the space Aut(T 6 )/ α contains points x which exhibit escape of mass along some subsequence of horospherical orbital averages and along some other subsequences equidistribute to the Haar measure, as we show in the fourth part.
Proof of Theorem D First part: Construction of α . For each α ∈ (1, 2), we will construct an edge-indexed graph (Q α , ind) of finite volume, which will yield a lattice α ≤ Aut(T 6 ). First, the underlying graph is a ray, with vertices {x i } ∞ i=0 and edges e i = (x i , x i+1 ). Let α ∈ (1, 2) and for i ≥ 1, let n i = α i . We divide the vertices (x i ) i≥1 of the ray into two types: x j is black if j = i + n 1 + · · · + n i for some i ≥ 1 and white otherwise. In other words, there are blocks of n i consecutive white vertices that are separated by single appearances of black vertices. A white vertex x j is said to belong to i−th block if i + n 1 + · · · n i < j < i + 1 + n 1 + · · · + n i+1 .
We say that an edge e belongs to the i-th block if both ∂ 0 e and ∂ 1 e do.
We define the index map on E Q α : for i > 0, let ind(e i ) = 2 and ind(ē i−1 ) = 4 if x i is black and ind(e i ) = ind(ē i−1 ) = 3 if x i is white. Set ind(e 0 ) = 6. See Fig. 7 for illustration.
One easily checks that (Q α , ind) has bounded denominators [3, page 23] and, hence, has a faithful finite grouping which yields a discrete subgroup α in Aut(T 6 ). Moreover, vol(Q α ) ≤ 1 + 4 i≥1 (n i + 1) 1 2 i−1 < ∞ (see (2.3)), thus α is a lattice. This auxiliary Markov Chain records the behavior of M n along the edges within some block. It remembers the probabilities of an edge to turn around or to continue further in the same direction (see e.g. Fig. 4). Denote by V n the Markov chain associated to the kernelP. Given a word u ∈ {e,ē} n of length n ≥ 2, for s ∈ {e,ē} 2 denote by N s (u) the number of occurrences of the word s as a subword of u. For n ≥ 2, define the function f n on {e,ē} n by f n (u) = N ee (u) + N eē (u) − Nē e (u) − Nēē(u). Denote by Y n the integer valued random variable f n (V 0 , V 1 , . . . , V n−1 ). Now, it is readily observed that for every i ∈ N large enough so that n i ≥ 16 and every integer j, m ≥ 2 with we have ∂ 0 * δ e j P m = ∂ 0 * e j+Y m in distribution. (5.4) Indeed, the inequalities (5.3) make sure that the starting edge e j in the i-th block is n i /4 away from the boundary of the i-th block, so application of P m keeps the support of the distribution within i-th block and, therefore, transition probabilities at each step are given by (5.1). The relation with the auxiliary chain (5.2) is straightforward.
We now wish to use subgaussian concentration inequalities for the Markov chain V m (e.g. as discussed in [18]). To this end, we note that being an aperiodic irreducible Markov chain with finite state space {e, e}, V m is geometrically ergodic and the state space is a small set in the sense of [ The path e(t) as above comes back infinitely often to x 0 , but makes some longer and longer visits towards the cusp. Now, choose a lift of e(t) in T 6 , starting at some basepoint o ∈ V T , which is the lift of x 0 . Letõ = y 0 , y 1 , y 2 , ... be consecutive vertices converging to η ∈ ∂ T . Let g ∈ Aut(T 6 ) be an automorphism such that g −1 maps the edge (y i , y i+1 ) to the lift of the edge e(i), and x = g α .
Let i k and t i k be increasing sequences of N, such that e(t i k ) is an edge, pointing toward the cusp, that is exactly in the middle of i k -th block, namely and t β i k < c 1 |e(t i k )| for some c 1 > 0. Such infinite subsequences exist by property (3) in the choice of e(t). In the notation above, e d i k = |e(t i k )|.
We shall now show that the sequence of measures given by the orbital averages ν x,t i k converges weakly to 0 as k → ∞.
By Lemma 3.7 and (3.11), it suffices to show that the sequence δ e(t i k ) P t i k of distributions of the Markov chain M n converges weakly to zero. To do this, we would like to apply (5.5) to show that after t i k -iteration most of the mass of the Markov chain stays in i k -th block, which moves to the cusp in Q α as k → ∞. However, constraints (5.3) are not satisfied since t i k ≥ d i k > n i k /8. We remind that n i k = α i k , d i k ∼ c 2 α i k and, thus t i k ≤ c 3 α i k 1/β for some positive constants c 2 , c 3 .
To overcome this problem, we apply (5.6) several times for a small number of allowed iterations, each time dismissing an exponentially small proportion of trajectories that move more than a distance r k to be chosen below.
Let m k = n i k 8 ∼ α i k 8 . The number of times we wish to apply (5.6) is bounded above by The Markov property and choices of m k and r k allow us to repeatedly apply (5.6) N k times (with m = m k and r = r k ), each time conditioning on trajectories that do not move more than r k in each m k -iterate. We get that proportion of trajectories starting at e(t i k ) that move at most N k · r k ≤ n i k /4 (in particular, do not leave i k -th block) is at least for some constant c 4 > 0. The above tends to 1 as k → ∞, implying the escape of mass for ν x,t 's when the underlying Følner sequence is F t . By (3.6), this also implies the escape of mass for ν x,t 's associated to any good Følner sequence.
Fourth part: Equidistribution. Recall that by our choice of x ∈ X , there exists an increasing sequence t k ∈ N such that |e(t k )| = 0 for every k ∈ N. The equidistribution statement follows from the following technical but more general result. This completes the proof of Theorem D.

Proposition 5.2 Let G be a non-compact, closed, topologically simple subgroup of Aut(T ) that acts transitively on ∂ T . Let be a lattice in G, η ∈ ∂ T and O t a good Følner sequence
in G 0 η . Let g ∈ G and denote by (ê(t)) t∈N a sequence of consecutive edges in T on a geodesic segment towards g −1 η. Assume that there exist a finite subset F of E Q and an increasing subsequence t k such that π(ê(t k )) =: e(t k ) ∈ F for every k ∈ N. Then for x = g ∈ X, the orbital measures ν x,t k converge towards the Haar measure m X on X .
Proof As before, fix a distinguished vertexõ in T with respect to which the proj : G/ → V Q map given by proj(h ) = π(h −1õ ) is defined. Let g ∈ G be as in the statement. Since the Markov chain M n associated with the lattice is positive recurrent (Proposition 6.4), it follows by (3.6) and the correspondance established in Lemmas 3.7, 3.8 and 3.9 (see also Remark 3.10) that proj * ν x,t k converges to a probability measurem on E Q. Since the map proj is proper, this implies that the sequence ν x,t k is tight so that any subsequence of (ν x,t k ) k∈N has a limit point and any limit point is a probability measure on X . Let m be such a limit point along a subsequence that we also denote by t k . Since ν x,t k 's are orbital measures associated to a Følner sequence in G 0 η , the limit probability measure m is G 0 η -invariant. Now, fix a hyperbolic element a ∈ G with attracting point η ∈ ∂ T and such that the translation axis of a containsõ. Let n k = t k τ (a) where τ (a) ∈ N denotes the translation length of a, so that |τ (a n k ) − t k | is bounded. For every k ∈ N, we have proj(a −n k g ) = π(g −1 a n kõ ) = π((g −1 a n k g)g −1õ ). As n ∈ N varies, (g −1 a n g)g −1õ describes vertices on the geodesic ray between g −1õ and g −1 η. Therefore it follows by the hypothesis e(t k ) ∈ F that for some larger finite set F ⊂ E Q, we have proj(a −n k g ) ∈ F for every k ∈ N. Since the map proj has compact fibres, this entails that there exists a compact set K ⊂ G/ such that a −n k g ∈ K , (5.7) for every k ∈ N. Furthermore, since such a group G as in the statement enjoys the Howe-Moore property [11] (see also [36]), the action of a on (X , m X ) is mixing so that we are in a position to apply [13,Lemma 6.3] as in the proof of Proposition 4.1. Now repeating the same argument as in the end of the proof of Proposition 4.1 (i.e. (4.3) and thereafter), one deduces that m = m X and this proves the proposition.

Limiting distributions of spheres in quotient graphs and lattice point counting
This section is devoted to the proof of Theorems E and F . Recall from Sect. 3.1, the irreducible Markov chain M n associated with a tree lattice . We proved in Lemma 3.3 that it is positive recurrent when is a geometrically finite lattice. However, in Theorem E general lattices are considered. Here we will prove that, more generally, M n is positive recurrent for all lattices.
In order to do this we introduce another Markov chain that will serve as a tool to analyse further the chain M n .

Auxiliary chain
Consider the Markov chainM n on the state space V Q, given by the transition kernel P(v, w) = ind((v,w)) deg(v) if (v, w) ∈ E Q and 0 otherwise. Since the graph Q is connected, M n is irreducible.
Recall thatõ ∈ V T is a fixed basepoint. Let o = π(õ) ∈ V Q. Consider the positive function μ : V Q → R + , given by μ(w) = deg(w)N o (w) − Proof It suffices to check that μ has finite l 1 -norm and is reversible, i.e. satisfies μ(w 1 )P(w 1 , w 2 ) = μ(w 2 )P(w 2 , w 1 ) for all w 1 , w 2 ∈ V Q. It is enough to consider pairs of neighbors w 1 , w 2 ∈ V Q. Indeed, for such we havê This shows that μ is a reversible measure on V Q. The fact that μ has finite l 1 -norm is a direct consequence of the volume formula (2.3):

Remark 6.2
Recall from Sect. 2.2 that G is transitive on the set of vertices of V T at even distance fromõ. The image of proj : G/ → \T is the set of vertices at even distance from o. Moreover, from (2.4) it is clear that proj * m X is proportional to the restriction of μ to the image of proj. Hence, the measure μ can be thought of as the projection of Haar measure on G/ .

Positive recurrence of the Markov chain M n
First, we wish to relate the Markov chainsM n and M n . Denote by R n the n th -step of the nearest-neighbor simple random walk on the vertices of the tree T and by δõ(R n ) its distribution when the initial vertex isõ ∈ V T , i.e. a.s. R 0 =õ. Note that since T is biregular, the restriction of δõ(R n ) to the spheres S(õ, m), for m ≤ n, is a multiple of the uniform measure on S(õ, m) which is denoted by ρ m as before. Let Dõ be the distribution on E Q given by Proof Let R be the transition kernel for simple random walk on the tree. For the first equality, one simply notes thatP(π(x), π(ỹ)) = R(x, π −1 (ỹ)).
The second equality follows from the fact that the distribution of n th -step of nearestneighborhood simple random walk on the tree is given by Põ(d(R n ,õ) = k)ρ k . (6.2) The statement follows after applying π * and Lemma 3.9.
In other words, the distribution of the chainM n starting from v ∈ V Q is given by weighted average of distributions given by M k with k ≤ n. We will use this relation to deduce the positive recurrence of M n from the positive recurrence ofM n .

Proposition 6.4 The Markov chain M n is positive recurrent.
Proof By Kingman's subadditive ergodic theorem, there exists r ∈ R such that 1 k d(õ, R k ) −→ r , Põ-almost surely, hence also in measure, as k → ∞ (the value r is called the drift of random walk R k ). Since max{d 1 , d 2 } ≥ 3, it is easily seen that r > 0. Let ε > 0. Then for all k ∈ N large enough, we have By positive recurrence of the auxiliary chainM n (Lemma 6.1), there exists a finite subset K 1 of V Q such that for every n large enough, P o (M n ∈ K 1 ) > 1 − .
In view of Lemma 6.3 and (6.3), we deduce that there exists a sequence n k ∈ N with |n k − kr| ≤ kε such that for every k large enough which implies that the irreducible chain M n is positive recurrent.
An alternative and more conceptual proof of Proposition 6.4 was kindly suggested to us by an anonymous referee. We discuss it in the following remark. As our proof above, it relies on the fact that the Markov chain M n can be seen as a quotient of the simple random walk on E T , the set of edges of the tree T . Remark 6.5 (Alternative proof of Proposition 6.4) LetP be the Markov operator associated to the simple random walk on E T . Considering two successive edges x and y in E T , we have E T = Gx ∪ Gy G/G x ∪ G/G y , where G x and G y denote the respective stabilizers and G = Aut(T ). Using this and the fact that G is unimodular [1, Proposition 6], one sees that E T carries aP-stationary and G-invariant measureν. The restriction ofν to Gx (respectively Gy) corresponds to the G-invariant measure on G/G x (respectively G/G y ). On the other hand, the Markov operator P of the Markov chain M n on E Q \E T can be seen as the restriction ofP to -invariant functions on E T and the associated quotient measure ν ofν gives a P-stationary measure on E Q. But since < G is a lattice and ν is given by the quotient measure on \G/G x ∪ \G/G y , we have that ν is finite, as required.

Proof of Theorem E
Here we prove parts 1. and 2. of Theorem E. Its third part about exponential equidistribution will be proven in Sect. 6.4.
If the irreducible and positive recurrent Markov chain M n has period p ∈ N, then the sequence of distributions Dṽ P n have finitely many limit points {μ j } p −1 j=0 , corresponding to all possible convex combinations with coefficients 1/ deg(ṽ) of the unique stationary probability measures of M n on each one of its cyclic classes (corresponding to the classes of Dirac measures constituting Dṽ). This implies the convergence along subsequences pn + j and hence (2) of Theorem E.

Exponential equidistribution of spheres in quotients by geometrically finite lattices
Previously, we established positive recurrence of M n , which is sufficient to prove the existence of limiting distributions of spheres in quotients of trees by action of tree lattices. However, in some cases, our Markov chain possesses a stronger property, namely that of geometric ergodicity. In these situations, the speed of convergence to the limiting distribution can be shown to be exponential and the exponential rate can even be made effective.
We begin by stating a version of Geometric Ergodic Theorem for Markov chains. Out of the equivalent definitions of geometric ergodicity, we conveniently choose one that uses the (Foster-Lyapunov) drift criteria. We then prove geometric ergodicity of the Markov chain M n associated to geometrically finite tree lattices and discuss the application for exponential equidistribution of spheres. We refer the reader to [20,37] for more on geometric ergodicity.
Let M n be an irreducible, aperiodic and positive recurrent Markov chain on a countable state space S with the stationary probability measure μ. Denote by P the corresponding Markov operator. We call M n geometrically ergodic if there exists r > 1 such that for all x ∈ S, we have n≥0 r n δ x P n − μ < ∞. (6.5) where · denotes the total variation norm. In particular, for a geometrically ergodic chain M n , we have δ x P n − μ = o(r −n ) for every x ∈ S. Theorem 6.6 (Geometric Ergodic Theorem) Let M n be an irreducible aperiodic Markov chain on a countable state space S. Assume that there exist a finite set K ⊂ S, b ∈ R, β < 1 and a function V ≥ 1, which is finite at some x 0 ∈ S satisfying the drift criteria: PV (x) ≤ βV (x) + b1 K (x), for any x ∈ X . (6.6) Then M n is geometrically ergodic.
Let us remark that the rate r can be made explicit in terms of β, K ; see [4] for the treatment of the constant r . Finally, aperiodicity hypothesis is only required to have a simple expression as in (6.5); if the Markov chain is not aperiodic, we shall still speak of geometric ergodicity if its restriction to its cyclic classes are. Lemma 6.7 Let T be (d 1 , d 2 )-biregular tree with d 1 , d 2 ≥ 3 and a geometrically finite tree lattice. Then the associated Markov chain M n is geometrically ergodic.
Proof Let F be the compact part of Q. For convenience of notation, we will assume d 1 = d 2 . Let q = d 1 + 1.
We define the function V : E Q → [1, ∞) by if e / ∈ E F, and points toward the finite part 1 2 |e| q 0.9|e| otherwise.
We claim that V satisfies the drift criteria (6.6) with β = q −0.1 and b = q 5 .
Recall that we have positive transition probabilities only among neighboring edges in E Q. For e ∈ E Q\E F, the edge e belongs to a Nagao ray. If e is oriented toward the finite part and |e| > 5, PV (e) = V ( f ), where | f | = |e| − 1. Hence, PV (e) = q 0.1(|e|−1) ≤ q −0.1 V (e).
If e is oriented toward the cusp, the transition probabilities are 1/q to jump one step further away from E F to edge pointing toward the cusp and q − 1/q to get one step closer and point toward the finite part (see Example 3.1). In other words, for each edge e with |e| > 5 we have PV (e) = 1 q · q 0.9(|e|+1) 2 |e|+1 + q − 1 q · q 0.1(|e|−1) ≤ 1 2 · q −0.1 · q 0.9|e| 2 |e| + q 0.1(|e|−1) ≤ q −0.1 · q 0.9|e| 2 |e| = q −0.1 V (e). The last inequality holds since for any q ≥ 4 and |e| > 5 The lemma follows by letting K be the finite set of edges with |e| ≤ 5 (this also contains E F).
Finally, the description of limit measures μ j 's in the paragraph following the statement of Theorem E follows from the proof Sect. 6.3 and Lemma 3.4 which says that the period of the Markov chain M n is always two so that the Dirac masses constituting each distribution Dṽ all belong to a single cyclic class.

Remark 6.8
In the context of homogeneous dynamics, inequalities of type (6.6) are often referred to as Margulis inequalities. They were first used in the work of Eskin-Margulis-Mozes [24] and Eskin-Margulis [23]. After we completed the first version of this article, for horospherical averages on lattice quotients of real semisimple groups, using linear representations, Katz [31] proved Margulis inequalities to establish quantitative non-divergence of horospherical averages (as in Lemma 6.7). Combining this with a spectral gap, he also deduced an equidistribution result (as Theorem B but) with rate depending, among others, on certain diophantine parameters of the starting point x ∈ G/ (cf. Remark 1.5). For PSL 2 (R)-quotients, more precise estimates were obtained earlier by Flaminio-Forni [27] and Strömbergsson [53], exploiting, among others, (unitary) representation theory of PSL 2 (R). Remark 6. 9 We remark that the family of lattices for which the associated Markov chain is geometrically ergodic and, consequently, for which part 3. of Theorem E holds, contains many non-geometrically finite lattices. For example, the lattice associated with the edgeindexed graph from Fig. 6 is such an example, with similar Foster-Lyapunov function V (x) to the one in proof of Lemma 6.7.

Proof of Theorem F
Let be a geometrically finite lattice in Aut(T ) =: G, denote by m a Haar measure on G and let m X be the induced G-invariant finite measure on G/ by choice of a Borel fundamental domain in G. Denote by S T (R) the cardinality of the sphere of radius R aroundõ in T and o = π(õ). As before, π is the natural projection V T → V Q and ρ n denotes the normalized probability measure on the sphere of radius n on V Q with center o. Recall that G has precisely two orbits on V T and it acts transitively on the set of vertices of T that are of even distance to each other, so that for every γ ∈ , 2|d(γõ,õ). For every R ∈ N, we have N (2R) = n≤R S T (2n) · π * ρ 2n (o) · | ∩ Gõ|. (6.7) Thanks to (3) of Theorem E (see also the paragraph following that theorem), for some constant r > 1, we have Plugging (6.9) and (6.8) in (6.7) yields the desired statement. To see the alternative expression of the main term m(Gõ) m X (X ) as expressed after the statement of Theorem F, observe first that it follows by unimodularity of Aut(T ) that for any two verticesṽ,w ∈ V T with 2|d(ṽ,w), we have m(Gṽ) = m(Gw). Now fixing a liftṽ for every v ∈ V Q with 2|d(o, v) and an element g v such that g vṽ =õ, we have