(Non)-escape of mass and equidistribution for horospherical actions on trees

Let $G$ be a large group acting on a biregular tree $T$ and $\Gamma \leq G$ a geometrically finite lattice. In an earlier work, the authors classified orbit closures of the action of the horospherical subgroups on $G/\Gamma$. In this article we show that there is no escape of mass and use this to prove that, in fact, dense orbits equidistribute to the Haar measure on $G/\Gamma$. On the other hand, we show that new dynamical phenomena for horospherical actions appear on quotients by non-geometrically finite lattices: we give examples of non-geometrically finite lattices where an escape of mass phenomenon occurs and where the orbital averages along a Folner sequence do not converge. In the last part, as a by-product of our methods, we show that projections to $\Gamma \backslash T$ of the uniform distributions on large spheres in the tree $T$ converge to a natural probability measure on $\Gamma \backslash T$. Finally, we apply this equidistribution result to a lattice point counting problem to obtain counting asymptotics with exponential error term.


Introduction
Let T be a (d 1 , d 2 )-biregular tree with d 1 , d 2 ≥ 3. Denote by Aut(T ) the group of automorphisms acting without edge inversion. Let G be a non-compact, closed subgroup of Aut(T ) acting transitively on the boundary of the tree ∂T . Let Γ ≤ G be a lattice and X = G/Γ. This parallels the classical setting of homogeneous dynamics, where one studies the actions of certain subgroups on a quotient of a linear algebraic group by a lattice. These two worlds intersect, for example, when G = SL 2 (k), where k is a nonarchimedean local field, in which case G naturally acts on the associated Bruhat-Tits tree. However, our geometric setting also comprises many groups G ≤ Aut(T ), including Aut(T ) itself, that are not linear [12].
We first focus on the homogeneous space X = G/Γ, where Γ is a geometrically finite lattice. The dynamics of discrete geodesic flow on X was considered by Paulin in [41], and is related, among others, to the theory of continued fractions in nonarchimedean local fields. We recall that when G is linear, by works of Raghunathan and Lubotzky [35,43], any lattice therein is geometrically finite.
In our geometric setup, the role of Ad-unipotent subgroups in classical homogeneous dynamics is played by the horospherical subgroups G 0 η of G, for η ∈ ∂T . In the earlier work [13], the authors classified Borel probability measures on G/Γ invariant under G 0 η -action for large class of groups G and general lattices Γ, establishing an analogue of Dani's result in [15]. Moreover, it was shown that when Γ is geometrically finite, G 0 η -orbits are either compact or dense, as in the classical result of Hedlund [30] on the horocycle flow on finite volume hyperbolic surfaces.
1.1. Non-escape of mass. The horospherical group G 0 η is amenable and one can easily construct Følner sequences therein: let a ∈ G be a hyperbolic element that has η as its attracting fixed point on ∂T and let M be the compact subgroup of G 0 η that fixes pointwise the translation axis of a in T . Then for any M -invariant compact subset O with non-empty interior in G 0 η , the sequence (O t := a t Oa −t ) t∈N constitutes a Følner sequence in G 0 η (see e.g. [13,Lemma 2.10]). In the sequel, we shall refer to such sequences O t as good Følner sequences. Følner sequences allow one to average along larger and larger pieces of the orbits. For x ∈ X, we define ν x,t = m Ot * δ x , where m Ot is the normalized restriction of the Haar measure m G 0 η to O t ; in other words for f ∈ C c (X), The probability measures ν x,t are called the orbital measures. In general, one can have a qualitative information on the statistical behaviour of the typical points x ∈ X. This can be done using the Howe-Moore property, established in our setting in [11] and amenable ergodic theorem [34]. Our topological result in [13] says, however, that all points x ∈ X that do not lie in a compact G 0 ηorbit have dense orbits. Therefore, the immediate question arises whether every dense orbit equidistributes to the Haar measure on G/Γ. First possible obstruction to this is the escape of mass phenomenon. Our first result states that this does not happen when Γ is a geometrically finite lattice.
Theorem A (Non-escape of mass). Let T be a (d 1 , d 2 )-biregular tree, with d 1 , d 2 ≥ 3, and G a non-compact, closed subgroup of Aut(T ) acting transitively on ∂T . Let Γ be a geometrically finite lattice in G, η ∈ ∂T and O t a good Følner sequence in G 0 η . Then, for every ε > 0, there exists a compact set K = K(ε) ⊂ X such that for every x ∈ X not contained in a compact G 0 η -orbit, there exists a positive integer N = N (x, ε) with the property that for every t ≥ N , we have ν x,t (K) > 1 − ε. (1.1) The above is known as non-escape of mass. In the context of one-parameter unipotent flows on quotients of real Lie groups, it is due to Dani and Margulis [14,16]. Our result also applies to the linear setting, we now describe this special case. Let k be a non-archimedean local field and H be the group of k-points of a connected semisimple linear algebraic k-group H of k-rank one. Let A be a maximal k-split torus in H, Z its centralizer, U a maximal unipotent subgroup, P the normalizer of U, and, respectively, A, Z, U, P be the groups of k-points. The group H acts by automorphisms [8] (see also [35, page 411]) on its Bruhat-Tits building which is a bi-regular tree T . If H is simply connected, then H embeds as a closed subgroup of Aut(T ). In general, H might have edge inversion and in this case, we shall replace it with an index two subgroup that acts without edge inversion. Moreover, let K be a good maximal compact group of H. The group K is the stabilizer of a vertex of T , P = ZU is the stabilizer of an end η ∈ ∂T and we have the Iwasawa decomposition H = KP (see [8, §4] or [7, §8.2.1]). Finally let M be the compact subgroup K ∩ Z of H. In our geometric setting, we have H 0 η = M U and the following result is an immediate consequence of the previous theorem: Corollary 1.1. Let H and its subgroups M, U be as above. Let Λ be a lattice in H and O t be a good Følner sequence in M U . Then, for every ε > 0, there exists a compact set K = K(ε) ⊂ X such that for every x ∈ X not contained in a compact M U -orbit, there exists a positive integer N = N (x, ε) with the property that for every t ≥ N , we have ν x,t (K) > 1 − ε.
This corollary is only relevant for fields k with char k = 0. Indeed in the zero characteristic case, by a result of Tamagawa [54] (also observed in [47]), every lattice in H is uniform. We also remark that the version of the previous corollary for U (instead of M U ) holds as well. Finally, we note that a related result which would imply the previous corollary was mentioned in [29, page 467].
An immediate general consequence of Theorem A is For every x ∈ X, every weak- * limit of the sequence ν x,t as t → ∞ is a G 0 η -invariant probability measure on X. In the proof of Theorem A, exploiting the underlying geometric setting, we translate the problem of understanding the distribution of G 0 η -orbit in G/Γ to the language of Markov chains, where it appears as a problem of controlling the distributions of a Markov chain with changing starting distributions. We then rely on two main ingredients: the first is a qualitative description of the behaviour of the discrete geodesic flow on G/Γ, as studied in [13]. This allows us to understand the behaviour of starting distributions of the Markov chain. The second ingredient is, naturally, a set of Markov chain theoretical tools. The proof is then carried out by combining the two ingredients.
1.2. Equidistribution of orbits. For example, when G = Aut(T ) and for x ∈ X lying in a compact G 0 η -orbit, by standard arguments, all weak- * limits of ν x,t are G 0 η -invariant probability measures supported on the homogeneous orbit. This orbit supports a unique G 0 η -invariant measure and, hence, ν x,t equidistribute to the homogeneous measure supported on the orbit closure.
Under the additional topological simplicity assumption on G, our second result yields a complete qualitative description of statistical behaviour of every x ∈ X not contained in a compact G 0 η -orbit and for such x ∈ X, it identifies the limit of ν x,t as the Haar measure: and G a non-compact, closed, topologically simple subgroup of Aut(T ) acting transitively on ∂T . Let Γ be a geometrically finite lattice in G and O t be a good Følner sequence in G 0 η . Assume x ∈ X does not belong to a compact G 0 η -orbit. Then, the orbital measures ν x,t equidistribute to the normalized Haar measure m X as t → ∞, in other words, for every f ∈ C c (X), we have The previous theorem has the following immediate consequence on the statistical behaviour of G 0 η -orbits. Let L be a closed subgroup of G. A probability measure µ on X is called L-homogeneous if it is the unique L-invariant probability measure on a closed L-orbit. It is said to be homogeneous if it is L-homogeneous for some closed subgroup L < G. A point x ∈ X is called generic for G 0 η (see [47,Definition 1]) if for some (equivalently for any) good Følner sequence O t , the sequence ν x,t of orbital measures equidistributes to a homogeneous measure.
Keep the hypotheses of Theorem B. Any x ∈ X is generic for G 0 η . In the context of unipotent flows on SL 2 (R)/Γ, this result goes back to Dani-Smillie [17]. Since then, Ratner [46,48], Shah [51] and others have obtained very general results in Lie groups or algebraic groups over local fields of characteristic zero, but even in the case of a semisimple linear group G of rank one over a local field of positive characteristic, e.g. SL 2 (k) with k = F q ((X −1 )), this result does not appear in the literature. However, we remark that for arithmetic quotients of linear groups, one may deduce such an equidistribution result by combining the work of Mohammadi [38] and the result mentioned by Ghosh in [29, page 467]. In the linear setting, the previous results have the following immediate consequence: Keep the hypotheses and the notation of Corollary 1.1. The statement of Corollary 1.3 holds when G 0 η is replaced with the subgroup M U of H. For example, for H = SL 2 (F q ((X −1 ))), one can take Γ to be the non-uniform lattice SL 2 (F q [X]) and the groups M and U to be We remark that for uniform lattices, one can use Margulis' orbit thickening argument to show that U -action is uniquely ergodic (see Mohammadi [38], Ellis-Perrizo [22] or [13,Lemma 6.3]). It is also worth noting that for non-uniform quotients, using our geometric approach, one can show the version of the previous corollary for the U -action (instead of M U ). Finally, we mention the work of Vatsal [56] in which the equidistribution results of Ratner [47,48] for unipotent dynamics in the p-adic case were applicable with a geometric approach similar to ours (see [56,). Regarding the proof of Theorem B, it is proven by using Theorem A, the classification of G 0 η -orbits, given in [13] and the Howe-Moore property established in [11].
1.3. New non-linear homogeneous dynamical phenomena. So far, the results obtained in Theorems A and B for geometrically finite lattices parallel the more classical results in linear homogeneous dynamics. However, the family of tree lattices is very rich and, as opposed to the linear setting, there exist many non-geometrically finite lattices. These exhibit wilder behaviors than their linear counterparts giving rise to several interesting phenomena that do not appear in the classical setting. Various aspects of these differences, as well as analogies, were studied by many, including Serre [50], Tits [55], Bass-Kulkarni [2], Burger-Mozes [10,11], Lubotzky [35], Bass-Lubotzky [3], Paulin [42], Bekka-Lubotzky [5] etc. The following results add a new dynamical aspect to these non-linear phenomena showing that horospherical orbits on quotients by non-geometrically finite lattices can exhibit escape of mass, which does not occur in homogeneous dynamics in the linear setting.
Theorem C (Escape of mass). For any q ≥ 2, there exist a lattice Γ in G = Aut(T 2q+2 ) and η ∈ ∂T 2q+2 such that for the trivial coset x = eΓ ∈ X, any compact K ⊂ X and any good Følner sequence Recall that in the setting of unipotent dynamics on linear homogeneous spaces, by now classical results of Ratner [45,46,48], Mozes, Shah [39,51] and others show that the orbital averages along unipotent group actions always converge towards an invariant probability measure. The following result contrasts the classical situation by giving an example where we see not only an escape of mass phenomenon, but also a failure of convergence of the orbital averages along Følner sequences.
Theorem D (Escape of mass and equidistribution). There exists a non-uniform lattice Γ < Aut(T 6 ) with the property that for any η ∈ ∂T there exist points x ∈ X = Aut(T 6 )/Γ such that for any good Følner sequence (O t ) t∈N in G 0 η , the set of accumulation points of the sequence of orbital averages ν x,t contains the zero measure and m X .
The proof of this theorem is carried out in Section 5 and consists of several parts. In fact, it yields an uncountable number of non-isomorphic such lattices in Aut(T 6 ). The construction of these lattices has a similar flavor as the constructions of Bass-Lubotzky in [3] to show that there are lattices of arbitrarily small covolumes in Aut(T ). Once the candidate lattices are constructed, the escape of mass phenomenon is proven by exploiting further the aforementioned connection between the Markov chain theory and distributions of horospherical orbits. This step uses the relatively finer ingredient of subgaussian concentration estimates for geometrically ergodic Markov chains (see e.g. Dedecker-Gouëzel [18]). Finally, the proofs of the uniqueness of the G 0 η -invariant probability measure and the equidistribution along some orbital averages rely, among others, on the mixing of the discrete geodesic flow and the positive recurrence of the associated Markov chain.

Equidistribution of spheres.
To describe the general problem that we study here, consider a morphism of graphs π : T → Q, where T is a biregular tree. For a vertexṽ ∈ V T , let S(ṽ, n) be the set of vertices of T at distance n fromṽ. Let ρ n be the uniform distribution on S(ṽ, n). We are interested in the distributions π * ρ n on V Q: do they have a limiting distribution and, if yes, can one identify it? Questions about equidistribution of spheres are well-studied in many homogeneous quotients: Euclidean spheres in R d /Z d in [44] or hyperbolic spheres in quotients of hyperbolic space H d /Γ, where Γ is a lattice in SO(d, 1) (see [6,Theorem 3.3], [44] and [21,25,52] for more general results with applications to various counting problems). In the following result, we answer such a question for the natural quotient Q of the tree associated to the Γ-action, where Γ is a general lattice in Aut(T ).
Theorem E (Equidistribution of spheres in quotients by tree lattices). Let T be a biregular tree, Γ ≤ Aut(T ) a tree lattice. Denote by Q = Γ\T .
(1) (Non-escape of mass) For any > 0, there exists a finite subset K ⊂ V Q, such that for all n ∈ N we have π * ρ n (K) ≥ 1 − .
(2) (Limiting distribution) There exists an integer p, and limiting probability distributions µ 0 , ..., µ p−1 on V Q such that for all v ∈ V T and for all 0 ≤ j < p we have π * ρ pn+j → µ j , as n → ∞.
(3) (Exponential convergence) If, in addition, Γ is geometrically finite, we can take p = 2 and there exists r > 1 such that where . denotes the total variation norm.
In geometrically finite case (3), the measures (µ j ) j=0,1 coincide with the projection of the Haar measure m X by the natural map proj : Aut(T )/Γ → V Q by two different base points. The exponential rate of convergence 1/r in this result can be made effective, using the effective version of geometric ergodic theorem for Markov chains as in [4].
The proof of the previous result relies on the tools we develop to prove Theorem A. Indeed, the Markov chain that we construct to track the statistical behaviour of horospherical averages easily allows one to understand the spherical averages provided one proves a (positive) geometric recurrence property for (non-) geometrically finite lattice quotients. This is carried out in Section 6. To draw an analogy, the overall proof can be seen to parallel, in considerably simpler fashion, the deduction of Theorem [24,Theorem 4.4] from Theorem 4.1 in that work. Remark 1.5 (Diophantine exponent vs. speed of equidistribution). In fact, in the geometrically finite case, using the geometric recurrence of the associated Markov chain (Lemma 6.7), one can show the version of the equidistribution in Theorem B on the quotient V Q additionally with a speed as in (3) above. The equidistribution itself directly follows by projecting the measures m Ot and m X in Theorem B by the map proj. The speed of equidistribution depends on a geometric diophantine exponent (see e.g. [53, (1.6)] and [26,42]) of the boundary point g −1 η where x = gΓ. From this perspective, Theorem E (3) can also be seen as a particular case based on the fact of hyperbolic geometry that large circles are well-approximated by horocycles [24, p.116] (see also Remark 6.8).
1.5. Counting lattice points. Another classical question closely related to the equidistribution of spheres is the problem of counting lattice points. To describe the general problem, consider a lattice Γ (or more generally a discrete subgroup) in some locally compact topological group endowed with a non-negative functional · . One is interested in describing the asymptotics of This problem goes back to Gauss who was interested in the case Z d ≤ R d with Euclidean norm as the functional || · ||. This particular problem is known as Gauss circle problem and the sharp error rates are still unknown. For Γ ≤ SL 2 (R), one can take || · || to be the operator norm induced by the Euclidean norm on R 2 , in which case, we have g = exp( 1 2 d H 2 (g.i, i)). This was already studied by Delsarte [19] in 40's, who obtained the first non-euclidean counting results. In the same setting, lattice point counting problem is also closely related to the counting of closed geodesics on hyperbolic surfaces. For an extensive historical survey and overview of methods used, we refer to [28], where the authors also develop spectral techniques to study the lattice point counting problem in a large generality.
Coming back to our setting, in analogy with the real hyperbolic case, it is natural to consider the functional g = d(gõ,õ), whereõ ∈ V T is some basepoint and d the graph distance on the tree. Clearly, for a discrete Γ, N (R) is finite and nondecreasing in R. The following result describes the growth asymptotics of N (R) with exponential error term for a geometrically finite lattice Γ: Theorem F. Let T be a biregular tree, Γ ≤ Aut(T ) a geometrically finite tree lattice. Let m be an Haar measure on Aut(T ) and m X the induced finite measure on X = Aut(T )/Γ. Fix a basepointõ ∈ V T and for R ∈ N, let Denote by B T (R) the cardinality of the set of vertices at an even distance fromõ that is at most R. Then, there exists c ∈ (0, 1) such that We stress that unlike before, we do not normalize the measure m X to be a probability measure. We also remark that the main term m(Gõ) m X (X) can alternatively be expressed as ( v∈V Q 1 |Γ∩Gṽ| ) −1 , where for every vertex v of Q = Γ \ T ,ṽ ∈ V T denotes a lift of v, Gṽ is the maximal compact subgroup of Aut(T ) fixingṽ and V Q denotes the set of vertices Q at even distance from π(õ). Finally, we note that Aut(T ) acts without edge inversion and this implies that for every g ∈ Aut(T ) and v ∈ V T , d(gṽ,ṽ) ∈ N is an even number. This is the reason why, in the previous statement, we only consider vertices at even distance from each other.
We remark that this theorem also follows from the main result of Kwon in [32] and from the work of Roblin [49, Chapitre 4, Corollaire 2]. Our proof relies on our previous result on the equidistribution of spheres (Theorem E) and is a relatively straightforward consequence thereof. An exponential error rate c ∈ (0, 1) can also be effectively calculated.
The article is organized as follows. We recall some preliminary material mostly on lattices in groups acting on trees and set our notation in §2. In §3, we associate a natural Markov chain to an edge-indexed graph, study its properties and use these to prove Theorem A for geometrically finite lattices. In §4, we prove Theorem B. Theorems C and D are proven in §5. In §6, we introduce an auxiliary Markov chain and use this to study the edge-indexed graph associated to a general lattice and prove Theorems E and F.
Acknowledgements. The authors are thankful to Marc Burger and Manfred Einsiedler for helpful discussions. The authors also thank an anonymous referee for a careful reading, several remarks clarifying the exposition and helpful bibliographical suggestions. V.F. is supported by ERC Consolidator grant 648329 (GRANT). C.S. is supported by SNF grants 178958 and 182089.

Preliminaries
2.1. Basic notation. We denote by T a (d 1 , d 2 )-regular tree, with d 1 , d 2 ≥ 3, with V T its set of vertices and ET , its edges. All edges are directed and ∂ 0 , ∂ 1 : ET → V T are, respectively, the initial and the terminal vertex maps. An (ordered) pair of edges e 1 , e 2 is called consecutive if ∂ 1 (e 1 ) = ∂ 0 (e 2 ). A sequence of consecutive edges e 1 , ..., e n is called a path of length n. We also refer to it as a path between ∂ 0 (e 1 ) and ∂ 1 (e n ). The distance d(·, ·) between two vertices of the graph is defined as the minimal length of a path between these vertices.
We denote by Aut(T ) the group of tree automorphisms acting without edge inversion, i.e. the group of automorphisms g such that d(gv, v) = 0 (mod 2) for one (equivalently every) vertex v ∈ V T . When d 1 = d 2 , this is an index two subgroup of full group of automorphisms. Endowed with pointwise convergence topology, it is a locally compact, second countable group. In this article G always stands for a non-compact, closed subgroup of Aut(T ) that acts transitively on the boundary ∂T of T .
Throughout the rest of the article, we fix a basepointõ ∈ V T and a distinguished end η ∈ ∂T , and denote by (y 0 , y 1 , y 2 , ...) the vertices of the infinite path converging to η, where y 0 =õ.
For a subset S ⊂ T , and a subgroup H < Aut(T ), H S denotes the pointwise stabilizer of S in H. Given η ∈ ∂T , we define The group G 0 η is called the horospherical subgroup (see [13,Section 2] for more details on horospherical subgroups). It is a closed and amenable subgroup of G and as mentioned in the introduction, one can construct many good Følner sequences in G 0 η . The following sequence of compact open subgroups of G 0 η yields a good and tempered Følner sequence that is particularly convenient for our geometric approach. For t ∈ N, we set In fact, as we shall see, thanks to the structure of good Følner sequences, it will be sufficient to prove our results only for the sequence F t . Denote by m G and m G 0 η the Haar measures on G and G 0 η , respectively. By m Ft we denote the Haar probability measure on F t which clearly coincides with the normalized restriction of m G 0 η to F t .

2.2.
Lattices and theirs associated edge-indexed graphs. It is well-known that a subgroup Γ ≤ G is discrete if and only if all vertex stabilizers Γ v for v ∈ V T are finite. A discrete subgroup Γ ≤ G is called a lattice if X = G/Γ admits a G-invariant Borel probability measure, in which case we denote this measure by m X . By our standing assumption of boundary transitivity of G, the quotient graph G\T has two vertices. Indeed by [10, Lemma 3.1.1], G acts two-transitively on ∂T which in turn implies that G has precisely two orbits on V T . Moreover, since G acts without edge inversions, it acts transitively on the set of vertices of even distance. In this case, Γ is a lattice in G if and only if it is a lattice in Aut(T ). Therefore, all lattices we will consider are tree lattices, i.e. lattices in Aut(T ). For convenience, we will often call them lattices without specifying the ambient group. We refer to [3] for more details on tree lattices and edge-indexed graphs. Given a discrete subgroup Γ, there is a useful construction [2] of a graph Q and map ind : EQ → N as follows: the graph Q is the quotient graph Γ\T , which is welldefined, since Γ acts without edge inversion. Denote by π : T → Q the projection map. The index map ind : EQ → N is given by ind(e) = [Γ ∂ 0 (ẽ) : Γẽ], whereẽ ∈ ET is any edge with π(ẽ) = e. This clearly does not depend on the choice of the liftẽ. The pair (Q, ind) is called the edge-indexed graph associated to Γ < Aut(T ).
For v ∈ V Q, we define deg(v) to be the valency of any of its liftsṽ. By definition of the map ind, where (e 1 , ..., e n ) is a path from u to v. For an an edge-indexed graph (Q, ind) associated with a discrete subgroup Γ, the value of N u (v) does not depend on the choice of the path. Fixing a basepoint o ∈ V Q (for convenience, we use o = π(õ)), where d(., .) denotes the graph distance on Q. We shall refer to this quantity as the volume of the edge-indexed graph (Q, ind) based at o. We also remark that changing the base point from o to o has the effect of multiplying the previous sum by the rational number ∆(o ) ∆(o) , therefore does not affect its finiteness. Conversely, one can define an abstract edge-indexed graph (Q, ind) as a tuple consisting of a graph Q and map ind : EQ → N. Under natural assumptions on the associated maps ∆ and N as above, there exists a discrete subgroup Γ whose associated edge-indexed graph coincides with (Q, ind) and the function N is proportional to v → |Γṽ|, whereṽ is any lift of v (see [3, page 23] or [2]).
For a discrete group Γ ≤ G, we define the projection map proj : G/Γ → V Q by proj(gΓ) := π(g −1õ ) = Γg −1õ . The map proj is clearly continuous and has compact fibers in G/Γ: for each v ∈ V Q and g ∈ G such that proj(gΓ) = v, we have proj −1 (v) = GõgΓ. Moreover, the measure of each fiber is In other words, using the definition 2.2 of the map N o , we have 3. Geometrically finite lattices. Following [3,50], we define a Nagao ray to be an edge-indexed graph (Q, ind) whose underlying graph Q is an infinite ray and the map ind takes value 1 on all edges directed towards the infinity except the edge emanating from the vertex o at the origin. All edges e directed away from infinity are indexed by deg(∂ 1 (e)) − 1. Here, an edge e ∈ EQ is said to be directed towards infinity if d(∂ 1 (e), o) > d(∂ 0 (e), o), and directed away from infinity otherwise. See Fig. 1 for an example of Nagao ray in (q 1 +1, q 2 +1)-biregular tree. An open Nagao ray is obtained by removing the origin vertex from a Nagao ray. Following Paulin [42], a tree lattice Γ is called geometrically finite if its associated edge-indexed graph (Q, ind) contains a finite subgraph F whose set theoretic complement in Q is a disjoint union of finitely many open Nagao rays. The finite part of (Q, ind) is the smallest non-empty finite subgraph F with this property. When T is a (q + 1)-regular tree, a tree lattice Γ is called of Nagao type if the . . . Figure 1. Nagao ray, when T is (q 1 + 1, q 2 + 1)-biregular. By convention, for edge e, the index ind(e) is written next to the vertex ∂ 0 (e) associated edge-indexed graph is a Nagao ray (see [3,Chapter 10]). Fig. 2 illustrates the corresponding edge-indexed graph. Another example of geometrically finite lattice, where T is (3, 10)-biregular tree, is given in Fig. 3.  When Γ is geometrically finite, we have a very useful characterization of compact G 0 η -orbits in G/Γ (see [13,Lemma 6.2] or [42, Proposition 3.1]). Proposition 2.1. Let Γ ≤ G geometrically finite lattice. Let g ∈ G be such that the G 0 η -orbit of gΓ is not compact in G/Γ. Let F denote the finite part of Q = Γ\T . Then π(g −1 y t ) belongs to F for infinitely many values of t, in particular t − d(π(g −1 y t ), F ) is monotone non-decreasing and unbounded.

Markov chains.
We recall some terminology and basic facts of the theory of Markov chains and set our notation. For more details, we refer the reader to [20,33,37].
Let S be a countable set, and P : S×S → [0, 1] a Markov kernel, i.e. y∈S P (x, y) = 1 for every x ∈ S. By (standard) abuse of notation, we shall also denote the associated Markov operator and its dual by P : for a function f on S, P f (x) = y f (y)P (x, y). For a measure µ on S, µP (·) = y µ(x)P (x, ·). For n ∈ N, P n denotes the n th -convolution power of P . For s ∈ S, we denote by δ s the probability measure supported on {s}: for s 1 , s 2 ∈ S, P n (s 1 , s 2 ) := δ s 1 P n (s 2 ).
The Markov kernel P is called irreducible if for every s, t ∈ S, there exists n ∈ N with P n (s, t) > 0. The period of an irreducible Markov kernel P is defined as gcd{n ∈ N | P n (s, s) > 0} for some (or equivalently all) s ∈ S. If the period is 1, the Markov chain is called aperiodic. Denoting the period by p, there exists a partition Ω 0 , . . . , Ω p−1 of the state space S into cyclic classes Ω i such that for every s ∈ Ω i , P (s, Ω i+1 ) = 1 (i mod p). If P is irreducible and has period p, then P p restricted to each cyclic class is irreducible and aperiodic. In a standard manner [20, Section 3.1], a Markov kernel yields a canonical Markov chain on the state space S. Therefore, we shall equivalently speak of a Markov chain being irreducible, aperiodic etc.
A non-negative measure µ on S is said to be stationary for the Markov kernel P if µP = µ. An irreducible Markov kernel P is called positive recurrent if it admits a stationary probability measure, in which case this measure is unique. If, moreover, P has period p then µ = 1 p p−1 i=0 µ |Ω i is a stationary measure of P , where µ |Ω i is the unique stationary probability measure of P p restricted to Ω i . We also have For an irreducible aperiodic positive recurrent Markov chain and any initial distribution µ, µP n converges to the stationary probability measure as n → ∞. In case of an irreducible Markov chain that is not positive recurrent, µP n converges to 0, regardless of the period.

Non-escape of mass
The aim of this section is to prove Theorem A. We start by associating a Markov chain with a tree lattice Γ, study its properties and eventually link the Markov chain to the study of orbital measures in G/Γ of horospherical subgroups. If Γ is a uniform lattice, there is nothing to prove in Theorem A, so throughout the proof, Γ is assumed to be non-uniform.
3.1. The Markov chain. Let Γ be a tree lattice and (Q, ind) be the corresponding edge-indexed graph. Define the Markov chain M n with state space EQ and transition probabilities given by Note that by (2.1) transition probabilities sum to 1 so that P is a Markov kernel. As the subsequent proofs will show, we are naturally led to the study of the Markov chain M n which can simply be seen as the image by quotient map π of the simple random walk on the set of edges of the tree T . It came to our knowledge that this Markov chain was considered by Burger and Mozes [9] in the study of the notion of divergence groups in Aut(T ) and by Kwon [32] in the study of mixing properties of the discrete geodesic flow.
Let us illustrate the structure of this Markov chain as well as our subsequent use of it in a simple but important situation, that is when Γ is lattice of Nagao type.
Example 3.1. Let Γ be Nagao lattice in G ≤ Aut(T ), where T is a (q + 1)-regular tree (see Fig. 2 for the corresponding edge-indexed graph). In this case, the above construction of Markov chain gives rise to a state space and transition probabilities as illustrated in Fig. 4. Figure 4. Transition probabilities of M n when Γ is a lattice of Nagao type (for the labeling of edges, see Fig. 2).
Consider a random trajectory of this Markov chain on its state space as depicted in the previous figure. The key phenomenon for us in this example is that once the trajectory turns toward the finite part (here, this corresponds to the edges facing left or up), it must deterministically walk all the way toward the finite part without a chance to turn around. This feature entails very strong recurrence properties which will allow us to control hitting times and, eventually, deduce convergence of the Markov chain to the stationary measure (up to issues of periodicity) even with moving starting point. The latter property is crucial for Theorem A.  Proof. Since the graph Q is connected, it is sufficient to show that for any two edges e, f ∈ EQ, such that ∂ 1 e = ∂ 0 f , we have P n (e, f ) > 0 for some n ≥ 1. If f = e, this holds for n = 1 by definition of P .
We show the existence of a path as above by contradiction. Suppose for any non-backtracking finite path starting at e we have ind(e n ) = 1. Such a path cannot end at a leaf, since then ind(e n ) = deg(∂ 0 e n ) = deg(∂ 1 e n ) > 2 by (2.1). Hence, we can extend it to produce an infinite non-backtracking path with ind(e i ) = 1 for all i ∈ N. In particular, N ∂ 0 (e) (e i ) ≤ 1 for all i, which contradicts the finiteness of the volume in (2.3).
In the case of geometrically finite lattices, we will prove positive recurrence of the associated Markov chain M n using Foster's drift criterion. Positive recurrence of M n in the setting of general tree lattices, which is required in the proof of Theorems C and E, is shown in Proposition 6.4 with a slightly more elaborate proof.
Assume Γ is a geometrically finite tree lattice, (Q, ind) its associated edge-index graph and F the finite part of Q. For e ∈ EQ, we use the notation |e| := d(∂ 1 (e), F ) to indicate the distance between an edge and the finite part F . For e / ∈ F , we say that e is oriented toward the finite part if d(∂ 1 (e), F ) < d(∂ 0 (e), F ), and oriented toward the cusp otherwise. Proof. For d 1 , d 2 ≥ 4, one easily verifies that for any e ∈ EQ, setting V (e) = (3/2) |e| and letting P to be the Markov operator corresponding to M n , we have P V (e) < ∞ for all e ∈ F and P V (e) ≤ V (e) − 1/8, for all e ∈ EQ \ F.
In the case d 1 = d 2 = 3, a slightly different function V (which also works for the previous case) does the job: Let if e is oriented toward the finite part, 100(3/2) |e| otherwise. A simple combinatorial observation allows us to show that when Γ is geometrically finite, the period of the Markov chain M n is two. This is expressed in the following lemma: When Γ is geometrically finite, one can simply take e, f to be two consecutive edges in a Nagao ray oriented toward the finite part so that the lemma applies.
Proof. Let m − 1 be the length of a path from e toē along edges with positive transition probabilities. Since ind(e) > 1, P (e, e) > 0, hence there is a loop of length m with positive transition probabilities along all edges. On the other hand, after the previous loop, one can follow the path from e to e, continue to f , then to f and finally back to e. This is a loop of length m + 2. Hence, the period divides by m and m + 2, which forces it to be 1 or 2. On the other hand, since Γ action on T preserves a partition into two sets of vertices (thanks to the assumption that Aut(T ) acts without edge inversion), hence the period cannot be 1, proving the claim.

3.1.2.
Hitting time of the finite part. Let F be the finite part of the graph Q. For the Markov chain M n , we denote by τ the first hitting time of F i.e. τ = min{n ∈ N | ∂ 1 (M n ) ∈ F }. By positive recurrence, τ is finite almost surely. To deal with periodicity, define τ := min{n ∈ N | n ≥ τ, 2|n}.
We start by a lemma that controls the probabilities of long hitting times of the finite part. Proof. Clearly, random walk starting at e can never hit F in less than |e| steps. Similarly when |e| = 0, the claim is obvious. When |e| > 0, by definition of geometric finiteness, ∂ 1 (e) belongs to some Nagao ray. Because of the structure of Nagao rays (see Example 3.1), a Markov trajectory starting at an edge oriented toward the finite part F must necessarily take at least one step toward F . This can only change once the trajectory visits F . Hence, if e is oriented toward finite part, we deduce that P e (τ = |e|) = 1 matching the upper bound in the statement.
On the other hand, if e is oriented toward the cusp, in order to avoid visiting F in the first i − 1 steps, the walk must take at least i−|e| 2 steps toward the cusp, all with probability q −1 . This gives the bound in the lemma.

3.1.3.
Convergence of the Markov chain with varying initial distribution. As before, let P be the Markov operator corresponding to M n . Let Ω 0 , Ω 1 ⊆ EQ be its cyclic classes and for j = 0, 1, denote by µ Ω j the unique P 2 stationary probability measure on Ω j . The next lemma describes the convergence of the Markov chain with moving initial distributions. The condition on the initial distributions will be clear later on, as this convergence will play a crucial role in the proof of Theorem A. Lemma 3.6. Let Ω be a cyclic class of P and e(t) ⊂ Ω be a sequence of edges in the same cyclic class, such that t − |e(t)| → ∞. Let n(t) be such that |t − 2n(t)| is constant so that δ e(t) P 2n(t) is supported in Ω. Then, where · denotes the total variation norm (see e.g. [37, §D.1.2]).
In the proof, we control the distributions with non-constant starting points e(t) by studying the behaviour of the Markov chain conditioned on the hitting time of the finite part. This, together with the precise control on the hitting time as provided by Lemma 3.5, allows us to prove the required convergence.
Proof. By conditioning the Markov chain on the hitting time τ (as defined in §3.1.2), we have Here, for every i ∈ 2N with P e(t) (τ = i) > 0, P e(t) (δ e(t) P 2n(t) ∈ ·|τ = i) denotes the probability measure on EQ given by e → P e(t) (M 2n(t) = e and τ = i) P e(t) (τ = i) .
With this notation, splitting the right-hand-side of (3.1) into three sums, we get that left-hand-side of (3.1) is bounded above by where we used (3.2) for the first two sums, and (1) for the third. We need to show that the above tends to 0 as t → ∞. By (3), the first sum is identically 0 and as t → ∞, the third sum tends to 0 by (4) and the fact that 2n(t) − |e(t)| tends to ∞.
We focus on the middle sum of (3.3), which after denoting N t = 2n(t) − |e(t)|, we rewrite as follows which converges to 0 as t → ∞ by (4). This concludes the proof.

Proof of Theorem A.
We now link the Markov chain to the study of orbital measures of horospherical orbits and use the properties of M n to prove Theorem A. Before starting the proof, we remark that it suffices to prove the result only for the Følner sequence F t . Indeed, let O be a M -invariant compact subset with non-empty interior in G 0 η , a ∈ G be a hyperbolic element with attractive fixed point η and of (minimal) translation distance 2 and O t = a t Oa −t be the associated good Følner sequence. It follows by compactness of F 0 and O that for some n 0 ∈ N and every t ∈ N, we have (3.5) As a consequence, there exists c ∈ (0, 1) such that for every t ∈ N, the sequence One easily sees from these inequalities that the orbital measures ν x,t associated to F t have non-escape of mass if and only if those associated to O t have it.

3.2.1.
Reduction to measures on the tree. For the rest of the section we fix x = gΓ ∈ X with non-compact G 0 η -orbit. Recall that for t ∈ N, ν x,t denotes the probability measure on the orbit F t x obtained by pushforward of the Haar probability measure on F t under the orbit map u → ux for u ∈ F t .
Denote by σ t the uniform probability measure on the finite set g −1 F tõ ⊂ V T . The following observation is the first step in reducing the proof of recurrence of horospherical orbits to studying recurrence properties of the Markov chain M n introduced earlier.
Lemma 3.7. For every t ∈ N * , we have proj * ν x,t = π * σ t . (3.7) Proof. Recall that x ∈ G/Γ is fixed and g ∈ G is such that x = gΓ. Consider the map f : G 0 η → T given by f (u) = g −1 u −1õ . Denote by O : u → ugΓ the orbit map. Then the following diagram clearly commutes: By definition, O * m Ft = ν x,t and hence it is enough to see that f * m Ft = σ t . This is readily verified and we are done.

3.2.2.
Further reduction to shadows and the Markov chain. Above, we related the orbital measures ν x,t to σ t -distributions on V T . The next lemmas will link σ t to the distributions of the Markov chain.
For v ∈ V T and n ∈ N, denote by S(v, n) the set of vertices of T at distance n ≥ 0 from v. For w a neighbor of v, let S w (v, n) be the subset of S(v, n) consisting of vertices z ∈ V T such that d(z, w) < d(z, v). Thinking of v as a light source at the center the sphere, we call S w (v, n) the shadow of w (see Fig. 5 for illustration). Denote by λ (v,w),n the uniform probability measure on the shadow S w (v, n). v w Lemma 3.8. Let G be a non-compact closed subgroup of Aut(T ) that acts transitively on ∂T . For any t ∈ N * , we have where {ṽ it } is the collection of vertices in T neighboring g −1 y t except g −1 y t+1 .
Proof. The last equality directly results from the definition of the probability measure λ (v,w),t , therefore we focus on the first equality. Since all the shadows involved have the same cardinality, using the definitions of σ t and λ (v,w),t , the equality will follow if we show In other words, the set g −1 F tõ is the set of vertices on the sphere of radius t around g −1 y t except the shadow of g −1 y t+1 .
Since g acts by isometry, it is enough to show this for g = id. We clearly have To show the other inclusion, let ξ 1 , ξ 2 ∈ ∂T be such that (ξ i , η)∩[y 0 , η) ⊃ [y t , η) for i = 1, 2. It clearly suffices to show that there exist a sequence h n ∈ F t with h n ξ 1 → ξ 2 as n → ∞. To see this, note that since G is non-compact, closed and transitive on ∂T , by [10, Lemma 3.1.1] it acts doubly transitively on ∂T . Furthermore, since it is non-compact, it contains a hyperbolic element a that -thanks to double transitivitywe can suppose to have attracting point η and repelling point ξ 1 on ∂T . Similarly up to conjugating a, let b be a hyperbolic element with attracting fixed point η and repelling fixed point ξ 2 . The sequence h n = b −n a n does the job and this concludes the proof.
where deg(.) denotes the valency of the vertexõ. Denote by ρ n the uniform measure on the sphere S(õ, n). In the following lemma, we realize the probability measures π * λ (v,w),n and π * ρ n as the n th -step distribution of our Markov chain with appropriate initial distributions. The fact that such a relation exists is not surprising as the Markov chain M n is obtained as a quotient of the simple random walk on the edges of the tree T .
To see the second claim, note that by construction of the Markov chain L n , we have Applying π * to both sides yields (3.10).
Remark 3.10. We remark here that the statements of Lemmas 3.7, 3.8 and 3.9 hold more generally for any lattice Γ of Aut(T ). Indeed, the proofs do not make use of the particular structure of a geometrically finite lattice. Proposition 3.11. Let x = gΓ with non-compact G 0 η -orbit. Then the set of weak- * limit points of π * σ t is {∂ 1 * µ Ω 0 , ∂ 1 * µ Ω 1 }, where Ω i 's are two cyclic classes of P and for i = 0, 1, µ Ω i is the unique P 2 -stationary measures on Ω i as before.
Remark 3.12. In this proposition, the measures π * σ t depend on the point x = gΓ, but the set of limit points of π * σ t does not.
Proof. Combining Lemmas 3.8 and 3.9 and denoting v it := π(ṽ it ), we have for any For a fixed t ∈ N, the edges (π(g −1 y t ), v it ) belong to the same cyclic class, denote it by Ω j(t) . Up to passing to a subsequence (i.e. considering even or odd t's), which we also denote by t, we may assume that j(t) is constant. For each t, choose one vertex v(t) ⊂ {v it } and denote the edge e(t) = (π(g −1 y t ), v t ). (3.12) Up to passing to a further subsequence of t's, we may suppose that δ e(t) P t−1 is supported in a single cyclic class. Therefore, for some r ∈ {0, 1}, every t in this sequence writes as t − 1 = 2n(t) + r, where n(t) ∈ N. Thus we can write δ e(t) P t−1 = δ e(t) P 2n(t) P r . (3.13) Now, since by Proposition 2.1, we have t − |e(t)| → ∞, Lemma 3.6 applies and we deduce that δ e(t) P 2n(t) − µ Ω j → 0 as t → ∞ for some j ∈ {0, 1}. Therefore, we have δ e(t) P t−1 − µ Ω i → 0 as t → ∞. where i = j + r (mod 2). This finishes the proof.

Equidistribution
This section is devoted to the proof of Theorem B which we deduce from Theorem A and our previous work [13].
Fix a hyperbolic element a ∈ G of translation length 2 with attracting fixed point η. Denote by η − ∈ ∂T the repelling point of a and set η with non-empty interior. Let O t = a t Oa −t be the associated good Følner sequence for G 0 η . As before, for x ∈ X, denote by ν x,t the orbital measure m Ot * δ x . Let x ∈ X be such that G 0 η -orbit of x is not compact. By Theorem A, up to passing to a subsequence, we can suppose that for the weak- * topology and where m is a Borel probability measure on X. Furthermore, since O t is a Følner sequence, m is G 0 η -invariant. We need to show that m = m X .
Recall that by [13,Theorem 1.6], there exists countably many closed G 0 η -orbits in X. These are all compact and for each cusp of Γ, there exists precisely a discrete one parameter family of compact orbits. Denote by k ∈ N the number of cusps of Γ and let C i,j be the collection of compact G 0 η -orbits, where i = 1, . . . , k and j ∈ Z.
By the same result, we have aC i,j = C i,j+1 and a − C i,j escapes to infinity as → ∞ in the sense that for any compact set K, we have K ∩ a − C i,j = ∅ for every large enough (see e.g. proof of [13, Lemma 6.2]). We first prove that m(C i,j ) = 0 for every i = 1, . . . , k and j ∈ Z. For a contradiction, suppose m(C i 0 ,j 0 ) > 0 for some i 0 , j 0 . Denote 1 2 m(C i 0 ,j 0 ) =: > 0 and let K = K( ) be the compact subset of X given by Theorem A. It follows by the latter result that we have m(K) ≥ 1 − . (4.2) Choose an ∈ N large enough so that a − C i 0 ,j 0 ∩ K = ∅. Since a − x does not lie on a compact G 0 η -orbit either, using Theorem A, by passing to a further subsequence in (4.1), we can suppose that ν a − x,t also converges to a G 0 η -invariant probability measure that we denote by m a − . As in (4.2), by Theorem A, we have m a − (K) ≥ 1 − .
Using the relation a − O t a = O t− , one verifies by a simple calculation that we have m a − = a − * m. Using this, we deduce a contradiction. Therefore, m(C i,j ) = 0 for all i = 1, . . . , k and j ∈ Z.
We mention that at this point, one could conclude the proof by appealing to the classification of ergodic G 0 η -invariant Borel probability measures [13, Theorem 1.1]. However, that result has extra hypotheses on G, namely Tits independence property and a certain transitivity condition. On the other hand, for a geometrically finite lattice Γ, it is possible to give a similar classification of ergodic G 0 η -invariant Borel probability measures on G/Γ for a more general group G as in Theorem B. We single this out in the next proposition which is essentially contained in [13].
Proposition 4.1. Let T be a (d 1 , d 2 )-biregular tree, with d 1 , d 2 ≥ 3, and G a noncompact, closed and topologically simple subgroup of Aut(T ) acting transitively on ∂T . Let Γ be a geometrically finite lattice in G and η ∈ ∂T . Then, any G 0 η -invariant and ergodic Borel probability measure on X = G/Γ is either G 0 η -homogeneous and compactly supported, or it is the Haar measure m X .
To finish the proof of Theorem B, consider an ergodic decomposition of the G 0 ηinvariant probability measure m. Since there are countably many closed G 0 η -orbits and each of them has zero measure with respect to m, the same holds for almost every ergodic component of m. Therefore by Proposition 4.1 almost every ergodic component of m is the Haar measure m X , hence m = m X completing the proof of Theorem B.
Proof of Proposition 4.1. We use the same notation introduced in the beginning of the proof of Theorem B, namely a is a hyperbolic element with attracting fixed point η ∈ ∂T , the group M and the good Følner sequence O t are as defined there. Let m 0 be a G 0 η -invariant and ergodic probability measure on X. If m 0 gives positive mass to a compact G 0 η -orbit, then by ergodicity, it must be the homogeneous measure supported on that orbit. So let us suppose that m 0 gives zero mass to each compact G 0 η -orbit. By pointwise ergodic theorem for amenable groups ([34, Theorem 1.2]), there exists a point y ∈ X that is generic with respect to m 0 and the tempered Følner sequence O t (see [13, §2.3]). By [13,Theorem 1.6], G 0 η -orbit of y is dense in X. Then, by [13,Lemma 6.2], there exists a compact set K in X, a sequence of integers n k → ∞ such that a −n k y ∈ K for every k ∈ N. For a function θ ∈ C c (X), denoteθ(z) := θ(mz)dm M (z) where m M is the Haar probability measure on M . The functionθ is clearly M -invariant. Since G is closed, transitive and topologically simple, it has the Howe-Moore property [11,Proposition 4.2] and in particular the action of the hyperbolic element a on X is mixing. Therefore we can apply [13, Lemma 6.3], where we can take O + to be O t 0 for some t 0 ∈ Z small enough, for every θ ∈ C c (X), we have On the other hand by choice of y ∈ X, the left-hand-side above also converges to θ (z)dm 0 (z). It follows But since m 0 and m X are G 0 η -invariant, by Fubini's theorem, it follows that in other words, m 0 = m X as required.

Escape of mass phenomenon
This section contains the proofs of Theorems C and D. We start by proving an escape of mass result that implies Theorem C. Regarding the construction of a lattice Γ < Aut(T ) that figures in the following result, we note that by [3, §4.11, Example 1], for every q ≥ 2, there exists a lattice Γ ≤ G = Aut(T 2q+2 ) whose associated edge-indexed graph is as in Fig. 6. Clearly, this Γ is not geometrically finite.
Theorem 5.1. Let Γ be a tree lattice with associated edge-indexed graph (Q, ind) as in Fig. 6. Let x = eΓ ∈ X = G/Γ be the trivial coset. Let ξ ∈ ∂T be the end corresponding to the sequence {x i } i∈N for some lifts of x i and G 0 ξ be the corresponding horospherical subgroup. Then for any compact K ⊂ X lim t→∞ ν x,t (K) = 0.
Proof. By (3.6), it clearly suffices to prove the statement for the orbital measures ν x,t associated to the Følner sequence F t . Letõ be a lift of the left-most vertex x 0 to T = T 2q+2 . By Lemma 3.7 and the fact that proj has compact fibers, it is enough to show that if σ t is the uniform measure on F tõ ⊂ V T , then for any x l ∈ V Q, we have π * σ t (x l ) → 0.
The set F tõ can be identified with all non-backtracking paths in T of length t that start atx t and do not containx t+1 . A path fromx t to a vertex y ∈ F tõ with π(y) = x l projects to a path in Q between x t and x l . Note that the projection of such paths to Q can only contain x 0 as endpoint. These will allow us to bound the number of such paths.
Without loss of generality, assume that t is even. For l an even non-negative integer, we claim that the number of vertices in F tõ that project to x l is bounded above by t l/2 · (2q) t−l/2 · 2 l/2 .
Indeed, any such path from x t to x l must take t − l/2 steps to the left and l/2 steps to the right in Fig. 6. The binomial coefficient counts the number of choices when to take the right step. Since the projection of the paths that we consider can only contain x 0 as endpoint, for any choice of such path, each edge taken to the right has at most 2 lifts to T , while each edge taken to the left has at most 2q lifts. Therefore, for any even l ≥ 0 The rest of this section is devoted to the proof of Theorem D. Its proof consists of four parts. In the first part, we construct an uncountable family of lattices Γ α in Aut(T 6 ). In the second part, thanks to an auxiliary Markov chain that we introduce, we obtain subgaussian concentration estimates on the Markov chain associated with the lattice Γ α (see §3.1). In the third part, we show that the space Aut(T 6 )/Γ α contains points x which exhibit escape of mass along some subsequence of horospherical orbital averages and along some other subsequences equidistribute to the Haar measure, as we show in the fourth part.
Proof of Theorem D. First part: Construction of Γ α . For each α ∈ (1, 2), we will construct an edge-indexed graph (Q α , ind) of finite volume, which will yield a lattice Γ α ≤ Aut(T 6 ). First, the underlying graph is a ray, with vertices {x i } ∞ i=0 and edges Let α ∈ (1, 2) and for i ≥ 1, let n i = α i . We divide the vertices (x i ) i≥1 of the ray into two types: x j is black if j = i + n 1 + · · · + n i for some i ≥ 1 and white otherwise. In other words, there are blocks of n i consecutive white vertices that are separated by single appearances of black vertices. A white vertex x j is said to belong to i−th block if i + n 1 + · · · n i < j < i + 1 + n 1 + · · · + n i+1 .
We say that an edge e belongs to the i-th block if both ∂ 0 e and ∂ 1 e do.
Second part: An auxiliary chain and subgaussian estimates. Consider the edgeindexed graph Q α and the Markov chain M n on EQ as in §3.1. For an edge e j that belongs to some block i, the transition probabilities of M n are given by P (e j , e j+1 ) = P (ē j ,ē j−1 ) = 3 5 , P (e j ,ē j ) = P (ē j , e j ) = 2 5 . (5.1) In view of the reductions in Section 3, we are interested in understanding the distribution of ∂ 0 * δ e j P m . For this, consider an auxiliary Markov kernelP on a state space consisting of two elements {e,ē} with transition probabilities P (e, e) =P (ē,ē) = 3 5 ,P (e,ē) =P (ē, e) = 2 5 .
This auxiliary Markov Chain records the behavior of M n along the edges within some block. It remembers the probabilities of an edge to turn around or to continue further in the same direction (see e.g. Fig. 4). Denote by V n the Markov chain associated to the kernelP . Given a word u ∈ {e,ē} n of length n ≥ 2, for s ∈ {e,ē} 2 denote by N s (u) the number of occurrences of the word s as a subword of u. For n ≥ 2, define the function f n on {e,ē} n by f n (u) = N ee (u)+N eē (u)−Nē e (u)−Nēē(u). Denote by Y n the integer valued random variable f n (V 0 , V 1 , . . . , V n−1 ). Now, it is readily observed that for every i ∈ N large enough so that n i ≥ 16 and every integer j, m ≥ 2 with we have ∂ 0 * δ e j P m = ∂ 0 * e j+Ym in distribution. (5.4) Indeed, the inequalities (5.3) make sure that the starting edge e j in the i-th block is n i /4 away from the boundary of the i-th block, so application of P m keeps the support of the distribution within i-th block and, therefore, transition probabilities at each step are given by (5.1). The relation with the auxiliary chain (5.2) is straightforward.
We now wish to use subgaussian concentration inequalities for the Markov chain V m (e.g. as discussed in [18]). To this end, we note that being an aperiodic irreducible Markov chain with finite state space {e, e}, V m is geometrically ergodic and the state space is a small set in the sense of [ Now using the relation (5.4) and slightly decreasing the constant c 0 to c 0 (depending only on C > 0), we deduce that for every i ∈ N large enough, j, m ∈ N as in (5.3) and r ≤ m, we have P e j (M m ∈ {e j−r , . . . , e j+r ,ē j−r , . . . ,ē j+r }) It follows immediately that if an initial distribution µ is supported on a set S of edges {e j } which satisfy (5.3) for some i large enough, then for m as in (5.3) and r ≤ m we have Third part: Showing the escape of mass. Now we construct points x = gΓ α ∈ G/Γ α who, under horospherical group action, exhibit the dynamical behavior as described in the statement of Theorem D. For an edge e ∈ EQ α , denote by |e| = d(e, x 0 ) the graph distance in Q α . In particular, |e j | = j.
(2) ∂ 0 e t+1 = ∂ 1 e(t). The path e(t) as above comes back infinitely often to x 0 , but makes some longer and longer visits towards the cusp. Now, choose a lift of e(t) in T 6 , starting at some basepointõ ∈ V T , which is the lift of x 0 . Letõ = y 0 , y 1 , y 2 , ... be consecutive vertices converging to η ∈ ∂T . Let g ∈ Aut(T 6 ) be an automorphism such that g −1 maps the edge (y i , y i+1 ) to the lift of the edge e(i), and x = gΓ α .
Let i k and t i k be increasing sequences of N, such that e(t i k ) is an edge, pointing toward the cusp, that is exactly in the middle of i k -th block, namely and t β i k < c 1 |e(t i k )| for some c 1 > 0. Such infinite subsequences exist by property (3) in the choice of e(t). In the notation above, e d i k = |e(t i k )|.
We shall now show that the sequence of measures given by the orbital averages ν x,t i k converges weakly to 0 as k → ∞.
By Lemma 3.7 and (3.11), it suffices to show that the sequence δ e(t i k ) P t i k of distributions of the Markov chain M n converges weakly to zero. To do this, we would like to apply (5.5) to show that after t i k -iteration most of the mass of the Markov chain stays in i k -th block, which moves to the cusp in Q α as k → ∞. However, constraints (5.3) are not satisfied since t i k ≥ d i k > n i k /8. We remind that n i k = α i k , d i k ∼ c 2 α i k and, thus t i k ≤ c 3 α i k 1/β for some positive constants c 2 , c 3 .
To overcome this problem, we apply (5.6) several times for a small number of allowed iterations, each time dismissing an exponentially small proportion of trajectories that move more than a distance r k to be chosen below.
Let m k = n i k 8 The number of times we wish to apply (5.6) is bounded above by The Markov property and choices of m k and r k allow us to repeatedly apply (5.6) N k times (with m = m k and r = r k ), each time conditioning on trajectories that do not move more than r k in each m kiterate. We get that proportion of trajectories starting at e(t i k ) that move at most N k · r k ≤ n i k /4 (in particular, do not leave i k -th block) is at least for some constant c 4 > 0. The above tends to 1 as k → ∞, implying the escape of mass for ν x,t 's when the underlying Følner sequence is F t . By (3.6), this also implies the escape of mass for ν x,t 's associated to any good Følner sequence.
Fourth part: Equidistribution. Recall that by our choice of x ∈ X, there exists an increasing sequence t k ∈ N such that |e(t k )| = 0 for every k ∈ N. The equidistribution statement follows from the following technical but more general result. This completes the proof of Theorem D.
Proposition 5.2. Let G be a non-compact, closed, topologically simple subgroup of Aut(T ) that acts transitively on ∂T . Let Γ be a lattice in G, η ∈ ∂T and O t a good Følner sequence in G 0 η . Let g ∈ G and denote by (ê(t)) t∈N a sequence of consecutive edges in T on a geodesic segment towards g −1 η. Assume that there exist a finite subset F of EQ and an increasing subsequence t k such that π(ê(t k )) =: e(t k ) ∈ F for every k ∈ N. Then for x = gΓ ∈ X, the orbital measures ν x,t k converge towards the Haar measure m X on X.
Proof. As before, fix a distinguished vertexõ in T with respect to which the proj : G/Γ → V Q map given by proj(hΓ) = π(h −1õ ) is defined. Let g ∈ G be as in the statement. Since the Markov chain M n associated with the lattice Γ is positive recurrent (Proposition 6.4), it follows by (3.6) and the correspondance established in Lemmas 3.7, 3.8 and 3.9 (see also Remark 3.10) that proj * ν x,t k converges to a probability measurem on EQ. Since the map proj is proper, this implies that the sequence ν x,t k is tight so that any subsequence of (ν x,t k ) k∈N has a limit point and any limit point is a probability measure on X. Let m be such a limit point along a subsequence that we also denote by t k . Since ν x,t k 's are orbital measures associated to a Følner sequence in G 0 η , the limit probability measure m is G 0 η -invariant. Now, fix a hyperbolic element a ∈ G with attracting point η ∈ ∂T and such that the translation axis of a containsõ. Let n k = t k τ (a) where τ (a) ∈ N denotes the translation length of a, so that |τ (a n k ) − t k | is bounded. For every k ∈ N, we have proj(a −n k gΓ) = π(g −1 a n kõ ) = π((g −1 a n k g)g −1õ ). As n ∈ N varies, (g −1 a n g)g −1õ describes vertices on the geodesic ray between g −1õ and g −1 η. Therefore it follows by the hypothesis e(t k ) ∈ F that for some larger finite set F ⊂ EQ, we have proj(a −n k gΓ) ∈ F for every k ∈ N. Since the map proj has compact fibres, this entails that there exists a compact set K ⊂ G/Γ such that a −n k gΓ ∈ K, (5.7) for every k ∈ N. Furthermore, since such a group G as in the statement enjoys the Howe-Moore property [11] (see also [36]), the action of a on (X, m X ) is mixing so that we are in a position to apply [13,Lemma 6.3] as in the proof of Proposition 4.1. Now repeating the same argument as in the end of the proof of Proposition 4.1 (i.e. (4.3) and thereafter), one deduces that m = m X and this proves the proposition.

Limiting distributions of spheres in quotient graphs and lattice point counting
This section is devoted to the proof of Theorems E and F . Recall from §3.1, the irreducible Markov chain M n associated with a tree lattice Γ. We proved in Lemma 3.3 that it is positive recurrent when Γ is a geometrically finite lattice. However, in Theorem E general lattices are considered. Here we will prove that, more generally, M n is positive recurrent for all lattices. In order to do this we introduce another Markov chain that will serve as a tool to analyse further the chain M n . Proof. It suffices to check that µ has finite l 1 -norm and is reversible, i.e. satisfies µ(w 1 )P (w 1 , w 2 ) = µ(w 2 )P (w 2 , w 1 ) for all w 1 , w 2 ∈ V Q. It is enough to consider pairs of neighbors w 1 , w 2 ∈ V Q. Indeed, for such we havê This shows that µ is a reversible measure on V Q. The fact that µ has finite l 1 -norm is a direct consequence of the volume formula (2.3): Remark 6.2. Recall from §2.2 that G is transitive on the set of vertices of V T at even distance fromõ. The image of proj : G/Γ → Γ \ T is the set of vertices at even distance from o. Moreover, from (2.4) it is clear that proj * m X is proportional to the restriction of µ to the image of proj. Hence, the measure µ can be thought of as the projection of Haar measure on G/Γ.

6.2.
Positive recurrence of the Markov chain M n . First, we wish to relate the Markov chainsM n and M n . Denote by R n the n th -step of the nearest-neighbor simple random walk on the vertices of the tree T and by δõ(R n ) its distribution when the initial vertex isõ ∈ V T , i.e. a.s. R 0 =õ. Note that since T is biregular, the restriction of δõ(R n ) to the spheres S(õ, m), for m ≤ n, is a multiple of the uniform measure on S(õ, m) which is denoted by ρ m as before. Let Dõ be the distribution on EQ given by (6.1) Lemma 6.3. For any n ≥ 1 we have Proof. Let R be the transition kernel for simple random walk on the tree. For the first equality, one simply notes thatP (π(x), π(ỹ)) = R(x, π −1 (ỹ)). The second equality follows from the fact that the distribution of n th -step of nearest-neighborhood simple random walk on the tree is given by The statement follows after applying π * and Lemma 3.9.
In other words, the distribution of the chainM n starting from v ∈ V Q is given by weighted average of distributions given by M k with k ≤ n. We will use this relation to deduce the positive recurrence of M n from the positive recurrence ofM n . Proposition 6.4. The Markov chain M n is positive recurrent.
Proof. By Kingman's subadditive ergodic theorem, there exists r ∈ R such that 1 k d(õ, R k ) −→ r, Põ-almost surely, hence also in measure, as k → ∞ (the value r is called the drift of random walk R k ). Since max{d 1 , d 2 } ≥ 3, it is easily seen that r > 0. Let ε > 0. Then for all k ∈ N large enough, we have By positive recurrence of the auxiliary chainM n (Lemma 6.1), there exists a finite subset K 1 of V Q such that for every n large enough, P o (M n ∈ K 1 ) > 1 − .
In view of Lemma 6.3 and (6.3), we deduce that there exists a sequence n k ∈ N with |n k − kr| ≤ kε such that for every k large enough which implies that the irreducible chain M n is positive recurrent.
An alternative and more conceptual proof of Proposition 6.4 was kindly suggested to us by an anonymous referee. We discuss it in the following remark. As our proof above, it relies on the fact that the Markov chain M n can be seen as a quotient of the simple random walk on ET , the set of edges of the tree T . Remark 6.5 (Alternative proof of Proposition 6.4). LetP be the Markov operator associated to the simple random walk on ET . Considering two successive edges x and y in ET , we have ET = Gx ∪ Gy G/G x ∪ G/G y , where G x and G y denote the respective stabilizers and G = Aut(T ). Using this and the fact that G is unimodular [1,Proposition 6], one sees that ET carries aP -stationary and G-invariant measurẽ ν. The restriction ofν to Gx (respectively Gy) corresponds to the G-invariant measure on G/G x (respectively G/G y ). On the other hand, the Markov operator P of the Markov chain M n on EQ Γ \ ET can be seen as the restriction ofP to Γ-invariant functions on ET and the associated quotient measure ν ofν gives a P -stationary measure on EQ. But since Γ < G is a lattice and ν is given by the quotient measure on Γ \ G/G x ∪ Γ \ G/G y , we have that ν is finite, as required.
6.3. Proof of Theorem E. Here we prove parts 1. and 2. of Theorem E. Its third part about exponential equidistribution will be proven in §6.4.
If the irreducible and positive recurrent Markov chain M n has period p ∈ N, then the sequence of distributions DṽP n have finitely many limit points {µ j } p −1 j=0 , corresponding to all possible convex combinations with coefficients 1/ deg(ṽ) of the unique stationary probability measures of M n on each one of its cyclic classes (corresponding to the classes of Dirac measures constituting Dṽ). This implies the convergence along subsequences pn + j and hence (2) of Theorem E. 6.4. Exponential equidistribution of spheres in quotients by geometrically finite lattices. Previously, we established positive recurrence of M n , which is sufficient to prove the existence of limiting distributions of spheres in quotients of trees by action of tree lattices. However, in some cases, our Markov chain possesses a stronger property, namely that of geometric ergodicity. In these situations, the speed of convergence to the limiting distribution can be shown to be exponential and the exponential rate can even be made effective.
We begin by stating a version of Geometric Ergodic Theorem for Markov chains. Out of the equivalent definitions of geometric ergodicity, we conveniently choose one that uses the (Foster-Lyapunov) drift criteria. We then prove geometric ergodicity of the Markov chain M n associated to geometrically finite tree lattices and discuss the application for exponential equidistribution of spheres. We refer the reader to [20,37] for more on geometric ergodicity.
Let M n be an irreducible, aperiodic and positive recurrent Markov chain on a countable state space S with the stationary probability measure µ. Denote by P the corresponding Markov operator. We call M n geometrically ergodic if there exists r > 1 such that for all x ∈ S, we have n≥0 r n δ x P n − µ < ∞. (6.5) where · denotes the total variation norm. In particular, for a geometrically ergodic chain M n , we have δ x P n − µ = o(r −n ) for every x ∈ S. Theorem 6.6. (Geometric Ergodic Theorem) Let M n be an irreducible aperiodic Markov chain on a countable state space S. Assume that there exist a finite set K ⊂ S, b ∈ R, β < 1 and a function V ≥ 1, which is finite at some x 0 ∈ S satisfying the drift criteria: for any x ∈ X. (6.6) Then M n is geometrically ergodic.
Let us remark that the rate r can be made explicit in terms of β, K; see [4] for the treatment of the constant r. Finally, aperiodicity hypothesis is only required to have a simple expression as in (6.5); if the Markov chain is not aperiodic, we shall still speak of geometric ergodicity if its restriction to its cyclic classes are. Lemma 6.7. Let T be (d 1 , d 2 )-biregular tree with d 1 , d 2 ≥ 3 and Γ a geometrically finite tree lattice. Then the associated Markov chain M n is geometrically ergodic.
Proof. Let F be the compact part of Q. For convenience of notation, we will assume d 1 = d 2 . Let q = d 1 + 1.
We define the function V : EQ → [1, ∞) by if e / ∈ EF, and points toward the finite part 1 2 |e| q 0.9|e| otherwise. We claim that V satisfies the drift criteria (6.6) with β = q −0.1 and b = q 5 . Recall that we have positive transition probabilities only among neighboring edges in EQ. For e ∈ EQ \ EF , the edge e belongs to a Nagao ray. If e is oriented toward the finite part and |e| > 5, P V (e) = V (f ), where |f | = |e| − 1. Hence, P V (e) = q 0.1(|e|−1) ≤ q −0.1 V (e).
If e is oriented toward the cusp, the transition probabilities are 1/q to jump one step further away from EF to edge pointing toward the cusp and q − 1/q to get one step closer and point toward the finite part (see Example 3.1). In other words, for each edge e with |e| > 5 we have P V (e) = 1 q · q 0.9(|e|+1) 2 |e|+1 + q − 1 q · q 0.1(|e|−1) ≤ 1 2 · q −0.1 · q 0.9|e| 2 |e| + q 0.1(|e|−1) ≤ q −0.1 · q 0.9|e| 2 |e| = q −0.1 V (e). The last inequality holds since for any q ≥ 4 and |e| > 5 q 0.1(|e|−1) ≤ 1 2 · q −0.1 · q 0.9|e| 2 |e| . The lemma follows by letting K be the finite set of edges with |e| ≤ 5 (this also contains EF ). Finally, the description of limit measures µ j 's in the paragraph following the statement of Theorem E follows from the proof §6.3 and Lemma 3.4 which says that the period of the Markov chain M n is always two so that the Dirac masses constituting each distribution Dṽ all belong to a single cyclic class. Remark 6.8. In the context of homogeneous dynamics, inequalities of type (6.6) are often referred to as Margulis inequalities. They were first used in the work of Eskin-Margulis-Mozes [24] and Eskin-Margulis [23]. After we completed the first version of this article, for horospherical averages on lattice quotients of real semisimple groups, using linear representations, Katz [31] proved Margulis inequalities to establish quantitative non-divergence of horospherical averages (as in Lemma 6.7). Combining this with a spectral gap, he also deduced an equidistribution result (as Theorem B but) with rate depending, among others, on certain diophantine parameters of the starting point x ∈ G/Γ (cf. Remark 1.5). For PSL 2 (R)-quotients, more precise estimates were obtained earlier by Flaminio-Forni [27] and Strömbergsson [53], exploiting, among others, (unitary) representation theory of PSL 2 (R). Remark 6.9. We remark that the family of lattices for which the associated Markov chain is geometrically ergodic and, consequently, for which part 3. of Theorem E holds, contains many non-geometrically finite lattices. For example, the lattice associated with the edge-indexed graph from Fig. 6 is such an example, with similar Foster-Lyapunov function V (x) to the one in proof of Lemma 6.7. 6.5. Proof of Theorem F. Let Γ be a geometrically finite lattice in Aut(T ) =: G, denote by m a Haar measure on G and let m X be the induced G-invariant finite measure on G/Γ by choice of a Borel fundamental domain in G. Denote by S T (R) the cardinality of the sphere of radius R aroundõ in T and o = π(õ). As before, π is the natural projection V T → V Q and ρ n denotes the normalized probability measure on the sphere of radius n on V Q with center o. Recall that G has precisely two orbits on V T and it acts transitively on the set of vertices of T that are of even distance to each other, so that for every γ ∈ Γ, 2|d(γõ,õ). For every R ∈ N, we have N (2R) = n≤R S T (2n) · π * ρ 2n (o) · |Γ ∩ Gõ|. (6.7) Thanks to (3) of Theorem E (see also the paragraph following that theorem), for some constant r > 1, we have On the other hand, in (6.8), the term proj * m X (o) can be rewritten as: proj * m X (o) = m X (proj −1 (o)) = m X (GõΓ) = 1 |Gõ ∩ Γ| m(Gõ). (6.9) Plugging (6.9) and (6.8) in (6.7) yields the desired statement. To see the alternative expression of the main term m(Gõ) m X (X) as expressed after the statement of Theorem F, observe first that it follows by unimodularity of Aut(T ) that for any two verticesṽ,w ∈ V T with 2|d(ṽ,w), we have m(Gṽ) = m(Gw). Now fixing a liftṽ for every v ∈ V Q with 2|d(o, v) and an element g v such that g vṽ =õ, we have