1 Introduction

Generalizing the classical ergodic theorem of Birkhoff, ergodic theorems for actions of hyperbolic groups have long been a subject of interest, starting with Nevo–Stein [25] for free groups, and then Fujiwara–Nevo [17], Bufetov [10, 11], Bowen [8], and Bufetov–Series [12], among others.

Given a hyperbolic group \(\Gamma \) with a measure preserving action on a probability space (Xm), these authors consider Cesàro averages of the following type: let S be a finite generating set for \(\Gamma \), and let \(S_n\) denote the sphere of radius n in the Cayley graph of \((\Gamma , S)\). Then for any function \(f :X \rightarrow \mathbb {R}\), any \(x \in X\) and \(N \ge 1\), one defines the averaging operator

$$\begin{aligned} c_N(f) := \frac{1}{N} \sum _{n \le N} \frac{1}{\# S_n} \sum _{|w| = n} f(w^{-1} x). \end{aligned}$$
(1)

The most recent results in this vein are due to Bufetov–Khristoforov–Klimenko [13], Pollicott–Sharp [26], and Bowen–Nevo [9], who, for measure-preserving actions of hyperbolic groups, establish convergence of the Cesàro averages \((c_N(f))_{N \ge 0}\) for \(f \in L^p(X, m)\) \((1 < p \le \infty )\) and for almost every point \(x \in X\). In these cases, the identification of the limit is a well-known open problem (see e.g. [13, Question p. 4799]). In this paper, we prove the convergence of such Cesàro averages for every starting point \(x \in X\), provided that X is a homogeneous space, and we identify the limiting measure.

To recall the setting of homogenous dynamics, consider a real Lie group G, a lattice \(\Lambda <G\) and a subgroup \(\Gamma <G\). The subgroup \(\Gamma \) acts on \(G/\Lambda \) by left multiplication. The distribution of orbits of \(\Gamma \) in the homogeneous space \(G/\Lambda \) has been the object of much research. When \(G=\textrm{SL}_{2}(\mathbb {R})\) and \(\Gamma \) is the diagonal subgroup, the orbits of \(\Gamma \) are precisely hyperbolic geodesics in the unit tangent bundle of \(\mathbb {H}^2/\Lambda \). By ergodicity of the geodesic flow, almost every \(x\in G/\Lambda \) (with respect to Riemannian measure m on \(\mathbb {H}^2/\Lambda \)) has dense orbit and is equidistributed with respect to Haar measure, i.e. for any continuous \(f:G/\Lambda \rightarrow \mathbb {R}\) we have \(\frac{1}{T} \int ^T_{0}f(g_tx) \ dt \rightarrow \int f dm\). Nevertheless some orbits of \(\Gamma \) are closed geodesics, and others are very wild: indeed, for any \(c\in [1,2]\) there is an orbit whose image in \(\mathbb {H}^2/\Lambda \) has closure of Hausdorff dimension c. When \(\Gamma <\textrm{SL}_{2}(\mathbb {R})\) is instead the (one parameter continuous) upper triangular subgroup, the orbits of \(\Gamma \) are horocycles in \(\mathbb {H}^2/\Lambda \) and their orbits are much more regular: indeed, Ratner’s celebrated theorem implies any \(\Gamma \)-orbit is either closed or dense in \(G/\Lambda \) [27]. In the latter case, their orbits are equidistributed with respect to Haar measure.

Here we are concerned with finitely generated subgroups \(\Gamma <G\), in particular ones which are word hyperbolic. We will show that under suitable assumptions, Cesàro averages of spheres in their Cayley graphs become equidistributed with respect to the Haar measure on \(G/\Lambda \). Assume that G is connected semisimple and \(\Gamma \) is Ad-Zariski dense, meaning that the image of \(\Gamma \) by the adjoint representation \(\textrm{Ad} :G \rightarrow \textrm{GL}(\mathfrak {g})\) is Zariski dense. A breakthrough of Benoist and Quint [3, 5, 6] implies that every \(\Gamma \) orbit in \(G/\Lambda \) is either finite or dense.

If \(x\in G/\Lambda \) is such that \(\Gamma x\) is infinite, we prove that its orbit equidistributes with respect to the Haar measure \(\nu \) on \(X = G/\Lambda \). More precisely:

Theorem 1.1

Let G be a connected semisimple real Lie group without compact factors, and \(\Lambda < G\) an irreducible lattice. Let \(X:= G/\Lambda \) and \(\nu \) be the Haar measure on X. Let \(\Gamma \) be a hyperbolic group, and consider a representation \(\rho :\Gamma \rightarrow G\) with Ad-Zariski dense image, which defines an action of \(\Gamma \) on X. Fix any finite generating set S of \(\Gamma \), and let \(S_n\) be the sphere of radius n in the Cayley graph of \((\Gamma , S)\). Then for any continuous \(f :X \rightarrow \mathbb {R}\) with compact support, and any \(x \in X\) we have that either the orbit \(\Gamma x\) is finite, or

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} \frac{1}{\# S_n} \sum _{|w| = n} f(w^{-1} x)\rightarrow \int _X f \ d \nu . \end{aligned}$$
(2)

We remark that since spheres in the Cayley graph are symmetric, one can instead consider f(wx) in the theorem statement rather than \(f(w^{-1}x)\), however this would be less natural for the proof.

Of course, Eq. (2) is equivalent to the statement that the measures on X induced by averaging over the images of uniform measure on \(S_n\), for \(n= 1,\ldots , N\), converge weakly to \(\nu \). In particular, for any \(x\in X\) with \(\Gamma x\) infinite and any compact \(A \subset X\) with \(\nu (\partial A) = 0\),

$$ \frac{1}{N}\sum _{n=1}^N\frac{\#\{g\in S_n: gx \in A\} }{\# S_n} \longrightarrow \nu (A), \quad \text {as } n\rightarrow \infty . $$

To consider more general triples \((G,\Lambda , \Gamma )\), the hypothesis that \(\Gamma \) is Ad-Zariski dense can be replaced with the hypothesis that \(\textrm{Ad}(\Gamma ) \subset \textrm{GL}(\mathfrak {g})\) is Zariski connected semisimple with no compact factor and that \(\Gamma x\) is dense in X. In fact, we can assume more weakly that \(\overline{\Gamma x}\) is connected, in which case \(\nu \) is replaced by the unique invariant (Haar) probability measure \(\nu _{\overline{\Gamma x}}\) on the homogeneous space \(\overline{\Gamma x}\), which exists by Benoist–Quint [5]. See Sect. 3 for details.

Theorem 1.2

Let G be a real Lie group, \(\Lambda < G\) a lattice, and let \(\rho :\Gamma \rightarrow G\) be a representation of a hyperbolic group \(\Gamma \) into G. Suppose that the Zariski closure of \(\textrm{Ad}( \rho (\Gamma ))\) is Zariski connected, semisimple, and without compact factors. Fix any finite generating set S of \(\Gamma \), and let \(x \in X\) such that the orbit closure \(\overline{\Gamma x}\) is connected. Then for any continuous \(f :X \rightarrow \mathbb {R}\) with compact support, we have

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} \frac{1}{\# S_n} \sum _{|w| = n} f(w^{-1} x)\rightarrow \int _X f \ d \nu _{\overline{\Gamma x}}. \end{aligned}$$
(3)

We also show that orbits along randomly chosen geodesic rays in \(\Gamma \) equidistribute in X (Theorem 7.1); see Sect. 7 for details.

In fact, our methods apply beyond hyperbolic groups, to groups admitting a thick geodesic combing, as defined in [20] (see Sect. 2). Such class of groups include relatively hyperbolic groups and right-angled Artin and Coxeter groups, for certain natural generating sets. Representations of such groups into \(SL (n, \mathbb {R})\) are a topic of considerable recent interest, especially in the context of higher Teichmüller theory [21, 22, 31,32,33].

The most general version of the theorem we prove is the following:

Theorem 1.3

Let G be a real Lie group, \(\Lambda < G\) a lattice, and let \(\Gamma \) be a finitely generated group with generating set S, such that \((\Gamma , S)\) has a thick geodesic combing. Let \(\rho :\Gamma \rightarrow G\) be a representation, and suppose that the Zariski closure of \(\textrm{Ad} (\rho (\Gamma ))\) is Zariski connected, semisimple, and without compact factors. Let \(x \in X\) such that the orbit closure \(\overline{\Gamma x}\) is connected. Then for any continuous \(f :X \rightarrow \mathbb {R}\) with compact support, we have

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} \frac{1}{\# S_n} \sum _{|w| = n} f(w^{-1} x)\rightarrow \int _X f \ d \nu _{\overline{\Gamma x}}. \end{aligned}$$
(4)

For instance, using [18, Lemma 8.1], the above theorem applies to the following situations:

  • If \(\Gamma \) is relatively hyperbolic with virtually abelian peripheral subgroups then there exists a generating set of \(\Gamma \) with thick geodesic combing (see also [20, Sections 2.3, 9]);

  • If \(\Gamma \) is a non-abelian, irreducible, right-angled Artin or Coxeter group, and S is the vertex generating set, then \((\Gamma , S)\) has a thick geodesic combing (see also [20, Section 10]).

One particularly concrete application of Theorem 1.2 is the following. Let \(M = \mathbb {H}^3/\Lambda \) be a finite volume hyperbolic 3–manifold. It is well known by Shah [28] and Ratner [27] that every totally geodesic hyperbolic plane in M is either closed or dense. More precisely, for any \(x \in G/\Lambda \) (where \(G = \textrm{PSL}_2(\mathbb {C})\) is the group of orientation preserving isometries of \(\mathbb {H}^3\)) the orbit \(\textrm{PSL}_2(\mathbb {R}) x\) is either closed or dense. In the latter case, Theorem 1.2 implies that if \(\Gamma \) is any discrete, nonelementary subgroup of \(\textrm{PSL}_2(\mathbb {R})\) and S is any generating set of \(\Gamma \), then either \(\Gamma x\) is finite or spheres \(S_n\) in the Cayley graph for \((\Gamma ,S)\) equidistribute; i.e. averages of the counting measures on \(S_n x\) converge to the invariant (Haar) measure on \(G/\Lambda \). In the case where \(\textrm{PSL}_2(\mathbb {R}) x\) is closed, the equidistribution occurs with respect to the Haar measure on the orbit closure.

Another application of our techniques is to actions on tori, where in fact we do not need to take the Cesàro average to guarantee convergence.

Theorem 1.4

Let \(\rho :\Gamma \rightarrow \textrm{SL}(d, \mathbb {Z})\) be a representation of a hyperbolic group with Zariski dense image. Let \(x\in \mathbb {T}^d\) be any irrational point. Then for any continuous \(f :\mathbb {T}^d \rightarrow \mathbb {R}\) we have

$$\begin{aligned} \frac{1}{\# S_n} \sum _{|w| = n} f(w^{-1} x)\rightarrow \int f dm \end{aligned}$$
(5)

where the integral is taken with respect to Haar measure on \(\mathbb {T}^d\).

Theorem 1.4 follows by using the recent work of He–de Saxcé [24] (extending work of [7]) in place of [5]. In fact, it extends to more general nilmanifolds, using [23]. This leads us to ask the following question:

Question 1.1

In the context of Theorem 1.1, for which representations \(\rho :\Gamma \rightarrow G\) can the Cesàro average in Eq. (2) be removed?

This is closely related to the well-known question of Benoist–Quint [4, Question 3] concerning whether the Cesàro average appearing in Theorem 3.1 can be removed. Although partial progress has been made in [2], the hypotheses there are incompatible with the case of interest here; the measures \(\mu _j\) appearing in the proof of Lemma 4.1 have the property that distinct convolution powers have disjoint support.

2 Geodesic combings

In this section, we recall some basic properties of graph structures and geodesic combings of groups. For hyperbolic groups, the essential features are due to Cannon [15] and Calegari–Fujiwara [14]. For the general case, we refer to [18, 20].

Fix a finitely generated group \(\Gamma \) and any finite subset \(S \subset \Gamma \), which we usually take to be a generating set of \(\Gamma \). A graph structure for \((\Gamma , S)\) is a triple \((D,v_0,\textrm{ev})\), where D is a finite directed graph, \(v_0\) is a vertex of D which we call its initial vertex, and \(\textrm{ev}:E(D) \rightarrow S\subset G\) is a map that labels the edges of D with elements from S. We extend the map \(\textrm{ev}\) by defining for each finite (always directed) path \(g = g_1 \dots g_n\) the group element \(\textrm{ev}(g) = \textrm{ev}(g_1) \dots \textrm{ev}(g_n)\) in G. To simplify notation, we will use \(\overline{g} = \textrm{ev}(g)\) to denote the group element associated to the path g. Additionally, if there is an action \(\Gamma \curvearrowright X\), we write gx to mean \(\overline{g}x\) for a path g in D.

For a graph structure D, we define \(\Omega \) to be the set of all infinite paths starting at any vertex of \(\Gamma \). By \(\Omega ^n\) we mean the set of all paths of length n and set \(\Omega ^* = \cup _{n\ge 1} \Omega ^n\). Further, if v is a vertex of D, then \(\Omega _v\) (or \(\Omega _i\) if \(v =v_i\)) is the set of infinite paths starting at v, and similarly for \(\Omega _v^n\).

The graph structure D is geodesic if the map \(\textrm{ev}:\Omega _{v_0}^* \rightarrow \Gamma \) is injective and length preserving when \(\Gamma \), or more precisely the subgroup generated by S, is given the word metric for the generating set S. If it is also surjective, then D is said to be a geodesic combing. In this case, evaluation induces a bijection from \(\Omega _{v_0}^n\) to \(S_n\) for each \(n\ge 1\), where \(S_n\) is the sphere of radius n with respect to S. In this paper, each graph structure will come from starting with a geodesic combing D for \(\Gamma \) and applying one or both of the following operations:

  1. (1)

    restrict the evaluation map to some subgraph \(D'\); if the subgraph does not contain \(v_0\) then choose an arbitrary vertex of \(D'\), or

  2. (2)

    replace the graph D with its associated p-step graph structure \(D_p\). The vertices of \(D_p\) are equal to those of D and each edge of \(D_p\) (and its label) naturally corresponds to a path of length p in D.

We observe that if D is any geodesic graph structure, then so are each of \(D'\) and \(D_p\) as defined in items (1) and (2) above.

According to Cannon [15], for any hyperbolic group and any finite generating set there is an associated geodesic combing.

2.1 Structure of geodesic combings for hyperbolic groups

We define two vertices \(v_i, v_j\) to be equivalent if there is a path from \(v_i\) to \(v_j\) and a path from \(v_j\) to \(v_i\), and the (recurrent) components of D as the equivalence classes for this relation.

We denote by A the transition matrix for D. By Perron–Frobenius, A has a real eigenvalue of largest modulus, which we will denote by \(\lambda \). Following Calegari–Fujiwara [14], we say that the matrix A is almost semisimple if for any eigenvalue of maximal modulus, its geometric and algebraic multiplicity agree. For example, Calegari–Fujiwara prove that when D is a geodesic combing of a hyperbolic group A always satisfies this property. We additionally call a geodesic combing (or more generally a graph structure) semisimple or primitive if its transition matrix has those properties. Recall that a matrix is semisimple if its only eigenvalue of maximal modulus is real positive and primitive if it has a positive power. In general, primitive \(\implies \) semisimple \(\implies \) almost semisimple.

Let D be almost semisimple, and let \(\lambda \) be the leading eigenvalue of A. Then we say a vertex v is of large growth if

$$\lim _{n \rightarrow \infty } \frac{1}{n} \log \# \{ paths of length n starting at v \} = \lambda $$

and of small growth otherwise (in which case the limit above is \(< \lambda \)). Furthermore, a component C of D is maximal if

$$\lim _{n \rightarrow \infty } \frac{1}{n} \log \# \{ paths of length n inside C \} = \lambda .$$

The component-wise structure of D is as follows: there is no path between maximal components and vertices of large growth are precisely the ones which have a path to a maximal component. See [14] or [20].

2.1.1 Loop semigroups and thickness

Given a vertex v, we denote as \(D_v\) the loop semigroup of v, i.e. the set of all finite paths from v to itself. This is a semigroup under concatenation, and all its elements lie entirely in the component of v. The evaluation map embeds \(D_v\) into G as a semigroup which we denote by \(\Gamma _v\) . Abusing terminology slightly, we also refer to \(\Gamma _v\) as the loop semigroup.

Finally, we recall that any geodesic combing of a hyperbolic group has a fundamental property which we call thickness: for any vertex v in a maximal component, there is a finite set \(B \subset \Gamma \) such that \(\Gamma = B \cdot \Gamma _v \cdot B\). Here, the equality is in the group G. See [18, Lemma 8.1] and the references therein. In general, any thick geodesic combing of a finitely generated group is automatically almost semisimple [18, Lemma 2.3].

We conclude by remarking that if D is a thick geodesic combing, then the graph structures obtained by either restricting to a subgraph \(D'\) of maximal growth or taking the p-step graph structure \(D_p\) are themselves thick. See [18, Section 7] for details.

3 Random walks on \(\Gamma \) and passing to loop semigroups

Let G be a real Lie group and \(\Lambda \) a lattice in G. Let \(\Gamma \) be a subsemigroup of G which is generated by the support of a Borel probability measure \(\mu \).

A closed subspace \(Y \subset G/\Lambda \) is called homogeneous if its stabilizer \(G_Y \le G\) acts transitively on Y. If, in addition, \(G_Y\) preserves a Borel probability measure, Y is said to have finite volume. Such a measure is unique and is denoted by \(\nu _Y\). If the subsemigroup \(\Gamma \) is a subgroup of \(G_Y\), then Y is \(\Gamma \)–invariant and if the action \(\Gamma \curvearrowright (Y,\nu _Y)\) is ergodic, then Y is called \(\Gamma \)–ergodic.

The following theorem is due to Benoist–Quint in the case where \(\mu \) is compactly supported. The generalization stated here, required for our application, is due to Bénard–de Saxcé.

Theorem 3.1

(Benoist–Quint [5], Bénard–de Saxcé [1] Theorem C) Suppose that the Zariski closure of \(\textrm{Ad }(\Gamma ) \le \textrm{GL}(\mathfrak {g})\) is Zariski connected and semisimple with no compact factors. Further assume that the measure \(\mu \) has finite first moment. Then

  1. (1)

    The orbit closure \(Y = \overline{\Gamma x} \subset G/\Lambda \) is a \(\Gamma \)–invariant ergodic finite volume closed homogeneous subspace.

  2. (2)

    The sequence of measures \(\left( \frac{1}{n} \sum _{k=0}^{n-1} \mu ^{*k} *\delta _x \right) _{n\ge 1}\) converges to \(\nu _Y\) in the weak–\(*\) topology.

  3. (3)

    For \(\nu ^{\otimes \textrm{N}^*}\)–almost every sequence \((g_i)_{i\ge 1}\), the sequence of empirical measures \(\left( \frac{1}{n} \sum _{k=0}^{n-1} \delta _{g_k \ldots g_1 x} \right) _{n\ge 1}\) converges to \(\nu _Y\) in the weak–\(*\) topology.

The following lemma allows us to pass conditions from \(\Gamma \) to loop semigroups of the geodesic combing. This is a fundamental step toward applying Theorem 3.1 in the proof of Theorem 1.3.

Lemma 3.2

Suppose that \(\Gamma \) is a group satisfying the hypotheses of Theorem 1.2 (or more generally Theorem 1.3) and that \(\Gamma _v\) is a subsemigroup of \(\Gamma \) with the property that the group \(\Gamma _v^\pm \) generated by \(\Gamma _v\) has finite index in \(\Gamma \). Then

  1. (1)

    the Zariski closures of \(\textrm{Ad}(\Gamma _v)\) and \(\textrm{Ad}(\Gamma )\) in \(\textrm{GL}(\mathfrak {g})\) are equal, and

  2. (2)

    if \(Y = \overline{\Gamma x} \subset G/\Lambda \) is connected, then Y is also the orbit closure of \(\Gamma _v x\).

Proof

For item (1), first note that by Goldsheid–Margulis [29, Lemma 3.3], the Zariski closure \(\mathcal Z(\Gamma _v)\) is a group, which implies \(\mathcal Z(\Gamma _v) \supseteq \Gamma _v^\pm \). Hence, \(\mathcal Z(\Gamma _v) = \mathcal Z( \Gamma _v^\pm )\) and since \(\Gamma _v^\pm \) is finite index in \(\Gamma \), the subgroup \(\mathcal Z( \Gamma _v)\) is finite index in \(\mathcal Z( \Gamma )\) and thus a finite union of components. But since \(\mathcal Z( \Gamma )\) is Zariski connected, this implies that \(\mathcal Z(\Gamma _v) = \mathcal Z( \Gamma )\) as claimed.

For item (2), let \(Y_v, Y_v^\pm , Y\) be the orbit closures of \(\Gamma _v, \Gamma _v^\pm , \Gamma \), respectively, based at x. By Theorem 3.1 and the first item, each of these is a finite volume homogeneous space. Since \(G_{Y_v}\) is a group containing \(\Gamma _v\), it also contains \(\Gamma _v^\pm \), hence \(\Gamma _v^\pm x \subseteq G_{Y_v} x\) and by taking the closures \(Y_v^\pm \subseteq Y_v\). Since \(Y_v \subseteq Y_v^\pm \) by definition, we obtain \(Y_v = Y_v^\pm \), hence also \(G_{Y_v^\pm } = G_{Y_v}\).

Write \(\Gamma = \bigcup _{b\in B} b\Gamma _v^\pm \) for a finite set \(B \subseteq \Gamma \), which we can assume to contain the identity, so that \(Y = \bigcup _{b \in B} b Y_v\). Hence the smooth properly embedded submanifold Y is a finite union of smooth properly embedded submanifolds diffeomorphic to \(Y_v\), and so each \(b Y_v\) is the union of connected components of Y. When Y is connected, we conclude that \(Y = Y_v\) and \(G_Y = G_{Y_v}\). \(\square \)

We will sometimes use the notation

$$\overline{f}(x):= \int f \ d \nu _{\overline{\Gamma x}},$$

which we observe determines a \(\Gamma \)–invariant function.

Lemma 3.3

Let \(\mu \) be a generating measure on \(\Gamma < G\) with finite first moment, and suppose that the Zariski closure of \(\textrm{Ad}(\Gamma )\) is Zariski connected and semisimple without compact factors. Let \(w_n = g_1 \dots g_n\) be the right random walk driven by \(\mu \). Let \(f :X \rightarrow \mathbb {R}\) be continuous, compactly supported. Then for any \(g, h \in \Gamma \) and any \(x \in X\), we have

$$\frac{1}{N} \sum _{n \le N} f(g w_n^{-1} h x) \rightarrow \overline{f}(x)$$

for almost every \((w_n)\).

Proof

We apply Theorem 3.1 to the measure \(\check{\mu }(g):= \mu (g^{-1})\). Then a sample path for the left random walk driven by \(\check{\mu }\) is given by \(h_n \dots h_1 = (g_n)^{-1} \dots (g_1)^{-1} = (g_1 \dots g_n)^{-1}\) where \(g_1 \dots g_n\) is a sample path for the right random walk driven by \(\mu \). Hence by Theorem 3.1, for any \(y \in X \) and any \(\varphi \) we have

$$\frac{1}{N} \sum _{n \le N} \varphi (w_n^{-1} y) \rightarrow \int \varphi \ d \nu _{\overline{\Gamma y}}.$$

Then apply the above equation with \(y = h x\), \(\varphi (x) = f(g x)\), using that the action of g is measure-preserving and that \(\overline{\Gamma x} = \overline{\Gamma y}\).\(\square \)

4 Convergence for Markov chains

Once and for all, let us fix a countable group \(\Gamma \). Throughout, we consider various thick, geodesic graph structures for \(\Gamma \) whose properties are weakened over the next few sections, culminating in Sect. 6 where arbitrary thick geodesic combings are considered.

We also fix the hypotheses of Theorem 1.3. That is,

  • G is a real Lie group, \(\Lambda < G\) is a lattice, \(X = G/\Lambda \), and \(\Gamma \) is a finitely generated group with generating set S,

  • \(\rho :\Gamma \rightarrow G\) is a representation, inducing an action \(\Gamma \curvearrowright X\), such that the Zariski closure of \(\textrm{Ad} (\rho (\Gamma ))\) is Zariski connected, semisimple, and without compact factors,

  • \(x \in X\) is a point such that the orbit closure \(\overline{\Gamma x} \subset X\) is connected.

We also set \(\overline{f}(x):= \int f \ d \nu _{\overline{\Gamma x}}\), where \(\nu _{\overline{\Gamma x}}\) is the Haar measure as in Theorem 3.1. If \(\Gamma x\) is dense, we also write \(\nu _{\overline{\Gamma x}}\) as \(\nu _X\).

In this section, we let D be a thick, geodesic graph structure for \(\Gamma \) which is primitive, i.e. that its transition matrix A has a positive power. Note that we do not assume that the evaluation map is surjective.

Let \((p_i)\) be a right eigenvector for A, and \((q_i)\) be a left eigenvector, normalized so that \(\sum _i p_i q_i = 1\). Then we define the stationary measure as \(\pi _i = p_i q_i\), and for any word w of length n from vertex i to vertex j we define

$$\begin{aligned} \mu (w) = \frac{q_i p_j}{\lambda ^{n}}. \end{aligned}$$

Moreover, let \(\mathbb {P}\) be the Markov measure on \(\Omega \) whose stationary measure is \((\pi _i)\) and such that the transition probability from \(v_i\) to \(v_j\) is \(\frac{A_{ij}p_j}{\lambda p_i}\). That is, if W is the set of paths starting with a fixed prefix w, then \(\mathbb {P}(W) = \mu (w)\).

Let \(\Omega ^n_{i,j}\) be the set of paths of length n from i to j. For any vertices \(v_i, v_j\) and any \(N \ge 0\), we define the modified Markov averaging operator

$$c_N^{\mu , i, j}(f):= \frac{1}{N} \sum _{n \le N} \sum _{w \in \Omega ^n_{i,j}} \mu (w) f(w^{-1} x).$$

Lemma 4.1

Let us consider a primitive graph structure on \(\Gamma \). Then for any continuous, compactly supported f on X and for any vertices \(v_i, v_j\),

$$c_N^{\mu , i, j}(f) \rightarrow \pi _i \pi _j \overline{f}(x).$$

Proof

Let us fix a vertex \(v_j\) of the graph. Then we can decompose almost every path \(\gamma \in \Omega \) as

$$\gamma = \alpha \cdot g_1 \cdot g_2 \cdot \ldots \cdot g_n \cdot \ldots $$

where \(\alpha \) does not pass through \(v_j\) except at its end, and each \(g_i\) is a loop based at \(v_j\). Let \(\mu _j\) be the measure induced by \(\mu \) on the loop semigroup \(\Gamma _j\) associated to \(v_j\).

Thus, we have, up to a set of \(\mathbb {P}\)-measure zero, the decomposition

$$\Omega = \bigsqcup _{\alpha } \alpha \cdot (\Gamma _j)^\mathbb {N}.$$

For each \(\alpha \), the conditional measure on the space \((\Gamma _j)^\mathbb {N}\) is the product measure \((\mu _j)^\mathbb {N}\), hence \(w_n:= g_1 g_2 \dots g_n\) is a random walk driven by \(\mu _j\). Let \(\mathbb {P}_j\) be the distribution of \((w_n)\).

From now on, let us fix a vertex \(v_j\). Then by Lemma 3.2 we have \(\overline{\Gamma x} = \overline{\Gamma _j x}\) since thickness implies that the group generated by \(\Gamma _j\) is finite index in \(\Gamma \). Hence by Lemma 3.3 we have

$$\frac{1}{N} \sum _{n \le N} f( w_n^{-1} \alpha ^{-1} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}}$$

for \(\mathbb {P}_j\)-almost every \(w_n\).

Let now \(R(n, j): \Omega \rightarrow \mathbb {N}\) be the nth return time to \(v_j\) (it depends on the infinite path, but we will omit that dependence in the notation). Since by construction \(\alpha \cdot w_n = \gamma _{R(n, j)}\), we have

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} f(\gamma ^{-1}_{R(n, j)} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}} \end{aligned}$$
(6)

\(\mathbb {P}\)-almost surely.

Let \(T_j(N):= \max \{ k \, \ R(k, j) \le N \}\). Then, if \([\gamma _n]\) denotes the end vertex of the path \((\gamma _n)\), we obtain

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} f(\gamma _n^{-1} x) \chi _{\{ [\gamma _n] = j \}}&= \frac{1}{N} \sum _{k \le T_j(N)} f(\gamma _{R(k, j)}^{-1} x) \\&= \frac{T_j(N)}{N} \cdot \frac{1}{T_j(N)} \sum _{k \le T_j(N)} f(\gamma _{R(k, j)}^{-1} x) \end{aligned}$$

hence by (6),

$$\begin{aligned} \frac{1}{T_j(N)} \sum _{k \le T_j(N)} f(\gamma _{R(k, j)}^{-1} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}} \qquad \mathbb {P}--a.s. \end{aligned}$$
(7)

Now, we also have \(\mathbb {P}\)–a.s. (e.g. [18, Lemma 4.6])

$$\lim _{N} \frac{T_j(N)}{N} = \pi _j$$

hence we have

$$\frac{1}{N} \sum _{n \le N} f(\gamma _n^{-1} x) \chi _{\{ [\gamma _n] = j\}} \rightarrow \pi _j \int f \ d\nu _{\overline{\Gamma x}} \qquad \mathbb {P}-a.s. $$

Then, if we integrate over all paths that start in \(v_i\),

$$\int d \mathbb {P}(\gamma ) \ \frac{1}{N} \sum _{n \le N} f(\gamma _n^{-1} x) \chi _{\{ [\gamma _n] = j \}} \chi _{\{ [\gamma _0] = i \}} \rightarrow \pi _j \int f \ d\nu _{\overline{\Gamma x}} \ \mathbb {P}(\chi _{\{ [\gamma _0] = i \}}) $$

and, since \(\mathbb {P}(\chi _{\{ [\gamma _0] = i \}}) = \pi _i\),

$$\frac{1}{N} \sum _{n \le N} \sum _{w \in \Omega ^n_{i,j}} \mu (w) f(w^{-1} x) \rightarrow \pi _i \pi _j \int f \ d\nu _{\overline{\Gamma x}}. $$

\(\square \)

Recall that \(\Omega ^n\) is the set of paths of length n starting at any vertex. Consider the counting operator

$$\kappa _N(f):= \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n} \sum _{|w| = n} f(w^{-1}x)$$

where the sum is over all paths of length n in the graph. Given ij, we define the modified counting operator

$$\kappa _N^{i, j}(f):= \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n} \sum _{w \in \Omega ^n_{i,j} } f(w^{-1} x)$$

Since the graph structure is primitive, we know that

$$\begin{aligned} c := \lim _{n} \frac{\# \Omega ^n}{\lambda ^n} \end{aligned}$$
(8)

exists, and \(c > 0\).

Proposition 4.2

Suppose that the graph structure is primitive. Then for any two vertices \(v_i\) and \(v_j\) and any compactly supported \(f:X \rightarrow \mathbb {R}\), we have

$$\lim _{N \rightarrow \infty } \kappa _N^{i, j}(f) = \frac{q_i p_j}{c} \overline{f}(x).$$

As a consequence, we also have

$$\lim _{N \rightarrow \infty } \kappa _N(f) = \overline{f}(x).$$

Proof

Let us set for any \(n \ge 1\)

$$a_n:= \sum _{w \in \Omega ^n_{i,j}} \mu (w) f(w^{-1} x), \qquad b_n:= \frac{\lambda ^{n}}{p_i q_j \# \Omega ^n }$$

so that

$$a_n b_n = \frac{1}{\# \Omega ^n} \sum _{w \in \Omega ^n_{i,j}} f(w^{-1} x).$$

Now, we have by Lemma 4.1

$$\frac{1}{N} \sum _{n \le N} a_n \rightarrow \pi _i \pi _j \overline{f}(x)$$

and by (8)

$$b_N \rightarrow \frac{1}{c p_i q_j }.$$

Hence, as in [10, Proposition 8],

$$\kappa _N^{i, j}(f) = \frac{1}{N} \sum _{n \le N} a_n b_n \rightarrow \frac{ \pi _i \pi _j }{c p_i q_j} \overline{f}(x) = \frac{q_i p_j}{c} \overline{f}(x).$$

Then, by summing over all ij,

$$\lim _{N \rightarrow \infty } \kappa _N(f) = \alpha \overline{f}(x)$$

with \(\alpha = \frac{\sum _i q_i \sum _j p_j}{c}\) is a constant which does not depend on f or x. To see that \(\alpha = 1\), we note that \(\int \kappa _N(f) \ d\nu _{\overline{\Gamma x}} = \int f \ d\nu _{\overline{\Gamma x}}\) for each \(N\ge 1\). This completes the proof.\(\square \)

5 From primitive to semisimple graph structures

In this section, we now allow the thick, geodesic graph structure D for \(\Gamma \) to be semisimple and generalize the results from the previous section.

Recall that \(\Omega ^n_0:= \Omega ^n_{v_0}\) denotes the set of paths of length n from the initial vertex \(v_0\). We now consider the operator

$$\begin{aligned} c_N(f) := \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n_0 } \sum _{w \in \Omega ^n_0 } f(w^{-1} x), \end{aligned}$$
(9)

and recall that we have set \(\overline{f}(x):= \int f \ d \nu _{\overline{\Gamma x}}\).

Proposition 5.1

Suppose that the graph structure is semisimple. Then for any continuous f with compact support we have

$$\lim _{N \rightarrow \infty } c_N(f) = \overline{f}(x).$$

Proof

Following Pollicott–Sharp [26], for any maximal component \(\mathcal {V}\), we divide the set of all paths intersecting \(\mathcal {V}\) in two subsets: for some fixed \(M\ge 1\), the ones spending at most time M outside the maximal component, and the ones spending at least time M outside the maximal component.

Denote as \(\Omega ^n_{i,j}\) the set of paths of length n between \(v_i\) and \(v_j\) (note that, if \(v_i\) and \(v_j\) both lie in the maximal component, so does the path); \(\Omega _{i,out}^n\) is the set of paths of length n from vertex \(v_i\) lying outside the maximal component, and \(\Omega _{in, i}^{n}\) is the set of paths of length n from the initial vertex to vertex \(v_i\) lying outside the maximal component. Let \(\Omega ^n_{0,\mathcal V+}\) be the set of paths of length n from the initial vertex \(v_0\) that intersect the component \(\mathcal {V}\). By abuse of notation, we shall write \(i \in \mathcal {V}\) to mean that the vertex \(v_i\) belongs to the component \(\mathcal {V}\).

Then if we set

$$s_{n, \mathcal {V}}(f):= \sum _{w \in \Omega ^n_{0,\mathcal V+}} f(w^{-1} x)$$

and

$$c_{n, \mathcal {V}}(f):= \frac{1}{N} \sum _{n \le N} \frac{s_{n, \mathcal {V}}(f)}{\# \Omega ^n_0 }$$

we have

$$s_{n, \mathcal {V}}(f) = \sum _{i, j \in \mathcal {V}} \sum _{a + b \le n} \sum _{g \in \Omega ^a_{in,i}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} \sum _{h \in \Omega ^b_{j,out}} f((gwh)^{-1}x).$$

Then, let us fix \(M \ge 0\). Define

$$s_{n, < M, \mathcal {V}}(f) = \sum _{i, j \in \mathcal {V}} \sum _{a + b \le \min \{ n, M\} } \sum _{g \in \Omega ^a_{in,i}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} \sum _{h \in \Omega ^b_{j,out}} f((gwh)^{-1}x)$$

and

$$s_{n, > M, \mathcal {V}}(f) = \sum _{i, j \in \mathcal {V}} \sum _{M < a + b \le n} \sum _{g \in \Omega ^a_{in,i}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} \sum _{h \in \Omega ^b_{j,out}} f((gwh)^{-1}x).$$

Let us look at the first term. Note that the restriction of the graph structure to the component \(\mathcal {V}\) is primitive; let \((p_i), (q_i)\) be a right and left eigenvector of the transition matrix of the subgraph corresponding to \(\mathcal {V}\), as in Sect. 4. Hence, for each pair of vertices \(v_i, v_j\) in \(\mathcal V\) and paths gh we have by Proposition 4.2 that

$$\frac{1}{N} \sum _{m \le N} \frac{1}{\#\Omega ^m_{\mathcal V}} \sum _{w \in \Omega ^m_{i, j}} f((gwh)^{-1}x) \rightarrow \frac{q_i p_j}{c} \overline{f}(x)$$

where \(\Omega ^n_{\mathcal V}\) is the set of paths of length n that lie entirely inside the maximal component. Hence, by summing over all gh,

$$ \sum _{g \in \Omega ^a_{in,i}} \sum _{h \in \Omega ^b_{j,out}} \frac{1}{N} \sum _{m \le N} \frac{1}{\#\Omega ^m_{\mathcal V}} \sum _{w \in \Omega ^m_{i, j}} f((gwh)^{-1}x) \rightarrow \#\Omega ^a_{in,i} \#\Omega ^b_{j,out} \frac{q_i p_j}{c} \overline{f}(x).$$

Expanding, we have

$$\begin{aligned}&\frac{1}{N} \sum _{n \le N} \frac{s_{n, < M, \mathcal {V}}(f)}{\# \Omega ^n_0 } \\&= \frac{1}{N} \sum _{n \le N} \frac{1}{\# \Omega ^n_0 } \sum _{i, j \in \mathcal {V}} \sum _{a + b \le \min \{ n, M\} } \sum _{g \in \Omega ^a_{in,i}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} \sum _{h \in \Omega ^b_{j,out}} f((gwh)^{-1}x) \\&= \frac{1}{N} \sum _{n \le N} \sum _{i, j \in \mathcal {V}} \sum _{a + b \le \min \{ n, M\} } \frac{\# \Omega ^{n-a-b}_{\mathcal V}}{\# \Omega ^n_0 } \sum _{g \in \Omega ^a_{in,i}} \sum _{h \in \Omega ^b_{j,out}} \frac{1}{\#\Omega ^{n-a-b}_{\mathcal V}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} f((gwh)^{-1}x) \\&= \sum _{g \in \Omega ^a_{in,i}} \sum _{h \in \Omega ^b_{j,out}} \sum _{i, j \in \mathcal {V}} \sum _{a + b \le M } \frac{1}{N} \sum _{M \le n \le N} \frac{\# \Omega ^{n-a-b}_{\mathcal V}}{\# \Omega ^n_0 } \frac{1}{\#\Omega ^{n-a-b}_{\mathcal V}} \sum _{w \in \Omega ^{n-a-b}_{i, j}} f((gwh)^{-1}x). \\ \end{aligned}$$

Now, note that since the graph structure is semisimple there exists a constant \(A(\mathcal {V})\) such that for any \(a, b \ge 0\)

$$\lim _{n \rightarrow \infty } \frac{\# \Omega ^{n-a-b}_{\mathcal V}}{\# \Omega ^n_0 } = A(\mathcal {V}) \lambda ^{-a-b}$$

hence by taking the limit as \(N \rightarrow \infty \)

$$c_{n,< M, \mathcal {V}}(f) := \frac{1}{N} \sum _{n \le N} \frac{s_{n, < M, \mathcal {V}}(f)}{\# \Omega ^n_0 } \rightarrow \sum _{a + b \le M } \sum _{i, j \in \mathcal {V}} \#\Omega ^a_{in,i} \#\Omega ^b_{j,out} A(\mathcal {V}) \lambda ^{-a-b} \frac{q_i p_j}{c} \overline{f}(x).$$

This implies that for fixed M there is a constant \(L_{M, \mathcal V}\) such that for any f we have

$$\lim _{N \rightarrow \infty } c_{N, < M, \mathcal {V}}(f) = L_{M,\mathcal V} \cdot \overline{f}(x)$$

and moreover

$$L_\mathcal {V}:= \lim _{M \rightarrow \infty } L_{M, \mathcal {V}}$$

exists. To prove the last claim, let

$$P_{1, n}:= \{ paths g_1 of length n from initial vertex to \mathcal {V}\}$$

and

$$P_{2, n}:= \{ paths g_2 of length n from \mathcal {V} to its complement \}.$$

Now, note that there exists \(d > 0\) and \(\mu < \lambda \) with

$$\# P_{1, n} \le d \mu ^n \qquad \#P_{2, n} \le d\mu ^n$$

hence

$$\begin{aligned} L_{M, \mathcal {V}}&= \sum _{a + b \le M } \sum _{i, j \in \mathcal {V}} \#\Omega ^a_{in,i} \cdot \#\Omega ^b_{j,out} \cdot A(\mathcal {V}) \lambda ^{-a-b} \frac{q_i p_j}{c}\\&\le \sum _{i, j \in \mathcal {V}} \frac{q_i p_j}{c} A(\mathcal {V}) \sum _{a, b \ge 0} \left( \frac{\mu }{\lambda } \right) ^{a + b} d^2 < \infty \end{aligned}$$

converges as \(M \rightarrow \infty \).

To estimate the second term, note that

$$c_{N, >M, \mathcal {V}}(f) := \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n_0} \sum _{M \le a + b \le n} \sum _{i,j} \sum _{g \in \Omega ^a_{in, i}} \sum _{h \in \Omega ^b_{j, out}} \sum _{w \in \Omega ^{n-a-b}_{i,j}} f((gwh)^{-1}x)$$
$$\begin{aligned} |c_{N, >M, \mathcal {V}}(f)|&\le \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n_0} \sum _{M \le a + b \le n} \sum _{i, j} \#\Omega ^a_{in, i} \cdot \#\Omega ^b_{ j, out} \cdot \#\Omega ^{n-a-b}_{i,j} \Vert f \Vert _{\infty } \\&\le \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n_0} \sum _{M \le a + b \le n} \sum _{i, j} d \mu ^a \cdot d \mu ^b \cdot d \lambda ^{n-a-b} \Vert f \Vert _{\infty } \\&\le \frac{1}{N} \sum _{n \le N} \frac{1}{c_1 \lambda ^n} \sum _{M \le a + b \le n} (\# \mathcal {V})^2 d \mu ^a \cdot d \mu ^b \cdot d \lambda ^{n-a-b} \Vert f \Vert _{\infty } \\&\le \frac{1}{N} \sum _{n \le N} \sum _{M \le a + b \le n} \frac{(\# \mathcal {V})^2 d^3}{c_1} \left( \frac{\mu }{\lambda } \right) ^{a+b} \Vert f \Vert _{\infty } \\&\le \frac{(\# \mathcal {V})^2 d^3}{c_1} \Vert f \Vert _{\infty } \sum _{M \le \ell } \ell \left( \frac{\mu }{\lambda } \right) ^{\ell }. \end{aligned}$$

Hence for any \(N \ge M \ge 0\) we have

$$|c_{N, >M}(f)| \le r_{M, \mathcal {V}} \cdot \Vert f \Vert _\infty $$

with

$$\lim _{M \rightarrow \infty } r_{M, \mathcal {V}} = 0.$$

As a consequence,

$$\lim _{N \rightarrow \infty } c_{N, \mathcal {V}}(f) = L_\mathcal {V} \overline{f}(x).$$

Indeed, for any \(\epsilon > 0\) there exists \(M >0 \) such that \(r_{M, \mathcal {V}} < \epsilon \) and \(|L_{M, \mathcal {V}} - L_\mathcal {V}| < \epsilon \). Then

$$c_{N, \mathcal {V}}(f) = c_{N, < M, \mathcal {V}}(f) + c_{N, > M, \mathcal {V}}(f)$$

so

$$\begin{aligned} \limsup _N |c_{N, \mathcal {V}}(f) - L_{\mathcal {V}} \overline{f}(x)|&\le \limsup _N |c_{N, < M, \mathcal {V}}(f) - L_{M, \mathcal {V}} \overline{f}(x)| + |L_{M, \mathcal {V}} \overline{f} - L_{\mathcal V} \overline{f}(x)| \\&\ \quad \quad \quad \quad \quad + \limsup _N |c_{N, > M, \mathcal {V}}(f)| \\&\le |L_{M, \mathcal {V}} - L_{\mathcal V}| \Vert f \Vert _\infty + r_{M, \mathcal {V}} \Vert f \Vert _\infty \le 2 \epsilon \Vert f \Vert _\infty \end{aligned}$$

which proves the claim.

Now, we have

$$c_N(f) = \sum _{\mathcal {V}} c_{N, \mathcal {V}}(f) + c_{N, nmax }(f)$$

where \(\mathcal {V}\) runs over all maximal components, and \(c_{N, nmax }\) takes into account all paths of length at most N that do not enter any maximal component. Since

$$\lim _{N \rightarrow \infty } c_{N, nmax }(f) = 0$$

we obtain for any compactly supported f,

$$\lim _{N \rightarrow \infty } c_{N}(f) = L_\infty \overline{f}(x)$$

with \(L_\infty := \sum _{\mathcal {V}} L_{\mathcal {V}}\). By again noting that as in Proposition 4.2, \(\int c_N(f) \ d\nu _{\overline{\Gamma x}} = \int f \ d\nu _{\overline{\Gamma x}}\) we see that \(L_\infty = 1\). \(\square \)

Next, for the fixed semisimple, thick graph structure D, we can restrict to the subgraph \(D^i\) obtained by considering only paths starting at the large growth vertex \(v_i\). This gives a new semisimple, thick graph structure and we have the corresponding counting function \(c_N^i\) as in Eq. (9). Applying the previous proposition then gives

Corollary 5.2

Suppose the graph structure is semisimple. For each large growth vertex \(v_i\) and any continuous functions f with compact support, we have

$$\lim _{N \rightarrow \infty } c^i_N(f) = \overline{f}(x),$$

where \(c^i\) is restricted to the paths that start with \(v_i\).

6 From semisimple graph structures to the general case

We now come to the general case of our main theorem: suppose that \((D,v_0,\textrm{ev})\) is a thick geodesic combing for the pair \((\Gamma ,S)\). As previously discussed, the associated transition matrix A is almost semisimple. Hence, there is a \(p\ge 1\) so that \(A^p\) is semisimple. This leads us to consider the semisimple p-step graph structure \(D_p\), defined in Sect. 2, whose paths of length n starting at \(v_0\) are in natural bijective correspondence with the paths in \(\Omega ^{pn}_0\).

By Corollary 5.2 (applied to the semisimple p-step graph structure \(D_p\)), we have that for any \(h\in \Omega ^r_{0,i}\) with \(0\le r \le p-1\):

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^{pn}_i} \sum _{w\in \Omega ^{pn}_{i}} f((hw)^{-1}x) \rightarrow \overline{f}(x). \end{aligned}$$
(10)

Now we recall a trick from [18].

Let us fix \(0 \le r \le p-1\). Then we can write the counting measure on \(\Omega ^{pn+r}_0\), starting at the initial vertex \(v_0\), by first picking randomly a path \(g_0\) of length r from \(v_0\) with a certain probability \(\mu \), and then picking a random path starting at \(v_i = t(g_0)\) with respect to the counting measure on the set of paths of length n starting at \(v_i\).

To compute \(\mu \), let us consider a path \(g_0\) of length r starting at \(v_0\) and ending at \(v_i\). Then, if \(v_i\) is of large growth for D (and hence also large growth for \(D_p\) by [18, Lemma 7.1]), we define

$$\mu (g_0):= \frac{e_i A_{\infty } 1}{e_0 A^r A_\infty 1},$$

and otherwise \(\mu (g_0) = 0\) if the end vertex of \(g_0\) has small growth. Here, \(A_\infty = \lim _{n\rightarrow \infty }A^{pn}/\lambda ^{pn}\), which exists since \(A^p\) is semisimple.

Let \(\lambda '_{pn+r}\) be the measure on \(\Omega ^{pn+r}_0\) given by first taking randomly a path \(g_0\) of length r from \(v_0\) with distribution \(\mu \) and then taking uniformly a path of length pn starting from \(t(g_0)\).

We previously proved ([18, Proof of Theorem 7.3])

$$\Vert \lambda '_{pn+r} - \lambda _{pn+r} \Vert _{TV} \rightarrow 0$$

as \(n \rightarrow \infty \), where \(\lambda _n\) is the counting probability measure on \(\Omega ^{n}_0\) and \(\Vert \cdot \Vert _{TV}\) denotes total variation. This implies that it suffices to show that for each r,

$$ \frac{1}{N} \sum _{n \le N} \int f(g^{-1}x) d\lambda '_{pn+r}(g) \rightarrow \overline{f}(x) $$

as \(N \rightarrow \infty \). But

$$\begin{aligned} \frac{1}{N} \sum _{n \le N} \int f(g^{-1}x) d\lambda '_{pn+r}(g)&= \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^{pn}_i} \sum _i \sum _{h\in \Omega ^{r}_{0,i}} \mu (h)\sum _{w\in \Omega ^{pn}_{i}} f((hw)^{-1}x) \\&= \sum _i \sum _{h\in \Omega ^{r}_{0,i}} \mu (h) \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^{pn}_i} \sum _{w\in \Omega ^{pn}_{i}} f((hw)^{-1}x), \end{aligned}$$

which as in Eq. (10) converges to

$$ \sum _i \sum _{h\in \Omega ^{r}_{0,i}} \mu (h) \overline{f}(x) = \sum _{g_0\in \Omega ^r_0}\mu (g_0) \overline{f}(x) = \overline{f}(x). $$

Applying this to each \(0\le r \le p-1\), we conclude that for any continuous f with compact support:

$$ \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n \le N} \frac{1}{\#\Omega ^n_0 } \sum _{w \in \Omega ^n_0 } f(w^{-1} x) = \overline{f}(x). $$

Since we have assumed that D is a geodesic combing for \((\Gamma ,S)\), so that \(\Omega ^n_0\) parameterizes the sphere \(S_n\) of radius n in the Cayley graph of \((\Gamma ,S)\), this completes the proof Theorem 1.3.

The proof of Theorem 1.2 now follows using the fact, explained in Sect. 2, that if \(\Gamma \) is a hyperbolic group and S is any finite generating set of \(\Gamma \), there is a thick, geodesic combing of \((\Gamma ,S)\).

The proof of Theorem 1.1 then proceeds by replacing Theorem 3.1 with [5, Corollary 1.8], which applies when G is semisimple and \(\rho (\Gamma )\) is Ad-Zariski dense.

Proof of Theorem 1.4

The proof follows in a very similar way as the proof of Theorem 1.2, by replacing the use of Theorem 3.1 with the main theorem of [24], which establishes the limit for random walks \((w_n)\), without taking Cesàro averages; namely,

$$\lim _{n \rightarrow \infty } f(w_n^{-1} x) = \overline{f}(x),$$

for any irrational \(x \in \mathbb {T}^n\). The only modifications are in Lemma 4.1 and Proposition 4.2. In Lemma 4.1, we obtain

$$ f(\gamma ^{-1}_{R(n, j)} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}}$$

almost surely, for any \(v_j\); hence, since at every step the random walk lies at some vertex \(v_j\), this implies

$$ f(\gamma ^{-1}_{n} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}}$$

almost surely. Integrating on both sides yields the analog of Lemma 4.1. In Proposition 4.2, we already know the convergence of \((a_n)\) and \((b_n)\), which immediately implies the convergence of \((a_n b_n)\). The rest of the proof follows verbatim, removing the average over \(n \le N\) from all equations. \(\square \)

7 Equidistribution along random geodesics

For \(\Gamma \) hyperbolic with finite generating set S, we denote by \(\textrm{PS}\) any measure in the Patterson-Sullivan class on the hyperbolic boundary \(\partial \Gamma \) associated to the word metric \(d_S\). In this context, one can define this class as the class of any limit point of the sequence of spherical averages \(\nu _n:= \frac{1}{\#S_n} \sum _{|g| = n} \delta _g\) in the space of measures on \(\Gamma \cup \partial \Gamma \). See [14, 16, 19] for definitions and details. For a geodesic ray \(\gamma = [1, \eta )\) we write \(\gamma (n)\) to be element of \(\Gamma \) such that \(d_S(1,\gamma (n)) = n\).

Theorem 7.1

With notation and hypotheses as in Theorem 1.2, if \(\Gamma x\) is infinite and has connected closure, then for \(\textrm{PS}\)–almost every \(\eta \in \partial \Gamma \), there is a geodesic \(\gamma = [1,\eta )\) such that

$$\begin{aligned} \frac{1}{N}\sum _{n=1}^N \delta _{\gamma (n)^{-1}x} \longrightarrow \nu _{\overline{\Gamma x}}. \end{aligned}$$
(11)

Proof

Let \(\Omega _{0}\) be the set of infinite paths starting at \(v_0\) and let \(\mathbb {P}_0\) the Markov measure starting at \(v_0\), which is supported on \(\Omega _{0}\). There is a map \(\Omega _{0} \rightarrow \partial \Gamma \) sending each infinite path to the endpoint of the associated geodesic ray in Cay(GS) and this map pushes \(\mathbb {P}_0\) forward to a Patterson–Sullivan measure \(\textrm{PS}\) on \(\partial \Gamma \) ([14, 19, Section 5]). Here we recall that any two Patterson–Sullivan measures are absolutely continuous with bounded Radon–Nikodym derivative.

Hence, it suffices to prove equation (11) for \(\mathbb {P}_0\)–almost every path \(\gamma \in \Omega _{0}\). This was essentially accomplished in Lemma 4.1 and we now make the idea explicit. Let \(v_i\) be a vertex in a maximal component, and let \(R(n, i): \Omega \rightarrow \mathbb {N}\) be the nth return time to \(v_i\), and \(T_i(N):= \max \{ k \, \ R(k, i) \le N \}\). Now by equation (7), noting that its proof does not use that the structure is primitive, we have

$$\frac{1}{T_i(N)} \sum _{k \le T_i(N)} f(\gamma _{R(k, i)}^{-1} x) \rightarrow \int f \ d\nu _{\overline{\Gamma x}} \qquad \mathbb {P}_0--a.s. $$

Now since the above is true for any vertex \(v_i\) in a maximal component, we have, using also \(\sum _i \lim _{N \rightarrow \infty } \frac{T_i(N)}{N} = 1\), that

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{1}{N} \sum _{n \le N} f(\gamma _n^{-1} x)&= \lim _{N \rightarrow \infty } \sum _i \frac{T_i(N)}{N} \cdot \frac{1}{T_i(N)} \sum _{k \le T_i(N)} f(\gamma _{R(k, i)}^{-1} x) \\&= \int f \ d\nu _{\overline{\Gamma x}} \end{aligned}$$

for \(\mathbb {P}_0\)–almost every \(\gamma \in \Omega _{0}\), as desired. \(\square \)