1 Introduction

Let (Md) be a metric space and \({\text {Isom}}(M)\) the group of isometries of M. Consider a finitely supported probability measure \(\mu \) on \({\text {Isom}}(M)\), let \((X_i)_{i \in {\mathbb {N}}}\) be a sequence of independent random variables with distribution \(\mu \) and denote by \(R_n\) the random variable given by the product \(X_1 \ldots X_n\). Fix a basepoint \(o \in M\) and consider the random walk \(R_no\) on M. A straightforward application of Kingman’s subadditive ergodic theorem shows that there exists a constant \(\ell (\mu ) \geqslant 0\), called the drift of the random walk, such that

$$\begin{aligned} \frac{1}{n} d(R_n o, o) \underset{n \rightarrow \infty }{\overset{a.s.}{\longrightarrow }} \ell (\mu ). \end{aligned}$$
(1.1)

This can be seen as a generalization of the classical law of large numbers which corresponds to the case \(M={\mathbb {R}}\) and \(\mu \) supported on the translations \({\mathbb {R}}<{\text {Isom}}({\mathbb {R}})\).

Understanding various aspects of the convergence (1.1) (e.g. central limit theorem (CLT), large deviation principles (LDP), Azuma-Hoeffding-type concentration inequalities) in the aforementioned special case constitutes a fundamental part of classical probability theory. Various other cases have attracted considerable attention relatively more recently: starting in ’60s with the work of Furstenberg, Kesten, Oseledets, Kaimanovich [22, 24, 40, 54] for symmetric spaces of non-compact type and with Dynkin–Malyutov [20], Furstenberg [23], Kaimanovich–Vershik [41] and others for random walks on countable groups. More recently, for general metric spaces with an assumption of coarse negative-curvature (namely Gromov hyperbolicity), a number of analogues of the classical results were proven including CLT’s [4, 51], local limit theorems [30], and closer to our considerations, LDP’s and exponential decay results [7, 31]. Our goal in this paper is to establish Hoeffding-type concentration inequalities in the general setting of random walks on hyperbolic spaces. To the best of our knowledge, this aspect of the classical theory is far less developed in our setting.

Concentration inequalities around the mean \(\ell (\mu )\) have two distinctive features compared to asymptotic large deviations estimates: on the one hand, these are large deviation bounds for the fluctuations of the distance of the random walk that are valid uniformly over all times as opposed to asymptotic estimates. On the other hand, the exponential decay rate is expressed as an explicit function of the normalized deviation distance t. As such, these inequalities have been useful in the classical case both from a pure mathematics and applied or computational perspectives. Accordingly, one of the main reasons that we mostly focus our attention in this article to proper Gromov hyperbolic spaces is that, by following a geometric and harmonic analytic technique of Benoist–Quint [4], we are able to exploit their geometry and consequently obtain explicit concentration estimates. We also obtain subgaussian concentration estimates for non-proper Gromov hyperbolic spaces and random matrix products, but with less explicit bounds. These results are also new and discussed later in the introduction.

Our approach consists of proving a general concentration type result for cocycles satisfying a certain cohomological equation. This is line with Gordin’s method for proving the central limit theorem where the values of cocycles along random walks coming from group actions are related to martingales via a Poisson type equation.

In particular, the solutions by Benoist–Quint of associated cohomogical equations for Busemann and norm cocycles, respectively on the boundary of hyperbolic spaces [4] and projective spaces [3], play a crucial role in the application of our general cocycle-concentration results to these settings. We slightly extend this solution to adapt it to our purposes, and in the case of proper hyperbolic spaces, we get explicit bounds on its size. These bounds involve the norm \(\Vert \lambda _G(\mu )\Vert _2\) of the regular representation \(\lambda _G\) of a probability measure \(\mu \) on the isometry group \(G={\text {Isom}}(M)\). In a later part, we use various versions of uniform Tits alternatives to control the size of \(\Vert \lambda _G(\mu )\Vert _2\) which in turn yields effective constants for example in the case of linear groups of rank one, thanks to the works of Breuillard [9, 10].

Finally, we give explicit finite-time estimates for the probability that two independent non-elementary random walks on a proper hyperbolic space generate a free subgroup. We deduce this result from our concentration bounds together with a more general statement linking uniform large deviations with free-subgroups generated by samplings of random walks. Our result (Theorem 1.10) quantifies some cases of several known probabilistic Tits alternatives proven in [1, 29, 57].

Let us now state our first main result, some of its consequences and related remarks.

1.1 Subgaussian concentration estimates for random walks on hyperbolic spaces

We first introduce some notation and definitions.

Let (Md) be a proper metric space, we denote by G its group of isometries. It is a locally compact group and we denote by \(\mu _G\) a Haar measure on G. For every \(r \in [0,1]\), we denote \(\mu _{r,{\text {lazy}}}=r \delta _{{\text {id}}} + (1-r)\mu \). Furthermore, we denote by \(\lambda _G(\mu )\) the operator given by the image of the probability measure \(\mu \) under the the left-regular representation of G on \(L^2(G)\). Finally, having fixed a basepoint \(o \in M\), for an element \(g \in G\), we set \(\kappa (g):=d(go,o)\) and for a set \(S \subset G\), \(\kappa _S:=\sup \{\kappa (g) : g\in S\}\). The set S is said to be bounded if \(\kappa _S<\infty \).

Given \(\delta \geqslant 0\), by a \(\delta \)-hyperbolic metric space M, we understand a metric space M such that for every \(x,y,z,o \in M\), we have \((x|y)_o \geqslant (x|z)_o \wedge (z|y)_o -\delta \), where \((.|.)_.\) is the Gromov product given by \((x|y)_o=\frac{1}{2}(d(x,o)+d(y,o)-d(x,y))\). A probability measure \(\mu \) is called non-elementary if its support S generates a semigroup that contains two independent loxodromic elements (see §3.2). We can now state

Theorem 1.1

Let (Md) be a proper geodesic \(\delta \)-hyperbolic space and \(o\in M\). Assume that the group \(G={\text {Isom}}(M)\) acts cocompactly on M. Then, there exists an explicit positive function D(., .) with \(D(.,\lambda )<\infty \) for every \(\lambda \in (0,1)\) such that for every non-elementary probability measure \(\mu \) on G with bounded support S, for every \(t \geqslant 0\) and \(n \in {\mathbb {N}}\) we have

$$\begin{aligned} {\mathbb {P}}\left( |\kappa (R_n) - n \ell (\mu )|\geqslant nt \right) \leqslant 2 \exp \left( \frac{-nt^2}{ \kappa _S^2 D(\kappa _S,\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)} \right) \end{aligned}$$
(1.2)

for every \(r \in [0,1)\).

This statement will follow from a more general concentration result (Theorem 4.1) for the Busemann cocycle on the horofunction compactification of M.

To convey the dependence of this upper bound to the involved quantities and for practical use, in the following remark we provide a function that one can substitute for the function D in the previous result.

Remark 1.2

(On the upper bound) One can take

$$\begin{aligned} D(\kappa , \lambda )= 32 \left( 16\ln ^+ (\kappa )+8A_0/3+33 \right) ^2 \frac{1}{(1-\sqrt{\lambda })^4}, \end{aligned}$$

where \(A_0= \left( \frac{\mu _G \left( B_{2R(\delta )+2D_0} \right) }{\mu _G \left( B_{R(\delta ) +D_0}\right) } \right) ^{1/2}\) with \(R(\delta )=14\delta +4\), for \(r \geqslant 0\), \(B_r:=\{g \in G : d(go,o) \leqslant r\}\), and \(D_0:=2\text {diam}( G\backslash M)\). We also set \(D(\kappa ,1)=\infty \). Note that if \(\mu \) is non-elementary, then for every \(r \in (0,1)\), we have \(\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2<1\) (see Remark 4.4). Also note that if \(\mu \) is symmetric, then \(\Vert \lambda _G(\mu )\Vert _2=\Vert \lambda _G(\mu _{0,{\text {lazy}}})\Vert _2<1\).

Remark 1.3

  1. 1.

    (Non-proper case) As mentioned earlier, we also obtain subgaussian concentration estimates without the properness assumption but in this case, the dependence on \(\mu \) at the right-hand-side of (1.2) is less explicit (Proposition 3.4).

  2. 2.

    (Random walks with unbounded support) It is possible to have a version of our result where the bounded support assumption on the probability measure \(\mu \) is replaced by a finite exponential moment assumption and obtain a Bennett–Bernstein type concentration inequality. However, the constants that appear in that version are more complicated to express. This point is discussed in more detail in Remark 4.9.

In the sequel, we will see that each of the two aspects of the upper bound in Theorem 1.1, namely its subgaussian form and its parameters of dependence, have implications and strenghtenings. On the one hand, by combining this upper bound with versions of uniform Tits alternatives in various contexts (which entail uniform bounds for \(\Vert \lambda _G(\mu )\Vert _2\), see Lemma 5.2), we will obtain uniform concentration estimates for a class of driving probability measures, see Corollaries 1.4 and 1.6. On the other hand, the subgaussian character allows us for instance to provide a global quadratic lower bound (see Corollary 1.8) for the rate function of large deviations, recently studied in this setting by [7]. Let us now explain these consequences.

1.1.1 The case of hyperbolic and rank-one linear groups

Firstly, specifying Theorem 1.1 to hyperbolic groups, and using Koubi’s uniform Tits alternative [43, Theorem 5.1], we obtain the following more precise concentration result for random walks on hyperbolic spaces.

Corollary 1.4

Let (Md) be a proper geodesic hyperbolic metric space and \(o\in M\). Then there exists a constant \(A_M>0\) such that for any group \(\Gamma <G\) that acts properly and cocompactly on M, there exist constants \(\alpha _\Gamma >0\) and \(N_\Gamma \in {\mathbb {N}}\) depending only on \(\Gamma \) such that for every non-elementary probability measure \(\mu \) of finite support S generating \(\Gamma \), for every \(t>0\) and \(n\in {\mathbb {N}}\), setting \(m_\mu =\min _{g \in S}\mu (g)\), we have

$$\begin{aligned} {\mathbb {P}}\left( |\kappa (R_n) - n\ell (\mu )|\geqslant nt \right) \leqslant 2 \exp \left( \frac{-nt^2}{m_\mu ^{N_\Gamma } \alpha _\Gamma \kappa _S^2 (\ln ^+(\kappa _S) + A_M)}\right) \end{aligned}$$

Remark 1.5

Using the quantitative Tits alternative in the recent work of Cavallucci–Sambusetti [15, Theorem 1.1], under additional assumptions on the hyperbolic space (Md) (such as the existence of a convex geodesic bicombing with certain properties) and for a torsion-free group \(\Gamma \), one can provide a version of the previous corollary dropping the cocompactness assumption of the \(\Gamma \)-action and replacing the constants \(\alpha _\Gamma \) and \(N_\Gamma \) with constants depending only on a packing parameter of the hyperbolic space M (see [15, §2.2]).

Specifying Theorem 1.1 to rank one matrix groups and using the strong Tits alternative of Breuillard [9, 10], we obtain concentrations for random matrix products of discrete non-amenable subgroups of rank-one semisimple algebraic groups. A further aspect of the following corollary is that thanks to the work of Breuillard, the implied constants can be effectively calculated.

We need some notation to state the next corollary. Let \(\mathrm {k}\) be a local field (i.e. in characteristic zero \({\mathbb {R}}\), \({\mathbb {C}}\) or a finite extension of \({\mathbb {Q}}_p\) for a prime number p and in positive characteristic, a finite extension of \({\mathbb {F}}_p((T))\)). We denote by \(\Vert \cdot \Vert \) the canonical norm on \(\mathrm {k}^d\) for a fixed discrete valuation on \(\mathrm {k}\) and consider the associated operator norm on the space of \(d\times d\)-matrices. Moreover, if S is a finite subset of \({\text {Mat}}_d(\mathrm {k})\), we denote by \(\kappa _S:=\sup \{\ln \Vert g\Vert : g\in S\}\). Finally, if \(\mu \) is a probability measure with finite first order moment on \(\text {GL}_d(\mathrm {k})\), we denote by \(\ell (\mu )\) the top Lyapunov exponent, i.e.  the almost sure limit of \( \frac{1}{n} \ln \Vert R_n\Vert \).

Corollary 1.6

Let \(\mathrm {k}\) be a local field and \({\mathbb {H}}\subseteq \mathrm {SL}_d\) be a connected semisimple linear algebraic group of \(\mathrm {k}\) rank-one defined over \(\mathrm {k}\). For every \(d\in {\mathbb {N}}\), there exist constants \(\alpha _d>0\), \(N_d \in {\mathbb {N}}\) depending only on the dimension d and constants \(A=A({\mathbb {H}}, \mathrm {k})\) such that for every finitely supported probability measure \(\mu \) whose support generates a non-amenable discrete subgroup of \({\mathbb {H}}(\mathrm {k})\), for every \(t>0\) and \(n \in {\mathbb {N}}\), the following holds:

$$\begin{aligned} {\mathbb {P}}\left( | \frac{1}{n} \ln \Vert R_n\Vert - \ell (\mu )| \geqslant t \right) \leqslant 2 \exp \left( \frac{- n t^2}{m_\mu ^{N_d} \alpha _d \kappa _S^2(\ln ^+(\kappa _S)+A_{{\mathbb {H}}, \mathrm {k}})^2} \right) . \end{aligned}$$
(1.3)

Remark 1.7

(About the discreteness assumption)

  1. 1.

    Both of the above corollaries are obtained from Theorem 1.1 in the following way: the respective versions of Tits alternatives allow us to deduce bounds on the norm \(\Vert \lambda _\Gamma (\mu )\Vert \) of the regular representation on \(\ell ^2(\Gamma )\), which is equal to \(\Vert \lambda _G(\mu )\Vert \) thanks to the discreteness assumption. In general, even though we have uniform upper bounds for \(\Vert \lambda _\Gamma (\mu )\Vert \), we are not able to transfer this to a bound on \(\Vert \lambda _G(\mu )\Vert \) without discreteness assumption. Indeed, by [13, 44], in any connected semisimple Lie group G, for any element \(g \in G\), one can find pairs of elements \(\{a_n,b_n\}\) that converge to g and that generate a non-abelian free group, so that for the uniform probability measure \(\mu _n\) supported on \(\{a_n,b_n,a_n^{-1},b_n^{-1}\}\), we have \(\frac{\sqrt{3}}{2}=\Vert \lambda _\Gamma (\mu _n)\Vert <\Vert \lambda _G(\mu _n)\Vert \rightarrow 1\).

  2. 2.

    We also note that under the discreteness assumption, the fact that the support S generates a non-elementary group implies, thanks to various versions of Margulis Lemma, a positive lower bound for \(\kappa _S\). This lower bound depends in Corollary 1.4 on some parameters of M and the group generated by S (see [6, Theorem 5.21]). In Corollary 1.6, it depends only on \({\mathbb {H}}(\mathrm {k})\) (see e.g. [2, Chapter 8]).

1.1.2 Rate function of LDP

We now mention a consequence of Theorem 1.1 concerning the rate function of large deviation principles of random walks on hyperbolic spaces recently studied by [7]. The authors prove that the sequence of random variables \(\frac{\kappa (R_n)}{n}\) satisfies a large deviation principle with proper convex rate function \(I_\mu : [0,\infty ) \rightarrow [0,+\infty ]\) vanishing only at the drift \(\ell (\mu )\). Recall that this means that \(I_\mu \) is a lower-semicontinuous function such that for every measurable subset J of \({\mathbb {R}}\), we have

$$\begin{aligned} \underset{\alpha \in {\text {int}}(J)}{-\inf I(\alpha )} \leqslant \underset{n \rightarrow \infty }{\liminf } \frac{1}{n}\ln {\mathbb {P}}(\frac{\kappa (R_n)}{n} \in J) \leqslant \underset{n \rightarrow \infty }{\limsup } \frac{1}{n}\ln {\mathbb {P}}(\frac{\kappa (R_n)}{n} \in J) \leqslant \underset{\alpha \in {\overline{J}}}{-\inf I(\alpha )}\nonumber \\ \end{aligned}$$
(1.4)

where \({\text {int}}(J)\) denotes the interior and \({\overline{J}}\) the closure of J. To the best of our knowledge, no explicit global estimate for the rate function exists in the literature. Theorem 1.1 allows us to give an explicit quadratic lower bound for the rate function \(I_\mu \) in our setting, i.e. when M is proper and the non-elementary probability measure \(\mu \) has a bounded support.

Corollary 1.8

(Quadratic lower bound) Under the assumptions of Theorem 1.1, for every \(t \in [0,\infty )\) the rate function \(I_\mu \) of the sequence \(\frac{1}{n}\kappa (R_n)\) satisfies

$$\begin{aligned} I_\mu (t)\geqslant \frac{(t-\ell (\mu ))^2}{\kappa _S^2 D\left( \kappa _S, \Vert \lambda _G(\mu _r, {\text {lazy}})\Vert _2 \right) }, \end{aligned}$$

for every \(r \in [0,1)\).

The proof of this corollary is immediate from the property (1.4) defining the function \(I_\mu \) and the estimate given by Theorem 1.1.

Remark 1.9

In the general case of random walks with finite exponential moment, one can clearly not get such a quadratic lower bound, see Remark 4.9 for the type of global lower bound that one can obtain using our methods.

1.2 Quantitative probabilistic Tits alternative

It is known since the foundational work of Gromov [34] that groups acting non-elementarily on hyperbolic spaces contain non-abelian free subgroups. The main result of this part is a probabilistic quantification of this fact which says that if we sample two independent random walks at their \(n^{th}\)-steps, the probability that the two elements generate a free group of rank two is exponentially close to one. Moreover, an important aspect is that this probability is explicitly described in terms of the norm of the driving measure \(\mu \) in the regular representation and the size of its support.

1.2.1 Probabilistic free-subgroup theorem

Theorem 1.10

Keep the assumptions of Theorem 1.1. Then, there exist explicit functions \(n_0(\cdot )\) and \(T(\cdot , \cdot )\) both with values in \((0,+\infty )\) such that for any non-elementary probability measure \(\mu \) on \(G={\text {Isom}}(M)\), denoting \((R_n)_{n\in {\mathbb {N}}}\) and \((R'_n)_{n\in {\mathbb {N}}}\) two independent random walks driven by \(\mu \), for every \(n> n_0 \left( \Vert \lambda _{G}(\mu _{1/2, {\text {lazy}}})\Vert _2 \right) \), we have

$$\begin{aligned} {\mathbb {P}}\left( \langle R_n, R'_n \rangle \,\text {is free} \right) \geqslant 1-50 \exp \left( -n\, T(\kappa _S, \Vert \lambda _{G}(\mu _{1/2, {\text {lazy}}})\Vert _2) \right) . \end{aligned}$$
(1.5)

We proceed with a few remarks on the statement and some consequences.

Remark 1.11

(The explicit estimate)

  1. (i)

    For the function appearing in the above statement, one can take

    $$\begin{aligned} T(\kappa , \lambda )= \frac{1}{A_M} \frac{(\ln \lambda )^2 (1-\sqrt{\lambda })^4}{\kappa ^2(\ln ^+(\kappa ) +1)^2} \qquad \text {and} \qquad n_0(\lambda )=2-A_M \frac{1}{\ln \lambda }, \end{aligned}$$

    where the constant \(A_M>0\) is related only to a doubling constant of the Haar measure on G and to the diameter of \(G \backslash M\) (see (6.21) for its expression).

  2. (ii)

    Unlike in our previous results, the left-hand-side (1.5) is independent of the choice of basepoint o. One can therefore replace \(\kappa _S\) by the joint minimal displacement L(S) of S ( [12]) given by \(\inf _{x \in M} \sup _{s \in S} d(sx,x)\), which is independent of any basepoint.

  3. (iii)

    Finally, the choice of 1/2 for the lazy random walk \(\mu _{1/2,{\text {lazy}}}\) is for convenience: it ensures that the associated operator norm is strictly less than one (which might not be the case for \(\mu \) due to the non-symmetry of \(\mu \), see Remarks 4.4 and 6.9).

Remark 1.12

Using similar techniques, one can also prove a more general version of this result where several (more than two) independent copies of random walks, even with different step-distributions, are considered.

1.2.2 Some consequences

  • For discrete subgroups of \({\text {Isom}}(M)\), in the respective settings, using Corollaries 1.4 and 1.6 (see also Remark 1.5), we can deduce an explicit expression for the right-hand-side of (1.5) as well for its range of validity controlled by \(n_0(\cdot )\) (see Remark 6.11).

  • Moreover, it is known that for a discrete subgroup of isometries \(\Gamma \) of a proper geodesic hyperbolic space M such that \({\text {Isom}}(M)\) acts cocompactly on M, the group \(\Gamma \) is either virtually nilpotent or non-elementary (see e.g. [15, Corollary 3.13]). Hence Theorem 1.10 can be seen as a quantitative probabilistic Tits alternative for discrete groups of isometries of M.

  • Theorem 1.10 gives an explicit version of a result by Taylor–Tiozzo [57, Corollary 1.6] under additional hypotheses. We also refer to Gilman–Miasnikov–Osin [29, Theorem 1.2] for a previous result in the particular case of Gromov-hyperbolic groups. Finally, in the setting of discrete subgroups of rank one semi-simple linear algebraic groups, Theorem 1.10 provides an effective version of the probabilistic Tits alternative proved by the first-named-author (see [1, Theorem 1.1]).

1.3 Random matrix products

The concentration estimates that we obtain in Sect. 2 for general cocycles also allow us to deduce concentration estimates for random matrix products in arbitrary dimension, but these are less explicit compared to Theorem 1.1. Before stating the result we recall some known facts; we refer to §3.1 for more details. Let \(\mu \) be a probability measure on \({\text {GL}}_d({\mathbb {C}})\) whose support generates a strongly irreducible and proximal subgroup, then there exists a unique \(\mu \)-stationary probability measure \(\nu \) on the projective space of \({\mathbb {C}}^d\) ( [23, 37]). The stationary measure \(\nu \) enjoys some regularity properties. It is non-degenerate (i.e. does not charge any proper hyperplane) [23], log-regular under a finite second order moment [3] and Hölder regular under a finite exponential moment assumption [36]. Suppose now \(\mu \) has bounded support and consider \({\mathfrak {c}}(\mu ):= \sup _{x\in {\mathbb {C}}^d\setminus \{0\}} \int \ln \frac{\Vert x\Vert \,\Vert y\Vert }{|\langle x, y\rangle |} d\nu ({\mathbb {C}}y)\). It follows from the aforementioned regularity properties that this quantity is finite. Finally, we denote \(\mu ^*\) the pushforward of \(\mu \) by the map \(g\mapsto g^{*}\), where \(g^*\) is the conjugate-transpose of g. With these at hand, we are now ready to state

Proposition 1.13

Let \(\mu \) be a boundedly supported probability measure on \({\text {GL}}_d({\mathbb {C}})\) such that the semigroup generated by the support S of \(\mu \) is strongly irreducible and proximal. Let \(\kappa _S:=\max \{\ln \Vert g\Vert \vee \ln \Vert g^{-1}\Vert ; g \in S\}\) and \({\mathfrak {c}}={\mathfrak {c}}(\mu ^*)\). Then, for every \(t>0\) and \(n \in {\mathbb {N}}\), we have

$$\begin{aligned} \sup _{v\in {\mathbb {C}}^d\setminus \{0\}}{\mathbb {P}}\left( \left| \ln \frac{\Vert R_n v \Vert }{ \Vert v\Vert } - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2\exp \left( -\frac{nt^2}{32 (\kappa _S +{\mathfrak {c)}^2}}\right) . \end{aligned}$$

In particular, for every \(t>0\) and \(n \in {\mathbb {N}}\) such that \(nt \geqslant \ln d\), the following holds:

$$\begin{aligned} {\mathbb {P}}\left( \left| \ln \Vert R_n \Vert - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2d \exp \left( -\frac{nt^2}{128 (\kappa _S+ {\mathfrak {c}})^2}\right) . \end{aligned}$$

In this result, the fact that we have subgaussian estimates for every \(t>0\) small enough can also be deduced from the spectral gap result of Le Page [45] using analytic perturbation methods. We also refer to [3, 8] for exponential deviation estimates in a more general setting and to [19, Ch. 5] for local concentrations that are uniform over small neighborhoods of irreducible cocycles.

Remark 1.14

Similarly to Corollary 1.8, the estimate in Proposition 1.13 allows one to obtain a global lower bound (less explicit in its constants compared to the aforementioned corollary) for the rate function of log-norms of random matrix products studied in [55, 60] (see also [55, Corollary 4.17]).

We end the introduction by mentioning that

  • the methods we use to prove Theorem 1.1 allow us to provide an explicit lower bound for the bottom of the support of Hausdorff spectrum of the harmonic measure, equivalently, for the exponent with which the Frostman property holds (see §4.2 and see also Tanaka [56] for a thorough discussion of multifractal analysis of the harmonic measure in the particular case of hyperbolic groups);

  • Theorem 1.1 itself has a direct application to the continuity of the drift (§4.3);

  • in view of Horbez’s work [39], it seems possible that our results in §2 can be used to obtain subgaussian concentration estimates in the setting of random walks on mapping class groups and on the group \(Out(F_N)\) of outer automorphisms of a non-abelian free group.

1.4 Organization

The article is organized as follows. In Sect. 2, we prove concentration estimates for a general cocycle that satisfies a certain cohomological equation (Proposition 2.1). In Sect. 3, we deduce non-explicit concentration estimates for random matrix products in arbitrary dimension (Proposition 1.13) and for random walks on hyperbolic spaces (Proposition 3.4). In Sect. 4, we prove Theorem 1.1. In Sect. 5, we prove Corollaries 1.4 and 1.6. Finally in Sect. 6, we deduce Theorem 1.10 from Theorem 1.1, a uniform positive lower bound on the drift (Proposition 6.8) and a general result estimating the likelihood of obtaining free subgroups from random walks based on uniform large deviation estimates (Proposition 6.1).

2 Concentration inequalities for cocycles satisfying a Poisson equation

The goal of this section is to prove Proposition 2.1 yielding concentration inequalities for values of a cocycle for which the associated Poisson equation has a bounded measurable solution. This result will provide the basis for the rest of the article where we will obtain more precise versions in the particular setups discussed in Introduction. We note that this section is inspired by the work of Furstenberg–Kifer [25] of which it can be seen as a quantitative analogue under an additional assumption (see Remark 2.2).

We start by recalling some standard terminology. Let G be a Polish group (endowed with the Borel \(\sigma \)-algebra) and X a standard Borel space endowed with a measurable action of G. We shall refer to such a space as a G-space. A function \(\sigma :G \times X \rightarrow {\mathbb {R}}\) is said to be an additive cocycle if it satisfies \(\sigma (g_1g_2,x)=\sigma (g_1, g_2x)+ \sigma (g_2,x)\) for every \(g_1,g_2 \in G\) and \(x \in X\). All cocycles will supposed to be measurable. Given a probability measure \(\mu \) on G, a probability measure \(\nu \) on X is said to be \(\mu \)-stationary if for every bounded measurable function \(\phi \), we have \(\int \int \phi (gx) d\mu (g) d\nu (x)=\int \phi (x) d\nu (x)\). We denote by \(P_\mu \) the Markov operator acting on bounded measurable functions on X by \(P_\mu \phi (x)=\int \phi (gx) d\mu (g)\). Finally, denoting by \((X_i)_{i \in {\mathbb {N}}}\) a sequence of independent G-valued random variables with distribution \(\mu \), we write \(L_n\) for the left product \(X_n\cdots X_1\). Although, \(L_n\) and \(R_n\) have the same distribution, it will be more convenient in this section to work with the left random walk \(L_n\).

Proposition 2.1

Let G be a Polish group, X a G-space and \(\sigma : G\times X\rightarrow {\mathbb {R}}\) a bounded additive cocycle. Let \(\mu \) be a probability measure on G with support S. Denote by

$$\begin{aligned} \kappa _S:=\sup \{\sup _{x\in X}|\sigma (g, x)| : g\in S\}. \end{aligned}$$

Let \(\nu \) be a \(\mu \)-stationary probability measure on X and

$$\begin{aligned} \ell (\mu ):=\int _{G\times X}{\sigma (g,x) d\mu (g) d\nu (x)}. \end{aligned}$$

Assume that the set E of bounded measurable solutions \(\psi \) of the Poisson equation

$$\begin{aligned} \psi (x) - P_{\mu }(\psi )(x)=\int _{G}{\sigma (g,x) d\mu (g)} - \ell (\mu ). \end{aligned}$$
(2.1)

is non-empty and let \({\mathfrak {c}}:= \inf \{\Vert \psi \Vert _{\infty } : \psi \in E\}\). Then, for every \(t>0\), \(n\in {\mathbb {N}}\), and \(x\in X\) we have

$$\begin{aligned} {\mathbb {P}}\left( |\sigma (L_n, x) - n\ell (\mu )|\geqslant nt \right) \leqslant 2\exp \left( -\frac{nt^2}{32 (\kappa _S +{\mathfrak {c)}^2}}\right) . \end{aligned}$$

Remark 2.2

  1. 1.

    Our assumption (2.1) implies that there is a unique cocycle average in the sense of [3, §3].

  2. 2.

    This result can be seen as an abstract quantitative refinement of [25, Theorem 2.1] under the assumption that the expected increase function is cohomologous to a constant.

The proof of the previous result is based on the following general probabilistic ingredient. We start by recalling some standard terminology on Markov chains. Let M be a standard Borel space, P a Markov operator on M, i.e. a measurable map \(x \mapsto P_x\) from M to the space of probability measures on M. This data naturally defines an operator on the space of bounded Borel functions on M by \(\phi \mapsto P\phi \), where \(P \phi (x)= \int \phi (y) dP_x(y)\). Given \(x \in M\), we denote by \({\mathbb {P}}_x\) the law of the Markov chain \((Z_n)_n\) on the space of trajectories, i.e. \(M^{\mathbb {N}}\) and \({\mathbb {E}}_x\) the associated expectation operator. We say that a probability measure \(\pi \) is invariant (or stationary) under the Markov operator P if \(\int P\phi d\pi = \int \phi d\pi \) for every bounded measurable function \(\phi \) on M.

Proposition 2.3

Let \((Z_n)\) be a Markov chain on a standard Borel space M associated to the Markov operator P. Let \(\pi \) be a P-stationary probability measure on M. Let f be a bounded measurable function on M. We assume that f is cohomologous to \(\int _{M} {f\,d\pi }\), i.e.  there exists a bounded measurable solution \(\phi \) of the equation:

$$\begin{aligned} \phi - P\phi =f - \int _{M} {f\,d\pi }. \end{aligned}$$
(2.2)

Then, for every \(t>0\), \(n\in {\mathbb {N}}\) and \(x \in M\), the following inequality holds

$$\begin{aligned} {\mathbb {P}}_{x} \left( \Big | \sum _{i=1}^n{f(Z_i)} - n \int _{M} {f\,d\pi } \Big | \geqslant nt \right) \leqslant 2 \exp \left( -\frac{n t^2}{32 \Vert \phi \Vert _{\infty }^2}\right) . \end{aligned}$$

Remark 2.4

  1. 1.

    Let M be a compact metric space and \(f: M\rightarrow {\mathbb {R}}\) a continuous function. Suppose, for simplicity, that the operator P is Markov-Feller and that f has a unique average \(\ell _f:=\int f d\pi \) for all P-stationary probability measures \(\pi \). Even though f may not be cohomologous to the constant \(\ell _f\), Furstenberg–Kifer [25, Lemma 3.1] showed that for every \(t>0\), there exists a continuous function \(h_{t}\) on M, cohomologous to f and such that \(\Vert h_{t}\Vert _{\infty }\leqslant \ell _f + t\). This can be used to show the exponential decay of \({\mathbb {P}}(|\sum _{i=1}^n{f(Z_i) - n \ell _f|>nt})\). This is a particular case of Benoist–Quint’s [3, Proposition 3.1]. In Proposition 2.3, thanks to the stronger assumption (2.2), one obtains subgaussian exponential decay with explicit constants.

  2. 2.

    In some particular cases, powerful concentration inequalities exist for the sums of any function along the Markov chain [18, 28]. They are not applicable here since our Markov chains are not geometrically ergodic. On the other hand, the particular requirement (2.2) on the function f allows us to use the usual Hoeffding inequality for martingales and thereby deduce the previous concentration estimates in the generality of Markov chains that we consider.

Proof of Proposition 2.1

We start by defining the appropriate objects to which we will apply Proposition 2.3. We take the standard Borel space M to be \(S \times X\) and P the Markov operator defined by

$$\begin{aligned} Pf\left( (g,x)\right) = \int _{G} { f(\gamma , \gamma x)\, d\mu (\gamma )} \end{aligned}$$

for every bounded measurable function f on M. The associated Markov chain \((Z_n)_{n\in {\mathbb {N}}}\) on M starting from \(Z_0=(e,x)\) is the process

$$\begin{aligned} Z_0=(e,x)\,,\, Z_1=(g_1, g_1\cdot x)\,,\, Z_2=(g_2, g_2 g_1 \cdot x)\,\cdots \, Z_n=(g_n, L_n\cdot x), \cdots , \end{aligned}$$

where the \(g_i\)’s are iid random variables on G with distribution \(\mu \). Let \(\pi \) be the probability measure on M defined by

$$\begin{aligned} \int _{M}\,{f\,d\pi } := \iint _{S\times X} {f(g,g\cdot x)\, d\mu (g)\,d\nu (x)} \end{aligned}$$

for every bounded measurable f on M. Since \(\nu \) is a \(\mu \)-stationary, one readily checks that \(\pi \) is stationary for the Markov operator P. Let now

$$\begin{aligned} f: M \longrightarrow {\mathbb {R}}, (g,x) \longmapsto f(g,x):=\sigma (g,g^{-1} \cdot x). \end{aligned}$$

The following properties are immediate to check

  • Starting from \(Z_0=(e,x)\), we have \(\sum _{i=1}^n {f(Z_i)} = \sigma (L_n,x)\),

  • \(\int _{M} f\,d\pi = \iint f(g,g\cdot x)\,d\mu (g)\,d\nu (x) = \iint \sigma (g,x) \,d\mu (g)\,d\nu (x)=\ell (\mu )\)

  • \(\Vert f\Vert _{\infty } \leqslant \kappa _S\).

Finally, we check that if (2.1) holds for some \(\psi \), then (2.2) holds. Indeed, let

$$\begin{aligned} \phi : M \longrightarrow {\mathbb {R}}, (g,x) \longmapsto \phi (g,x):= \psi (x)+f(g,x). \end{aligned}$$

One readily checks that \(P\psi =P_{\mu }\psi \) and \(Pf(g,x)= \int _G \sigma (g,x)\,d\mu (g)\). Thus, by (2.1), \(\phi - P \phi =f - \int _M{f \,d\pi }\), and (2.2) is fulfilled. Since \(\Vert \phi \Vert _{\infty }\leqslant \Vert \psi \Vert _{\infty }+\kappa _S\), Proposition 2.1 follows from Proposition 2.3. \(\square \)

Proof of Proposition 2.3

Let \(\alpha :=\int _{M}{f\,d\pi }\) and \(\phi \) as in the statement so that \(f-\alpha = \phi - P \phi \). We write

$$\begin{aligned} \sum _{i=1}^{n}{f(Z_i) - n \alpha } =\sum _{i=1}^{n}{\left[ \phi (Z_{i+1}) - P\phi (Z_i)\right] } + \left[ \phi (Z_1)-\phi (Z_{n+1})\right] . \end{aligned}$$
(2.3)

On the one hand, the sequence \(D_i:= \phi (Z_{i+1}) - P\phi (Z_i)\) is a martingale difference sequence with respect to the canonical filtration of \((Z_i)_i\). Moreover, \(|D_i|\leqslant 2 \Vert \phi \Vert _{\infty }\). Thus \(M_n:=\sum _{i=1}^n{\left[ \phi (Z_{i+1}) - P\phi (Z_i)\right] }\) is a martingale with bounded differences. Applying Azuma–Hoeffding concentration inequality for martingales with bounded difference (see for instance [52, Lemma 4.1]), we get that for every \(t>0\) and \(n\in {\mathbb {N}}\),

$$\begin{aligned} {\mathbb {P}}\left( M_n \geqslant nt/2\right) \leqslant \exp \left( -\frac{nt^2}{32 \Vert \phi \Vert ^2} \right) \,\,\,\text {and}\,\,\, {\mathbb {P}}\left( M_n \leqslant -nt/2\right) \leqslant \exp \left( -\frac{nt^2}{32 \Vert \phi \Vert ^2} \right) .\nonumber \\\end{aligned}$$
(2.4)

On the other hand, the following crude upper bound holds for \(V_n:= \phi (Z_1)-\phi (Z_{n+1})\); for every \(n\in {\mathbb {N}}\), we have \(|V_n|\leqslant 2\Vert \phi \Vert _{\infty }\). Hence, \(|V_n|\leqslant nt/2\) for every \(n\geqslant \frac{4\Vert \phi \Vert _{\infty }}{t}\). Combining this fact with (2.3) and (2.4), we get that for every \(t>0\) and every \(n\geqslant \frac{4\Vert \phi \Vert _{\infty }}{t}\),

$$\begin{aligned} {\mathbb {P}}\left( \sum _{i=1}^n{f(Z_i)} - n\int _{M}{f\,d\pi } \geqslant nt\right) \leqslant \exp \left( -\frac{nt^2}{32 \Vert \phi \Vert _{\infty }^2}\right) \end{aligned}$$

and

$$\begin{aligned} {\mathbb {P}}\left( \sum _{i=1}^n{f(Z_i)} - n\int _{M}{f\,d\pi } \leqslant -nt\right) \leqslant \exp \left( -\frac{nt^2}{32 \Vert \phi \Vert _{\infty }^2}\right) . \end{aligned}$$

Thus \({\mathbb {P}}\left( |\sum _{i=1}^n{f(Z_i)} - n\int _{M}{f\,d\pi } | \leqslant nt\right) \leqslant 2\exp \left( -\frac{nt^2}{32 \Vert \phi \Vert _{\infty }^2} \right) \). This shows the desired inequality in the case \(n \geqslant \frac{1}{t} 4\Vert \phi \Vert _{\infty }\). Suppose finally that \(n \leqslant \frac{1}{t} 4 \Vert \phi \Vert _{\infty }\). In this case, \(nt^2 \leqslant 16\Vert \phi \Vert ^2 \) and then \(\exp \left( -\frac{nt^2}{32\Vert \phi \Vert _{\infty }^2}\right) \geqslant \exp (-1/2)>\frac{1}{2}\). The desired estimate holds trivially in this case. \(\square \)

3 Applications to random matrix products and random walks on hyperbolic spaces

The goal of this section is to obtain two consequences of Proposition 2.1 in the settings of random matrix products and random walks on hyperbolic spaces M. For the latter, in this section, we will not suppose any properness assumption, and relatedly, we are only able to obtain non-explicit concentration estimates. In §4, we will upgrade those to more explicit estimates in the case of proper hyperbolic spaces.

3.1 Subgaussian concentrations for random matrix products

Let \(d \geqslant 1\) be an integer, we consider \({\mathbb {C}}^d\) endowed with the canonical Hermitian structure and \(M_d({\mathbb {C}})\) with the induced operator norm. For simplicity, we denote by \(\Vert .\Vert \) both norms on \({\mathbb {C}}^d\) and \(M_d({\mathbb {C}})\). We denote by \(X=P({\mathbb {C}}^d)\) the projective space of \({\mathbb {C}}^d\) and we endow it with the standard metric given by

$$\begin{aligned} \delta ([x], [y]):=\frac{\Vert x\wedge y\Vert }{\Vert x\Vert \Vert y\Vert }, \end{aligned}$$

where the norm \(\Vert \cdot \Vert \) is the canonical norm on \(\bigwedge ^2 {\mathbb {C}}^d\), \([x]={\mathbb {C}}x\) and \([y]={\mathbb {C}}y\).

A probability measure \(\mu \) on \({\text {GL}}_d({\mathbb {C}})\) is said to be (strongly-)irreducible if the support S of \(\mu \) does not fix a (finite union of) non-trivial proper subspace(s) of \({\mathbb {C}}^d\). An irreducible probability measure \(\mu \) is said to be proximal if the closure \(\overline{{\mathbb {C}}G_\mu ^+}\) in \(M_d({\mathbb {C}})\) of the semigroup \(G_\mu ^+\) generated by the support of \(\mu \) contains a rank-one linear transformation.

A probability measure \(\nu \) on X is said to be \(\mu \)-stationary if it is \(\mu \)-stationary for the Markov operator \(P_\mu \) associated to \(\mu \). We recall that for a strongly irreducible and proximal probability measure \(\mu \) on \({\text {GL}}_d({\mathbb {C}})\), there exists a unique \(\mu \)-stationary probability measure \(\nu \) on X [23, 37]. We denote by \(\mu ^*\) the image of \(\mu \) under the map \(g \mapsto g^*\), where \(g^*\) denotes the conjugate-transpose of \(\mu \) and by \(\nu ^*\) the unique-stationary measure of \(\mu ^*\) (which is also proximal and strongly irreducible).

We denote by \(\sigma :{\text {GL}}_d({\mathbb {C}}) \times X \rightarrow {\mathbb {R}}\) the (additive) norm-cocycle given by \(\sigma (g,[x])=\ln \frac{\Vert gx\Vert }{\Vert x\Vert }\). The solution of the Poisson equation (2.1) for the norm cocycle is closely related to regularity properties of the stationary measure \(\nu \) on X. Indeed when \(\mu \) has an exponential moment, (2.1) can be solved using the result of Le Page [45] establishing a spectral gap for the Markov operator \(P_{\mu }\) acting on some Hölder functions of X. As proved by Guivarc’h [36] this spectral gap property implies the Hölder regularity of \(\nu \). When \(\mu \) has a finite second order moment, Benoist–Quint [3] solved the same equation by using and proving the log-regularity of the stationary measure \(\nu \). We will rely on their results.

By [3], the following quantity

$$\begin{aligned} \psi ([x]):=-\int {\ln \frac{\Vert x\Vert \,\Vert y\Vert }{|\langle x, y \rangle |} \,d\nu ^*([y])} \end{aligned}$$
(3.1)

is finite for every \(x\in X\) and defines a continuous function \(\psi \) on X. Moreover, \(\psi \) satisfies the cohomological equation

$$\begin{aligned} \psi - P_{\mu } \psi = \phi - \ell (\mu ), \end{aligned}$$
(3.2)

where \(\phi ([v]):=\int {\ln \frac{\Vert g v\Vert }{\Vert v\Vert }\, d\mu (g)}\), the expected increase at [v]. This fact plays the key role in the proof of the following result:

Proof of Proposition 1.13

We will apply Proposition 2.1 with \(G={\text {GL}}_d({\mathbb {C}})\), \(X=P({\mathbb {C}}^d)\), and the norm-cocycle \(\sigma :G\times X\rightarrow {\mathbb {R}}\). Observe that for every \(g\in G\) and \(x\in X\),

$$\begin{aligned} |\sigma (g,x)| \leqslant \max \left\{ \ln \Vert g\Vert , \ln \Vert g^{-1}\Vert \right\} . \end{aligned}$$

Furthermore, the equation (3.2) shows that the hypothesis (2.1) of Proposition 2.1 holds, and consequently, we deduce that for every \(t>0\), \(n\in {\mathbb {N}}\), \(v\in {\mathbb {C}}^d\setminus \{0\}\) we have

$$\begin{aligned} {\mathbb {P}}\left( \left| \ln \frac{\Vert L_n v \Vert }{ \Vert v\Vert } - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2\exp \left( -\frac{nt^2}{32 (\kappa _S +{\mathfrak {c)}^2}}\right) , \end{aligned}$$
(3.3)

where \({\mathfrak {c}}={\mathfrak {c}}(\mu ^*)=\sup _{[x] \in P({\mathbb {C}}^d)}\psi ([x])\). This proves the first estimate. To get the concentration estimates for the matrix norm of \(L_n\), consider the canonical basis \(e_1, \cdots , e_n\) of \({\mathbb {C}}^d\). For every \(g\in G\), we have

$$\begin{aligned} \Vert g\Vert \leqslant \sqrt{d} \max \{\Vert g e_i\Vert : i=1, \cdots , d\}. \end{aligned}$$

Thus

$$\begin{aligned} {\mathbb {P}}\left( \ln \Vert L_n\Vert - n \ell (\mu ) \geqslant n t \right) \leqslant d\, \max _{i=1,\ldots ,d}{\mathbb {P}}\left( \ln \Vert L_n e_i\Vert - n \ell (\mu ) \geqslant n t - \frac{ \ln d}{2} \right) .\nonumber \\ \end{aligned}$$
(3.4)

Suppose that \(nt \geqslant \ln d\). Then \(nt - (\ln d)/2 \geqslant \frac{nt}{2}\) and hence, by combining (3.3) and (3.4), we get that

$$\begin{aligned} {\mathbb {P}}\left( \left| \ln \Vert L_n \Vert - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2 d \exp \left( -\frac{nt^2}{128 (\kappa _S +{\mathfrak {c)}^2}}\right) , \end{aligned}$$

as claimed. \(\square \)

3.2 Application to random walks on hyperbolic spaces

The goal of this part is to deduce concentration estimates for non-elementary random walks on (not necessarily proper) geodesic hyperbolic spaces.

The main tool is Proposition 2.1 that we will apply to the horofunction compactification \({\overline{M}}^h\) and Busemann cocycle \(\sigma \) of a separable geodesic hyperbolic metric space. The key point in this application is to solve the cohomological equation (2.1) in this setting. This was previously done by Benoist–Quint [4] when M is proper; they gave a solution \(\psi \) on \(\partial _h M\). A partial extension of this solution to \(\partial _h M\) was used by Horbez [39] in the non-proper setting. We will observe here that \(\psi \) extends further to a solution on the full space \({\overline{M}}^h\); this will be more convenient for our purpose.

Let us start by recalling some definitions. Let (Md) be a separable metric space and denote by \({\text {Lip}}^1(M)\) the set of real valued Lipschitz functions on M with Lipschitz constant 1, endowed with the topology of pointwise convergence. Fixing \(o \in M\), for \(x \in M\), let the function \(h_x \in {\text {Lip}}_{o}^1(M)\), defined by \(h_x(m)=d(x,m)-d(x,o)\), where \({\text {Lip}}_{o}^1(M)\) is the subspace of \({\text {Lip}}^1(M)\) consisting of functions f satisfying \(f(o)=0\). The closure of \(\{h_x : x \in M\}\) is a compact metrizable subset of \({\text {Lip}}^1_o(M)\), called the horofunction compactification of M (see e.g. [49, Proposition 3.1]). It will be denoted as \({\overline{M}}^h\). The map \(x \mapsto h_x\) is injective on M and we usually identify M with its image in \({\overline{M}}^h\). The horofunction boundary of M is defined as \(\partial _h M:={\overline{M}}^h\setminus M\). The group of isometries \({\text {Isom}}(M)\) acts on \({\overline{M}}^h\) by homeomorphisms given, for \(g \in {\text {Isom}}(M)\), \(h \in {\overline{M}}^h\) and \(m \in M\), by \((g.h)(m)=h(g^{-1}m)-h(g^{-1}o)\). This extends equivariantly the isometric action of \({\text {Isom}}(M)\) on M and the set \(\partial _h M \subset {\overline{M}}^h\) is invariant under \({\text {Isom}}(M)\). The Busemann cocycle \(\sigma : {\text {Isom}}(M) \times {\overline{M}}^h \rightarrow {\mathbb {R}}\) is defined by

$$\begin{aligned} \sigma (g,h)=h(g^{-1}o). \end{aligned}$$

Now, let (Md) be, moreover, a \(\delta \)-hyperbolic space. We recall that this means that for every \(x,y,z,o \in M\),

$$\begin{aligned} (x|y)_o \geqslant (x|z)_o \wedge (z|y)_o -\delta , \end{aligned}$$
(3.5)

where \((.|.)_.\) is the Gromov product given by \((x|y)_o=\frac{1}{2}(d(x,o)+d(y,o)-d(x,y))\). For simplicity, we will often omit the basepoint o from the notation. We refer to [16] for general properties of these spaces. An element \(\gamma \in {\text {Isom}}(M)\) is said to be loxodromic if for any \(x \in M\), the sequence \((\gamma ^nx)_{n \in {\mathbb {Z}}}\) constitutes a quasi-geodesic (see [16, Ch. 3]). Equivalently, \(\gamma \) is loxodromic if and only if it fixes precisely two points \(x_\gamma ^+,x_\gamma ^-\) on the Gromov boundary \(\partial M\) of M [16, Ch. 9 & 10]. Two loxodromic elements \(\gamma _1,\gamma _2\) are said to be independent if the sets of fixed points \(\{x^+_{\gamma _i} ,x^-_{\gamma _i}\}\) for \(i=1,2\) are disjoint. Finally, a set S, or equivalently a probability measure with support S, is said to be non-elementary if the semigroup generated by S contains at least two independent loxodromic elements.

For \(h_1,h_2 \in \partial _h M\), we set

$$\begin{aligned} (h_1 | h_2)_o=-\frac{1}{2} \inf _{m \in M} \left( h_1(m)+h_2(m) \right) . \end{aligned}$$

This extends the usual Gromov product on M based at \(o \in M\) to \(\partial _h M\). We note that \((h_1|h_2)_o=\infty \) if any only if \(h_1\) and \(h_2\) have the same projection to the Gromov boundary of M.

Let now \(\mu \) be a non-elementary probability measure on G. There might exist several \(\mu \)-stationary probability measures on \(\partial _h M\) but they all have the same Busemann cocycle average which is given by the drift of the \(\mu \)-random walk on M i.e. for any \(\mu \)-stationary \(\nu \) on \(\partial _h M\), \(\iint _{G\times \partial _h M}{\sigma (g,x)\,d\mu (g) d\nu (x)}=\ell (\mu )\) (see [4, Proposition 3.3] or [39, Corollary 2.7]). Recall that \(\mu \) is said to have a finite first order moment if \(\int \kappa (g) d\mu (g)<\infty \) and that the convergence (1.1) to the drift \(\ell (\mu ) \in {\mathbb {R}}\) is ensured under this moment assumption.

When \(\mu \) has a finite second order moment (i.e.  \(\int {\kappa (g)^2 d\mu (g)}<+\infty \)) and M is proper, Benoist–Quint showed that the function \(\psi \) defined on \(\partial _h M\) as

$$\begin{aligned} \psi (x):=-2 \int _{\partial _h M} {(x | y)_o \, d{\check{\nu }}(y)}, \end{aligned}$$
(3.6)

is bounded, measurable, and it satisfies the Poisson equation

$$\begin{aligned} \psi (x)-P_\mu \psi (x)=\int \sigma (g,x) d\mu (g)-\ell (\mu ), \end{aligned}$$
(3.7)

where \({\check{\nu }}\) is any stationary probability measure on \(\partial _h M\) for \(\mu ^{-1}\) and \(\mu ^{-1}\) is the non-elementary probability measure given by the image of \(\mu \) by the map \(g \mapsto g^{-1}\).

For our purposes in the sequel, it will be more convenient to consider the action of the Markov operator \(P_{\mu }\) on the space of bounded measurable functions defined on the whole compactification \({\overline{M}}^h\) in the case where M is only a separable and geodesic hyperbolic space. Accordingly, we will verify that the natural extension of the function \(\psi \) given by (3.6) to the space \({\overline{M}}^h\) yields a solution to the equation (3.7). We summarize these in the next

Lemma 3.1

Suppose that \(\mu \) has finite second order moment. The function \(\psi : {\overline{M}}^h \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} \psi (x)=-2 \int _{{\overline{M}}^h}(x|y)_o d{\check{\nu }}(y) \end{aligned}$$
(3.8)

is a bounded measurable function that satisfies the equation

$$\begin{aligned} \psi (x)-P_\mu \psi (x)=\int \sigma (g,x) d\mu (g)-\ell (\mu ) \end{aligned}$$
(3.9)

for every \(x \in {\overline{M}}^h\).

The proof requires the following slight extension of [39, Lemma 2.4]:

Lemma 3.2

There exists \(C>0\) depending only on the hyperbolicity constant \(\delta \) of M such that for all \(g \in G\) and \(x \in {\overline{M}}^h\), we have

$$\begin{aligned}&|(go | gx)_o -\frac{1}{2}(\kappa (g)+\sigma (g,x))| \leqslant C, \end{aligned}$$
(3.10)
$$\begin{aligned}&|(go | x)_o -\frac{1}{2}(\kappa (g)-\sigma (g^{-1},x))|\leqslant C. \end{aligned}$$
(3.11)

Proof

When \(x \in M\), expanding the definitions, both inequalities are seen to hold with \(C=0\). To treat the case when \(x \in \partial _h M\), we follow [49, §3.2] and write the boundary \(\partial _h M\) as the union of two G-invariant subsets \(\partial ^\infty _h M\) and \(\partial ^f_h M \), where

$$\begin{aligned} \partial _h^\infty M=\{h \in \partial _h M : \inf _{m \in M} h(m)=-\infty \} \end{aligned}$$

and \(\partial _h^{f}M\) is defined similarly by \(\inf _{m\in M}{h(m)}>-\infty \). In case \(x \in \partial _h^\infty M\), the statement is precisely [39, Lemma 2.4]. On the other hand, for \(x=h \in \partial _h^f M\), it is not hard to see that for some constant \(C>0\) depending only on \(\delta \), the horofunction h stays C-close to a horofunction \(h_y\) for some \(y \in M\) chosen in the coarse minimizer of h (see [49, §3.3]) i.e.

$$\begin{aligned} |h(m)-h_y(m)|\leqslant C \end{aligned}$$
(3.12)

for every \(m \in M\). Therefore, when \(x \in \partial _h^f M\), the inequalities (3.10) and (3.11) follow from the first case above where \(x \in M\). \(\square \)

Proof of Lemma 3.1

The proof goes similarly as Benoist–Quint’s proof [4, Propositions 4.2 & 4.6]; we indicate only the needed changes. Let \({\check{\nu }}\) be any \(\mu ^{-1}\)-stationary probability measure on \({\overline{M}}^h\). We first show that \(\psi (x)=-2 \int _{{\overline{M}}^h}(x|y)_o d{\check{\nu }}(y) \) is a bounded function on the whole compactification \({\overline{M}}^h\). Arguing precisely as in the proof of [4, Proposition 4.2] (namely, taking \(p=2\) in the authors’ proof), it suffices to show that there exists \(a>0\) such that

$$\begin{aligned} \sum _{n \geqslant 1}\sup _{x\in {\overline{M}}^h}{\check{\nu }}\{y : (x|y)_o\geqslant an\}<+\infty . \end{aligned}$$
(3.13)

Therefore, we now focus on obtaining (3.13). First, any \({\widetilde{\mu }}\)-stationary probability measure on \({\overline{M}}^h\) is supported on \(\partial _h^\infty M \subseteq \partial _h M\) ( [49, Proposition 4.4]) for \({\widetilde{\mu }} \in \{\mu ,\mu ^{-1}\}\). Since by [39, Corollary 2.7], the Busemann cocycle \(\sigma : G\times {\overline{M}}^h\rightarrow {\mathbb {R}}\) has a unique cocycle average on the boundary \(\partial _h M\), it follows that it has a unique cocycle average on all of the compactification \({\overline{M}}^h\). Then, using Benoist–Quint’s large deviation result for cocycles [3, Proposition 3.2] applied to the continuous Busemann cocycle on the compact metrizable space \({\overline{M}}^h\), we deduce that for every \(t>0\) and \({\widetilde{\mu }} \in \{\mu ,\mu ^{-1}\}\), we have

$$\begin{aligned} \sum _{n \geqslant 1}{\sup _{\xi \in {\overline{M}}^h} {{\widetilde{\mu }}^{*n}\left\{ g : |\sigma (g,\xi ) - n \ell ({\widetilde{\mu }})| > n t\right\} }}<+\infty . \end{aligned}$$

Now, using Lemma 3.2 — by substituting (3.10) for [4, (2.17)] and (3.11) for [4, (2.16)] — and following the same strategy as in the proof of [4, Lemma 4.5], we deduce that there exists a summable sequence \((C_n)\) of constants and a constant \(a>0\) such that for every \(x,y\in {\overline{M}}^h\),

$$\begin{aligned} (\mu ^{-1})^{*n}\{ g : (g y| x)_o \geqslant a n \}\leqslant C_n. \end{aligned}$$

By stationary of \({\check{\nu }}\), we have \({\check{\nu }}\{y: (x|y)_o \geqslant a n\}=\int { (\mu ^{-1})^{*n}\{g : (g y|x)_o \geqslant a n \} d {\check{\nu }}(y)}\) for every \(n \in {\mathbb {N}}\). This implies (3.13) and shows that \(\psi \) is bounded.

Finally, we check the Poisson equation (3.9). We remark that the following key identity

$$\begin{aligned} \sigma (g, x) = -2(x|g^{-1}y)_o + 2(gx|y)_o+\sigma (g^{-1},y) \end{aligned}$$

used by Benoist–Quint holds also true in our setting for every \(g\in {\text {Isom}}(M)\) and \(x,y \in {\overline{M}}^h\) provided gx and y do not project to the same point of the Gromov boundary \(\partial M\). Since the unique \(\mu ^{-1}\)-harmonic measure on \(\partial M\) is non-atomic ( [49, Theorem 1.1]), we deduce (3.9) by integrating both sides of the previous identity with respect to \(d{\check{\nu }}(y) d\mu (g)\) and using the fact that \(\ell (\mu ^{-1})=\ell (\mu )\). \(\square \)

Remark 3.3

When \(\mu \) has a finite exponential moment, one can substitute Maher’s result on Hölder regularity of the harmonic measure [48, Lemma 2.10] (see also [7, Proposition 2.16]) for large deviation results of Benoist–Quint to prove that the function \(\psi \) is bounded.

Proposition 3.4

Let (Md) be a separable, geodesic, \(\delta \)-hyperbolic space, \(o\in M\) and \(\mu \) a non-elementary probability measure on the group \(G= {\text {Isom}}(M)\) with countable bounded support S. Denoting \(\kappa _S=\sup _{g\in S} d(go,o)\) and \({\mathfrak {c}}=-\inf _{x \in X}\psi (x)\) (see Lemma 3.1), for every \(t>0\) and \(n\in {\mathbb {N}}\), we have

$$\begin{aligned} \sup _{\xi \in {\overline{M}}^h}{\mathbb {P}}\left( \left| \sigma (L_n, \xi ) - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2 \exp \left( -\frac{nt^2}{32 (\kappa _S +{\mathfrak {c)}^2}}\right) . \end{aligned}$$
(3.14)

In particular,

$$\begin{aligned} {\mathbb {P}}\left( \left| \kappa (L_n) - n\ell (\mu )\right| \geqslant nt \right) \leqslant 2 \exp \left( -\frac{nt^2}{32 (\kappa _S +{\mathfrak {c)}^2}}\right) . \end{aligned}$$
(3.15)

Proof

In view of the inequality \(|\sigma (g,\xi )| \leqslant \kappa (g)\) true for every \(g\in G\) and \(\xi \in {\overline{M}}^h\), the estimate (3.14) follows directly from Proposition 2.1 applied to \(G=\langle {\text {supp}}(\mu ) \rangle \) the group generated by the support of \(\mu \) endowed with the discrete topology, \(X={\overline{M}}^h\) and the Busemann cocycle \(\sigma : G \times X \rightarrow {\mathbb {R}}\). Finally, (3.15) follows directly by specializing to \(\xi =o\). \(\square \)

Remark 3.5

(Proper case) The countability assumption in Proposition 3.4 is not needed when M is proper. Indeed, it is known that in that case the full isometry group \(G={\text {Isom}}(M)\) is Polish and hence we may apply Proposition 2.1 for any (non-elementary) Borel probability measure \(\mu \) on G with bounded support.

4 Explicit estimates for random walks on proper hyperbolic spaces

In §4.1, we prove our main result on concentration inequalities around the drift for random walks on proper hyperbolic spaces M. Exploiting the locally compact structure of \({\text {Isom}}(M)\), the proof makes crucial use of the harmonic analytic and geometric approach and results of Benoist–Quint [4, §5]. Respectively in §4.2 and §4.3, we discuss the Frostman property of the harmonic measure and the continuity properties of the drift.

4.1 Main result on concentrations

Let (Md) be a proper metric space, we denote by G its group of isometries. It is a locally compact group [26, Theorem 6] and we denote by \(\mu _G\) a Haar measure on G. For a probability measure \(\mu \) on G, S denotes the support of \(\mu \) which is the smallest closed subset whose \(\mu \)-mass equals one. We recall that for every \(r \in [0,1)\), we denote by \(\mu _{r,{\text {lazy}}}=r \delta _{{\text {id}}} + (1-r)\mu \). Having fixed a basepoint \(o \in M\), for an element \(g \in {\text {Isom}}(M)\), we write \(\kappa (g)=d(go,o)\) and for a bounded set S, we set \(\kappa _S:=\sup \{\kappa (g) : g\in S\}\).

The main result of this section is the following result, which immediately implies Theorem 1.1 by specializing to \(\xi =o\).

Theorem 4.1

Let (Md) be a proper geodesic hyperbolic space and \(o\in M\). Assume that the group \({\text {Isom}}(M)\) acts cocompactly on M. Then, there exists an explicit positive function D(., .) with \(D(.,\lambda )<\infty \) for every \(\lambda \in (0,1)\) such that for every non-elementary probability measure \(\mu \) on G with bounded support S, for every \(\xi \in {\overline{M}}^h\), \(t>0\) and \(n \in {\mathbb {N}}\), we have

$$\begin{aligned} {\mathbb {P}}\left( |\sigma (L_n,\xi ) - n \ell (\mu )|\geqslant nt \right) \leqslant 2 \exp \left( \frac{-nt^2}{ \kappa _S^2 D(\kappa _S,\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)}\right) \end{aligned}$$

for every \(r \in [0,1)\).

With Proposition 3.4 at hand, the main ingredient for the proof of the Theorem 4.1 is the following

Proposition 4.2

Let (Md) be a proper geodesic \(\delta \)-hyperbolic space such that the group G of isometries of M acts cocompactly on M. Then, there exists an explicit positive function C(., .) with \(C(.,\lambda )<\infty \) for every \(\lambda \in (0,1)\) such that for every boundedly supported non-elementary probability measure \(\mu \) on G, we have

$$\begin{aligned} \sup _{y\in {\overline{M}}^h } \int _{\partial _h M} {(x\,| y)_o\,d\nu (x)}\leqslant C(\kappa _{S},\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2). \end{aligned}$$
(4.1)

for every \(r \in [0,1)\).

Remark 4.3

  1. 1.

    The function \(C(\kappa ,\lambda )\) is a function of \(\kappa >0\) and \(\lambda \in (0,1]\) and, for \(\lambda <1\), it can be given by

    $$\begin{aligned} \kappa \inf _{1<c<\lambda ^{-1/2}} \left[ 4\left( \frac{\ln \kappa }{\ln c} \vee \frac{1}{(\ln c)^2}\right) + \frac{A_0}{1-c^2 \lambda }\right] , \end{aligned}$$

    where \(A_0=\left( \frac{\mu _G(B_{2R(\delta )+2D_0})}{\mu _G(B_{R(\delta ) +D_0})}\right) ^{1/2}\) with \(R(\delta )=14\delta +4\), for \(r \geqslant 0\), \(B_r:=\{g \in G : d(go,o) \leqslant r\}\) and \(D_0\) is the diameter \(2\sup _{x,y\in M}\inf _{g\in G}{d(gx,y)}\), which is finite by the cocompactness assumption. We set \(C(\kappa ,1)=\infty \). Finally, for concreteness, for \(\lambda <1\), specializing to \(c=(1+\lambda ^{-1/2})/2\), one can get

    $$\begin{aligned} C(\kappa ,\lambda ) \leqslant \kappa \left( 8\ln ^+ (\kappa )+4A_0/3+16 \right) \frac{1}{(1-\sqrt{\lambda })^2}. \end{aligned}$$
  2. 2.

    One can obtain a version of (4.1) with finite first order moment assumption replacing \(\kappa _S\) with the first order moment. However, we will not need this more general version.

Remark 4.4

(Why we also consider lazy random walks) Denoting by \(\rho (\mu )\) the spectral radius of \(\lambda _G(\mu )\), we have \(\Vert \lambda _G(\mu )\Vert _2=\sqrt{\rho (\mu *\mu ^{-1})}\), where \(\mu ^{-1}\) the image of \(\mu \) by the map \(g \mapsto g^{-1}\). When \(\mu \) is symmetric, \(\Vert \lambda _G(\mu )\Vert _2=\rho (\mu )\). Moreover, thanks to [5, Theorem 4], \(\Vert \lambda _G(\mu )\Vert _2<1\) as soon as \(\mu *\mu ^{-1}\) is non-elementary and hence in this case, we may take \(r=0\) on the right-hand-side of the the inequality given by the previous result. Finally, for every non-elementary probability measure \(\mu \), for every \(r>0\), \(\mu _{r,{\text {lazy}}}\) satisfies the previous property and hence for \(r \in (0,1)\), \(\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2<1\). We refer to [5, p177] for an example of a non-elementary probability measure \(\mu \) with \(\Vert \lambda _G(\mu )\Vert _2=1\).

For the proof, we require the following version of [4, Lemma 5.2] where we highlight the constants that appear in the aforementioned lemma for our purposes. This is the crucial harmonic analytic ingredient of the proof where the additional hypothesis (compared to Proposition 3.4) on the cocompactness action of G on the space M is used.

Lemma 4.5

Let (Md) be a proper metric space. Suppose that the group G of isometries of (Md) acts cocompactly on M. Then, there exists a constant \(D_0>0\) depending only on M such that for every probability measure \(\mu \), for every \(R>0\), \(n \geqslant 1\) and \(m,m' \in M\), we have

$$\begin{aligned} {\mathbb {P}}(d(R_n m,m') \leqslant R-D_0) \leqslant A_0(R) \Vert \lambda _G(\mu )\Vert _2^n, \end{aligned}$$
(4.2)

where \(A_0(R)=\left( \frac{\mu _G(B_{2R})}{\mu _G(B_{R})}\right) ^{1/2}\).

Proof

The proof follows similarly as [4, Lemma 5.2], we only indicate the necessary modifications.

  • We replace the estimate \(\Vert \lambda _G(\mu )^n\Vert \leqslant C_0a_0^n\) used in the proof of [4, Lemma 5.2], by \(\Vert \lambda _G(\mu )^n\Vert \leqslant \Vert \lambda _G(\mu )\Vert ^n\), and consequently, the constants \(C_0\) and \(a_0\) in [4, Lemma 5.2] can, respectively, be taken to be 1 and \(\Vert \lambda _G(\mu )\Vert \).

  • Regarding the constant \(A_0\) in [4, Lemma 5.2], in their proof, Benoist–Quint assume that m and \(m'\) belongs to the same G-orbit. However, since the action of G on M is cocompact, there exists \(D_0\) such that for every xy, there exists \(g \in G\) with \(d(gx,y) \leqslant D_0/2\) and this allows us to take \(A_0=\left( \frac{\mu _G(B_{2R})}{\mu _G(B_{R})}\right) ^{1/2}\).

\(\square \)

We will need the following geometric lemma, which is an adaption of Inclusion (5.5) in the proof of [4, Lemma 5.3] to the horofunction compactification. We point out that this is the key point where the geometric assumption of hyperbolicity is used.

Lemma 4.6

Let (Md) be a proper geodesic hyperbolic metric space and \(o \in M\). Then, there exists \(R(\delta )>0\) such that for any \(\xi \in \partial _h M\), \(y\in {\overline{M}}^h\), and constant \(D > 0\), there exists a finite subset \(C\subset M\times M\) with at most \(D^2\) elements such that for any \(g\in {\text {Isom}}(M)\), there exists \((x',y')\in C\) such that

$$\begin{aligned} (g \xi | y)_o\geqslant \kappa (g) \Longrightarrow [\kappa (g)\geqslant D] \vee [d(gx', y')\leqslant R(\delta )]. \end{aligned}$$

Moreover, all elements constituting the tuples in C are contained in a ball of radius D around o.

As it is will be shown, one can take the constant \(R(\delta )=14\delta +4\).

The proof will require some juggling between horofunction boundary \(\partial _h M\) and Gromov boundary \(\partial M\) to construct the set C. We therefore start by recalling some standard facts on the relation between \(\partial _h M\) and \(\partial M\).

First, there exists a natural G-equivariant surjective map from \({\overline{M}}^h\) to \(M \cup \partial M\). Namely, given \(h \in \partial _h M\), for any sequence \(x_n \in M\) such that \(h_{x_n} \rightarrow h\), the sequence \(x_n\) Gromov converges to infinity in the sense that \(\inf _{m,n \geqslant k} (x_n,x_m) \rightarrow \infty \) as \(k \rightarrow \infty \). When we endow \(M \cup \partial M\) with the usual topology [16, Ch. 2], this projection is the unique map that continuously extends the identity map \(m \mapsto m\) on M. For \(y \in {\overline{M}}^h\), we denote by \(\pi _y\) its image in \(M \cup \partial M\).

Given two points \(\pi _x \ne \pi _y \in \partial M\), by using the defining inequality (3.5) of a \(\delta \)-hyperbolic space, one checks that for any pair of pair of sequences \(x_n,x'_n\) and \(y_n,y_n'\) that Gromov converge to infinity and that are, respectively, in the equivalence class of \(\pi _x\) and \(\pi _y\), we have

$$\begin{aligned} \limsup _{n \rightarrow \infty } (x'_n|y'_n)_o -\liminf _{n \rightarrow \infty }(x_n|y_n)_o\leqslant 2\delta . \end{aligned}$$
(4.3)

For reader’s convenience, we single out two basic geometric properties that are used in the proof of the previous lemma.

Lemma 4.7

(Thin triangles) Given a triple (xyz) with \(x,y \in M\) and \(z \in \partial M\), fix geodesic rays [xy], [xz] and [yz]. Then, there exist \(a \in [x,y]\), \(b \in [x,z]\) and \(c \in [y,z]\) with the following property: denoting the corresponding geodesic rays oriented from i to j by \(\gamma _{ij}\), where \(i,j\in \{x,y,z,a,b,c\}\), we have for every \(t \geqslant 0\) in the respective interval of definition, all distances \(d(\gamma _{xb}(t),\gamma _{xa}(t)),d(\gamma _{ya}(t),\gamma _{yc}(t))\) and \(d(\gamma _{bz}(t),\gamma _{cz}(t))\) are \(\leqslant 6\delta \).

This lemma can be deduced from standard facts in hyperbolic geometry. We include a brief proof for reader’s convenience.

Proof

Consider the triangle (xyz) whose edges are as given in the statement. Let \(z_n\) be a sequence of points on the edge (yz) that converge to z. Consider the segments \(\zeta _n\) from x to \(z_n\). Since M is proper, by Arzelà–Ascoli Theorem, up to subsequence, they converge to a ray \(\zeta \) between x and z.

For each triangle \((x,y,z_n)\), fix points \(a_n,b_n,c_n\) respectively on the edges [xy], \([x,z_n]\) and \([y,z_n]\) that map to the junction point of the associated tripod [16, Ch. 1]. Using the fact that M is proper and passing to a further subsequence of \(z_n\), we may suppose that the sequences \(a_n,b_n,c_n\) converge, respectively, to the points, \(a\in [x,y], b' \in \zeta \) and \(c \in [y,z]\). Let b be the point on [xz] at distance d(xa) from the x.

Now note that by the tripod lemma [16, Proposition 3.1], we have the required property within each triangle \((x,y,z_n)\) with \(4\delta \). Since all points \(a_n,b_n,c_n\) converge to respectively \(a,b',c\), the same property is true at the limit triangle with [xz] replaced by \(\zeta \). Now since [xz] and \(\zeta \) are at parametrized-distance \(2\delta \)-apart, we get the required property with \(6\delta \). \(\square \)

Lemma 4.8

(Fellow travellers) For every \(\xi , \eta \in {\overline{M}}^h\), let \(\gamma _\xi \) and \(\gamma _\eta \) be geodesic rays such that \(\gamma _\xi (0)=\gamma _\eta (0)=o\) and \(\gamma _\zeta (t) \rightarrow \pi _\zeta \) as \(t \rightarrow \infty \) for \(\zeta \in \{\xi ,\eta \}\). Then for any \(r \geqslant 0\) such that \((\xi |\eta )_o \geqslant r\), we have \(d(\gamma _\xi (t), \gamma _\eta (t)) \leqslant 8\delta \) for every \(t \in [0,r]\).

Proof

We can suppose that \(\pi _\xi \ne \pi _\eta \) and \(r \geqslant 2\delta \). It follows by (4.3) that for every \(\epsilon >0\) for every \(s>0\) large enough, we have \((\gamma _\xi (s) | \gamma _\eta (s))_o \geqslant r-2\delta -\epsilon \). The statement now follows by expanding the inequality \((\gamma _\xi (t)| \gamma _\eta (t))_o \geqslant (\gamma _\xi (t), \gamma _\xi (s)) \wedge (\gamma _\xi (s),\gamma _\eta (s))\wedge (\gamma _\eta (s),\gamma _\eta (t))-2\delta \) for \(t \leqslant r\) and \(s \rightarrow \infty \). \(\square \)

We now give the

Proof of Lemma 4.6

We will prove the claim with \(R=14\delta +4\). Let such \(\xi \in \partial _h M\), \(y \in {\overline{M}}^h\) and \(D>0\) be given. Let \(g\in {\text {Isom}}(M)\). To construct the set C, fix three rays \([o,\pi _\xi ]\), \([o,\pi _y]\) and \([o,g\pi _\xi ]\) and for \(i=1,\ldots ,\lfloor D \rfloor -1\), let \(m_i\), \(m_i'\) and \(m_i''\) be points on the respective rays satisfying \(d(z_i,o)=i \wedge d(o,\zeta )\) for every couple \((z_i,\zeta ) \in \{(m_i,\pi _\xi ), (m_i',\pi _y), (m_i'',g\pi _\xi )\}\). We denote \(m_0=m'_0=m_0''=o\). Suppose now that \((g\xi |y)_o \geqslant \kappa (g)\) and \(\kappa (g)<D\). Using the thin triangles Lemma 4.7 for \((o,go,g\xi )\), since \(D >\kappa (g)\), we deduce that there exists \(i_0, j_0 \leqslant \kappa (g)\) such that \(d(gm_{i_0},m_{j_0}'') \leqslant 6 \delta +4\).

We now use the fellow-travellers Lemma 4.8 for \(g\xi ,y \in {\overline{M}}^h\) with the geodesic rays \([o,g\pi _\xi ]\) and \([o,\pi _y]\). Since \((g\xi |y)_o \geqslant r:=\kappa (g)\) and \(j_0 \leqslant \kappa (g)\), we find that there exists \(i_1 \leqslant \kappa (g)\) such that \(d(m''_{j_0}, m'_{i_0}) \leqslant 8\delta \).

We deduce that \(d(gm_{i_0},m_{i_1}') \leqslant 14\delta +4\). Therefore, desired result holds with \(C:=\{(m_i, m'_j)| 0\leqslant i,j\leqslant \lfloor D \rfloor -1\}\). \(\square \)

Proof of Proposition 4.2

We fix \(r \in [0,1)\) such that \(\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert <1\). The latter may be equal to 1 only for \(r=0\) (see Remark 4.4) in which case the inequality holds trivially by setting \(C(.,1)\equiv \infty \). Note that the measure \(\nu \) is \(\mu _{r,{\text {lazy}}}\)-stationary and denoting by \(S_r\) the support of \(\mu _{r,{\text {lazy}}}\), we have \(\kappa _{S_r}=\kappa _S\). To ease the notation, in the proof, we write \(\mu \) for \(\mu _{r,{\text {lazy}}}\).

Let \(\nu \) be a \(\mu \)-stationary probability measure on \(\partial _h M\). Let \((B=G^{{\mathbb {N}}}, \beta =\mu ^{\otimes {\mathbb {N}}})\) be the Bernoulli space and \(T:B \rightarrow B\) the shift map. Since \(\partial _h M\) is compact, metrizable (see e.g. [49, Proposition 3.1]) and \({\text {Isom}}(M)\) acts continuously on \(\partial _h M\), by a result of Furstenberg [23], it follows that for \(\beta \)-almost every \(b\in B\), there exists a probability measure \(\nu _b\) on \(\partial _h M\) such that the following weak convergence holds

$$\begin{aligned} (b_1\cdots b_n)*\nu \overset{\text {weakly}}{\underset{n\rightarrow +\infty }{\longrightarrow }} \nu _b. \end{aligned}$$
(4.4)

Moreover, for every \(n \in {\mathbb {N}}\), we have

$$\begin{aligned} (b_1\cdots b_n)*\nu _{T^n b} =\nu _b \qquad \text {and} \qquad \nu =\int _{B} \nu _b d\beta (b). \end{aligned}$$
(4.5)

For every \(b\in B\) and \(n\in {\mathbb {N}}\), denote for simplicity \(R_n(b):=b_1 \cdots b_n\) and \(k_n(b):=\kappa (R_n(b))\). Let now \(\eta \in {\overline{M}}^h\). Using (4.5) and Fubini–Tonelli, we have

$$\begin{aligned} \int _{\partial _h M} (\eta |\xi )_o d\nu (\xi )&= \int _B \int _{\partial _h M} (\eta |y)_o d\nu _b(y) d\beta (b) \\&= \int _B \int _0^\infty \nu _b((\eta |y)_o \geqslant t) dt d\beta (b)\\&\quad \leqslant \int _B \sum _{n=0}^\infty \int _0^\infty 1_{\kappa _n(b)\leqslant t <\kappa _{n+1}(b)} \nu _b((\eta |y)_o\geqslant t) dt d\beta (b) , \end{aligned}$$

where we used the fact that \(\kappa _n(b)\rightarrow +\infty \) almost surely and where, for every \(t>0\), we set \(1_{\kappa _n(b)\leqslant t <\kappa _{n+1}(b)}=0\) whenever \(\kappa _n(b)\geqslant \kappa _{n+1}(b)\). Using Fubini–Tonelli’s theorem, we deduce that

$$\begin{aligned} \int _{\partial _h M} (\eta |\xi )_o d\nu (\xi )&= \sum _{n=0}^\infty \int _B \int _{0}^{\infty } \nu _b((\eta |y)_o \geqslant t){\mathbf {1}}_{\kappa _n(b)\leqslant t < \kappa _{n+1}(b)} dt d\beta (b) \\&\leqslant \sum _{n=0}^\infty \kappa _S \int _B \nu _b((\eta |y)_o \geqslant \kappa _n(b)) d\beta (b) \end{aligned}$$

Now using the first equality of (4.5), we have

$$\begin{aligned} \begin{aligned}&\int _{\partial _h M} (\eta |\xi )_o d\nu (\xi ) \leqslant \sum _{n=0}^\infty \kappa _S \int _B \nu _{T^n b}\left( (\eta | b_1 \cdots b_n y)_o \geqslant \kappa _n(b)\right) d\beta (b) \\&\quad = \sum _{n=0}^\infty \kappa _S \int _B \int _G \nu _{b}((\eta |gy)_o \geqslant \kappa (g)) d\mu ^{*n}(g) d\beta (b)\\&= \sum _{n=0}^\infty \kappa _S \int _B \int _{\partial _h M} \mu ^{*n}((\eta |gy)_o \geqslant \kappa (g)) d\nu _{b}(y) d\beta (b), \end{aligned} \end{aligned}$$

where in the second line we used the fact that \(\beta =\mu ^{\otimes {\mathbb {N}}}\) is a product measure and in the last line we used Fubini–Tonelli’s theorem. We conclude that

$$\begin{aligned} \int _{X}{(\eta |x)_o\,d\nu (x)} \leqslant \sum _{n=0}^\infty \kappa _S \sup _{y\in \partial _h M}{\mathbb {P}}\left( (\eta | R_n y)_o \geqslant \kappa (R_n)\right) . \end{aligned}$$

Using now Lemma 4.6, we get that for every \(c>1\), \(n \in {\mathbb {N}}\), \(\xi \in \partial _h M\), \(y\in {\overline{M}}^h\),

$$\begin{aligned} {\mathbb {P}}\left( (R_n \xi | y)_o\geqslant \kappa (R_n)\right) \leqslant \underset{a_n}{\underbrace{{\mathbb {P}}(\kappa (R_n)\geqslant c^n)}} + \underset{b_n}{\underbrace{c^{2n} \sup _{x',y'\in M}{\mathbb {P}}( d(R_n x', y')\leqslant R(\delta ))}}. \end{aligned}$$

Thus

$$\begin{aligned} \int _{X}{(\eta |x)_o\,d\nu (x)}\leqslant \kappa _S \left( \sum _{n=0}^{+\infty }{a_n} + \sum _{n=0}^{+\infty }{b_n}\right) . \end{aligned}$$
(4.6)

On the one hand, since \(\kappa (R_n)\leqslant n \kappa _S\), we have \(\sum _{n=0}^{+\infty }{a_n} \leqslant \sum _{n=0}^{+\infty }{{\mathbf {1}}_{c^n \leqslant n \kappa _S}}\). Using this, it is not hard to deduce that

$$\begin{aligned} \sum _{n=0}^{+\infty }{a_n} \leqslant \max \left\{ \frac{2\ln \kappa _S}{\ln c} , \frac{4}{(\ln c)^2} \right\} . \end{aligned}$$
(4.7)

On the other hand, by Lemma 4.5, for every \(n\in {\mathbb {N}}\), we have for every \(x',y'\in M\),

$$\begin{aligned} {\mathbb {P}}( d(R_n x', y')\leqslant R(\delta )) \leqslant A_0(R(\delta ) +D_0) \Vert \lambda _G(\mu )\Vert _2^n, \end{aligned}$$

where \(A_0(.)\) is the function defined in that lemma.

We deduce that for every \(1<c<\Vert \lambda _G(\mu )\Vert _2^{-1/2}\),

$$\begin{aligned} \sum _{n=0}^{+\infty }{b_n} \leqslant A_0(R(\delta ) +D_0) \sum _{n=0}^{+\infty }{ (c^2 \Vert \lambda _G(\mu )\Vert _2)^n} \leqslant \frac{A_0(R(\delta ) +D_0)}{1-c^2 \Vert \lambda _G(\mu )\Vert _2}. \end{aligned}$$
(4.8)

The proof follows by combining (4.6), (4.7) and (4.8). \(\square \)

Theorem 4.1 now directly follows by putting together Propositions 3.4 and 4.2.

Proof of Theorem 4.1

Using the estimate (4.1) in combination with Lemma 3.1, one gets that in Proposition 3.4 (see also Remark 3.5), the constant \({\mathfrak {c}}\) is bounded above by \(2C(\kappa _S,\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)\). Since the right-hand-side of the inequality (3.15) is increasing in \({\mathfrak {c}}\), we can substitute \(2C(\kappa _S,\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)\) for \({\mathfrak {c}}\), one gets that for every \(\xi \in {\overline{M}}^h\), \(n \in {\mathbb {N}}\) and \(t>0\),

$$\begin{aligned} {\mathbb {P}}(|\sigma (L_n,\xi ) - n \ell (\mu )|\geqslant nt) \leqslant 2 \exp \left( -\frac{n t^2}{32(\kappa _S + 2C\left( \kappa _S,\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)\right) ^2}\right) \end{aligned}$$

for every \(r \in [0,1)\). This yields the desired estimate. \(\square \)

Remark 4.9

(Random walks with unbounded support) As mentioned in the introduction (Remark 1.3), one can have a version of Theorem 4.1 under the assumption that the probability measure \(\mu \) driving the random walk has a finite exponential moment, i.e.  there exists \(\alpha _0>0\) such that \(\int e^{\alpha _0 \kappa (g)} d\mu (g) < \infty \). Indeed, the use of Azuma–Hoeffding concentration inequality in §2 can be replaced for example by the result of Liu–Watbled [47, Theorem 1.1] adapted for martingale differences with conditionally bounded exponential moment. Using the latter, one obtains the following version of Theorem 4.1. Keep the assumptions of Theorem 4.1 and (instead of the bounded support assumption) suppose that there exists \(\alpha _0>0\) such that \(\int e^{\alpha _0 \kappa (g)} d\mu (g):=K<\infty \). Then, there exists a positive real c such that for every \(\xi \in {\overline{M}}^h\), \(t>0\) and \(n \in {\mathbb {N}}\), we have

$$\begin{aligned} {\mathbb {P}}\left( |\sigma (L_n,\xi ) - n \ell (\mu )|\geqslant nt \right) \leqslant {\left\{ \begin{array}{ll} 2e^{-nct^2}, &{} \text {if}\ t \in [0,1] \\ 2e^{-nct}, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

The constant c depends only on \(\alpha _0\), K and on the constants \(A_0\), \(R_\delta \), \(D_0\) and \(\Vert \lambda _G(\mu _{1/2,{\text {lazy}}})\Vert _2\) as in Theorem 4.1. This statement has the obvious advantage of applying to random walks with unbounded support (with finite exponential moment) but it has the disadvantage that the dependence of the appearing constant c to aforementioned parameters of \(\mu \) is considerably more complicated. In line with our goals in this article, we have chosen not to give more details on this version of Theorem 4.1 for finite exponential moment random walks.

In the remainder of this section, we will single out two applications of the methods we used in this part. Namely, in §4.2, we will give an explicit bound for the bottom of the support of Hausdorff spectrum of the harmonic measure, and in §4.3, we discuss an application to continuity of the drift.

4.2 Application to the Frostman property of the harmonic measure

In this part, we keep the assumptions of Theorem 4.1. In particular, \(\mu \) is a non-elementary probability measure with finite support on \({\text {Isom}}(M)\) where M is a proper geodesic hyperbolic metric space.

Let B(xr) denote the ball of radius r around x for a natural metric coming from the Gromov product (we do not go into the details here since this metric will not be used, see [27, §8], [58, Proposition 5.16]). The following result provides an explicit constant \(s_0>0\) for which the Frostman type property

$$\begin{aligned} \nu (B(x,r)) \leqslant Cr^{s_0} \end{aligned}$$

holds for some constant \(C>0\) and every \(x \in \partial M\) and \(r \geqslant 0\).

Such a constant gives a lower bound for the bottom of the support of Hausdorff spectrum of \(\nu \); see the work of Tanaka [56] who gives a thorough multifractal analysis of the harmonic measure in the special setting of hyperbolic groups. Finally, we mention that the existence of such a (non-explicit) constant for the harmonic measure is known in a more general setting of (not necessarily proper) hyperbolic spaces (see [48, Lemma 2.10] and [7, Corollary 2.17]).

Proposition 4.10

(Frostman property) Under the hypotheses of Theorem 4.1, for every

$$\begin{aligned} s< \sup _{r\in [0,1)} \frac{1}{\kappa _S}\ln \left( \frac{1}{\Vert \lambda _G(\mu _{r,{\text {lazy}}}))\Vert _2}\right) , \end{aligned}$$

we have

$$\begin{aligned} \sup _{x\in M \cup \partial M} \int _{\partial M} {e^{s\, (x|y)_o}\, d\nu (y)}=:K(\mu ,S)<+\infty .\end{aligned}$$
(4.9)

In particular, for each such \(s>0\), for every \(x\in M\),

$$\begin{aligned} \nu \left\{ y\in \partial _h M : (x|y)_o\geqslant t \right\} \leqslant K(\mu ,s) e^{-st}. \end{aligned}$$
(4.10)

In the above statement and hereafter, for \(x,y \in M \cup \partial M\), the Gromov product \((x|y)_o\) is defined as \(\inf \liminf _{n \rightarrow \infty }(x_n|y_n)_o\), where the infimum is taken over all sequences \(x_n\) and \(y_n\) that converge, respectively, to x and y.

Since the proof of the above result follows similar lines as the proof of Proposition 4.2, we will keep its notation and content with indicating the main lines and changes. We write \(B=G^{\mathbb {N}}\), \(\beta =\mu ^{\otimes {\mathbb {N}}}\). By T we denote the shift map on B and for \(b \in B\) and \(n \in {\mathbb {N}}\), we write \(R_n(b)=b_1\ldots b_n\). It is well-known that for \(\beta \)-a.s. \(b \in B\), the sequence \(R_n(b)o\) converges to an element of the Gromov boundary \(\partial M\) (see [4, Proposition 3.1.b] or [49, Theorem 1.1]). This defines a boundary map

$$\begin{aligned} \xi :B \rightarrow \partial M, \quad \text {given by} \quad \xi _b=\lim _{n \rightarrow \infty }R_n(b)o \end{aligned}$$

on a T-invariant \(\beta \)-full measure subset \(B'\) of B which does not depend on \(o \in M\). In what follows, we will ignore the distinction between \(B'\) and B, this should not cause confusion. The fact that in the following proof (as opposed to Proposition 4.2) we are working over the Gromov boundary \(\partial M\) brings additional simplifications (cf. [4, Prop 5.1]); recall that for the \(\mu \)-harmonic measure \(\nu \) on \(\partial M\), for \(\beta \)-a.s. the limit measure \(\nu _b=b_1\ldots b_n \nu \) is a Dirac mass and we have \(\nu _b=\delta _{\xi _b}\).

Proof

Let \(s>0\). The beginning of the proof of Proposition 4.2 is replaced by the following series of identities obtained by multiple applications of Fubini–Tonelli’s theorem: We have

$$\begin{aligned} \int _{\partial M} \exp (s\, (\eta |\xi )_o) d\nu (\xi )&= \int _B \int _{\partial M} \exp (s\,(\eta |\xi _b)_o ) d\beta (b) \\&= \int _B \int _0^\infty 1_{(\eta |\xi _b)_o \geqslant \frac{\ln t}{s}} dt d\beta (b)\\&=s \int _B \int _{-\infty }^\infty \exp (s t) 1_{(\eta |\xi _b)_o \geqslant t} dt d\beta (b)\\&\leqslant 1+s \sum _{n=0}^\infty \int _B \int _{0}^{\infty } \exp (st) 1_{(\eta |\xi _b)_o \geqslant t} {\mathbf {1}}_{\kappa _n(b)\leqslant t < \kappa _{n+1}(b)} dt d\beta (b) \\&\leqslant 1+s \sum _{n=0}^\infty \kappa _S \exp (s(n+1)\kappa _S) \int _B 1_{(\eta |\xi _b)_o \geqslant \kappa _n(b)} d\beta (b). \end{aligned}$$

Now, let \(c>1\) and denote by \(a_n\) and \(b_n\) the sequences (that depend on c) as in the proof of Proposition 4.2. Then

$$\begin{aligned} \int _{\partial M} \exp (s\, (\eta |\xi )_o) d\nu (\xi )&\leqslant 1+ s \kappa _S\sum _{n=0}^{\infty }{ a_n \exp (s(n+1)\kappa _S) } \\&\quad + s \kappa _S \sum _{n=0}^{\infty }{b_n \exp (s(n+1) \kappa _S)}. \end{aligned}$$

The sum in the middle of the right hand side is a finite sum. The last series is finite as soon as

$$\begin{aligned} \sum _{n=0}^{+\infty }{ \left( c^2 \exp (s\kappa _S) \Vert \lambda _G(\mu )\Vert _2 \right) ^n} \end{aligned}$$
(4.11)

is finite. Now, for any \(s<\frac{1}{\kappa _S}\ln \frac{1}{\Vert \lambda _G(\mu )\Vert _2}\), one can choose \(c>1\) so that (4.11) is finite and this finishes the proof of (4.9). Finally (4.10) follows from (4.9) by Markov inequality. \(\square \)

4.3 Applications to continuity of the drift

A consequence of Theorem 1.1 is a uniform control, over different driving probability measures with controlled parameters \(\kappa _S\) and \(\Vert \lambda _G(\mu )\Vert _2\), of large deviations of the displacement around the drift. In turn, this allows one to deduce that the drift varies continuously when one perturbs \(\mu \) in such a way that that \(\kappa _S\) remains bounded and \(\Vert \lambda _G(\mu )\Vert _2\) remains away from 1, as we show in Corollary 4.11 below. The idea of such a deduction, of continuity from uniform large deviations, already appears in the literature, see e.g. Duarte–Klein [19, Ch. 3]. However, with our method, this way of deducing the continuity is not optimal as one can deduce a continuity result directly from unique cocycle-average property for the Busemann cocycle (which was a key point in obtaining our concentration result). We refer to Proposition 4.13 for a general continuity statement.

Corollary 4.11

Let (Md) be a proper geodesic hyperbolic metric space such that \({\text {Isom}}(M)\) acts cocompactly on M. Consider a sequence of non-elementary probability measures \((\mu _m)_{m\in {\mathbb {N}}}\) with bounded support \(S_m\) in the group \({\text {Isom}}(M)\) such that

$$\begin{aligned} \limsup _{m \rightarrow \infty } \inf _{r \in [0,1)} \Vert \lambda _G((\mu _m)_{r,{\text {lazy}}})\Vert _2<1 \qquad \text {and} \qquad \sup _{m \in {\mathbb {N}}}\kappa _{S_m}<\infty . \end{aligned}$$

Suppose that \(\mu _m\) converges weakly to some probability measure \(\mu _{\infty }\). Then, as \(m \rightarrow \infty \)

$$\begin{aligned} \ell (\mu _m) \rightarrow \ell (\mu _\infty ). \end{aligned}$$

Proof

Fix \(t_0>0\) and let \(\lambda <1\) be a constant such that for every \(m \in {\mathbb {N}}\), \(\inf _{r \in [0,1)}\Vert \lambda _G((\mu _{m})_{r,{\text {lazy}}})\Vert _2\) \(<\lambda \). Set \(\kappa _0=\sup _{m \in {\mathbb {N}}} \kappa _{S_m}\). Choose \(n_0\) large enough so that

$$\begin{aligned} \left| \frac{1}{n_0}{\mathbb {E}}_{\mu _\infty }[\kappa (L_{n_0})]-\ell (\mu _\infty )\right| <t_0, \end{aligned}$$
(4.12)

and

$$\begin{aligned} 2 \exp \left( \frac{-n_0 t_0^2 }{\kappa _0^2 32 \left( 16\ln ^+(\kappa _0)+ 8A_0/3 +33\right) ^2(1-\sqrt{\lambda })^4} \right) <\frac{t_0}{\kappa _0}, \end{aligned}$$
(4.13)

where the constant \(A_0\) (depending only on M) is as in Remark 1.2.

The choice of \(n_0\) satisfying (4.13) implies by using Theorem 1.1 and the bound on the function D given in Remark 1.2 which is non-decreasing in \(\kappa \) and in \(\lambda \), that for every \(m \in {\mathbb {N}}\) large enough, we have

$$\begin{aligned} {\mathbb {P}}_{\mu _m}\left( |\kappa (L_{n_0})-n_0\ell (\mu _m)| \geqslant n_0 t_0 \right) \leqslant \frac{t_0}{\kappa _0}. \end{aligned}$$

Since \( |\frac{1}{n_0}\kappa (L_{n_0}) - \ell (\mu _m)|\leqslant \kappa _0\), this implies that for every m large enough, we have

$$\begin{aligned} \left| \frac{1}{n_0} {\mathbb {E}}_{\mu _m}[\kappa (L_{n_0})] -\ell (\mu _m)\right| \leqslant 2t_0. \end{aligned}$$
(4.14)

On the other hand, since \(\mu _m \rightarrow \mu _\infty \) weakly, we have that as \(m \rightarrow \infty \), \({\mathbb {E}}_{\mu _m}[\kappa (L_{n_0})] \rightarrow {\mathbb {E}}_{\mu _\infty }[\kappa (L_{n_0})]\). Therefore, combining (4.12) with (4.14), it follows that for every \(m \in {\mathbb {N}}\) large enough, we have \(|\ell (\mu _\infty )-\ell (\mu _m)|\leqslant 4t_0\) completing the proof. \(\square \)

Remark 4.12

A particular situation where the hypotheses of the previous result are satisfied is when there exists a finite set \(S \subset {\text {Isom}}(M)\) that contains the supports of all \(\mu _m\) for \(m \in {\mathbb {N}}\), \(\mu _m \rightarrow \mu _\infty \) weakly and \(\rho (\lambda _G(\mu _\infty ))<1\). This claim can easily be deduced from the results of Berg–Christensen [5]. We will omit the details as we will now prove a general continuity statement.

The following result is the one can deduce from the unique cocycle-average property similarly to Hennion [38] and Furstenberg–Kifer [25]. For a very similar proof closer to our setting and related remarks, see Gouëzel–Mathéus–Maucourant [33, Proposition 2.3]. In the following, for a probability measure \(\mu \) on \({\text {Isom}}(M)\), we denote \(L_1(\mu )=\int \kappa (g) d\mu (g)\).

Proposition 4.13

Let (Md) be a proper geodesic hyperbolic metric space. Let \(\mu _n\) be a sequence of non-elementary probability measures that converges weakly to a non-elementary probability measure \(\mu \). Suppose furthermore that \(L_1(\mu _n) \rightarrow L_1(\mu )\) as \(n \rightarrow \infty \). Then,

$$\begin{aligned} \ell (\mu _n) \rightarrow \ell (\mu ). \end{aligned}$$

Proof

Let \(\nu _n\) be a \(\mu _n\)-stationary probability measure on the horofunction boundary X of M. By unique cocycle-average property [4, Proposition 3.3(c)], we have

$$\begin{aligned} \ell (\mu _n)=\int _{G \times X} \sigma (g,\xi ) d\mu _n(g) d\nu _n(\xi ). \end{aligned}$$
(4.15)

Since X is compact, up to passing to a subsequence of \(\nu _n\), we can suppose that the sequence \(\nu _n\) converges to a probability measure \(\nu \) on X. Since \(\mu _n \rightarrow \mu \) weakly, one deduces from the continuity of the action of G on X that \(\nu \) is \(\mu \)-stationary. Using the hypothesis that \(L_1(\mu _n) \rightarrow L_1(\mu )\) and the fact that \(\kappa (g) \geqslant |\sigma (g,\xi )|\) for every \(g \in G\) and \(\xi \in X\), one gets by dominated convergence that the sequence of integrals in (4.15) converges to \(\int _{G \times X} \sigma (g,\xi ) d\mu (g) d\nu (\xi )\). But by unique cocycle-average property, the latter is equal to \(\ell (\mu )\). This implies the claimed convergence. \(\square \)

Finally, we mention that, in the case of a countable hyperbolic group, further regularity properties of the drift are known, see e.g. Erschler–Kaimanovich [21], Ledrappier [46], Gouëzel [31] and Mathieu–Sisto [51].

5 The case of Gromov hyperbolic groups and rank-one linear groups

The goal of this section is to prove Corollaries 1.4 and 1.6 using Theorem 4.1.

An important ingredient that allows us to obtain concentration inequalities with implied constants that depends, in a minimal fashion, on the probability measure \(\mu \) is a version of uniform Tits alternative for group of isometries of hyperbolic spaces. For hyperbolic groups, we will use Koubi’s results [43] and for linear groups the strong Tits alternative of Breuillard [9].

5.1 Concentration inequalities for random walks on Gromov hyperbolic groups

For the proof of Corollary 1.4, we will use the following result of Koubi:

Theorem 5.1

([43]) Let \(\Gamma \) be a finitely generated non-elementary hyperbolic group. There exists \(N_{\Gamma } \in {\mathbb {N}}\) such that for any finite subset S generating \(\Gamma \), there exists two elements \(a,b \in S^{N_\Gamma }\) that generate a free subgroup of rank two.

Here, by S-length of an element \(g \in \Gamma \), we mean the distance of g to the identity element in the word-metric induced by S.

The previous result will be useful to us in combination with the following straightforward observation (see e.g. [11, §8]).

Lemma 5.2

Let \(\Gamma \) be a countable group and \(S \subset \Gamma \) such that \(S^{N_0}\) contains a pair of elements that generates a free subgroup of rank two for some \(N_0 \in {\mathbb {N}}\). Let \(\mu \) be a probability measure with support \(S\cup \{{\text {id}}\}\) and set \(m_\mu =\min _{g \in S}\mu (g)\). Then,

$$\begin{aligned} \Vert \lambda _\Gamma (\mu )\Vert _2 \leqslant \left( 1- \left( 1-\frac{\sqrt{3}}{2}\right) m_{\mu }^{2{N_0}} \right) ^{\frac{1}{2{N_0}}} . \end{aligned}$$

Proof

Consider the probability measure \(\mu '=\mu *{\check{\mu }}\) and denote by \(S'\) its support. Since S contains identity, the set \(S'\) is symmetric and it contains S. It follows that \((S')^{N_0}\) contains a set \(\{a,b,a^{-1},b^{-1}\}\), where ab are the generators of a free group of rank two.

Since \(\mu '\) is symmetric, the operator \(\lambda _\Gamma (\mu ')\) on \(\ell ^2(\Gamma )\) is self-adjoint, and since \(\lambda _\Gamma (\mu '^{*N_0})=\lambda _\Gamma (\mu ')^{N_0}\), we have \(\Vert \lambda _\Gamma (\mu ')\Vert _2= \Vert \lambda _\Gamma (\mu '^{*N_0})\Vert _2^{1/N_0}\). Therefore,

$$\begin{aligned} \Vert \lambda _\Gamma (\mu )\Vert _2=\Vert \lambda _\Gamma (\mu ')\Vert _2^{\frac{1}{2}}=\Vert \lambda _\Gamma (\mu '^{N_0})\Vert _2^{\frac{1}{2N_0}} \end{aligned}$$
(5.1)

On the other hand, we write \(\mu '^{ *N_0}=m_{\mu '}^{N_0} \eta + (1-m_{\mu '}^{N_0})\zeta \), where \(\eta \) is the uniform probability measure on \(\{a,b,a^{-1}, b^{-1}\}\) and \(\zeta \) some probability measure on \(\Gamma \). Using the trivial bound \(\Vert \lambda _\Gamma (\zeta )\Vert _2 \leqslant 1\), we deduce that

$$\begin{aligned} \Vert \lambda _\Gamma (\mu '^{*{N_0}})\Vert _2\leqslant 1-\kappa \,m_{\mu '}^{N_0}, \end{aligned}$$
(5.2)

where \(1-\kappa =\sqrt{3}/2\) is the spectral radius of the uniform probability measure on the free group [42, Theorem 3]. Combining (5.1) and (5.2), and using the fact that \(m_{\mu '}\geqslant m_{\mu }^2\), we deduce that

$$\begin{aligned} \Vert \lambda _\Gamma (\mu )\Vert _2 \leqslant \left( 1- \kappa \,m_{\mu }^{2{N_0}} \right) ^{\frac{1}{2{N_0}}}. \end{aligned}$$
(5.3)

\(\square \)

Proof of Corollary 1.4

Note first that by [14, Proposition 2.6], the group \(\Gamma \) is a non-elementary hyperbolic group. Therefore, the hypothesis of Lemma 5.2 is satisfied for every finite generating set of \(\Gamma \) with a uniform constant \(N_0=N'_\Gamma \) thanks to Koubi’s Theorem 5.1. Applying Lemma 5.2 to \(\mu _{1/2,{\text {lazy}}}\) and using the fact that \(m_{\mu _{1/2,{\text {lazy}}}} \geqslant \frac{1}{2} m_{\mu }\) yields that

$$\begin{aligned} \Vert \lambda _\Gamma (\mu _{1/2, {\text {lazy}}})\Vert _2\leqslant \left( 1- \frac{(1-\sqrt{3}/2)}{2^{N'_{\Gamma }}} \,m_{\mu }^{N'_\Gamma }\right) ^{\frac{1}{2 N'_{\Gamma }}} \leqslant \left( 1- \frac{(1-\sqrt{3}/2)}{N'_{\Gamma }2^{N'_{\Gamma }+1}} \,m_{\mu }^{N'_\Gamma }\right) .\nonumber \\ \end{aligned}$$
(5.4)

Observe finally that by the discreteness assumption of \(\Gamma \) and by Berg-Christensen’s Corollary [5, Corollaire 3], one has that \(\Vert \lambda _\Gamma (\mu _{1/2, {\text {lazy}}})\Vert _2=\Vert \lambda _G(\mu _{1/2, {\text {lazy}}})\Vert _2\). Using now Theorem 1.1 and the expression of the function \(D(\cdot , \cdot )\) given in Remark 1.2, we get the desired result with

$$\begin{aligned} \alpha _{\Gamma }:=2^{21+4N'_{\Gamma }} \frac{(N'_\Gamma )^4}{(1-\sqrt{3}/2)^4}, \end{aligned}$$

\(A_M=A_0+3\) and \(N_\Gamma =4N'_\Gamma \). \(\square \)

5.2 Concentration inequalities for random walks on rank-one semisimple linear groups

For the proof of Corollary 1.6, we will use the following result of Breuillard [9, Theorem 1.1] and [10] (see [11] for the particular case of \({\text {SL}}_2\)).

Theorem 5.3

([9]) For every \(d \in {\mathbb {N}}\) there is \(N_d \in {\mathbb {N}}\) such that if \(\mathrm {k}\) is any field and S is a finite symmetric subset of \({\text {GL}}_d(\mathrm {k})\) containing identity, either \(S^{N_d}\) contains two elements which generate a non-abelian free group, or the group generated by S contains a finite-index solvable subgroup.

We are now able to give the

Proof of Corollary 1.6

Let \(d \in {\mathbb {N}}\) be given, and \(\mathrm {k}\) and \({\mathbb {H}}\subseteq {{\mathbb {S}}}{{\mathbb {L}}}_d\) be as in the statement. Let the natural number \(N'_d\) (depending only on d) be as given by Theorem 5.3. Let \(\mu \) be a probability measure whose support S is a finite subset of \({\mathbb {H}}(\mathrm {k})\) that generates a discrete non-amenable subgroup \(\Gamma \) of \({\mathbb {H}}(k)\). Let \(\mu ':=\mu _{1/2, {\text {lazy}}}\), \(\mu '':=\mu '*\mu '^{-1}\), denote by \(S'\) the support of \(\mu '\) and \(S''\) the support of \(\mu ''\). Notice that the finite set \(S''\) is symmetric, contains the identity and it generates the group \(\Gamma \). Therefore it follows by Theorem 5.3 that \(S''^{N'_d}\) contains two elements that generate a free group of rank two where the constant \(N'_d \in {\mathbb {N}}\) only depends on the dimension d. Applying Lemma 5.2, we get that

$$\begin{aligned} \Vert \lambda _\Gamma (\mu '')\Vert _2 \leqslant \left( 1- (1-\sqrt{3}/2) m_{\mu ''}^{2{N'_d}} \right) ^{\frac{1}{2{N'_d}}}. \end{aligned}$$

Since \(\Vert \lambda _\Gamma (\mu '')\Vert _2=\Vert \lambda _\Gamma (\mu ')\Vert _2^2\) and \(m_{\mu ''}\geqslant m_{\mu '}^2\geqslant m_{\mu }^2/4\), we deduce

$$\begin{aligned} \Vert \lambda _\Gamma (\mu ')\Vert _2 \leqslant 1-m_{\mu }^{4N'_d} \frac{1-\sqrt{3}/2}{4N'_d 2^{4N'_d}}. \end{aligned}$$
(5.5)

As in the proof of Corollary 1.4, by the discreteness assumption on \(\Gamma \) it follows that \(\Vert \lambda _\Gamma (\mu _{1/2, {\text {lazy}}})\Vert _2=\Vert \lambda _G(\mu _{1/2, {\text {lazy}}})\Vert _2\). Therefore, a direct application of Theorem 1.1 (with \(r=1/2\) on the right hand side of the theorem) and the expression of the function \(D(\cdot , \cdot )\) given in Remark 1.2 concludes the proof with \(N_d=16 N_d'\), \(\alpha _d=2^{25+N_d} N_d^4 \frac{1}{(1-\sqrt{3}/2)^4}\) and \(A_{{\mathbb {H}}, \mathrm {k}}=A_0/3+3\), where \(A_0\) is the constant defined in Remark 4.3 applied for the isometry group G of the symmetric space M associated to the rank-one group \({\mathbb {H}}(\mathrm {k})\). \(\square \)

6 Probabilistic free subgroup theorem

The goal of this section is to prove Theorem 1.10 from Introduction. To do this, we start by proving a general result which shows that uniform large deviation estimates for the Busemann cocycle together with positivity of the drift imply a probabilistic free subgroup theorem for isometries of Gromov hyperbolic spaces.

6.1 Free subgroups from uniform large deviations

Let (Md) be a \(\delta \)-hyperbolic metric space and fix \(o \in M\). Let \(\mu \) be a Borel probability measure on \({\text {Isom}}(M)\) endowed with the topology of pointwise convergence.

We introduce the following uniform large deviation hypothesis for a probability measure \(\mu \) with finite first order moment on \({\text {Isom}}(M)\):

ULD: For every \(\epsilon >0\) and \(n \in {\mathbb {N}}\), there exist non-negative constants \(p_n(\epsilon )\) such that for every \(\epsilon >0\), \(p_n(\epsilon ) \rightarrow 0\) as \(n \rightarrow \infty \) and

$$\begin{aligned} \sup _{y \in M}{\mathbb {P}}(|\sigma (R_n^{\pm 1},y)-n\ell (\mu )| \geqslant n\epsilon ) \leqslant p_n(\epsilon ), \end{aligned}$$
(6.1)

where \(\sigma \) denotes the Busemann cocycle. Note that, whenever the ULD hypothesis is satisfied, by replacing, for every \(\epsilon >0\), \(p_n(\epsilon )\) by \(\sup _{m \geqslant n}p_n(\epsilon )\), we can and we will suppose that it is satisfied with a non-increasing sequence \(p_n(\epsilon )\).

The rest of §6.1 is devoted to the proof of the following

Proposition 6.1

Let (Md) be a \(\delta \)-hyperbolic metric space and \(\mu \) a probability measure on \({\text {Isom}}(M)\) with finite first order moment. Suppose that \(\mu \) satisfies the hypothesis ULD and \(\ell (\mu )>0\). Then, for every integer \(n > 2+ \frac{16\delta }{\ell (\mu )} \), we have

$$\begin{aligned} (\mu ^{*n}\otimes \mu ^{*n} )\left\{ (\gamma _1, \gamma _2) : \langle \gamma _1, \gamma _2 \rangle \, \text {is free} \right\} > 1-25p_{\lfloor n/2 \rfloor }(\ell (\mu )/8). \end{aligned}$$

Before proceeding with the proof, we make a few remarks on its hypotheses.

Remark 6.2

(About ULD hypothesis)

  1. 1.

    Theorem 4.1 shows that the ULD hypothesis is verified, with explicit constants, for random walks on a proper hyperbolic space M such that \({\text {Isom}}(M)\) acts cocompactly on M. This explicit aspect will be crucial for the quantitative probabilistic free subgroup Theorem 1.10.

  2. 2.

    However, ULD (with qualitative constants) is satisfied also when M is not proper: using cocycle large deviation results of [3], it can be shown (see [39, Proposition 2.8]) that if M is a separable geodesic hyperbolic space, then ULD hypothesis holds for any countably supported non-elementary probability measure \(\mu \) with finite second order moment. Moreover, in this case, \(\ell (\mu )>0\) ( [49, Theorem 1.2])

We will show that with high probability, two independent random walks \(R_n\) and \(R'_n\) will play ping-pong on the space M. To set the random ping-pong table, we need some geometric lemmas. Let (Md) be a \(\delta \)-hyperbolic space, fix \(o\in M\) and let \(C>0\). Recall that the shadow of \(y\in M\) seen from \(x\in X\) is the following subset of M;

$$\begin{aligned} {\mathcal {O}}_C(x,y)=\{z\in M : (z|y)_x\geqslant d(x,y)-C\}. \end{aligned}$$

It is immediate that

$$\begin{aligned} {\mathcal {O}}_C(x,y)=\{z\in M : (z|x)_y\leqslant C\}. \end{aligned}$$
(6.2)

Observe that \({\mathcal {O}}_C(x,y)=M\) when \(C\geqslant d(x,y)\).

We will use the following lemma to construct Schottky subgroups of \({\text {Isom}}(M)\).

Lemma 6.3

Let (Md) be a \(\delta \)-hyperbolic space and \(o\in M\). Let \(\gamma _1, \gamma _2\in {\text {Isom}}(M)\). Suppose that there exists \(D>0\) such that

  1. (i)

    for every \(i \ne j \in \{1, 2\}\) and \(\epsilon _{1},\epsilon _2 \in \{-1, 1\}\), \((\gamma _i^{\epsilon _1} \cdot o | \gamma _j^{\epsilon _2}\cdot o)_o\leqslant D\).

  2. (ii)

    for every \(i=\{1,2\}\), \((\gamma _i \cdot o | \gamma _i^{-1} \cdot o)_o\leqslant D\),

  3. (iii)

    \(0<\frac{1}{2}\max \{\kappa (\gamma _1), \kappa (\gamma _2)\}<\min \{\kappa (\gamma _1), \kappa (\gamma _2)\} -D-\delta \).

Then \(\langle \gamma _1, \gamma _2\rangle \) is non-abelian free group.

Remark 6.4

Assumptions (ii) and (iii) have the consequence that \(\gamma _1\) and \(\gamma _2\) are both hyperbolic isometries. Indeed it follows from (iii) that we have \(\kappa (\gamma _i) > 2D+2\delta \) for \(i=1,2\). Together with (ii), this implies \(\kappa (\gamma _i^2)>\kappa (\gamma _i)+2\delta \), which in turn imply that \(\gamma _i\) is a hyperbolic isometry by [16, Lemma 2.2].

The proof of Lemma 6.3 will follow from intermediate lemmas inspired from [7, Appendix A].

Lemma 6.5

Let (Md) be any metric space and \(\gamma \in {\text {Isom}}(M)\). Then for every constant \(C\geqslant 0\), we have

$$\begin{aligned} \gamma (M\setminus {\mathcal {O}}_C(o, \gamma ^{-1} \cdot o)) \subseteq {\mathcal {O}}_{\kappa (\gamma )-C}(o,\gamma \cdot o). \end{aligned}$$

Proof

We have clearly for every \(C\geqslant 0\),

$$\begin{aligned} \gamma \cdot (M\setminus {\mathcal {O}}_C(o, \gamma ^{-1}\cdot o))=M\setminus {\mathcal {O}}_C(\gamma \cdot o, o ). \end{aligned}$$

So let \(x\not \in {\mathcal {O}}_C(\gamma \cdot o, o)\). By (6.2) this means that \((x | \gamma \cdot o)_o>C\). Using the identity

$$\begin{aligned} \kappa (\gamma )=(x|o)_{\gamma \cdot o}+(\gamma \cdot o | x)_{o}, \end{aligned}$$

we deduce that \((x|o)_{\gamma \cdot o}<\kappa (\gamma )-C\). Hence by (6.2), we have \(x\in {\mathcal {O}}_{\kappa (\gamma )-C}(o, \gamma \cdot o)\) as desired. \(\square \)

Lemma 6.6

Let (Md) be a \(\delta \)-hyperbolic space. Let \(\gamma _1, \gamma _2 \in Isom(M)\), \(D>0\). Denote \(\kappa _{1,2}:=\min \{\kappa (\gamma _1), \kappa (\gamma _2)\}\).

$$\begin{aligned} \left. \begin{array}{l} (\gamma _1 \cdot o | \gamma _2 \cdot o)_o \leqslant D\\ \kappa _{1,2}> D+\delta \end{array}\right\}&\Longrightarrow \forall 0<C<\kappa _{1,2}-D-\delta ,\, {\mathcal {O}}_C(o, \gamma _1 \cdot o)\\&\quad \qquad \cap {\mathcal {O}}_C(o, \gamma _2 \cdot o)=\emptyset . \end{aligned}$$

Proof

Assume that \((\gamma _1 \cdot o, \gamma _2 \cdot o)_o \leqslant D\) and \(\kappa _{1,2}\geqslant D+C+\delta \). Let \(x\in {\mathcal {O}}_C(o, \gamma _1\cdot o)\). By definition, \((x,\gamma _1\cdot o)_o\geqslant \kappa (\gamma _1)-C> D+\delta \). But by \(\delta \)-hyperbolicity,

$$\begin{aligned} \min \{(x | \gamma _2\cdot o)_o, (x | \gamma _1\cdot o)_o\}\leqslant (\gamma _1 \cdot o | \gamma _2 \cdot o)_o+\delta \leqslant D+\delta . \end{aligned}$$

Thus \((x | \gamma _2 \cdot o)_o\leqslant D+\delta <\kappa (\gamma _2)-C\) and hence \(x\not \in {\mathcal {O}}_C(o, \gamma _2\cdot o)\) which proves the claim. \(\square \)

Now we are able give

Proof of Lemma 6.3

Using the assumption (iii), fix any real C such that

$$\begin{aligned} \frac{1}{2}\max _{i=1,2}\{\kappa (\gamma _i)\}< C< \min _{i=1,2}\{\kappa (\gamma _i)\}-D-\delta . \end{aligned}$$

For \(i=1,2\), denote \({\mathcal {O}}_i={\mathcal {O}}_C(o, \gamma _i \cdot o)\) and \({\mathcal {O}}_i^{<}={\mathcal {O}}_C(o, \gamma _i^{-1} \cdot o)\). By Lemma 6.6, these are four disjoint subsets of M. Moreover, by Lemma 6.5 and the choice of the constant C, the following inclusions hold every \(i=1,2\),

$$\begin{aligned} \gamma _i (M\setminus {\mathcal {O}}_i^{<}) \subseteq {\mathcal {O}}_i \end{aligned}$$

and

$$\begin{aligned} \gamma _i^{-1} (M\setminus {\mathcal {O}}_i) \subseteq {\mathcal {O}}_i^<. \end{aligned}$$

Thus pair of elements \(\gamma _1, \gamma _2\) satisfies the hypotheses of the classical ping-pong lemma and therefore they generate then a free subgroup of \({\text {Isom}}(M)\). \(\square \)

With Lemma 6.3 at hand, we focus now on showing that the random walks \(R_n, R'_n, R_n^{-1}, {R'_n}^{-1}\) satisfy assumptions (i)–(iii) of Lemma 6.3 with \(D=n\ell (\mu )/8+2\delta \), with probability tending to one depending on the constants \(p_n(\epsilon )\) appearing in the hypothesis ULD. Before that, we provide some estimates on the random walk \(R_n\) based on uniform large deviation estimates.

Lemma 6.7

Let (Md) be a \(\delta \)-hyperbolic metric space and let \(\mu \) be a probability measure on \({\text {Isom}}(M)\) with finite first order moment and satisfying the hypothesis ULD. Then, the following estimates hold.

  1. (i)

    For every \(\epsilon >0\) and every \(n\in {\mathbb {N}}\),

    $$\begin{aligned} \sup _{y\in M} {\mathbb {P}}\left( (R_n \cdot o| y)_o\geqslant \epsilon n\right) \leqslant 2p_n(\epsilon ). \end{aligned}$$
  2. (ii)

    For every \(0<\epsilon \leqslant \ell (\mu )/8\) and every \(n> 2+\frac{8 \delta }{\ell (\mu )}\),

    $$\begin{aligned} {\mathbb {P}}\left( (R_n \cdot o| R_n^{-1}\cdot o)_o\geqslant \epsilon n + 2\delta \right) \leqslant 8p_{\lfloor n/2 \rfloor }(\epsilon ) . \end{aligned}$$

Proof

  1. (i)

    Using the identity

    $$\begin{aligned} (go| y)_o = \frac{1}{2} (\kappa (g)-\sigma (g^{-1},y)) \end{aligned}$$

    which holds for any \(g\in {\text {Isom}}(M)\) and \(y\in M\), the desired inequality follows from ULD hypothesis applied to both \(\kappa (R_n)=\sigma (R_n,o)\) and \(\sigma (R_n^{-1},y)\).

  2. (ii)

    Let \(\epsilon >0\) and \(n\in {\mathbb {N}}\). For every \(1\leqslant m< n\), we denote \(R_{m,n}:=X_m \cdots X_n\). By \(\delta \)-hyperbolicity, we have

    $$\begin{aligned} \begin{aligned} \min&\left\{ (R_n\cdot o | R_n^{-1} \cdot o)_o, (R_n \cdot o| R_{\lfloor n/2 \rfloor } \cdot o)_o, (R_n^{-1} \cdot o| R_{\lfloor n/2 \rfloor +1,n}^{-1} \cdot o)_o\right\} \\&\leqslant (R_{\lfloor n/2 \rfloor }\cdot o | R_{\lfloor n/2 \rfloor +1,n}^{-1} \cdot o)_o+2\delta . \end{aligned} \end{aligned}$$
    (6.3)

    On the one hand, since \(R_{\lfloor n/2 \rfloor }=X_1 \cdots X_{\lfloor n/2 \rfloor }\) and \(R_{\lfloor n/2 \rfloor +1,n}=X_{\lfloor n/2 \rfloor +1} \cdots X_{n}\) are independent random variables, we deduce from (i) that

    $$\begin{aligned} {\mathbb {P}}\left( (R_{\lfloor n/2 \rfloor }\cdot o | R_{\lfloor n/2 \rfloor +1,n}^{-1} \cdot o)_o\geqslant \epsilon n\right) \leqslant 2p_{\lfloor n/2\rfloor }(\epsilon ). \end{aligned}$$
    (6.4)

    On the other hand, we claim that if \(0<\epsilon < \ell (\mu )/8\) and \(n>8\delta /\ell (\mu )\), then the following holds:

    $$\begin{aligned}&{\mathbb {P}}\left( \min \left\{ (R_n^{-1}\cdot o | R_{\lfloor n/2 \rfloor +1,n}^{-1}\cdot o)_o, (R_n\cdot o | R_{\lfloor n/2 \rfloor }\cdot o)_o \right\} \leqslant \epsilon n+ 2\delta \right) \nonumber \\&\quad \leqslant 2p_n(\epsilon )+4p_{\lfloor n/2 \rfloor }(\epsilon ). \end{aligned}$$
    (6.5)

    This will finish the proof of (ii) by combining (6.3), (6.4) and (6.5). We now check (6.5). We have that

    $$\begin{aligned} (R_n\cdot o | R_{\lfloor n/2 \rfloor } \cdot o)_o=\frac{\kappa (R_n)+\kappa (R_{\lfloor n/2 \rfloor }) - \kappa (R_{\lfloor n/2 \rfloor +1,n})}{2}. \end{aligned}$$
    (6.6)

    Thanks to ULD, the following inequalities hold: \({\mathbb {P}}\left( \kappa (R_n)<n(\ell (\mu )-\epsilon )\right) \leqslant p_n(\epsilon )\) and \({\mathbb {P}}\left( \kappa (R_{\lfloor n/2 \rfloor })< \lfloor n/2 \rfloor (\ell (\mu )-\epsilon )\right) \leqslant p_{\lfloor n/2 \rfloor }(\epsilon )\). Moreover, since the \(X_i\)’s are iid, for each \(n \in {\mathbb {N}}\), the distribution of \(\kappa (R_{\lfloor n/2 \rfloor +1, n})\) is the same as \(\kappa (R_{n-\lfloor n/2 \rfloor })\). Thus by applying again the ULD hypothesis, we get that

    $$\begin{aligned} {\mathbb {P}}\left( \kappa (R_{\lfloor n/2 \rfloor +1,n})> (n-\lfloor n/2 \rfloor )(\ell (\mu )+\epsilon )\right) \leqslant p_{n-\lfloor n/2 \rfloor }(\epsilon )\leqslant p_{\lfloor n/2 \rfloor }(\epsilon ). \end{aligned}$$

    By (6.6) this yields that

    $$\begin{aligned} {\mathbb {P}}\left( (R_n\cdot o | R_{\lfloor n/2 \rfloor }\cdot o)_o\leqslant \lfloor n/2 \rfloor \ell (\mu )-n\epsilon \right) \leqslant p_n(\epsilon )+2p_{\lfloor n/2 \rfloor }(\epsilon ). \end{aligned}$$
    (6.7)

    A similar relation holds by replacing the couple \((R_n\cdot o | R_{\lfloor n/2 \rfloor }\cdot o)_o\) with the couple \((R_n^{-1} \cdot o| R_{\lfloor n/2 \rfloor +1,n}^{-1} \cdot o)_o\). Consequently estimate (6.5) holds as soon as \(\lfloor n/2\rfloor \ell (\mu ) -n\epsilon > \epsilon n+2\delta \). This is for instance guaranteed if \(0<\epsilon \leqslant \ell (\mu )/8\) and \(n>2+8\delta /\ell (\mu )\). This shows (6.5). Since \(p_n(\epsilon )\) is non-increasing, this concludes the proof of estimate (ii). \(\square \)

We are finally ready to conclude

Proof of Proposition 6.1

Consider two independent random walks \((R_n)_{n\geqslant 1}\) and \( (R'_n)_{n\geqslant 1}\) driven by \(\mu \). We will check that \(R_n, R'_n, R_n^{-1}, {R'_n}^{-1}\) satisfy assumptions (i)–(iii) of Lemma 6.3 with \(D_n:=n\ell (\mu )/8+2\delta \), with probability tending to one. By (i) of Lemma 6.7 and the independence of the random variables \(R_n\) and \(R'_n\) we deduce that

$$\begin{aligned} {\mathbb {P}}\left( (R_n\cdot o | {R'_n} \cdot o)_o \geqslant n\ell (\mu )/8 \right) \leqslant 2 p_n(\ell (\mu )/8). \end{aligned}$$
(6.8)

Three other similar estimates hold by replacing the couple \((R_n, R'_n)\) with the couples \((R_n, {R'_n}^{-1})\), \((R_n^{-1}, R'_n)\), \((R_n^{-1}, {R'_n}^{-1})\). Also, by (ii) of Lemma 6.7, we have for \(n>2+8\delta /\ell (\mu ) \),

$$\begin{aligned} {\mathbb {P}}\left( (R_n \cdot o | R_n^{- 1} \cdot o)_o \geqslant D_n\right) \leqslant 8 p_{\lfloor n/2 \rfloor }(\ell (\mu )/8), \end{aligned}$$
(6.9)

and similarly

$$\begin{aligned} {\mathbb {P}}\left( (R'_n \cdot o| {R'_n}^{- 1} \cdot o)_o \geqslant D_n\right) \leqslant 8 p_{\lfloor n/2 \rfloor }(\ell (\mu )/8). \end{aligned}$$
(6.10)

Finally, using the hypothesis ULD, we have for every \(\epsilon >0\) and \(n\in {\mathbb {N}}\) that

$$\begin{aligned} {\mathbb {P}}\left( \kappa (R_n)\in [n \ell (\mu ) - n\epsilon , n \ell (\mu )+n\epsilon ]\right) \geqslant 1-p_n(\epsilon ), \end{aligned}$$

and similarly for \(\kappa (R'_n)\). Hence, with probability \(\geqslant 1-p_n(\epsilon )\),

$$\begin{aligned} 0<\frac{1}{2}\max \{\kappa (R_n), \kappa (R'_n)\}<\min \{\kappa (R_n), \kappa (R'_n)\} -D_n-\delta \end{aligned}$$
(6.11)

as soon as

$$\begin{aligned} (n \ell (\mu )+n\epsilon )/2< n\ell (\mu )-n\epsilon -D_n-\delta , \end{aligned}$$

and in particular as soon as \(0<\epsilon \leqslant \ell (\mu )/8\) and \(n>16\delta /\ell (\mu )\).

Finally, specializing to \(\epsilon =\ell (\mu )/8\), we conclude that the seven estimates (6.11), (6.10), (6.9) (6.8), and the three other inequalities similar to (6.8) hold simultaneously in an event of \({\mathbb {P}}\)-probability

$$\begin{aligned}>1-\left( 8p_n(\ell (\mu )/8)+16p_{\lfloor n/2 \rfloor }(\ell (\mu )/8) + p_n(\ell (\mu )/8)\right) >1-25 p_{\lfloor n/2 \rfloor }(\ell (\mu )/8),\nonumber \\ \end{aligned}$$
(6.12)

provided that \(n>2+16\delta /\ell (\mu )\). In other words, for every such \(n \in {\mathbb {N}}\), with probability at least the amount given by (6.12), the elements \(R_n\) and \(R_n'\) satisfy the hypotheses of Lemma 6.3 and this finishes the proof of Proposition 6.1. \(\square \)

6.2 A lower bound for the drift

In view of Theorem 4.1 and Proposition 6.1, the only remaining ingredient for the proof of Theorem 1.10 is a control of how small the drift \(\ell (\mu )\) of the random walk can be. The harmonic analytic approach of §4 allows one to deduce a lower bound on the drift as we now discuss. This sort of result should be known to the experts. Results of similar flavor appear in the works [35, 50, 53, 59].

Given \(R \geqslant 0\), as before, we set \(B_R=\{g \in G \; | \; d(go,o) \leqslant R\}\). Since M is proper, the sets \(B_R\) defined above are compact and they have non-empty interior if \(R>0\). In particular, there exists \(K_0 \in {\mathbb {N}}\) and \(g_1, \ldots , g_{K_0} \in B_{6D_1}\) such that \(B_{6D_1} \subseteq \cup _{i=1}^{K_0}g_i B_{D_1}\), where, as before, \(D_1 \in {\mathbb {R}}\) denotes the constant \(\max \{D_0,1\}\) and \(D_0:=2\text {diam}(G\backslash M)\). For convenience later on, we choose \(K_0\) to be the smallest such integer. An elementary covering argument allows one to get the bound \(K_0\leqslant \frac{\mu _G(B_{\frac{13}{2}D_1 })}{\mu _G(B_{\frac{1}{2}D_1})}\).

Proposition 6.8

Let (Md) be a proper geodesic metric space such that \(G={\text {Isom}}(M)\) acts cocompactly on M. Then, for every probability measure \(\mu \) on G with finite first order moment, the drift \(\ell (\mu ) \in {\mathbb {R}}\) satisfies

$$\begin{aligned} \ell (\mu ) \geqslant \frac{2D_1}{\ln K_0} \sup _{r \in [0,1)} \frac{1}{1-r} \ln \frac{1}{\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2}. \end{aligned}$$
(6.13)

Remark 6.9

The reason why we also include \(\mu _{r,{\text {lazy}}}\) in the conclusion of the previous Proposition 6.8 is that, as discussed in Remark 4.4, when \(\mu \) is non-symmetric, it might happen that the closed group \({\overline{\Gamma }}_\mu \) generated by the support of \(\mu \) is non-amenable whereas \(\Vert \lambda _G(\mu )\Vert _2=1\). However, in this case, for every \(r>0\), we have \(\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2<1\). Therefore, whenever \({\overline{\Gamma }}_\mu \) is non-amenable the lower bound provided by the proposition is strictly positive and it depends only on \(D_1,K_0\), and \(\Vert \lambda _G(\mu _{1/2,{\text {lazy}}})\Vert \).

Proof

We first prove that

$$\begin{aligned} \ell (\mu ) \geqslant \frac{2D_1}{\ln K_0} \ln \frac{1}{\Vert \lambda _G(\mu )\Vert _2}. \end{aligned}$$

The proposition then follows by applying the above for each \(\mu _{r,{\text {lazy}}}\) and noting that \(\ell (\mu _{r,{\text {lazy}}})=(1-r)\ell (\mu )\). A straightforward modification of the proof of Lemma 4.5 shows that for every \(R>0\) and \(n \in {\mathbb {N}}\), we have

$$\begin{aligned} {\mathbb {P}}(d(R_n o,o) \leqslant R) \leqslant \left( \frac{\mu _G(B_{2R})}{\mu _G(B_{R})}\right) ^{1/2} \Vert \lambda _G(\mu )\Vert _2^n, \end{aligned}$$
(6.14)

where \(\mu _G\) is a Haar measure on G. Indeed, the additional term \(D_0\) in the left hand side of (4.2) disappears since here we take \(m=m'=o\).

We now claim that for every \(r \geqslant D_1\),

$$\begin{aligned} \frac{\mu _G(B_{r+D_1})}{\mu _G(B_{r})} \leqslant K_0, \end{aligned}$$
(6.15)

where \(K_0 \in {\mathbb {N}}\) is the constant defined before the statement of Proposition 6.8. Indeed, given \(r \geqslant D_1\), let \(\{\gamma _1,\ldots ,\gamma _T\}\) be a maximal \(2D_1\)-separated set contained in \(B_{r-D_1}\) with respect to the left-invariant pseudo-metric \(d_G\) defined as \(d_G(g,h)=d(go,ho)\) for every \(g,h \in G\). Then the collection \(\gamma _iB_{D_1}\) for \(i=1,\ldots ,T\) consists of disjoint compact subsets of \(B_r\) of same Haar measure as \(B_{D_1}\) so that we have \(\mu _G(B_r) \geqslant T \mu _G(B_{D_1})\). On the other hand, since G acts co-compactly on M and M is geodesic, it is not hard to see that every element in \(B_{r+D_1}\) is \(2D_1\)-close for the pseudo-metric \(d_G\) to an element of \(B_r\) (in fact, \((G,d_G)\) is a large-scale geodesic space in the sense of [17, Definition 3.B.1]). Hence the collection \(\gamma _i B_{6D_1}\) for \(i=1,\ldots ,T\) is a covering of \(B_{r+D_1}\) by compacts having the same Haar measure as \(B_{6D_1}\) and therefore we have \(\mu _G(B_{r+D_1}) \leqslant T \mu _G(B_{6D_1})\). Therefore we deduce \(\frac{\mu _G(B_{r+D_1})}{\mu _G(B_r)} \leqslant \frac{\mu _G(B_{6D_1})}{\mu _G(B_{D_1})} \leqslant K_0\) proving (6.15).

Now, by using (6.15) iteratively and plugging it in (6.14), we deduce that for every \(\alpha <\frac{2D_1}{\ln K_0}\ln \frac{1}{\Vert \lambda _G(\mu )\Vert _2}\), we have

$$\begin{aligned} \limsup _{n \rightarrow \infty } {\mathbb {P}}(d(R_no,o) \leqslant \alpha n)^{\frac{1}{n}}<1. \end{aligned}$$
(6.16)

The result follows in view of the Kingman’s subadditive ergodic theorem. \(\square \)

Remark 6.10

In fact, the estimate (6.16) above provides a lower bound \(L_0>0\) for a region of type \([0,L_0)\) on which the large deviation rate function of the process \((\frac{1}{n}\kappa (R_n))_{n \geqslant 1}\) is positive. Such a lower bound is, a priori, stronger than a lower bound for the drift \(\ell (\mu )\). However, the recent works [7] under finite exponential moment and [32] under finite first order moment assumptions, identify the drift \(\ell (\mu )\) as the smallest real r such that the rate function I is positive on [0, r).

On the other hand, the fact that Proposition 6.8 provides an explicit region of positivity of I allows, for example, to obtain explicit constants in [49, Theorem 1.2] under our assumptions.

6.3 Proof of Theorem 1.10

We denote by D(., .) the positive function given by Theorem 4.1. The hypotheses of Theorem 1.10 allow us to apply Theorem 4.1 to deduce that for every \(r \in [0,1)\) the probability measure \(\mu \) satisfies the ULD hypothesis with

$$\begin{aligned} p_n(\epsilon )=2\exp \left( \frac{-n\epsilon ^2}{\kappa _{S}^2D(\kappa _{S},\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2)}\right) . \end{aligned}$$
(6.17)

Applying Proposition 6.1, we deduce that for every integer

$$\begin{aligned} n > 2+ \frac{16\delta }{\ell (\mu )}, \end{aligned}$$
(6.18)

we have

$$\begin{aligned}&(\mu ^{*n}\otimes \mu ^{*n} )\left\{ (\gamma _1, \gamma _2)\,|\, \langle \gamma _1, \gamma _2 \rangle \, \text {is free} \right\}>\\&\quad 1-25p_{\lfloor n/2 \rfloor }(\ell (\mu )/8)>1-25p_{n/4}(\ell (\mu )/8). \end{aligned}$$

Therefore, using (6.17) and the bound provided by Proposition 6.8, setting \(\lambda _r=\Vert \lambda _G(\mu _{r,{\text {lazy}}})\Vert _2\) we obtain that for every \(r\in [0,1)\) and for every \(n> 2+\frac{8 \delta (1-r) \ln K_0 }{D_1 \ln \frac{1}{\lambda _r}}\), two independent random walks generate a free subgroup with probability

$$\begin{aligned} > 1-50\exp \left( \frac{-nD_1^2 (\ln \lambda _r)^2 (1-\sqrt{\lambda _r})^4}{2^{11} (\ln K_0)^2 (1-r)^2 \kappa _S^2 (16 \ln ^+(\kappa _S)+\frac{8A_0}{3}+33)^2}\right) . \end{aligned}$$
(6.19)

Specifying to \(r=1/2\), the result follows by taking the function \(n_0(\cdot )\) and \(T(\cdot , \cdot )\) as

$$\begin{aligned} T(\kappa , \lambda )= & {} {\tilde{B}}_M \frac{(\ln \lambda )^2 (1-\sqrt{\lambda })^4}{\kappa ^2(\ln ^+\kappa +{\tilde{A}}_M)^2},\\ n_0(\lambda )= & {} 2-{\tilde{C}}_M \frac{1}{\ln \lambda }, \end{aligned}$$

where the constants \({\tilde{A}}_M,{\tilde{B}}_M,{\tilde{C}}_M\) are given by

$$\begin{aligned} {\tilde{A}}_M=A_0/6 + 33/16, \quad {\tilde{B}}_M=\frac{D_1^2}{2^{17} (\ln K_0)^2}, \quad {\tilde{C}}_M=\frac{4\delta \ln K_0}{D_1}, \end{aligned}$$
(6.20)

and for clarity, we recall that

  • 6.2) \(D_1=\max \{1,2{\text {diam}}(G\backslash M)\}\),

  • 6.2) \(K_0\) satisfies \(K_0\leqslant \frac{\mu _G(B_{13D_1/2})}{\mu _G(B_{D_1/2})}\), and

  • 4.1) \(A_0\) is the doubling constant given in Remark 4.3.

Finally, the expression in Remark 1.11 follows by taking

$$\begin{aligned} A_M=\max \left\{ {\tilde{C}}_M,\frac{{\tilde{A}}_M^2}{{\tilde{B}}_M}\right\} . \end{aligned}$$
(6.21)

Remark 6.11

The explicit bounds on the probability mentioned in §1.2.2 for hyperbolic groups and rank-one linear groups are obtained by plugging the upper bounds (5.4) and (5.5) on \(\lambda _r\) into (6.19) in the proof above. Similarly, for the range of validity of \(n \in {\mathbb {N}}\), one can plug (5.4) and (5.5) in (6.13) to get an explicit lower bound for \(\ell (\mu )\) which then provides an upper bound for the right-hand-side of (6.18).