Random walks on hyperbolic spaces: Concentration inequalities and probabilistic Tits alternative

The goal of this article is two-fold: in a first part, we prove Azuma–Hoeffding type concentration inequalities around the drift for the displacement of non-elementary random walks on hyperbolic spaces. For a proper hyperbolic space M, we obtain explicit bounds that depend only on M, the size of support of the measure as in the classical case of sums of independent random variables, and on the norm of the driving probability measure in the left regular representation of the group of isometries. We obtain uniform bounds in the case of hyperbolic groups and effective bounds for simple linear groups of rank-one. In a second part, using our concentration inequalities, we give quantitative finite-time estimates on the probability that two independent random walks on the isometry group of a hyperbolic space generate a free non-abelian subgroup. Our concentration results follow from a more general, but less explicit statement that we prove for cocycles which satisfy a certain cohomological equation. For example, this also allows us to obtain subgaussian concentration bounds around the top Lyapunov exponent of random matrix products in arbitrary dimension.


Introduction
Let (M, d) be a metric space and Isom(M) the group of isometries of M. Consider a finitely supported probability measure μ on Isom(M), let (X i ) i∈N be a sequence of independent random variables with distribution μ and denote by R n the random variable given by the product X 1 . . . This can be seen as a generalization of the classical law of large numbers which corresponds to the case M = R and μ supported on the translations R < Isom(R). Understanding various aspects of the convergence (1.1) (e.g. central limit theorem (CLT), large deviation principles (LDP), Azuma-Hoeffding-type concentration inequalities) in the aforementioned special case constitutes a fundamental part of classical probability theory. Various other cases have attracted considerable attention relatively more recently: starting in '60s with the work of Furstenberg, Kesten, Oseledets, Kaimanovich [22,24,40,54] for symmetric spaces of non-compact type and with Dynkin-Malyutov [20], Furstenberg [23], Kaimanovich-Vershik [41] and others for random walks on countable groups. More recently, for general metric spaces with an assumption of coarse negative-curvature (namely Gromov hyperbolicity), a number of analogues of the classical results were proven including CLT's [4,51], local limit theorems [30], and closer to our considerations, LDP's and exponential decay results [7,31]. Our goal in this paper is to establish Hoeffding-type concentration inequalities in the general setting of random walks on hyperbolic spaces. To the best of our knowledge, this aspect of the classical theory is far less developed in our setting.
Concentration inequalities around the mean (μ) have two distinctive features compared to asymptotic large deviations estimates: on the one hand, these are large deviation bounds for the fluctuations of the distance of the random walk that are valid uniformly over all times as opposed to asymptotic estimates. On the other hand, the exponential decay rate is expressed as an explicit function of the normalized deviation distance t. As such, these inequalities have been useful in the classical case both from a pure mathematics and applied or computational perspectives. Accordingly, one of the main reasons that we mostly focus our attention in this article to proper Gromov hyperbolic spaces is that, by following a geometric and harmonic analytic technique of Benoist-Quint [4], we are able to exploit their geometry and consequently obtain explicit concentration estimates. We also obtain subgaussian concentration estimates for non-proper Gromov hyperbolic spaces and random matrix products, but with less explicit bounds. These results are also new and discussed later in the introduction.
Our approach consists of proving a general concentration type result for cocycles satisfying a certain cohomological equation. This is line with Gordin's method for proving the central limit theorem where the values of cocycles along random walks coming from group actions are related to martingales via a Poisson type equation.
In particular, the solutions by Benoist-Quint of associated cohomogical equations for Busemann and norm cocycles, respectively on the boundary of hyperbolic spaces [4] and projective spaces [3], play a crucial role in the application of our general cocycle-concentration results to these settings. We slightly extend this solution to adapt it to our purposes, and in the case of proper hyperbolic spaces, we get explicit bounds on its size. These bounds involve the norm λ G (μ) 2 of the regular representation λ G of a probability measure μ on the isometry group G = Isom(M). In a later part, we use various versions of uniform Tits alternatives to control the size of λ G (μ) 2 which in turn yields effective constants for example in the case of linear groups of rank one, thanks to the works of Breuillard [9,10].
Finally, we give explicit finite-time estimates for the probability that two independent non-elementary random walks on a proper hyperbolic space generate a free subgroup. We deduce this result from our concentration bounds together with a more general statement linking uniform large deviations with free-subgroups generated by samplings of random walks. Our result (Theorem 1.10) quantifies some cases of several known probabilistic Tits alternatives proven in [1,29,57].
Let us now state our first main result, some of its consequences and related remarks.

Subgaussian concentration estimates for random walks on hyperbolic spaces
We first introduce some notation and definitions. Let (M, d) be a proper metric space, we denote by G its group of isometries. It is a locally compact group and we denote by μ G a Haar measure on G. For every r ∈ [0, 1], we denote μ r ,lazy = r δ id + (1 − r )μ. Furthermore, we denote by λ G (μ) the operator given by the image of the probability measure μ under the the left-regular representation of G on L 2 (G). Finally, having fixed a basepoint o ∈ M, for an element g ∈ G, we set κ(g) := d(go, o) and for a set S ⊂ G, κ S := sup{κ(g) : g ∈ S}. The set S is said to be bounded if κ S < ∞.  ) with D(., λ) < ∞ for every λ ∈ (0, 1) such that for every non-elementary probability measure μ on G with bounded support S, for every t 0 and n ∈ N we have for every r ∈ [0, 1).
This statement will follow from a more general concentration result (Theorem 4.1) for the Busemann cocycle on the horofunction compactification of M.

Remark 1.3 1. (Non-proper case)
As mentioned earlier, we also obtain subgaussian concentration estimates without the properness assumption but in this case, the dependence on μ at the right-hand-side of (1.2) is less explicit (Proposition 3.4). 2. (Random walks with unbounded support) It is possible to have a version of our result where the bounded support assumption on the probability measure μ is replaced by a finite exponential moment assumption and obtain a Bennett-Bernstein type concentration inequality. However, the constants that appear in that version are more complicated to express. This point is discussed in more detail in Remark 4.9.
In the sequel, we will see that each of the two aspects of the upper bound in Theorem 1.1, namely its subgaussian form and its parameters of dependence, have implications and strenghtenings. On the one hand, by combining this upper bound with versions of uniform Tits alternatives in various contexts (which entail uniform bounds for λ G (μ) 2 , see Lemma 5.2), we will obtain uniform concentration estimates for a class of driving probability measures, see Corollaries 1.4 and 1.6. On the other hand, the subgaussian character allows us for instance to provide a global quadratic lower bound (see Corollary 1.8) for the rate function of large deviations, recently studied in this setting by [7]. Let us now explain these consequences.

The case of hyperbolic and rank-one linear groups
Firstly, specifying Theorem 1.1 to hyperbolic groups, and using Koubi's uniform Tits alternative [43,Theorem 5.1], we obtain the following more precise concentration result for random walks on hyperbolic spaces. Then there exists a constant A M > 0 such that for any group < G that acts properly and cocompactly on M, there exist constants α > 0 and N ∈ N depending only on such that for every non-elementary probability measure μ of finite support S generating , for every t > 0 and n ∈ N, setting m μ = min g∈S μ(g), we have Specifying Theorem 1.1 to rank one matrix groups and using the strong Tits alternative of Breuillard [9,10], we obtain concentrations for random matrix products of discrete non-amenable subgroups of rank-one semisimple algebraic groups. A further aspect of the following corollary is that thanks to the work of Breuillard, the implied constants can be effectively calculated.
We need some notation to state the next corollary. Let k be a local field (i.e. in characteristic zero R, C or a finite extension of Q p for a prime number p and in positive characteristic, a finite extension of F p ((T ))). We denote by · the canonical norm on k d for a fixed discrete valuation on k and consider the associated operator norm on the space of d × d-matrices. Moreover, if S is a finite subset of Mat d (k), we denote by κ S := sup{ln g : g ∈ S}. Finally, if μ is a probability measure with finite first order moment on GL d (k), we denote by (μ) the top Lyapunov exponent, i.e. the almost sure limit of 1 n ln R n . Corollary 1.6 Let k be a local field and H ⊆ SL d be a connected semisimple linear algebraic group of k rank-one defined over k. For every d ∈ N, there exist constants α d > 0, N d ∈ N depending only on the dimension d and constants A = A(H, k) such that for every finitely supported probability measure μ whose support generates a non-amenable discrete subgroup of H(k), for every t > 0 and n ∈ N, the following holds:

Remark 1.7 (About the discreteness assumption)
1. Both of the above corollaries are obtained from Theorem 1.1 in the following way: the respective versions of Tits alternatives allow us to deduce bounds on the norm λ (μ) of the regular representation on 2 ( ), which is equal to λ G (μ) thanks to the discreteness assumption. In general, even though we have uniform upper bounds for λ (μ) , we are not able to transfer this to a bound on λ G (μ) without discreteness assumption. Indeed, by [13,44], in any connected semisimple Lie group G, for any element g ∈ G, one can find pairs of elements {a n , b n } that converge to g and that generate a non-abelian free group, so that for the uniform probability measure μ n supported on {a n , b n , a −1 We also note that under the discreteness assumption, the fact that the support S generates a non-elementary group implies, thanks to various versions of Margulis Lemma, a positive lower bound for κ S . This lower bound depends in Corollary 1.4 on some parameters of M and the group generated by S (see [6,Theorem 5.21]). In Corollary 1.6, it depends only on H(k) (see e.g. [2, Chapter 8]).

Rate function of LDP
We now mention a consequence of Theorem 1.1 concerning the rate function of large deviation principles of random walks on hyperbolic spaces recently studied by [7]. The authors prove that the sequence of random variables κ(R n ) n satisfies a large deviation principle with proper convex rate function I μ : [0, ∞) → [0, +∞] vanishing only at the drift (μ). Recall that this means that I μ is a lower-semicontinuous function such that for every measurable subset J of R, we have where int(J ) denotes the interior and J the closure of J . To the best of our knowledge, no explicit global estimate for the rate function exists in the literature. Theorem 1.1 allows us to give an explicit quadratic lower bound for the rate function I μ in our setting, i.e. when M is proper and the non-elementary probability measure μ has a bounded support.
The proof of this corollary is immediate from the property (1.4) defining the function I μ and the estimate given by Theorem 1.1.

Remark 1.9
In the general case of random walks with finite exponential moment, one can clearly not get such a quadratic lower bound, see Remark 4.9 for the type of global lower bound that one can obtain using our methods.

Quantitative probabilistic Tits alternative
It is known since the foundational work of Gromov [34] that groups acting nonelementarily on hyperbolic spaces contain non-abelian free subgroups. The main result of this part is a probabilistic quantification of this fact which says that if we sample two independent random walks at their n th -steps, the probability that the two elements generate a free group of rank two is exponentially close to one. Moreover, an important aspect is that this probability is explicitly described in terms of the norm of the driving measure μ in the regular representation and the size of its support.

Probabilistic free-subgroup theorem
Theorem 1.10 Keep the assumptions of Theorem 1.1. Then, there exist explicit functions n 0 (·) and T (·, ·) both with values in (0, +∞) such that for any non-elementary probability measure μ on G = Isom(M), denoting (R n ) n∈N and (R n ) n∈N two independent random walks driven by μ, for every n > n 0 λ G (μ 1/2,lazy ) 2 , we have We proceed with a few remarks on the statement and some consequences.

Remark 1.11 (The explicit estimate)
(i) For the function appearing in the above statement, one can take where the constant A M > 0 is related only to a doubling constant of the Haar measure on G and to the diameter of G\M (see (6.21) for its expression). (ii) Unlike in our previous results, the left-hand-side (1.5) is independent of the choice of basepoint o. One can therefore replace κ S by the joint minimal displacement L(S) of S ( [12]) given by inf x∈M sup s∈S d(sx, x), which is independent of any basepoint. (iii) Finally, the choice of 1/2 for the lazy random walk μ 1/2,lazy is for convenience: it ensures that the associated operator norm is strictly less than one (which might not be the case for μ due to the non-symmetry of μ, see Remarks 4.4 and 6.9).

Remark 1.12
Using similar techniques, one can also prove a more general version of this result where several (more than two) independent copies of random walks, even with different step-distributions, are considered.

Some consequences
• For discrete subgroups of Isom(M), in the respective settings, using Corollaries 1.4 and 1.6 (see also Remark 1.5), we can deduce an explicit expression for the right-hand-side of (1.5) as well for its range of validity controlled by n 0 (·) (see Remark 6.11). • Moreover, it is known that for a discrete subgroup of isometries of a proper geodesic hyperbolic space M such that Isom(M) acts cocompactly on M, the group is either virtually nilpotent or non-elementary (see e.g. [15,Corollary 3.13]). Hence Theorem 1.10 can be seen as a quantitative probabilistic Tits alternative for discrete groups of isometries of M.
• Theorem 1.10 gives an explicit version of a result by Taylor

Random matrix products
The concentration estimates that we obtain in Sect. 2 for general cocycles also allow us to deduce concentration estimates for random matrix products in arbitrary dimension, but these are less explicit compared to Theorem 1.1. Before stating the result we recall some known facts; we refer to §3.1 for more details. Let μ be a probability measure on GL d (C) whose support generates a strongly irreducible and proximal subgroup, then there exists a unique μ-stationary probability measure ν on the projective space of C d ( [23,37]). The stationary measure ν enjoys some regularity properties. It is non-degenerate (i.e. does not charge any proper hyperplane) [23], log-regular under a finite second order moment [3] and Hölder regular under a finite exponential moment assumption [36]. Suppose now μ has bounded support and consider c(μ) := sup x∈C d \{0} ln x y | x,y | dν(Cy). It follows from the aforementioned regularity properties that this quantity is finite. Finally, we denote μ * the pushforward of μ by the map g → g * , where g * is the conjugate-transpose of g. With these at hand, we are now ready to state Proposition 1.13 Let μ be a boundedly supported probability measure on GL d (C) such that the semigroup generated by the support S of μ is strongly irreducible and proximal. Let κ S := max{ln g ∨ ln g −1 ; g ∈ S} and c = c(μ * ). Then, for every t > 0 and n ∈ N, we have In particular, for every t > 0 and n ∈ N such that nt ln d, the following holds: In this result, the fact that we have subgaussian estimates for every t > 0 small enough can also be deduced from the spectral gap result of Le Page [45] using analytic perturbation methods. We also refer to [3,8] for exponential deviation estimates in a more general setting and to [19,Ch. 5] for local concentrations that are uniform over small neighborhoods of irreducible cocycles. Remark 1.14 Similarly to Corollary 1.8, the estimate in Proposition 1.13 allows one to obtain a global lower bound (less explicit in its constants compared to the aforementioned corollary) for the rate function of log-norms of random matrix products studied in [55,60] (see also [55,Corollary 4.17]).
We end the introduction by mentioning that • the methods we use to prove Theorem 1.1 allow us to provide an explicit lower bound for the bottom of the support of Hausdorff spectrum of the harmonic measure, equivalently, for the exponent with which the Frostman property holds (see §4.2 and see also Tanaka [56] for a thorough discussion of multifractal analysis of the harmonic measure in the particular case of hyperbolic groups); • Theorem 1.1 itself has a direct application to the continuity of the drift ( §4.3); • in view of Horbez's work [39], it seems possible that our results in §2 can be used to obtain subgaussian concentration estimates in the setting of random walks on mapping class groups and on the group Out(F N ) of outer automorphisms of a non-abelian free group.

Organization
The article is organized as follows. In Sect. 2, we prove concentration estimates for a general cocycle that satisfies a certain cohomological equation (Proposition 2.1). In Sect. 3, we deduce non-explicit concentration estimates for random matrix products in arbitrary dimension (Proposition 1.13) and for random walks on hyperbolic spaces (Proposition 3.4). In Sect. 4, we prove Theorem 1.1. In Sect. 5, we prove Corollaries 1.4 and 1.6. Finally in Sect. 6, we deduce Theorem 1.10 from Theorem 1.1, a uniform positive lower bound on the drift (Proposition 6.8) and a general result estimating the likelihood of obtaining free subgroups from random walks based on uniform large deviation estimates (Proposition 6.1).

Concentration inequalities for cocycles satisfying a Poisson equation
The goal of this section is to prove Proposition 2.1 yielding concentration inequalities for values of a cocycle for which the associated Poisson equation has a bounded measurable solution. This result will provide the basis for the rest of the article where we will obtain more precise versions in the particular setups discussed in Introduction. We note that this section is inspired by the work of Furstenberg-Kifer [25] of which it can be seen as a quantitative analogue under an additional assumption (see Remark 2.2). We start by recalling some standard terminology. Let G be a Polish group (endowed with the Borel σ -algebra) and X a standard Borel space endowed with a measurable action of G. We shall refer to such a space as a G-space. A function σ : G × X → R is said to be an additive cocycle if it satisfies σ (g 1 g 2 , x) = σ (g 1 , g 2 x) + σ (g 2 , x) for every g 1 , g 2 ∈ G and x ∈ X . All cocycles will supposed to be measurable. Given a probability measure μ on G, a probability measure ν on X is said to be μ-stationary if for every bounded measurable function φ, we have We denote by P μ the Markov operator acting on bounded measurable functions on X by P μ φ(x) = φ(gx)dμ(g). Finally, denoting by (X i ) i∈N a sequence of independent G-valued random variables with distribution μ, we write L n for the left product X n · · · X 1 . Although, L n and R n have the same distribution, it will be more convenient in this section to work with the left random walk L n .

Proposition 2.1 Let G be a Polish group, X a G-space and σ : G × X → R a bounded additive cocycle. Let μ be a probability measure on G with support S. Denote by
Let ν be a μ-stationary probability measure on X and

Assume that the set E of bounded measurable solutions ψ of the Poisson equation
is non-empty and let c := inf{ ψ ∞ : ψ ∈ E}. Then, for every t > 0, n ∈ N, and x ∈ X we have

Remark 2.2 1. Our assumption (2.1) implies that there is a unique cocycle average
in the sense of [3, §3]. 2. This result can be seen as an abstract quantitative refinement of [25,Theorem 2.1] under the assumption that the expected increase function is cohomologous to a constant.
The proof of the previous result is based on the following general probabilistic ingredient. We start by recalling some standard terminology on Markov chains. Let M be a standard Borel space, P a Markov operator on M, i.e. a measurable map x → P x from M to the space of probability measures on M. This data naturally defines an operator on the space of bounded Borel functions on M by φ → Pφ, where Pφ(x) = φ(y)d P x (y). Given x ∈ M, we denote by P x the law of the Markov chain (Z n ) n on the space of trajectories, i.e. M N and E x the associated expectation operator. We say that a probability measure π is invariant (or stationary) under the Markov operator P if Pφdπ = φdπ for every bounded measurable function φ on M.
Then, for every t > 0, n ∈ N and x ∈ M, the following inequality holds . This is a particular case of Benoist-Quint's [3, Proposition 3.1]. In Proposition 2.3, thanks to the stronger assumption (2.2), one obtains subgaussian exponential decay with explicit constants. 2. In some particular cases, powerful concentration inequalities exist for the sums of any function along the Markov chain [18,28]. They are not applicable here since our Markov chains are not geometrically ergodic. On the other hand, the particular requirement (2.2) on the function f allows us to use the usual Hoeffding inequality for martingales and thereby deduce the previous concentration estimates in the generality of Markov chains that we consider.
Proof of Proposition 2. 1 We start by defining the appropriate objects to which we will apply Proposition 2.3. We take the standard Borel space M to be S × X and P the Markov operator defined by for every bounded measurable function f on M. The associated Markov chain (Z n ) n∈N on M starting from Z 0 = (e, x) is the process where the g i 's are iid random variables on G with distribution μ. Let π be the probability measure on M defined by for every bounded measurable f on M. Since ν is a μ-stationary, one readily checks that π is stationary for the Markov operator P. Let now The following properties are immediate to check Finally, we check that if (2.1) holds for some ψ, then (2.2) holds. Indeed, let One readily checks that Pψ = P μ ψ and P f (g, x) = G σ (g, x) dμ(g). Thus, by

Proof of Proposition 2.3 Let α := M f dπ and φ as in the statement so that
On the one hand, the sequence is a martingale with bounded differences. Applying Azuma-Hoeffding concentration inequality for martingales with bounded difference (see for instance [52, Lemma 4.1]), we get that for every t > 0 and n ∈ N, On the other hand, the following crude upper bound holds for V n : . Combining this fact with (2.3) and (2.4), we get that for every t > 0 and every . This shows the desired inequality in the case n The desired estimate holds trivially in this case.

Applications to random matrix products and random walks on hyperbolic spaces
The goal of this section is to obtain two consequences of Proposition 2.1 in the settings of random matrix products and random walks on hyperbolic spaces M. For the latter, in this section, we will not suppose any properness assumption, and relatedly, we are only able to obtain non-explicit concentration estimates. In §4, we will upgrade those to more explicit estimates in the case of proper hyperbolic spaces.

Subgaussian concentrations for random matrix products
Let d 1 be an integer, we consider C d endowed with the canonical Hermitian structure and M d (C) with the induced operator norm. For simplicity, we denote by . both norms on C d and M d (C). We denote by X = P(C d ) the projective space of C d and we endow it with the standard metric given by where the norm · is the canonical norm on 2 C d , [x] = Cx and [y] = Cy. A probability measure μ on GL d (C) is said to be (strongly-)irreducible if the support S of μ does not fix a (finite union of) non-trivial proper subspace(s) of C d . An irreducible probability measure μ is said to be proximal if the closure CG + μ in M d (C) of the semigroup G + μ generated by the support of μ contains a rank-one linear transformation.
A probability measure ν on X is said to be μ-stationary if it is μ-stationary for the Markov operator P μ associated to μ. We recall that for a strongly irreducible and proximal probability measure μ on GL d (C), there exists a unique μ-stationary probability measure ν on X [23,37]. We denote by μ * the image of μ under the map g → g * , where g * denotes the conjugate-transpose of μ and by ν * the uniquestationary measure of μ * (which is also proximal and strongly irreducible).
We denote by σ : x . The solution of the Poisson equation (2.1) for the norm cocycle is closely related to regularity properties of the stationary measure ν on X . Indeed when μ has an exponential moment, (2.1) can be solved using the result of Le Page [45] establishing a spectral gap for the Markov operator P μ acting on some Hölder functions of X . As proved by Guivarc'h [36] this spectral gap property implies the Hölder regularity of ν. When μ has a finite second order moment, Benoist-Quint [3] solved the same equation by using and proving the log-regularity of the stationary measure ν. We will rely on their results.
By [3], the following quantity is finite for every x ∈ X and defines a continuous function ψ on X . Moreover, ψ satisfies the cohomological equation . This fact plays the key role in the proof of the following result: Proof of Proposition 1. 13 We will apply Proposition 2.1 with G = GL d (C), X = P(C d ), and the norm-cocycle σ : G × X → R. Observe that for every g ∈ G and x ∈ X , Furthermore, the equation (3.2) shows that the hypothesis (2.1) of Proposition 2.1 holds, and consequently, we deduce that for every ). This proves the first estimate. To get the concentration estimates for the matrix norm of L n , consider the canonical basis e 1 , · · · , e n of C d . For every g ∈ G, we have (3.4) Suppose that nt ln d. Then nt − (ln d)/2 nt 2 and hence, by combining (3.3) and (3.4), we get that as claimed.

Application to random walks on hyperbolic spaces
The goal of this part is to deduce concentration estimates for non-elementary random walks on (not necessarily proper) geodesic hyperbolic spaces.
The main tool is Proposition 2.1 that we will apply to the horofunction compactification M h and Busemann cocycle σ of a separable geodesic hyperbolic metric space.
The key point in this application is to solve the cohomological equation (2.1) in this setting. This was previously done by Benoist-Quint [4] when M is proper; they gave a solution ψ on ∂ h M. A partial extension of this solution to ∂ h M was used by Horbez [39] in the non-proper setting. We will observe here that ψ extends further to a solution on the full space M h ; this will be more convenient for our purpose.
Let us start by recalling some definitions. Let (M, d) be a separable metric space and denote by Lip 1 (M) the set of real valued Lipschitz functions on M with Lipschitz constant 1, endowed with the topology of pointwise convergence.
where (.|.) . is the Gromov product given by (x|y y)). For simplicity, we will often omit the basepoint o from the notation. We refer to [16] for general properties of these spaces. An element γ ∈ Isom(M) is said to be loxodromic if for any x ∈ M, the sequence (γ n x) n∈Z constitutes a quasi-geodesic (see [16,Ch. 3] Finally, a set S, or equivalently a probability measure with support S, is said to be non-elementary if the semigroup generated by S contains at least two independent loxodromic elements. For  [4,Proposition 3.3] or [39,Corollary 2.7]). Recall that μ is said to have a finite first order moment if κ(g)dμ(g) < ∞ and that the convergence (1.1) to the drift (μ) ∈ R is ensured under this moment assumption.

This extends the usual Gromov product on
When μ has a finite second order moment (i.e. κ(g) 2 dμ(g) < +∞) and M is proper, Benoist-Quint showed that the function ψ defined on ∂ h M as is bounded, measurable, and it satisfies the Poisson equation whereν is any stationary probability measure on ∂ h M for μ −1 and μ −1 is the nonelementary probability measure given by the image of μ by the map g → g −1 .
For our purposes in the sequel, it will be more convenient to consider the action of the Markov operator P μ on the space of bounded measurable functions defined on the whole compactification M h in the case where M is only a separable and geodesic hyperbolic space. Accordingly, we will verify that the natural extension of the function ψ given by (3.6) to the space M h yields a solution to the equation (3.7). We summarize these in the next is a bounded measurable function that satisfies the equation for every x ∈ M h . (3.14) In particular, Proof In view of the inequality |σ (g, ξ)| κ(g) true for every g ∈ G and ξ ∈ M h , the estimate (3.14) follows directly from Proposition 2.1 applied to G = supp(μ) the group generated by the support of μ endowed with the discrete topology, X = M h and the Busemann cocycle σ : G × X → R. Finally, (3.15) follows directly by specializing to ξ = o.

Main result on concentrations
Let (M, d) be a proper metric space, we denote by G its group of isometries. It is a locally compact group [26, Theorem 6] and we denote by μ G a Haar measure on G. For a probability measure μ on G, S denotes the support of μ which is the smallest closed subset whose μ-mass equals one. We recall that for every r ∈ [0, 1), we denote by μ r ,lazy = r δ id + (1 − r )μ. Having fixed a basepoint o ∈ M, for an element g ∈ Isom(M), we write κ(g) = d(go, o) and for a bounded set S, we set κ S := sup{κ(g) : g ∈ S}.
The main result of this section is the following result, which immediately implies Theorem 1.1 by specializing to ξ = o.  .) with D(., λ) < ∞ for every λ ∈ (0, 1) such that for every nonelementary probability measure μ on G with bounded support S, for every ξ ∈ M h , t > 0 and n ∈ N, we have for every r ∈ [0, 1).
With Proposition 3.4 at hand, the main ingredient for the proof of the Theorem 4.1 is the following  for every r ∈ [0, 1).
For the proof, we require the following version of [4, Lemma 5.2] where we highlight the constants that appear in the aforementioned lemma for our purposes. This is the crucial harmonic analytic ingredient of the proof where the additional hypothesis (compared to Proposition 3.4) on the cocompactness action of G on the space M is used.
The proof follows similarly as [4, Lemma 5.2], we only indicate the necessary modifications.
• We replace the estimate λ G (μ) n C 0 a n 0 used in the proof of [4, Lemma 5.2], by λ G (μ) n λ G (μ) n , and consequently, the constants C 0 and a 0 in [4, Lemma 5.2] can, respectively, be taken to be 1 and λ G (μ) . We will need the following geometric lemma, which is an adaption of Inclusion (5.5) in the proof of [4, Lemma 5.3] to the horofunction compactification. We point out that this is the key point where the geometric assumption of hyperbolicity is used.

Moreover, all elements constituting the tuples in C are contained in a ball of radius D around o.
As it is will be shown, one can take the constant R(δ) = 14δ + 4. The proof will require some juggling between horofunction boundary ∂ h M and Gromov boundary ∂ M to construct the set C. We therefore start by recalling some standard facts on the relation between ∂ h M and ∂ M.
First, there exists a natural G-equivariant surjective map from M h to M ∪ ∂ M.
Namely, given h ∈ ∂ h M, for any sequence x n ∈ M such that h x n → h, the sequence x n Gromov converges to infinity in the sense that inf m,n k (x n , x m ) → ∞ as k → ∞. Given two points π x = π y ∈ ∂ M, by using the defining inequality (3.5) of a δhyperbolic space, one checks that for any pair of pair of sequences x n , x n and y n , y n that Gromov converge to infinity and that are, respectively, in the equivalence class of π x and π y , we have For reader's convenience, we single out two basic geometric properties that are used in the proof of the previous lemma. This lemma can be deduced from standard facts in hyperbolic geometry. We include a brief proof for reader's convenience.
Proof Consider the triangle (x, y, z) whose edges are as given in the statement. Let z n be a sequence of points on the edge (y, z) that converge to z. Consider the segments ζ n from x to z n . Since M is proper, by Arzelà-Ascoli Theorem, up to subsequence, they converge to a ray ζ between x and z.
For each triangle (x, y, z n ), fix points a n , b n , c n respectively on the edges [x, y], [x, z n ] and [y, z n ] that map to the junction point of the associated tripod [16, Ch. 1]. Using the fact that M is proper and passing to a further subsequence of z n , we may suppose that the sequences a n , b n , c n converge, respectively, to the points, a ∈ [x, y], b ∈ ζ and c ∈ [y, z]. Let b be the point on [x, z] at distance d(x, a) from the x. Now note that by the tripod lemma [16, Proposition 3.1], we have the required property within each triangle (x, y, z n ) with 4δ. Since all points a n , b n , c n converge to respectively a, b , c, the same property is true at the limit triangle with [x, z] replaced by ζ . Now since [x, z] and ζ are at parametrized-distance 2δ-apart, we get the required property with 6δ.
We now give the Proof of Lemma 4. 6 We will prove the claim with R = 14δ We deduce that d(gm i 0 , m i 1 ) 14δ + 4. Therefore, desired result holds with Proof of Proposition 4. 2 We fix r ∈ [0, 1) such that λ G (μ r ,lazy ) < 1. The latter may be equal to 1 only for r = 0 (see Remark 4.4) in which case the inequality holds trivially by setting C(., 1) ≡ ∞. Note that the measure ν is μ r ,lazy -stationary and denoting by S r the support of μ r ,lazy , we have κ S r = κ S . To ease the notation, in the proof, we write μ for μ r ,lazy . Let ν be a μ-stationary probability measure on ∂ h M. Let (B = G N , β = μ ⊗N ) be the Bernoulli space and T : B → B the shift map. Since ∂ h M is compact, metrizable (see e.g. [49,Proposition 3.1]) and Isom(M) acts continuously on ∂ h M, by a result of Furstenberg [23], it follows that for β-almost every b ∈ B, there exists a probability measure ν b on ∂ h M such that the following weak convergence holds Moreover, for every n ∈ N, we have For every b ∈ B and n ∈ N, denote for simplicity R n (b) := b 1 · · · b n and k n (b) := κ (R n (b)). Let now η ∈ M h . Using (4.5) and Fubini-Tonelli, we have where we used the fact that κ n (b) → +∞ almost surely and where, for every t > 0, we Now using the first equality of (4.5), we have where in the second line we used the fact that β = μ ⊗N is a product measure and in the last line we used Fubini-Tonelli's theorem. We conclude that Using now Lemma 4.6, we get that for every c > 1, n ∈ N, ξ ∈ ∂ h M, y ∈ M h , P ((R n ξ |y) o κ(R n )) P(κ(R n ) c n ) a n + c 2n sup On the one hand, since κ(R n ) nκ S , we have +∞ n=0 a n +∞ n=0 1 c n nκ S . Using this, it is not hard to deduce that +∞ n=0 a n max 2 ln κ S ln c , 4 (ln c) 2 . (4.7) On the other hand, by Lemma 4.5, for every n ∈ N, we have for every x , y ∈ M, where A 0 (.) is the function defined in that lemma. We deduce that for every 1 < c < λ G (μ) The proof follows by combining (4.6), (4.7) and (4.8). Proof of Theorem 4.1 Using the estimate (4.1) in combination with Lemma 3.1, one gets that in Proposition 3.4 (see also Remark 3.5), the constant c is bounded above by 2C(κ S , λ G (μ r ,lazy ) 2 ). Since the right-hand-side of the inequality (3.15) is increasing in c, we can substitute 2C(κ S , λ G (μ r ,lazy ) 2 ) for c, one gets that for every ξ ∈ M h , n ∈ N and t > 0, for every r ∈ [0, 1). This yields the desired estimate.
The constant c depends only on α 0 , K and on the constants A 0 , R δ , D 0 and λ G (μ 1/2,lazy ) 2 as in Theorem 4.1. This statement has the obvious advantage of applying to random walks with unbounded support (with finite exponential moment) but it has the disadvantage that the dependence of the appearing constant c to aforementioned parameters of μ is considerably more complicated. In line with our goals in this article, we have chosen not to give more details on this version of Theorem 4.1 for finite exponential moment random walks.
In the remainder of this section, we will single out two applications of the methods we used in this part. Namely, in §4.2, we will give an explicit bound for the bottom of the support of Hausdorff spectrum of the harmonic measure, and in §4.3, we discuss an application to continuity of the drift.

Application to the Frostman property of the harmonic measure
In this part, we keep the assumptions of Theorem 4.1. In particular, μ is a nonelementary probability measure with finite support on Isom(M) where M is a proper geodesic hyperbolic metric space.
Let B(x, r ) denote the ball of radius r around x for a natural metric coming from the Gromov product (we do not go into the details here since this metric will not be used, see [27, §8], [58,Proposition 5.16]). The following result provides an explicit constant s 0 > 0 for which the Frostman type property ν(B(x, r )) Cr s 0 holds for some constant C > 0 and every x ∈ ∂ M and r 0.
Such a constant gives a lower bound for the bottom of the support of Hausdorff spectrum of ν; see the work of Tanaka [56] who gives a thorough multifractal analysis of the harmonic measure in the special setting of hyperbolic groups. Finally, we mention that the existence of such a (non-explicit) constant for the harmonic measure is known in a more general setting of (not necessarily proper) hyperbolic spaces (see [  In particular, for each such s > 0, for every x ∈ M, (4.10) In the above statement and hereafter, for x, y ∈ M ∪∂ M, the Gromov product (x|y) o is defined as inf lim inf n→∞ (x n |y n ) o , where the infimum is taken over all sequences x n and y n that converge, respectively, to x and y.
Since the proof of the above result follows similar lines as the proof of Proposition 4.2, we will keep its notation and content with indicating the main lines and changes. We write B = G N , β = μ ⊗N . By T we denote the shift map on B and for b ∈ B and n ∈ N, we write Now, let c > 1 and denote by a n and b n the sequences (that depend on c) as in the proof of Proposition 4.2. Then The sum in the middle of the right hand side is a finite sum. The last series is finite as soon as is finite. Now, for any s < 1 κ S ln 1 λ G (μ) 2 , one can choose c > 1 so that (4.11) is finite and this finishes the proof of (4.9). Finally (4.10) follows from (4.9) by Markov inequality.

Applications to continuity of the drift
A consequence of Theorem 1.1 is a uniform control, over different driving probability measures with controlled parameters κ S and λ G (μ) 2 , of large deviations of the displacement around the drift. In turn, this allows one to deduce that the drift varies continuously when one perturbs μ in such a way that that κ S remains bounded and λ G (μ) 2 remains away from 1, as we show in Corollary 4.11 below. The idea of such a deduction, of continuity from uniform large deviations, already appears in the literature, see e.g. Duarte-Klein [19,Ch. 3]. However, with our method, this way of deducing the continuity is not optimal as one can deduce a continuity result directly from unique cocycle-average property for the Busemann cocycle (which was a key point in obtaining our concentration result). We refer to Proposition 4.13 for a general continuity statement. Suppose that μ m converges weakly to some probability measure μ ∞ . Then, as m → ∞ Proof Fix t 0 > 0 and let λ < 1 be a constant such that for every m ∈ N, inf r ∈[0,1) λ G ((μ m ) r ,lazy ) 2 < λ. Set κ 0 = sup m∈N κ S m . Choose n 0 large enough so that where the constant A 0 (depending only on M) is as in Remark 1.2. The choice of n 0 satisfying (4.13) implies by using Theorem 1.1 and the bound on the function D given in Remark 1.2 which is non-decreasing in κ and in λ, that for every m ∈ N large enough, we have Since | 1 n 0 κ(L n 0 ) − (μ m )| κ 0 , this implies that for every m large enough, we have On the other hand, since μ m → μ ∞ weakly, we have that as m → ∞, E μ m [κ(L n 0 )] → E μ ∞ [κ(L n 0 )]. Therefore, combining (4.12) with (4.14), it follows that for every m ∈ N large enough, we have | (μ ∞ ) − (μ m )| 4t 0 completing the proof.

Remark 4.12
A particular situation where the hypotheses of the previous result are satisfied is when there exists a finite set S ⊂ Isom(M) that contains the supports of all μ m for m ∈ N, μ m → μ ∞ weakly and ρ(λ G (μ ∞ )) < 1. This claim can easily be deduced from the results of Berg-Christensen [5]. We will omit the details as we will now prove a general continuity statement.
The following result is the one can deduce from the unique cocycle-average property similarly to Hennion [38] and Furstenberg-Kifer [25]. For a very similar proof closer to our setting and related remarks, see Gouëzel-Mathéus-Maucourant [33, Proposition 2.3]. In the following, for a probability measure μ on Isom(M), we denote L 1 (μ) = κ(g)dμ(g). Since X is compact, up to passing to a subsequence of ν n , we can suppose that the sequence ν n converges to a probability measure ν on X . Since μ n → μ weakly, one deduces from the continuity of the action of G on X that ν is μ-stationary. Using the hypothesis that L 1 (μ n ) → L 1 (μ) and the fact that κ(g) |σ (g, ξ)| for every g ∈ G and ξ ∈ X , one gets by dominated convergence that the sequence of integrals in (4.15) converges to G×X σ (g, ξ)dμ(g)dν(ξ ). But by unique cocycle-average property, the latter is equal to (μ). This implies the claimed convergence.

The case of Gromov hyperbolic groups and rank-one linear groups
The goal of this section is to prove Corollaries 1.4 and 1.6 using Theorem 4.1.
An important ingredient that allows us to obtain concentration inequalities with implied constants that depends, in a minimal fashion, on the probability measure μ is a version of uniform Tits alternative for group of isometries of hyperbolic spaces. For hyperbolic groups, we will use Koubi's results [43] and for linear groups the strong Tits alternative of Breuillard [9].

Concentration inequalities for random walks on Gromov hyperbolic groups
For the proof of Corollary 1.4, we will use the following result of Koubi: Theorem 5.1 ([43]) Let be a finitely generated non-elementary hyperbolic group. There exists N ∈ N such that for any finite subset S generating , there exists two elements a, b ∈ S N that generate a free subgroup of rank two.
Here, by S-length of an element g ∈ , we mean the distance of g to the identity element in the word-metric induced by S.
The previous result will be useful to us in combination with the following straightforward observation (see e.g. [11, §8]).

Lemma 5.2
Let be a countable group and S ⊂ such that S N 0 contains a pair of elements that generates a free subgroup of rank two for some N 0 ∈ N. Let μ be a probability measure with support S ∪ {id} and set m μ = min g∈S μ(g). Then, Proof Consider the probability measure μ = μ * μ and denote by S its support. Since S contains identity, the set S is symmetric and it contains S. It follows that (S ) N 0 contains a set {a, b, a −1 , b −1 }, where a, b are the generators of a free group of rank two.
Since μ is symmetric, the operator λ (μ ) on 2 ( ) is self-adjoint, and since On the other hand, we write μ where η is the uniform probability measure on {a, b, a −1 , b −1 } and ζ some probability measure on . Using the trivial bound λ (ζ ) 2 1, we deduce that where 1 − κ = √ 3/2 is the spectral radius of the uniform probability measure on the free group [42,Theorem 3]. Combining (5.1) and (5.2), and using the fact that m μ m 2 μ , we deduce that Proof of Corollary 1.4 Note first that by [14,Proposition 2.6], the group is a nonelementary hyperbolic group. Therefore, the hypothesis of Lemma 5.2 is satisfied for every finite generating set of with a uniform constant N 0 = N thanks to Koubi's Theorem 5.1. Applying Lemma 5.2 to μ 1/2,lazy and using the fact that m μ 1/2,lazy 1 2 m μ yields that λ (μ 1/2,lazy ) 2 1 − Observe finally that by the discreteness assumption of and by Berg-Christensen's Corollary [5,Corollaire 3], one has that λ (μ 1/2,lazy ) 2 = λ G (μ 1/2,lazy ) 2 . Using now Theorem 1.1 and the expression of the function D(·, ·) given in Remark 1.2, we get the desired result with A M = A 0 + 3 and N = 4N .

Concentration inequalities for random walks on rank-one semisimple linear groups
For the proof of Corollary 1.6, we will use the following result of Breuillard [9, Theorem 1.1] and [10] (see [11] for the particular case of SL 2 ).
Since λ (μ ) 2 = λ (μ ) 2 2 and m μ m 2 μ m 2 μ /4, we deduce As in the proof of Corollary 1.4, by the discreteness assumption on it follows that λ (μ 1/2,lazy ) 2 = λ G (μ 1/2,lazy ) 2 . Therefore, a direct application of Theorem 1.1 (with r = 1/2 on the right hand side of the theorem) and the expression of the function D(·, ·) given in Remark 1.2 concludes the proof with where A 0 is the constant defined in Remark 4.3 applied for the isometry group G of the symmetric space M associated to the rank-one group H(k).

Probabilistic free subgroup theorem
The goal of this section is to prove Theorem 1.10 from Introduction. To do this, we start by proving a general result which shows that uniform large deviation estimates for the Busemann cocycle together with positivity of the drift imply a probabilistic free subgroup theorem for isometries of Gromov hyperbolic spaces.

Free subgroups from uniform large deviations
Let (M, d) be a δ-hyperbolic metric space and fix o ∈ M. Let μ be a Borel probability measure on Isom(M) endowed with the topology of pointwise convergence.
We introduce the following uniform large deviation hypothesis for a probability measure μ with finite first order moment on Isom(M): ULD: For every > 0 and n ∈ N, there exist non-negative constants p n ( ) such that for every > 0, p n ( ) → 0 as n → ∞ and sup y∈M P(|σ (R ±1 n , y) − n (μ)| n ) p n ( ), (6.1) where σ denotes the Busemann cocycle. Note that, whenever the ULD hypothesis is satisfied, by replacing, for every > 0, p n ( ) by sup m n p n ( ), we can and we will suppose that it is satisfied with a non-increasing sequence p n ( ). The rest of §6.1 is devoted to the proof of the following Before proceeding with the proof, we make a few remarks on its hypotheses. We will show that with high probability, two independent random walks R n and R n will play ping-pong on the space M. To set the random ping-pong table, we need some geometric lemmas. Let (M, d) be a δ-hyperbolic space, fix o ∈ M and let C > 0. Recall that the shadow of y ∈ M seen from x ∈ X is the following subset of M; It is immediate that Observe that O C (x, y) = M when C d(x, y). We will use the following lemma to construct Schottky subgroups of Isom(M).

Now we are able give
Proof of Lemma 6.3 Using the assumption (iii), fix any real C such that . By Lemma 6.6, these are four disjoint subsets of M. Moreover, by Lemma 6.5 and the choice of the constant C, the following inclusions hold every i = 1, 2, Thus pair of elements γ 1 , γ 2 satisfies the hypotheses of the classical ping-pong lemma and therefore they generate then a free subgroup of Isom(M).
With Lemma 6.3 at hand, we focus now on showing that the random walks R n , R n , R −1 n , R n −1 satisfy assumptions (i)-(iii) of Lemma 6.3 with D = n (μ)/8 + 2δ, with probability tending to one depending on the constants p n ( ) appearing in the hypothesis ULD. Before that, we provide some estimates on the random walk R n based on uniform large deviation estimates. (ii) For every 0 < (μ)/8 and every n > 2 + 8δ (μ) , Proof (i) Using the identity which holds for any g ∈ Isom(M) and y ∈ M, the desired inequality follows from ULD hypothesis applied to both κ(R n ) = σ (R n , o) and σ (R −1 n , y).

A lower bound for the drift
In view of Theorem 4.1 and Proposition 6.1, the only remaining ingredient for the proof of Theorem 1.10 is a control of how small the drift (μ) of the random walk can be. The harmonic analytic approach of §4 allows one to deduce a lower bound on the drift as we now discuss. This sort of result should be known to the experts. Results of similar flavor appear in the works [35,50,53,59].
Given R 0, as before, we set B R = {g ∈ G | d(go, o) R}. Since M is proper, the sets B R defined above are compact and they have non-empty interior if R > 0. In particular, there exists K 0 ∈ N and g 1 , . . . , g K 0 ∈ B 6D 1 such that B 6D 1 ⊆ ∪ K 0 i=1 g i B D 1 , where, as before, D 1 ∈ R denotes the constant max{D 0 , 1} and D 0 := 2diam(G\M). For convenience later on, we choose K 0 to be the smallest such integer. An elementary covering argument allows one to get the bound K 0

Remark 6.9
The reason why we also include μ r ,lazy in the conclusion of the previous Proposition 6.8 is that, as discussed in Remark 4.4, when μ is non-symmetric, it might happen that the closed group μ generated by the support of μ is nonamenable whereas λ G (μ) 2 = 1. However, in this case, for every r > 0, we have λ G (μ r ,lazy ) 2 < 1. Therefore, whenever μ is non-amenable the lower bound provided by the proposition is strictly positive and it depends only on D 1 , K 0 , and λ G (μ 1/2,lazy ) .
The proposition then follows by applying the above for each μ r ,lazy and noting that (μ r ,lazy ) = (1 − r ) (μ). A straightforward modification of the proof of Lemma 4.5 shows that for every R > 0 and n ∈ N, we have where μ G is a Haar measure on G. Indeed, the additional term D 0 in the left hand side of (4.2) disappears since here we take m = m = o. We now claim that for every r D 1 , where K 0 ∈ N is the constant defined before the statement of Proposition 6.8. Indeed, given r D 1 , let {γ 1 , . . . , γ T } be a maximal 2D 1 -separated set contained in B r −D 1 with respect to the left-invariant pseudo-metric d G defined as d G (g, h) = d(go, ho) for every g, h ∈ G. Then the collection γ i B D 1 for i = 1, . . . , T consists of disjoint compact subsets of B r of same Haar measure as B D 1 so that we have μ G (B r ) T μ G (B D 1 ). On the other hand, since G acts co-compactly on M and M is geodesic, it is not hard to see that every element in B r +D 1 is 2D 1 -close for the pseudo-metric d G to an element of B r (in fact, (G, d G ) is a large-scale geodesic space in the sense of [17, Definition 3.B.1]). Hence the collection γ i B 6D 1 for i = 1, . . . , T is a covering of B r +D 1 by compacts having the same Haar measure as B 6D 1 and therefore we have μ G (B r +D 1 ) T μ G (B 6D 1 ). Therefore we deduce K 0 proving (6.15). Now, by using (6.15) iteratively and plugging it in (6.14), we deduce that for every α < The result follows in view of the Kingman's subadditive ergodic theorem.

Remark 6.10
In fact, the estimate (6.16) above provides a lower bound L 0 > 0 for a region of type [0, L 0 ) on which the large deviation rate function of the process ( 1 n κ(R n )) n 1 is positive. Such a lower bound is, a priori, stronger than a lower bound for the drift (μ). However, the recent works [7] under finite exponential moment and [32] under finite first order moment assumptions, identify the drift (μ) as the smallest real r such that the rate function I is positive on [0, r ).
On the other hand, the fact that Proposition 6.8 provides an explicit region of positivity of I allows, for example, to obtain explicit constants in [49, Theorem 1.2] under our assumptions.

Proof of Theorem 1.10
We denote by D(., .) the positive function given by Theorem 4.1. The hypotheses of Theorem 1.10 allow us to apply Theorem 4.1 to deduce that for every r ∈ [0, 1) the probability measure μ satisfies the ULD hypothesis with Therefore, using (6.17) and the bound provided by Proposition 6.8, setting λ r = λ G (μ r ,lazy ) 2 we obtain that for every r ∈ [0, 1) and for every n > 2 + 8δ(1−r ) ln K 0 D 1 ln 1 λr , two independent random walks generate a free subgroup with probability

Remark 6.11
The explicit bounds on the probability mentioned in §1.2.2 for hyperbolic groups and rank-one linear groups are obtained by plugging the upper bounds (5.4) and (5.5) on λ r into (6.19) in the proof above. Similarly, for the range of validity of n ∈ N, one can plug (5.4) and (5.5) in (6.13) to get an explicit lower bound for (μ) which then provides an upper bound for the right-hand-side of (6.18).
in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.