The large deviation behavior of lacunary sums

We study the large deviation behavior of lacunary sums $(S_n/n)_{n\in \mathbb{N} }$ with $S_n:= \sum_{k=1}^n f(a_kU)$, $n\in\mathbb{N}$, where $U$ is uniformly distributed on $[0,1]$, $(a_k)_{k\in\mathbb{N}}$ is an Hadamard gap sequence, and $f\colon \mathbb{R}\to \mathbb{R} $ is a $1$-periodic, (Lipschitz-)continuous mapping. In the case of large gaps, we show that the normalized partial sums satisfy a large deviation principle at speed $n$ and with a good rate function which is the same as in the case of independent and identically distributed random variables $U_k$, $k\in\mathbb{N}$, having uniform distribution on $[0,1]$. When the lacunary sequence $(a_k)_{k\in\mathbb{N}}$ is a geometric progression, then we also obtain large deviation principles at speed $n$, but with a good rate function that is different from the independent case, its form depending in a subtle way on the interplay between the function $f$ and the arithmetic properties of the gap sequence. Our work generalizes some results recently obtained by Aistleitner, Gantert, Kabluchko, Prochno, and Ramanan [Large deviation principles for lacunary sums, preprint, 2020] who initiated this line of research for the case of lacunary trigonometric sums.


Introduction & Main results
The study of lacunary (trigonometric) series is a classical but still flourishing topic in harmonic analysis. Its origins can be traced back to the work of Rademacher [17] in 1922, who studied the convergence of series of the form ∞ k=1 b k r k (ω), (1) where ω ∈ [0, 1], b = (b k ) k∈N , and r k denotes the k th Rademacher function r k (ω) = sgn sin(2 k πω) . Rademacher proved that under the assumption of square summability ∞ k=1 b 2 k < ∞ such a series converges for almost every ω ∈ [0, 1]. It was then in 1925 that Kolmogorov and Khinchin [14] discovered the necessity of square summability in an even more general setting. Observing that for Rademacher sums (1) one has the relation where in the sum on the right-hand side we have a fixed function r 1 containing a sequence of exponentially growing dilation factors, marks the beginning of the study of lacunary trigonometric series of the form where ω ∈ [0, 1], (b k ) k∈N is a sequence of real numbers, (a k ) k∈N is a sequence of positive integers that is lacunary, in the sense that it satisfies the Hadamard gap condition a k+1 a k ≥ q > 1, for every k ∈ N.
Kolmogorov [15] showed convergence of such series under the ℓ 2 -assumption on (b k ) k∈N and a few years later Zygmund [21] showed that this assumption is necessary. Coming back to the Rademacher functions (r k ) k∈N , one can easily check that they form a system of independent random variables. Thus, it is natural to ask weather k∈N b k r k satisfies a central limit theorem (CLT). Under the assumptions that (b k ) k∈N ∉ ℓ 2 and max 1≤k≤n |b k | = o( (b k ) k∈N 2 ) one can verify that Lindeberg's condition holds and thus we have that for t ∈ R lim n→∞ λ ω ∈ [0, 1] : In particular this is true when b k = 1 for all k ∈ N. Kac proved in 1939 a similar CLT for a sequence (a k ) k∈N with large gaps, that is, if a k+1 /a k → ∞, as k → ∞ in case of lacunary trigonometric functions. A few years later, in 1947, Salem and Zygmund [18] showed the CLT under the Hadamard gap condition. More precisely they showed that if (a k ) k∈N satisfies a k+1 /a k ≥ q > 1, we have It became more and more evident that lacunary sums show a behavior similar to sums of truly independent random variables. Salem and Zygmund [19] in 1950 and Erdős and Gál [8] in 1955 were able to prove a law of the iterated logarithm under the Hadamard gap condition, that is, for almost every ω ∈ [0, 1] we have lim sup n→∞ n k=1 cos(2πa k ω) n log log n = 1.
It is therefore natural to ask whether results like these can be generalized to arbitrary periodic functions. This is not always possible as a famous example of Erdős and Fortet (see, e.g., [12]) showed, where the law of the iterated logarithm failed to be true even for very simple trigonometric polynomials. In 1946, Kac [11] considered functions f : for all ω ∈ [0, 1]. He was able to show that if one additionally assumes that f is of bounded variation or Hölder-continuous, then 1 n n k=1 f (2 k−1 ω) converges in distribution to a normal distribution with mean 0 and variance provided the latter exists. The form of the variance in the limit is somewhat unexpected, since one would naively assume the same variance as in the independent case, namely 1 0 f (ω) 2 dω. This observation shows that not only the regularity of the function f plays a role, also the arithmetic structure of the sequence (a k ) k∈N claims its influence. This phenomenom became even more visible when Gapoškin [9] linked the existence of a CLT to the number of solutions of certain Diophantine equations. Only a few years ago Aistleitner and Berkes [1] improved his result. While both CLT and LIL are quite well understood for lacunary series, fluctuations on the scale of large deviations were not considered till very recently, when in 2020 Aistleitner, Gantert, Kabluchko, Prochno, and Ramanan [2] initiated the study of large deviation principles (LDP) for lacunary trigonometric sums, obtaining a series of unexpected results that display a subtle dependence on the arithmetic structure of the gap sequence that is not visible in the trigonometric setting on the scales of a CLT and a LIL; for the latter two the arithmetic structure is irrelevant and they always display a behavior as in the case of independent and identically distributed random variables.
In this paper we continue the study of large deviations for lacunary series and make progress on some of the open problems stated in [2]. More precisely, we will consider Hadamard gap sequences (a k ) k∈N and study the tail behavior of partial sums of the form where ω ∈ [0, 1] and f : R → R is a general 1-periodic function satisfying certain regularity assumptions. Functions like the one in (2) can be interpreted as random variables on [0, 1] equipped with the Borel-sigma field and the Lebesgue measure λ on it.

Main Results
Before presenting the main results, recall that a sequence (X n ) n∈N of random variables satisfies an LDP at speed (s n ) n∈N and rate function I : where A • and A denote the interior and the closure of A, respectively. The speed (s n ) n∈N is a sequence of positive real numbers tending to infinity and the rate function I is lower-semicontinuous. If I has compact level sets, we speak of a good rate function (GRF). The classical setting of independent and identically distributed (i.i.d.) random variables is dealt with in Cramér's theorem [5]: for a sequence of i.i.d. random variables (X n ) n∈N with finite exponential moments, i.e., Our aim is to prove LDPs for the sequence (S n /n) n∈N in two different and natural settings, namely for large gaps and for gap sequences forming a geometric progression.

The case of large gaps
We first consider the case of independent and identically distributed random variables. Assume that for the function f : R → R, exists for all θ ∈ R. Let (U k ) k∈N be an i.i.d. sequence of random variables with the same distribution as U and define By Cramér's theorem, ( S n /n) n∈N satisfies an LDP in R at speed n and with GRF I f : R → [0, ∞] given by the Legendre-Fenchel transform of Λ f , that is, Our first result treats the case where the Hadamard gap sequence (a k ) k∈N has "large gaps", i.e., a k+1 a k k→∞ −→ ∞.
Theorem A. Let f : R → R be a 1-periodic continuous function, let U ∼ Unif([0, 1]) and (a k ) k∈N be a lacunary sequence with large gaps. Then the sequence (S n /n) n∈N satisfies an LDP in R at speed n and with GRF I f , where S n is defined as in Equation (3).

Remark 1.1. In essence, modulo several technicalities, the proof (given in Section 3) uses uniform approximation of continuous functions via trigonometric polynomials and the Gärtner-Ellis theorem.
We also provide a proof in a more restrictive setting using the Fourier expansion technique (see Proposition 3.2), where we put a growth condition on the Fourier coefficients of the function f . We do this with a view towards potential generalizations, e.g., for certain non-continuous f .

The case of geometric progressions
We now consider a lacunary sequence (a k ) k∈N of the form a k = q k for some q ∈ N with q ≥ 2. In this case, the large deviation behavior changes dramatically as is shown in the next theorem.
Theorem B. Let f : R → R be 1-periodic and Lipschitz continuous. Let (a k ) k∈N be a geometric sequence, i.e., a k = q k for some q ∈ N with q ≥ 2, and let (S n /n) n∈N be as defined in Equation (3). Then Furthermore, the limit lim q→∞ I f q (x) = I f (x) holds uniformly on compact subsets of the interval (−1, 1).
We are even able to strengthen Theorem B to arbitrary continuous functions.

Corollary 1.2.
Let (a k ) k∈N be a geometric sequence, i.e., a k = q k for some q ∈ N with q ≥ 2, and let f : R → R be a continuous 1-periodic function. Then, (S n /n) n∈N from Equation (3) satisfies an LDP in R at speed n and some GRF I Before we present the necessary large deviation background and then the proofs of the main results, we close this section providing some instructive examples.
Here the asymptotic cumulant generating function is Λ f 2 = Λ 2 as given in Theorem B in [2]. There is no closed expression for it, but the coefficients of its Taylor series may be computed explicitly; see Lemma 3.5 and Appendix A loc. cit. for more information.
Therefore, we get a markedly different rate function in this case (the trivial one, to be precise), although f is nontrivial and seems to differ from the previous example only marginally. 4. f (ω) = cos(2πω) + cos(4πω) and a k = 2 k + 1. Now (a k ) k∈N is no longer a geometric sequence, yet because of 1-periodicity we have which is exactly the same as in Example 2.
Together these examples demonstrate that the rate function, and hence the whole LDP, does depend on the interplay between the function f and the sequence (a k ) k∈N , not solely on the one or the other.
In particular it is not only the arithmetic properties of (a k ) k∈N that influence the LDP: in the last example it is neither a geometric sequence nor does it have large gaps, so it does not fall into the framework of the present Theorems A and B, and still an LDP can be proved.

Elements of large deviations theory -some elementary background
Large deviations describe the decay of the probability of rare events on an exponential scale. In contrast to the law of large numbers and the CLT, large deviations behavior is highly non-universal. In 1938 Cramér [5] published his famous theorem (see also Theorem 2.2.3 in [6]): for a sequence of independent and identically distributed random variables (X n ) n∈N with finite exponential moments, i.e., for all u in a neighborhood of 0, one has that A few decades later, Donsker and Varadhan initiated the systematic study of large deviations and generalized Cramér's idea (see [6] for more historical background). A sequence (X n ) n∈N of random variables (not necessarily i.i.d.) satisfies an LDP at speed (s n ) n∈N with rate function I : where A • and A denote the interior and the closure of A, respectively. The speed (s n ) n∈N is a sequence of positive real numbers converging to infinity and the rate function I is lower-semicontinuous. If I has compact level sets, we speak of a good rate function (GRF). An important result from large deviations theory is the Gärtner-Ellis theorem, which we shall use several times in this paper. Having its roots in a paper of Gärtner from 1977 [10], Ellis established this result in 1984 [7]. The reader may also consult [6, Theorem 2.3.6].
Theorem 2.1. Let (Z n ) n∈N be a sequence of real valued random variables and assume that the following limit exists and is differentiable on R: Then, (Z n ) n∈N satisfies an LDP at speed n and with GRF Λ * : is the Legendre-Fenchel transform of Λ.

Proofs
In this section we provide the proofs of all statements in Section 1.1. The central idea is to establish LDPs for trigonometric polynomials and use uniform approximation of continuous functions by such polynomials. We start with a lemma that will later facilitate the application of the Gärtner-Ellis theorem. In the following, if, e.g., j is an integer index, j ∈ [a, b] means j ∈ {k ∈ Z : a ≤ k ≤ b}, where a, b ∈ Z.
Furthermore, given a lacunary sequence (a k ) k∈N with large gaps, there exists k 0 = k 0 (d , m) ∈ N such that for every θ ∈ R and for all n > k 0 ,

Proof. A proof by induction reveals that, for all
with some numbers c , the precise knowledge of which is irrelevant. This yields A simple computation reveals that this implies the integral representation of b 0 (θ, d ) in the statement of the lemma, i.e., Because p d approximates the exponential function uniformly on bounded subsets of C (in particular on compact subsets) and since θ f ([0, 1]) ⊂ C is compact, integration and limit for d → ∞ may be exchanged, which shows that Now let θ ∈ R and n > k 0 , where k 0 ∈ N is such that It is left to show that if j k = 0 for some k ∈ [k 0 + 1, n], then n k=k 0 +1 j k a k = 0, because in that case the integral of the exponential function evaluates to zero and the only nontrivial summand left is the one with j k 0 +1 = · · · = j n = 0, which then yields the desired b 0 (θ, d ) n−k 0 . So let ( j k 0 +1 , . . . , j n ) = 0 and let ℓ := max{k ∈ [k 0 + 1, n] : j k = 0}. By the choice of k 0 , a ℓ ≥ a k (d m + 1) ℓ−k for all k ∈ [k 0 + 1, ℓ]. We now consider two cases according to the sign of j ℓ . First, suppose that j ℓ > 0. Then, because j ℓ ≥ 1 and Using a k ≤ a ℓ (d m + 1) k−ℓ for all k ∈ [k 0 + 1, ℓ], this can be estimated further and we obtain The case j ℓ < 0 is argued analogously. This completes the proof.
We use the previous lemma to prove Theorem A by employing the Stone-Weierstraß theorem (see, e.g., [16,Chapter 15]).
Proof of Theorem A. The proof is done in two steps: first we assume that f is a trigonometric polynomial and then we use approximation for arbitrary f .
Step 1: The general idea is the same as in the proof of Theorem A in [2]; in particular, [2,Lemma 3.2] can be adapted to hold for any bounded function f instead of ω → cos(2πa k ω). Now let f : ω → m j =−m c j e 2πi j ω be a trigonometric polynomial for some m ∈ N 0 and c −m , . . . , c m ∈ C (actually c − j = c j holds for all j ∈ [0, m] because of f (R) ⊂ R). For fixed θ ∈ R and ε > 0, we can find some d 0 = d 0 (ε) ∈ N such that lim ε→0 d 0 (ε) = ∞ and for all d because p d , as defined in Lemma 3.1, is the d th Taylor polynomial of the exponential function. Recall that, by Lemma 3.1, for fixed d ≥ d 0 we can find k 0 ∈ N such that, for all θ ∈ R and n > k 0 , Combining (11) and (12) results in the estimate 1]. Analogously one receives the lower bound, where we have Hence, the Gärtner-Ellis limit exists for every θ ∈ R, is finite, and the map Λ f is clearly differentiable on R. Therefore, the claim follows. Step Note that each f m can be chosen to be real-valued. Indeed, for any ω ∈ R, We are going to prove that, for any θ ∈ R, Because obviously Λ f (R) ⊂ R and Λ f is differentiable on R, we can then apply the Gärnter-Ellis theorem to obtain the claimed LDP with the corresponding rate function. So let S m n := n k=1 f m (a k U ), n ∈ N, θ ∈ R, and assume ε > 0. Then there exists m 1 ∈ N such that, for all m ≥ m 1 , From the uniform approximation (13) we infer, for any n ∈ N and each realization, This in its turn implies and the proof is complete.
In view of the classical work [11] of Kac, another possible approach uses Fourier analytic methods. Taking this route allows us to establish a large deviation principle under certain growth conditions on the Fourier coefficients. Although this result is obtained under slightly stronger assumptions, we present it here with a view towards potential improvements in the future, where such an approach might be useful.
we assume that |c k | ≤ M |k| −β for some β > 1 and some constant M ∈ (0, ∞). Then, (S n /n) n∈N satisfies the LDP at speed n with GRF I f from (6).

Remark 3.3.
Note that the assumptions of Proposition 3.2 imply continuity of the function f . To see this, we consider a sequence (x n ) n∈N of real numbers converging to some x ∈ R. Then, for fixed N ∈ N, we have where we used our growth condition on the Fourier coefficents of f in the last line. Now, if we let n → ∞, we have For N → ∞, the right-hand side tends to zero, since we assumed that β > 1. In the proof of Theorem A we saw that the LDP holds for S m n := n k=1 f m (a k U ), n ∈ N, with the i.i.d. rate function I f m . Now fix ε > 0 and choose m = m(ε) such that lim ε→0 m(ε) = ∞ and This approximation can be used to bound the Gärtner-Ellis limit from above: we get We note that the convergence of ( f m ) m∈N towards f is uniform and that such an f is bounded. This allows us to interchange integral and limit, and we end up with The latter function is differentiable in θ and thus by the Gärtner-Ellis theorem we obtain an LDP at speed n for (S n /n) n∈N with GRF I f .
Proof of Theorem B. The proof follows essentially the same steps as the proof of Theorem B and Proposition 3.4 in [2], where the LDP is proven using tools from hyperbolic dynamics and mixing processes. For more information, we refer the reader to the references given on p. 19 in [2]. First, we define the map T : [0, 1] → [0, 1] with T (ω) := qω − ⌊qω⌋.
Then, using that a k = q k for k ∈ N, the lacunary sum S n from (2) can be written as As in the proof of Theorem A, we use the Gärtner-Ellis theorem to show the LDP. Thus, we need to prove that the limit Λ f q (θ) := lim n→∞ 1 n log E e θS n exists for all θ ∈ R and is differentiable in θ. In order to do so, we express e θS n in terms of a certain linear operator. Let Lip[0, 1] be the Banach space of Lipschitz-continuous functions g : [0, 1] → C, endowed with the norm g := g ∞ + L(g ), where L(g ) is the Lipschitz constant of g . Then, for fixed θ ∈ R, define the linear operator Φ θ,q : Next, we consider the Perron-Frobenius operator associated to T , i.e., Φ q : where we note that the operator from (15) can be interpreted as a perturbation of the Perron-Frobenius operator in (16). We have that In a moment we will need the following basic property of Φ q : for g ∈ Lip[0, 1], where we used the variable substitution (ω + k)/q = x for k = 0, . . . , q − 1. Let Φ n θ,q and Φ n q denote the n-fold composition of Φ θ,q and Φ q respectively. Then, by Proposition 5.1 in [3], we have for Φ n θ,q [g ] = Φ n q e θS n g , for every n ∈ N. Using this, we can write where 1 denotes the constant function with value 1. The second equation in (18) holds due to the calculation in (17).
By assumption, f is Lipschitz-continuous and hence Theorem 4.1 in [20] and Theorem 1.5 in [3] are applicable, therefore we get that Φ θ,q has a positive eigenvalue λ θ,q =: λ θ with multiplicity 1 and all other eigenvalues of Φ θ,q have strictly smaller modulus than λ θ . We use the well-known decomposition where Q θ is a projection operator onto the line spanned by an eigenfunction h θ > 0 associated to the eigenvalue λ θ , and R θ is an operator whose spectral radius is strictly smaller than λ θ and which is orthogonal to Q θ in the sense R θ Q θ = Q θ R θ = 0. Moreover, there exists a probability measure µ θ on [0, 1] such that for all g ∈ Lip[0, 1] we have that .
Using these quantities and the orthogonality of R θ and Q θ gives us for g ∈ Lip[0, 1] If we set g = 1, using (18), we obtain E e θS n = 1 0 Φ n θ,q [1](ω) dω = λ n Since the spectral radius of R θ is strictly smaller than λ θ , we get In particular, we have that lim n→∞ 1 n log E e θS n = log λ θ , which proves the existence of the Gärtner-Ellis limit. We now turn to the proof of the remaining assertions, which we claim (and justify below) can be deduced from the perturbation theory of linear operators (see Chapter 7 in [13]), in particular the Kato-Rellich theorem, as stated in Theorem 4.24 in [20]. Indeed, since the family of operators Φ θ,q depends on θ ∈ C in an analytic way (see Proposition 5.1 (P3) in [4] and Theorem 1.7 in [13]), the decomposition (19) continues to hold in some neighborhood D of the real axis (with λ θ , h θ and µ θ becoming complex-valued), with λ θ = 0 and λ θ (as well as h θ , µ θ , R θ ) being analytic on D. Moreover, |λ θ | stays strictly greater than the spectral radius of R θ if D is sufficiently small, which, looking at (20), shows that convergence in (21) is uniform on compact subsets of D.
For the second statement of Theorem B, we fix θ ∈ R and let q → ∞. We note that the operator Φ θ,q from (15) is a Riemann sum and converges to the corresponding Riemann integral. This implies that the sequence of operators Φ θ,q for q ≥ 2 converges to the operator where λ θ := 1 0 e θ f (x) d x = e Λ f (θ) with the cumulant generating function Λ f defined in (4). Thus, λ θ is the Perron-Frobenius eigenvalue of Φ θ , since Φ θ / λ θ is a projection onto the line spanned by the function 1. Now if θ ∈ R stays constant and q → ∞, we can view Φ θ,q as perturbation of Φ θ . By perturbation theory (see e.g. [13]), we have the convergence of the Perron-Frobenius eigenvalues, that is, lim q→∞ λ θ,q = λ θ for every θ ∈ R. Taking the logarithm, we get lim q→∞ Λ f q (θ) = Λ f (θ). Since the involved functions are convex, the convergence in fact is uniform on compact intervals. By taking the Legendre-Fenchel transform, it follows that lim q→∞ I f q (x) = I f (x) locally uniformly on (−1, 1).