Speed of convergence of Weyl sums over Kronecker sequences

We study the speed of convergence in the numerical integration with Weyl sums over Kronecker sequences in the torus, 1N∑n=1Nfx+nα-∫Tdf(y)dy.\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \dfrac{1}{N}\sum _{n=1}^{N}f\left( x+n\alpha \right) -\int _{\mathbb {T}^{d} }f(y)dy. \end{aligned}$$\end{document}

A classical result of Leopold Kronecker states that if the entries of the vector α in R d together with the number 1 are linearly independent over the rationals, then the sequence {nα} +∞ n=1 is dense in the d dimensional torus T d = R d /Z d . Indeed Hermann Weyl has proved that more is true, the sequence {nα} +∞ n=1 is equidistributed, for every Riemann integrable function f (x) on the torus one has In the Koksma Hlawka inequality in several variables the discrepancy with respect to intervals is replaced by the discrepancy with respect to parallelepipeds with sides parallel to the axes, and the total variation is replaced by the Hardy Krause variation. See [15,Section 5 in Chapter 2] and [6, Definition 1.13 and Theorem 1.14]. In one dimension it is known that there exist absolute constants c and C such that for any infinite sequence {α n } +∞ n=1 one has D {α n } N n=1 ≥ cN −1 log (1 + N ) for infinitely many values of N , but there are sequences for which D {α n } N n=1 ≤ C N −1 log (1 + N ) for every N . Examples are the van der Corput sequence and the Kronecker sequences {nα} +∞ n=1 with the continued fraction expansions of the parameters α with bounded partial quotients. In dimension d > 1 the discrepancy of infinite sequences is larger than cN −1 log (1 + N ). It is a classical result in the metric theory of Diophantine approximation that for almost every α in the d dimensional torus T d the Kronecker sequence {nα} N n=1 has a discrepancy between cN −1 log d (1 + N ) log (log (2 + N )) and C N −1 log d (1 + N ) log 1+ε (log (2 + N )), for every ε > 0. The case d = 1 has been proved with the use of continued fractions by Khintchin, while the case d > 1 has been proved with Fourier analysis by Beck. See [2] and [6, Theorem 1.72 and Theorem 1.91]. Hence, if the function f (x) has bounded Hardy Krause variation, then for almost every α one has The purpose of this paper is to remove some of these logarithms. In [17] Owen and Pan raised the issue to improve the error estimates obtained with Koksma Hlawka and discrepancy by fixing a single function, but at the end they proved that with functions in suitable reproducing kernel Hilbert spaces some powers of logarithms are indeed necessary for infinitely many N . On the other hand it is known that, under suitable Diophantine properties of the parameters α and smoothness properties of the functions f (x), the speed of convergence of the numerical integration with Weyl sums on the left may be smaller than the discrepancy of the sequence {nα} +∞ n=1 on the right. In particular, functions with speed of convergence c/N are said to have bounded remainders. See e.g. [5, 8-10, 13, 16, 18, 19] and Remark 13 below. The purpose of this paper is to show that under the assumption that the Fourier expansion of the function f (x) is absolutely convergent, then for almost every Kronecker sequence {nα} +∞ n=1 in the torus T d one can obtain similar estimates, with N −1 log d (1 + N ) log 1+ε (log (2 + N )) replaced by N −1 log (1 + N ) log 1+ε (log (2 + N )), with the same exponents of the logarithms in any dimension d. Moreover, under the assumption that the Fourier transform of the function is in L log L Z d , a bit more than absolutely convergent, then for almost every α one can get rid of the logarithmic factor and obtain a speed of convergence c/N . On the other hand, an easy application of the triangle inequality shows that the speed of convergence c/N is best possible and it cannot be improved by any infinite sequence {α n } +∞ n=1 and any non constant integrable function f (x). In order to compare our results with others already in the literature, it should be pointed out that our results and the Koksma Hlawka inequality apply to different classes of functions. The class of functions with absolutely convergent Fourier expansion, or with Fourier transform in L log L Z d , is different from the class of functions with bounded variation. There are function with absolutely convergent Fourier expansion and unbounded variation, and there are functions with bounded variation and non absolutely convergent Fourier expansion. It should also be pointed out that our results are different from results that involve the smoothness of the functions, for example the ones in [5,9,10]. Functions with suitable smoothness have absolutely convergent Fourier expansions, but not viceversa. Functions with absolutely convergent Fourier expansions may have a fractal nature and no smoothness. See Remark 14 below. The proofs of our results are based on fairly elementary tools of Fourier analysis and measure theory. The discrepancy of the sequences {nα} +∞ n=1 does not enter explicitly into these proofs, and instead a major role is played by the measure properties of the functions α → m · α −1 , where α ∈ T d , m ∈ Z d − {0}, and m · α denotes the distance of m · α to the nearest integer.
Since all functions considered in what follows are periodic, it is not a loss of generality to assume that in the definition of the Kronecker sequence {nα} +∞ n=1 all entries of the vector α are reduced modulo 1, that is the parameter α is in the torus. In what follows the Fourier transform and Fourier expansion of an integrable function f (x) on the torus are defined by

Also let f (x) be an integrable function on the torus T d with absolutely convergent Fourier expansion,
Then for almost all α in the torus with a possible exception of a set of Lebesgue measure zero there exist constants c ( f , α) such that for every positive integer N and every x in T d one has For example, (t) = log (1 + t) log 1+ε (log (2 + t)) with ε > 0 is an admissible function for Theorem 1. Theorem 2 is similar. Roughly speaking it says that one can remove the factor (N ) from the speed of convergence by introducing an extra logarithmic decay in the Fourier transform.

Theorem 2 Let f (x) be an integrable function on the torus T d , and let
Then for almost all α in the torus with a possible exception of a set of δ dimensional Hausdorff measure zero there exist constants c ( f , α) such that for every positive integer N and every x in T d one has The above theorems seem not too far from being sharp, as far as the requirement that the Fourier expansions of the functions are absolutely convergent.
Observe that, since the norm in L 2 (T) is dominated by the norm in L ∞ (T), when the above theorem is applied to a continuous function one can replace the square norm with the supremum norm. In the above theorems one might speculate that the exceptional sets are related to the Diophantine properties of the entries of the vectors α, but in fact every α can be exceptional, even with an arbitrary low speed of convergence.

Theorem 4 Assume that X T d is a Banach function space on the torus with a norm invariant under translations. Assume that X T d is continuously embedded into a Banach function space Y T d continuously embedded into the space of integrable
Assume that X T d contains all the exponentials {exp (2πim · x)} m∈Z d and that for some K > 0 and for all m in Z d one has is an increasing sequence of positive numbers diverging to +∞. Then for every countable set of points in the torus there exists a set F of functions dense in X T d such that for every α in and every f (x) in F, The above theorem is an easy corollary of the principle of condensation of singularities, and likely it is just a slight variation of known results. Indeed, it is known that no general statements can be made about the rate of convergence in ergodic theorems. In particular, since the functions with Fourier transforms in L log L Z d are a Banach space contained in the space of continuous functions, it follows that for every ε > 0 and every α, with rational or irrational entries, there exist functions with Fourier Finally, Theorem 2 is best possible in the sense that for every infinite sequence of points {α n } +∞ n=1 and every non constant integrable function f (x) the speed of convergence c/N cannot be improved.

is an infinite sequence of points in the torus and if f (x) is in X T d , then for the majority of positive values of N , at least one every two consecutive, one has
The theorem is an easy application of the triangle inequality, and likely it is just a slight variation of known results, but we do not have precise references. It should be emphasized that in this theorem the sequence {α n } +∞ n=1 has to be infinite. For fixed N better estimates may hold.
The starting point of the proofs the theorems is an elementary lemma.

Lemma 6
For every sequence {α n } +∞ n=1 of points and every integrable function f (x) one has the Fourier expansion In particular, for a Kronecker sequence {nα} +∞ n=1 , Proof The first assertion follows from the Fourier expansion of f (x + α n ). Observe that one does not claim pointwise or norm convergence, but only that the function on the left has the Fourier expansion on the right. The second assertion is a sum of a geometric series.

Lemma 7
Denote by m · α the distance of m · α to the nearest integer, and recall that one can assume that α varies in the torus. Let (t) be a positive function defined in {1 ≤ t < +∞}, with (t) and t/ (t) increasing.
(1) For every m ∈ Z d − {0} and every positive integer N , If m · α ≥ 1/ (2N ) then, since (t) and t/ (t) are increasing, Hence in all cases the desired estimate holds true.
(2) For every real number s and every non zero integer h the map t → ht + s is a measure preserving transformation of the one dimensional torus. In particular, for every function g (t) locally integrable and periodic with period 1,

This implies that for every
Since the distance to the nearest integer t is a periodic function, it follows that Proof of Theorem 1 We shall prove a slightly more general result. By Lemma 6, By the Hausdorff Young inequality and Lemma 7 (1) Hence, by Lemma 7 (2), Theorem 1 follows from the case p = 1 and q = +∞. Set Then, if f (x) has an absolutely convergent Fourier expansion, Observe that the estimate in Lemma 7 (2) When 1 < p < +∞ this quasi norm is equivalent to a norm, but the case of interest in what follows is 0 < p ≤ 1. The following lemma describes the numerical sequences that sum functions in these spaces.   [11] for a sort of converse.

Lemma 9
Let dμ (α) be a Borel probability measure on the torus T d with the property that there exist constants c > 0 and d − 1 < δ ≤ d such that for every ball B (y, r ) with center y and radius r one has μ (B (y, r )

with quasi norm bounded independently of m.
Proof Since μ T d = 1, for every 0 < t ≤ 2 one has the estimate These stripes have thickness 2 (|m| t) −1 , and each strip can be covered with at most c (|m| t) d−1 balls with radius c (|m| t) −1 . Under the assumption that μ (B (y, r )) ≤ cr δ , it follows that Also observe that if t > 2 the stripes α ∈ T d : |m · α − k| < t −1 are contained into the larger stripes These larger stripes have disjoint interiors and thickness |m| −1 . Since the unit cube has diameter √ d, there are at most √ d |m| + 2 non empty stripes. Summing over these non empty stripes, one obtains Observe that the proof of the above lemma gives an estimate that depends on the dimension d, m · α −1 Weak-L p (T d ,dμ(α)) ≤ cd 1/2 p |m| (1− p)/ p . On the other hand, if dμ (α) = dα is the Lebesgue measure on the torus T d , then, as in Lemma 7, for every 0 < p ≤ 1 one obtains an estimate independent of the dimension,

Proof of Theorem 2 By
Since |sin (π m · α)| ≥ 2 m · α −1 , it follows that In order to prove the theorem it suffices to show that the above series converges absolutely for almost every α, with a possible exception of a set of δ dimensional Hausdorff measure zero. Assume that A is a Borel set with positive δ dimensional Hausdorff measure. By Frostman's lemma there exists a probability measure dμ (α) with μ ( A) > 0 and with the property that there exists a constant c > 0 such that for every ball B (y, r ) with center y and radius r one has μ (B (y, r )) ≤ cr δ . If 0 < 1 + δ − d = p ≤ 1, then by Lemma 9 the functions |m| ( p−1)/ p m · α −1 are in the space Weak-L p T d , dμ (α) with quasi norms bounded independently of m. By Lemma 8, under the assumption that the sequence |m| (1− p)/ p f (m) m∈Z d is in L log L Z d when p = 1, or in L p Z d when 0 < p < 1, the sum of these functions is in Weak-L p T d , dμ (α) and it is finite dμ (α) almost everywhere. Hence the set A of positive δ dimensional Hausdorff measure cannot be contained in the exceptional set where the theorem fails.
The proof of Theorem 3 is based on a couple lemmas, both already known.

Lemma 10 If f (x) is in L 2 T d and if α is a point in the torus, the following are equivalent:
Proof This is proved in [18] and [16] when the function f (x) is a characteristic function of an interval in the one dimensional torus, however the same proof works for square integrable functions on the d dimensional torus. To be convinced and for completeness let us repeat the proof. By subtracting f (0) to f (x) one can assume that f (x) has mean zero. Denote by F N (x), with N = 0, 1, 2, 3, ..., the functions Also denote by K the closure in L 2 T d of the convex hull of the sequence is bounded in L 2 T d , the set K is convex and weakly compact. Define the operator . This operator is continuous in L 2 T d and it maps K into K . Then, by the theorem of Schauder Tychonoff, it has a fixed point, The Fourier transform of this last equation gives .
Hence (1) implies (2). In order to prove the converse, given α and f (x), as before In the following lemma, which is a sort of converse of Lemma 8, the space dimension is d = 1.

Proof of Theorem 3 Assume that f (x) is in L 2 (T) and that for a given α and every N one has
Then, by Lemma 10, for the same α one has

If the above inequality holds true for all α in a set
By the assumptions of the theorem It follows that the operator norm of S α,N from X T d into Y T d is at least as large as The last inequality follows from the fact that for every α, with rational or irrational entries, one can choose m in Z d − {0} with m · α arbitrary close to an integer, so that |sin (π N m · α) / (N sin (π m · α))| gets arbitrarily close to 1.
diverges to +∞ the family of operators (N ) S α,N +∞ n=1 is not uniformly bounded from X T d into Y T d , and the principle of condensation of singularities, the Banach Steinhaus theorem, guarantees the existence of a set of functions F (α) of second category in X T d with the property that for every Then, by the triangle inequality, for the next N + 1 one has Observe that, under the assumption that the norm in X T d is invariant under translation, .
Then apply Lemma 12

Remark 13
The motivation for this work comes from an attempt to extend the Koksma Hlawka inequality and improve, if possible, the speed of convergence of Weyl sums. In the literature there are several results on irrational rotations on the torus and functions with bounded remainders. We quote only a few. In [14] Krengel and in [12] Kakutani and Petersen proved that no general statements can be made about the rate of convergence in ergodic theorems. In particular, if T is an ergodic measure preserving transformation of the interval {0 ≤ x ≤ 1} and if {ε n } +∞ Moreover, for every 1 ≤ p ≤ +∞, Theorem 4 gives an easy proof of these results, in the particular case of a translation T (x) = x + α.
In [1] Bayart and Buczolich and Heurteaux proved that the expected speed of convergence of Weyl sums of continuous, or more generally square integrable functions, is slightly less than N −1/2 . More precisely, they proved that if f (x) is square integrable and if ν < 1/2 then for almost every (α, They also proved that if ν = 1/2 then there exist continuous functions f (x) such that for almost every (α, Confirming a conjecture of Erdös and Szüs, in [13] Kesten, and with a different proof in [18] Petersen, proved that if f (x) is the characteristic function of an interval if and only if b −a = nα −k for some integers n and k. This result for one dimensional intervals have been generalizes to Riemann measurable sets in several dimensions in [8] by Grepstad and Lev. Hence for a characteristic function a bounded remainder c ( f , α) is the exception not the rule.
In [9] and [10] Hellekalek and Larcher proved that if f (x) is a continuously differentiable function on {0 ≤ x ≤ 1} with d f (x) /dx Lipschitz continuous and with f (0) = f (1), hence f (x) has a jump discontinuity as a function on the torus T, then for every α and every x one has lim sup They also proved that if f (0) = f (1), hence f (x) is continuous as a function on the torus T and the derivative may have at most a jump discontinuity, then for almost every α with respect to Lebesgue measure one has Observe that a discontinuous function cannot have a Fourier expansion absolutely convergent. On the other hand, the assumption that is continuous in the torus and d f (x) /dx is Hölder continuous with exponent ε > 0, then f (m) ≤ cm −1−ε . Hence Theorem 2 applies with 2/ (2 + ε) < p ≤ 1.
In [5] Dick and Pillichshammer proved that if {α n } +∞ n=0 is a van der Corput sequence on the interval {0 ≤ x ≤ 1}, and if the Fourier coefficients of the function f (x) have decay f (m) ≤ c |m| −1−ε for some ε > 0, then Observe that if f (m) ≤ c |m| −1−ε , then f (x) is in the Sobolev space W v (T) of functions with square integrable fractional order derivatives up to the order ν < 1/2+ . Hence the assumption f (m) ≤ c |m| −1−ε is a sort of smoothness assumption on the function f (x). However, it should be remarked that the assumptions in the above quoted papers are quite different from the ones in this paper.
The assumption that f (m) ≤ c |m| −1−ε is not only on the size of the Fourier transform, but also on the location on the masses. On the contrary, the assumption that the Fourier transform is summable, or it is in L log L (Z), is invariant under rearrangement of the Fourier transform. Finally, the assumption that the Fourier transform is summable guarantees that the function is bounded, and with unbounded functions the conclusions of Theorem 1 and Theorem 2 fail. Set In [16] Liardet, and again in [19] Schoissengeier, proved that if c ( f , α, x) is finite at a particular point x 0 , c ( f , α, x 0 ) < +∞, then c ( f , α, x) is finite at every other point x, and c ( f , α, x) ≤ 2c ( f , α, x 0 ). Indeed the proof holds true also in several variables. In [19] there is also a study of properties of the set of B ( f ) = {α ∈ T, c ( f , α, x) < +∞}. In particular, it is proved that if the complement of B ( f ) in the torus has a cardinality smaller than the continuum, then f (x) is a trigonometric polynomial. The results stated in one variable, easily extend to several variables. Hence, if f (x) is not a trigonometric polynomial, in Theorem 2 the exceptional set has measure zero but the cardinality of continuum.
In [17] Owen and Pan proved that there exist reproducing kernel Hilbert spaces of functions on T d with bounded Hardy Krause variation with the property that for every sequence {α n } +∞ n=1 in T d exist functions in these spaces with Hence in the Koksma Hlawka inequality some powers of logarithms are indeed necessary.

Remark 14
As already mentioned, Theorems 1, 2, and the classical Koksma Hlawka inequality apply to quite different classes of functions. The Fourier coefficients of functions with bounded variation on the one dimensional torus have decay f (m) ≤ c |m| −1 , but functions with bounded variation may be discontinuous, and discontinuous functions cannot have absolutely convergent Fourier expansions. On the other hand, by a theorem of Zygmund, a function with bounded variation and Hölder continuous of any positive exponent has an absolutely convergent Fourier expansion. By a theorem of Bernstein, a function Hölder continuous with exponent δ > 1/2 has an absolutely convergent Fourier expansion, and a look at the proof shows that the Fourier transform is in L log L (Z), and also that |m| [22,Chapter VI.3]. On the other hand, there are functions with Fourier transforms in L log L (Z) which do not have bounded variation or without the smoothness required by the theorem of Bernstein. An example is the Weierstrass function f (x) = +∞ n=0 a n cos (2π b n x), with 0 < a < 1 < b and ab > 1. Observe that if |m| = b n then f (m) = 2 −1 |m| log(a)/ log(b) , and that −1 < log (a) / log (b) < 0. Also observe that this function is Hölder continuous with exponent − log (a) / log (b) and it is roughly self affine, f (x) = cos (2π x) + a f (bx), and this suggests that its graph has fractal dimension 2 + log (a) / log (b).

Remark 15
The assumption exp (2πim · x) Y (T d ) ≥ K exp (2πim · x) X (T d ) in Theorem 5 is necessary. Indeed it is possible to find Banach function spaces X T d contained into the space of continuous functions Y T d = C T d such that the assumption on the norms of the exponentials is not satisfied, and such that under suitable Diophantine assumptions on the vector α for any function f (x) in X T d and every N ≥ 1 one has Examples are the spaces X T d = C d,δ T d with derivatives up to the order d Hölder continuous of exponent δ > 0. We hope to return to this matter in another paper.

Remark 16
The decay of the Fourier transform can be related to the smoothness of the function. In particular, functions in Sobolev spaces W v T d with square integrable fractional order derivatives up to the order ν > d/2 have absolutely convergent Fourier expansions. Quadrature rules for functions with different degree of smoothness have been extensively studied. See e.g. [4] for quadrature rules on compact manifolds, the torus is included. In particular, it is known that for every ν > d/2 and every N > 0 there are distributions of points {α n } N n=1 in T d and weights {c n } N n=1 with the property that for every function in the Sobolev space This order of approximation in the Sobolev norm is optimal. The Kronecker sequences {nα} +∞ n=1 in Theorem 1 are quite explicit distributions of points on the torus, but do not give such optimal order of approximation. Observe that these estimates are not in contradiction with Theorem 5, since these distribution of points {α n } N n=1 are finite, while Theorem 5 applies only to infinite sequence.

Remark 17
The paper of Beck [3] has developed a new chapter of uniform distribution, the strong uniformity, with the discrete samplings {nα} +∞ n=1 replaced by continuous rotations {tα} 0<t<+∞ , and in [7] Grepstad and Larchert studied bounded remainder sets for these continuous rotations. Indeed, the above theorems have an analog with Weyl sums replaced by integrals and the discrete parameter N replaced by a continuous parameter T , sin (π T m · α) π T m · α f (m) exp (2πim · x) .