Lattice-based integration algorithms: Kronecker sequences and rank-1 lattices

We prove upper bounds on the order of convergence of lattice-based algorithms for numerical integration in function spaces of dominating mixed smoothness on the unit cube with homogeneous boundary condition. More precisely, we study worst-case integration errors for Besov spaces of dominating mixed smoothness B˚p,θs\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathring{\mathbf {B}}^s_{p,\theta }$$\end{document}, which also comprise the concept of Sobolev spaces of dominating mixed smoothness H˚ps\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathring{\mathbf {H}}^s_{p}$$\end{document} as special cases. The considered algorithms are quasi-Monte Carlo rules with underlying nodes from TNZd∩[0,1)d\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_N\left( \mathbb {Z}^d\right) \cap [0,1)^d$$\end{document}, where TN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_N$$\end{document} is a real invertible generator matrix of size d. For such rules, the worst-case error can be bounded in terms of the Zaremba index of the lattice XN=TNZd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {X}_N=T_N\left( \mathbb {Z}^d\right) $$\end{document}. We apply this result to Kronecker lattices and to rank-1 lattice point sets, which both lead to optimal error bounds up to logN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\log N$$\end{document}-factors for arbitrary smoothness s. The advantage of Kronecker lattices and classical lattice point sets is that the run-time of algorithms generating these point sets is very short.


Introduction
In this paper, we study numerical integration of smooth functions on the d-dimensional unit cube which satisfy homogeneous boundary conditions. More precisely, we consider Besov spaces of dominating mixed smoothnessB s p,θ which also comprise the concept of Sobolev spaces of dominating mixed smoothnessH s p as special cases. The exact definition of these spaces requires some preparation and will be given in Sect. 2.2. For the moment we just note that the parameter s denotes the underlying smoothness of the functions and that for f ∈B s p,θ we have supp( f ) ⊂ [0, 1] d . We study numerical integration where · B s p,θ denotes the norm inB s p,θ . As integration rules we use what we call general lattice rules which are of the form with an invertible matrix T ∈ R d×d . Note that is a d-dimensional lattice in R d , i.e., a discrete additive subgroup of R d not contained in a proper linear subspace of R d . A specific example of such a rule is Frolov's cubature formula introduced in [1] (see also [2] and the survey paper by Temlyakov [3,Section 4.3]). Here a suitable generating matrix T for the underlying lattice X is found as follows: Define the polynomial p d ∈ R[x], p d (x) = −1 + d j=1 (x − 2 j + 1). Then p d has only integer coefficients, it is irreducible over the rationals and has d different real roots, say ξ 1 , . . . , ξ d ∈ R. With these roots define the d × d Vandermonde matrix B by Then the generator of the lattice X used in Frolov's cubature formula is where a > 1 is a shrinking factor. It was shown by Dubinin [4], see also [5] and [6], that Frolov's cubature formula achieves the optimal convergence rate for the worst-case error inB s p,θ , which is where N is the number of elements of T Z d that belong to the unit cube [0, 1] d . The lower bound for these spaces was proven in [2]. See also [7] for techniques to transfer such results to spaces without (or with periodic) boundary conditions. For the Sobolev(-Hilbert) spaces H s 2 =B s 2,2 the result reads The problem with Frolov's cubature formula is that one needs to determine which points from the shrunk lattice belong to the unit cube [0, 1] d . This is in general a very difficult task, especially when the dimension d is large. 1 It is the aim of this paper to find other general lattice rules whose lattice points are faster to generate on a computer also for large dimensions d and which also achieve optimal convergence rate for the worst-case error with respect to the smoothness s, up to log N -terms.
We will prove a very general estimate for the worst-case error in terms of the Zaremba index of the lattice X = T Z d , see Theorem 14 in Sect. 3. Then we apply this error estimate to two examples of general lattice rules, to so-called Kronecker lattices (Sect. 4.1) and to rank-1 lattice point sets (Sect. 4.2). In both cases we achieve a convergence rate of order O N −s for the worst-case error up to log N -terms. The major advantage of these constructions is that it is automatically clear which points belong to the unit cube. Hence, the proposed integration algorithms (which are in fact QMC rules) can be implemented very efficiently.

Basic notation
Throughout the paper d ∈ N denotes the dimension. By log we denote the dyadic logarithm. For x ∈ R we denote by {x} the fractional part of x and by x the distance of x to the nearest integer, i.e., x := min m∈Z |x − m|.
where ·, · denotes the usual inner product in R d . Before we formulate the main results of this work in more detail we need some preparation, which is presented in Sect. 2.

Basics of Fourier analysis
with the usual modification if p = ∞. We will also need L p -spaces on compact domains ⊂ R d instead of the entire R d . We write f L p ( ) for the corresponding (restricted) L p -norm.
For f ∈ L 1 R d we define the Fourier transform and the corresponding inverse Fourier transform F −1 f (ξ ) = F f (−ξ ). Additionally, we define the spaces of continuous functions C R d , infinitely differentiable functions C ∞ R d and infinitely differentiable functions with compact support C ∞ 0 R d as well as the Schwartz space S = S R d of all rapidly decaying infinitely differentiable functions on R d , i.e., The space S R d , the topological dual of S R d , is also referred to as the set of tempered distributions on R d . Indeed, a linear mapping f : S R d → C belongs to S R d if and only if there exist vectors k, ∈ N d 0 and a constant c = c f such that for all ϕ ∈ S R d . The space S R d is equipped with the weak * -topology. The convolution ϕ * ψ of two square-integrable functions ϕ, ψ is defined via the integral If ϕ, ψ ∈ S R d , then ϕ * ψ still belongs to S R d . In fact, the convolution operator can be extended to S R d × L 1 , in which case we have ϕ * ψ ∈ S R d , and ). It makes sense point-wise and is a C ∞ -function in R d . As usual, the Fourier transform can be extended to S R d by

Function spaces
In this section we introduce the function spaces under consideration, namely the Besov and Sobolev spaces of dominating mixed smoothness. There are several equivalent characterizations of these spaces, see [9]. For our purposes, the most suitable is the characterization by local means (see [9,Theorem 1.23] or [6,Section 4]).
Let us continue with the definition of the Besov spaces B s p,θ = B s p,θ R d defined on the entire R d .
with the usual modification for θ = ∞.

Remark 3 Other important function spaces are Sobolev spaces of dominating mixed smoothness, which are denoted by H
In the case s ∈ N these spaces can be normed by Note that H s 2 = B s 2,2 , i.e., the Sobolev(-Hilbert) spaces H s 2 appear as the special case p = θ = 2 in the Besov space scale B s p,θ , see e.g., [11,Chapt. 2]. Moreover, the spaces B s p, p with 1 ≤ p < ∞ and s / ∈ N are called Sobolev-Slobodeckij spaces.
Remark 4 Different choices of 0 , 1 in Definition 2 lead to equivalent (quasi-)norms. In fact, it is not even necessary that 0 and 1 have compact support. However, for the proof of our results, this specific choice is crucial.
In this article we are interested in classes of functions which are supported in the unit cube The next lemma collects some frequently used embeddings between Besov and Sobolev spaces. They will be useful in obtaining our results for Sobolev spaces directly from the results for Besov spaces.
Lemma 5 Let p ∈ (0, ∞], and let s ∈ R. (i) We have the chain of embeddings Proof For a proof we refer to [11,Chapt. 2]. Note that for (ii) the compact support of the functions is necessary to ensure the corresponding embeddings of the L p spaces.
In the sequel we will always assume that s >

Useful lemmas
Here we collect some lemmas that will be essential for our analysis. For proofs and further literature for these results, we refer to [6, Sections 3.2 and 3.3].
The first lemma is a variant of Poisson's summation formula, see [12,Thm. VII.2.4]. Although this equality looks more technical than the original summation formula, this variant is exactly in the form we need and it comes with less assumptions on the involved functions.

Lemma 6 (Poisson summation formula) Let f ∈ L 1 R d be continuous with compact support, T ∈ R d×d be an invertible matrix, and B
In particular, the limit on the right-hand side exists.
The lemma above holds for quite general choices of ϕ 0 . However, we choose a certain ϕ 0 (and hence ϕ m ) that is related to the definition of our spaces, i.e., the functions 0 , 1 from Sect. 2.2 are involved. This is because, given 0 and 1 , we can construct a suitable decomposition of unity, i.e., a set of functions Proof Following [9, Thm. 1.20], see also [6, Lemma 3.6], we define some function ϕ ∈ Then (i) and (ii) follow from the support of F 0 = ϕ and F 1 , (iii) comes from ϕ 0 (0) = F 0 (0) = ϕ(0) = 1, and (iv) is easily shown by using the definition of j , j ≥ 1.
We define the d-fold tensorized functions where j , j , j ∈ N 0 , are defined in Lemma 7. We obtain from Lemma 7 that we can use the functions in Lemma 6. Moreover, by the construction of the tensorized functions and Lemma 7, we know that the support of F m , m ∈ N d 0 , is of the form For this, recall that we choose 0 , 1 such that ε = 1, see Remark 1.
As one can see in the lemmas above, we are concerned with sums of certain function evaluations of Fourier transforms. Finally, we want to bound such sums by the norm of the involved functions, i.e., we have to control the dependence on the matrix B, see Lemma 6. For this we need the following two lemmas, see [6, Lemmas 3.3 and 3.5]. Lemma 10 Let B ∈ R d×d be an invertible matrix and ∈ R d be a bounded set. Furthermore, let g ∈ S R d with supp(F g) ⊂ . Then, for 1 ≤ p ≤ ∞, we have

General error bound
In this section, we provide a general upper bound on the error of general lattice rules for the integration in the spacesB s p,θ of functions with support inside the unit cube. Here we mean by general lattice rules all algorithms of the form with an invertible matrix T ∈ R d×d . Clearly, due to supp( f ) ⊂ [0, 1] d , we could also sum over all x ∈ T Z d without changing the value of Q T ( f ).
For an invertible matrix T the dual lattice of the lattice X = T Z d is defined as X * = B Z d := T − Z d .

Remark 11
In the literature usually only those algorithms of the form (8) are called lattice rule that satisfy T Z d ⊃ Z d , see e.g., [13,14]. Such rules are a priori also suited for the integration of periodic functions, in contrast to the general algorithms of the form (8). We decided to use this nomenclature with the prefix "general" since it seems to be adequate for cubature rules that use function evaluations on a lattice.
We prove the following theorem.

Theorem 12
Let T ∈ R d×d be invertible, X = T Z d , X * its dual lattice, I m from (7) and Q T as in (8). Then, for 1 ≤ p, θ ≤ ∞ and s > 1/ p, we have Proof From Lemma 6 with ϕ m from (6) we have where B = T − . Actually, the outer sum is defined as a certain limit; however, we will see that this sum converges absolutely, which justifies this notation.
By Hölder's inequality as well as Lemmas 8 and 10 we have This shows that, in order to prove bounds on the worst-case error, it is enough to study the numbers |X * ∩ I m \{0}| for m ∈ N d 0 . Moreover, as the next lemma shows, this quantity can be bounded by the Zaremba index of the lattice X, which is, loosely speaking, the largest ∈ N such that |X * ∩ I m \{0}| = 0 for all m with |m| 1 ≤ . More precisely, we define the Zaremba index of the lattice X by where, for z = (z 1 , . . . , z d ), r (z) := d j=1 z j with z := max{1, |z|}. The connection of the Zaremba index to numerical integration is well known, see e.g., [13][14][15][16][17] and the references therein. However, these classical results are for Korobov spaces or star discrepancy. In this paper we are concerned with Besov spaces of dominating mixed smoothness. We repeat the (relatively short) proofs here for convenience of the reader.

Lemma 13
Let T ∈ R d×d be invertible, X = T Z d , X * its dual lattice and I m from (7). Then, for all m ∈ N d 0 , we have where log denotes the dyadic logarithm.
Proof Let M := log ρ(X) . For x ∈ I m , we have r (x) ≤ 2 |m| 1 . This shows that, for |m| 1 < M, there is no x ∈ X * in I m , except possibly the origin. This proves the first bound.
By the same argument it is clear, that also boxes of the form {x ∈ R d : |x i | ≤ 2 i , i = 1, . . . , d} with | | 1 < M do not contain points from X * \{0}. If we halve all sides of this set, i.e., if we consider the sets it is easy to see that all translates of J , | | 1 < M, contain at most one x ∈ X * . Now consider |m| 1 ≥ M. Then there exists an ∈ N d . . , d} can be covered by 2 |m− | 1 +d = 2 |m| 1 − M +d+1 ≤ 2 |m| 1 +d+1 /ρ(X) translates of J , each containing at most one x ∈ X * . This proves the second bound.
Together with Theorem 12 this implies the following.

Application to specific point sets
In this section we apply the general results from the last section to specific point sets. More precisely, we study Kronecker lattices and rank-1 lattice point sets in dimensions d ≥ 2. By Theorem 14 it is enough to bound the Zaremba index of these lattices. However, since the bounds on the Zaremba index are worse in higher dimensions, we treat the lower-dimensional cases separately.

Kronecker lattices
We study Kronecker lattices which are point sets of the form where α = (α 1 , . . . , α d−1 ) ∈ R d−1 . These point sets can be written as where X N (α) = T N (Z d ) and Note that det(T N ) = (−1) d+1 /N and hence d T N = N . Hence, we can use the results from the last section to prove upper bounds on the error of the cubature rule Q T N . Recall that the dual lattice of  Since all lattices that follow will be of this form we use the notation Moreover, we write In the following we fix a vector α ∈ R d−1 and just let N grow in (10). Given a "good" α, this makes the point set P N (α) particularly easy to implement, since one needs only N d arithmetic operations.
It is not surprising, and well known, that bounds on the Zaremba index of lattices X N (α) depend on Diophantine approximation properties of the vector α. More precisely, a lattice has large Zaremba index if the involved numbers α 1 , . . . , α d−1 are badly approximable (in a certain sense). This is reflected by the following lemma.
First assume that k 1 = · · · = k d−1 = 0 and hence k d = 0. This already implies r (z) ≥ N . Next assume that |k j | ≥ N for some j = 1, . . . , d − 1. Then it follows from the definition that r (z) ≥ k j ≥ N .
We now treat the cases d = 2 and d ≥ 3 separately, since the known results on the existence of (simultaneously) badly approximable numbers differ in these cases.

The case d = 2
We say that a real number α is badly approximable, if there is a positive constant c 0 = c 0 (α) > 0 such that k kα ≥ c 0 > 0 for all integers k ≥ 1.
It is well known that an irrational number α is badly approximable if and only if the sequence a 1 , a 2 , a 3 , . . . of partial quotients in the continued fraction expansion of α = [ a 0 ; a 1 , a 2 , a 3 , . . .] is bounded, i.e., there is some M = M(α) > 0 such that a j ≤ M for all integers j ≥ 1, see e.g., [18]. For example for the golden ratio α = (1 + √ 5)/2 we have that the continued fraction coefficients are all 1 and hence of course are also bounded.
It is easily seen from the definition of badly approximable numbers that we can apply Lemma 16 with d = 2 and ψ ≡ 1, which proves that, for the constant c 0 = min(1, c 0 ), ρ 2 (N , α) ≥ c 0 N for all N ∈ N. This implies the following theorem.

Theorem 17
Let α be a badly approximable number, N ∈ N and P N (α) as in (10). Then, for 1 ≤ p, θ ≤ ∞ and s > 1/ p, Proof The upper bound follows from Theorem 14 and the lower bound was proven, e.g., in [6,Theorem 7.3].

The case d ≥ 3
Unfortunately, in dimensions greater than two the results are not as satisfactory as for d = 2.
That is, we do not know if vectors α exist that give the optimal order of convergence of the corresponding (general) lattice rule. Assume for the moment that we have a vector α ∈ R d−1 for d ≥ 3 such that In this case we could show that wce(P N (α),B s p,θ ) (log N ) (d−1)(1−1/θ ) /N s , which is the optimal order of convergence. However, a famous conjecture of Littlewood states that there is no α ∈ R d−1 , d ≥ 3, with this property, see e.g., [19]. See also [20] for a discussion of this Diophantine problem in the context of the discrepancy of (nα)-sequences.
The best we can hope for at the moment for our problem are metrical results. These are based on the following lemma. Proof From [21, Lemma 5] we obtain that, under the assumptions of the lemma, we can apply Lemma 16 for almost every α ∈ R d−1 .
This implies the following result.

Theorem 19
Let ψ : N → R + be non-decreasing such that n≥1 1 nψ(n) < ∞. Then, for almost all α ∈ R d−1 and every N ≥ 1 we have For example, for δ > 0 for almost all α ∈ R d−1 we have Remark 20 In dimension d = 3 the metrical result can be slightly improved. It follows from results in [19] that there exist (α 1 , α 2 ) ∈ R 2 such that the assumption of Lemma 16 holds with ψ(N ) = (log N ) log log N . Hence, for d = 3, the second statement of Theorem 19 holds with δ = 0.
However, if we want a result for concrete α ∈ R d−1 for d ≥ 3, the situation is even worse. Recall that for a real number η, a (d − 1)-tuple α ∈ (R\Q) d−1 is said to be of approximation type η, if η is the infimum of all numbers σ for which there exists a positive constant c = c(σ, α) such that It is well known that the type η of an irrational vector α is at least one. On the other hand it has been shown by Schmidt [22]

Rank-1 lattice point sets
A rank-1 lattice point set is given by the points { n N g} for n = 0, 1, . . . , N − 1, where g = (g 0 , g 1 , . . . , g d−1 ) is a lattice point in Z d and where the fractional part is applied componentwise. We restrict ourselves to the case where g 0 = 1 (if N is a prime number, this still covers all possible rank-1 lattice point sets). Then we can write a rank-1 lattice point set as In view of Sect. 4.1 we see that we replace the possibly irrational point α by the rational point g/N . So, for consistent notation, we should have used, e.g., the denotation P N (g/N ) for the point set. However, we use P N (g) etc. for simplicity. For the same reasoning we let ρ d (N , g) := ρ (X N (g/N )). The Zaremba index of rank-1 lattice point sets is a well studied quantity, and we can use the known results in order to apply them in Theorem 14. Since the statement of Remark 15 holds also in this case, we can ignore the hidden constants in Theorem 14. Again we treat the cases d = 2 and d ≥ 3 separately.

Remark 22
For d = 2 the construction is again based on the boundedness of the partial quotients of g 1 /N . So, given a badly approximable number α, see Sect. 4.1.1, one can use its convergents p k /q k , k = 1, 2, . . ., to construct the (optimal) sequence of lattices P q k (1, p k ) with respect to the worst-case error. To find an analogous construction in higher dimensions is a challenging open problem.

The case d = 2
Let g ∈ {1, . . . , N − 1} with gcd(g, N ) = 1. Let a 1 , a 2 , . . . , a l be the partial quotients in the continued fraction expansion of g/N and let K ( g N ) = max 1≤ j≤l a j . Then it was shown by Zaremba [23] (see also [13,Theorem 5.17]) that the Zaremba index ρ 2 (N , g) for g = (1, g) can be bounded in terms of K (g/N ), more precisely, that .

Remark 24
In particular the result holds for Fibonacci rules, where N = F n and g = F n−1 , the nth and (n − 1)st Fibonacci numbers, respectively. In this case the continued fraction coefficients of g/N = F n−1 /F n are all exactly 1. Fibonacci rules were also used by Temlyakov, see [24] and [3, Section 4.1].

Remark 25
Note that a famous conjecture of Zaremba [25, p. 76] states that for every integer N ≥ 2 one can find a g ∈ {1, . . . , N } with gcd(g, N ) = 1 such that the continued fraction coefficients of g/N are bounded by some constant K (in fact, he conjectured that K = 5).
Niederreiter [26] established this conjecture for all N of the form 2 m , 3 m or 5 m for m ∈ N. Bourgain and Kontorovich [27] proved Zaremba's conjecture for almost all N with a constant K = 50. Huang [28] improved this result to show Zaremba's conjecture for almost all N with constant K = 5.

The case d ≥ 3
It follows from a result of Zaremba [29] that for every d ≥ 2 and N ≥ 2 there exists a lattice point g ∈ Z d such that In fact, one can choose C d = (d − 1)!/2 d−1 (see also [13,Theorem 5.12]). From this result together with Theorem 14 we obtain the following theorem. However, it remains an open question how to construct, for given d and N , lattice points g ∈ Z d−1 which achieve the lower bound (16). Without loss of generality one can restrict oneself to the search space {g ∈ {1, 2, . . . , N − 1} : gcd(g, N ) = 1} d−1 of size ϕ(N ) d−1 , where ϕ denotes Euler's totient function. This is too large for a full search already for moderately large N and d ≥ 3. So far one relies on computer search to find good generating vectors, usually based on the fast component-by-component construction [30,31].
Korobov [32] suggested considering lattice point sets with generating vectors g = (1, g, g 2 , . . . , g d−1 ) in Z d with g ∈ {1, 2, . . . , N − 1} such that gcd(g, N ) = 1. The size of the search space for lattice points of this form reduces to ϕ(N ). At least for prime powers N and in dimension d = 3, there is an existence result of Larcher and Niederreiter [33] for g = 1, g, g 2 with For d > 3 this is open.