Abstract
The number of lattice points in d-dimensional hyperbolic or elliptic shells \(\{m : a<Q[m]<b\}\), which are restricted to rescaled and growing domains \(r\,\Omega \), is approximated by the volume. An effective error bound of order \(o(r^{d-2})\) for this approximation is proved based on Diophantine approximation properties of the quadratic form Q. These results allow to show effective variants of previous non-effective results in the quantitative Oppenheim problem and extend known effective results in dimension \(d \ge 9\) to dimension \(d \ge 5\). They apply to wide shells when \(b-a\) is growing with r and to positive definite forms Q. For indefinite forms they provide explicit bounds (depending on the signature or Diophantine properties of Q) for the size of non-zero integral points m in dimension \(d\ge 5\) solving the Diophantine inequality \(|Q[m] |< \varepsilon \) and provide error bounds comparable with those for positive forms up to powers of \(\log r\).
Similar content being viewed by others
1 Introduction
Let Q[x] denote an indefinite quadratic form in d variables. We say that the form Q is rational, if it is proportional to a form with integer coefficients; otherwise it is called irrational. The Oppenheim conjecture, proved by G. Margulis [40] in 1986, states that \(Q[{\mathbb {Z}}^d]\) is dense in \({\mathbb {R}}\) if \(d \ge 3\) and Q is irrational. Initially this was conjectured for \(d \ge 5\) by A. Oppenheim [47, 48] in 1929 and in 1946 strengthened (for diagonal forms) to \(d\ge 3\) by H. Davenport [18]. The proof given in 1986 uses a connection, noticed by M. S. Raghunathan, between the Oppenheim conjecture and questions concerning closures in \(\mathrm {SL}(3, {\mathbb {R}}) / \mathrm {SL}(3, {\mathbb {Z}})\) of orbits of certain subgroups of \(\mathrm {SL}(3, {\mathbb {R}})\). It is based on the study of minimal invariant sets and the limits of orbits of sequences of points tending to a minimal invariant set. Previous studies have mostly used analytic number theory methods. In fact, B. J. Birch, H. Davenport and D. Ridout proved in a series of papers that \(Q[{\mathbb {Z}}^d]\) is dense in \({\mathbb {R}}\) if \(d \ge 21\) provided that Q is irrational, see [39, 41] for a complete historical overview until 1997.
For a measurable set \(B \subset {\mathbb {R}}^d\) let \(\mathrm {vol}\,\, B\) denote the Lebesgue measure of B and let \(\mathrm {vol}\,_{\mathbb {Z}}\, B \,:=\,\#(B \cap {\mathbb {Z}}^d)\) denote the number of integer points in B. We define for \(a, b \in {\mathbb {R}}\) with \(a < b\) the hyperbolic shell
The Oppenheim conjecture is equivalent to the statement that if \(d \ge 3\) and Q is irrational, then \(\mathrm {vol}_{\mathbb {Z}}\,E_{a,b} = \infty \) whenever \(a < b\). We would like to study the distribution of values of Q at integer points, often referred to as “quantitative Oppenheim conjecture” with an emphasis on establishing effective error bounds for the approximation of the number of lattice points restricted to growing domains. Our methods rely mainly on Götze’s Fourier approach [28] via Theta series, translating the lattice point counting problem into averages of certain functions on the space of lattices, for which we extend the mean-value estimates obtained by Eskin–Margulis–Mozes [23].
1.1 Related results
Let \({\mathcal {R}}\) be a continuous positive function on the sphere \(\{v \in {\mathbb {R}}^d {:} \Vert v\Vert =1\}\) and let \(\Omega = \{v\in {\mathbb {R}}^d\,{:}\,\Vert v\Vert \le 1/ {\mathcal {R}}(v/\Vert v\Vert )\}\). Note that the Minkowski functional of \(\Omega \), that is \(M(v)=\inf \{r>0 \,{:}\, v \in r \Omega \}\), may be rewritten as \(M(v)=\Vert v\Vert \,{\mathcal {R}}(v/\Vert v\Vert )\) and therefore \(\Omega =\{v\in {\mathbb {R}}^d \,{:}\, M(v) \le 1\}\). Without loss of generality we may assume that \(\Omega \subset [-1,1]^d\). We denote by \(r\Omega \) the dilate of \(\Omega \) by \(r > 1\). In [20] S. G. Dani and G. Margulis obtained the following asymptotic exact lower bound under the same assumptions that Q is irrational and \(d\ge 3\):
Remark 1.1
It is not difficult to prove (see Lemma 3.8 in [23]) that as \(r\rightarrow \infty \),
where
L is the light cone \(Q=0\) and \(\mathrm {d}A\) is the area element on L.
The situation with asymptotics and upper bounds is more subtle. It was proved in [23] that if Q is an irrational indefinite quadratic form of signature (p, q) with \(p+q=d\), \(p \ge 3\) and \(q \ge 1\), then for any \(a<b\)
or, equivalently, as \(r\rightarrow \infty \)
where \(\lambda _{Q, \Omega }\) is as in (1.2).
If the signature of Q is (2, 1) or (2, 2), then no universal formula like (1.4) holds. In fact, one can show (see Theorem 2.2 in [23]) that if \(\Omega \) is the unit ball and \(q=1\) or \(q=2\), then for every \(\varepsilon >0\) and every \(a<b\) there exists an irrational quadratic form Q of signature (2, q) and a constant \(c>0\) such that for an infinite sequence \(r_j\rightarrow \infty \)
While the asymptotics as in (1.4) do not hold in the case of signatures (2, 1) and (2, 2), one can show (see [23]) that in these cases there is an upper bound of the form \(r^{d-2}\log r\). This upper bound is effective and it is uniform over compact sets in the space of quadratic forms. In addition, there is an effective uniform upper bound (see [23]) of the form \(cr^{d-2}\) for the case \(p\ge 3\), \(q\ge 1\).
The examples in [23] for the cases of signatures (2, 1) and (2, 2) are obtained by considering irrational forms which are very well approximated by split rational forms. More precisely, a quadratic form Q is called extremely well approximable by split rational forms (EWAS) if for any \(N>0\) there exists a split integral form \(Q^\prime \) and a real number \(t \ge 2\) such that
where \(||\cdot ||\) denotes a norm on the linear space of quadratic forms. It is shown in [22] that if Q is an indefinite quadratic form of signature (2, 2), which is not (EWAS), then for any interval (a, b), as \(r\rightarrow \infty \),
where \(\lambda _{Q, \Omega }\) is the same as in (1.2) and \({\widetilde{N}}_{Q,\Omega }(a,b,r)\) counts all the integral points in \(E_{a,b}\cap r\Omega \) not contained in rational subspaces isotropic with respect to Q. It should be noted that
-
(i)
an irrational quadratic form of signature (2, 2) may have at most four rational isotropic subspaces,
-
(ii)
if \(0\not \in (a,b)\), then \({\widetilde{N}}_{Q,\Omega }(a,b,r)=\mathrm {vol}_{\mathbb {Z}}\,(E_{a,b}\cap r\Omega ).\)
The above mentioned results have analogs for inhomogeneous quadratic forms
We define for \(a,b\in {\mathbb {R}}\) with \(a<b\) the shifted hyperbolic shell
We say that \(Q_\xi \) is rational if there exists \(t>0\) such that the coefficients of tQ and the coordinates of \(t\xi \) are integers; otherwise \(Q_\xi \) is irrational. Then, under the assumptions that \(Q_\xi \) is irrational and \(d\ge 3\), we have that (see [45])
The proof of (1.6) is similar to the proof of (1.1).
Let (p, q) be the signature of Q. If \(p \ge 3\), \(q \ge 1\) and \(Q_\xi \) is irrational then
or, equivalently, as \(r\rightarrow \infty \),
The proof of (1.7) is similar to the proof of (1.3), see [45]. The latter paper [45] also contains an analog of (1.5) for inhomogeneous forms in the case of signature (2, 2). One should also mention related results of Marklof [42, 43].
Remark 1.2
The proofs of the above mentioned results use such notions as a minimal invariant set (in the case of the Oppenheim conjecture) and an ergodic invariant measure. These notions do not have in general effective analogs. Because of that it is very difficult to get ‘good’ estimates for the size of the smallest non-trivial integral solution of the inequality \(|Q[m]|<\varepsilon \) and ‘good’ error terms in the quantitative Oppenheim conjecture by applying dynamical and ergodic methods.
1.2 Diophantine inequalities
One of our main objective is to develop effective analogs of (1.8) and show that all indefinite quadratic forms Q of rank at least 5 admit a non-trivial integral solution to the Diophantine inequality \(|Q[m]|<\varepsilon \) whose size can be bounded effectively in terms of \(\varepsilon ^{-1}\). On the one hand, we will exploit Schlickewei’s results [51] on small zeros of integral forms (see Sect. 8.1) in order to establish effective bounds depending on the signature (r, s) of Q. On the other hand we will introduce an appropriate Diophantine condition on the space of quadratic forms, which will enable us to significantly improve our effective bounds due to the exponents appearing in the Diophantine approximation of Q. To state these bounds we need to introduce notation.
Denote by Q also the symmetric matrix in \(\mathrm {GL}(d, {\mathbb {R}})\) associated with the form \(Q[x] :=\langle x, Q\, x\rangle \), where \(\langle \, \cdot \,, \,\cdot \, \rangle \) is the standard Euclidean scalar product on \({\mathbb {R}}^d\). Let \(Q_{+}\) denote the unique positive symmetric matrix such that \(Q_{+}^2=Q^2\) and let \(Q_{+}[x]=\langle x, Q_{+}\, x\rangle \) denote the associated positive form with eigenvalues being the eigenvalues of Q in absolute value. Let q, resp. \(q_0\), denote the largest, resp. smallest, of the absolute value of the eigenvalues of Q and assume \(q_0 \ge 1\). In the first case, where we compare Q with rational forms, we can replace the form Q by \(Q / \varepsilon \) and consider the solubility of the inequality \(|Q[m]| <1\). Since this Diophantine inequality includes the case of integral-valued indefinite forms, we shall appeal to Corollary 8.4 (a variant of Folgerung 3 in [51]) on the size of non-trivial integral solutions. Combining this result with our effective bounds we arrive at the following size estimate for a non-trivial solution of this Diophantine inequality.
Theorem 1.3
For all indefinite and non-degenerate quadratic forms Q of dimension \(d \ge 5\) and signature (r, s) there exists for any \(\delta >0\) a non-trivial integral solution \(m \in {\mathbb {Z}}^d {\setminus }\{0\}\) to the Diophantine inequality \(|Q[m]| < 1\) satisfying
where the dependency on the signature (r, s) is given by
In particular, for indefinite non-degenerate forms in \(d\ge 5\) variables of signature (r, s) and eigenvalues in absolute value contained in a compact set [1, C], i.e \(1 \le q_0 \le q \le C\), Theorem 1.3 yields non-trivial solutions \(m \in {\mathbb {Z}}^d\) of \(|Q[m]| <\varepsilon \) of size bounded by
As an example, we obtain solutions of order \(\ll _{C,\delta } \varepsilon ^{-1- \frac{5}{(d-4)}-\delta }\) for the special case \(r= s+3\) and \(d \ge 12\). More generally, we may embed \({\mathbb {Z}}^{d_1} \subset {\mathbb {Z}}^d\) for dimensions \(d \ge d_1 \ge 5\), in such a way that the restricted form is indefinite and of rank \(d_1\), and apply Theorem 1.3 to this form in \(d_1\) dimensions. As a consequence, since \((Q^*)^2 \le Q^2\) in the ordering of positive forms we get \(q\ge q^{*} \ge q_0^* \ge q_0 \ge 1\) and \(|\det Q^*| \le |\det Q|\), we obtain the following corollary.
Corollary 1.4
For all indefinite and non-degenerate quadratic forms Q in \(d \ge 5\) variables there exists for any \(\varepsilon >0\) at least one non-trivial integral solution \(m \in {\mathbb {Z}}^d\) of
for any \(\delta >0\), where \(f_d=12,\,8\frac{1}{2},\,7\frac{2}{3}\) for \(d=5,\,6,\,7\) respectively and \(f_d=7\frac{1}{2}\) for all \(d\ge 8\). The constant \(c_{C,\delta }\) depends only on \(\delta \) and \(C>0\) for forms Q satisfying \(1 \le q_0 \le q \le C\).
Remark 1.5
(a) For the special case of diagonal indefinite forms \(Q[x]=\sum _{j=1}^5 q_j x_j^2\) with \(\min |q_j| \ge 1\) Birch and Davenport [3], obtained a sharper bound. They showed for arbitrary small \(\delta >0\) that there exists an \(m\in {\mathbb {Z}}^5 {\setminus } \{0\}\) with \(|Q[m]|<1\) and \(Q_{+}[m]\ll _{d,\delta } |\det {Q}|^{1+\delta }\). This implies (as above) for a compact set of forms Q that there exists an integral vector m satisfying \(|Q[m]|< \varepsilon \) and \(\Vert m\Vert \le c_{d,\delta }\, \varepsilon ^{-2+\delta }\) for any fixed \(\delta >0\). In [7] Buterus, Götze and Hille extended the approach of Birch and Davenport to improve the size of a solution by using Schlickewei’s result [51] on small zeros of integral forms: Let \(Q[x] = \sum _{j=1}^d q_j x_j^2\) be an indefinite form of signature (r, s) in \(d = r+s \ge 5\) variables. Then for any \(\varepsilon >0\) the Diophantine inequality \(|Q[m]| < \varepsilon \) admits a non-trivial solution \(m \in {\mathbb {Z}}^d\), whose size is bounded by \(\ll \varepsilon ^{-\rho +\delta }\) for any fixed \(\delta >0\).
(b) Recently, quantitative versions of the Oppenheim conjecture were studied by Bourgain [9], Athreya and Margulis [1], and Ghosh and Kelmer [26]. Bourgain [9] proves essentially optimal results for one-parameter families of diagonal ternary indefinite quadratic forms under the Lindelöf hypothesis by using also a Fourier approach, based on Epstein-Zeta functions. In contrast, Ghosh and Kelmer [26] consider the space of all indefinite ternary quadratic forms and use spectral methods (an effective mean ergodic theorem). Lastly, Athreya and Margulis apply classical bounds of Rogers for \(L^2\)-norm of Siegel transforms in order to prove that for every \(\delta >0\) and almost every Q (with respect to the Lebesgue measure) with signature (r, s), there exists a non-trivial integral solution \(m \in {\mathbb {Z}}^d\) to the Diophantine inequality \(|Q[m] |< \varepsilon \) whose size is bounded by \(\Vert m\Vert \ll _{\delta ,Q} \varepsilon ^{-\frac{1}{d-2}-\delta }\) if \(d \ge 3\).
As mentioned above let us introduce a class of Diophantine forms as follows.
Definition 1.6
We call Q Diophantine of type \((\kappa ,A)\), where \(\kappa , A >0\), if for any \(m\in {\mathbb {Z}}{\setminus } \{0\}\) and \(M \in M(d,{\mathbb {Z}})\) we have
where \(\Vert \,\cdot \,\Vert \) denotes the operator norm induced by the Euclidean norm on \({\mathbb {R}}^n\).
We shall see in Sect. 4.3 that almost every form satisfies this property for some \(\kappa \) and A. In particular, fixing an integer k such that \(1 \le k \le \frac{d(d+1)}{2}-1\), we shall show that a form Q for which \(k+1\) non-zero entries \(y,x_1,\ldots , x_k\) exist such that \(x_1/y,\ldots , x_k/y\) are algebraic and \(1, x_1/y,\ldots , x_k/y\) are linearly independent over \({\mathbb {Q}}\) is Diophantine in this sense and admits a non-trivial solution to the Diophantine inequality \(|Q[m]|<\epsilon \) of order \(\ll _{Q,d, \delta } \epsilon ^{-\frac{d(3+2k)-4}{2k(d-4)}-\delta }\) for any \(\delta >0\). In particular, for \(k = \frac{d(d+1)}{2}-1\) we can give a bound for the size of the least solution of order \(\ll _{Q,d,\delta } \epsilon ^{-\frac{d^3+d^2+d-4}{(d^2+d-2)(d-4)}-\delta }\) and in this case for \(d=5\) of order \(\ll _{Q,\delta } \epsilon ^{-151/28-\delta }\).
Corollary 1.7
Let Q be an indefinite quadratic form in \(d \ge 5\) variables and of Diophantine type (\(\kappa \),A) and fix \(\delta >0\). Then for any \(\varepsilon >0\) there exists a non-trivial lattice point \(m \in {\mathbb {Z}}^d {\setminus } 0\) satisfying
For irrational indefinite quadratic forms we may quantify the density of values Q[m], \(m \in r\Omega \cap {\mathbb {Z}}^d\), where \(\Omega \) denotes a (not necessarily admissible) parallelepiped satisfying (7.1) (see Sect. 7.3) as follows: Consider the set
of values of Q[x], \(x \in r \Omega \cap {\mathbb {Z}}^d\) lying in the interval \([-c_0\,r^2, c_0\,r^2]\), where \(c_0\) denotes the constant introduced in Lemma 7.1. For each \(r \ge 1\) we arrange the values V(r) in increasing order \(v_0(r)< \ldots < v_k(r)\), \(k=k(r)\), and define the maximal gap between successive values of V(r) as
As a consequence of our technical quantitative bounds we obtain
Corollary 1.8
Let Q denote a non-degenerate indefinite form in \(d \ge 5\) variables and of Diophantine type \((\kappa ,A)\). For \(\delta >0\) we obtain for the maximal gap d(r) between successive values of the quadratic form in the set V(r)
for sufficiently large \(r \ge c_{\delta ,d,\Omega ,\kappa ,A,Q}\), where \(\nu _0 :=\frac{2d-8}{2d + 3 \kappa d - 4 \kappa }\) and \(c_{\delta ,d,\Omega ,\kappa ,A,Q}>0\) denotes a constant depending on \(\kappa , A, Q, \Omega , d\) and \(0<\delta <1/10\) (here we omit a description of the explicit dependence).
For positive definite quadratic forms Davenport and Lewis (see [19]) conjectured, that the distance between successive values \(v_n\) of the quadratic form Q[x] on \({\mathbb {Z}}^d\) converges to zero as \(n\rightarrow \infty \), provided that the dimension d is at least five and Q is irrational. This conjecture was proved by Götze in [28]. It also follows by the results of the present paper which provides error bounds for the lattice point counting problem for the indefinite case as well as the positive definite case.
The proof is similar as in the case of positive forms solved in [28]: For any \(\varepsilon >0\) and any interval \([b, b+\varepsilon ]\), we find at least two lattice points in the shell \(E_{b, b+\varepsilon }\) (and the box of size \(r= \sqrt{2b}\)) by Corollary 2.4, provided that b is larger than a threshold \(b(\varepsilon )\). Here \(b(\varepsilon )\) and consequently the distance between successive values (as a function of b) depends on the rate of convergence of the Diophantine characteristic \(\rho _Q^{\mathrm {ell}}(r)\) in the bound of Corollary 2.4 towards zero. For quadratic forms of Diophantine type \((\kappa ,A)\) this dependency can be stated explicitly.
1.3 Discussion of effective bounds and outline of the proofs
In order to prove an effective result like Theorem 1.3 we need an explicit bound for the error, say \(R( I_{E_{a,b} \cap r\Omega })\) (for a formal definition see (1.15) below) with \(I_B\) denoting the indicator of a set B, of approximating the number of integral points \(m \in E_{a,b}\) in a bounded domain \(r\,\Omega \) by the volume \(\mathrm {vol}\,(E_{a,b}\cap r\Omega )\), compare Remark 1.1. First, we simplify the problem by replacing the weights \(I_{r\Omega }(m)=1\) of integral points \(m\,\in r\,\Omega \) by suitable smoothly changing weights \(\text {v}(m/r)\) (for notational simplicity, we will write \(\text {v}_r(m) := \text {v}(m/r)\)), which tend to zero as m/r tends to infinity. This smoothing (together with a smoothing of the indicator function of [a, b]) allows us to use techniques from Fourier analysis, but we are forced to restrict the region \(\Omega \) to parallelepipeds in order to ensure that the corresponding error has logarithmic growth only.
1.3.1 Fourier analysis
Starting with smooth weight functions \(\text {v}_r\) (which depend on the dilation parameter r), we also construct a w-smoothing g of the indicator function of [a, b] via convolution with an appropriate kernel k whose Fourier transform decays like \(|\widehat{k}(t)| \ll \exp \{-\sqrt{|wt|}\}\). This allows us to replace the indicator function of [a, b] in the lattice point counting problem by a smooth function, gaining an error bounded in Corollary 3.2. After this smoothing procedure, writing \(g^Q(x) :=g(Q[x])\), our main objective will be to estimate the weighted lattice remainder
where g and \(\text {v}\) are smooth functions whose Fourier transforms decay fast enough as well. More precisely, we will assume that \(\text {v}\) satisfies (2.4). (At this point we should note that the abbreviation introduced in (1.15) will frequently be used to denote remainder terms.) Next we shall use inverse Fourier transforms in order to express the weights as
where \(\zeta (x) = \text {v}(x) \exp \{Q_{+}[x]\}\). Combining the resulting factors \(\exp \{ 2 \pi \mathrm {i}\,t \, Q[m] \}\), \(\exp \{ 2 \pi \mathrm {i}\langle v,m\rangle \}\) and \(\exp \{-Q_{+}[\frac{x}{r}]\}\) in (1.15) into terms of the generalized theta series
one arrives at an expression for the sum \( V_r :=\sum _{m\in {\mathbb {Z}}^{d}} \text {v}(\tfrac{m}{r}) \,g(Q[m])\) by the following integral (in t and v) over \(\theta _v(t)\):
The approximating integral \(W_r :=\int _{{\mathbb {R}}^d} \text {v}(\tfrac{m}{r})\,g(Q[x]) \, \mathrm {d}x\) to this sum \(V_r\) can be rewritten in exactly the same way by means of the theta integral
replacing the theta sum \(\theta _v(t)\). Thus, in order to estimate the error \(| R( g^Q \,\text {v}_r )| =|V_r-W_r|\), the integral over t and v of \(|\theta _v(t)- \vartheta _v(t)| |\widehat{g}(t) \widehat{\zeta }(v)|\) has to be estimated.
For \(|t| \le q_0^{-1/2}r^{-1}\) and \(\Vert x\Vert \ll r\) the functions \( x\mapsto \exp \{ 2 \pi \mathrm {i}\,t\,Q[x]\}\) are sufficiently smooth, so that the sum \(\theta _v(t)\) is well approximable by the first term of its Fourier series, that is the corresponding integral \(\vartheta _v(t)\), see (3.16) and (3.33). The error of this approximation, after integration over v, yields the second error term in (1.26), which does not depend on the Diophantine properties of Q. Additionally, we may restrict the integration to \(|t| \le T_{+}\) for an appropriate choice of \(T_{+}\) (depending on the width of the shell) by using the decay rate of the kernel k. So we end up with the remaining error term
which we estimate as follows
The second factor in the bound of I in (1.18) encodes both the Diophantine behavior of Q as described above as well as the growth rate with respect to r. We shall describe in the next subsection our method to extract out of this factor the correct rate of growth, while simultaneously avoiding the loss of information on the Diophantine properties of Q, provided that \(d >4\). However, let us first state that the resulting bound (the choice of \(T_+\) depending on the width of the shell) is an error bound depending on characteristics of \(\widehat{\zeta } (v)\) of the form (see Theorem 2.2)
which has to be optimized in the smoothing size w (compare e.g. Corollary 2.4) and \(\rho _{Q,b-a}^w(r)\) depends on the Diophantine properties of Q and r (see Theorem 2.2).
1.3.2 Mean-value estimates
In order to describe the second term in (1.19), we follow [28] (by using a modified Weyl differencing argument) to show in Lemma 3.3 that uniformly in v and pointwise in t
where \(\{\Lambda _t\}_{t \in {\mathbb {R}}}\) is a family of 2d-dimensional unimodular lattices generated by orbits of one-parameter subgroups of \(\mathrm {SL}(2,{\mathbb {R}})\) indexed by t and r, see (3.47) for the precise definition. It is well-known that the expression \(\psi (r,t) :=\sum _{v \in \Lambda _t} \exp \{- \Vert v\Vert ^2\}\) can be bounded by the number of lattice points \(v \in \Lambda _t\) satisfying \(\Vert v\Vert _\infty \ll 1\). Combining this estimate together with the symplectic structure of \(\Lambda _t\) (see Sect. 4.1) yields the estimate
where \(M_i(\Lambda _t)\) denotes the i-th successive minima of \(\Lambda _t\) and \(\alpha _d(\Lambda _t)\) the d-th \(\alpha \)-characteristic of \(\Lambda _t\), that is \(\alpha _d(\Lambda _t) = \sup \{|\det (\Lambda ')|^{-1}: \Lambda ' \text { is a } \)d\(-\text {dimensional sublattice of }\) \(\Lambda _t \}\). After a local approximation of a certain one-parameter unipotent subgroup by the compact group \(\mathrm {SO}(2)\) (see Sect. 4.2), we estimate the average of \(\alpha _d(\Lambda _t)^\beta \) over t for \(0 < \beta \le 1/2\) in Lemmas 5.12, 6.1 and 6.2. This argument involves a recursion in the size of r and builds upon a method developed in [23] on upper estimates of averages of certain functions on the space of lattices along translates of orbits of compact subgroups.
Let us give a brief sketch of the main ideas involved in this argument. Let \({\mathrm {G}}= \mathrm {SL}(2,{\mathbb {R}}), \, {\mathrm {K}} = \mathrm {SO}(2)\) endowed with the probability Haar measure \(\mathrm {d}k\) and denote by \(A_r\) the mean-value operator on \({\mathrm {K}} \backslash {\mathrm {G}}\) defined by
where f is any continuous function on \({\mathrm {K}} \backslash {\mathrm {G}}\), \(g \in {\mathrm {G}}\) denotes any element for which \(\Vert g\Vert =r\) and \(\Vert \, \cdot \, \Vert \) denotes the operator norm induced by the standard Euclidean norm. Fixing \(2/d < \beta \le 1/2\), we shall show that uniformly in v and for all intervals I of fixed bounded length there exists a positive function f depending only on Q and \(\beta \) such that
where \(\gamma _{I,\beta }(r)\) contains information on the Diophantine properties of Q and tends to zero for irrational forms as r tends to infinity (see Corollary 4.11).
The function f does not appear isolated but emerges as the maximum of a family of positive functions \(f_1, \dots , f_{2d}\). For a positive number \(r_0>0\) and any \(g_0 \in {\mathrm {G}}\) such that \(\Vert g_0\Vert =r_0\) we show that this family satisfies two main properties. First, the value of each \(f_i\) on any orbit of the form \(g_0 {\mathrm {K}} h\) is bounded (up to a constant depending only on \(r_0\)) by its value at \(f_i(h)\). Second, the mean-value \(A_{r_0}(f_i)\) of any \(f_i\) satisfies the following functional inequality (see Lemma 5.11)
where we set \({\bar{i}} = \min \{i,2d-i\}\), \(\lambda _i := \max \{2, \beta {\bar{i}}\}\) and \(\tau _{\lambda _i}\) denotes the spherical function
where \(e_1=(1,0)\) denotes the first standard unit vector on \({\mathbb {R}}^2\).
The asymptotic growth of spherical functions is well-understood and in our case \(\tau _{\lambda }(g) \asymp \Vert g\Vert ^{\lambda -2}\) whenever \(\lambda >2\) and \(g \not \in {\mathrm {K}}\). Here spherical functions are crucial precisely because they are the eigenfunctions of the mean-value operator. We show, in a first instance, that any positive function f satisfying an inequality of the form
for \(\lambda >2\) and \(0<\eta <\lambda \) satisfies
for any \(r>0\), where \(g \in {\mathrm {G}}\) is any element for which \(\Vert g\Vert =r\). In other words, the growth of the mean value at \(\mathbb {1}\) grows at most as fast as the associated spherical function. In a second instance we obtain, after radializing the family, a preliminary estimate of the form
for any fixed \(\mu > \lambda _d\). We then show inductively, using repeatedly (1.21), (1.22) and (1.23) that
for an appropriate sequence \(\lambda _d>\mu _i >\lambda _i\). Combining these estimates again with (1.21) in the case \(i =d\) then yields the inequality \(A_{r_0}f_d \ll \tau _{\lambda _d}(g_0)f_d + f(\mathbb {1})\tau _\eta \), for some \(\eta <\lambda _d\), which implies together with (1.21) and (1.24) the desired and expected estimate (see Theorem 5.12), namely that \(A_r(f)(\mathbb {1}) \ll \tau _{\lambda _d}(g) f(\mathbb {1}) \asymp r^{\beta d -2} f(\mathbb {1})\) for any \(r \gg 1\) and any \(g \in {\mathrm {G}}\) for which \(\Vert g\Vert =r\). In particular for any such interval I we obtain the following bound
At this point the current approach is fundamentally different to the approach of previous effective bounds for \(R(I_{E_{a,b} \cap r\Omega })\) by Bentkus and Götze [6] (see also [5]) valid for \(d \ge 9\) and positive as well as indefinite forms. The reduction to (1.20) and the Diophantine factor \(\rho _{Q,b-a}^{w}(r)\) follows the approach used by Götze in [28], where the average on the right-hand side of (1.20) was estimated for \(d \ge 5\) by methods from the Geometry of Numbers and essentially required positive definite forms. A variant of that method was applied to split indefinite forms in a PhD thesis by G. Elsner [21].
1.3.3 Smooth weights on \({\mathbb {Z}}^d\)
For the Gaussian weights \(\text {v}_r(x) = \exp \{-2 \,Q_{+}[x]/r^2\}\) our techniques yield effective bounds for the approximation of a weighted count of lattice points \(m \in {\mathbb {Z}}^d\) with \(Q[m] \in [a,b]\) by a corresponding integral with an error
The following bounds for \(R(I_{E_{a,b}} \,\text {v}_r)\) are identical for the case of positive and indefinite d-dimensional forms Q, provided that \(d \ge 5\). Using Vinogradov’s notation \(A \ll _B C\), meaning that \(A< c_B \, C\) with a constant \(c_B>0\) depending on B, we have
Theorem 1.9
Let Q be a non-degenerate quadratic form in \(d\ge 5\) variables. Choose \(\beta =\tfrac{2}{d} + \tfrac{\delta }{d}\) for some arbitrary small \(\delta \in (0,\tfrac{1}{10})\). Then for any \(r \ge q^{1/2}\), where q denotes the maximal eigenvalue of Q, \(b>a\) and \( 0< w < (b-a)/4\) we have
provided that \(b-a \le r\). If \(r < b-a \ll r^2\) the second term in the bound has to be replaced by \( r^{d/2} \log {r}\).
In Theorem 2.2 an explicit description of the Diophantine factor \(\rho _{Q,b-a}^{w}(r)\) will be provided. Depending on whether Q is definite or indefinite, this factor will be further refined in Corollary 2.4, resp. Corollary 2.5. Moreover, the function \(\rho _{Q,b-a}^{w}(r)\) tends to zero as r tends to infinity if Q is irrational. Additionally, if Q is Diophantine of type \((\kappa ,A)\), as we shall introduce in Definition 1.6, we find a polynomial decay \(\rho _{Q,b-a}^{w}(r) \ll _{Q,d,A} r^{-\nu }\) for an appropriate choice of \(0<w<(b-a)/4\), where \(\nu \in (0,\infty )\) depends on d, \(\kappa \) and A, see Corollary 2.6. These results follow from Theorem 2.2 with parameters chosen for the indefinite, positive and effective Diophantine cases in the proofs in Sect. 7.4.
1.3.4 The role of the region \(\Omega \)
In order to estimate the lattice point deficiency \(R(I_{E_{a,b} \cap r\Omega })\) we have to \(\varepsilon \)-smooth the indicator function of \(\Omega \) which yields weights \(\zeta =\zeta _{\varepsilon }\) and an additional error of order \(\varepsilon (b-a) r^{d-2}\) in case of indefinite forms due to the intersection of \(E_{a,b}\) with the boundary \(\partial r\Omega \). For positive definite forms, \(r \Omega \) contains \(E_{a,b}\), that is \(\varepsilon >0 \) could be fixed independent of r, since this boundary intersection term is not present here.
In the indefinite case one needs to match the actual size of the error by choosing \(\varepsilon \) small enough in (1.19). This leads to a critical dependence on \(\varepsilon \) through the Fourier transform of \(\zeta _{\varepsilon }\) and its characteristics (see (2.6)). Here \(\Vert \widehat{\zeta _{\varepsilon }}\Vert _1\) moderately grows like \((\log 1/\varepsilon )^d\) for arbitrary small \(\varepsilon \) in the case of polyhedra only, see Lemma 7.2. The dependence of \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _{1,*}\), see (2.6), is again critically dependent on \(\Omega \) and the width \(b-a\) of the hyperbolic shell \(E_{a,b}\). For \(b-a \gg r\) the boundary of \( r\Omega \cap E_{a,b}\) will contain a larger segment of \(\partial r\Omega \). For a sequence of scalings r these segments of the \((d-1)\)-polytope potentially contain a large number of lattice points which induce large errors in the lattice point approximation, for which the technical restriction to the region \(\Omega \) is solely responsible. In order to avoid this artefact which is reflected by a large growth of \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _{1,*}\) when \(\varepsilon \) is small, we restrict ourselves to special admissible regions \(r\Omega \), where \(\Omega =B^{-1}[-1,1]^d\), and \(B\in {\text {GL}}(d, {\mathbb {R}})\) is chosen such that the lattice \(\Gamma = B {\mathbb {Z}}^d\) is admissible in the sense of Sect. 7.3, i.e. both (7.1) and (7.29) are satisfied. This ensures that the lattice point remainder of \(r \Omega \) satisfies \(|\mathrm {vol}_{\mathbb {Z}}\,r\Omega - \mathrm {vol}\,r\Omega | \ll _{\Omega } (\log r)^{d-1}\) uniformly which is ‘abnormally’ small. Likewise \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _{1,*}\) grows of order \((\log 1/\varepsilon )^d\) only. The resulting error bounds in Corollary 2.5 for wide shells with \(\max \{|a|,|b|\} \ll _{B} r^2\) are then comparable up to at most \((\log 1/\varepsilon )^d\) factors to the case of positive forms in Corollary 2.4.
1.4 Organization of this paper
The paper is organized mostly in logical order. In Sect. 2 we describe the explicit technical estimates on lattice point remainders for both positive definite and indefinite forms. In the following Sect. 3 we transfer the problem to Fourier transforms of the error starting with a first smoothing step and rewrite the lattice remainder in terms of integrals over d-dimensional theta sums. Section 4 provides a reformulation of the problem via upper bounds in terms of integrals over the absolute value of other theta sums with an underlying symplectic structure on \({\mathbb {R}}^{2d}\) which, in turn, are estimated using basic arguments from the Geometry of Numbers. Section 5 contains crucial estimates for averages of functions on the space of lattices. Finally, in Sect. 6 all these results are combined to prove Theorem 2.2. Starting with the applications, we collect in Sect. 7 the geometric bounds related to parallelepiped regions \(\Omega \) used in this paper and afterwards conclude (in Sect. 7.4) the results of Sect. 2. In the last Sect. 8 we focus on small values of indefinite quadratic forms: After recollecting and refining some results due to Schlickewei [51] on the size of small zeros of integral quadratic forms, we shall prove Theorem 1.3.
Compared to an earlier preprint [27] this version has been rewritten so that it allows to separate the error contributions due to the Diophantine properties of Q and the influence of weights for the lattice points in Theorem 2.2. The latter has been developed for special choices of regions \(\Omega \) which are particularly relevant for wide shells \(E_{a,b}\) in Sect. 7. Moreover, the effective bounds for non-trivial solutions of the Diophantine inequality \(|Q[m]| < \varepsilon \) have been improved in terms of the signature (r, s) by using Schlickewei’s result [51] on small zeros of quadratic forms. In addition, we included a number of corrections concerning the explicit dependence on Q (resp. \(\Omega \)) and the dimensions, and corrected typos as well.
2 Effective estimates
We consider the quadratic form
where \(\langle \cdot ,\cdot \rangle \) resp. \(\Vert \,\cdot \,\Vert \) denote the standard Euclidean scalar product and norm, \(Q :{\mathbb {R}}^d\rightarrow {\mathbb {R}}^d\) denotes a symmetric linear operator in \(\mathrm {GL}(d, {\mathbb {R}})\) with eigenvalues \(q_1,\dots , q_d\). Write
In what follows we shall always assume that the form is non-degenerate, that is \(q_0>0\). In order to describe the explicit bounds we need to introduce some more notations. Let \(\beta > \tfrac{2}{d}\) such that \(0< \tfrac{1}{2} -\beta <\tfrac{1}{2} - \tfrac{2}{d}\) for \(d>4\). For a lattice \(\Lambda \subset {\mathbb {R}}^n\), \(n \in {\mathbb {N}}\), with \(\dim \Lambda = n\) we define for \(1\le l\le n\) its \(\alpha _l\)-characteristic by
Here \(\Lambda '=B\,{\mathbb {Z}}^n\) is determined by a \(n \times l\)-matrix B and \(\det (\Lambda ')= \det (B^T\, B)^{1/2}\) is the volume of a fundamental domain.
Remark 2.1
Given \(\Lambda = g {\mathbb {Z}}^n\) with \(g \in \mathrm {GL}(n,{\mathbb {R}})\), then any l-dimensional sublattice \(\Delta \subset \Lambda \) is spanned by \(g n_1,\ldots ,g n_l\), where \(n_i \in {\mathbb {Z}}^n\) and \(\det (\Delta ) = \Vert g n_1 \wedge \ldots \wedge g n_l\Vert \). If \(\Delta ' \subset \Lambda \) is a sublattice distinct from \(\Delta \) with basis \(g n_1',\ldots ,g n_l'\), \(n_i' \in {\mathbb {Z}}^n\), then
since the l-th exterior product of g is invertible. This argument shows that the \(\alpha _l\)-characteristic is attained at some l-dimensional sublattice \(\Lambda ' \subset \Lambda \).
In the special case \(n=2d\) we also introduce
where \(\Lambda _t = d_r u_t \Lambda _Q\) denotes a 2d-dimensional lattice obtained by an appropriate action of \(d_r, u_t \in \mathrm {SL}(2,{\mathbb {R}})\) on \({\mathbb {R}}^{2d}\) (see (4.25)), where \(d_r\) and \(u_t\) denote the usual diagonal and unipotent elements and \(\Lambda _Q\) denotes a fixed 2d-dimensional lattice depending on Q (see (4.28)). Recall that \(E_{a,b} = \{ x \in {\mathbb {R}}^d \,: \,a< Q[x] < b \}\) and let \(\text {v}(x)\) denote a smooth weight function such that \(\zeta (x) :=\text {v}(x) \exp \{Q_{+}[x]\}\) satisfies
An explicit construction of weight functions for parallelepiped regions will be given in Sect. 7. Nevertheless, as a simple example, one can take the Gaussian weights \(\text {v}(x) = \exp \{-2 Q_{+}[x]\}\).
Theorem 2.2
Let Q be a non-degenerate quadratic form in \(d\ge 5\) variables with \(q_0\ge 1\). Choose \(\beta =\tfrac{2}{d} + \tfrac{\delta }{d}\) for some arbitrary small \(\delta \in (0,\tfrac{1}{10})\). Write \((b-a)_q :=b-a\) if \(b-a \le q\) and \((b-a)_q:=q^{\beta d -1/2}\) if \(b-a >q\), and \((b-a)^* :=(b-a)\) if \(b-a \le 1\) and \((b-a)^* :=1\) if \(b -a >1\). Then for any \(r \ge q^{1/2}\), \(b>a\) and \( 0< w < (b-a)/4\) we have
where \(C_Q :=q \,|\det {Q}|^{-1/4-\beta /2}\) and \(\Vert \text {v}\Vert _Q\) is defined in Lemma 7.1 (the quantity \(\Vert \text {v}\Vert _Q\) depends additionally on r, a, b and w, but we will suppress this dependence),
and \(c_Q :=|\det {Q}|^{1/4-\beta /2}\). Furthermore
and here \(\Vert v\Vert _{{\mathbb {Z}}} :=\min _{m \in {\mathbb {Z}}^d} \Vert v-m\Vert _{\infty }\).
We use the notation \(A \asymp _d B\) for quantities of equivalent size up to constants depending on d only, i.e. \(A \ll _d B \ll _d A\).
Remark 2.3
Note that
-
a)
Theorem 2.2 extends to affine quadratic forms \(Q[x+\xi ]\) uniformly in \(|\xi |_{\infty } \le 1\).
-
b)
Depending on the application, the lattice remainder (2.5) will be optimized in the parameters w, \(\varepsilon \) and \(T_{+}\) differently: For thin shells the error should also scale with the length \(b-a\). This forces \(T_{+}\) to be large and requires ‘strong’ Diophantine assumptions. In the case of wide shells it is possible to choose w relatively large.
-
c)
If Q is irrational, then Corollary 4.11 implies that \(\rho _{Q,b-a}^{w}(r) \rightarrow 0\) for \(r \rightarrow \infty \), provided that w and \((b-a)\) are fixed. The first factor in the definition of \(\rho _{Q,b-a}^{w}\) corresponds to small values of t on the Fourier side and the last factor to the decay rate of the w-smoothing of the interval [a, b].
With these notations we state a result providing quantitative bounds for the difference between the volume and the lattice point volume in \(E_{a,b}\).
2.1 Ellipsoids \(E_{0,b}\)
Here Q is positive definite and we may assume that b tends to infinity. Let \(r=\sqrt{2b}\) in Theorem 2.2. Then the ellipsoid \(E_{0,b}=\{ x \in {\mathbb {R}}^d\;:\; Q[x] \le b \}\) is contained in \(r\Omega = Q_+^{-1/2}[-r,r]^d\). Choosing in Theorem 2.2 a smoothing of \(I_{\Omega }\), say \(\text {v}_{\varepsilon }\) of width \(\varepsilon =\tfrac{1}{15}\), which equals 1 on \(E_{0,b}\), and the smoothing parameter w in terms of \(T_+\), such that the right-hand side in (2.5) is minimal, will lead to
Corollary 2.4
Let Q denote a non-degenerate d-dimensional positive definite form with \(d \ge 5\) and \(q_0 \ge 1\). For any \(r \ge q^{1/2}\) and \(r= \sqrt{2b}\) we have with \(H_r :=E_{0,b}\)
where
and the infimum is taken over \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_+ \ge 1\), where \(a_Q = q |\det {Q}|^{\frac{1}{4}-\frac{\beta }{2}}\), \(c_Q = |\det {Q}|^{1/4-\beta /2}\). Furthermore, \(\lim _{r\rightarrow \infty } \rho _Q^{\mathrm {ell}}(r) =0\) as r tends to infinity, provided that Q is irrational.
Compared to the quantitative results in [5, 6], this bound holds already for \(d \ge 5\). Moreover, Corollary 2.4 refines the estimates obtained in [28].
2.2 Hyperboloid shells \(E_{a,b}\)
If Q is indefinite, we distinguish, depending on \(b-a\), between ‘small’ and ‘wide’ shells \(E_{a,b}\). Here we restrict ourselves to a special class of rescaled admissible parallelepipeds \(r\Omega \) for \(r>0\): We suppose that \(\Omega =B^{-1}[-1,1]^d\) is determined by some \(B\in {\text {GL}}(d, {\mathbb {R}})\) such that the lattice \(\Gamma = B {\mathbb {Z}}^d\) is admissible in the sense of Sect. 7.3, i.e. both (7.29) and (7.1) should be satisfied (for examples, see Remark 7.4 and Example 7.6). Note that the latter condition (7.1), that is \(Q_{+} \le B^T B \le c_B Q_{+}\) with \(c_B \ge 1\), ensures that the region \(\Omega \) is rescaled with respect to the quadratic form Q.
To estimate the lattice point remainder for this restriction of \(E_{a,b}\) given by \(H_{r} :=E_{a,b}\cap r\Omega \) we smooth the indicator function \(I_{\Omega }\) in an \(\varepsilon \)-neighborhood with an error of order \({\mathcal {O}}(\varepsilon (b-a) r^{d-2})\) using Lemma 7.1. This yields a smooth function \(\text {v}_{\varepsilon }\) and a final weight function \(\zeta _{\varepsilon }\), according to (2.4) in Theorem 2.2. Since \(\Omega \) is admissible, both \(\Vert \zeta _{\varepsilon }\Vert _1\) and \(\Vert \zeta _{\varepsilon }\Vert _{*,r}\) in (2.6) are growing with a power of \(|\log \varepsilon |\) only, see Lemmas 7.2 and 7.8.
In the next step we calibrate both smoothing parameters w and \(\varepsilon \) in order to get Corollary 2.5 below for ‘wide’ and ‘thin’ shells. The actual choice of \(\varepsilon \) is then determined by calibrating the main terms \(\varepsilon r^{d-2}\) and \(\Vert \zeta _{\varepsilon }\Vert _1 \rho _{Q,b-a}^{w}(r) r^{d-2}\) depending on the speed of convergence of \(\lim _{r\rightarrow \infty } \rho _{Q,b-a}^{w}(r)=0\). The resulting error bound for indefinite forms will then differ at most by some \(|\log \varepsilon |\)-factors from the positive definite case, and is thus dominantly influenced by the Diophantine properties reflected in the decay of the \(\gamma _{[T_{-},T_{+}],\beta }\), resp. the \(\rho _{Q,b-a}^{w}\)-characteristic of irrationality. In particular we have uniformly for ‘small’ and ‘wide’ shells \(E_{a,b}\) and admissible regions \(\Omega \) the following bound:
Corollary 2.5
Under the assumptions of Theorem 2.2 we get for an admissible region \(\Omega \), all \(\max \{|a|,|b|\} \le c_0 r^2\), where \(c_0>0\) is chosen as in Lemma 7.1, and \(b-a \ge q\)
where
\({{\,\mathrm{{\text {Nm}}}\,}}(\Gamma ) :=\inf _{\gamma \in \Gamma {\setminus } \{0\}} |\gamma _1 \ldots \gamma _d|\) in standard coordinates \(\gamma = (\gamma _1,\ldots ,\gamma _d)\) and
where the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\). If \(b-a \le q\), then (2.8) holds, too, whereby the Diophantine factor \(\rho _{Q,b-a}^{\mathrm {hyp}+}(r)\) has to be replaced by
In the last equation the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\) with
These bounds refine the results obtained in [6] providing explicit estimates in terms of Q and are valid for \(d \ge 5\). Note that, due to the ‘uncertainty principle’ for the Fourier transform, we need to choose \(T_{+}\) at least as large as in (2.10) if \(E_{a,b}\) is ‘thin’ in order to control the factor \(\exp \{- |T_{+} w|^{1/2} \}\) (occurring in the definition of \(\rho _{Q,b-a}^{w}\)) which scales with \(b-a\). In Sect. 7.4 we prove a variant of Corollary 2.5 for thin shells and non-admissible regions \(\Omega \) as well, see Corollary 7.10.
2.3 Quadratic forms of diophantine type \((\kappa ,A)\)
For any fixed \(T_{+}>1> T_{-}>0\) and irrational Q it is shown in Corollary 4.11 that
with a speed depending on the Diophantine properties of Q. For indefinite forms Q, this implies for fixed \(b-a >0\) that
and hence \(\Delta _r = o(r^{d-2})\) as \(r\rightarrow \infty \). This holds uniformly for all intervals [a, b] with \(0<u_r \le b-a \le v_r\le c_0 r^2\) and sequences \(\lim _r u_r=0\), \(\lim _r v_r = \infty \), \(r \rightarrow \infty \) depending on Q. For the special class of quadratic forms of Diophantine type \((\kappa ,A)\), as introduced in Definition 1.6, we may apply Corollary 4.11 to obtain explicit bounds on the Diophantine factors in the previous theorems as follows.
Corollary 2.6
Consider an indefinite quadratic form Q that is Diophantine of type \((\kappa ,A)\). Moreover, let \(\beta = 2/d + \delta /d \) for some sufficiently small \(0<\delta < \tfrac{1}{10}\). Then for the case of wide shells \(b-a \ge q\) in Corollary 2.5 we have
where \(h_Q = q \,|\det {Q}|^{1/2-\beta }\), \(\nu = (1-2\beta )/(2\kappa + 2)\) and \(\sigma = d(1/2-\beta )\). Thus for an admissible region \(\Omega \) satisfying (7.1) we have for all \(r \ge q^{1/2}\) and \(\max \{ |a|,|b|\} \le c_0 r^2\)
where the implied constant in (2.14) can be explicitly determined. For thin shells, i.e. \(b-a \le q\), we have
where the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\) restricted to
3 Fourier analysis
3.1 Smoothing
The first step in the proof of Theorem 2.2 is to rewrite the lattice point counting error (i.e. the left hand side of (2.5)) in terms of integrals over appropriate smooth functions. To this end, we introduce smooth approximation of the indicator functions of \(E_{a,b}\) and \(\Omega \) constructed as follows. Denote by \(k=k(x) \,\mathrm {d}x\) a probability measure (symmetric around 0) with compact support satisfying \(k([-1,1])=1\) and \( |\widehat{k}(t)| \le C \exp \{-|t|^{1/2}\}\) for all \(t\in {\mathbb {R}}\) and a positive constant \(C>0\), where \(\widehat{k}(t) :=\int k(x) \exp \{ - 2 \pi \mathrm {i}\,t\,x \} \, \mathrm {d}x\) denotes the Fourier transform of the measure k. For an example of k we refer to Corollary 10.4 in [10]. More generally, by a result of Ingham [31] (see e.g. Theorem 10.2 in [10]) there is a probability density k such that \(|\widehat{k}(t)| \le C \exp \{ - u(|t|) |t| \}\), where u is a continuous, non-negative, non-increasing function on \([0,\infty )\) satisfying \(\int _1^\infty u(t) \,t^{-1} \, \mathrm {d}t < \infty \) and this condition is also necessary. However, we will not need this improved decay rate. For \(\tau >0\) let \(k_{\tau }\) denote the rescaled measures \(k_{\tau }(A) :=k(\tau ^{-1}A)\) for any \(A\in {\mathcal {B}}^d\), where \({\mathcal {B}}^d\) denotes the Borel \(\sigma \)-algebra. Using the same notation, let \(k_{\tau }(x) = k_{\tau }(x_1)\ldots k_{\tau }(x_d)\), \(x=(x_1,\ldots ,x_d)\), denote its multivariate extension on \({\mathbb {R}}^d\), \(d \ge 1\). Furthermore, let \(f*k_{\tau }\) denote the convolution of a function f on \({\mathbb {R}}^d\) and \(k_{\tau }\). We need the following standard estimate for smooth approximations.
Lemma 3.1
Let \(\mu \) and \(\nu \) be (positive) finite measures on \({\mathbb {R}}^d\), let f and \(f_{\tau }^{\pm }, \tau >0\), denote bounded real-valued Borel-measurable functions on \({\mathbb {R}}^d\) such that for any \(\tau >0\)
Then
Proof
Note that \(k_{\tau }\) is a probability measure with support contained in a \(\Vert \cdot \Vert _\infty \)-ball of radius \(\tau \). Hence, (3.1) implies the following chain of inequalities
which leads to
together with a similar lower bound. Since by (3.3) \(f \le f^{+}_{\tau }*k_{\tau }\le f^{+}_{2\tau }\) and \(f \ge f^{-}_{\tau }*k_{\tau }\ge f^{-}_{2\tau }\), the upper bound (3.4) together with the corresponding lower bound proves the lemma. \(\square \)
First we shall investigate approximations to the sum under consideration, counting the lattice points in \(E_{a,b}\) with weights \(\text {v}_r(x) :=\text {v}(x/r)\). In accordance with the notation introduced in (1.15) at the beginning of Sect. 1.3.1, we write
where \(\text {v}(x)\) is a sufficiently fast decreasing smooth function such that the function
satisfies (2.4). For such weights both sides of (3.5) are well defined and \(R(I_{E_{a,b}} \text {v}_r)\) may be estimated by Poisson’s formula, see [8], §46. By means of Lemma 3.1 we now replace the indicator \(I_{[a,b]}\) by a smooth approximation.
Corollary 3.2
Let \([a,b]_{\tau } :=[a-\tau ,b +\tau ]\) and write
where \(0< w < (b-a)/4\). Then
where \(R(g^Q_{\pm w}\text {v}_r)\) is defined in accordance to (3.5), \(\Vert \text {v}\Vert _Q\) is defined in Lemma 7.1 and \(c_d\) is a positive constant depending on d only.
Proof
In Lemma 3.1 we choose the measure \(\mu \), resp. \(\nu \), on \({\mathbb {R}}\) as the induced measure under the map \(x\mapsto Q[x]\) of the counting measure with weights \(\text {v}_r(m)\), resp. the measure \(\text {v}_r(x)\, \mathrm {d}x\). Let \(f(z)=I_{[a,b]}(z)\) and \(f^{\pm }_{\tau }(z)=I_{[a,b]_{\pm \tau }}(z)\). Then (3.1) is satisfied and (3.2) applies with \(\tau =w\). In order to bound the remainder term in (3.2) observe that
and apply the geometric estimate of Lemma 7.1; that is (7.10) of Sect. 7.1. \(\square \)
Thus we have reduced the determination of the lattice point remainder \(R(I_{E_{a,b}} \,\text {v}_r)\) to the remainder \(R(g^Q_{\pm w}\text {v}_r)\) for smooth weights. In the next subsection we shall rewrite the latter by means of the corresponding Fourier transforms.
3.2 Fourier transforms and theta-series
Rewrite the weight factor \(\text {v}\) in (3.5) as \(\text {v}(x) = \exp \{- Q_{+}[x] \} \,\zeta (x)\). Since by definition (see the previous Sect. 3.1)
where
we may express the weight functions \(g_{\pm w}\) and \(\zeta \) by their Fourier transforms
This yields
Using (3.10) we obtain by interchanging summation and integration in (3.5)
with \(e_{tQ}(x) :=\exp \{ 2 \pi \mathrm {i}\,t \,Q[x]\}\). (Here \(R(e_{tQ}\,\text {v}_r)\) denotes the inner integral with respect to the variable v.) In the same way, writing \({\tilde{e}}_{v,r}(x) :=\exp \{ - Q_{+}[x/r] + 2 \pi \mathrm {i}\,\langle x,v \,r^{-1} \rangle \}\), we derive by (3.11) the remainder
The sum \(R(e_{tQ}\,{\tilde{e}}_{v,r}) \) is the remainder between the generalized theta series and its corresponding theta integral, that is \(R(e_{tQ}\,{\tilde{e}}_{v,r})= \theta _v(z)- {\vartheta }_{v}(t)\), where
Let us note that both \(\vartheta _t(v)\) as well as \(\theta _t(v)\) depend on the dilating variable r. However, we shall suppress this underlying dependency in order to reduce the notational burden. For \(|t| \le q_0^{-1/2}r^{-1}\) we shall use following representations of \( R(e_{tQ}\,{\tilde{e}}_{v,r})= \theta _v(z)- {\vartheta }_{v}(t)\) in (3.12) by means of Poisson’s formula (see [8], §46), which obviously applies here:
Note that by definition (3.14) the Fourier transform of \( x\mapsto \exp \{Q_{r,v}(t,x)\}\) at \(u \in {\mathbb {R}}^d\) is given by \(\vartheta _{v-r \,u}(t)\), where
In view of (3.13) and (3.16) we have
From here we only consider the weight \(g_w\). The same inequalities hold also for \(g_w\) replaced with \(g_{-w}\). Next, we decompose the integral over t in (3.12) into the segments \(J_0 :=[ - q_0^{-1/2}r^{-1}, q_0^{-1/2}r^{-1}] \) and \(J_1 :={\mathbb {R}}{\setminus } J_0\) and obtain
where,
We start with the integral over the sections \(J_1\). In the term \(I_\theta \) we separate the t and v integrals via
where the estimation of the latter integral will be done in the Sects. 4–6. In order to estimate the terms \(I_\Delta \) and \(I_\vartheta \) we need to estimate \(|\vartheta _v(t)|\) first:
3.2.1 Estimates for \(|\vartheta _v(t)|\)
For any symmetric complex \(d\times d\)-matrix \(\Xi \), whose imaginary part is positive definite, we have
where we choose the branch of the square root which takes positive values on purely imaginary \(\Xi \), \(v\in {\mathbb {R}}^d\) and \(\Xi ^{-1}[x]\) denotes the quadratic form \(\langle \Xi ^{-1}x,x\rangle \), defined by the inverse operator \(\Xi ^{-1} :{\mathbb {C}}^d\rightarrow {\mathbb {C}}^d\) whose imaginary part is negative definite (see [46], p. 195, Lemma 5.8 and (5.6)). We shall apply (3.24) in the case \(\Xi _t :=\mathrm {i}\pi ^{-1} {\tilde{Q}}_t = 2tQ+\mathrm {i}\pi ^{-1} r^{-2}Q_+\) in order to obtain the following expression for \(\vartheta _v\) in (3.14) (see also (3.17))
Hence, the Fourier transform of \(x \mapsto \exp \{Q_{r,v}(t,x)]\}\) takes the following shape
A short calculation shows that \({\tilde{Q}}_t^{-1}=(4 \pi ^2 t^2 +r^{-4})^{-1}(2 \pi \mathrm {i}t Q^{-1}+r^{-2}Q_+^{-1})\) and it follows immediately that
Taking the absolute value of (3.25) and (3.27) we conclude that
where \(r_t :=r (4 \pi ^2 t^2 r^4 +1)^{-1/2}\) and \(d_Q :=|\det Q|^{-1/2}\) as already defined in (2.1).Footnote 1
3.2.2 Estimation of \(I_\vartheta \)
By (3.28) with \(v=ur\) we have \(|\vartheta _v(t)| \ll _d d_Q\,r^{d/2} \,r_t^{d/2}\) and therefore we obtain by using (3.8) after integrating over v in (3.21)
If \(|b-a|^{-1} \le q_0^{-1/2}r^{-1}\), then we use \(s_{[a,b]_w}(t) \le |t|^{-1}\) and \(r_t \le (rt)^{-1}\) to get the bound
In the case \(|b-a|^{-1} > q_0^{-1/2}r^{-1}\) we shall estimate the t-integral in (3.29) by means of \(s_{[a,b]_w}(t) \le |b-a+2w|/2\). Using \(|w| < (b-a)/4\) additionally leads to
Summarizing, we have established the bound
provided that \(d>2\).
3.2.3 Estimation of \(I_\Delta \)
According to (3.20), (3.13) and (3.16) we may write
In order to use the estimate (3.28) let \(v \in {\mathbb {R}}^d\) and write \(v = ru\) with \(u = u_0 + m_u\), where \(u_0 \in [-1/2,1/2]^d\) and \(m_u \in {\mathbb {Z}}^d\). Then
Note that \(\Vert m+u_0\Vert \ge \Vert m+u_0\Vert _\infty \ge \frac{1}{2}\) for any \(m \in {\mathbb {Z}}^d {\setminus } \{0\}\) and therefore \(\frac{\pi ^2}{2}Q_+^{-1}[u_0 +m] \ge \frac{\pi ^2}{8}q^{-1} \ge q^{-1}\) which yields the bound
where \(I_r(v) :=I_{[r/2,\infty )}(\Vert v\Vert _{\infty })\) and \(K_{u_0} :=\sum _{m \in {\mathbb {Z}}^d} \exp \{-\frac{\pi ^2}{2}r_t^2 Q_+ ^{-1}[m+u_0]\}\). The sum \(K_{u_0}\) may be estimated by an integral as follows: Since the map \(t\mapsto r_t^2 = r (4 \pi ^2 t^2 r^4 +1)^{-1/2}\) is strictly monotone increasing on \(t < 0\) and decreasing on \(t >0\), we find that \(r_t^2 \ge q_0/(4 \pi ^2+1)\) for \(|t| \le q_0^{-1/2}r^{-1}\) as \(r \ge q^\frac{1}{2}\) and thus \(\exp \{- \pi ^2 r_t^2 Q_+^{-1}[u]\} \le \exp \{- \frac{q_0}{5} Q_{+}^{-1}[u]\}\). Let \(I :=[-\frac{1}{2} , \frac{1}{2}]^d\) and note that \( Q_+^{-1}[x] \le \tfrac{d}{4q_0}\) for \(x\in I\), from which we deduce that
where the integral on the right-hand side is at least one by Jensen’s inequality. Hence
Using (3.31) together with (3.33) and (3.34), we may now estimate \(I_\Delta \) by the following integrals. Writing \(v_0 = v- r m\), \(\Vert v_0\Vert _{\infty } \le \frac{r}{2}\), \(m \in {\mathbb {Z}}^d\), we have
where
If we write \(h(s;x) :=s^{d/4} \mathrm {e}^{-s \,x}\) with \(s,x>0\), then the maximum of \(s \mapsto h(s;x)\) is attained at \(s_0=d/(4\,x)\). Hence, \(\max _{t\in J_0} h(r_t^2;x)\ll _d \min (x^{-d/4}, r^{d/2})\ll _d (x+ \tfrac{1}{r^2})^{-d/4}\). Thus, we obtain with \(x=1/q\)
Note that the value \(x=1/q\) is within the range of \(t \mapsto r_t^2\), \(t \in J_0\), since its maximum is \(r_0^2=r^2\) and its minimum is \(q_0/(4 \pi ^2+1) \le r_{t^*}^2 \le q_0\), where \(t^*=\pm q_0^{-1/2}r^{-1}\). In order to estimate \(\Theta _{t,2}\), we choose \(x=Q_{+}^{-1}[v_0/r]/4\) and get
Now we integrate the bounds (3.36) and (3.37) in \(t \in J_0\) weighted with \(|\widehat{g}_w(t)|\): In view of (3.8) we have \(\int _{J_0} |\widehat{g}_w(t)| \, \mathrm {d}t \ll \log (1+|b-a| \,q_0^{-1/2}r^{-1})\) and thus we finally get, using the quantity \(\Vert \widehat{\zeta }\Vert _{*,r}\) as defined in (2.6) for the weights \(\zeta (x)\), the estimate
Applying (3.7) of Corollary 3.2 with (3.19), (3.30) and (3.38) we may now collect the results obtained so far as follows for the lattice point remainder of (3.5). We have
3.2.4 Estimation of \(I_\theta \)
We shall now estimate the crucial error term \(I_\theta \), see (3.22) and (3.23). At first we shall bound the theta series \(\theta _v(t)\) uniformly in v by another theta series in dimension \(2\,d\) in order to transform the problem to averages over functions on the space of lattices subject to an appropriate action of \(\mathrm {SL}(2,{\mathbb {R}})\). We have
Lemma 3.3
Let \(\theta _v(t)\) denote the theta function in (3.14) depending on Q, \(r \in {\mathbb {R}}\) and \(v\in {\mathbb {R}}^d\). For \(r \ge 1\), \(t \in {\mathbb {R}}\) the following bound holds uniformly in \(v \in {\mathbb {R}}^d\)
and \(H_t(m,n)\) is a positive quadratic form on \({\mathbb {Z}}^{2d}\). Note that \(H_t(m,n)\) depends as well on the currently fixed dilating variable r which we suppress here.
Proof
For any \(x,y \in {\mathbb {R}}^d\) the equalities
hold. Rearranging \(\theta _v(z)\, \overline{\theta _v(z)}\) and using (3.43), we would like to use \(m+n\) and \(m-n\) as new summation variables on a lattice. But both vectors have the same parity, that is \(m + n \equiv m - n \mod 2\). Since they are dependent one has to consider the \(2^d\) affine sublattices indexed by \( \alpha = (\alpha _1, \dots , \alpha _d)\) with \(\alpha _j \in \{0,1\}\) for \(1 \le j \le d\):
where, for \( m = (m_1, \dots , m_d)\), \(m \equiv \alpha \mod 2\) means \(m_j \equiv \alpha _j \mod 2\) for all \( 1\le j \le d\). Thus writing
we obtain \(\theta _v(t) = \sum _{\alpha } \theta _{v,\alpha }(t)\) and hence by the Cauchy-Schwarz inequality
Using (3.43) and the absolute convergence of \(\theta _{v,\alpha }(t)\), we can write
where \(\bar{m} = \frac{m+n}{2}\), \(\bar{n} = \frac{m - n}{2}\). Note that the map
is a bijection. Therefore we get by (3.44)
In this double sum fix \(\bar{n}\) and sum over \(\bar{m} \in {\mathbb {Z}}^d\) first, and call the inner sum \(\theta _v(t, \bar{n})\). Using (3.24) with \(\Xi = 2\mathrm {i}Q_+ r^{-2}/\pi \) and \(v = -4 \,t \,Q \,\bar{n} +m\), we get for \(\delta :=\left( \det \left( \frac{2}{\pi r^2} \,Q_{+}\right) \right) ^{-1/2}\) by the symmetry of Q and Poisson’s formula (see [8], §46)
Thus, we have uniformly in \(v \in {\mathbb {R}}^d\)
Hence we obtain by (3.45) and (3.46)
where \(G_t(m,n) :=\frac{\pi ^2 r^2}{2} Q_{+}^{-1}[\,m - 4 \,t \,Q\,n \,] +\frac{2}{r^2} Q_{+}[n]\). Since \(\pi ^2/2 >1 \) we may bound \(G_t(m,n)\) from below as follows:
which proves the claimed estimate (3.40). Finally, observe that we can write
which shows that \(H_t(m,n)\) is a positive definite quadratic form on \({\mathbb {Z}}^{2d}\). \(\square \)
In view of Lemma 3.3 we can introduce the 2d-dimensional lattice
where
in order to write \(\psi (r,t)= \sum _{v \in \Lambda _t} \exp \{ -\Vert v\Vert ^2\}\) as the Siegel transform of \(\exp \{-\Vert x\Vert ^2\}\) evaluated at the lattice \(\Lambda _t\). According to the Lipschitz principle in the Geometry of Numbers (see [50], Lemma 2, or [23], Lemma 3.1) one can show that \(\psi (r,t) \ll _d \alpha (\Lambda _{t})\), where \(\alpha \) is the maximum over all \(\alpha _l\)-characteristics (see (2.2)). However, we choose to follow a more direct and transparent argument for the sake of clarity and motivate the relation between the \(\alpha _i\)-characteristics and the successive minima of a lattice for the convenience of the reader. The following Lemma 3.4 (with \(\varepsilon =1\)) reduces the problem of estimating the theta series (3.41) to the problem of counting lattice points as follows
Lemma 3.4
Let \(\Lambda \) be a lattice in \({\mathbb {R}}^d\). Assume that \(0<\varepsilon \le 1\), then
where \({\mathcal {H}} :=\bigl \{\,v \in \Lambda \, : \, \Vert v\Vert _\infty < 1 \,\bigr \}\).
Proof
The lower bound for the sum is obvious by restricting summation to the set of elements in \({\mathcal {H}}\). As for the upper bound introduce for \(\mu = (\mu _1, \dots , \mu _d)\in {\mathbb {Z}}^d\) the sets
such that \({\mathbb {R}}^d = \bigcup _{\mu \in {\mathbb {Z}}^d} B_\mu \). For any fixed \(w^* \in {\mathcal {H}}_\mu :=\Lambda \cap B_\mu \) we have \(w - w^*\in {\mathcal {H}}\) for all \(w\in {\mathcal {H}}_\mu \). Hence we conclude for any \(\mu \in {\mathbb {Z}}^d\)
Since \(x \in B_\mu \) implies \(\Vert x\Vert _\infty \ge \Vert \mu \Vert _{\infty }/2\), we obtain
This concludes the proof of Lemma 3.4. \(\square \)
4 Functions on the space of lattices and geometry of numbers
Let \(n \in {\mathbb {N}}^+\) be fixed (later to be chosen as \(n=2d\)) and for every integer l with \(1\le l\le n\) we fix a quasinorm \(\vert \cdot \vert _l\) on the exterior product . Let L be a subspace of \({\mathbb {R}}^n\) and \(\Delta \) a lattice in L (i.e. \(\Delta \) is a free \({\mathbb {Z}}\)-module of full rank \(\dim L\)), then any two bases of \(\Delta \) are related by a unimodular transformation, that is, if \(u_1, \dots , u_l\) and \(v_1,\dots , v_l\) are two bases of \(\Delta \), where \(l = \dim L\), then \(v_1 \wedge \dots \wedge v_l = \pm u_1 \wedge \dots \wedge u_l\), which implies that the expression \(|v_1 \wedge \dots \wedge v_l|_l\) is independent of the choice of basis.
Let \(\Delta \) be a lattice in \({\mathbb {R}}^n\), we say that a subspace L of \({\mathbb {R}}^n\) is \(\Delta \)-rational if \(L\cap \Delta \) is a lattice in L. For any \(\Delta \)-rational subspace L, we denote by \(d_\Delta (L)\), or simply by d(L), the quasinorm \(\vert u_1\wedge \ldots \wedge u_l\vert _l\) where \(\{ u_1,\ldots ,u_l\}\), \(l =\dim L\), is a basis of \(L\cap \Delta \) over \({\mathbb {Z}}\). For \(L = \{ 0\}\) we write \(d(L) :=1\). If the quasinorms \(\vert \cdot \vert _{l}\) are the norms on induced from the standard Euclidean norm on \({\mathbb {R}}^n\), then d(L) is equal to the determinant (or discriminant) \(\det (L\cap \Delta )\) of the lattice \(L \cap \Delta \), that is the volume of \(L/(L\cap \Delta )\). In particular, in this case the lattice \(\Delta \) is said to be unimodular if and only if \(d_\Delta ({\mathbb {R}}^n) = 1\). Also in this case \(d(L)d(M)\ge d(L\cap M)d(L+M)\) for any two \(\Delta \)-rational subspaces L and M (see Lemma 5.6 in [23]), but any two quasinorms on are equivalent, which proves
Lemma 4.1
There is a constant \(C\ge 1\) depending only on the quasinorm \(\vert \,\cdot \,\vert _l\) and not on \(\Delta \) such that
for any two \(\Delta \)-rational subspaces L and M.
Let us introduce the following notations for \(0\le l \le n\),
This extends the earlier definition (2.2) of \(\alpha _l(\Delta )\) in the introduction of Sect. 2 to the case of general seminorms on . In this section the functions \(\alpha _l\) and \(\alpha \) will be based on standard Euclidean norms, that is, we have \(d(L)=\det (L \cap \Delta )\).
In the following we shall use some facts from the Geometry of Numbers and the classical reduction theory for lattices in \({\mathbb {R}}^n\), see Davenport [17], Siegel [52], Cassels [14] and Einsiedler-Ward [24]. The successive minima of a lattice \(\Lambda \) are the numbers \(M_1(\Lambda ) \le \dots \le M_n(\Lambda )\) defined as follows: \(M_j(\Lambda )\) is the infimum of \(\lambda > 0\) such that the set \(\{v \in \Lambda : \Vert v\Vert < \lambda \}\) contains j linearly independent vectors and in particular \(M_1(\Lambda )\) is the shortest non-zero vector of the lattice \(\Lambda \). It is easy to see that these infima are attained, that is, there exist linearly independent vectors \(v_1,\dots , v_n \in \Lambda \) such that \(\Vert v_j\Vert = M_j(\Lambda )\) for all \(j=1,\ldots , n\). Moreover, as a consequence of the reduction algorithm of Korkine and Zolotareff (see [35,36,37]) the \(\alpha _l\)-characteristic and the successive minima are related according to \(\alpha _l(\Lambda ) \asymp _d (M_1(\Lambda ) \dots M_l(\Lambda ))^{-1}\) (see [24], Chapter 1, Theorem 15).
Lemma 4.2
Let F be a norm in \({\mathbb {R}}^n\) and denote by \(M_1 \le \dots \le M_n\) the successive minima with respect to F. Let \(\Lambda \) be a lattice in \({\mathbb {R}}^n\), then
Moreover, for any \(\mu >0\), if \(1\le j\le n\) is such that \(M_j(\Lambda ) \le \mu <M_{j+1}(\Lambda )\), where the right-hand side is omitted if \(j=n\), then
Proof
First we prove the lower bound. We may assume that \(M_j(\Lambda ) \le \mu < M_{j+1}(\Lambda )\), the right-hand side being omitted if \(j=n\). Let \(v_1,\dots , v_n\) denote the elements in \(\Lambda \) corresponding to the successive minima \(M_i(\Lambda )\), \(i=1, \ldots , n\). For \(m_1, \ldots , m_j \in {\mathbb {Z}}\) with \(|m_i| \le j^{-1}\,\mu \,F(v_i)^{-1}\) notice that \(v=m_1\,v_1+ \ldots + m_j\,v_j\) satisfies \(F(v) \le \mu \), thus
The upper bound is also proven in Davenport [17] (see Lemma 1). We include the short argument here for the sake of completeness: Let \(w_1,\dots , w_n\) be an integral basis of \(\Lambda \) such that \(v_i\) is linearly dependent on \(w_1,\dots , w_i\) for any \(i=1,\dots , n\). Consequently any lattice point \(v \in \Lambda \) with \(F(v) < M_{j+1}\) is linearly dependent on \(w_1,\ldots , w_j\) and hence any element \(v \in \Lambda \) with \(F(v)\le \mu \) can be written as \(v= m_1\,w_1+ \ldots + m_j\,w_j\) with \(m_i \in {\mathbb {Z}}\). Suppose \(v' \in \Lambda \) is another element with \(F(v') \le \mu \) and write \(v'= m_1'\,w_1+ \ldots m_j'\,w_j\) with \(m_i' \in {\mathbb {Z}}\). Now define positive integers \(\nu _1, \ldots , \nu _j\) by
and observe that \(\nu _1 \ge \nu _2 \ge \ldots \ge \nu _j\). Assuming for the moment that \(m_i \equiv m_i' \,\mod 2^{\nu _i}\) for every \(i=1,\ldots , j\) and let \(i_0\) denote the largest index \(i_0\) such that \(m_{i_0} \ne m'_{i_0}\). Then \(x :=2^{-\nu _{i_0}}\,(v-v')\) is an element of \(\Lambda \) and linearly independent of \(w_1, \dots , w_{{i_0}-1}\). This implies \(F(x) \ge M_{i_0}(\Lambda )\). On the other hand we have
by (4.7). This contradiction shows that there is at most one lattice point in \(\Delta \), implying that the coordinates \(m_1, \ldots , m_j\) lie in the same residue classes modulo \(2^{\nu _1}, \,2^{\nu _2}, \dots , 2^{\nu _j}\) respectively. Hence, the number of lattice points \(N(\mu )\) in (4.6) is bounded from above by the number of all residue classes, i.e. by \(2^{\nu _1}\,2^{\nu _2}\ldots \,2^{\nu _j} \le (4 \,\mu )^j (M_1(\Lambda )\ldots M_j(\Lambda ))^{-1}\). This shows the upper bound in (4.5). \(\square \)
Lemma 4.3
(Davenport [17]) Let \(\Lambda = g \,{\mathbb {Z}}^n\) and \(\Lambda ' = (g^{-1})^T \,{\mathbb {Z}}^n\) denote dual lattices of rank n, then for all \(j=1,\ldots , n\) we have
This is a variant of Lemma 2 of Davenport [17] for the Euclidean norm. Again, for the reader’s convenience, we include the short argument here.
Proof
Let \(v_1, \ldots ,v_n \in \Lambda \), resp. \(v_1',\ldots ,v_n' \in \Lambda '\), be linearly independent such that \(\Vert v_i\Vert = M_i(\Lambda )\), resp. \(\Vert v_i'\Vert = M_i(\Lambda ')\). Then \(v_1,\ldots ,v_j\) cannot be orthogonal to all lattice points \(v_1',\ldots ,v_{n+1-j}'\), otherwise they would fail to be independent. Thus, we have \(\langle v_i, v_k' \rangle \ne 0\) for some \(i =1,\ldots ,j\) and \(k=1,\ldots , n+1-j\), which implies that
because of duality. The right-hand side of (4.8) follows from (4.4) with \(l=n\), which is known as Minkowski’s inequality. Indeed, \(\det (\Lambda )= \alpha _n(\Lambda )^{-1} \asymp _n M_1(\Lambda ) \dots M_n(\Lambda )\) and since \(\det (\Lambda )\det (\Lambda ')=1\) we conclude that
\(\square \)
4.1 Sympletic structure of \(\Lambda _t\)
In the following we shall apply the previous results from the Geometry of Numbers to the special 2d-dimensional lattice \(\Lambda _t\) introduced in (3.47). The symplectic structure of \(\Lambda _t\) will allow us to establish a majorizing relation between the theta series (3.41) and the \(\alpha _d\)-characteristic of \(\Lambda _t\), see (4.14). To do this, we shall apply Lemma 4.2 combined with Lemma 4.3 as follows. (We note that the results of this section remain valid regardless of whether \(r \ge q^{1/2}\) or not.)
Lemma 4.4
Let \(\Lambda _t\) be the lattice defined in (3.47). Then we have for any \(t \in {\mathbb {R}}\)
and the lower bound
Corollary 4.5
As a consequence, we find for \(\mu \ge 1\)
and
Proof of Lemma 4.4
First we prove (4.9). Let
and consider the lattice
Then \(J D_{rQ} U_{4tQ} J^{-1} = D_{rQ}^{-1}U_{-4tQ}^T\) and hence \(\Lambda _t '\) is the lattice dual to \(\Lambda _t\) in the sense of Lemma 4.3. We claim that they have identical successive minima. To this end, note that for any \(N = (m, {\bar{m}})^T \in {\mathbb {Z}}^{2d}\)
where we use that J is an orthogonal matrix. Since \(J {\mathbb {Z}}^{2d} = {\mathbb {Z}}^{2d}\), the Eq. (4.15) implies that the successive minima of \(\Lambda _t\) and \(\Lambda '_t\) are identical and by Lemma 4.3 we conclude \( M_j(\Lambda _t) M_{2d+1-j}(\Lambda _t) \asymp _d 1\) for \(j=1,\ldots ,d\).
To prove (4.10) we note that \(M_d \le M_{d+1}\) and \(1 \le M_d(\Lambda _t)\,M_{d+1}(\Lambda _t) \ll _d 1\) implies
for all \(j=1,\ldots ,d\). Thus, it remains to show the lower bound (4.11) for \(M_1(\Lambda _t)\): Take \(m,\bar{m} \in {\mathbb {Z}}^d\) with \(M_1(\Lambda _t) = \Vert D_{rQ} U_{4tQ} (m, \bar{m})\Vert = H_t(m,\bar{m})^{1/2}\), where \(H_t\) denotes the special norm (3.42) in the theta series (3.41). If \(\bar{m} \ne 0\), then we have \(M_1(\Lambda _t) \ge r^{-1} \Vert Q_+^{1/2}\,\bar{m}\Vert \ge q_0^{1/2}r^{-1}\), but otherwise \(M_1(\Lambda _t) = r \Vert Q_+^{-1/2}m\Vert \ge r q^{-1/2}\). \(\square \)
Proof of Corollary 4.5
We begin with proving (4.12) as follows. Recall that \(\mu \ge 1\) and let \(2d \ge j\ge 1\) denote the maximal integer with \(M_j(\Lambda _t) \le \mu \). Then Lemma 4.2 implies
since we have \(M_j(\Lambda _t) \ge \ldots \ge M_{d+1}(\Lambda _t) \gg 1\) if \(j>d\) and \(\mu < M_{j+1}(\Lambda _t)\le \ldots \le M_d(\Lambda _t) \ll _d 1\) if \(j < d\). In the case \(\mu < M_1(\Lambda _t)\) the inequality in (4.12) holds trivially. Moreover, this argument also proves (4.13). Finally, the estimate (4.14) follows from the relation (3.49) combined with (4.12) for \(\mu =d^{1/2}\). \(\square \)
For arbitrary \(t \in {\mathbb {R}}\) the following bounds hold independently of the Diophantine properties of Q.
Lemma 4.6
Denote by \(\Delta \) the lattice \(Q_+ ^{1/2} {\mathbb {Z}}^d\), then
where \(D_{sQ}\) and \(U_{4tQ}\) are defined as in (3.48) and
In particular, it follows that
and for small t we get
We emphasize that these estimates will be used for a wide range of \(s>0\) (depending on the blow-up parameter \(r \ge q^{1/2}\)), see e.g. the proof of Lemma 6.2, and for small t as well (by which we mean \(r^{-1} q_0^{-1/2}< t < T_{-}\) as stated in Theorem 2.2).
Proof
In this proof we replace the definition of \(\Lambda _t\), see (3.47), by \(\Lambda _t = D_{sQ} \,U_{4tQ} {\mathbb {Z}}^{2d}\), i.e. r has to be replaced by s. If \(1/8 < M_1(\Lambda _t)\), then we have
Otherwise, there exists an integer \(j=1,\ldots ,d\) with \(M_j(\Lambda _t) \le 1/8 < M_{j+1}(\Lambda _t)\), since \(1 \le M_{d+1}(\Lambda _t)\) holds by (4.10). Now, taking \(\mu =1/8\) in (4.5) of Lemma 4.2 shows that
i.e. (4.21) holds also in the second case. Recalling again (3.42), we see that the right-hand side of (4.21) is the same as the number all lattice points \(m,\bar{m} \in {\mathbb {Z}}^d\) satisfying
where the positive form \(H_t[\cdot ,\cdot ]\) is defined as in (3.42), but here again r has to be replaced by s. \(\square \)
Proof of (4.16)
If (4.22) holds, then \(\Vert Q_+^{1/2} \bar{m}\Vert \le s/2\), which has again by Lemma 4.2 at most \(\ll _d \prod _{j\,:\, M_{j}(\Delta )\le s} (s \,M_{j}(\Delta )^{-1})\) integral solutions. Similarly, for fixed \(\bar{m}\) the triangle inequality combined with (4.22) implies
Thus, for fixed \(\bar{m}\), the number of pairs \((m, \bar{m})\) for which (4.22) holds is bounded by the number of elements v in the dual lattice \(\Delta '=Q_{+}^{-1/2}{\mathbb {Z}}^d\) to \(\Delta \) such that \(\Vert v\Vert \le s^{-1}\). Since the successive minima for this dual lattice are determined by Lemma 4.3, we may use Lemma 4.2, inequality (4.5), again to determine the upper bound
for this number as well. The product of both numbers yields the bound
Finally, using Lemma 4.2 in form of \((\prod _{j=1}^d M_{j}(\Delta ))^{-1} \asymp _d \alpha _d(\Delta ) = |\det \,Q|^{1/2}\) shows the claimed bound in (4.16). Also the inequality (4.18) follows immediately from (4.17). \(\square \)
Proof of (4.19)
Assume \(q_0^{1/2}|t \,s| \ge 1\) and \(q_0\ge 1\). If \(m=0\) we conclude that \(\Vert \bar{m}\Vert \le |4t\,s| \Vert Q_+^{1/2} \bar{m}\Vert \le 1/8\). Hence \(\bar{m}=0\). For any fixed \(m \ne 0\) the triangle inequality implies that there is at most one element \(\bar{m}\in {\mathbb {Z}}^d\) with (4.22). Furthermore, we get \((\Vert Q_+^{-1/2} m\Vert - 1/(8 \,s)) \le \Vert 4t\,Q_+^{1/2}\, \bar{m}\Vert \) for that pair \((m,\bar{m})\). This implies
and hence \(\Vert Q_+^{-1/2} m\Vert \le (s^{-1}+|4t\,s|)/8\). Thus
Proof of (4.20)
As in the previous case, (4.22) implies by the triangle inequality that
and together with \(q^{1/2}\,|t \,s| \le 1\) also \(|4t\,s|\,s^{-1} \Vert Q_+^{1/2}\bar{m}\Vert \le |4t\,s|/8 \le (2 q)^{-1/2}\). Moreover one of these inequalities is strict and therefore we have
If \(s \ge q^{1/2}\), this leads to a contradiction unless \(m=0\). Hence, the possible solutions for \(\bar{m}\) in (4.23) satisfy \(\Vert Q_+^{1/2}\bar{m}\Vert \le |32 t \,s|^{-1}\) which, as in the proof of (4.16), has at most \(\ll _d |\det \,Q|^{-1/2} |t\,s|^{-d}\) solutions. In the second case, i.e. if \(s < q^{1/2}\), the inequality (4.24) has at most \(\ll _d (q^{1/2}/s)^d\) solutions for m. Now any possible \(\bar{m}\) must satisfy
again, which completes the proof of (4.20) in view of (4.21). \(\square \)
4.2 Approximation by compact subgroups
In Sect. 5 we shall develop mean-value estimates for fractional moments of the \(\alpha _d\)-characteristic of the lattice \(\Lambda _t\) introduced in (3.47). In order to apply techniques from harmonic analysis, we will rewrite the family \(\{\Lambda _t\}_{t \in {\mathbb {R}}}\) as an orbit of a single lattice by means of elements of the one-parameter subgroups \({\mathrm {D}} :=\{d_r : r>0\}\) and \({\mathrm {U}} :=\{u_t : t\in {\mathbb {R}}\}\) of \(\mathrm {SL}(2,{\mathbb {R}})\), where
and then approximate the subgroup \({\mathrm {U}}\) locally by the compact subgroup \({\mathrm {K}}= \mathrm {SO}(2)=\{ k_\theta : \theta \in [0,2\pi ]\}\) parameterized, as usual, by elements
Let S be an orthogonal matrix such that \(SQQ_+^{-1}S^T = Q_0\), where \(Q_0\) denotes the signature matrix corresponding to Q, that is \(Q_0 = \text {diag}(1,\ldots , 1,-1, \ldots ,-1)\). A short computation shows that
where we embed \(\mathrm {SL}(2,{\mathbb {R}})\) into \(\mathrm {SL}(2d,{\mathbb {R}})\) according to the following action
Define the 2d-dimensional lattice
then as claimed,
Moreover, since S is orthogonal and \(\alpha _i\) is invariant under left multiplication by orthogonal matrices we observe for any \(i = 1,\dots ,2d\) that
Lemma 4.7
With respect to the embedding of \(\mathrm {SL}(2,{\mathbb {R}})\) defined in (4.27) we have for \(t \in {\mathbb {R}}\), \(s \ge 1\) and any 2d-dimensional lattice \(\Lambda \) in \({\mathbb {R}}^{2d}\)
where \(\theta = \arctan t\).
Proof
Suppose the signature of Q is (p, q) and let \((v,w) \in {\mathbb {R}}^d \times {\mathbb {R}}^d\), thought of as a column vector with coordinates \(v_1,\ldots ,v_d,w_1,\ldots , w_d\), then
Let \(x,y \in {\mathbb {R}}\). Note that \(y+t\,x=(1 +t^2)\,y+t\, (x-t\,y)\), which implies that
and therefore we find
provided that \(s \ge 1\). Taking \(\theta =\arctan t\) and noting that \(\cos (\theta ) = (t^2+1)^{-1/2}\), resp. \(\sin (\theta ) = t (t^2+1)^{-1/2}\), we see that (4.33) can be written as
and it is easy to see, along the same lines as before, that
Hence, we obtain in view of (4.32) that
from which we deduce that \((1+t^2)^{i/2} M_i(d_s u_t \Lambda ) \gg M_i(d_s k_\theta \Lambda )\) for any \(i =1, \dots ,2d\). The claim follows now from (4.4). \(\square \)
4.3 Irrational and diophantine lattices
The purpose of this section is to relate the \(\alpha _d\)-characteristic of \(\Lambda _t\) to the Diophantine approximation of tQ by symmetric integral matrices. We begin by motivating the Definition 1.6: Recall that Q is said to be Diophantine of type \((\kappa ,A)\), where \(\kappa >0\) and \(A>0\), if
or equivalently if we introduce the truncated rational approximation error
we require Q to satisfy
Remark 4.8
As an aside, we remark that the property of Q being Diophantine in the above sense is equivalent to the requirement that for some \({\tilde{\kappa }}>0\)
which was introduced in [23] in the context of forms that are (EWAS). However, this formulation is not optimal because \({\tilde{\kappa }}\) must be chosen larger than \(\kappa \) depending on A. Moreover, in most applications the constant A cannot be determined explicitly due to non-effective methods in Diophantine approximation.
The following lemma justifies calling such forms Diophantine:
Lemma 4.9
Let k be an integer in the range \(1 \le k \le \frac{d(d+1)}{2}-1\) and let Q be a form such that \(k+1\) non-zero entries \(y, x_1, \dots , x_k\) satisfy the property that
for all k-tuples \((p_1/q,\dots ,p_k/q)\) of rationals. Then Q is Diophantine of type \((\kappa ,A')\), where \(A'\) depends on \(A, \kappa , y, x_1/y, \ldots , x_k/y\) only (see (4.36)).
Proof
Let \(M \in \text {Sym}(d,{\mathbb {Z}})\), \(m \in {\mathbb {Z}}{\setminus } \! \{0\}\) and \(t \in [1,2]\). Denoting the entries in M corresponding to the coordinates of Q in which \(y, x_1, \dots , x_k\) appear by \(q, p_1, \dots , p_k\), we find the inequality
Suppose that the expression on the right-hand side is strictly less than \(A'm^{-\kappa }\), where
Note first that \(|m| \ge |m \,t y|/(2 y) > q/(4 y)\) and hence
for all \(i=1,\ldots ,k\), which yields a contradiction. \(\square \)
Recall that a number \(\theta \in {\mathbb {R}}\) is called Diophantine of type \(\kappa >0\) if there exists \(c_\kappa >0\) such that \(|q \theta -a| \ge c_\kappa |q|^{-\kappa }\) for every rational number a/q. In particular any form Q for which one ratio of two of its entries is a Diophantine number, is Diophantine in the sense of Definition 1.6 and hence almost all forms are Diophantine in this sense. An example of Diophantine forms for which we can control the exponent \(\kappa \) is the following: Suppose Q is a form with \(k+1\) entries \(y, x_1, \dots ,x_k\) such that \(x_1/y, \dots , x_k/ y\) are algebraic and \(1, x_1/y, \dots , x_k/ y\) are linearly independent over \({\mathbb {Q}}\), then Schmidt’s Subspace Theorem together with Lemma 4.9 implies that for any \(\eta >0\) the form Q is Diophantine of type \((1/k +\eta ,A')\), where \(A'\) is a constant depending only on \(\eta ,A,y, x_1/y, \dots , x_k/y\). However, as is usually the case in Diophantine approximation, the constant A and hence \(A'\) is ineffective in the sense that these constants cannot be determined explicitly.
After the previous motivation, we shall state the main result of this section. In particular, we will see that larger values of \(\beta _{t;r}\) (see (4.38)) enforce smaller values of the truncated rational approximation error \(\delta _{4tQ;R}\) as follows
Lemma 4.10
Assume that \(q_0\ge 1\). Then we have for all \(t \in {\mathbb {R}}\) and \(r\ge q^{1/2}\)
where
Note that this bound is non-trivial for \(\beta _{t;r} > q \,r^{-2}\) only, due to the uniform bound \(\beta _{t;r}\ll _d 1\) for \(r \ge q^{1/2}\) established in Lemma 4.6.
Before proving (4.37), we shall state some important consequences.
Corollary 4.11
Consider any interval \([T_-,T_+]\) with \(T_- \in (0,1]\) and \(T_+ \ge 1\).
-
(i)
If Q is irrational, then
$$\begin{aligned} \lim _{r \rightarrow \infty } \big ( \sup _{T_- \le t \le T_+} \alpha _d(\Lambda _t) \,r^{-d} \, \big ) = 0. \end{aligned}$$(4.39) -
(ii)
If Q is Diophantine of type \((\kappa ,A)\), then
$$\begin{aligned} \sup _{T_- \le t \le T_+} \alpha _d(\Lambda _t)\,r^{-d} \ll _d |\det {Q}|^{-1/2} (q \,A^{-1} r^{-2} )^{\frac{1}{\kappa +1}} \,\max \big \{ (T_-)^{-\frac{1}{\kappa +1}}, (T_+)^{\frac{\kappa }{\kappa +1}}\big \}\,. \end{aligned}$$(4.40)
A variant of (i) in terms of the successive minima of \(\Lambda _t\) can also be found in [28], see Lemma 3.11, yielding an alternative proof of (4.39) when combined with (4.4).
Proof
(i) We show the contraposition: Assume that there exists an \(\varepsilon >0\) and sequences \((r_j)_j\), \((t_j)_j\) such that \(\lim _{j \rightarrow \infty } r_j = \infty \) and \(\beta _{t_j;r_j} > \varepsilon \). Passing to a subsequence we may assume that \(\lim _{j \rightarrow \infty } t_j =t\) for some \(t \in [T_-,T_+]\). Thus (4.37) yields \(\lim _{j \rightarrow \infty } \delta _{4t_jQ; R_j^*}=0\) with \(R^*_j :=\beta _{t_j;r_j}^{-1} < \varepsilon ^{-1}\). By definition, this means that \(\lim _{j \rightarrow \infty } \Vert M_j - 4 t_j m_j Q\Vert =0\) for some \(M_j\in \mathrm {Sym}(d,{\mathbb {Z}})\) and \(m_j \in {\mathbb {Z}}\) with \(|m_j| \le \varepsilon ^{-1}\). Obviously both, \(\Vert M_j\Vert \) and \(|m_j|\), are bounded. Hence there exist integral elements M, m and an infinite subsequence \(j'\) of j with \(M_{j'}=M\), \(m_{j'}=m\) and by construction \(\lim _{j'} t_{j'}=t\). These limit values satisfy \(\Vert M - 4 \,m\,t\,Q\Vert =0\), i.e. Q is a multiple of a rational form.
(ii) First we note that for any \(t \in [1,T_{+}]\) we have by (4.35)
and similarly for \(t \in [T_{-},1]\)
Thus, the relation (4.37), established in Lemma 4.10, implies for any \(t \in [T_-,T_+]\) that
where we used (4.37). Therefore we conclude (4.40) as claimed. \(\square \)
Proof of Lemma 4.10
We begin by recalling that \(\Lambda _t = D_{rQ} \,U_{4tQ} \,{\mathbb {Z}}^{2d}\) (see (3.47)), where
As noted in Remark 2.1 the \(\alpha _d\)-characteristic of \(\Lambda _t\) is attained at some sublattice, that is we can write \(\alpha _d(\Lambda _t) = \Vert w_1 \wedge \ldots \wedge w_d\Vert ^{-1}\) by means of vectors \(w_j :=D_{rQ} U_{4t Q} l_j\) with linear independent points \(l_1,\ldots ,l_d \in {\mathbb {Z}}^{2d}\) depending on t. Here we use the standard Euclidean norm on the exterior product . Moreover, we write \(l_j=(m_j,n_j)\), where \(m_j,n_j \in {\mathbb {Z}}^d\) and the coordinates of \((m_j,n_j)\) are the coordinates of the vectors \(m_j\) and \(n_j\) in the corresponding order. Additionally, we introduce the \(d\times d\) integer matrices N and M with columns \(n_1, \ldots , n_d\) and \(m_1,\ldots ,m_d\) as well. Using this notation, we may write
First, we shall prove that
Note that the left-hand side of (4.42) can be rewritten as \(\beta _{t;r} > q \,r^{-2}\) and we may assume that this inequality holds, since otherwise the bound (4.37) is trivial.
Let us show that \({\text {rank}}(N)=d\). To this end, we write \(k=d-{\text {rank}}(N)\). According to elementary divisor theory (for matrices with entries in a principal ideal domain) there exist \(P, P' \in \mathrm {GL}(d,{\mathbb {Z}})\) such that \(P'N P\) is a diagonal matrix with positive entries of the form \(\text {diag}(0,\dots ,0,a_{k+1},\dots ,a_d)\) with \(a_i \mid a_{i+1}\), \(a_i \in {\mathbb {N}}\). In particular NP is a matrix whose first k columns are zero. Moreover, since \(\det {P} = \pm 1\), we conclude that
and hence we can assume from now on that \(N = (0,\ldots ,0,n_{k+1},\ldots ,n_{d})\) with linearly independent vectors \(n_{k+1},\ldots ,n_d \in {\mathbb {Z}}^d\). Since \(l_1,\ldots ,l_d\) constitute a basis of a d-dimensional lattice, we note that \(m_1,\ldots ,m_k\) are necessarily linearly independent. Now we shall express \(w_1\wedge \ldots \wedge w_d\) in terms of the standard basis \(e_{I} \wedge e_{J}\) indexed by pairs of subsets \(I \subset \{1,\ldots , d\}\) and \(J \subset \{d+1, \ldots , 2d\}\) with \(|I|+|J|=d\), i.e. we write
Let \(I = \{i_1,\ldots ,i_m\}\) and \(J = \{j_1,\ldots ,j_{d-m}\}\), then the coefficients \(\omega _{I,J}\) are given by
where
Since the matrix in (4.43) is of block-type, we find
Without loss of generality assume that the eigenvalues of Q are indexed such that \(|q_1| \le \dots \le |q_d|\). Since \(q_0 \ge 1\), note that the minimal eigenvalue of the k-th exterior power of \(Q_{+}^{-1/2}\) is given by \(|q_{d-k+1} \ldots q_d|^{-1/2}\) and that of the \((d{-}k)\)-th exterior power of \(Q_+^{1/2}\) is precisely \(|q_1 \ldots q_{d-k}|^{1/2}\). Hence, since \(m_1,\ldots , m_k\) and \(n_{k+1}, \ldots , n_d\) are linearly independent and integral, we obtain the following lower bound
where we used that \(r \ge q^{1/2}\). In view of (4.42), this strict inequality yields a contradiction unless \(k=0\). Thus, we proved that \(k=0\), i.e. \(|\det N| > 0\). Now (4.44) also implies \(\beta _{t;r}^{-1} \ge |\det N|\). Hence, the upper bound for \(|\det N|\) in (4.42) holds as well.
Finally, we shall prove (4.37). Since N is invertible, we can rewrite \(w_1 \wedge \ldots \wedge w_d\) by
i.e. we parametrized the subspace spanned by \(l_1,\ldots ,l_d\). Introduce also the \(2d {\times } d\) matrix
and note that \(W^T W\) is a positive definite symmetric \(d \times d\) matrix. Thus, there exists an orthogonal matrix \(V \in O(d)\) such that \(D :=V^T W^T W V\) is diagonal with positive entries. Since \((\det {V}) (e_1 \wedge \ldots \wedge e_d) = V (e_1 \wedge \ldots \wedge e_d)\) it follows that
where \(v_1,\ldots ,v_d\) denote the columns of V. Next observe that
Now let \(i_0\) be a subscript for which \(\Vert Wv_i\Vert \) is maximal. Similar to the proof of (4.44) we may write \( W ( \wedge _{i \ne i_0} v_i) = \sum \omega _{I,J} e_{I} \wedge e_J\), where the sum is taken over subsets \(I \subset \{1,\ldots ,d\}\) and \(J \subset \{d+1,\ldots ,2d\}\) with \(|I|+|J| = d-1\), and find that
Combining (4.45) together with (4.46)–(4.48) yields
Since \((\det N)\,N^{-1}\) is an integral matrix, the last line together with (4.42) implies
and, since Q is symmetric, we may take \(\bar{M}\) symmetric as well, which proves (4.37). \(\square \)
5 Averages along translates of orbits of \(\mathrm {SO}(2)\)
5.1 Application of geometry of numbers
In view of the bound (3.39) we need to estimate the error term \(I_\theta \), that is (3.22). Proceeding as in (3.23) combined with the estimates \(|\theta _v(t)| \ll _d | \det Q|^{-1/4}\,r^{d/2}\,\psi (r,t)^{1/2}\) and \(\psi (r,t) \ll _d \alpha _d(\Lambda _t)\), obtained in Lemma 3.3 respectively (4.14) of Corollary 4.5, leads to
where \(\Lambda _t\) denotes the lattice defined in (3.47) and \(g_w\) the smoothed indicator function of [a, b] with \(0< w < (b-a)/4\), see Corollary 3.2. Since Lemma 7.2 provides estimates for \(\Vert \widehat{\zeta }\Vert _1\) in the case of both admissible and non-admissible regions \(\Omega \), it remains to estimate the integral in (5.1). We shall start with bounding this integral over an interval I of length at most 1/q. For this, we introduce the maximum value over I of the \(\alpha _d\)-characteristic for the lattice \(\Lambda _t\) via
and the following family of lattices
where \(\Lambda _Q\) is as defined in (4.29). Here \(\gamma _{I, \beta }(r)\) depends on the Diophantine properties of Q and tends to zero for growing \(r \rightarrow \infty \) by Lemma 4.11 for irrational Q.
Lemma 5.1
Let \(r \ge q^{1/2}\), \(0<\beta \le 1/2\) and fix an interval \(I=[\tau _1,\tau _2]\) of length at most 1/q. Then we have
where \(r_* :=r \,q^{-1/2}\) and \(\widehat{g}_I :=\max \{ |\widehat{g}_w(t)| : t \in I\}\).
Proof
Using the trivial bound \(\alpha _d(\Lambda _{t}) \le r^{d- 2 \,\beta \,d} \gamma _{I, \beta }(r)^2 \,\alpha _d(\Lambda _{t})^{2\beta }\) and estimating \(|\widehat{g}_w|\) by its maximum \(\widehat{g}_I\) on I yields
Since the group \(\mathrm {D}\) normalizes \(\mathrm {U}\), a computation shows that \(d_r\,u_{4t} = d_r \,u_{4(t-\tau _1)} \,u_{4 \tau _1}=d_{r_*}\,u_{\tau } \,d_{q^{1/2}} \,u_{4\tau _1}\), where \(\tau :=4 \,(t-\tau _1)\,q\). Changing variables from t to \(\tau \) we obtain in terms of the lattices \(\Lambda _{Q,s}\), defined in (5.3),
Finally, we estimate the last average with the help of Lemma 4.7 by the average over the group \(\text {K} = \mathrm {SO}(2)\). Changing variables \(\theta (s) = \arctan (\tau )\), \(\tau \in [0,4]\), and noting that \(|\theta | < \pi \) and \(\mathrm {d}\tau = (1+\tau ^2) \, \mathrm {d}\theta \), we get by (4.31) of Lemma 4.7 that
Now note that \(\alpha _d(\Lambda ) \le \alpha (\Lambda )\) holds for any lattice \(\Lambda \) in \({\mathbb {R}}^{2d}\). Thus, the last inequality together with (5.5) and (5.6) completes the proof. \(\square \)
In the following paragraphs we shall develop explicit bounds for averages over the group \({\mathrm {K}}\) of type \(\int _{{\mathrm {K}}} \alpha _d(d_r\,k \,\Lambda )^{\beta } \, \mathrm {d}k\).
5.2 Operators \(A_g\) and functions \(\tau _\lambda \) on \(\mathrm {SL}(2,{\mathbb {R}})\)
Let \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\). We consider the following two subgroups of \({\mathrm {G}}\):
where \(k_\theta \) is defined in (4.26). According to the Iwasawa decomposition, any \(g\in G\) can be uniquely represented as a product of elements from \(\mathrm {K}\) and \(\mathrm {T}\), that is
Now let
According to the Cartan decomposition, we have
In this decomposition d(g) is determined by g, and if \(g\notin {\mathrm {K}}\) then \(k_1(g)\) and \(k_2(g)\) are also determined by g up to a factor of \(\pm 1\) on \(k_1\) and \(k_2\). It is clear that \(\Vert g\Vert = \Vert d(g)\Vert \), where \(\Vert \,\cdot \,\Vert \) denotes the operator norm induced by the standard Euclidean norm on \({\mathbb {R}}^2\). Note that, in the simple case \(g= d_a\), this norm is given by \(\Vert d_a\Vert =a\). Since \(d_a\) is the conjugate of \(d_{a^{-1}}\) by \(k_{\pi /2}\), we see that \(g^{-1}\in {\mathrm {K}}g {\mathrm {K}}\) or equivalently, \(d(g) = d(g^{-1})\) for any \(g\in {\mathrm {G}}\). Therefore, \(\Vert g\Vert = \Vert g^{-1}\Vert \), \(g\in {\mathrm {G}}\).
We say that a function f on \({\mathrm {G}}\) is left \({\mathrm {K}}\)-invariant (resp. right \({\mathrm {K}}\)-invariant, resp. bi-\({\mathrm {K}}\)-invariant) if \(f({\mathrm {K}}g) = f(g)\) (resp. \(f(g{\mathrm {K}}) = f(g)\), resp. \(f({\mathrm {K}}g {\mathrm {K}}) = f(g)\)). Any bi-\({\mathrm {K}}\)-invariant function on \({\mathrm {G}}\) is completely determined by its restriction to \({\mathrm {D}}^+\). Hence for any bi-\({\mathrm {K}}\)-invariant function f on \({\mathrm {G}}\), there is a function \(f^*\) on \([1,\infty )\) such that \(f(g) = f^*(\Vert g\Vert )\), \(g\in {\mathrm {G}}\).
For any \(\lambda \in {\mathbb {R}}\) we define a character \(\chi _\lambda \) of \({\mathrm {T}}\) by
and the function \(\varphi _\lambda :{\mathrm {G}}\rightarrow {\mathbb {R}}^+\) by
The function \(\varphi _\lambda \) has the property
and it is completely determined by this property and the condition \(\varphi _\lambda (1) = 1\).
For \(g\in {\mathrm {G}}\) and a continuous action of \({\mathrm {G}}\) on a topological space X, we define the operator \(A_g\) on the space of continuous functions on X by
where \(\sigma \) is the normalized Haar measure on \({\mathrm {K}}\), or, using the parametrization of \({\mathrm {K}}\), by
The operator \(A_g\) is a linear map into the space of left \(\mathrm {K}\)-invariant functions on X. If \(X=\mathrm {G}\) and \(\mathrm {G}\) acts on itself by left translations, then \(A_g\) commutes with right translations. From these two remarks, or using a direct computation, we get that \(A_g\varphi _\lambda \) has the property (5.7). Hence \(\varphi _\lambda \) is an eigenfunction for \(A_g\) with the eigenvalue
We see from (5.9) that \(\tau _\lambda \) is obtained from \(\varphi _\lambda \) by averaging over right translations by elements of \({\mathrm {K}}\). But \(\varphi _\lambda \) is left \({\mathrm {K}}\)-invariant and \(A_g\) commutes with right translations. Hence the function \(\tau _\lambda \) is bi-\({\mathrm {K}}\)-invariant and it is an eigenfunction for \(A_g\) with the eigenvalue \(\tau _\lambda (g)\), that is
We have that
where \(\Vert \cdot \Vert \) denotes the usual Euclidean norm on \({\mathbb {R}}^2\). Indeed
where \(S^1\) is the unit circle in \({\mathbb {R}}^2\) and \(\ell \) denotes the normalized rotation invariant measure on \(S^1\). One can easily see that \(\Vert gu\Vert ^{-2}\), \(g\in G\), \(u\in S^1\), is equal to the Jacobian at u of the diffeomorphism \(v\mapsto gv/\Vert g v\Vert \) of \(S^1\) onto \(S^1\). On the other hand, it follows from the change of variables formula that
where \(f :M \rightarrow M\) is a diffeomorphism of a compact differentiable manifold M and \(J_f\) (resp. \(J_{f^{-1}}\)) denotes the Jacobian of f (resp. \(f^{-1}\)). Now using (5.12) we get
The second equality in (5.13) is true because \(\tau _\lambda \) is bi-\({\mathrm {K}}\)-invariant and \(g^{-1}\in {\mathrm {K}}g {\mathrm {K}}\). Since, obviously, \(\tau _0 (g) = 1\), it follows that
Since \(t^{-\lambda }\) is a strictly convex function of \(\lambda \) for any \(t > 0, t\ne 1\), it follows from (5.12) that \(\tau _\lambda (g)\) is a strictly convex function of \(\lambda \) for any \(g\in G\). From this, (5.13) and (5.14) we deduce that
Since the function \(\tau _\lambda (g)\) is bi-\({\mathrm {K}}\)-invariant, it depends only on the norm \(\Vert g\Vert \) of g. Thus, we can write
where for \(a\ge 1\)
In view of (5.10) and the definition of \(A_g\), we get
Since \(\Vert g\Vert = \Vert g^{-1}\Vert \) for all \(g\in {\mathrm {G}}\),
for all \(k\in {\mathrm {K}}\) and \(g\in {\mathrm {G}}\). From this, (5.15) and (5.19) we deduce that, for any \(\lambda > 2\), the continuous function \(\tau _\lambda ^* (a), a\ge 1\), does not have a local maximum. Hence \(\tau _\lambda ^*\) is strictly increasing for all \(\lambda > 2\) or, equivalently,
Using (5.13) and (5.18) yields
Since \(a^2\cos ^2\theta \le a^2\cos ^2 \theta + a^{-2} \sin ^2 \theta \le a^2\), we deduce from (5.21) the estimates
where
\(\mathrm {B}\) denotes the beta function and we use the identity \(\mathrm {B}(x,y) = \Gamma (x)\Gamma (y)/\Gamma (x+y)\) as well as \(\Gamma (1/2)=\sqrt{\pi }\). From (5.21) we also conclude that for any \(\lambda > 2\) the ratio \(\frac{\tau _{\lambda }^*(a)}{a^{\lambda -2}}\) is a strictly decreasing function of \(a\ge 1\) and
Remark 5.2
The function \(\tau _\lambda \) can be viewed as a spherical function on the upper-half plane \({\mathbb {H}}\) (see [29] Chapter IV Proposition 2.9) and all spherical functions on \({\mathbb {H}}\) are of this form for some \(\lambda \in {\mathbb {C}}\). In particular, it is not difficult to see that \(\tau _\lambda \) can also be represented as
Moreover, for \(\text {Re}(\lambda )>1\) it is well-known that \(c(\lambda )\), which is usually referred to as Harish-Chandra’s c-function, as defined in (5.24) exists and its value is given by (5.23) (see [29] Introduction Theorem 4.5 or [38] Chapter V §5).
Lemma 5.3
Let \(g\in {\mathrm {G}}, g\notin {\mathrm {K}}\), \(\lambda > 2\), \(0< \eta < \lambda \), \(b\ge 0\), \(B > 1\), and let f be a left \({\mathrm {K}}\)-invariant positive continuous function on \({\mathrm {G}}\). Assume that
and that
Then for all \(h\in G\)
where
Proof
We define
Since \(A_g\) commutes with right translations, and \(\tau _\eta \) is right \({\mathrm {K}}\)-invariant, it follows from (5.25) that \(A_g f_{{\mathrm {K}}}\le \tau _\lambda (g) f_{{\mathrm {K}}} + b\tau _\eta \). If h and y are as in (5.26), then \(f(yhk) \le B f(hk)\) for every \(k\in {\mathrm {K}}\) and therefore \(f_{{\mathrm {K}}}(yh) \le B f_{{\mathrm {K}}}(h)\). On the other hand, it is clear that
Thus we can replace f by \(f_{{\mathrm {K}}}\) and assume that f is bi-\({\mathrm {K}}\)-invariant. Then we have to prove that \(f\le s\tau _\lambda \). Assume the contrary, then \(f(h) > s'\tau _\lambda (h)\) for some \(h\in G\) and \(s' > s\). In view of (5.16) and (5.27), \(s' > s\ge Bf(1)\). From this, (5.20) and (5.26) we get that \(\Vert h\Vert > \Vert g\Vert \) and
Using the Cartan decomposition, we see that any \(x\in {\mathrm {G}}\) with \(\frac{\Vert h\Vert }{\Vert g\Vert } \le \Vert x\Vert \le \Vert h\Vert \) can be written as \(x = k_1 yh k_2\), where \(k_1, k_2\in {\mathrm {K}}\), \(\Vert y\Vert \le \Vert g\Vert \) and \(\Vert yh\Vert \le \Vert h\Vert \). But the functions f and \(\tau _\lambda \) are bi-\({\mathrm {K}}\)-invariant. Therefore it follows from (5.28) that
Let
In view of (5.10) and (5.25), we see that
Since \(\tau _\lambda (1) = \tau _\eta (1) = 1\), we have
It follows from (5.16) that \(a_2\ge 0\). Using additionally (5.27) and (5.29), we get that
Let \(v\in {\mathrm {G}}\), satisfying \(\Vert v\Vert \le \Vert h\Vert \), be a point where the continuous function \(\omega \) attains its minimum on the set \(\{ x\in {\mathrm {G}}: \Vert x\Vert \le \Vert h\Vert \}\). It follows from (5.31) and (5.32) that
Because of \(\tau _\lambda (g) > 1\) and \(\Vert gkv\Vert \le \Vert g\Vert \Vert v\Vert \) for all \(k\in {\mathrm {K}}\) we conclude
Thus, we get a contradiction with (5.30). \(\square \)
As a special case (\(\eta = 2\) and \(b =0\)) of Lemma 5.3, we have the following
Corollary 5.4
Let \(g\in {\mathrm {G}}\), \(g\notin {\mathrm {K}}\), \(\lambda > 2\), \(B > 1\), and let f be a left \({\mathrm {K}}\)-invariant positive continuous function on \({\mathrm {G}}\) satisfying the inequality (5.26). Assume that
Then for all \(h\in G\)
Lemma 5.5
Let \(g\in {\mathrm {G}}\), \(g\notin {\mathrm {K}}\), \(2< \lambda < \mu \), \(B > 1\), \(M > 1\), \(n\in {\mathbb {N}}^{+}\) and let \(f_i\), \(0\le i \le n\), be left \({\mathrm {K}}\)-invariant positive continuous functions on \({\mathrm {G}}\). We denote \(\min \{i, n-i\}\) by \({\bar{i}}\) and \(\sum _{0\le i\le n} f_i\) by f. Assume that
so in particular \(A_g f_0 \le \tau _\lambda (g)f_0\) and \(A_gf_n \le \tau _\lambda (g) f_n\). Then there is a constant \(C = C(g, \lambda ,\mu , B, M, n)\) such that for all \(h\in {\mathrm {G}}\),
Proof
For any \(0 < \varepsilon \le 1\) and \(0\le i \le n\) we define
Using the inequality (5.33) for all i, \(0\le i\le n\), we see that
Direct computation shows that
Hence for all i, \(0\le i\le n\),
Let \(f_\varepsilon :=\sum _{0\le i\le n} f_{i,\varepsilon }\). Summing (5.35) over all i, \(0\le i\le n\), and using the inequalities \(f_\varepsilon > \sqrt{f_{i-j,\varepsilon } \, f_{i+j,\varepsilon }}\), which are satisfied for any \(1\le i\le n-1\), \(0 < j \le {\bar{i}}\), we get
Write
in order to get from (5.36) that
Since \(f_\varepsilon \) also satisfies (5.26), we can apply Corollary 5.4 to \(f_{\varepsilon _{0}}\) and get that
for all \(h\in {\mathrm {G}}\). Hence (5.34) is true with \(C = \varepsilon ^{-n^{2}}_0 B\). \(\square \)
Proposition 5.6
Let \(g\in {\mathrm {G}}\), \(g\notin {\mathrm {K}}\), \(d\in {\mathbb {N}}^+\), \(B>1\), \(M>1\). For every \(0\le i\le 2d\), let \(\lambda _i\ge 2\) and let \(f_i\) be a left \({\mathrm {K}}\)-invariant positive continuous function on \({\mathrm {G}}\). We denote \(\min \{ i, 2d-i\}\) by \(\bar{i}\) and \(\sum _{0\le i\le 2d} f_i\) by f. Assume that
in particular,
Then, using the notation \(\ll \) (which until the end of the proof of this proposition means that the left hand side is bounded from above by the right-hand side multiplied by a constant which depends on \(g, \lambda _0, \ldots , \lambda _{2d}, B\) and M, and does not depend on \(f_0,\ldots ,f_{2d}\)), we have that
(a) For all \(h\in {\mathrm {G}}\) and \(0\le i\le 2d,\; i\ne d\),
where
(b) For all \(h\in {\mathrm {G}}\)
(c) For all \(h\in {\mathrm {G}}\)
Proof
-
(a)
Let
$$\begin{aligned} f_{i,{\mathrm {K}}}(h) \, {\mathop {=}\limits ^{\mathrm {def}}} \,\int _{{\mathrm {K}}} f_i(hk) \, \mathrm {d}\sigma (k), \quad h\in {\mathrm {G}}. \end{aligned}$$The Cauchy-Schwarz inequality implies
$$\begin{aligned} \int _{{\mathrm {K}}} \sqrt{f_{i-j} (hk) f_{i+j} (hk)} \, \mathrm {d}\sigma (k)&\le \sqrt{\int _{{\mathrm {K}}} f_{i-j}(hk) \, \mathrm {d}\sigma (k)} \, \sqrt{\int _{{\mathrm {K}}} f_{i+j} (hk) \, \mathrm {d}\sigma (k)} \\&= \sqrt{f_{i-j,{\mathrm {K}}} (h) f_{i+j,{\mathrm {K}}} (h)}. \end{aligned}$$Hence
$$\begin{aligned} \int _{{\mathrm {K}}} \max _{0<j\le {\bar{i}}} \sqrt{f_{i-j} (hk) f_{i+j}(hk)} \, \mathrm {d}\sigma (k)&\le \sum _{0<j\le {\bar{i}}} \int _{{\mathrm {K}}} \sqrt{f_{i-j} (hk) f_{i+j}(hk)} \, \mathrm {d}\sigma (k) \\&\le \sum _{0<j\le {\bar{i}}} \sqrt{f_{i-j, {\mathrm {K}}}(h) f_{i+j, {\mathrm {K}}}(h)} \\&\le d \max _{0<j\le {\bar{i}}} \sqrt{f_{i-j, {\mathrm {K}}} (h) f_{i+j,{\mathrm {K}}} (h)}. \end{aligned}$$On the other hand, we have
$$\begin{aligned} (A_g f_{i,{\mathrm {K}}})(h) = \int _{{\mathrm {K}}} (A_g f_i)(hk) \, \mathrm {d}\sigma (k) \end{aligned}$$and according to (5.38)
$$\begin{aligned} (A_g f_i)(hk)\le \tau _{\lambda _{i}} (g) f_i(hk) + M \max _{0<j\le \bar{i}} \sqrt{f_{i-j} (hk) f_{i+j}(hk)}. \end{aligned}$$Therefore
$$\begin{aligned} A_g f_{i,{\mathrm {K}}} \le \tau _{\lambda _{i}} (g) f_{i,{\mathrm {K}}} + dM \max _{0<j\le {\bar{i}}} \sqrt{f_{i-j,{\mathrm {K}}}f_{i+j,{\mathrm {K}}}}. \end{aligned}$$But \(f_{{\mathrm {K}}}(1) = f(1)\),
$$\begin{aligned} f_{i,{\mathrm {K}}} (h) = (A_hf_{i,{\mathrm {K}}}) (1) = (A_hf_i)(1) \end{aligned}$$and, as easily follows from (5.37), we have
$$\begin{aligned} f_{i,{\mathrm {K}}} (yh)\le Bf_{i,{\mathrm {K}}} (h) \end{aligned}$$if \(h,y\in {\mathrm {G}}\), and \(\Vert y\Vert \le \Vert g\Vert \). Thus, replacing \(f_i\) by \(f_{i,{\mathrm {K}}}\) and M by dM, we can assume that the functions \(f_i\) are bi-\({\mathrm {K}}\)-invariant. Then we have to prove that
$$\begin{aligned} f_i \ll f(1) \tau _\eta \quad \text {for all} \,\, 0\le i\le 2d, \; i\ne d. \end{aligned}$$(5.40)Let \(\eta ' = \max \{ \lambda _i : 0\le i\le 2d, i\ne d\}\), as in (5.39). We define \(\mu _i\), \(0\le i\le 2d\), by
$$\begin{aligned} \mu _d&= \lambda _d + 3^{-(d+1)} (\lambda _d -\eta ') \quad \text {and} \end{aligned}$$(5.41)$$\begin{aligned} \mu _i&= \mu _d -3^{-{\bar{i}}} (\lambda _d - \eta '), \ 0\le i\le 2d, \ i\ne d. \end{aligned}$$(5.42)Since (5.16) implies \(\tau _{\lambda _{i}} (g)\le \tau _{\mu _{d}} (g)\), it follows from (5.16) and Lemma 5.5 that
$$\begin{aligned} f_i \ll f(1)\tau _{\mu _{d}}, \quad 0\le i\le 2d. \end{aligned}$$(5.43)One can easily check that \(\eta> \mu _i > \lambda _i\ge 2\) and therefore \(\tau _\eta \ge \tau _{\mu _{i}}\) for all \(0\le i\le 2d, i\ne d\). Thus, to prove (5.40), it is enough to show that
$$\begin{aligned} f_i \ll f(1) \tau _{\mu _{i}} \quad \text {for all} \ 0\le i\le 2d, \ i\ne d. \end{aligned}$$(5.44)We will prove (5.44) for \(i\le d-1\) by using induction in i; the proof in the case \(i\ge d+1\) is similar. For \(i=0\) we have \(\tau _{\mu _{0}}(g) > \tau _{\lambda _{0}}(g)\) because of (5.16) and thus it is enough to use Corollary 5.4. Let \(1\le m\le d-1\) and assume that (5.44) is proved for all \(i < m\). Using (5.43) for all \(0 < j\le m\) we find that
$$\begin{aligned} \sqrt{f_{m-j}f_{m+j}} \ll f(1)\sqrt{\tau _{\mu _{m-j}} \tau _{\mu _{d}}} \le f(1) \sqrt{\tau _{\mu _{m-1}} \tau _{\mu _{d}}} \ll f(1) \tau _{(\mu _{m-1}+\mu _{d})/2}. \end{aligned}$$(5.45)Note that the second inequality in (5.45) follows from (5.16) and (5.42), and the third one follows from (5.17) and (5.22). Combining (5.38) and (5.40) we get
$$\begin{aligned} A_gf_m \le \tau _{\lambda _{m}} (g) f_m + Cf(1) \tau _{(\mu _{m-1}+\mu _{d})/2}, \end{aligned}$$where \(C \ll 1\). On the other hand, we have \(\lambda _m < \mu _m\) and \((\mu _{m-1} + \mu _d)/2 < \mu _m\) by (5.41) and (5.42). Now, to prove that \(f_m \ll f(1)\tau _{\mu _{m}}\), it remains to apply Lemma 5.3 combined with (5.16). (b) As in the proof of (a), we can assume that the functions \(f_i\) are bi-K-invariant. Then we get from (5.38) and (5.40) that
$$\begin{aligned} A_gf_d\le \tau _{\lambda _{d}} f_d + Df(1) \tau _\eta , \end{aligned}$$where \(D \ll 1\). Since \(\eta < \lambda _d\), Lemma 5.3 implies that \(f_d \ll f(1)\tau _{\lambda _{d}}\) which proves (b). (c) Follows from (a), (b), (5.16), (5.17) and (5.22). \(\square \)
5.3 Quasinorms and representations of \(\mathrm {SL}(2,{\mathbb {R}})\)
We say that a continuous function \(v\mapsto |v|\) on a real topological vector space V is a quasinorm if it satisfies the following properties
-
(i)
\(|v| \ge 0\) and \(|v| = 0\) if and only if \(v=0\),
-
(ii)
\(|\lambda v| = |\lambda | {\cdot } |v|\) for all \(\lambda \in {\mathbb {R}}\) and \(v\in V\).
If V is finite dimensional, then any two quasinorms on V are equivalent in the sense that their ratio lies between two positive constants.
Lemma 5.7
Let \(\rho \) be a (continuous) representation of \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) in a real topological vector space V, let \(|\, \cdot \,|\) be a \(\rho ({\mathrm {K}})\)-invariant quasinorm on V and let \(v\in V, v\ne 0\), be an eigenvector for \(\rho \) corresponding to the character \(\chi _{-r}, r\in {\mathbb {R}}\), that is
Then for any \(g\in {\mathrm {G}}\) and \(\beta \in {\mathbb {R}}\)
and
Proof
Using the \({\mathrm {K}}\)-invariance of \(\vert \cdot \vert \) we get that
The equality (5.47) follows from (5.46) and from the definition of \(\tau _{\beta r}(g)\). \(\square \)
Let \(\Vert z\Vert \) denote the norm of \(z\in {\mathbb {C}}^2\) corresponding to the standard Hermitian inner product on \({\mathbb {C}}^2\), that is
Lemma 5.8
For any \(z\in {\mathbb {C}}^2\), \(z \ne 0\), \(g\in G\) and \(\beta > 0\), we have
Proof
Since the measure \(\sigma \) on \({\mathrm {K}}\) is translation invariant, we have
Also for all \(\lambda \in {\mathbb {C}}\), \(\lambda \ne 0\), and \(z\in {\mathbb {C}}^2, z\ne 0\),
because \(\Vert \lambda v\Vert = |\lambda | {\cdot } \Vert v\Vert \), \(v\in {\mathbb {C}}^2\), and because \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) acts \({\mathbb {C}}\)-linearly on \({\mathbb {C}}^2\). Any non-zero vector \(x\in {\mathbb {R}}^2\) can be represented as \(x=\lambda ke_1\) with \(\lambda \in {\mathbb {R}}\), \(k\in {\mathrm {K}}\), \(e_1 = (1,0)\). Then, using (5.12) from Sect. 5.2, we get from (5.49) and (5.50) that
Let now \(z = x+iy\), \(x,y\in {\mathbb {R}}^2\), \(z\ne 0\). We write \(e^{i\theta } z = x_\theta + iy_\theta \), \(x_\theta , y_\theta \in {\mathbb {R}}^2\). Then \(\frac{\Vert x_\theta \Vert }{\Vert y_\theta \Vert }\) is a continuous function of \(\theta \) with values in \({\mathbb {R}}_{\ge 0}\cup \{\infty \}\). But \(e^{i\pi /2}z = iz = -y+ ix\) and therefore \(\frac{\Vert x_{\pi /2}\Vert }{\Vert y_{\pi /2}\Vert } = \left( \frac{\Vert x_0\Vert }{\Vert y_0\Vert }\right) ^{-1}\). Hence there exists \(\theta \) such that \(\Vert x_\theta \Vert = \Vert y_\theta \Vert \). Replacing then z by \(e^{i\theta }z\) and using (5.50) we can assume that \(\Vert x_\theta \Vert = \Vert y_\theta \Vert \). Now using the convexity of the function \(t\rightarrow t^{-\beta /2}\), \(t > 0\), and the identity (5.51) we get that
Clearly the last inequality (5.52) implies (5.48). \(\square \)
Let us recall some basic facts of the finite-dimensional representation theory of \(\mathrm {G} = \mathrm {SL}(2,{\mathbb {R}})\). Let W be a finite-dimensional complex vector space, there is a correspondence between complex-linear representations of \({\mathfrak {s}}{\mathfrak {l}}(2,{\mathbb {C}})\) on W and representations of \({\mathrm {G}}\) on W, under which invariant subspaces and equivalences are preserved (see [32] Proposition 2.1). It is well-known that any finite-dimensional representation of \({\mathfrak {s}}{\mathfrak {l}}(2,{\mathbb {C}})\) is fully reducible, that is, it can be decomposed into the direct sum of irreducible representations (see [33] Corollary 1.70). Moreover, for each \(m \ge 1\) there exists up to equivalence a unique irreducible complex-linear representation of \({\mathfrak {s}}{\mathfrak {l}}(2,{\mathbb {C}})\) on a complex vector space of dimension m (see [33] Corollary 1.63). Hence, any finite-dimensional representation of \({\mathrm {G}}\) is fully reducible and any two irreducible finite-dimensional representations of the same degree must be isomorphic. Let \({\mathcal {P}}_m\) denote the \((m+1)\)-dimensional complex vector space of complex polynomials in two variables homogeneous of degree m, and let \(\psi _m\) denote the regular representation of \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) on \({\mathcal {P}}_m\) defined by \((\psi _m(g)P)(z) = P(g^{-1}z)\), for \(g\in {\mathrm {G}}, z\in {\mathbb {C}}^2\) and \(P\in {\mathcal {P}}_m\). It is well-known that the representation \(\psi _m\) is irreducible for any m (see [34] Example 2.7.11) and hence it is, up to isomorphism, the unique irreducible finite-dimensional representation of \({\mathrm {G}}\) of degree m. We define
Proposition 5.9
Let \(\rho \) be a representation of \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) on a finite-dimensional space W. Then there exists a \(\rho ({\mathrm {K}})\)-invariant quasinorm \(|\;\cdot \;| = |\;\cdot \;|_\rho \) on W such that for any \(w\in W, w\ne 0\), \(g\in {\mathrm {G}}\) and \(\beta > 0\),
Proof
Let \(W =\bigoplus _{i=1}^n W_i\) be the decomposition of W into the direct sum of \(\rho ({\mathrm {G}})\)-irreducible subspaces, and let \(\pi _i :W\rightarrow W_i\) denote the natural projection. Suppose that we constructed for each i a \({\mathrm {K}}\)-invariant quasinorm \(\vert \cdot \vert _i = \vert \cdot \vert _{\rho _{i}}\) on \(W_i\) such that for any \(w\in W_i, w\ne 0, g\in {\mathrm {G}}\), and \(\beta > 0\),
where \(\rho _i\) denotes the restriction of \(\rho \) to \(W_i\) and \(m(i)\in I(\rho )\) is defined by the condition that \(\psi _{m(i)}\) is isomorphic to \(\rho _i\). Then we define \(\vert w\vert = \vert w\vert _\rho \) by
Clearly \(\vert \cdot \vert _\rho \) is a \({\mathrm {K}}\)-invariant quasinorm. Let us fix now \(w\in W, w\ne 0\). Then
Thus, it is enough to prove the proposition for representations \(\psi _m\). For this, let \(P\in {\mathcal {P}}_m, P\ne 0\). We consider P as a polynomial on \({\mathbb {C}}^2\) and decompose P, using the fundamental theorem of algebra, into the product of m linear forms
There is a natural \({\mathrm {K}}\)-invariant norm on the space of linear forms on \({\mathbb {C}}^2\):
Now we define a quasinorm on \({\mathcal {P}}_m\) by the equation
This definition is correct because the factorization (5.55) is unique up to the order of factors and the multiplication of \(\ell _i\), \(1\le i\le n\), by constants. We denote by \({{\tilde{\psi }}}_1\) the extension of \(\psi _1\) to the space of linear forms on G. It is isomorphic to the standard representation of \({\mathrm {G}}\) on \({\mathbb {C}}^2\). Then using Lemma 5.8 and the generalized Hölder inequality, we get that
Since \(I(\psi _m) = \{ m\}\), (5.56) implies (5.53) for \(\rho = \psi _m\). \(\square \)
We recall from Sect. 5.2, see (5.15) and (5.16), that \(\tau _\mu (g) < 1\) and \(\tau _\eta (g)< \tau _\lambda (g)\) for any \(g\notin {\mathrm {K}}\), \(0< \mu < 2\), \(\lambda \ge 2\) and \(0< \eta < \lambda \). Using this, we deduce from the previous Proposition 5.9 the following corollary.
Corollary 5.10
Let \(\rho \) be a representation of \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) in a finite dimensional space W, and let m be the largest number in \(I(\rho )\). Then there exists a \(\rho ({\mathrm {K}})\)-invariant quasinorm \(\vert \cdot \vert = \vert \cdot \vert _\rho \) on W such that
-
(i)
if \(\beta > 0\) and \(\beta m\ge 2\) then for any \(w\in W\), \(w\ne 0\), and \(g\in {\mathrm {G}}\)
$$\begin{aligned} \int _{{\mathrm {K}}} \frac{\mathrm {d}\sigma (k)}{\vert \rho (gk)w\vert ^\beta } \le \tau _{\beta m} (g) \frac{1}{\vert w\vert ^\beta }, \end{aligned}$$ -
(ii)
if \(\beta > 0\) and \(\beta m < 2\) then for any \(w\in W\), \(w\ne 0\), and \(g\in {\mathrm {G}}\), \(g\notin {\mathrm {K}}\),
$$\begin{aligned} \int _{{\mathrm {K}}} \frac{\mathrm {d}\sigma (k)}{\vert \rho (gk)w\vert ^\beta } < \frac{1}{\vert w\vert ^\beta }. \end{aligned}$$
5.4 Functions \(\alpha _i\) on the space of lattices and estimates for \(A_h\alpha _i\)
Let \(\rho \) be a representation of \({\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) on \({\mathbb {R}}^n\) and for each \(1 \le i \le n\) let \(| \, \cdot \, |_i\) be a \((\wedge ^i\rho )({\mathrm {K}})\)-invariant quasinorm on the exterior product . Throughout this section the underlying quasinorms in the definition of the lattice functions \(\alpha _i\) and \(\alpha \) are taken to be with respect to this particular choice of quasinorms (see (4.2) and (4.3)). For every compact subset \(A \subset {\mathrm {G}}\) note that
is finite for every i, \(1\le i\le n\). Hence, if we fix \(g\in {\mathrm {G}}, g\notin {\mathrm {K}}\), then there exists some \(B > 1\) such that for any i, \(1\le i\le n\), and , \(v\ne 0\),
where \(\Vert h\Vert = \Vert h^{-1}\Vert \) denotes the norm of \(h \in {\mathrm {G}} = \mathrm {SL}(2,{\mathbb {R}})\) with respect to the standard Euclidean norm on \({\mathbb {R}}^2\). Now, let \(\Delta \) be a lattice in \({\mathbb {R}}^n\) and L a \(\Delta \)-rational subspace. For any \(h \in \mathrm {SL}(2,{\mathbb {R}})\) observe that hL is an \(h\Delta \)-rational subspace and if \(v_1, \dots , v_i\) is a basis of \(\Delta \cap L\) then \(h v_1, \dots , h v_i\) is a basis of \(h\Delta \cap h L\). This observation together with (5.57) implies that
Hence, for any \(i \in \{ 0, \ldots , n\}\) it follows that
For any \(\beta > 0\) and \(1\le i\le n\) we define the functions \(F_{i,\beta }\) on by
It is clear that the functions \(F_{i,\beta }\) are continuous and that \(F_{i,\beta }(\lambda w) = F_{i,\beta }(w)\) for any \(\lambda \in {\mathbb {R}}\), \(\lambda \ne 0\). Let \(c_{0,\beta }:=1\) and for \(1 \le i \le n\)
We note that \(c_{n,\beta } = 1\), since the image of any continuous homomorphism \(\mathrm {SL}(2,{\mathbb {R}}) \rightarrow \mathrm {GL}(n,{\mathbb {R}})\) is contained in \(\mathrm {SL}(n,{\mathbb {R}})\) and thus \(| (\wedge ^n\rho )(gk)w|_n = |\det (\wedge ^n\rho (gk))| |w|_n = |w|_n\).
Lemma 5.11
For any i, \(0\le i\le n\),
where \(\bar{i} = \min \{ i,n-i\}\), the constant \(C\ge 1\) is from Lemma 4.1 and the operator \(A_g\) is defined by (5.8) from Sect. 5.2.
Proof
Let \(\Delta \) be a lattice in \({\mathbb {R}}^n\). We have to prove that
According to Remark 2.1 there exists a \(\Delta \)-rational subspace L of dimension i such that
Let us denote the set of \(\Delta \)-rational subspaces M of dimension i with \(d_\Delta (M) < B^2 d_\Delta (L)\) by \(\Psi _i\). For a \(\Delta \)-rational i-dimensional subspace \(M\notin \Psi _i\) we get from (5.58) that
If \(\Psi _i = \{ L\}\), then it follows from this and the definitions of \(\alpha _i\) and \(c_{i,\beta }\) that
Assume now that \(\Psi _i\ne \{ L\}\). Let \(M\in \Psi _i\), \(M\ne L\). Then \(\dim (M+L) = i+j\), \(0 < j\le {\bar{i}}\). Now we obtain by (5.58), (5.63) and Lemma 4.1 for any \(k\in K\) that
Hence, if \(\Psi _i\ne \{ L\}\),
Combining (5.64) and (5.65), we get (5.62). \(\square \)
Theorem 5.12
Let \(d\in {\mathbb {N}}^+\) and let \(\rho _d\) be a representation of \({\mathrm {G}} =\mathrm {SL}(2,{\mathbb {R}})\) isomorphic to the direct sum of d copies of the standard 2-dimensional representation. Let \(\beta \) be a positive number such that \(\beta d > 2\). Then there is a constant R, depending only on \(\beta \) and the choice of the \({\mathrm {K}}\)-invariant quasinorms \(| \, \cdot \, |_i\) involved in the definition of \(\alpha _i\), such that for any \(h\in {\mathrm {G}}\) and any lattice \(\Delta \) in \({\mathbb {R}}^{2d}\)
Proof
As in Sect. 5.3, we define for a finite dimensional representation \(\rho \) of \({\mathrm {G}}\)
where \(\psi _m\) denotes the regular representation of \({\mathrm {G}}\) in the space of complex homogeneous polynomials in two variables homogeneous of degree m. Let \(m_i\) be the largest number in \(I(\wedge ^i\rho _d)\), \(1\le i\le 2d\). It is well known that
We fix \(g\in {\mathrm {G}}, g\notin {\mathrm {K}}\). It follows from (5.66) and from Corollary 5.10 that we can choose quasinorms \(\vert \cdot \vert _i\) on in such a way that for , \(w\ne 0\),
Hence
where \(c_{i,\beta }\), \(1\le i\le 2d\), is defined by (5.60) and \(c_{0,\beta } = 1\). As a remark, we notice that \(c_{i,\beta } = \tau _{\beta {\bar{i}}} (g)\) if \(\beta \bar{i} \ge 2\).
According to Lemma 5.11, the functions \(\alpha ^\beta _i\), \( 0\le i\le 2d\), satisfy the following system of inequalities
Let
Since \(\tau _2(g) = 1\), see (5.14) in Sect. 5.2, it follows from (5.67)-(5.69) that
Now we fix a lattice \(\Delta \) in \({\mathbb {R}}^{2d}\) and define functions \(f_i\), \( 0\le i\le 2d\), on \({\mathrm {G}}\) by
Then it follows from (5.70) that
On the other hand, in view of (5.59),
Since \(\beta d > 2\), we have that \(\beta d = \lambda _d > \lambda _i\) for any \(i\ne d\). Now we can apply Proposition 5.6 (c) in order to get that
The inequality (5.71) proves the theorem for our specific choice of the quasinorms \(\vert \cdot \vert _i\). Now it remains to notice that any two quasinorms on are equivalent. \(\square \)
6 Proofs of Theorems 2.2 and 1.9
In this section we shall prove our main theorem, giving effective estimates on the lattice remainder. But, before doing this, we have to establish mean-value estimates for the \(\alpha _d\)-characeristics of \(\Lambda _t\) by applying Theorem 5.12 combined with Lemma 5.1.
Corollary 6.1
Let \(r \ge q^{1/2}\), \(I=[t_0,t_0+1]\) with \(t_0 \in {\mathbb {R}}\), \(0< \beta \le 1/2\) with \(\beta d >2\) and \(\widehat{g}_I :=\max \{|\widehat{g}_w(t)|\, :\, t \in I\}\). Using the notation (5.2), we have
where \(\gamma _{I, \beta }(r) =1\) if \(\beta =1/2\). Note that we need at least \(d \ge 5\).
Based on our variant of Weyl’s inequality (see Lemma 3.3 and Corollary 4.5) the \(\alpha \)-characteristic enters with a power 1/2 in (6.1). While saving a maximum of the \(\alpha \)-characeristic, it will enter still with an exponent \(0 < \beta \le 1/2\) for its average (compare Lemma 5.1). Since the crucial averaging recursion (Theorem 5.12) fails unless \(\beta d >2\), the proof essentially needs \(d >4\) and thus \(d \ge 5\).
Proof
In order to apply Lemma 5.1, we cover I by intervals \(I_j = [s_j,s_{j+1}]\) of length at most 1/q, where \(s_j = t_0 + j/q\) with \(j \in J :=\{0, \ldots , \lceil q \rceil \}\). This implies
Now, we shall apply Theorem 5.12 with \(h=d_{r_*}\), \(r_*=r/q^{1/2}\) and the lattices \(\Lambda _{Q,s_j} = d_{q^{1/2}} \,u_{s_j} \Lambda _Q\), as defined in (5.3), and obtain
where we have used \(\Vert d_{r_*}\Vert = r_* = r/q^{1/2}\) and (4.18) in form of
Note that we have applied Corollary 4.5 with \(r = q^{1/2}\) and \(t= s_j\) in order to get \(\alpha (\Lambda _{Q,s_j} ) \asymp _d\alpha _d(\Lambda _{Q,s_j})\). Finally, in view of (6.2), this concludes the proof of (6.1). \(\square \)
In order to bound the lattice point remainder for ‘wide shells’, that is \(b-a> q^{1/2}\), we need to extend the averaging result, established in Corollary 6.1, for small values of t. To do this, we recall the bound
for the integrand \(\widehat{g}_w(t)\) in (5.4), provided that \(0< w < (b-a)/4\). Note that it is of size \(b-a\) for \(|t| \le 1/(b-a)\) and changes rapidly if \(|b-a|>1 \) grows with r.
Lemma 6.2
If \(r \ge q^{1/2}\), \(\beta d >2\) and \(0< w < |b-a|/4\), then
where \(I = [q_0^{-1/2}r^{-1}, q^{-1/2}]\).
Proof
Proceeding first as in the proof of Lemma 5.1 and changing variables to \(s= t^{-1}\) it is plain to see that
Let \(N = \lceil r(q_0/q)^{1/2} \rceil \), then the integral on the right-hand side is bounded by \(\sum _{j=2} ^N I_j\), where
For \(2 \le j \le N\) write \(t_j = q^{-1/2}j^{-1}\), then using that
together with the change of variables \(v = 4^{-1} j^2 (s^{-1}- t_j)\) yields
where the last inequality is a consequence of \(|\widehat{g}_w(t)| \ll |t|^{-1}\). Hence, since \( 4 rj^{-1} \ge 1\) and \(q^{1/2}j t_j =1\), we deduce from Lemma 4.7, Theorem 5.12 and (4.20) of Lemma 4.6 that
Summing the last inequality over \(2 \le j \le N\), we observe that it suffices to show that the following estimate holds
Indeed, split the previous sum according to whether \(j \le 4q^{1/2}\) or \(j > 4q^{1/2}\). The sum over \(j > 4 q^{1/2}\) can be bounded by
and the sum over \(2 \le j \le 4 q^{1/2} \) by
\(\square \)
Proof of Theorem 2.2
In view of (3.39), it remains to estimate \(I_\theta \). By (5.1), with \(K_0 :=[q_0^{-1/2}r^{-1},1]\) and \(K_j :=(j,j+1]\), \(j \ge 1\), we have
For fixed \(r \ge q^{1/2}\) we may choose
For notational simplicity, we write \(C_Q :=q \,|\det {Q}|^{-1/4-\beta /2}\).
Step 1: Estimate of \(I_{\theta ,0}\). We consider the case \(b-a \le q\) first. Here we apply Corollary 6.1 to bound the integral over \(K_0\) combined with \(\widehat{g}_{K_0} \ll s_{[a,b]_{\pm w}}(t) \ll b-a\), compare (3.8) and (3.9). Note that we didn’t use the restriction \(b-a \le q\) at all. For wide shells, i.e. in the case \(b-a > q\), we use Lemma 6.2 for \(t\in K_0\), \(q_0^{-1/2}r^{-1}\le |t| \le q^{-1/2}\) and Corollary 6.1 for the other t in \(K_0\) together with \(\widehat{g}_{[q^{-1/2},1]} \ll q^{1/2}\). Furthermore, for both cases of \(b-a\), split \(K_0=K_{00} \cup K_{01}\), where \(K_{00} :=[q_0^{-1/2}r^{-1}, T_{-}]\) and \(K_{01}:=(T_{-},1]\). Then (4.19) of Lemma 4.6 yields
with the notation (5.2). Using \(C_Q q^{(2\beta d -1)/2} =\bar{C}_Q\), we may bound \(I_{\theta ,0}\) as
As a side remark, we note that the above splitting of the interval \(K_0 = [q_0^{-1/2}r^{-1},1]\) is required for our later applications - especially, Corollary 4.11 is only valid for fixed intervals \([T_{-},T_{+}]\).
Step 2: Estimate of \(I_{\theta ,j}\) for \(j \ge 1\). Similar as before, applying Corollary 6.1 (with \(\beta =1/2\)), while noting that \(\gamma _{I, \beta }(r) =1\) if \(\beta =1/2\), yields
We recall the bound (6.3) for \(\widehat{g}_w\) and the choices of \(T_+\) and w in (6.6) in order to get
Thus, we obtain
Furthermore, for \(b-a > 1\) we can use \(|\widehat{g}_{K_j}|\ll j^{-1}\) to bound the remaining sum. Whereas for \(b-a \le 1\) we use \(|\widehat{g}_{K_j}| \ll b-a\) for \(1 \le j \le S-1\) and \(|\widehat{g}_{K_j}|\ll j^{-1}\) for \(S \le j \le T_+-1\) and minimize the resulting expression in S. In both cases this leads to
where
Hence, using (6.5) combined with (6.8), (6.11) and (6.12) with (6.10), we get
where \(c_Q = |\det {Q}|^{\frac{1}{4}-\frac{\beta }{2}}\). Together with the inequality (3.39) we obtain
where
under the condition \(0< w < (b-a)/4\). This completes the proof of Theorem 2.2. \(\square \)
Proof of Theorem 1.9
We have only to apply Theorem 2.2 to the Gaussian weights \(\text {v}(x) = \exp \{-2 \,Q_{+}[x]\}\) noting that \(\zeta (x) = \exp \{ - Q_{+}[x] \}\) satisfies the integrability condition (2.4). This yields
In view of (7.9) and (7.8), we see that \(\Vert \text {v}\Vert _Q \ll d_Q\). Here we used that \(\varphi _{\text {v}}(v,\sqrt{u^2-v}) = \exp \{-2u^2\}\), if Q is indefinite; and \(\varphi _{\text {v}}(v) = \exp \{-2v^2\}\) if Q is positive definite. Moreover, a simple calculation shows that \(\Vert \widehat{\zeta }\Vert _1 \ll _d 1\) and by following the arguments in the proof of (7.31) we get \(\Vert {\hat{\zeta }}\Vert _{*,r} \ll _{d} q^{d/4} ( (q/q_0)^{d/2} + d_Q q^{d/2})\) as well.
\(\square \)
7 Lattice point deficiency for admissible regions and applications
Before we can apply Theorem 2.2, we have to construct smooth bump functions, approximating the indicator function of special parallelepiped regions, and also to control the additional error produced by this smoothing step: In the following Lemma 7.1 we shall bound the volume of \(\varepsilon \)-boundaries of \(r\Omega \cap E_{a,b}\) and in Lemma 7.2 we estimate integrals of the Fourier transform of the region \(\Omega \). For wide shells the lattice point counting remainders will reflect the Diophantine properties of Q more directly when using counting regions \(\Omega \) which are ‘admissible’ convex polyhedra.
7.1 Smoothing of special parallelepiped regions
Here we confine ourselves to study a specially oriented parallelepiped \(\Omega = B^{-1} [-1,1]^d\) with
for a suitable \(B \in \mathrm {GL}(d,{\mathbb {R}})\) and a positive constant \(c_B \ge 1\) depending on B. In this case, the Minkowski functional of \(\Omega \) is given by \(M(x)= \max ( \langle g_{i,\pm } ,x \rangle \,: \,i=1,\ldots ,d \,)\), where \(g_{i,\pm } = \pm B^T e_i\) are 2d outward normal vectors of the faces of \(\Omega \). Note that the inequalities in (7.1) imply the norm equivalence
We now approximate \(I_{\Omega }\) by smooth weight functions. For this, introduce
where \(k_{B,\varepsilon }(A) = k_{\varepsilon }(BA)\) for any \(A\in {\mathcal {B}}^d\) and \(k_{\varepsilon }\) denotes the rescaled measure on \({\mathbb {R}}^d\) introduced in the beginning of Sect. 3.1. Moreover, we need the technical restriction \(0 < \varepsilon \le \varepsilon _0\) with \(\varepsilon _0 :=1/15\). Since Lemma 3.1 can be adapted to this situation, taking \(\text {v}_{\pm \varepsilon ,r}(x) :=\text {v}_{\pm \varepsilon }(x/r)\), we get for the lattice point remainder (3.5)
where, in view of (3.2), the remainder term is given by
For hyperbolic shells the latter term (7.5) will be absent, but for elliptic shells we shall find that
This estimate will be proven in the following Lemma 7.1, but first we need to introduce some notations: For a measurable, non-negative, bounded weight function \(\text {v}\) on \({\mathbb {R}}^d\) we shall define the spherical mean by
where \(r_1,r_2 \ge 0\), \(\sigma \) denotes the unique normalized Haar measure on the sphere \(S^{p-1}\) resp. \(S^{q-1}\), (p, q) denotes the signature of Q (with \(p+ q =d\)) and U a rotation in \({\mathbb {R}}^d\) such that \(U Q U^{-1}\) is diagonal matrix whose first p entries are positive and the latter q are negative. Note that in the case of positive definite forms Q (i.e. \(q=0\)), the double integral must be replaced by a single one.
Lemma 7.1
Let \(\varphi _{\text {v}}\) be defined as in (7.7). If Q is indefinite, define also
and suppose that the latter integral exists. Otherwise, if Q is positive definite, define
and assume that the latter supremum is bounded. Under these conditions, writing \(\partial _{w}[a,b] :=[a-2w,a+2w] \cup [b-2w,b+2w]\), we have for \(0< w < (b-a)/4\)
Assuming additionally \(\max \{|a|,|b|\} \le c_0 r^2\) with \(c_0 = (c_B)^{-1} /5\), the estimates
hold for indefinite forms Q, provided that \(\varepsilon \in (0,\varepsilon _0]\). Moreover, for the special choice \(\text {v}= \text {v}_{\pm \varepsilon }\), as defined in (7.3), we have
whereby the condition \(\max \{|a|,|b|\} \le c_0 r^2\) can be dropped if Q is positive definite.
The lower bound (7.12) can be also found in [6], see Lemma 8.2. Moreover, Lemma 3.8 in [23] provides an asymptotic formula for the volume of \(H_r\).
Proof
For a bounded measurable function g on \({\mathbb {R}}\) with compact support we introduce
Let \(S_Q=Q\,Q_{+}^{-1}, L_Q=Q_{+}^{1/2}\) and let U denote the rotation stated in the lemma. In particular, \(UQU^{-1}\) and \(UL_QU^{-1}\) are diagonal. Changing variables via \(x= r L_Q^{-1} U^{-1} \,y \) in \({\mathbb {R}}^d\) with \(y\in {\mathbb {R}}^p\times {\mathbb {R}}^q\), \(d=p+q\) and using polar coordinates, \(y=\,(r_1\,\eta _1,r_2\eta _2)\), where \(r_1,r_2 >0\) and \(\eta _1\in S^{p-1}\), \(\eta _2\in S^{q-1}\), that is \(\Vert \eta _1\Vert =\Vert \eta _2\Vert =1\), we may write \(Q[x]= r^2(r_1^2-r_2^2)\) and obtain by Fubini’s theorem
where \(\varphi _{\text {v}}(r_1,r_2)\) is defined as in (7.7) for suitable weight functions \(\text {v}\). (As already noted, in the case of positive definite forms Q, the double integral in (7.14) must be replaced by a single one.) Next, we change variables via \(v:=r_1^2-r_2^2\) and \(u:=r_1\), so that \(r_1^2+r_2^2=2u^2-v\) and \(r_2=\sqrt{u^2-v}\). Thus, we get
In order to prove (7.10), we choose \(g=I_{\partial _{w}[a,b]}\) in (7.15). Since the length of \(r^{-2} \, \mathrm {supp} \, g\) is at most \(\ll |w| r^{-2}\), we get \(R_g \ll _d |w| r^{d-2} \Vert \text {v}\Vert _Q\), where \(\Vert \text {v}\Vert _Q\) is defined as in (7.8) if Q is indefinite, resp. as in (7.9) if Q is positive definite.
Next we prove (7.12): Taking \(g= I_{[a,b]}\), \(\text {v}(x)= I_{\Omega }(x) = I(M(x) \le 1)\) and using
gives the lower bound
Thus, we find
Proof of (7.11)
In (7.15) we choose \(g=I_{[a,b]}\) and \(\text {v}=I_{(\partial \Omega )_{2\varepsilon }}\) with \(0<\varepsilon \le \varepsilon _0\). By the properties of the polyhedron \(\Omega \), see (7.2), we have \(I_{ (\partial \Omega )_{2\varepsilon }}(x) \le I(M(x)\in J_{1,2\varepsilon })\), where \(J_{1,2\varepsilon }:=[1-2\varepsilon , 1+ 2 \varepsilon ]\). Let \(g_1,\ldots ,g_{2d}\) denote the 2d-tuple of normal vectors defining \(\Omega \) and let \(f_m = U L_Q^{-1} g_m\), \(m=1,\ldots ,2d\), be the transformed vectors. Since
we may bound \(\varphi _{\text {v}}(r_1,r_2)\) in (7.15) as follows
where
Recall \(|v| \le c_0\), \(v= r_1^2-r_2^2\), \(u=r_1\) and \(r_2=\sqrt{u^2-v}\). The inequality (7.16) implies
Therefore \(\varphi _{\text {v}}(u,\sqrt{u^2-v})=0\) if
Because of
and \(u^2-v \ge 17c_0/45>0\), we get
By interchanging the variables \(r_1\) and \(r_2\) we can suppose that \(q \ge 2\). Thus, since \(u \ll _d 1\) and \(\sqrt{u^2-v} \ll _d 1\), we see that
We claim that
holds. In view of (7.17) and (7.18), the estimates
for all \(m=1,\dots ,2d\) will prove the bound (7.19).
Thus let \(F_m(u):=\langle (u\,\eta _1,(u^2-v)^{1/2} \,\eta _2),f_m \rangle \) for fixed \(|v| \le c_0\) and \((\eta _1,\eta _2)\). If
for all \(c_\Omega \le u\le C_\Omega \) with \(F_m(u)\in [1-2\varepsilon ,1+2\varepsilon ]\) uniformly in \((\eta _1,\eta _2)\) and v, then
and hence \(R_m \ll _d c_1^{-1} \varepsilon \) for all \(m=1,\dots ,2d\). Note that
and because of \(\Vert L_Q^{-1} B^T\Vert = \Vert B \,L_Q^{-1}\Vert \le \sqrt{c_B}\) we see that
Note, that here it is important that \(\varepsilon >0\) is not too large, i.e. \(\varepsilon \in (0,\varepsilon _0]\). Thus, (7.20) holds and the assertion (7.19) is proved. This yields the claimed bound for \(R_{\varepsilon ,r}\), compare (7.5). \(\square \)
Finally, we prove (7.13). Here we have \(v= v_{\pm \varepsilon }\) and \(v_{\pm \varepsilon }(x) \le I(M(x) \le 1 + 2 \varepsilon )\). In view of (7.16), we find that the u-integral in (7.8) can be restricted to \(2u^2 \le 2d +v\). Hence
because \(|v| \le r^{-2}(|a|+|b|) \le 2 c_0 \le 1\). Since \(\varphi _\text {v}\) is supported in \(\Vert \cdot \Vert \)-ball of radius \(2d^{1/2}\), we get also in the case of positive definite forms that (7.9) is bounded by \(\ll _d d_Q\). \(\square \)
7.2 Fourier transform of weights for polyhedra
Here we continue to estimate the remainder terms in (7.6). Since the bounds for \(R(g^Q_w \,\text {v}_{-\varepsilon ,r})\) are exactly the same as for \(R(g^Q_w \,\text {v}_{+\varepsilon ,r})\) we shall consider the latter only. We shall now modify the weight \(\text {v}_{\varepsilon }\), defined in (7.3), as follows. Define \(\varphi =I_{[-2,2]} *k\), where k is again the probability measure from Sect. 3.1. Of course, \(\varphi \) is smooth and \(\varphi (u)=1\) if \(|u| \le 1\) and \(\varphi (u)=0\) if \(|u| \ge 3\). Let \(s_d :=d(1+2\varepsilon _0)^2\). Now, by construction \(\varphi (Q_{+}[x] s_d^{-1})\) is identical to 1 on the support of the \(\varepsilon \)-smoothed indicator of \(\Omega _{\varepsilon } = B^{-1}[-(1+\varepsilon ),(1+\varepsilon )]^d\), that is \(\text {v}_{\varepsilon }(x)\). Hence we may rewrite the weights \(\zeta \) of (3.6) via
using the \(C^{\infty }\) function \(\psi (x):=\exp \{ Q_{+}[x] \} \varphi (Q_{+}[x] s_d^{-1})\) of bounded support, whose Fourier transform can easily be estimated, see (7.24). In particular, the weights \(\zeta _{\varepsilon }\) satisfy the integrability condition (2.4), i.e. \(\sup _{x\in {\mathbb {R}}^d}\big (|\zeta _{\varepsilon }(x)| + |\widehat{\zeta }_{\varepsilon }(x)|\big ) (1+\Vert x\Vert )^{d+1} < \infty \).
Lemma 7.2
The following estimate holds
Remark 7.3
In the general case, when \(\Omega \) has finite Minkowski surface measure \(c_{\Omega }\) only, defined via \(\mathrm {meas}(\partial _{\varepsilon } \Omega ) \le c_{\Omega } \varepsilon \), we have
as can be deduced from the bound in Theorem 2.9 of [2], that is
This estimate is sharp as shown by the explicit example of an unit ball, see [2] for more details. That paper contains also bounds on the average \(\eta \mapsto |\widehat{I}_{\Omega }(s \eta )|\) over the unit sphere \(S^{d-1}\) for polyhedra, which are usually of smaller order than pointwise bounds. In fact, the pointwise decay of \(\widehat{I}_\Omega (v)\) may depend crucially on the direction of v. In our setting (finding \(L^1\)-estimates for specially oriented parallelepipeds \(\Omega \)) more elementary arguments can be used.
Proof
Note that by definition
Since
we easily conclude that
Defining \(Z :=(B^{-1})^T\) and changing variables shows also that
and
Thus we get for \(\text {v}_{\varepsilon }=I_{\Omega _{\varepsilon }}*k_{B,\varepsilon }\)
Finally, using \(\widehat{I}_{[-1,1]^d}(v) = \prod _{j=1}^d \sin ( 2 \pi v_j)/( \pi v_j)\) together with (7.27) gives the estimate
We now obtain the estimate (7.22) from (7.23) combined with (7.24) and (7.28). \(\square \)
7.3 Lattice point remainders for admissible parallelepipeds
Now we restrict the parallelepiped \(\Omega = B^{-1} [-1,1]^d\), as defined in (7.1), such that its faces are in a general position relative to the standard lattice \({\mathbb {Z}}^d\). This ensures that the lattice point remainder for \(r \Omega \) is of ‘abnormally’ small error uniformly in r. To construct it, we may alternatively construct lattices \(B \,{\mathbb {Z}}^d\) such that the faces of \([-1, 1]^d\) have this property. Following Skriganov [53], we call a lattice \(\Gamma \subset {\mathbb {R}}^d\) of full rank, and likewise \(\Omega \), ‘admissible’ if
where \({{\,\mathrm{{\text {Nm}}}\,}}\gamma = |\gamma _1 \cdots \gamma _d|\) in standard coordinates \(\gamma =(\gamma _1, \ldots , \gamma _d)\).
Remark 7.4
The set of all admissible lattices is dense in the space of lattices (see [54]). Hence, for any \(\eta >0\), if \(D_\eta \) denotes the set of diagonal matrices with entries in \([1,1+\eta )\), then \(\text {O}(d)D_\eta \text {O}(d) \Gamma \) contains an admissible lattice. In particular, if \(\Gamma = Q_+^{1/2} {\mathbb {Z}}^d\), then there exist orthogonal matrices \(k,l \in \text {O}(d)\) and a diagonal matrix \(d \in D_\eta \) such that \(B {\mathbb {Z}}^{d}\) is admissible, where \(B = k d l \, Q_+^{1/2}\) satisfies property (7.1) with a constant \(c_B\) depending only on \(\eta \).
Remark 7.5
This definition is a special case of ‘admissible lattices’ for star-bodies, see Chapter IV.4 in [14]. Here, the star-body is given by \(\{F <1\}\) with the distance function \(F(x) = |x_1 \cdots x_d|^{1/d}\).
As shown in Lemma 3.1 of [53], the dual lattice \(\Gamma ^*=Z{\mathbb {Z}}^d\) of \(\Gamma \), where \(Z^TB=\mathrm {Id}\), is admissible as well. Another property of admissible lattices is that there exists a cube \([-r_0,r_0]^d\) containing a fundamental domain F of \(\Gamma \) such that \(r_0>0\) depends only by means of the invariants \(\det {\Gamma }\) and \({{\,\mathrm{{\text {Nm}}}\,}}\Gamma \).
Example 7.6
Well known examples are provided by the Minkowski embedding of a totally real algebraic number field \({\mathbb {F}}\) of degree d into \({\mathbb {R}}^d\). Given all embeddings \(\sigma _1,\ldots ,\sigma _d\) of \({\mathbb {F}}\), the Minkowski embedding \(\sigma :{\mathbb {F}} \rightarrow {\mathbb {R}}^d\) is defined by \(\sigma = (\sigma _1,\ldots ,\sigma _d)\). In this case \({{\,\mathrm{{\text {Nm}}}\,}}\sigma (\alpha ) = |N_{{\mathbb {F}}/{\mathbb {Q}}}(\alpha )|\) is the field norm of any \(\alpha \in {\mathbb {F}}\), where we interpret multiplication by \(\alpha \) as a \({\mathbb {Q}}\)-linear map. Thus, the image of the ring of integers \({\mathcal {O}}_{\mathbb {F}}\) is an admissible lattice \(\Gamma \) with \({{\,\mathrm{{\text {Nm}}}\,}}\Gamma \ge 1\). For more information, see Chapter 2.3 in [11].
Remark 7.7
We also note that for any natural number \(n \in {\mathbb {N}}\) we may choose a real number field of degree n which is normal over the rational numbers. In fact, let \(m \in {\mathbb {N}}\) be chosen such that \(2n \mid \varphi (m)\) and let \(\xi _m\) be a primitive m-th root of unity. Then \({\mathbb {Q}}(\xi _m + \xi _m^{-1})\) is a real number field of degree \(\varphi (m)/2\), which is also normal and its Galois group G is abelian. Since G contains a subgroup H of order \(\varphi (m)/(2n)\), the fixed field of H is real, normal and of degree n. Thus, there exists an admissible region \(\Omega \) satisfying (7.1) with \(c_B \asymp _d q/q_0\) and \({{\,\mathrm{{\text {Nm}}}\,}}(B) \asymp _d q^{d/2}\).
Lemma 7.8
Assume that the lattice \(\Gamma =B{\mathbb {Z}}^d\) is admissible and B satisfies (7.1). For \(0< \varepsilon \le \varepsilon _0\) and \(r \ge 1\) we get for the parallelepiped \(\Omega =B^{-1} [-1,1]^d\) and the corresponding weights \(\zeta _{\varepsilon }(x)= \text {v}_{\varepsilon }(x) \psi (x)\) introduced in Sect. 7.2
where \(\lambda _{r,\varepsilon } :=\min \{ \log (r+1), \log (\varepsilon ^{-1}) \}\) and \({\bar{\lambda }}_{r,\varepsilon ,\Gamma } :=\max \{ \lambda _{r,\varepsilon },\log (2+\tfrac{1}{{{\,\mathrm{{\text {Nm}}}\,}}(\Gamma ) \,r\varepsilon })\}\). For any non-admissible parallelepiped \(\Omega \) only the estimate
holds. Additionally, we also have \(d_Q |\det {B}| \le (c_B)^{d/2}\).
Proof
We start by making the change of variables \(w=r^{-1} Z\,v\) in (7.30) and then splitting \(I_\zeta \) into integrals over cells \(C^*:= Z [-\frac{1}{2},\frac{1}{2})^d\), where \(\Gamma ^* :=Z{\mathbb {Z}}^d\) denotes the dual lattice to \(\Gamma \), that is \(Z= (B^T)^{-1}\), in order to get
Note that \(\Gamma ^*\) satisfies \(\Vert Z\Vert \le \Vert Q_{+}^{-1/2}\Vert \le q_0^{-1/2}\), since the first inequality in (7.1) implies
In particular, the fundamental domain \(C^*\) is contained in \(q_0^{-1/2} \sqrt{d} [-\frac{1}{2},\frac{1}{2}]^d\). Next, we shall bound the Fourier transform of \(\zeta _\varepsilon \). Recall that by definition
As verified in (7.25), we have in coordinates \(u=(u_1, \ldots , u_d)\)
Since (7.33) also implies \(\Vert Q_{+}^{-1/2}(Z^{-1}u)\Vert \ge \Vert u\Vert \), we can rewrite (7.24) by
where we applied the AM-GM inequality. In view of (7.26) we have the bound
as well. Combining these estimates yields
Thus, we get for a fixed lattice point \(\gamma ^* = (\gamma ^*_1,\ldots ,\gamma ^*_d) \in \Gamma ^*\)
where \({\bar{\omega }}(x) :=(1 + x^2)^{-k/d}\) and \(\omega (x) :=\exp \{- |x|^{1/2} \}\). We now estimate the last double integral coordinatewise: Note that we have \(|v_i| \le \bar{v} :=\sqrt{d}/2\) and
since \(\Vert Z^{-1}v\Vert _\infty \gg _d \Vert Z\Vert ^{-1} \Vert v\Vert _\infty \ge q_0^{1/2} \Vert v\Vert _\infty \). Hence, we find
where
In order to estimate \(J_{\zeta }(\gamma ^*_j;{\mathbb {R}})\), we decompose the integral into parts corresponding to the extremal points of the integrands. Defining \(D_j :=\{|u| \ge r|\gamma ^*_j +v|/2\}\), we get
In the case \(|\gamma _j^*| \ge \sqrt{d}\), we have \(|\gamma ^*_j+v| \ge |\gamma _j^*|/2\) and hence
if we take \(k = d (d+3)\). In the other case \(|\gamma _j^*| < \sqrt{d}/2\), we split the v-integral into two parts as follows in order to find the estimate
In the complement \(u \in D_j^c\) we have \(|\gamma _j^* + v-\frac{u}{r}| \ge |\gamma _j^* + v|/2\) and thus
If \(|\gamma _j^*| \ge \sqrt{d}\), then we easily conclude that \(J_{\zeta }(\gamma ^*_j;D_j^c) \ll _d \omega ( \varepsilon r \gamma _j^* /4 ) |\gamma _j^*|^{-1}\). At last, we consider the case \(|\gamma _j^*| < \sqrt{d}\). The v-integral over the region \(\{\bar{v} \ge |v| \ge |\gamma ^*_j|/2\}\) can be bounded by
and similar over the complement by
Hence we conclude that
where
In view of the following Lemma 7.9 this concludes the proof of the bound (7.30).
If the region \(\Omega \) is not admissible, then we change variables to \(w = r^{-1}v\) split the left-hand side of (7.30) into integrals over unit cells \(E:= [-\frac{1}{2},\frac{1}{2})^d\) in order to find
Because of \(\sum _{j=1}^d |u_j|^{1/2} \ge \Vert u\Vert ^{1/2}\) we can further estimate (7.37) by
Recalling the definition (7.34) and the estimates (7.35)–(7.36) for \(u =Z w\) shows that
Thus, taking \(k=d+1\) we find
The last remark easily follows by comparing the volume of the bodies \(\{\Vert Bx\Vert \le 1\}\) and \(\{ \Vert Q_{+}^{1/2}x\Vert \le 1\}\): Using (7.1) leads to \(|\det {Q}|^{1/2} \le |\det {B}| \le (c_B)^{d/2} |\det {Q}|^{1/2}\). \(\square \)
Lemma 7.9
For an admissible lattice \(\Gamma \) we have for any weight function \(\omega (x)>0\) on \({\mathbb {R}}\), such that \(\omega _\infty :=1+ \max _x\omega (x) (1+ |x|)^p < \infty \), where \(p \in {\mathbb {N}}\) and \(\varepsilon >0\), the bound
where \(\omega _{r,\varepsilon }(x) :=\lambda _{r,\varepsilon } |x|^{\frac{1}{2}} I(|x| < \sqrt{d})+ \omega ( \varepsilon r x) I(|x| \ge \sqrt{d})\) and \( \lambda _{r,\varepsilon }\), \({\bar{\lambda }}_{r,\varepsilon ,\Gamma }\) are as introduced in Lemma 7.8.
Proof
First, we make a decomposition of \(\Gamma \) as follows. For any \((x_1,\ldots ,x_d) \in {\mathbb {R}}^d\) with \(|x_1 \cdots x_d|\ge {{\,\mathrm{{\text {Nm}}}\,}}(\Gamma )\) let \(m_j \in {\mathbb {Z}}\) be the unique integers satisfying \(2> | 2^{m_j} x_j| d^{-1/2} \ge 1\) for \(j=2, \ldots ,d\). We have \(|x_1| \ge {{\,\mathrm{{\text {Nm}}}\,}}(x) |x_2\ldots x_d|^{-1} \ge {{\,\mathrm{{\text {Nm}}}\,}}(\Gamma ) d^{(1-d)/2} \prod _{j=2}^d 2^{m_j -1}\) and this implies that \(|2^{m_1}x_1| \in [k\,c_{\Gamma },(k+1)c_{\Gamma })\) for a unique integer \(k \ge 1\), where \(m_1 \in {\mathbb {Z}}\) is determined by \(m_1 + m_2 +\ldots + m_d =0\) and \(c_{\Gamma }= d^{(1-d)/2} 2^{-d+1}{{\,\mathrm{{\text {Nm}}}\,}}(\Gamma )\). Introducing the lattice
and the interval \(B_k:=[k\,c_{\Gamma },(k+1)c_{\Gamma })\), we can write
and hence
We also introduce the obvious notations \({{\,\mathrm{{\text {Nm}}}\,}}(x) :=|x_1 \cdots x_d |\), \(2^m x= ( 2^{m_1}x_1, \ldots 2^{m_d} x_d ), m \in E_d\) and \(2^m \Gamma \) for the rescaled lattice \(\{ 2^m \gamma \,: \,\gamma \in \Gamma \}\). Note that \({{\,\mathrm{{\text {Nm}}}\,}}(2^m\gamma )={{\,\mathrm{{\text {Nm}}}\,}}(\gamma )\) and hence \({{\,\mathrm{{\text {Nm}}}\,}}(\Gamma )={{\,\mathrm{{\text {Nm}}}\,}}(2^m\Gamma )\). Defining \(C_k:=B_k\times [\sqrt{d},2\sqrt{d})^{d-1}\) and \(h(x) :=( 1+ |x|)^{-p}\) (where \(p \in {\mathbb {N}}\) is the same as in the assumptions of the lemma), we may rewrite and bound (7.41) by
where \(h_{r,\varepsilon }(x) :=\lambda _{r,\varepsilon } |x|^{\frac{1}{2}} I(|x| < 1)+ h( \varepsilon r x) I(|x| \ge 1)\). In order to perform the summation in k and \(\eta \) in (7.42) we first observe that
Proof of (7.43): Assume that two different lattice points \(\eta , \eta '\in 2^m\Gamma \) lie in \(C_k\). Then we have \(|\eta _1-\eta '_1| < c_{\Gamma }\) and \(\max _{2\le j\le d}|\eta _j-\eta '_j| < \sqrt{d}\). Since \(\eta -\eta '\in 2^m \Gamma {\setminus } \{0\}\) implies \(|\eta _2- \eta '_2 |\cdots |\eta _d- \eta '_d |\ge ({{\,\mathrm{{\text {Nm}}}\,}}\Gamma )/c_{\Gamma }=d^{(d-1)/2}2^{(d-1)}\) and hence \(|(\eta _2- \eta '_2) | \ge 2 \sqrt{d}\) for some \(j\ge 2\), we get at a contradiction which proves (7.43).
Estimating the following sum in k by an integral, we obtain
Hence, making use of (7.43) and (7.44) in (7.42), shows that
where \(2^{m} :=(2^{m_1},\ldots ,2^{m_d})\) and \(H(x):=\bar{h}_{r,\varepsilon }(c_\Gamma x_1) h_{r,\varepsilon }(x_2)\cdots h_{r,\varepsilon }(x_d)\).
Let \(E_d'\) denote the subset of \(E_d\) consisting of all lattice points \((m_1,\ldots ,m_d) \in E_d\) with \(m_1 \le 0\). We claim that
Proof of (7.46)
Let \(m \in E_d' {\setminus } \{0\}\). Assume for definiteness that \(m_1, \ldots , m_{l-1} \le 0\) and \(m_l, \ldots , m_d > 0\). By definition of \(E_d\) we get \(2 \sum _{j=l}^m m_j= \sum _{j=1}^d |m_j| \ge \Vert m\Vert _2\). Since \(h_{r,\varepsilon }(2^{-k}) \le 1\) for \(k \le 0\) and otherwise \(h_{r,\varepsilon }(2^{-k}) = \lambda _{r,\varepsilon } 2^{-k/2}\), we obtain
Thus, splitting the sum according to the number of positive coordinates and then summing over the \((d-1)\)-dimensional lattice \(E_d\) yields (7.46). \(\square \)
In order to bound the sum over the complement of \(E_d'\), we again split the sum according to the number of positive coordinates. For simplicity, we may assume that \(m_1,m_2,\ldots ,m_l >0\) and \(m_{l+1},\ldots ,m_d \le 0\). Similar to the previous case, we find that
If we parameterize the \((d-1)\)-dimensional lattice \(E_d\) by \((m_1,\bar{m})\), where \(m_1 = - (m_2+ \ldots + m_d)\) and \(\bar{m}=(m_2,\ldots ,m_d) \in {\mathbb {Z}}^{d-1}\), and split the summation into a ball of radius \(\Vert \bar{m}\Vert _2 \le R_{\varepsilon }:=3d\,\log (2+(r\varepsilon )^{-1})\) and its complement, where \((r\varepsilon )^{-dp}\,2^{-p\Vert m\Vert _2/2} \le (r\varepsilon )^{-dp}\,2^{-p\Vert \bar{m}\Vert _2/2} \le 1\), we can bound the sum corresponding to a fixed l by
where we have estimated the sums by comparison with the corresponding integrals. Using this estimate for each \(l=1,\ldots ,d-1\) together with (7.46) in (7.45) yields the bound (7.40). \(\square \)
7.4 Applications of Theorem 2.2
We start by smoothing the indicator function of the region \(\Omega \). We choose weights \(\text {v}=\text {v}_{\pm \varepsilon }\) as defined in (7.3) with \(\varepsilon \in (0,\varepsilon _0]\) and the related \(\zeta =\zeta _{\varepsilon }\), see Sect. 7.2, corresponding to parallelepipeds \(\Omega =B^{-1}[-1,1]^d\) satisfying \(Q_+ \le B^T B \le c_B Q_+\), compare (7.1). Recalling (7.6), where we have used Lemma 7.1 to estimate the \(\varepsilon \)-smoothing error, yields a total error
Now we can apply Theorem 2.2 in order to bound the latter remainder \(|R( I_{E_{a,b}} \text {v}_{\pm \varepsilon ,r})|\) as follows. In (6.14) we shall estimate \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _{*,r}\) by using \(\Vert \text {v}_{\varepsilon }\Vert _Q \ll _d d_Q\) of Lemma 7.1, \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _1\ll _d (\log {\varepsilon ^{-1}})^d\) of Lemma 7.2 and
of Lemma 7.8 for admissible regions \(\Omega \), i.e. (7.29) holds, to get
where \(a_Q :=q \,c_Q = q |\det {Q}|^{1/4-\beta /2} = C_Q (d_Q)^{-1}\), provided that \(0<w < (b-a)/4\). This bound holds for admissible parallelepipeds \(\Omega \) only. If \(\Omega \) is not admissible, then we have to replace the smoothing error (7.48) by
that is (7.31) of Lemma 7.8. With these bounds we are ready to prove the main statements on the lattice point remainder for hyperbolic shells.
Proof of Corollary 2.5
For wide shells, i.e. \(b-a > q\), we optimize (7.49) in the smoothing parameter w first by choosing \(w = \mathrm {W}(qT_{+}/2)^2/T_{+}\), where \(\mathrm {W}\) denotes the upper branch, defined on the interval \((-\mathrm {e}^{-1},\infty )\), of the inverse function of \(x \mapsto x e^x\). (The function W is also known as the Lambert-W-function, see [15] for more details and some applications.)
Since \(x \mapsto W(x)^2/x\) has a global maximum at \(x= \mathrm {e}\) with value \(\mathrm {e}^{-1}\), we find \(w \le q/(2 \mathrm {e}) < (b-a)/4\) as required in the restrictions (6.6). This leads to the partial bound
where we used that \(W(x) \le \log (x+1)\) and \(W(x)^{-1} \exp (-W(x)) = x^{-1}\). Next, we calibrate the \(\varepsilon \)-dependent terms in (7.49) by choosing \(\varepsilon =T_{-}^{\frac{d}{2}-2-\delta } \,(b-a)^{-1} /15\). Again, this choice satisfies the required restrictions, i.e. \(\varepsilon \le \varepsilon _0 = 1/15\). Because of
compare the definition in Lemma 7.8, we can simplify (7.49) to
where
and the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\), \(T_{+} \ge 1\). This proves the first part of Corollary 2.5. Next, we consider the case of thin shells, i.e. \(b-a \le q\). Here we take \(\varepsilon = T_{-}^{\frac{d}{2}-2-\delta }/15\) and \(w= T_{-}^{\frac{d}{2}-2-\delta } (b-a)/4\) in (7.49), noting that \(d_Q ( w + \varepsilon \,(b-a)) \le a_Q (b-a) c_Q T_{-}^{\frac{d}{2}-2-\delta }\), in order to get the bound (7.51), whereby the factor \(\rho _{Q,b-a}^{\mathrm {hyp}+}(r)\), depending on the Diophantine properties of Q, has to be replaced by
In the last equation the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\) with
where the last condition ensures that
Finally, we note that Corollary 4.11 implies that \(\gamma _{[T_{-},1],\beta }(r)\rightarrow 0\) and also \(\gamma _{[1,T_{+}],\beta }(r)\rightarrow 0\) for \(r\rightarrow \infty \) and any fixed \(T_{-} \in [q_0^{-1/2}r^{-1},1]\), \(T_{+} \ge 1\), when Q is irrational. Thus, we conclude that \(\rho _{Q,b-a}^{\mathrm {hyp}+}(r)\rightarrow 0\), resp. \(\rho _{Q,b-a}^{\mathrm {hyp}-}(r)\rightarrow 0\), for \(r\rightarrow \infty \) and fixed \(b-a\). \(\square \)
Corollary 7.10
Consider an indefinite quadratic form Q in \(d \ge 5\) variables and a (not necessary admissible) parallelepiped \(\Omega \) satisfying (7.1) and \(\max \{|a|,|b|\} \le c_0 r^2\), where \(c_0>0\) is chosen as in Lemma 7.1. Then for all \(b-a \le 1\)
where \(\rho _{Q,b-a}^{\mathrm {hyp}*}\) is defined in (7.53). In particular, for irrational Q we have \(\rho _{Q,b-a}^{\mathrm {hyp}*}(r) \rightarrow 0\) for \(r \rightarrow \infty \), provided that \(b-a\) is fixed.
Proof
We shall argue similar as in the previous proof of Corollary 2.5, but here we can only use (7.50) to bound \(\Vert \widehat{\zeta }_{\varepsilon }\Vert _{*,r}\), since \(\Omega \) is not necessarily admissible. Thus, we have to replace the error bound (7.49) for the lattice remainder by
Now the right-hand side can be optimized by taking
and this leads to the bound
where
and the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and
\(\square \)
The next corollary provides a lower bound for the number of lattice points and is useful for proving quantitative bounds in the Oppenheim conjecture.
Corollary 7.11
For the special choice \(B= Q_{+}^{1/2}\), i.e. \(\Omega = Q_{+}^{-1/2} [-1,1]^d\) and \(c_B=1\), and all \(\max \{|a|,|b|\} \le r^2/5\) and \(b-a \le 1\) there exists constants \(b_{\beta ,d} >0\) and \({\tilde{b}}_{\beta ,d}>0\), depending on \(\beta \) and d only, such that for all \(r \ge {\tilde{b}}_{\beta ,d} \,q^{1/2} (q/q_0)^{(d+1)/(d-2)}\)
where \(c_Q = |\det {Q}|^{1/4-\beta /2}\), \(a_Q = q \,c_Q\) and
and the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\) with
and \(C_{\beta ,d}, c_{\beta ,d} \ge 1\) are constants depending on d and \(\beta \) only.
Proof
Here we only consider the special region \(\Omega = Q_{+}^{-1/2} [-1,1]^d\), i.e. \(B = Q_{+}^{1/2}\) and thus (7.1) is valid with \(c_B =1\). Since \(\Omega \) is not necessarily admissible, we have to argue as in the previous proof (of Corollary 7.10): Starting with the estimate (7.52), we can take \(\varepsilon = (30 \, a_{d} \, b_{\beta ,d})^{-1}\) and \(w= (b-a) \varepsilon \) in the optimization procedure, where \(a_d \ge 1\), resp. \(b_{\beta ,d} \ge 1\), denotes the implicit constant in (7.12) (see Lemma 7.1), resp. (7.52). (Of course, we have \(\varepsilon \in (0,\varepsilon _0]\) and \(0< w < (b-a)/4\) as required.) This yields
where \(\bar{b}_{\beta ,d} :=b_{\beta ,d} ( \varepsilon ^{-d} + \log (\varepsilon ^{-1})^d)\) depends on \(\beta \) and d only. Again referring to Lemma 7.1, we also see that
if we choose \(r \ge {\tilde{b}}_{\beta ,d} \,q^{1/2} (q/q_0)^{(d+1)/(d-2)}\) with \({\tilde{b}}_{\beta ,d} = (15 a_d \bar{b}_{\beta ,d})^{-1}\). Finally, we make the restriction \(T_+ \ge w^{-1} \max \{\log ( (15 a_d \bar{b}_{\beta ,d})^{-1} q^{-1} (b-a) )^2,1\}\) to ensure that
Collecting the remaining terms proves (7.54). \(\square \)
Now we consider elliptic shells as well and optimize the lattice remainder as in the case of ‘wide shells’. In contrast to the previous cases, the error caused by the smoothing of the region \(\Omega \) is not present here.
Proof of Corollary 2.4
In the case of ellipsoids, i.e. Q is a positive definite form, we choose the (not necessary admissible) parallelepiped \(\Omega :=B^{-1} [-1,1]^d\) with \(B= Q_+^{1/2}\) and \(r=\sqrt{2b} \ge q^{1/2}\), resp. \(2b=r^2\), \(a=0\) and \(\varepsilon = 1/15\). Then (7.1) is satisfied with \(c_B =1\) and \(E_{0,b} \subset r\Omega \), i.e. \(H_r := E_{a,b} \cap r \Omega = E_{a,b}\). Moreover, since \(E_{0,b}\) does not intersect \(r (\partial \Omega )_{2 \varepsilon }\) (the \(2 \varepsilon r\)-boundary of \(r \Omega \) as defined in (7.3)), we get an error \(R_{\varepsilon ,r}=0\) for smoothing the indicator function of \(r \Omega \). Hence, we may remove the term proportional to \((b-a)\varepsilon \) in (7.47). Note that apart from Lemma 7.1 the indefiniteness of Q has not been used in all arguments so far. In contrast to the case of hyperbolic shells, we optimize (6.14) in w first. Again including the bound \(\Vert \text {v}_{\varepsilon }\Vert _Q \ll _d d_Q\) of Lemma 7.1 and here taking \(w=\mathrm {W}(q T_+/4)^2/T_+\), where \(\mathrm {W}\) denotes the upper branch of the Lambert-W-function (for more details on the Lambert-W-function see the proof of Corollary 2.5 on p. 71), and noting that \(w \le q/(4e) < (b-a)/4\), leads (as in the proof of Corollary 2.5) to the bound
where \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_+ \ge 1\). This can be rewritten as
with
where the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_+ \ge 1\). Note that as in the indefinite case \(\lim _{r\rightarrow \infty } \rho _Q^{\mathrm {ell}}(r)=0\) if Q is irrational by Corollary 4.11. This proves Corollary 2.4. Furthermore, we remark that \({{\,\mathrm{{\text {vol}}}\,}}H_r ={{\,\mathrm{{\text {vol}}}\,}}(r\Omega \cap E_{0,b}) = d_Q\,\omega _d \,r^{d}\), where \(\omega _d\) denotes the volume of the unit d-ball. \(\square \)
Similar arguments can be used in order to obtain related bounds for both wide (\(b-a >r\)) and narrow (\(b-a < r\)) shells in the case of ellipsoidal shells \(E_{a,b}\).
Given a quadratic form Q of Diophantine type \((\kappa ,A)\), i.e. Q satisfies (1.12), we shall apply Corollary 4.11 in order to estimate the Diophantine factors explicitly. Hereby, we prove quantitative bounds in the Oppenheim conjecture (for indefinite quadratic forms Q of Diophantine type \((\kappa ,A)\)) by comparing the volume with the corresponding lattice sum.
Proof of Corollary 1.7
We begin by applying Corollary 7.11 with \(b = - a = \varepsilon \) and \(\beta = 2/d+\delta '/d\) for an appropriate \(\delta '>0\): Taking \(T_{-} \asymp _{\beta ,d} q^{-1/(d(1/2-\beta )) } \,|\det {Q}|^{-1/d}\), so that \(b_{\beta ,d} (b-a) d_Q \,r^{d-2} a_Q \,c_Q \,T_{-}^{d(1/2-\beta )} \le ({{\,\mathrm{{\text {vol}}}\,}}H_r)/5\) holds, yields the lattice remainder bound
This estimate is valid provided that \(r \gg _{\beta ,d} (q/q_0)^{(d+1)/(d-2)} q^{1/2+2/(d-4)+\delta }\). Note that we have \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) as required and that the assumptions of Corollary 7.11 are satisfied as well. Next we calibrate the parameter \(T_{+}\) by taking
Since Q is of Diophantine type \((\kappa ,A)\), we can use Corollary 4.11 in order to find that
and also that
In view of (7.12), we may increase \(r \gg _{Q,\beta ,d} \max \{A^{-1},1\}\) to get
Now, we choose \(r \asymp _{A,Q,\delta ,d} \varepsilon ^{-(2d + 3 \kappa d - 4 \kappa )/(2d-8) - \delta } \) in order to obtain
All in all, we have
Since \((2d + 3 \kappa d - 4 \kappa )/(2d-8) \ge 1/(d-2)\) holds if \(d \ge 5\), we find that \(\mathrm {vol}_{\mathbb {Z}}\,H_r > 1\). This means that there exists at least one non-zero lattice point \(m \in {\mathbb {Z}}^d\) satisfying both \(|Q[m]| < \varepsilon \) and also \(\Vert Q_{+}^{1/2}m\Vert \ll _d r\). \(\square \)
We can argue similarly to investigate the density of values of a quadratic form:
Proof of Corollary 1.8
It is sufficient to prove that \({{\,\mathrm{{\text {vol}}}\,}}_{{\mathbb {Z}}^d}(r\Omega \cap E_{a,b})>0\) for any \(\max \{|a|, |b|\} \le c_0 r^2/2 \), where \(c_0\) is as in Lemma 7.1, with \(r^{-\nu _0+ \delta } = b-a\) for \(r \ge c_{\delta ,d,\Omega ,Q,A,\kappa }\) and a sufficiently large constant \(c_{\delta ,d,\Omega ,Q,A,\kappa } >1\). In particular, we consider small shells, i.e. \(b-a \le 1\). Repeating the proof of Corollary 7.11, we see that Corollary 7.11 is also valid for arbitrary parallelepipeds satisfying (7.1), but then the constants depend additionally on the scaling parameter \(c_B \ge 1\). Also repeating the previous proof (of Corollary 1.7) in this situation shows that we can take \(r = c_{\delta ,d,\Omega ,Q,A,\kappa } (b-a)^{-1/\nu _0}\), where \(\nu _0 :=\frac{2(d-4)}{2d + 3 \kappa d - 4 \kappa }\), to ensure that \({{\,\mathrm{{\text {vol}}}\,}}_{{\mathbb {Z}}^d}(r\Omega \cap E_{a,b})>0\). \(\square \)
Using the Diophantine estimates for quadratic forms Q of Diophantine type \((\kappa ,A)\), we can estimate \(\rho _{Q,b-a}^{\mathrm {hyp}+}(r)\) and \(\rho _{Q,b-a}^{\mathrm {hyp}-}(r)\) in Corollary 2.5 explicitly as follows.
Proof of Corollary 2.6
First, we consider ‘wide shells’, i.e. \(b-a \ge q\). By applying Corollary 4.11, we can bound the Diophantine factor from Corollary 2.5 by
where \(\nu :=(1-2\beta )/(2\kappa +2)\) and the infimum is taken over all \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and \(T_{+} \ge 1\). Next we optimize this expression by taking \(T_{-} = r^{- 2\nu /(\nu +\sigma )}\) and \(T_{+} = r^{(2\nu )/(\kappa \nu +1)}\), where \(\sigma :=d(1/2-\beta )\): This parameter choice is permissible, since \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) holds (because of \(\sigma \ge \nu \)), and thus we obtain
where \(h_Q :=q \,|\det {Q}|^{1/2-\beta }\) (here we avoided to give an optimal estimate in terms of \(|\det {Q}|\) to reduce the notational burden). In view of the bound from Corollary 2.5 and (7.12) we get the relative lattice error
For ‘thin shells’, i.e. \(b-a \le q\), we have
where the infimum is taken over all \(T_{-} \in [r^{-1},1]\) and \(T_{+} \ge 1\) satisfying
\(\square \)
8 Small values of quadratic forms at integer points
Finally we shall prove Theorem 1.3 by using our effective equidistribution results (in form of Corollary 7.11) together with bounds on small zeros of indefinite integral quadratic forms. Our proof is based on the following strategy: If Q has ‘good’ Diophantine properties, we can compare the volume with the number of lattice points to establish bounds for non-trivial lattice points \(m \in {\mathbb {Z}}^d {\setminus } \{0\}\) satisfying the Diophantine inequality \(|Q[m]| < \varepsilon \). Otherwise Q is near a rational form and here we shall use Schlickewei’s bound [51] for small zeros of integral quadratic forms.
8.1 Integer-valued quadratic forms
In this section we summarize some essential results on small zeros of integer-valued quadratic forms. Here A[m] denotes an integer-valued indefinite quadratic form on a lattice \(\Lambda \) in \({\mathbb {R}}^d\) of full rank. Meyer [44] proved in 1884 that such a form represents zero non-trivially on \(\Lambda \) if \(d \ge 5\). Nowadays, this result is usually deduced from the Hasse-Minkowski theorem, which is a local-global principle (see [25], Theorem 5.7, Corollary 5.10).
Similarly to the result of Birch and Davenport [3] on diagonal forms in five variables, our quantitative bounds in Theorem 1.3 depend essentially on explicit bounds for small zeros of integral forms (see Corollary 8.4). First bounds of this kind were proved by Cassels [12], based on a geometric argument. Birch and Davenport improved Cassels’ result as follows: If \(d \ge 3\) and A[m] admits a non-trivial zero on the lattice \(\Lambda \), then there exists an isotropic lattice point \(m \in \Lambda {\setminus } \{0\}\) with Euclidean norm
where \(\gamma _d\) denotes the Hermite constant in dimension d (see [4, 16]). This bound is essentially best possible in view of an example by M. Kneser, see [13], if A has signature \((d-1,1)\). In 1985 Schlickewei [51] extended Cassels’ argument non-trivially by showing that the dimension, say \(d_0\), of a maximal rational isotropic subspace has an essential impact on the size of small zeros, rather than mere indefiniteness (i.e. \(d_0 \ge 1\)). He established the following relation between small zeros of integral forms and the dimension \(d_0\).
Theorem 8.1
(Schlickewei [51]) Let \(\Lambda \) be a d-dimensional lattice and A a non-trivial quadratic form in d variables taking integral values on \(\Lambda \). Also let \(d_0 \ge 1\) be maximal such that there exists a \(d_0\)-dimensional sublattice of \(\Lambda \) on which A vanishes. Then there exist linearly independent lattice points \(m_1,\ldots ,m_{d_0} \in \Lambda \), spanning an isotropic subspace, of size
In the same way as Birch and Davenport [4] deduce their Theorem B from their Theorem A, we may conclude
Theorem 8.2
(Schlickewei [51]) Let \(F,G \ne 0\) be quadratic forms in d variables and suppose in addition that G is positive definite. Let \(d_0\) be maximal such that F vanishes on a rational subspace of dimension \(d_0\). Then there exist \(d_0\) linearly independent lattice points \(m_1,\ldots ,m_{d_0} \in {\mathbb {Z}}^d\) such that F vanishes on the corresponding subspace and
where the implicit constant depends on d only.
Using an induction argument combined with Meyer’s theorem, Schlickewei derived also the following lower bound (8.3) - which we only state for non-singular forms - for the dimension of a maximal rational isotropic subspace in terms of the signature (r, s). For notational convenience, we may suppose that \(r \ge s\). Then Hilfsatz of Section 4 in [51] reads
Remark 8.3
One can complement Schlickewei’s lower bound (8.3) with the upper bound \(d_0 \le \min \{r,s\}\), which follows immediately by a dimension argument: If we decompose \({\mathbb {R}}^d = V_{+} \oplus V_{-}\) into subspaces \(V_{+}\), \(V_{-}\), on which Q is positive or negative definite, and if \(V_{\mathrm {iso}}\) denotes an isotropic subspace, then \(V_\mathrm {iso} \cap V_{\pm } = \{0\}\) and thus
In particular, the lower bound (8.3) is essentially optimal.
Obviously, a straightforward combination of the upper bound (8.3) together with Theorem 8.1 yields explicit bounds on the smallest non-trivial isotropic vector. However this application can be improved in the cases \(r=s+2\) and \(r=s\) by reducing the problem to dimension \(d-1\) as done by Schlickewei in Folgerung 3 of [51], were he proved that for any integral quadratic form A of signature (r, s) there exists an isotropic lattice point \(m \in {\mathbb {Z}}^d {\setminus } \{0\}\) such that \(\Vert m\Vert ^2 \ll _d ({{\,\mathrm{{\text {Tr}}}\,}}A^2 )^\rho \), where
as defined in (1.10) (see Sect. 1.2). We shall extend this result to general lattices leading to the following strengthening of (8.1).
Corollary 8.4
Suppose that A is a non-singular quadratic form of signature (r, s) in \(r+s=d \ge 5\) variables, which takes integral values on \(\Lambda \). Additionally suppose that \(|\det (\Lambda )| \ge 1\), then the smallest non-trivial isotropic vector \(m \in \Lambda \) of A satisfies
where \(\rho \) is as defined in (1.10).
Compared to (8.1), the exponent in (8.4) is considerably smaller for a wide range of signatures (r, s). Especially, if \(r \sim s\), then \(\rho \sim 1/2\) and therefore \((2\rho +1)/d \sim 2/d\).
Proof
As can be checked easily, in the cases \(r \ge s+3\) and \(r=s+1\) the bound (8.4) follows immediately from Theorem 8.1 together with (8.3), since \(d/d_0 \le 2 \rho +1\) and \(2 \le d/d_0\) (by Remark 8.3) in both cases. (Here we estimate \(({{\,\mathrm{{\text {Tr}}}\,}}A^2)^{(d-d_0)/2}\) by \(({{\,\mathrm{{\text {Tr}}}\,}}A^2)^{1/2}\) if \({{\,\mathrm{{\text {Tr}}}\,}}A^2 < 1\) and by \(({{\,\mathrm{{\text {Tr}}}\,}}A^2)^{\rho }\) if \({{\,\mathrm{{\text {Tr}}}\,}}A^2 \ge 1\).) If \(r=s\) or \(r=s+2\), then the first relation does not hold. Here we fix a reduced basis \(v_1,\ldots ,v_d\) of \(\Lambda \) with
Let \(\Lambda _0 := {\mathbb {Z}}v_1 + \ldots + {\mathbb {Z}}v_{d-1}\), which is a \(d{-}1\) dimensional sublattice of \(\Lambda \), and note that Hadamard’s inequality shows that \(\det (\Lambda _0) = \Vert v_1 \wedge \ldots \wedge v_{d-1}\Vert \le \Vert v_1\Vert \ldots \Vert v_{d-1}\Vert \). Thus
Now denote by \(A_0\) the restriction of A to the subspace generated by \(v_1,\ldots ,v_{d-1}\). It follows that \(A_0\) has signature either \((r,s-1)\) or \((r-1,s)\) and, since \(({{\,\mathrm{{\text {Tr}}}\,}}A^2)^{1/2} = \Vert A\Vert _{\mathrm {HS}}\), also that \({{\,\mathrm{{\text {Tr}}}\,}}A_0^2 \le {{\,\mathrm{{\text {Tr}}}\,}}A^2\). Applying Theorem 8.1 (resp. Theorem 8.2 after a coordinate change) to \(A_0\) and \(\Lambda _0\) shows that there exists an isotropic lattice point \(m \in \Lambda _0 {\setminus } \{0\}\) such that
where \(d_0\) denotes the dimension of a maximal isotropic subspace of \(A_0\) (instead of A). Completing the proof, we note that in both cases \(r=s+2\) and \(r=s\) one has
as can be readily seen. \(\square \)
Remark 8.5
In 1988 Schlickewei and Schmidt [56] complemented their work [55] on isotropic subspaces of quadratic forms showing that Schlickewei’s bound in terms of \(d_0\) is best possible. Additionally, one can also ask if Schlickewei’s bound (8.3) in terms of (r, s) is best possible, as was already conjectured by Schlickewei himself in [51]. At least for the cases \(r \ge s + 3\) and (3, 2) this is known and due to Schmidt, see [49].
Remark 8.6
As a final remark we note that in the Geometry of Numbers it is often the case that one can use the existence of a lattice points satisfying some inequality in order to get several independent points satisfying a joint inequality. This argument was used by Schlickewei and Schmidt [55, 57] to prove an extension of Theorem 8.1, in which they considered several isotropic subspaces and their relative position.
8.2 Proof of Theorem 1.3
Now we are in position to prove the second main theorem of this paper. To simplify the notation we may replace Q by \(Q/\varepsilon \) and consider the solubility of the Diophantine inequality \(|Q[m]| < 1\). Notice that this rescaling does not change the constant \(c_B =1\) occuring in Corollary 7.11.
Proof of Theorem 1.3
Let \(d \ge 5\), \(q_0 \ge 1\) and
as in Corollary 7.11 and \(\beta =2/d +\delta '/d\) with fixed \(\delta ' >0\) depending on \(\delta >0\). Applying Corollary 7.11 with \(b=-a=1/5\) (note that both conditions \(\max \{|a|,|b|\} \le r^2/5\) and \(b-a \le 1\) are satisfied) gives the bound
for any \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) and
Hence, we can take \(T_{+} \asymp _{\beta ,d} \log (q \! + \! 1)^2\). Additionally, by taking
we can also ensure that
compare the lower bound (7.12) of Lemma 7.1. At this step we have to choose
in order to guarantee that \(T_{-} \in [q_0^{-1/2}r^{-1},1]\) is satisfied.
First Case: We consider first classes of quadratic forms Q for which the lattice remainder is ’small’: Corresponding to Diophantine properties of Q, we assume that
with some constant \(h_{\beta ,d}>0\) depending on d and \(\beta \) only (compare again with (7.12)) such that \(5 \mathrm {vol}_{\mathbb {Z}}\,H_r \ge {{\,\mathrm{{\text {vol}}}\,}}H_r\). Note that \(r \ge q^{1/2}\) is fixed here. According to Corollary 7.11 and (8.6) we shall take a priori
Increasing the implict constant guarantees that \(\mathrm {vol}_{\mathbb {Z}}\,H_r \ge 2\), i.e. there exists at least one non-zero lattice point \(m \in {\mathbb {Z}}^d {\setminus } \{0\}\) satisfying both \(|Q[m]| \le 1\) and \(\Vert Q_{+}^{1/2}m\Vert \le r\). Because of \(\rho \ge 1/2\), it is easy to see that the right-hand side of (8.8) is bounded, up to absolute constants, by the right-hand side of (1.9).
Second Case: Now we assume that one of the inequalities in (8.7) fails. Then there exists a \(t_0\in [T_{-}, T_{+}] \) such that the reciprocal \(\alpha _d\)-characteristic satisfies at least
Following the proof of Lemma 4.10, we see that there exists a d-dimensional sublattice \(\Lambda ' \subset \Lambda _{t_0}\) with \(\alpha _d(\Lambda _{t_0}) = |\det {\Lambda '}|^{-1} = \Vert w_1 \wedge \ldots \wedge w_n\Vert ^{-1}\), where
is a basis of \(\Lambda '\) determined by integral vectors \(m_j,n_j \in {\mathbb {Z}}^d\), \(j=1,\ldots ,d\). We have also proven, writing \(N = (n_1,\ldots ,n_d), M = (m_1,\ldots ,m_d) \in \mathrm {M}(d,{\mathbb {Z}})\), that N is invertible with \(\beta _{t_0;r}^{-1} > |\det {N}|\) and that the estimate
holds, provided that \(\alpha _d(\Lambda _{t_0}) > q d_Q r^{d-2}\). In view of (8.9) the last condition is satisfied if we take a priori
Now we are in position to apply Corollary 8.4 with the rescaled lattice \(\Lambda = r \Lambda '\), noting that \(\det (\Lambda ) = r^d \det (\Lambda ') \ge |\det {Q}|^{1/2} |\det {N}| \ge 1\), and the quadratic form \(A[x] = \langle x , A x \rangle \) induced by the symmetric matrix
with \(\langle w_i, A w_j \rangle = \langle m_i, n_j \rangle + \langle m_j,n_i \rangle \). In other words, the quadratic form A is represented by the symmetric matrix \(A_0 :=N^T M + M^T N\) in coordinates \(w_1,\ldots ,w_d\). In particular, A is integer-valued on \(\Lambda \). Since \(A_1[n] := A_0[N^{-1}n]\), i.e. \(A_1 = M N^{-1} + (M N^{-1})^T\), has the same signature as \(A_0\), we need to check that the signature of \(A_1\) is (r, s). Because of
we may choose a priori \(r \gg _{\beta ,d} (q/q_0)^{1/2} \max \{1,t_0^{-1/2}\} q^{d/(d-4) +\delta }\), i.e.
to ensure that \(A_1\) and \(t_0 Q\) have the same number of eigenvalues with the same sign, i.e. the same signature (e.g. apply the Hoffman-Wielandt inequality, see Theorem 6.3.5 in [30]). Thus, there exists a non-trivial lattice point \(w = a_1 rw_1+\ldots + a_d rw_d \in \Lambda \), where \((a_1,\ldots ,a_d) \in {\mathbb {Z}}^d {\setminus } \{0\}\), which satisfies \(A[w]=0\) and, writing \(n_0 = a_1 n_1 + \ldots + a_d n_d \in {\mathbb {Z}}^d {\setminus } \{0\}\), is of size
where we used \({{\,\mathrm{{\text {Tr}}}\,}}A^2 \ll _d (r^{-2} + t_0)^2 \ll t_0^2 \ll _{\beta ,d} \log (q+1)^4\) and (8.9). Writing \(w = (w_1,w_2) \in {\mathbb {R}}^d \times {\mathbb {R}}^d\) we also see that \(0 = A[w] = r^{-2} \langle w_1, w_2 \rangle + 8 t_0 Q[n_0]\) and thus
Hence, requiring in addition
it follows from (8.13) that \(|Q[n_0]| \ll _{\beta ,d} 1\), which in turn guarantees \(|Q[n_0]| < 1\) as long as r is taken large enough in terms of \(\beta \) and d. Combining this choice with the lower bounds on r already required in (8.5), (8.6), (8.10) (8.11) and (8.14), we observe that an appropriate choice for r is given by
where the implicit constant is chosen large enough depending on \(\beta \) and d only. This concludes the proof of Theorem 1.3. \(\square \)
Notes
The first of these notations will be used throughout this section only and should not be confused with the notation \(r_* :=r q^{-1/2}\) which will be introduced latter in Lemma 5.1.
References
Athreya, J.S., Margulis, G.A.: Values of random polynomials at integer points. J. Mod. Dyn. 12, 9–16 (2018)
Brandolini, L., Colzani, L., Travaglini, G.: Average decay of Fourier transforms and integer points in polyhedra. Ark. Mat. 35(2), 253–275 (1997)
Birch, B.J., Davenport, H.: On a theorem of Davenport and Heilbronn. Acta Math. 100, 259–279 (1958)
Birch, B.J., Davenport, H.: Quadratic equations in several variables. Proc. Camb. Philos. Soc. 54, 135–138 (1958)
Bentkus, V., Götze, F.: On the lattice point problem for ellipsoids. Acta Arith. 80(2), 101–125 (1997)
Bentkus, V., Götze, F.: Lattice point problems and distribution of values of quadratic forms. Ann. Math. (2) 150(3), 977–1027 (1999)
Buterus, P., Götze, F., Hille, T.: On small values of indefinite diagonal quadratic forms at integer points in at least five variables. Trans. Am. Math. Soc. Ser. B. (2022) (to appear). arXiv:1810.11898
Bochner, S.: Vorlesungen über Fouriersche Integrale. Chelsea Publ. Co., New York (1948)
Bourgain, J.: A quantitative Oppenheim theorem for generic diagonal quadratic forms. Isr. J. Math. 215(1), 503–512 (2016)
Bhattacharya, R.N., Rao, R.R.: Normal approximation and asymptotic expansions. Reprint of the 1976 original. Robert E. Krieger Publishing Co., Inc., Melbourne (1986)
Borevich, A.I., Shafarevich, I.R.: Number theory. Translated from the Russian by Newcomb Greenleaf. Pure and Applied Mathematics, Vol. 20. Academic Press, New York (1966)
Cassels, J.W.S.: Bounds for the least solutions of homogeneous quadratic equations. Proc. Camb. Philos. Soc. 51, 262–264 (1955)
Cassels, J.W.S.: “Addendum to the paper “Bounds for the least solutions of homogeneous quadratic equations””. Proc. Camb. Philos. Soc. 52, 602 (1956)
Cassels, J.W.S.: An introduction to the geometry of numbers. Classics in Mathematics. Corrected reprint of the 1971 edition. Springer, Berlin (1997)
Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the Lambert W function. Adv. Comput. Math. 5(4), 329–359 (1996)
Davenport, H.: Note on a theorem of Cassels. Proc. Camb. Philos. Soc. 53, 539–540 (1957)
Davenport, H.: Indefinite quadratic forms in many variables. II. Proc. Lond. Math. Soc. (3) 8, 109–126 (1958)
Davenport, H., Heilbronn, H.: On indefinite quadratic forms in five variables. J. Lond. Math. Soc. 21, 185–193 (1946)
Davenport, H., Lewis, D.J.: Gaps between values of positive definite quadratic forms. Acta Arith. 22, 87–105 (1972)
Dani, S.G., Margulis. G.A.: “Limit distributions of orbits of unipotent flows and values of quadratic forms”. In: I. M. Gelfand Seminar. Adv. Soviet Math. Amer. Math. Soc., vol. 16, pp. 91–137 (1993)
Elsner, G.: Values of special indefinite quadratic forms. Acta Arith. 138(3), 201–237 (2009)
Eskin, A., Margulis, G., Mozes, S.: Quadratic forms of signature (2; 2) and eigenvalue spacings on rectangular 2-tori. Ann. Math. (2) 161(2), 679–725 (2005)
Eskin, A., Margulis, G., Mozes, S.: Upper bounds and asymptotics in a quantitative version of the Oppenheim conjecture. Ann. Math. (2) 147(1), 93–141 (1998)
Einsiedler, M., Ward, T.: Homogeneous Dynamics and Applications (to appear) (2019)
Gerstein, L.J.: Basic Quadratic Forms. Graduate Studies in Mathematics, vol. 90. American Mathematical Society, Providence (2008)
Ghosh, A., Kelmer, D.: A quantitative Oppenheim theorem for generic ternary quadratic forms. J. Mod. Dyn. 12, 1–8 (2018)
Götze, F., Margulis, G.: Distribution of values of quadratic forms at integral points. Preprint (2010). http://arxiv.org/abs/1004.5123v2
Götze, F.: Lattice point problems and values of quadratic forms. Invent. Math. 157(1), 195–226 (2004)
Helgason, S.: Groups and geometric analysis. Vol. 83. Mathematical Surveys and Monographs. Integral geometry, invariant differential operators, and spherical functions, Corrected reprint of the 1984 original. American Mathematical Society (2000)
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn., p. xviii+643. Cambridge University Press, Cambridge (2013)
Ingham, A.E.: A note on Fourier transforms. J. Lond. Math. Soc. 9(1), 29–32 (1934)
Knapp, A.W.: Representation theory of semisimple groups. An overview based on examples, Reprint of the 1986 original. Princeton University Press (2001)
Knapp, A.W.: Lie Groups Beyond an Introduction. Progress in Mathematics. Birkhäuser, Boston (2002)
Kowalski, E.: An Introduction to the Representation Theory of Groups. Graduate Studies in Mathematics, vol. 155. American Mathematical Society, Providence (2014)
Korkine, A., Zolotareff, G.: Sur les formes quadratiques positives quaternaires. Math. Ann. 5(4), 581–583 (1872)
Korkine, A., Zolotareff, G.: Sur les formes quadratiques. Math. Ann. 6(3), 366–389 (1873)
Korkine, A., Zolotareff, G.: Sur les formes quadratiques positives. Math. Ann. 11(2), 242–292 (1877)
Lang, S.: \({\rm SL}_2({\mathbf{R}})\). Graduate Texts in Mathematics, Vol. 105. Reprint of the 1975 edition. Springer, New York (1985)
Lewis, D.J.: The distribution of the values of real quadratic forms at integer points. In: Analytic Number Theory (Proc. Sympos. Pure Math., Vol. XXIV, St. Louis Univ., St. Louis, Mo., 1972). American Mathematical Society, pp. 159– 174 (1973)
Margulis, G.A.: Discrete subgroups and ergodic theory. In: Number Theory, Trace Formulas and Discrete Groups (Oslo, 1987), pp. 377–398. Academic Press, Boston (1989)
Margulis, G.A.: Oppenheim conjecture. In: Fields Medallists’ Lectures. World Sci. Ser. 20th Century Math. (5), pp. 272–327 (1997)
Marklof, J.: Pair correlation densities of inhomogeneous quadratic forms II. Duke Math. J. 115(3), 409–434 (2002). (issn: 0012-7094)
Marklof, J.: Pair correlation densities of inhomogeneous quadratic forms. Ann. Math. (2) 158(2), 419–471 (2003)
Meyer, A.: Ueber die Aufloesung der Gleichung \(ax^2+by^2+cz^2+du^2+ev^2\) in ganzen Zahlen. Vierteljahresschrift der Naturforschenden Gesellschaft in Zürich 29, 209–222 (1884)
Margulis, G., Mohammadi, A.: Quantitative version of the Oppenheim conjecture for inhomogeneous quadratic forms. Duke Math. J. 158(1), 121–160 (2011)
Mumford, D.: Tata lectures on theta. I. Progress in Mathematics, Vol. 28. With the assistance of C. Musili, M. Nori, E. Previato and M. Stillman. Birkhäuser Boston, Inc., Boston (1983)
Oppenheim, A.: The minima of indefinite quaternary quadratic forms. Proc. Nat. Acad. Sci. USA 15(9), 724–727 (1929)
Oppenheim, A.: The minima of indefinite quaternary quadratic forms. Ann. Math. (2) 32(2), 271–298 (1931)
Schmidt, W.M.: Small zeros of quadratic forms. Trans. Am. Math. Soc. 291(1), 87–102 (1985)
Schmidt, W.M.: Asymptotic formulae for point lattices of bounded determinant and subspaces of bounded height. Duke Math. J. 35, 327–339 (1968)
Schlickewei, H.P.: Kleine Nullstellen homogener quadratischer Gleichungen. Monatsh. Math. 100(1), 35–45 (1985)
Siegel, C.L.: Lectures on the geometry of numbers. Notes by B. Friedman, Rewritten by Komaravolu Chandrasekharan with the assistance of Rudolf Suter, With a preface by Chandrasekharan. Springer, Berlin (1989)
Skriganov, M.M.: Constructions of uniform distributions in terms of geometry of numbers. Algebra i Analiz 6(3), 200–230 (1994)
Skriganov, M.M.: Ergodic theory on SL(n), Diophantine approximations and anomalies in the lattice point problem. Invent. Math. 132(1), 1–72 (1998)
Schlickewei, H.P., Schmidt, W.M.: Quadratic geometry of numbers. Trans. Am. Math. Soc. 301(2), 679–690 (1987)
Schlickewei, H.P., Schmidt, W.M.: Quadratic forms which have only large zeros. Monatsh. Math. 105(4), 295–311 (1988)
Schlickewei, H.P., Schmidt, W.M.: Isotrope Unterräume rationaler quadratischer Formen. Math. Z. 201(2), 191–208 (1989)
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Research supported by the DFG, CRC 701.
Index
Index
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Buterus, P., Götze, F., Hille, T. et al. Distribution of values of quadratic forms at integral points. Invent. math. 227, 857–961 (2022). https://doi.org/10.1007/s00222-021-01086-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00222-021-01086-6