On a sharp lemma of Cassels and Montgomery on manifolds

Let $\left( \mathcal{M},g\right) $ be a $d$-dimensional compact connected Riemannian manifold and let $\left\{ \varphi_{m}\right\}_{m=0}^{+\infty}$ be a complete sequence of orthonormal eigenfunctions of the Laplace-Beltrami operator on $\mathcal{M}$. We show that there exists a positive constant $C$ such that for all integers $N$ and $X$ and for all finite sequences of $N$ points in $\mathcal{M}$, $\left\{ x\left( j\right) \right\}_{j=1}^{N}$, and positive weights $\left\{ a_{j}\right\}_{j=1}^{N}$ we have \[ \sum_{m=0}^{X} | \sum_{j=1}^{N} a_{j} \varphi_{m} ( x( j) ) | ^{2}\geq \max \{ CX\sum_{j=1}^{N}a_{j}^{2},( \sum_{j=1}^{N}a_{j}) ^{2}\}.\]

The main result of this paper is the following theorem. Theorem 1. There exists a positive constant C such that for all integers N and X and for all finite sequences of N points in M, {x (j)} N j=1 , and positive weights Notice that the estimate is immediately obtained since for m = 0 one has ϕ 0 (x) = 1 for all x in M. The essential part of the theorem is therefore the estimate When M is the one-dimensional torus, the above theorem is classical and goes back to the work of J. W. S. Cassels [3]. This was later extended to the higher dimensional torus by H. L. Montgomery, see e.g. his book [9], or [14]. Recently, D. Bilyk, F. Dai, S. Steinerberger [2] extend the Cassels-Montgomery inequality to the case of smooth compact d-dimensional Riemannian manifolds without boundary. More precisely they show that there exists a positive constant C such that for all integers N and X and for all finite sequences of N points in M, {x (j)} N j=1 , and positive weights This result should be compared with the following simple proposition.
Proposition 2. Let X and N be positive integers. For all positive weights a j a k ϕ m (y j ) ϕ m (y k ). Hence, Therefore there exist points {x (j)} N j=1 such that Our goal is therefore to remove the logarithmic loss in the above result of Bilyk, Dai and Steinerberger, thus obtaining a sharp estimate.
The original proof by Montgomery in the case of the torus uses the Fejér kernel. A direct adaptation of this proof to the case of a general manifold would require to construct a positive kernel of the form X m=0 c m ϕ m (x)ϕ m (y), but unfortunately this type of kernels is not available in a general manifold. One could therefore withdraw, for example, the requirement that the spectrum of the kernel be contained in the set {λ 2 0 , . . . , λ 2 X }. This is the strategy followed by Bilyk, Dai and Steinerberger which use the heat kernel. Our strategy here is on the contrary to use a kernel which is positive up to a negligible error, without dropping the spectrum condition. The existence of such type of kernel can be proved by means of the Hadamard parametrix for the wave operator on the manifold. In the next section we introduce this construction.
Let us clarify the meaning of the objects that appear in this proposition. Let α ∈ C be such that Re α > −1 and for every test function ϕ ∈ C ∞ 0 (R) define the distribution χ α + as Integration by parts immediately gives χ α + , ϕ = − χ α+1 + , ϕ ′ so that χ α + can be extended to all α with Re α > −2, and, repeating the argument, to the whole complex plane (see [8, I, §3.2] for the details).
Also, since the function f (x, t) = t 2 −|x| 2 is a submersion of R d+1 \{0} in R, then the pull-back χ α We observe that by [8, I, Theorem 3.23] the distribution χ Recall that distributions in D ′ (M) can always be written as u = +∞ m=0 c m ϕ m , where the sequence {c m } is slowly increasing. Their action on smooth functions is given by Consider the continuous linear map K t : D(M) → D ′ (M) defined by Observe that K t φ is in fact a smooth function and it is the solution of the following Cauchy problem for the wave equation By the Schwartz kernel Theorem (see [ This immediately implies that Hadamard's construction of the parametrix for the wave operator allows to describe for small values of time t the singularities of cos t √ ∆ (x, y).
Theorem 4 (see [11,Theorem 3.1.5]). Given a d-dimensional Riemannian manifold (M, g), there exists ε > 0 and functions α ν ∈ C ∞ (M×M), so that if Q > d+3 the following holds. Let and Observe that K Q (t, x, y), by Proposition 3 (iii), defines a distribution on R×M× M via the identity However this distribution describes the singularities of the kernel cos t √ ∆ (x, y) only for small time.

Notations and Fourier transforms
Let us introduce some notation. If f and g are integrable functions on R d , we shall denote their convolution by We define the cosine transform of smooth even functions on R as For smooth functions on R d we will use a slightly different normalization, and we define the Fourier transform and its inverse as For radial functions f (x) = f 0 (|x|), the above Fourier transform reduces essentially to the Hankel transform, given by (see. [12,Chapter 4,Theorem 3.3]) In the future, with an abuse of notation, we will identify the function f with its radial profile f 0 and write F d f (|ξ|) instead of F d f (ξ) . One can easily show that In the proof of Theorem 1 we need the inverse cosine transform of the distribution ∂ t E ν −Ě ν . By Proposition 3 (iii), ∂ t E ν −Ě ν (t, z) can be seen as a continuous function of z into D ′ (R) . In the following Lemma we compute for every fixed z the inverse cosine transform of this distribution.
Proof. Since by Proposition 3 (i) and (iii) is an even, locally integrable function in t, vanishing at ∞, so that its cosine transform is (see [4,Formula 11, Table 1.3, Chapter 1, page 12]), Observe now that the distribution χ This implies that also the cosine transform can be analytically extended to all complex values of ν (see [6, Note 1, page 171]). This analytic extension coincides therefore with the analytic extension of the distribution Observe that this is the product of the locally integrable function |s| −2ν−1+d (re- Thus, the identity holds for all ν < d/2.

Proof of the main result
It suffices to show the main inequality (1) for any positive integer N and for any integer X sufficiently large. Indeed, if 1 ≤ X < X 0 Let κ be a positive integer that we will choose later and let Y = κX. By [7, with measure |R i | = 1/Y and such that each region contains a ball of radius c 1 Y −1/d and is contained in a ball of radius c 2 Y −1/d , for appropriate values of c 1 and c 2 independent of Y . Let us call {B r } R r=1 the sequence of all the regions of the above which contain at least one of the points x (j). We call K r the cardinality of the set {j = 1, . . . , N : x (j) ∈ B r } and S r the sum of the weights {a j } corresponding to points x (j) ∈ B r . Without loss of generality we can assume that We rename the sequence {x (j)} (4) Let ψ be a smooth radial function on R d compactly supported in the ball B (0, 1/2) = x ∈ R d : |x| 1/2 such that ψ 2 = 1 and ψ > 0, and set H (x) = ψ * d ψ (x). Then clearly H is radial, compactly supported in B (0, 1), H (x) ≤ 1 for all x ∈ M, and H (0) = 1. Moreover its Fourier transform is F d H (ξ) = (F d ψ (ξ)) 2 ≥ 0 for all ξ ∈ R d , and has fast decay at infinity with all its derivatives.
If we now identify H (x) with its radial profile, we can write Let us define the kernel We will estimate F X (x, y) using the parametrix for the wave operator described in the previous section. For this, one would need that the Fourier cosine transform of H · λX have small support. This of course cannot be achieved, having H itself compact support. For this reason we pick η = F d φ where φ (ξ) is a nonnegative smooth radial function supported in B (0, ε/2π) and such that φ (ξ) = 1 in B (0, ε/4π) and define The reason for taking a d-dimensional convolution will be clarified in Lemma 7 where we use the fact that F d H ≥ 0.
Observe that supp F d H ⊂ B (0, ε/2π). It is remarkable that the cosine transform of H has support in [0, ε] and is nonnegative.
Proof. It is known (see [10, eq. (3.9)] ) that for d > d ′ ≥ 1, Let now g (r) = F d H (r). Since H (s) = F d g (s) and the cosine transform is essentially F 1 we obtain 2π and the thesis follows immediately from (7), the fact that g (r) ≥ 0 and the fact that g (r) = 0 for r > ε/2π.
Let us go back to the kernel F X , We can therefore decompose the kernel F X as follows Recalling (5) and (6) we have a r,j a s,i F n (x r,j , x s,i ) .
We start estimating the term with F 1 which is the positive part of the kernel and gives the main contribution. Proof. First of all we show that Ω 0 (x, y) is positive. Indeed, by Lemma 5 and (2), for every x, y ∈ M, Since also α 0 (x, y) is positive, we can disregard off-diagonal terms, a r,j a r,i α 0 (x r,j , x r,i ) Ω 0 (x r,j , x r,i ) By Weyl's estimate (see e.g. [8,III,Corollary 17.5.8]) The following lemmas show that the contributions given by the terms with F 2 , F 3 , F 4 , F 5 are negligible. Lemma 8. There exist C > 0 and X 0 > 0 such that for every X > X 0 Proof. We will show that for every integer ν, By Lemma 5, for every x, y ∈ M, Using (7) and the fast decay at infinity of F d ψ, for any positive M there exist positive constants C and G such that for every ρ ≥ 0 Therefore, using the symmetry of Ω ν (x, y), for any integer ν with 1 ≤ ν < d/2 we obtain In order to estimate the above sum recall that every region B r is contained in a ball centered at a point z r ∈ B r of radius c 2 Y −1/d and let c 3 = 10c 2 . For every fixed r = 1, . . . , R we will consider separately the contribution of those values of s for which the B s is near B r , in the sense that B s is contained in the ball centered at z r and with radius c 3 Y −1/d , and the contribution of the remaining values of s, for which we will say that B s is far from B r . Notice that there are at most regions B s near B r . Thus, using again that λ X ∼ X 1/d and that for r s we have Using again that for r ≤ s we have Kr j=1 a r,j ≥ Ks i=1 a s,i , s>r: Lemma 9. There exist C > 0 and X 0 such that for every X > X 0 Proof. We will show that for every integer ν ≥ d/2, Ks i=1 a r,j a s,i α ν (x r,j , x s,i ) Ω ν (x r,j , x s,i ) Observe that for ν ≥ d/2, the distribution ∂ t E ν −Ě ν (t, d (x, y)) can be identified with the locally integrable function for an appropriate value of C ν . Therefore, using the symmetry of Ω ν (x, y), where we use the fact that C −1 H (t) ≥ 0, by Lemma 6.
Assume first that d = 1 and let D = d (x r,j , x s,i ), then A similar estimate can be obtained for d ≥ 2. Indeed, by formula (7) Thus, for all d ≥ 1, Finally, arguing as in the previous lemma, Lemma 10. There exist C > 0 and X 0 > 0 such that for every X > X 0 Proof. Recall that for every x, y ∈ M, As before, if d = 1, then and, by Theorem 4, A similar estimate holds for d ≥ 2. Indeed, as in the previous lemma, so that, again by Theorem 4, By Weyl's estimates, which say that the number of eigenvalues λ 2 m ≤ T 2 is asymptotic to cT d , and taking M such that −M + 2d − 1 < −d gives the result.

Final remarks
A simple consequence of Theorem 1 is the following estimate on the maximum degree X of linear combinations of eigenfunctions of the Laplacian up to the eigenvalue λ X that a quadrature rule can integrate exactly. This is a well known result for equal weights, see e.g. Then there exists a constant C > 0 independent of X and N such that In particular CX N.
Proof. Since ϕ 0 (x) ≡ 1 we must have Applying Cauchy-Schwarz inequality to 1 = N i=1 a i we easily obtain