On the variance of squarefree integers in short intervals and arithmetic progressions

We evaluate asymptotically the variance of the number of squarefree integers up to x in short intervals of length H<x6/11-ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H < x^{6/11 - \varepsilon }$$\end{document} and the variance of the number of squarefree integers up to x in arithmetic progressions modulo q with q>x5/11+ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q > x^{5/11 + \varepsilon }$$\end{document}. On the assumption of respectively the Lindelöf Hypothesis and the Generalized Lindelöf Hypothesis we show that these ranges can be improved to respectively H<x2/3-ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H < x^{2/3 - \varepsilon }$$\end{document} and q>x1/3+ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q > x^{1/3 + \varepsilon }$$\end{document}. Furthermore we show that obtaining a bound sharp up to factors of Hε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H^{\varepsilon }$$\end{document} in the full range H<x1-ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H < x^{1 - \varepsilon }$$\end{document} is equivalent to the Riemann Hypothesis. These results improve on a result of Hall (Mathematika 29(1):7–17, 1982) for short intervals, and earlier results of Warlimont, Vaughan, Blomer, Nunes and Le Boudec in the case of arithmetic progressions.

1. Introduction 1.1.Main results.An integer n ≥ 1 is squarefree if it is not divisible by the square of a prime.By analogy with questions about prime numbers, a basic problem in analytic number theory is to understand the distribution of squarefree numbers in arithmetic progressions and in short intervals.Squarefree numbers ought to be a simpler, more regular sequence than primes, and yet they present distinct challenges; for instance we can determine whether n is prime in polynomial time [AKS04], but there is no known polynomial time algorithm to determine whether n is squarefree.
It was conjectured by Montgomery (see [Cro75]) that for any given ε ∈ (0, 1/100), and (a, q) = 1, (1) uniformly in 1 ≤ q ≤ x1−ε .This conjecture is difficult for two reasons.In the regime of large q of size roughly x 1−ε the left-hand side contains only x ε terms and even establishing an asymptotic is open 1 (see the work of Nunes [Nun17] for the best result in this direction).In the regime of small q of size about x ε establishing an asymptotic is easy but obtaining an error term as good as O ε ((x/q) 1/4+ε ) is an open problem, even conditionally on the Generalized Riemann Hypothesis.Analogously we conjecture that for any given ε ∈ (0, 1 100 ), uniformly in x ε ≤ H ≤ x, (2) Similarly to the case of arithmetic progressions, when H is close to x ε no asymptotic estimates are known (see the work of Tolev [Tol06] and Filaseta-Trifonov [FT92] for the best unconditional results in this direction and [CE19, Thm.A.1], [Gra98] for results conditional on the ABC conjecture).Meanwhile for large H, say H = x, estimating (2) asymptotically is straightforward, but obtaining an error term ) is an open problem, even conditionally on the Riemann Hypothesis (see [Liu16] for the best result in this direction).
An important feature of both conjectures (1) and ( 2) is that the error term is significantly smaller than the square-root of the number of terms being summed, in contrast to what a naive probabilistic model predicts.
The conjectures (1) and (2) imply the Riemann Hypothesis, and they are almost certainly deeper than the Riemann Hypothesis.Nonetheless one can still hope to investigate them on average over residue classes for (1) or on average over short intervals for (2).Importantly, establishing (1) on average is easier when q is large than when q is small, since a large q allows for more averaging over the residue classes a (mod q).Similarly establishing (2) on average is easier when H is small, since there are more non-overlapping short intervals [x, x + H] to average over compared to the case when H is large.In fact when there is little averaging (i.e q small or H large), the averaged versions of (1) and (2) are not significantly easier than the non-averaged version, see Theorem 3 for a concrete manifestation of this.
In our first result we compute the variance of (2) on average over short intervals.We estimate the variance asymptotically thus making (on average) the error term in (2) more precise.
We recall that the Lindelöf Hypothesis follows from the Riemann Hypothesis and asserts that for any given ε > 0 we have |ζ( 1 2 + it)| ≪ ε 1 + |t| ε for all t ∈ R. In Theorem 3 we will show that if we had (3) in the full range H ≤ X 1−ε then the Riemann Hypothesis would ensue.
Theorem 1 extends a theorem of Hall [Hal82] who showed that the asymptotic formula (3) holds in the range H ≤ X2/9−ε .We will now explain why the range H = X 1/2 can be considered a threshold in this problem.It is reasonable to conjecture that given ε > 0, for any 1 ≤ h ≤ x 1−ε , (5) with C(h) a constant depending only on h.Summing this conjectural estimate over h recovers Theorem 1 but only in the range H < X 1/2−ε .Thus Theorem 1 exploits (unconditionally!) additional cancellations between the error terms in (5). 2  We now describe the analogue of Theorem 1 for the distribution of squarefree numbers in arithmetic progressions with a given modulus.In this case for a given modulus q the parameter x/q has the same role as the length H of the short interval in Theorem 1.While the results are analogous they are harder to prove, as is often the case with q-analogues.
Theorem 2. Let ε ∈ (0, 1 100 ) be given.Let x ≥ q ≥ x 5/11+ε be a prime.Then where C is the same constant as in Theorem 1. Assuming the Generalized Lindelöf Hypothesis the claim holds in the wider range q > x 1/3+30ε .
We recall that the Generalized Lindelöf Hypothesis follows from the Generalized Riemann Hypothesis and asserts that for any given ε > 0 we have and all characters χ (mod q).
For simplicity we have assumed in Theorem 2 that q is prime, but our methods are amenable to handling the general case of composite q with a bit more effort.
Once extended to composite q our Theorem 2 improves on results by Warlimont [War80] and Vaughan [Vau05] who obtain an asymptotic formula with an additional averaging over q ≤ Q in the range x 2/3 ≤ Q = o(x).Moreover, for prime values of q, Theorem 2 improves on a succession of results by Blomer [Blo08], Nunes [Nun15] (see also [Par19]) and Le Boudec [LB18] who considered individual averages over (a, q) = 1 as we do in Theorem 2. In particular Nunes showed that (6) holds in the range x 31/41+ε ≤ q = o(x) and Le Boudec showed that the left-hand side of (6) is O ε ((x/q) 1/2+ε ) for all ε > 0 in the range x 1/2 ≤ q ≤ x.
Keating and Rudnick [KR16] obtained Theorem 1 and Theorem 2 in the context of function fields in the limit of a large field size.Their results hold in the (analogues of) the ranges X ε ≤ H ≤ X 1−ε and x ε ≤ q ≤ x 1−ε .Our proofs of Theorem 1 and Theorem 2 can be adapted in the setting of a fixed base field and large degree limit.In fact our proofs of Theorems 1 and 2 were originally motivated by analogies with the function field setting.Since we ended up obtaining equally strong results in the setting of number fields we do not include the proofs in the function field setting.
Finally the next result shows that obtaining nearly optimal upper bounds for (3) in a complete range is equivalent to the Riemann Hypothesis.
Theorem 3. The Riemann Hypothesis holds if and only if for every ε ∈ (0, 1 100 ) and every for every δ > 0.
Following the proof of Theorem 3 one can show that for any smooth compactly supported Φ, conditionally on the Generalized Riemann Hypothesis (8) 1 ϕ(q) (a,q)=1 m≡a (mod q) for all δ > 0 and uniformly in 1 ≤ q ≤ x 1−ε for any given ε ∈ (0, 1 100 ).However it is not clear whether (8) implies the Generalized Riemann Hypothesis.Moreover replacing the smoothing Φ by sharp cut-offs appears to be difficult.For these reasons we decided not to pursue this further in the present paper.
Finally, we note that we have made no effort to optimize the exponents of error terms O ε (H 1/2−ε/16 ) and O ε ((x/q) 1/2−ε/16 ) in Theorems 1 and 2. Better power saving estimates, in more restricted ranges, can be found in the papers [Hal82] and [Nun15].
1.2.Fractional Brownian motion.One notable feature of Theorems 1 and 2 is that while the expected count of squarefrees in a short interval (or likewise arithmetic progression) is of order H, the variance of these counts is of order H 1/2 .For many other natural arithmetic sequences (e.g.primes) one conjectures that the variance of counts is of the same order of magnitude as the expected value of counts.
That the variance is of order H 1/2 in Theorems 1 and 2 speaks to the idea that the squarefree numbers are "less random" than (for example) the primes (cf.[CS13]).
One may conjecture that higher moments are gaussian (see [ACS17] for numerical evidence).For x drawn uniformly at random from [X, 2X], one may even make the stronger conjecture that the process tends weakly, when suitably normalized by H 1/4 , to a fractional Brownian motion with Hurst parameter 1/4.See Figure 1 for an illustration of the evolution of the partial sums (9).A formulation of this perspective seems to have been first made in [GH91].This is in contrast to the analogous process generated by prime-counting, where one may conjecture the appearance of Hurst parameter 1/2 -that is, usual Brownian motion.(See [She14] for a survey on fractional Brownian motion.)The evolution of the process is depicted in Figure 2.Both Figure 1 and Figure 2 depict the same range of parameters to make the comparison easier.The dots on Figure 1 and Figure 2 correspond to lattice points on the positive x-axis and on the (positive and negative) y-axis and indicate the difference in scales.Parts of this research were done during visits to Centre de Recherches Mathématiques and Oberwolfach and we thank these institutions for their hospitality.
1.4.Conventions and Notations.Throughout the rest of the paper we will allow the implicit constants in ≪ and O(•) to depend on ε.Furthermore the notation n ∼ N in the subscript of a sum will mean that N ≤ n < 2N.

Proofs of Theorems 1 and 2
We will show in this section how Theorems 1 and 2 follow from a number of technical propositions that are proven in Sections 4-7.
The proof of Theorem 1 splits into two steps and depends on the identity and the following two propositions.
Under the assumption of the Lindelöf Hypothesis, the above propositions cover all the possible values of d 2 for X ε ≤ H ≤ X 2/3−ε .However, unconditionally they cover all the possible values of d 2 only for X ε ≤ H ≤ X 6/(11+12ε) .It would be possible to improve on the exponent 4/7 in Proposition 2, but this would not help.Similarly it should be possible to prove Proposition 1 only with the condition H 1+ε ≤ z ≤ X/H 1/2+ε by adapting the proof of Proposition 3 below.
We note that only the terms d with d 2 ∈ [H 1−ε , H 1+ε ] contribute to the main term C √ H in Proposition 1. Roughly speaking Proposition 1 depends only on "convex" inputs such as a Fourier expansion and a point-counting lemma, whereas Proposition 2 exploits large value estimates of Huxley and subconvexity and fourth moment estimates for the Riemann zeta-function.
Proof of Theorem 1 assuming Proposition 1 and Proposition 2. Let ε ∈ (0, 1 100 ).If H ≤ X ε then the result already follows from Hall's theorem.We can therefore assume that H > X ε .
For H ∈ [X ε , X 6/11−ε ], take z = min{X/H 1/2+ε , H 1/2−ε X 1/2 }.Note that z ≥ H 4/3+ε .Denoting by I 1 the left-hand side of (11) and by I 2 the left-hand side of (12), we get, using Cauchy-Schwarz, that Using the bounds in (11) and (12), we conclude that Notice that the tail Applying Cauchy-Schwarz and (13) we see that the left hand side is Likewise the proof of Theorem 2 splits into two steps and depends on the following propositions.
The proof of Proposition 3 depends once again only on "convex" inputs: in this case Poisson summation and results on integer solutions to binary quadratic forms with positive discriminant.However the proof of Proposition 3 is more intricate than that of Proposition 1 due to a number of technical issues.The proof of Proposition 4 is similar to the proof of Proposition 2 and uses hybrid versions of Huxley's large value estimates, subconvexity estimates for L(s, χ) and a hybrid fourth moment estimate.
The deduction of Theorem 2 from the above two proposition is identical to the deduction of Theorem 1 from Proposition 1 and Proposition 2. The only difference is that we use the result of Nunes to handle the case when q > x 1−ε and we notice that for prime q, and the total error term incurred is x/q 2 which is ≤ x −ε x/q for q > x 1/3+ε .Theorem 3 depends upon similar principles as Propositions 2 and 4. We prove Theorem 3 in section 8.
Finally let us make a few remarks on the bottleneck that prevents us from pushing our result further.Taking H = X 6/11 , we are unable to show the following estimate, Specifically opening µ(d) using Heath-Brown's identity (see [HB82]) the only situation that we are not able to estimate is the one in which µ(d) is replaced by two smooth sums of equal length.Roughly speaking this corresponds to estimating, Opening the above expression into Dirichlet polynomials this is roughly equivalent to Applying the functional equation on the Dirichlet polynomial over n, and setting Y = X 12/11 we then see that obtaining the above estimate is equivalent to showing that, If the 2it in the Dirichlet polynomial over a and b were replaced by it then we would be facing exactly the same bottleneck as in the case of improving Huxley's prime number theorem in short intervals (by a variant of the computations in [HB82], see also [Har07,Chapter 7]).In particular to make further progress we either need to find a way to improve Huxley's estimate or find a way to exploit the fact that the phases in two of the Dirichlet polynomials are 2it and not it.Unfortunately we do not see how to make progress on either of these questions.

Lemmas
3.1.Dirichlet polynomials and L-functions.Let us first collect some standard results on large values of Dirichlet polynomials and L-functions.
Proof.This follows from the mean-value theorem and Huxley's large value theorem, see e.g.[IK04, Theorem 9.7 and Corollary 9.9].
We will say that a set of tuples (t, χ) with χ a Dirichlet character and t a real number is well-spaced whenever it holds that if (t, χ) = (u, χ ′ ) then either χ = χ ′ or |t − u| ≥ 1.
Lemma 2 (Hybrid large-value theorem).Let q ∈ N, N, T ≥ 1 and V > 0. Let F (s, χ) = n≤N a n χ(n)n −s be a Dirichlet polynomial, and let G = n≤N |a n | 2 .Let T be a set of well-spaced tuples (t r , χ) with t r ∈ [−T, T ] and with χ a primitive character of modulus Proof.This follows e.g. from [IK04, Theorems 9.16 and 9.18 with k = q and Q = 1].
Lemma 4 (Hybrid fourth moment estimate).Let T ≥ 2 and q ≥ 2. Then where the sum is over all characters modulo q.
Lemma 6 (Hybrid Weyl subconvexity).For cube-free q, primitive characters χ (mod q) and |t| ≥ 2, Of course the Lindelöf and Generalized Lindelöf Hypotheses would give us respectively that for any ε > 0, for |t| ≥ 2 and any character χ (mod q), Lemma 7 (Hybrid mean-value theorem).Let a(n) be an arbitrary sequence of coefficients and N, q ≥ 1 be integers and T ≥ 1 real.Then, for any given ε > 0, Proof.We notice that given a character ψ (mod q) there are at most ≪ q ε characters χ such that χ 2 = ψ.Therefore the left-hand side of the claim is bounded by and the result follows from the standard hybrid mean-value theorem, see e.g.[Mon71, Theorem 6.4].
Proof.The proof consists of two steps.The first step is to show that the sum in (17) can be completed into a sum over all d 1 , d 2 without affecting the claimed asymptotic.We use the information (16) for k = 0 and ℓ = 0, 1 to see that, for any ν > 0, (18 Writing (d 1 , d 2 ) = d 0 and d i = δ i d 0 and utilizing symmetry and the lower bound for z, we see that this is Thus the left-hand side of ( 17) is The second step is to use contour integration to simplify (19).Define g(x) = |W (e x )| 2 e x , which is smooth and decays exponentially as |x| → ∞.Now and standard partial integration arguments show that (i) ĝ(ξ) is entire and (ii) ĝ(ξ) = O(H 2ε /(|ξ| + 1) 3 ) uniformly for |ℑ(ξ)| < 1/(2π).Fourier inversion implies, for r > 0, where the integral is over ℜ(s) = c, and The range of s is such that the sums over both λ and d 1 , d 2 can be taken inside the integral, and the above simplifies to ds.
The Euler product in the last line converges absolutely for ℜs > −3/4.Therefore using Lemma 5 (noting that the same bound holds also for ζ(c + it) with c ≥ 1/2) and bounds on ĝ we can push the contour integral above to the left to an integral over ℜ(s) = −3/4 + ε.Because of the singularity from ζ(2 + 2s) at s = −1/2 the above then simplifies to This verifies the lemma.
We also have a minor variant: Lemma 9. Fix ε ∈ (0, 1 100 ).Let S(x) = sin πx πx , defined by continuity at x = 0, and let z ≥ H 1+ε .Then Proof.We first note that (21 This identity follows from [GR14, formula 3.823].Thus (20) is a variant of (17).Lemma 8 does not apply directly because S does not decay quickly enough.To overcome this issue, we let h be a smooth bump function such that h(x) = 1 for |x| ≤ 1 and h(x) = 0 for |x| ≥ 2. We introduce the function W (y) = S(y)h(y/H ε/4 ) which satisfies the hypothesis of Lemma 8 for our ε.On the other hand for such W We split the sum over d 1 and d 2 into the complementary ranges 2 ), and we see that the above is On the other hand, Combining this with the bound (22) and the identity ( 21) verifies (20) with error term of order 3.3.Initial reductions on second moments.The following lemma will be used in the proof of Proposition 2.
Lemma 10.If F : R → C is square-integrable and H ≤ X, then Proof.The proof can be found in a paper by Saffari and Vaughan [SV77, Page 25] but for the convenience of the reader we include the proof here.First note that by the triangle inequality we have, for any v ≥ H, Integrating this over x ∈ [X, 2X] and v ∈ [2H, 3H], By a change of variables the right-hand side is equal to Changing the order of integration was justified by Fubini's theorem.Letting v = θx in the inner integral of the last expression above, we see the right-hand side is equal to Collecting everything and swapping the order of integration again, we obtain which immediately implies the claim.
We will frequently use the following immediate consequences of the orthogonality of characters: For any sequence b n of complex numbers, Proof.We can clearly assume that η 1/2 (ab) 1/4 ≤ M 1/2 since otherwise the claim is trivial.Assume we have a (reduced) rational approximation r/q with r ∈ Z and Now, writing each m ∈ (M, 2M] as m = kq + ℓ with 0 ≤ ℓ ≤ q − 1, we see that the number of solutions to (25) with m ∼ M is at most Now since b/a = √ ab/a is a quadratic irrational, the partial denominators in its continued fraction expansion have size at most 2 √ ab (see for instance [RS92, p. 44]).In particular this means that for any given R ≥ 1, we can find q ∈ [R, 3 √ abR] such that (26) holds for some r coprime to q. Taking R = M 1/2 /(η 1/2 (ab) 1/4 ) ≥ 1, we see that the number of solutions is indeed Lemma 12. Let a, b ∈ N be such that b/a is irrational, and let M 1 , M 2 , T ≥ 1.The number of solutions to Proof.Dividing by b and factoring, we see that we need to count the number of solutions to T Dividing by the second factor, we see that it suffices to count the number of solutions to If M 2 ≥ T , we have M 1 choices for m 1 and after that O(M 2 /T ) choices for m 2 , so in total M 1 M 2 /T solutions which is fine.
If M 2 < T , then once m 1 is chosen there are at most two choices for m 2 .Therefore it suffices to count the number of integers m 1 ∼ M 1 such that The result now follows from Lemma 11.

The range
In what follows we let S be the sinc function as defined in Lemma 9. Proposition 1 follows immediately combining the following proposition with Lemma 9.
Proof.We prove a smoothed version of the claim first.Let σ : R → R be an absolutely integrable function such that σ is supported in the interval [−BH ε/2 , BH ε/2 ] for some constant B to be specified later.We first show that as X → ∞, Here (28) We take N = X 10 and plug (29) into (28).The arising error term is O(1/X 5 ) unless x/d 2 < X −5 or (x + H)/d 2 < X −5 .Given this, it is easy to see that the error term leads to acceptable contribution to the left hand side (27).Hence, the left hand side of ( 27) can be replaced by Expanding, this equals (30) Owing to the support of σ this implies that we may restrict the sum in (30) to those integers for which We consider separately those (n 1 , n 2 , d 1 , d 2 ) for which n 1 d 2 2 = n 2 d 2 1 and those for which this does not hold.In the first case parameterizing solutions in n 1 and n 2 by where the error term comes from adding |n i | > N (for which surely |λ| > X 8 ).Here , so we get the desired main term involving S(λH/(d 2 1 , d 2 2 )).Therefore it remains to show that the contribution of terms with n 1 d 2 2 = n 2 d 2 1 is negligible.Splitting n j and d j dyadically, we need to bound, for any Notice that there are no solutions unless (34) We split into two cases according to whether n 2 /n 1 is quadratic irrational or instead rational.In the first case we can apply Lemma 12, which shows that the number of solutions (33) is By (34) we can multiply the first term by (D 1 /D 2 )(N 2 /N 1 ) 1/2 and the third term by X 1/2 .Using this bound in (32), and summing over n 1 and n 2 , we note that the maximum is attained for N j = D 2 j /H and thus the contribution to (32) from n 2 /n 1 quadratic irrational is bounded by In case n 2 /n 1 is rational, there exist m, ℓ 1 , ℓ 2 ∈ Z such that n 1 = mℓ 2 1 and n 2 = mℓ 2 2 .Hence, writing r 2 1 = ℓ 2 1 d 2 2 and r 2 2 = ℓ 2 2 d 2 1 , we see that the contribution to (32) for n 2 /n 1 rational is bounded by Factoring r 2 1 − r 2 2 = (r 1 − r 2 )(r 1 + r 2 ) and dividing by the second factor, we see that the number of solutions (r 1 , r 2 ) is Summing over m ≪ min{N 1 , N 2 } and using this bound in (35), the maximum in the resulting bound for (35) is attained for N j = D 2 j /H.Hence we obtain that (35) is at most H 2+ε/2+ε/500 /X ≤ H 1/2−ε/2 since H ≤ X 2/3−ε .
Let us now dispose of the smoothing σ: Take B to be a sufficiently large absolute constant that there exist integrable functions σ − and σ + such that σ − and σ + have support [−BH ε/2 , BH ε/2 ], and (We allow σ − and σ + to take negative values.)An explicit construction of such functions is given by the Beurling-Selberg majorant and minorant [Mon01, p. 273].Applying (27) and these bounds, 5. The range z ≥ H 4/3+ε in the t-aspect : Proof of Proposition 2 We would like to establish that 1 Splitting into dyadic ranges according to the size of d, it essentially suffices to show that, for each D ∈ [z 1/2 , (2X) 1/2 ], we have Using this definition and Lemma 10, we see that the left-hand side of ( 36) is . Choose w such that e w = 1 + θ, so that w ≍ H X .By contour integration (38) A(e y ) = 1 2πi Moving the contour to the line ℜs = 1/2 we notice that the residue from s = 1 cancels with the second term on the right-hand side of (38), and we obtain Combining (37) and (39) we get after a change of variable, On the other hand, the contribution of |t| ≤ X 2 to the right-hand side of (40) is at most Let us now prove the claim on the assumption of the Lindelöf Hypothesis.Applying Lindelöf and then the mean-value theorem (Lemma 7 with q = 1), we have for any choice of δ > 0, for δ sufficiently small.Applying this bound to (41) yields the claim.
Let us now prove the unconditional part of the proposition.First notice that the values of t for which |M(1 + 2it)| ≤ D −1/2+ε/16 contribute to (41) by Cauchy-Schwarz and the fourth moment bound (Lemma 3) O(H 1+ε/16 D −1+ε/8 ) = O(H 1/2−ε/4 ), and therefore their contribution is always acceptable.Writing Consider first the case when the first term dominates here.Then by Lemma 5 we have Consider now the case that the second term dominates in (42).Then, by Cauchy-Schwarz and the fourth moment estimate (Lemma 3), since z ≥ H 4/3+ε .This finishes the proof of Proposition 2.
The proof of Proposition 6 is based on two Propositions that we now describe.Proposition 7 below will be used to introduce a smoothing into (43).Note that it gives an upper bound that is o( √ qx) whenever z = o( √ qx/(log x) 6 ) and the interval I has length o(x/(log x) 12 ).
Proposition 7. Let q be prime with q ≤ x, let z ≤ x.Let I ⊂ [1, 2x] be an interval.Then We will use the following proposition to evaluate the smoothed analogue of (43).
One way to construct f satisfying the assumptions of the proposition is to take φ(t) to be a smooth function which vanishes for negative t and has φ(t) = 1 for t greater than 1, and then set f (u) = φ((x/q) ε/4 u)φ((x/q) ε/4 (1 − u)).
With these two propositions at hand we are ready to prove Proposition 6.
6.1.Proof of Proposition 6.For m ∈ N, set and let f be as described below Proposition 8. Then where By partial summation, Now by Proposition 7 we have, for Therefore (45) is as well.By (a + b) 2 ≪ |a| 2 + |b| 2 we conclude that S 2 ≪ (log x) 6 (x/q) −ε/8 √ qx as needed.On the other hand we can compute S 1 by using Proposition 8 and this yields the claimed estimate.
6.2.Proof of Proposition 7. By Pólya's formula (see [MV77, Lemma 1]) for I = [a, b] and any χ = χ 0 of modulus q, where We split d and n into dyadic intervals and bound the left-hand side of (44) by The error term is clearly acceptable.We bound the main term of (46) using a majorant principle -by going through the first equality in (24) we can replace coefficients µ(d) and f I/d 2 (n) by their majorants.Hence we get the bound The contribution of the principal character and quadratic character is ≪ z(log x) 4 which is acceptable.On the remaining non-principal and non-quadratic characters we apply Cauchy-Schwarz giving the upper bound We claim that (48) We explain the second bound in (48); the first bound is similar.Let V be the Mellin transform of V .Using contour integration, the decay of V , and Hölder, we get, for every A ≥ 1, A dyadic decomposition of the integration range and the fourth moment bound for Dirichlet L-functions (Lemma 4) yield the second part of (48).Using (48) in (47), we obtain an upper bound Recalling the definition of g I/D 2 (N), we see that on the last line, the first Nsupremum is attained for N = 1 and the second N-supremum is attained for N = D 2 q/|I|, and we get the bound Therefore we have to asymptotically estimate and where O(zx 2ε/3 ) comes from the principal character and from replacing ϕ(q) by q.We note that since z ≤ x −ε √ qx this contribution is acceptable.Notice that we can add and remove the restrictions at will because they cost us a negligible error term that is ≪ A x −A for any given A > 0. Moreover note that n 1 and n 2 now traverse all of Z.We now separate the set of tuples (n 1 , n 2 ) into and the complement.The (n 1 , n 2 ) ∈ M contribute to a main term that is relatively easy to compute.On the other hand we will bound the contribution of (n 1 , n 2 ) ∈ M.
We now parametrize the equation d 1 k 1 = d 2 k 2 by dividing by (d 1 , d 2 ) on both sides so that and with ℓ ∈ N.
Plugging this and noticing that each non-negative integer can be written uniquely as ℓ 2 m with m squarefree, we can re-write (51) as Note that we can drop the condition (d 1 d 2 , q) = 1 as q is prime and d 1 , d 2 < q.
Likewise since ℓ ≥ q contribute O A (x −A ), we can drop the condition (ℓ, q) = 1 and apply Lemma 8 with W = f and H = x/q to see that the above is with the bound (x/q) −ε/4 for the difference between these two Fourier transforms following from the fact that f − F L 1 ≪ (x/q) −ε/4 , and the bound 1/|y| following from the fact that the total variation of the function f −F is bounded by an absolute constant.Likewise Putting these estimates together, and using the relation | F (ξ)| = |S(ξ)| and the integral identity (21), we see that (51) is 6.3.2.The off-diagonal (n 1 , n 2 ) ∈ M. Let us focus on bounding the contribution of (n 1 , n 2 ) ∈ M. We recall that the contribution of d 1 ≤ x 1/2−ε/6 /q 1/2 to (49) is negligible and likewise the contribution of d 2 ≤ x 1/2−ε/6 /q 1/2 is negligible.We now partition to (49) is bounded by with V a smooth non-negative compactly supported function such that V (x) ≥ 1 for x ∈ [−2, 2] and D 1 , D 2 > x 1/2−ε/6 /q 1/2 and N 1 ≤ x ε/3 D 2 2 q/x and N 2 ≤ x ε/3 D 2 1 q/x.We now split into two cases according to the size of D 1 D 2 : 6.3.3.Case D 1 D 2 ≥ x 1+2ε /q.In this case we do not use the condition (n 1 , n 2 ) ∈ M. Dropping this condition and using Dirichlet characters we can re-write (52) as Now we express each of the sums in (53) using a contour integral, and using Hölder's inequality this allows us to bound (53) by By the fourth moment bound (Lemma 4) the first term is 6.3.4.Case D 1 D 2 < x 1+2ε /q.In this case we notice that since + qℓ with 0 < |ℓ| ≪ x 1+13ε /q 2 .We now fix n 1 , n 2 , ℓ -there are ≪ x 1+27ε /q 2 possible choices.We shall show that the number of solutions in First of all note that we can assume that (n 1 , n 2 , qℓ) = 1.Indeed, q cannot divide n 1 n 2 as n 1 n 2 = o(q), and so letting g = (n 1 , n 2 , qℓ) we have g | ℓ and the problem reduces to one where (n 1 is a primitive binary quadratic form with discriminant d = 4n 1 n 2 > 0. Denote by ε n 1 n 2 the real number x 0 /2 + y 0 √ n 1 n 2 where (x 0 , y 0 ) is the solution in positive integers to the equation x 2 0 − 4n 1 n 2 y 2 0 = 4 for which x 0 + y 0 √ d is least.Note that ε n 1 n 2 ≥ 3/2.Let (x 1 , y 1 ) be a solution to f (x 1 , y 1 ) = qℓ with x 1 , y 1 ≪ (x/q) 1/2 x 3ε .We notice that in this situation (x 1 , y 1 ) ∈ The reason for this is that Moreover by Lemma 13 of [MW02] we have #T + m = #T + 1 for all m ≥ 1, and trivially #T − m = #T + m for all m ≥ 1.The solutions belonging to T + 1 are primary for the quadratic form n 1 x 2 1 − n 2 x 2 2 of discriminant 4n 1 n 2 (see p. 101 of [SW06] for the definition of primary).By Theorem 4.1 of [SW06] the number of (x 1 , y 1 ) for which there exists a quadratic form g of discriminant 4n 1 n 2 such that g(x 1 , y 1 ) = qℓ and such that (x 1 , y 1 ) is primary for g, is either 0 or given by for particular integers m and d 0 with m 2 | (qℓ, 4n 1 n 2 ).Using the divisor bound #{k : k | n} ≪ ε n ε/100 , we find that this is We conclude therefore that #T + 1 ≪ x 8ε and therefore the number of solutions (x 1 , y 1 ) with |x 1 |, |y 1 | ≪ (x/q) 1/2 x 3ε to the equation f (x 1 , y 1 ) = qℓ is bounded by ≪ log x • #T + 1 ≪ x 9ε as claimed.It follows therefore that the total number of solutions to n 7. The range z ≥ (x/q) 4/3+ε in the q-aspect : Proof of Proposition 4 Splitting into dyadic segments and recalling (23), we can bound the left-hand side of (15) by a constant times log x sup Expressing the condition nd 2 ≤ x using a contour integral (see [MV07, Cor.5.3]) the above is bounded by (in fact a better error term can be obtained but we do not need to keep track of it) where Applying Cauchy-Schwarz and splitting according to the values of t we can bound the main term above as Let us now prove the claim on the assumption of the Generalized Lindelöf Hypothesis.Applying Generalized Lindelöf and then the hybrid mean-value theorem (Lemma 7) we have for any choice of δ > 0, Since q, T ≤ x ≤ (x/q) O(1) , for sufficently small δ we have T δ q 2δ ≤ (x/q) ε/100 .Recalling also that D ≥ z 1/2 ≥ (x/q) (1+ε)/2 and q ≥ x 1/3+ε , we see that (55) is Applying this estimate to (54) yields the claim.

Conditional estimates: Proof of Theorem 3
The proof of Theorem 3 splits into two parts since two assertions are made.