1 Introduction

The sequence of numbers that are the sum of two positive cubes of integers, with or without multiplicities, has been studied from various perspectives. Hooley [5, 7] showed that high multiplicities are rare, and his theme was taken up by Wooley [14] and Heath-Brown [4]. The distribution of the gaps between numbers that are sums of two cubes is the subject of recent work by Brüdern and Wooley [1]. Here we discuss the distribution in arithmetic progressions to provide an asymptotic formula for a certain variance that is analogous to the theorems of Montgomery [9] and Hooley [6] in the distribution of primes.

We require some notation to formulate our results. Let r(n) denote the number of solutions of \(x^3+y^3=n\) in natural numbers xy, and let \(\rho (q,a)\) denote the number of incongruent solutions of the congruence . Finally, let \(C= \Gamma \bigl (\frac{4}{3}\bigr )^2/ \Gamma \bigl (\frac{5}{3}\bigr )\) denote the area of the domain

For \(N\geqslant 1\), \(Q\geqslant 1\) we are interested in establishing an asymptotic formula for

(1.1)

The average spacing between sums of two cubes suggests that such a formula should be available when Q is not too far from .

Theorem 1.1

Suppose that \(Q\leqslant N\). Then

In stating this theorem, and elsewhere in this memoir, we assert that whenever the letter \(\varepsilon \) occurs in a statement, then this statement is true for any positive real number assigned to \(\varepsilon \). Constants implicit in the familiar symbols of Landau and Vinogradov depend on the particular choice of \(\varepsilon \).

A result similar to Theorem 1.1 is available when sums of two cubes are counted without multiplicity. Thus, we define \(r_0(n)=1\) whenever \(r(n)\geqslant 1\) and let \(r_0(n)=0\) otherwise. As a consequence of the work of Hooley, we know that for a typical natural number n one has \(r(n)=2r_0(n)\). It is then natural to examine

(1.2)

Theorem 1.2

Suppose that \(Q\leqslant N\). Then

When studied on its own right, the ordering of sums of two cubes by size is certainly the most natural one. However, in the additive theory of numbers variables typically range over intervals independently, and it is then desirable to have at hand variance estimates that cover such situations. The following variant of Theorem 1.1 is directly applicable to the treatment of major arcs for additive problems involving cubes. Note that the range for Q where a valid asymptotic formula is obtained is larger than in Theorem 1.1. Let

Theorem 1.3

Suppose that \(Q\leqslant N\). Then

There is a rich literature on analogues of the Montgomery–Hooley formula for arithmetical functions. For example, the k-free numbers and other sifted sequences have been studied from this perspective (most recently Vaughan [13]) as well as other sequences that are reasonably well-behaved in arithmetic progressions (e.g. Hooley [8], Vaughan [12]). A common feature of existing work seems to be that the underlying sequence is fairly dense. In our situation, however, the expectation of r(n) grows like . We are not aware of other examples where a Montgomery–Hooley theorem is known and the growth rate for the expectation is as small as \(N^\theta \) with some \(\theta <1\). Our method requires, at the very least, that the expectation grows faster than \(\sqrt{N}\). In fact, as we shall demonstrate in a sequel to this paper, it is possible to establish results of the type considered here for the sequence of numbers that are representable as the sum of a square and a positive k-th power. Yet, a Montgomery–Hooley theorem for sums of two biquadrates seems to require new ideas.

The core argument of this memoir is directed at an imperfect version of V where we withdraw from r(n) the Hardy–Littlewood expectation. To define the latter, let , write

$$\begin{aligned} S(q,a)=\sum _{x=1}^q e(ax^3/q) \end{aligned}$$

and, whenever \(T\geqslant 1\), put

(1.3)

We consider

(1.4)

Theorem 1.4

Suppose that \(1\leqslant T\leqslant N^{3/5}\) and \(Q\geqslant 1\). Then

In Sect. 3, we establish Theorem 1.4 by the Hardy–Littlewood method. This approach to Montgomery–Hooley formulæ  originates in the work of Goldston and Vaughan [3]. Our application prominently features the exponential sum

(1.5)

that is less common than the cognate Weyl sum

$$\begin{aligned} f(\alpha ,X) =\sum _{x\leqslant X} e(\alpha x^3). \end{aligned}$$
(1.6)

A discussion of these exponential sums as well as some other auxiliary estimates can be found in Sect. 3. In Sect. 4 we deduce Theorem 1.1 from Theorem 1.4. In Sect. 5 we briefly describe alterations required to prove Theorem 1.3. Finally, in Sect. 6 we deduce Theorem 1.2 from Theorem 1.1.

2 Lemmata

We begin with some results concerning the function r(n).

Lemma 2.1

One has

$$\begin{aligned} r(n) \ll n^\varepsilon \end{aligned}$$

and the number \({\mathscr {N}}(N)\) of natural numbers n not exceeding N where \(r(n)\ne 2r_0(n)\) satisfies

$$\begin{aligned} {\mathscr {N}}(N) \ll N^{4/9+\varepsilon }. \end{aligned}$$
(2.1)

Further,

Proof

The first claim is a trivial consequence of a familar bound for the divisor function. It follows from Theorem 1 of Heath-Brown [4] that there are no more than natural numbers \(n\leqslant N\) with \(r(n)>2\). Moreover, \(r(n)=1\) gives \(n=2m^3\) for some \(m\in {\mathbb {N}}\), and there are no more than such n. This confirms the second claim. The linear average of r(n) is an instance of the Lipschitz principle (Davenport [2]), and the formula for the quadratic average follows from the assertions of the lemma that we have already confirmed.\(\square \)

Our next lemma is the dual of the large sieve, and is a simple application of Theorem 1 of Montgomery and Vaughan [10].

Lemma 2.2

Suppose that \(M\in {\mathbb {Z}}\), \(N\in {\mathbb {N}}\) and the \(\eta (t,c)\) are arbitrary complex numbers. Then

Let \(\kappa \) be the multiplicative function defined for \(\nu \geqslant 0\) and primes p by

$$\begin{aligned} \kappa (p^{3\nu })= p^{-\nu } , \quad \kappa (p^{3\nu +1})= 2p^{-\nu -1/2} , \quad \kappa (p^{3\nu +2}) = p^{-\nu -1}. \end{aligned}$$

One routinely confirms the estimates

$$\begin{aligned} \sum _{t\leqslant T} \kappa (t)^2 \ll T^\varepsilon , \quad \sum _{t\leqslant T} \,t\kappa (t)^4 \ll T^\varepsilon . \end{aligned}$$
(2.2)

Lemma 2.3

Whenever \((c,t)=1\) we have

$$\begin{aligned} t^{-1}S(t,c)\ll \kappa (t) \quad \text {and}\quad t^{-1}S(t,c) \ll t^{-1/3} . \end{aligned}$$

Proof

This follows from Lemmas 4.2, 4.3, 4.4 and Theorem 4.2 of Vaughan [11]. \(\square \)

Let

(2.3)

Then, by Lemma 2.8 of Vaughan [11], whenever \(|\beta |\leqslant 1/2\), we have

(2.4)

We note that if \(j=0\) then this bound holds for all real numbers \(\beta \).

It is convenient to introduce the notation

The next lemma is Theorem 4.1 of Vaughan [11].

Lemma 2.4

Suppose \((t,c)=1\). Then, for \(j=0\) and 1 one has

$$\begin{aligned} f(\alpha , X) - V_j(\alpha ;t,c) \ll t^{1/2+\varepsilon } (1+X^3|\alpha -c/t|)^{1/2} . \end{aligned}$$

We are ready to investigate the exponential sum \(g(\alpha )\), and begin by noting that by orthogonality and Lemma 2.1, one has

$$\begin{aligned} \int _0^1\! |g(\alpha )|^2\,{\mathrm {d}} \alpha \ll N^{2/3} . \end{aligned}$$
(2.5)

Next, we turn to an analogue of Lemma 2.4 for the exponential sum \(g(\alpha )\). Our result is relatively weak but suffices for our immediate needs. Improvements in the q-aspect would allow wider ranges for Q in Theorems 1.1 and 1.2. We define the sum

$$\begin{aligned} w(\beta ) = \frac{2}{3}\,C\! \sum _{n\leqslant N} n^{-1/3}\, e(n\beta ) \end{aligned}$$

and observe that the proof of Lemma 2.8 of Vaughan [11] provides the estimate

$$\begin{aligned} w(\beta ) \ll N^{2/3} (1+ N \Vert \beta \Vert )^{-2/3} , \end{aligned}$$
(2.6)

where \(\Vert \beta \Vert \) denotes the distance of the real number \(\beta \) to the nearest integer. Further, we write

$$\begin{aligned} W(\alpha ; q,a) = q^{-2} S(q,a)^2\, w(\alpha - a/q). \end{aligned}$$
(2.7)

Lemma 2.5

Let \(\alpha \in {\mathbb {R}}\), \(q\in {\mathbb {N}}\) and \(a\in {\mathbb {Z}}\). Then

Proof

By (1.5) we have

We write \(\beta = \alpha -a/q\) and apply Lemma 2.4 to the sum over y to conclude that

By (2.3),

$$\begin{aligned} v_0\bigl (\beta , (N-x^3)^{1/3}\bigr ) = \frac{1}{3} \int _0^{N-x^3}\!\! t^{-2/3} \,e(\beta t)\,{\mathrm {d}} t, \end{aligned}$$

whence, on exchanging sum and integral, we infer

We apply Lemma 2.4 again, this time to the sum over x. This yields

and we find that

where

Next, the substitution \(t=us\) in the inner integral produces

where

is a special value of Euler’s Beta function. In particular, we see that \(B\bigl ({\textstyle \frac{1}{3}}, {\textstyle \frac{1}{3}}\bigr )=6C\). Further, by Euler’s summation formula,

On collecting together, the lemma now follows.\(\square \)

It is convenient to estimate the mean square of the exponential sum

(2.8)

Lemma 2.6

Let \(T\geqslant 1\) and \(N\geqslant 1\). Then

Proof

By orthogonality,

(2.9)

Now let \(1\leqslant M\leqslant N\). Then, by Lemmata 2.2 and 2.3,

The lemma follows immediately, on summing over \(M=2^\nu \leqslant N\).\(\square \)

Lemma 2.7

Let \(T\geqslant 1\). Then

Proof

We square out and recall the bound

that is available from Lemma 2.6 and (2.9). For the cross product term, Cauchy’s inequality together with Lemma 2.1 yields

and the lemma follows.\(\square \)

3 The proof of Theorem 1.4

We begin by squaring out (1.4). This yields

(3.1)

where

Now form the exponential sums

and

Then

For later use, we note that

$$\begin{aligned} G(\alpha ) = g(\alpha ) - \frac{2}{3}\,C h(\alpha ) \end{aligned}$$
(3.2)

so that

$$\begin{aligned} |G(\alpha )|^2 \ll |g(\alpha )|^2 + |h(\alpha )|^2. \end{aligned}$$

By (2.5) and Lemma 2.6, it follows that

(3.3)

We now follow Section 3 of Goldston and Vaughan [3]. By Dirichlet’s method of the hyperbola,

Given a number r, on the right-hand side we separate terms with \(r\,{\mid }\, q\) from terms where \(r\not \mid q\). Then

$$\begin{aligned} F(\alpha )=F_r(\alpha )+H_r(\alpha ) \end{aligned}$$

where

(3.4)

and \(H_r(\alpha )\) is the corresponding multiple sum with \(r\not \mid q\). On performing the inner summation in \(H_r(\alpha )\) we have

For given b and r with \((b,r)=1\) we put

$$\begin{aligned} \beta =\alpha -b/r. \end{aligned}$$

Then and so when

$$\begin{aligned} q\leqslant \sqrt{N},\quad r\not \mid q\quad \text {and}\quad |\beta |\leqslant \frac{1}{2}\,r^{-1}N^{-1/2} \end{aligned}$$

we have

and so

$$\begin{aligned} H_r(\alpha )\ll \bigl (\sqrt{N}+r\bigr )\log 2r\quad \text {when}\quad |\beta |\leqslant \frac{1}{2}\, r^{-1}N^{-{1/2}}. \end{aligned}$$

We suppose that R and T satisfy

$$\begin{aligned} 2N^{1/2}\leqslant R\leqslant N^{2/3},\quad 1\leqslant T\leqslant R, \end{aligned}$$
(3.5)

and consider a typical interval \({{\mathfrak {M}}}(r,b)\) associated with the element b/r of the Farey dissection of order R, namely, when \(1\leqslant b\leqslant r\leqslant R\) and \((b,r)=1\),

where \(r_\pm \) is defined by \(br_\pm \equiv \mp 1\,(\mathrm{mod}\, r)\) and \(R-r<r_\pm \leqslant R\) and \(b_\pm \) is defined by \(b_\pm =(br_\pm \pm 1)/r\). We observe that

$$\begin{aligned} \biggl | {\frac{b+b_\pm }{r+r_\pm }}-{\frac{b}{r}}\biggr | = {\frac{1}{r(r+r_\pm )}} \end{aligned}$$

lies in [1/(2rR), 1/(rR)).

We have

and

We note also that \(F_r(\alpha )=0\) when \(r>N^{1/2}\). Hence, by (3.3),

(3.6)

where

$$\begin{aligned} S_1=\sum _{r\leqslant \sqrt{N}}\sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r\int _{{\mathfrak M}(r,b)}F_r(\alpha )|G(\alpha )|^2\,{\mathrm {d}}\alpha . \end{aligned}$$

Recalling that \(\beta =\alpha -b/r\), we infer from (3.4) that

Hence

(3.7)

Suppose that \(r\leqslant \sqrt{N}\) and define the major arc \({{\mathfrak {N}}}(r,b)\) by

Then \({{\mathfrak {N}}}(r,b)\subset {{\mathfrak {M}}}(r,b)\) and for \(\alpha \in {{\mathfrak {M}}}(r,b)\backslash {{\mathfrak {N}}}(r,b)\) we have

$$\begin{aligned} F_r(\alpha )\ll R\log N. \end{aligned}$$

Moreover, the same conclusion holds when \(\alpha \in {{\mathfrak {N}}}(r,b)\) and \(r>N/R\). Thus, by (3.3) and (3.6),

(3.8)

where

$$\begin{aligned} S_2\,=\!\sum _{r\leqslant N/R}\sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r\int _{{{\mathfrak {N}}}(r,b)}F_r(\alpha )|G(\alpha )|^2\,{\mathrm {d}}\alpha . \end{aligned}$$
(3.9)

We now apply major arcs arguments to the function \(G(\alpha )\). By (2.8) and (1.3),

whence by (3.2) and (2.7),

(3.10)

We wish to use this within (3.9). Hence, suppose that \(\alpha \in {\mathfrak {N}}(r,b)\) where \(1\leqslant b\leqslant r\leqslant N/R\) and \((b,r)=1\). Then

$$\begin{aligned} G(\alpha ) = g(\alpha ) - W(\alpha ;r,b) + D(\alpha ;r,b) \end{aligned}$$
(3.11)

where in view of (3.10) we have

$$\begin{aligned} D(\alpha ;r,b) = W(\alpha ;r,b) - \sum _{t\leqslant T} \sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \!\! W(\alpha ;t,c). \end{aligned}$$
(3.12)

Here we estimate a typical summand when \(c/t \ne b/r\). If \(\alpha \in {\mathfrak {N}}(r,b)\), then \(|\alpha -b/r|\leqslant (2rR)^{-1}\), and therefore, by (3.5)

$$\begin{aligned} \biggl \Vert \alpha -\frac{c}{t}\biggr \Vert \geqslant \biggl \Vert \frac{b}{r}-\frac{c}{t}\biggr \Vert -\frac{1}{2rR} \geqslant \frac{1}{2}\, \biggl \Vert \frac{b}{r}-\frac{c}{t}\biggr \Vert . \end{aligned}$$

Hence, by Lemma 2.3, (2.6) and (2.7), we have the bounds

(3.13)

We wish to sum this over all c and t with \(c/t\ne b/r\). For a given t there is at most one c with \(1\leqslant c \leqslant t\), \((c,t)=1\) and

(3.14)

If such a c exists we denote it by \(c_0(t)\). The fractions c/t are spaced at least 1/t apart, modulo 1. Hence, by (3.13) and (3.14),

By (2.2) it follows that

$$\begin{aligned} \sum _{t\leqslant T}\sum _{\begin{array}{c} c=1\\ (c,t)=1\\ c\ne c_0(t) \end{array}}^t\!\! W(\alpha ;t,c) \ll \sum _{t\leqslant T} \,t\kappa (t)^2 \ll T^{1+\varepsilon }. \end{aligned}$$
(3.15)

Let \({\mathscr {T}}\) be the set of all t with \(1\leqslant t\leqslant T\) where \(c_0(t)\) exists and satisfies \(c_0(t)/t \ne b/r\). Then

We split into dyadic ranges. For \(1\leqslant Z\leqslant T\) let \({{\mathscr {T}}}(Z)={{\mathscr {T}}}\cap [Z,2Z)\). Then the fractions \(c_0(t)/t\) are spaced \((2Z)^{-2}\) apart as t varies over \({{\mathscr {T}}}(Z)\), and the \(c_0(t)/t\) stay away from b/r at least \((2rZ)^{-1}\). By (3.13), we infer that

$$\begin{aligned} \sum _{t\in {{\mathscr {T}}}(Z)}\!\! W(\alpha ;t,c_0(t)) \,\ll \, Z^{-2/3} \!\sum _{0\leqslant l\leqslant Z} \biggl (\frac{1}{rZ} + \frac{l}{Z^2}\biggr )^{-2/3} \!\!\!\ll \, r^{2/3} + Z. \end{aligned}$$

We sum over \(Z=2^\nu \leqslant T\) and combine the result with (3.15). This yields

(3.16)

Now, inspecting (3.12), we see that if \(r\leqslant T\), then the sum on the left-hand side of (3.16) is \(-D(\alpha ;r,b)\) while in the case \(r>T\) the condition \(c/t\ne b/r\) is always satisfied. Consequently, by (3.16), whenever \(\alpha \in {\mathfrak {N}}(r,b)\) we have the estimates

(3.17)

and

(3.18)

Lemma 3.1

Suppose that (3.5) holds. Then

Proof

We begin with the set of all \(\alpha \) where holds. This set makes a contribution to the integral in question that in view of (3.7) does not exceed

By (3.17) and (3.18), on the remaining set we have \(r>T\) and \(D(\alpha ;r,b) \ll |W(\alpha ;r,b)|\). Then, by (3.14) and (3.13), together with (2.2), (2.6), (2.7) and Lemma 2.3 we see that

(3.19)

This confirms the estimate proposed in Lemma 3.1.\(\square \)

Lemma 3.2

Suppose that (3.5) holds. Then

$$\begin{aligned} \sum _{r\leqslant N/R} \sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r \int _{{{\mathfrak {N}}}(r,b)}|F_r(\alpha )W(\alpha ;r,b)| \,{\mathrm {d}}\alpha \ll N^{2/3+\varepsilon }. \end{aligned}$$

Proof

The reasoning leading to (3.19) applies here as well, and estimates the integral in question by

$$\begin{aligned}\ll N^{5/3+\varepsilon } \!\sum _{r\leqslant N/R}\! \kappa (r)^2 \int _0^1 (1+N|\beta |)^{-5/3}\,{\mathrm {d}}\beta \ll N^{2/3+\varepsilon }. \end{aligned}$$

\(\square \)

Lemma 3.3

Suppose that (3.5) holds. Then

$$\begin{aligned} \sum _{r\leqslant N/R} \sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r \int _{{{\mathfrak {N}}}(r,b)}|F_r(\alpha )g(\alpha )| \,{\mathrm {d}}\alpha \ll N^{4/3+\varepsilon }R^{-1}. \end{aligned}$$

Proof

Within this proof, we abbreviate \(f(\alpha , N^{1/3})\) to \(f(\alpha )\). The main observation is that in a suitable averaged sense one may replace \(|g(\alpha )|\) by \(|f(\alpha )|^2\). Let

$$\begin{aligned} K(\alpha ) = \sum _{n\leqslant N} e(-\alpha n). \end{aligned}$$

By orthogonality,

$$\begin{aligned} g(\alpha ) = \int _0^1\!\! f(\alpha +\gamma )^2 K(\gamma )\,{\mathrm {d}}\gamma . \end{aligned}$$
(3.20)

Let J be the expression on the left-hand side of (3.20). Then, by (3.20),

$$\begin{aligned} J\leqslant \int _0^1 \!|K(\gamma )| \sum _{r\leqslant X/R} \sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r \int _{{{\mathfrak {N}}}(r,b)}|F_r(\alpha )||f(\alpha +\gamma )|^2 \,{\mathrm {d}}\alpha \,{\mathrm {d}}\gamma . \end{aligned}$$
(3.21)

We have

$$\begin{aligned} |f(\theta )|^2 = \sum _{h\in {\mathbb {Z}}} \psi _h e(\theta h) \end{aligned}$$
(3.22)

where \(\psi _h\) is the number of \((x,y)\in {\mathbb {Z}}^2 \) with \(x^3-y^3=h\) and \(1\leqslant x,y\leqslant N^{1/3}\). By (3.7), we have

$$\begin{aligned} \int _{{\mathfrak N}(r,b)}|F_r(\alpha )|&|f(\alpha +\gamma )|^2 \,\mathrm d\alpha \\&\ll N^{1+\varepsilon }r^{-1} \int _{-1}^{1} (1+N|\beta |)^{-1} \biggl |f\biggl (\frac{b}{r}+\beta +\gamma \biggr )\biggr |^2\,\mathrm d\beta . \end{aligned}$$

We now use (3.22) and sum over b to bring in Ramanujan’s sum

This produces

Now note that \(\psi _h\geqslant 0\). Hence, the classical bound \(|c_r(h)|\leqslant (r,h)\) suffices to conclude that the above expression does not exceed

Observe that this bound is uniform in \(\gamma \). We sum over r and deduce from (3.21) that

The elementary bound shows that the first factor on the right is . This yields

$$\begin{aligned} J\ll \psi _0 N^{1+\varepsilon } R^{-1} + N^\varepsilon \sum _h \psi _h. \end{aligned}$$

The lemma follows immediately.\(\square \)

The proof of Theorem 1.4 is now readily completed. We return to (3.11) and apply the elementary inequality \(|\alpha +\beta |^2\leqslant 2|\alpha |^2 + 2|\beta |^2\) to confirm that

$$\begin{aligned} |G(\alpha )|^2&\leqslant 2|g(\alpha )-W(\alpha ;r,b)|^2 + 2|D(\alpha ;r,b)|^2 \\&\leqslant 2|g(\alpha )-W(\alpha ;r,b)|(|g(\alpha )|+|W(\alpha ;r,b)|) + 2|D(\alpha ;r,b)|^2. \end{aligned}$$

We multiply by \(|F_r(\alpha )|\), integrate over \({\mathfrak {N}}(r,b)\) and sum over \(1\leqslant b\leqslant r\), \((b,r)=1\) and \(r\leqslant N/R\). The factor \(g-W\) is bounded by \(N^{5/6+\varepsilon }R^{-1/2}\), as can be seen from Lemma 2.5. Then, by (3.9) and Lemmata 3.1, 3.2 and 3.3, we find that

$$\begin{aligned} S_2 \ll N^\varepsilon \bigl ( N^{13/6}R^{-3/2} +(N/R)^{7/3} + NR^{-1} T^2 + N^{4/3} T^{-1} + N^{3/2}R^{-1/2} \bigr ). \end{aligned}$$

This bound combines with (3.8) to an estimate for S. The choice \(R=N^{3/5}\) balances the term \(N^{13/6}R^{-3/2}\) in the preceding display with the term \(RN^{2/3}\) in (3.8). The resulting bound for S in conjunction with (3.1) establishes Theorem 1.4 when \(Q\leqslant N\). When \(Q>N\) the extra terms compared with the case \(Q=N\) contribute trivially

and the proof is completed by an appeal to Lemma 2.7.

4 The proof of Theorem 1.1

In this section we compare V(NQ) with U(NQT), thereby establishing Theorem 1.1. At the core of our argument is a mean square estimate for the difference between the approximants to the expectation used in U and V. In U we encountered the term

(4.1)

while in V the arithmetically obvious expectation is used. Thus we are led to consider

(4.2)

Lemma 4.1

Let \(1\leqslant T\leqslant Q\leqslant N\). Then

Proof

The main observation underpinning the proof of Lemma 4.1 is the identity

$$\begin{aligned} \frac{\rho (q,a)}{q^2} = \frac{1}{q}\sum _{t|q}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \!\frac{S(t,c)^2}{t^2}\, e(-ca/t) \end{aligned}$$
(4.3)

that is an instance of [11, Lemma 2.12]. This suggests to split the partial singular series \({\mathfrak {s}}(n;T)\) into

$$\begin{aligned} {\mathfrak {s}}^*_q(n;T) = \sum _{\begin{array}{c} t\leqslant T\\ t\mid q \end{array}}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \!\frac{S(t,c)^2}{t^2}\, e(-cn/t) \end{aligned}$$
(4.4)

and its complement

(4.5)

Let \(\Xi ^\dagger (N,T;q,a)\) be the sum defined in (4.1), but with \({\mathfrak {s}}(n;T)\) replaced by \({\mathfrak {s}}^\dagger _q(n;T)\). Later, we shall use this notation also in situations where \(\Xi \) and \({\mathfrak {s}}\) are decorated by symbols other than \(\dagger \). We expect \(\Xi ^\dagger (N,T;q,a)\) to be small on average. Temporarily we suppose that \(1\leqslant a\leqslant q\). Then

Note that \(q\leqslant N\), so the sums over h always include the term with \(h=0\). Their total contribution to \(\Xi ^\dagger (N,T;q,a)\) is

(4.6)

For \(h\geqslant 1\) we have \(a+hq \gg hq\), and so, by partial summation,

Hence the contribution to \(\Xi ^\dagger (N,T;q,a)\) from terms with \(h\geqslant 1\) is bounded by

Applying similar estimates within (4.6), we first find

and then see that

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q \,|\Xi ^\dagger (N,T;q,a)|^2 \ll Q^{4/3} T^{2+\varepsilon }. \end{aligned}$$
(4.7)

A similar argument applies to a tail version of the sum in (4.4). This we define by

$$\begin{aligned} {\mathfrak {s}}^\ddagger _q(n;T) = \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \!\frac{S(t,c)^2}{t^2}\, e(-cn/t). \end{aligned}$$
(4.8)

This sum depends on n only modulo q. Therefore, with the convention concerning decorations of \(\Xi \) in mind,

$$\begin{aligned} \Xi ^\ddagger (N,T;q,a) \,=\, {\mathfrak {s}}^\ddagger _q(a;T)\!\!\! \sum _{\begin{array}{c} n\leqslant N \\ n\equiv a \,(\mathrm{mod}\, q) \end{array}} \!\!\!\!\! n^{-1/3}. \end{aligned}$$

Much as before, we find that

and consequently,

$$\begin{aligned} \sum _{a=1}^q\, |\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{4/3}q^{-2} \sum _{a=1}^q \, |{\mathfrak {s}}^\ddagger _q(a;T)|^2 + \sum _{a=1}^q\, a^{-2/3}\, |{\mathfrak {s}}^\ddagger _q(a;T)|^2 .\nonumber \\ \end{aligned}$$
(4.9)

By (4.8) and Cauchy’s inequality, and then by orthogonality,

$$\begin{aligned} \sum _{a=1}^q \,|{\mathfrak {s}}^\ddagger _q(a;T)|^2&\ll q^\varepsilon \sum _{a=1}^q \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\,\biggl | \sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t\!\! \frac{S(t,c)^2}{t^2} \, e\biggl (-\frac{ca}{t}\biggr )\biggr |^2 \\&\ll q^{1+\varepsilon } \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \biggl |\frac{S(t,c)}{t}\biggr |^4 \!\!\ll q^{1+\varepsilon } \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\, t \kappa (t)^4. \end{aligned}$$

Similarly, again via (4.8), we have

$$\begin{aligned} {\mathfrak {s}}^\ddagger _q(a;T) \ll \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\,t\kappa (t)^2, \end{aligned}$$

implying that

$$\begin{aligned} \sum _{a=1}^q a^{-2/3}\, |{\mathfrak {s}}^\ddagger _q(a;T)|^2\ll q^{1/3} \biggl (\, \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}}\,t\kappa (t)^2\biggr )^2. \end{aligned}$$

From (4.9) we now see

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q\, |\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{4/3+\varepsilon }\,Y_1 + Y_2 \end{aligned}$$

where

To bound \(Y_1\), write \(q=ts\) and reverse the order of summation. Via (2.2), we find that

$$\begin{aligned} Y_1 \ll \sum _{T<t\leqslant Q} \kappa (t)^4 \sum _{s\leqslant Q/t} s^{-1} \ll (\log Q) \sum _{T<t\leqslant Q} \kappa (t)^4 \ll N^\varepsilon T^{-1}. \end{aligned}$$

The estimation of \(Y_2\) begins with Cauchy’s inequality, applied to the sum over t. Then, by the same procedure that was successful with \(Y_1\), we find that

This proves that

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q \,|\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{4/3+\varepsilon }\,T^{-1}+Q^{4/3+\varepsilon }. \end{aligned}$$
(4.10)

We collect the results obtained so far in a single estimate. By (4.3), (4.4), (4.5) and (4.8), we have

It follows that

and from (4.7) and (4.10) we then conclude

(4.11)

By (4.11) and (4.2), we are reduced to comparing

By orthogonality, we see that

Here, by (2.6), the sum on the far right over n is . Further, by (4.3) and Lemma 2.3,

$$\begin{aligned} \frac{\rho (q,a)}{q} \ll \sum _{t\mid q} \,t \kappa (t)^2. \end{aligned}$$
(4.12)

It now follows that the sum

(4.13)

is bounded above by

$$\begin{aligned}&\ll \sum _{q\leqslant Q} \biggl ( \sum _{t\mid q} \,t \kappa (t)^2 \biggr )^2 \frac{1}{q^2} \sum _{a=1}^q \,\biggl | \sum _{b=1}^{q-1} \, e\biggl (-\frac{ab}{q}\biggr ) \sum _{n\leqslant N} n^{-1/3}\, e\biggl (\frac{bn}{q}\biggr ) \biggr |^2 \\&\ll \sum _{q\leqslant Q} \biggl ( \sum _{t\mid q}\, t \kappa (t)^2 \biggr )^2\, \frac{1}{q}\, \sum _{b=1}^{q-1} \,\biggl \Vert \frac{b}{q}\biggr \Vert ^{-4/3} \ll \sum _{q\leqslant Q} q^{1/3} \biggl ( \sum _{t\mid q}\, t \kappa (t)^2 \biggr )^2 \!\!\ll Q^{4/3+\varepsilon }, \end{aligned}$$

where the very last bound has been confirmed earlier, in our estimate of \(Y_2\) above.

We now invoke the elementary evaluation

(4.14)

together with (4.12) to confirm the bounds

Equipped with the estimate obtained for the sum in (4.13), we deduce that

This bound coupled with (4.11) implies Lemma 4.1.\(\square \)

The endgame begins by writing (1.1) as

$$\begin{aligned} V(N,Q) = \sum _{q\leqslant Q} \sum _{a=1}^q \,( A(N,q,a) + B(N,q,a) )^2 \end{aligned}$$

where

Then, squaring out and estimating the cross term by Cauchy’s inequality, we obtain, by (1.4), (4.1) and (4.2),

First suppose that \(N^{3/5}\leqslant Q\leqslant N^{17/20}\), and take \(T=N^{1/15}\). Then, by Lemmata 2.7 and 2.1, we have

and consequently, by Theorem 1.4,

Lemma 4.1 yields \( \Delta (N,Q,T) \ll N^{19/15+\varepsilon }\), and we see that

Here the first summand in the error term is dominated by the last summand, so this reduces to the estimate claimed in Theorem 1.1.

Next suppose that \(N^{17/20} < Q \leqslant N\), and take \(T=(N/Q)^{4/9}\). Then \(1\leqslant T\leqslant N^{1/15}\), Lemma 4.1 gives \(\Delta \ll N^{8/9+\varepsilon }Q^{4/9}\) and as above, Theorem 1.4 delivers

Proceeding as before, we obtain

Here, in the error term, the last summand dominates the others, so this also reduces to the estimate claimed in Theorem 1.1.

In the range \(Q\leqslant N^{3/5}\) the theorem only asserts that \(V(N,Q)\ll N^{19/15+\varepsilon }\) which follows from the bound for \(V(N,N^{3/5})\) that we have already established.

5 The proof of Theorem 1.3

The proof of Theorem 1.3 is in principle simpler than that of Theorem 1.1 as it does not require the use of the convolution device embedded in the proof of Lemma 3.3. One follows the proceedings of Sect. 3 and replaces \(g(\alpha )\) with \(f(\alpha )^2\) where \(f(\alpha )=f(\alpha ; X)\) is as in (1.6). Then, by Lemma 2.4, (2.4) and Lemma 2.3, the estimate given in Lemma 2.5 can be replaced by

in the final stages of the proof of Theorem 1.4. For easier comparison, we put \(W^*(\alpha ;q,a) = V_1(\alpha ;q,a)^2\) and then have, in the context of the last paragraph of Sect. 3,

$$\begin{aligned} f(\alpha )^2 - W^*(\alpha ;b,r) \ll X^{3/2+\varepsilon }R^{-1/6} +X^{3+\varepsilon }R^{-1} \end{aligned}$$
(5.1)

which can be used in place of the bound \(g-W\ll N^{5/6+\varepsilon }R^{-1/2}\) and results in the possibility of taking a smaller value for R.

To be more precise, to effect the translation let \(N=2X^3\), and take

and

Then following the argument of Sect. 3 we have

where

and

$$\begin{aligned} G^*(\alpha )=f(\alpha )^2 - h^*(\alpha ). \end{aligned}$$

Then it follows that, cf. (3.8),

where

Proceeding to use (5.1) in place of Lemma 2.5 we have

The choice \(R=X^{5/3}\) gives

The analogue of Lemma 2.7 gives

Thus

We define

Then, for \(1\leqslant T\leqslant Q\leqslant X^3\), as in Lemma 4.1, one has

(5.2)

Some justification is in order at this stage. Indeed, for a proof of (5.2), in the arguments of Sect. 4, one needs to replace the sum (4.1) by

in which \(\tau (n,X^3)\) substitutes the monotonic weight . While the function \(\tau (n,X^3)\) is no longer initially decreasing, mundane asymptotic analysis reveals that

where

is positive. Hence, there is a number \(n_0\) such that \(\tau (n,n)\) is decreasing for \(n\geqslant n_0\). Furthermore, a rather more direct argument shows that \(\tau (n,X^3)\) is also decreasing for \(X^3<n\leqslant N\), and one immediately has . These properties suffice to check that the initial phase of the proof of Lemma 4.1 still applies in the new set-up, and one confirms the bound

as the appropriate counterpart of (4.11). The next step is to estimate

and progress then depends on the bound

This is readily established by applying Abel summation on the interval \( \Vert b/q\Vert ^{-1} \leqslant n\leqslant N\), using the monotonicity of \(\tau (n,X^3)\), while in the range \(n\leqslant \Vert b/q\Vert ^{-1}\) the trivial \(\tau (n,X^3)\ll n^{-1/3}\) is enough. Finally, we note that

This is weaker than the bound (4.14) but suffices to mimic the work after (4.11) successfully. However, the error term in the previous display now introduces an extra term of size \(X^2\) into the final bound for \(\Delta ^*\). This can be absorbed into the expression \(Q^{4/3}T^2 + X^4T^{-1}\), and this proves (5.2).

Now choose \(T=X^{1/3}\) for \(Q\leqslant X^{9/4}\), \(T=X^{4/3}Q^{-4/9}\) for \(X^{9/4}\leqslant Q\leqslant 2X^3\). Then, for \(Q\leqslant X^{9/4}\), we find that

and

$$\begin{aligned} \Delta ^*(X,Q,T)\ll X^{11/3+\varepsilon }. \end{aligned}$$

As in the proof of Theorem 1.2 we have

and this reduces to

and gives the first two cases of the theorem.

When \(X^{9/4}\leqslant Q\leqslant 2X^3\) we instead have

and

$$\begin{aligned} \Delta ^*(X,Q,T)\ll Q^{4/9}X^{8/3+\varepsilon } \end{aligned}$$

and the final case follows.

6 The proof of Theorem 1.2

The deduction of Theorem 1.2 from Theorem 1.1 is based on Lemma 2.1, and is straightforward. We define \(\eta (n)\) through

$$\begin{aligned} 2r_0(n)= r(n) + \eta (n). \end{aligned}$$

Then \(|\eta (n)|\leqslant r(n) \ll n^\varepsilon \), and (2.1) yields

$$\begin{aligned} \sum _{n\leqslant N} |\eta (n)|^\nu \ll N^{4/9+\varepsilon } \quad (\nu = 1, 2). \end{aligned}$$

By (1.1) and (1.2),

$$\begin{aligned} 4V_0 (N,Q) = V(N,Q)+ E_0 +2E_1 \end{aligned}$$
(6.1)

where

(6.2)
(6.3)

To bound \(E_0\), one opens the square and finds that

Hence

Here the terms with \(n=m\) contribute

$$\begin{aligned} \ll Q \sum _{n\leqslant N} |\eta (n)|^2 \ll QN^{4/9+\varepsilon } \end{aligned}$$

while the terms with \(n\ne m\) contribute no more than

This shows that

$$\begin{aligned} E_0 \ll QN^{4/9+\varepsilon }+N^{8/9+\varepsilon }. \end{aligned}$$

Further, by (6.2), (6.3), (1.1) and Cauchy’s inequality,

$$\begin{aligned} E_1^2 \ll V(N,Q) E_0. \end{aligned}$$

We now temporarily suppose that \(N^{3/5}\leqslant Q\leqslant N\). Then \(E_0 \ll QN^{4/9+\varepsilon }\), and Theorem 1.1 gives . Hence . A short calculation now shows that the estimate in Theorem 1.2 follows from (6.1) and Theorem 1.1. As in the proof of Theorem 1.1, the case \(Q\leqslant N^{3/5}\) follows by considering the upper bound implied by the estimate for \(V_0(N,N^{3/5})\) that is already established.