1 Introduction

Recently the authors [2] studied the distribution in arithmetic progressions of the numbers that are the sum of two positive cubes of integers, and established an asymptotic formula of Montgomery–Hooley type for the associated variance. As indicated on that occasion, a further development of our method supplies related results for sums of a square and an h-th power, for any \(h\geqslant 3\). Here we discuss the problem in broader generality and consider, for given numbers \(h\geqslant k\geqslant 2\), the sequence of numbers of the shape \(x^k+y^h\) as x and y range over the natural numbers. Our method is successful whenever the number

$$\begin{aligned} \theta = \frac{1}{k}+\frac{1}{h} \end{aligned}$$
(1.1)

exceeds 1/2. Given such a pair kh, let r(n) denote the number of solutions of \(x^k+y^h =n\) in natural numbers x and y, and let \(\rho (q,a)\) denote the number of incongruent solutions of the congruence . Finally, let

(1.2)

denote the area of the domain

$$\begin{aligned} \bigl \{(\xi ,\eta )\in {\mathbb {R}}^2: \xi> 0,\, \eta > 0,\, \xi ^k+\eta ^h< 1\bigr \}. \end{aligned}$$

For \(N\geqslant 1\), \(Q\geqslant 1\) we consider the variance

(1.3)

An asymptotic formula for V(NQ) is expected to hold when Q is not too far from \(N^{\theta }\). One possible approach is a dispersion argument. Opening the square in (1.3), the expression

$$\begin{aligned} \lfloor Q \rfloor \sum _{n\leqslant N} r(n)^2 \end{aligned}$$
(1.4)

arises naturally and prominently impacts the behaviour of V(NQ). It transpires that the case \(k=h=2\) is peculiar because, in marked contrast to all other cases, here r(n) is often so large that the order of magnitude of (1.4) is \(QN \log N\). This atypical case has been analysed by Dancs [3] in his thesis, in a slightly different setting; he replaces our r(n) with the number of solutions of \(n=x^2+y^2\) in integers x and y. Translated to our language, his main result asserts that there are real numbers \(c,c'\) such that whenever \(1\leqslant Q\leqslant N\) then

$$\begin{aligned} V(N,Q) = \frac{1}{2}\, QN\biggl (\log \frac{N}{Q} + c\biggr ) + \frac{1}{4}\, Q^2 \log Q + c'Q^2 +O\bigl (N^{5/3+\varepsilon }\bigr ). \end{aligned}$$

Here and later in this paper, we apply the following convention concerning the letter \(\varepsilon \): whenever \(\varepsilon \) occurs in a statement, it is asserted that the statement is true for any positive value assigned to \(\varepsilon \).

For sums of two cubes, the case \(k=h=3\), the situation is rather different. There are only about \(N^{2/3}\) numbers n not exceeding N that are the sum of two positive cubes, and for a typical such number one has \(r(n)=2\). Therefore, the sum \(\sum _{n\leqslant N} r(n)^2\) will be of size \(N^{2/3}\). This reflects in the shape of the asymptotic expansion of the variance for which we obtained ([2,  Theorem 1.1])

$$\begin{aligned} V(N,Q) = 2CQN^{2/3} +O\bigl (Q^{1/2}N^{{29}/{30}+\varepsilon } + Q^{13/18}N^{7/9+\varepsilon } + N^{19/15+\varepsilon }\bigr ) \end{aligned}$$

uniformly in the range \(1\leqslant Q\leqslant N\).

The remaining pairs kh with \(h\geqslant k\geqslant 2\) and \(\theta >1/2\) are

$$\begin{aligned} k=2, \quad h\geqslant 3 \qquad \text {and} \qquad k=3, \quad h=4 \text {\, or \,} 5. \end{aligned}$$
(1.5)

Thus, in all these cases we have \(h>k\), and as we shall see in Lemma 2.1, this implies that for the typical number that is the sum of a k-th power and an h-th power, one has \(r(n)=1\). Again, this changes the leading term in an asymptotic formula for V(NQ).

Theorem 1.1

Suppose that kh is one of the pairs satisfying (1.5), and let \(1\leqslant Q\leqslant N^{\theta }\). If \(k=3\), then

$$\begin{aligned} V(N,Q) = CQN^{\theta } +O\bigl (N^{2\theta -1/(8h) +\varepsilon } + Q^{1/2}N^{{3\theta }/{2}-1/(16h)+\varepsilon }\bigr ). \end{aligned}$$

If \(k=2\) and \(h\geqslant 7\), then

$$\begin{aligned} V(N,Q) = CQN^{\theta } + O\bigl (N^{2\theta -1/h +\varepsilon } + Q^{1/2}N^{{3\theta }/{2}-1/(2h)+\varepsilon }\bigr ). \end{aligned}$$
(1.6)

If \(k=2\) and \(4\leqslant h\leqslant 6\), then the error term in (1.6) is to be replaced by

$$\begin{aligned} N^{21/16+\varepsilon }+Q^{1/2}N^{1+\varepsilon } +QN^{5/8+\varepsilon } \end{aligned}$$

while for \(k=2\), \(h=3\) this error is

$$\begin{aligned} \bigl (N^{17/12}+Q^{3/4}N+Q^{1/2}N^{7/6} + Q^{1/4}N^{31/24}\bigr )N^\varepsilon . \end{aligned}$$

There is a large body of work concerned with the distribution of arithmetic sequences in residue classes, with a view toward an asymptotic formula for the associated variance. The historic papers of Montgomery [9] and Hooley [6] on the von Mangoldt function triggered interest in analogous results for other arithmetic functions of great familiarity in multiplicative number theory, such as the indicator function of the k-free numbers [17] and their l-tuplets [10], or the divisor function [11,12,13]. There are also axiomatic studies in work of Hooley [7, 8] and Vaughan [15, 16]. Any attempt to review all examples that have been detailed hitherto would take us far afield, but a common feature of previous work is that in all cases that we are aware of, the order of magnitude of the crucial expression (1.4) only mildly digresses from QN. In particular we do not know a single instance where the appropriate analogon of \(\sum _{n\leqslant N} r(n)^2\) is bounded above by \(N^{1-\delta }\), for some \(\delta >0\). This paper and its compagnion [2] provide a family of such examples, with \(\delta \) approaching 1/2. At \(\delta =1/2\), however, essential obstacles arise on which we comment in more detail.

Our approach to estimates of the type provided by Theorem 1.1 uses a variant of the dispersion argument proposed by Goldston and Vaughan [4]. One is required to evaluate the sum

asymptotically, and this is within the competence of the circle method. In our earlier work on sums of two cubes, this approach was modified, and the circle method was brought into play only after r(n) was replaced by an arithmetic function that resembles the minor arcs contribution in the integral

$$\begin{aligned} r(n) = \int _0^1\! \sum _{x^k+y^h\leqslant N}\!\!\!\!\!\! e(\alpha (x^k+y^h-n))\,\mathrm {d}\alpha , \end{aligned}$$

the latter being valid for all n not exceeding N. This led to considerable technical simplifications. Similar ideas apply in the more general set-up of this paper as well. As is often the case with mixed exponents in representation problems, a proportion of the Farey intervals in our application of the Hardy–Littlewood method have to be treated as major for the smaller exponent, but as minor for the larger one. This new hurdle is overcome by a succession of pruning exercises that we execute in §4. We then obtain, in §5, an imperfect version of Theorem 1.1. This is of some interest on its own right, see Theorem 5.4 below. The deduction of Theorem 1.1 from Theorem 5.4 is then achieved in §6 by following the routines developed in [2,  Section 4].

The Fourier integral that we estimate by the circle method appears in (5.4) below. The square root cancellation barrier for this integral is \(N^{\theta + 1/2}\), and it therefore appears to be very difficult to find asymptotic relations for V(NQ) with an error estimate superior to \(N^{\theta + 1/2}\). An error of this size is dominated by the leading \(QN^{\theta }\) only if \(Q\geqslant N^{1/2}\). It should therefore be noted that in the cases \(k=2\), \(h\geqslant 7\) Theorem 1.1 indeed supplies a valid asymptotic formula whenever \(Q\geqslant N^{1/2+\varepsilon }\), and achieves square root cancellation in a certain range for Q. In the remaining cases, the result is somewhat weaker but then \(\theta \) is rather larger than 1/2. In fact, our methods are tuned to perform optimally for the smaller values of \(\theta \), leaving the other cases susceptible to some small improvement.

In §7 we consider the numbers representable as sums of a k-th power and an h-th power without multiplicities. Define \(r_0(n)=1\) whenever \(r(n)\geqslant 1\) and let \(r_0(n)=0\) otherwise. Then, for a typical natural number n one has \(r(n)=r_0(n)\). It is now natural to examine

Theorem 1.2

Suppose that kh is one of the pairs satisfying (1.5). Let \(Q\leqslant N^{\theta }\). Then

$$\begin{aligned} V_0(N,Q)= & {} V(N,Q) \\&+O\bigl ( QN^{2/h+\varepsilon } + N^{4/h+\varepsilon } +V(N,Q)^{1/2}\bigl ( Q^{1/2}N^{1/h+\varepsilon } + N^{2/h+\varepsilon }\bigr )\bigr ). \end{aligned}$$

This combines easily with the results of Theorem 1.1, and provides asymptotic formulae for \(V_0(N,Q)\). In particular, one finds that (1.6) holds with V(NQ) replaced by \(V_0(N,Q)\). For an analogous result in the case \(k=h=3\) see [2,  Theorem 1.4]. Perhaps surprisingly, for the numbers that are the sum of two squares, an asymptotic formula for \(V_0(N,Q)\) is not yet known.

2 Auxiliaries

We begin with elementary mean value estimates for r(n).

Lemma 2.1

Let kh be a pair satisfying (1.5). Then

$$\begin{aligned} \sum _{n\leqslant N} r(n) = CN^\theta + O\bigl (N^{1/k}\bigr ), \end{aligned}$$

and

$$\begin{aligned} \sum _{n\leqslant N} r(n)^2 = \sum _{n\leqslant N} r(n) + O\bigl (N^{2/h+\varepsilon }\bigr ). \end{aligned}$$

Proof

The linear mean of r(n) follows by the standard lattice point argument of Gauß. The sum of \(r(n)^2\) equals the number of solutions of

$$\begin{aligned} x^k+y^h = u^k+v^h \leqslant N \end{aligned}$$

in positive integers xyuv. The solutions with \(y=v\) (and a fortiori \(x=u\)) contribute \(\sum r(n)\). There are \(O(N^{2/h})\) choices for \(y\ne v\), and once these are chosen, a divisor argument shows that there are no more than \(O(N^\varepsilon )\) choices for x and u.\(\square \)

We frequently encounter a family of multiplicative functions that we now describe. Let \(l\geqslant 2\) be a natural number, and let \(\kappa _l\) be the multiplicative function that for primes p and integers \(\nu ,\lambda \) with \(\nu \geqslant 0\) and \(2\leqslant \lambda \leqslant l\) is defined by

$$\begin{aligned} \kappa _l\bigl (p^{l\nu +1}\bigr ) = p^{-\nu -1/2},\quad \kappa _l\bigl (p^{l\nu +\lambda }\bigr ) = p^{-\nu -1}. \end{aligned}$$
(2.1)

We then have the immediate bounds

$$\begin{aligned} \kappa _l(q)\leqslant q^{-1/l} \end{aligned}$$
(2.2)

for all \(q\in {\mathbb {N}}\), and the estimate

$$\begin{aligned} \sum _{q\leqslant Q} \kappa _l(q)^2 \leqslant \prod _{p\leqslant Q} \biggl ( 1+ \frac{1}{p} + O\biggl (\frac{1}{p^2}\biggr )\biggr ) \ll \log Q. \end{aligned}$$
(2.3)

Lemma 2.2

Let kh be a pair satisfying (1.5). Then

(2.4)

and

(2.5)

Proof

By (2.2) we have \(q \kappa _2(q)^2\leqslant 1\). Hence, the cases of (2.4) where \(k=2\) are immediate from (2.3). If \(k=3\) and \(h=4\) or 5, then one checks from (2.1) that holds for all \(\nu \geqslant 1\) while (2.2) yields the bounds

that are superior when \(\nu \) is large. Similar to the argument in (2.3), the estimate (2.4) now follows after turning the sum into an Euler product.

Next we establish (2.5) in the case where \(k=3\), \(h=5\). Let

By (2.2),

and hence,

$$\begin{aligned} p^{-\nu } K(p^\nu )^2 \leqslant 4 p^{-\nu /15}. \end{aligned}$$

For \(1\leqslant \nu \leqslant 14\) one checks from (2.1) that , and so, for the same \(\nu \), we have \(K(p^\nu ) \leqslant (\nu +1) p^{(\nu -1)/2}\) and \(p^{-\nu } K(p^\nu )^2\leqslant (\nu +1)^2 p^{-1}\). It follows that the expression on the left-hand side of (2.5) does not exceed

$$\begin{aligned} \prod _{p\leqslant Q} \sum _{\nu =0}^\infty p^{-\nu } K(p^\nu )^2 \leqslant \prod _{p\leqslant Q} \biggl (1+ O\biggl (\frac{1}{p}\biggr )\biggr ). \end{aligned}$$

This establishes (2.5) in the case \(k=3\), \(h=5\). By the obvious inequality \(\kappa _4(q) \leqslant \kappa _5(q)\) the case \(k=3\), \(h=4\) also follows.

This leaves the cases \(k=2\), \(h\geqslant 3\). Here, by (2.2) and (2.1), we have

for all \(\nu \geqslant 1\), and also

The proof of (2.5) in these cases now proceeds as above.\(\square \)

3 Gauß and Weyl sums

For \(l\geqslant 2\) let

$$\begin{aligned} S_l(q,a)=\sum _{x=1}^q e(ax^l\!/q) \end{aligned}$$

be the l-th power Gauß sum. By [14,  Lemmata 4.3, 4.4 and 4.5], the bound

$$\begin{aligned} q^{-1} S_l(q,a) \ll q^\varepsilon \kappa _l(q) \end{aligned}$$
(3.1)

holds whenever \((a,q)=1\). The partial singular series relative to the parameter \(T\geqslant 1\) for the sum of a k-th power and an h-th power is the sum

(3.2)

We require the following mean value estimate.

Lemma 3.1

Let \(N\geqslant 1\), \(T\geqslant 1\). Then, for pairs kh satisfying (1.5),

Proof

One opens the square and the definition (3.2). Then, by the dual of the large sieve inequality (see [2,  Lemma 2.2], for example),

Via (3.1) and (2.4), we infer that

To deduce Lemma 3.1, split the sum over n into intervals \(M\leqslant n <2M\) and sum over \(M=2^\mu \). Since \(1/2<\theta <1\), the desired estimate is immediate.\(\square \)

The remainder of this section is primarily concerned with the exponential sum

(3.3)

that we examine by relating it to the more familiar Weyl sums

$$\begin{aligned} f_l(\alpha ,X) =\sum _{x\leqslant X} e(\alpha x^l). \end{aligned}$$
(3.4)

For the latter, we now define their major arc approximation. This entails the integral

$$\begin{aligned} v_l(\beta ,X) = \int _0^X\!\! e(\beta t^l)\,\mathrm {d} \beta \end{aligned}$$
(3.5)

for which partial integration provides the estimate

$$\begin{aligned} v_l(\beta ,X) \ll X(1+X^l|\beta |)^{-1/l}. \end{aligned}$$
(3.6)

The next lemma is [14,  Theorem 4.1].

Lemma 3.2

Let \(a\in {\mathbb {N}}\), \(q\in {\mathbb {N}}\), \(\alpha \in {\mathbb {R}}\) and write \(\beta =\alpha -a/q\). Then

From now on, let kh be a pair satisfying (1.5). We require appropriate analogues of Lemma 3.2 for the sum \(g(\alpha )\). By (3.3),

$$\begin{aligned} g(\alpha )\, =\!\!\!\sum _{x^k+y^h\leqslant N}\!\!\!\! e(\alpha (x^k+y^h)). \end{aligned}$$

We apply Lemma 3.2 to the sum over x. In the notation of that lemma, this yields

$$\begin{aligned} g(\alpha )\, = \!\!\!\sum _{y\leqslant N^{1/h}}\!\!\! e(\alpha y^h) \biggl (\frac{S_k(q,a)}{q}\, v_k\bigl (\beta ,(N-y^h)^{1/k}\bigr ) +O\bigl ( q^{1/2+\varepsilon } (1+N|\beta |)^{1/2}\bigr )\biggr ). \end{aligned}$$

We define the function

(3.7)

and arrive at the following imperfect approximation for \(g(\alpha )\).

Lemma 3.3

Let \(a\in {\mathbb {N}}\), \(q\in {\mathbb {N}}\), \(\alpha \in {\mathbb {R}}\) and write \(\beta =\alpha -a/q\). Then

$$\begin{aligned} g(\alpha )= g^*(\alpha ;q,a) +O\bigl (N^{1/h}q^{1/2+\varepsilon } (1+N|\beta |)^{1/2}\bigr ). \end{aligned}$$

To proceed further, we apply an obvious substitution in (3.5) to infer

We apply Lemma 3.2 again to see that the above expression equals

$$\begin{aligned} \frac{1}{k}&\int _0^{N} \!\! t^{(1-k)/k} e(\beta t)\nonumber \\&\qquad \times \biggl (\frac{S_h(q,a)}{q}\, v_h\bigl (\beta ,(N-t)^{1/h}\bigr ) +O\bigl (q^{1/2+\varepsilon } (1+N|\beta |)^{1/2}\bigr )\biggr )\,\mathrm d t \nonumber \\&= \frac{S_h(q,a)}{khq} \int _0^N\!\! t^{(1-k)/k} \int _0^{N-t}\!\!\! s^{(1-h)/h} e(\beta (t+s))\,\mathrm d s\, \mathrm dt \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad + O\bigl ( N^{1/k} q^{1/2+\varepsilon } (1+N|\beta |)^{1/2}\bigr ). \end{aligned}$$
(3.8)

Once more by obvious substitutions, the double integral here simplifies to

where \(\theta \) is defined by (1.1) and

$$\begin{aligned} B= B(1/k,1/h) = \int _0^1\!\! t^{(1-k)/k}(1-t)^{(1-h)/h}\, \mathrm {d} t \end{aligned}$$

is a special value of Euler’s Beta function. By (1.2) and a mundane computation, one finds that \(B=kh \theta C\) and then concludes from (3.1), (3.7) and (3.8) that

(3.9)

By Euler’s summation formula,

$$\begin{aligned} \sum _{n\leqslant N} n^{\theta -1} e(\beta n) = \int _0^N \!\! u^{\theta -1} e(\beta u)\,\mathrm {d} u + O(1+|\beta | N^{\theta }). \end{aligned}$$

We define the sum

$$\begin{aligned} w(\beta ) = \theta C \sum _{n\leqslant N} n^{\theta -1} e(n\beta ) \end{aligned}$$
(3.10)

and observe, with later applications in mind, that the proof of [14,  Lemma 2.8] provides the estimate

$$\begin{aligned} w(\beta ) \ll N^{\theta } (1+ N \Vert \beta \Vert )^{-\theta }. \end{aligned}$$
(3.11)

Further, we write

(3.12)

Then, by (3.9), (3.10), (3.12) and Lemma 3.3, we conclude as follows.

Lemma 3.4

Let \(\alpha \in {\mathbb {R}}\), \(q\in {\mathbb {N}}\) and \(a\in {\mathbb {Z}}\) with \(|\alpha -a/q|\leqslant 1\). Then

4 Pruning exercises

Let \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\), and let \({\mathfrak {N}}(q,a;X)\) denote the interval of all real \(\alpha \) with \(|q\alpha -a|\leqslant X/N\). Further, let \({\mathfrak {N}}(X)\) denote the union of \({\mathfrak {N}}(q,a;X)\) with \(1\leqslant a\leqslant q\leqslant X\) and \((a,q)=1\). Note that this union is disjoint. For convenience, we write \({\mathfrak {N}} = {\mathfrak {N}}\bigl (\frac{1}{4} N^{1/2}\bigr )\). When \(1\leqslant a\leqslant q\leqslant \frac{1}{4} N^{1/2}\), \((a,q)=1\) and \(\alpha \in {\mathfrak {N}}\bigl (q,a;\frac{1}{4} N^{1/2}\bigr )\), put

$$\begin{aligned} \Phi (\alpha ) = (q+N|q\alpha -a|)^{-1}. \end{aligned}$$

This defines a function \(\Phi :{\mathfrak {N}} \rightarrow (0,\infty )\). Our basic tool is a development of [1,  Lemma 1].

Lemma 4.1

Let \(\Psi :{\mathbb {R}} \rightarrow [0, \infty )\) be a trigonometric polynomial

$$\begin{aligned} \Psi (\alpha )\,=\! \sum _{|m|\leqslant M} \! \psi _m e(\alpha m) \end{aligned}$$

with real non-negative coefficients \(\psi _m\). Then, uniformly for \(\gamma \in {\mathbb {R}}\) and \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\), one has

Proof

Let \({\mathfrak {I}}\) denote the integral to be estimated. Since \(\Psi \) is a non-negative function, we have

The classical bound for Ramanujan’s sum [5,  Theorem 272]

$$\begin{aligned} \biggl | \sum _{\begin{array}{c} a=1\\ (a,q)=1 \end{array}}^q\!\! e\biggl (\frac{am}{q}\biggr )\biggr | \leqslant (q,m) \end{aligned}$$
(4.1)

now shows that

$$\begin{aligned} {\mathfrak {I}}&\leqslant \biggl (\int _{-1}^1 (1+N|\beta |)^{-1}\,\mathrm {d}\beta \biggr ) \sum _{|m|\leqslant M}\! \psi _m \sum _{q\leqslant X} \frac{(q,m)}{q} \\&\ll \frac{\log N}{N} \biggl ( X\psi _0 \,+\!\! \sum _{1\leqslant |m|\leqslant M} \!\!\psi _m \sum _{q\leqslant X} \frac{(q,m)}{q}\biggr ). \end{aligned}$$

For non-zero integers m one routinely finds that

$$\begin{aligned} \sum _{q\leqslant X} \frac{(q,m)}{q} \leqslant \sum _{\begin{array}{c} d\leqslant X\\ d\mid m \end{array}} \sum _{r\leqslant X/d} \frac{1}{r} \ll |m|^\varepsilon \log X, \end{aligned}$$
(4.2)

and the lemma follows immediately.\(\square \)

Within this section we adumbrate \(f_l(\alpha ,N^{1/l})\) to \(f_l(\alpha )\). As a first application of Lemma 4.1, we take \(\Psi (\alpha )=|f_l(\alpha )|^2\) where \(\psi _m\) is the number of solutions of \(x^l-y^l = m\) with \(1\leqslant x,y\leqslant N^{1/l}\). Thus, the hypotheses of this lemma are satisfied with \(M=N\), so uniformly in \(\gamma \in {\mathbb {R}}\) we infer the estimate

$$\begin{aligned} \int _{{\mathfrak {N}}(X)} \Phi (\alpha )|f_l(\alpha +\gamma ) |^2\, \mathrm {d}\alpha \ll N^{1/l - 1 + \varepsilon } X + N^{2/l -1 +\varepsilon }. \end{aligned}$$
(4.3)

Performing the same argument with \(\Psi (\alpha )=|f_l(\alpha )|^4\) yields

$$\begin{aligned} \int _{{\mathfrak {N}}(X)} \Phi (\alpha )|f_l(\alpha +\gamma ) |^4\, \mathrm {d}\alpha \ll N^{2/l - 1 + \varepsilon } X + N^{4/l -1 +\varepsilon }. \end{aligned}$$
(4.4)

The principal object in this section is to estimate the integral

(4.5)

where \(g(\alpha )\) is the sum defined in (3.3) with kh chosen in accordance with (1.5). There are several approaches, depending on the relative size of k and h, and on the size of X. For convenience, we put

and note at once that

$$\begin{aligned} \Phi (\alpha )\ll X^{-1} \quad (\alpha \in {\mathfrak {L}}(X)). \end{aligned}$$
(4.6)

If we pair this bound with the mean value

$$\begin{aligned} \int _0^1 |g(\alpha )|^2\,\mathrm {d}\alpha = \sum _{n\leqslant N} r(n)^2 \ll N^\theta \end{aligned}$$
(4.7)

that in turn is implied by (3.3), Lemma 2.1 and orthogonality, we deduce a first result concerning J(X), namely

$$\begin{aligned} J(X) \ll N^\theta X^{-1}\!. \end{aligned}$$
(4.8)

More sophisticated bounds for J(X) depend on (4.3) or (4.4).

Lemma 4.2

Let kh be one of the pairs satisfying (1.5), and let \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Then

$$\begin{aligned} J(X)\ll N^{\theta -1/2+\varepsilon } +X^{-1/2}N^{\theta -(1/2)+(1/h)+\varepsilon }. \end{aligned}$$

Proof

Let

$$\begin{aligned} K(\gamma ) = \sum _{n\leqslant N} e(-\gamma n). \end{aligned}$$

Then, by (3.3) and (3.4),

Hence, by (4.5),

where we wrote

$$\begin{aligned} F(\alpha ,\gamma ,\gamma ') = f_k(\alpha +\gamma )f_h(\alpha +\gamma ) f_k(-\alpha -\gamma ')f_h(-\alpha -\gamma '). \end{aligned}$$

For any complex numbers \(z, z'\) one has \(2|zz'| \leqslant |z|^2+ |z'|^2\), and so,

$$\begin{aligned} 2|F(\alpha ,\gamma ,\gamma ')|\leqslant |f_k(\alpha +\gamma )f_h(\alpha +\gamma )|^2 + |f_k(\alpha +\gamma ')f_h(\alpha +\gamma ')|^2 \end{aligned}$$

We put

(4.9)

By symmetry in \(\gamma \) and \(\gamma '\) it now follows that

The trivial bound \(K(\gamma )\ll N(1+N\Vert \gamma \Vert )^{-1}\) implies that

$$\begin{aligned} \int _0^1 \!|K(\gamma )|\, \mathrm {d}\gamma \ll \log N, \end{aligned}$$

and we arrive at the preparatory bound

$$\begin{aligned} J(X) \ll (\log N)^2\! \sup _{0\leqslant \gamma \leqslant 1}\!\! J(X,\gamma ). \end{aligned}$$
(4.10)

By (4.9) and Schwarz’s inequality,

$$\begin{aligned} J(X,\gamma ) \leqslant \biggl (\int _0^1\! |f_k(\alpha +\gamma )|^4\, \mathrm {d}\alpha \biggr )^{\!1/2} \biggl (\int _{{\mathfrak {L}}(X)} \Phi (\alpha )^2 |f_h(\alpha +\gamma ) |^4\,\mathrm {d}\alpha \biggr )^{\!1/2}. \end{aligned}$$

By (4.4) and (4.6), we have

$$\begin{aligned} \int _{{\mathfrak {N}}(X)} \Phi (\alpha )^2|f_h(\alpha +\gamma )|^4\, \mathrm {d}\alpha \ll N^{2/h - 1 + \varepsilon } + X^{-1}N^{4/h -1 +\varepsilon } \end{aligned}$$

while Hua’s Lemma [14,  Lemma 2.5] yields

$$\begin{aligned} \int _0^1\! |f_k(\alpha +\gamma )|^4\,\mathrm {d}\alpha \ll N^{2/k+\varepsilon }, \end{aligned}$$

both bounds being valid uniformly in \(\gamma \). This shows that

$$\begin{aligned} J(X,\gamma )\ll N^{\theta -1/2+\varepsilon } + X^{-1/2}N^{\theta -(1/2)+(1/h)+\varepsilon }, \end{aligned}$$

and the lemma is available from (4.10).\(\square \)

The bounds obtained so far are useful for large X. The next two lemmata are of preparatory nature for an argument that gives good bounds for J(X) when X is smaller.

Lemma 4.3

Let kh be a pair satisfying (1.5) and let \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Then

$$\begin{aligned} \int _{{\mathfrak {L}}(X)} \Phi (\alpha )|g(\alpha )|\,\mathrm {d}\alpha \ll N^{\theta -1+\varepsilon } \bigl (1+ X^{1/2}N^{-1/(2h)} + XN^{-\theta /2}\bigr ). \end{aligned}$$

Proof

We follow through the initial phase of the proof of Lemma 4.2 leading to (4.10). In this way, we arrive at the provisional bound

By Schwarz’s inequality, the integral on the right-hand side is reduced to the integrals in (4.3) with \(l=k\) and \(l=h\), and the lemma follows immediately.\(\square \)

Define the function \(g^*:{\mathfrak {N}}\rightarrow {\mathbb {C}}\) by taking \(g^*(\alpha )= g^*(\alpha ;q,a)\) whenever \(\alpha \in {\mathfrak {N}}(q,a;\frac{1}{4}N^{1/2})\) with \(1\leqslant a\leqslant q\leqslant \frac{1}{4} N^{1/2}\) and \((a,q)=1\).

Lemma 4.4

Let kh be a pair satisfying (1.5), and let \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Then

Proof

Let \({\mathscr {B}}(q)= [-1,1]\) when \(\frac{1}{2} X< q\leqslant X\), and when \(q\leqslant \frac{1}{2} X\) put

$$\begin{aligned} {\mathscr {B}}(q) = \{\alpha \in {\mathbb {R}}: X/(2qN) \leqslant |\beta |\leqslant 1\}. \end{aligned}$$

Then, writing I for the integral to be estimated,

$$\begin{aligned} I \leqslant \sum _{q\leqslant X}\, \frac{1}{q}\! \sum _{\begin{array}{c} a=1\\ (a,q)=1 \end{array}}^q\! \int _{{\mathscr {B}}(q)} (1+N|\beta |)^{-1} \biggl |g^*\biggl (\frac{a}{q}+\beta ;q,a\biggr )\biggr |^2\,\mathrm {d}\beta . \end{aligned}$$
(4.11)

By (3.1) and (3.7),

so that

the expression on the right being real and non-negative. We insert this in (4.11), bring the sum over a inside and estimate it by (4.1). This manoeuvre yields

(4.12)

First consider the portion of the sum on the right where \(\frac{1}{2} X<q\leqslant X\). Here \({\mathscr {B}}(q)=[-1,1]\), so we can bring the sum over q inside the integral and use the trivial uniform bound \(|v_k(\beta , (N-y^h)^{1/k})|\leqslant N^{1/k}\). We then see that this portion of (4.12) is bounded above by

$$\begin{aligned}&\ll N^{2/k+\varepsilon } \int _{-1}^1 (1+N|\beta |)^{-1} \!\sum _{X/2<q \leqslant X}\!\!\frac{\kappa _k(q)^2}{q}\! \sum _{y_1,y_2\leqslant N^{1/h}}\!\!\!\! \bigl (q,y_1^h-y_2^h\bigr )\,\mathrm {d}\beta \\&\ll N^{2/k-1+2\varepsilon } \!\!\sum _{X/2<q\leqslant X}\!\!\!\frac{\kappa _k(q)^2}{q}\!\! \sum _{y_1,y_2\leqslant N^{1/h}} \!\!\!\!\bigl (q,y_1^h-y_2^h\bigr ). \end{aligned}$$

Here we single out terms with \(y_1=y_2\) and apply (4.2) for the remaining choices of \(y_1,y_2\). Then, by (2.2) and (2.3) we see that the above expression is bounded by

$$\begin{aligned}&\ll N^{2/k-1+\varepsilon }\biggl ( N^{1/h} \sum _{q\leqslant X} \kappa _k(q)^2 + X^{-2/k}\!\! \sum _{y_1\ne y_2} \sum _{X/2< q\leqslant X}\!\! \frac{(q,y_1^h-y_2^h)}{q} \biggr ) \nonumber \\&\ll N^{2\theta -1+\varepsilon } \bigl (N^{-1/h} + X^{-2/k}\bigr ) \end{aligned}$$
(4.13)

which is sufficient.

Our treatment of the portion where \(q\leqslant \frac{1}{2} X\) is similar but relies on the bound \(v_k(\beta , (N-y^h)^{1/k})\ll |\beta |^{-1/k}\) that is again uniform in y, and which follows from (3.6). This portion of (4.12) therefore does not exceed

We can now proceed as in (4.13) to obtain the same final estimate. \(\square \)

Lemma 4.5

Let kh be a pair satisfying (1.5) and \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Then

$$\begin{aligned} J(X) \ll N^{2\theta -1+\varepsilon } \bigl ( N^{-1/h}&+ X^{-2/k} + N^{-1/k} X^{1/2} \\&+ N^{-(1/k)-1/(2h)}X + N^{(1/h)-3\theta /2}X^{3/2}\bigr ). \end{aligned}$$

Proof

Directly from (4.5) we have

while Schwarz’s inequality shows that

Combining the last two inequalities implies that

The second integral on the right-hand side is estimated in Lemma 4.4, and contributes an acceptable amount. To bound the first integral on the right-hand side, note that Lemma 3.3 yields \(g(\alpha )-g^*(\alpha ) \ll N^{1/h}X^{1/2+\varepsilon }\) for \(\alpha \in {\mathfrak {L}}(X)\), and then we apply Lemma 4.3 to find that

This establishes the lemma.\(\square \)

We have completed the estimation of J(X) but simplify the results for readier use.

Lemma 4.6

Let \(k=2\), \(h\geqslant 3\) and \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Put \(\sigma (3)=1/12\), \(\sigma (4)=1/16\) and \(\sigma (h)=0\) for \(h\geqslant 5\). Then

Proof

First suppose that \(h\geqslant 5\). For \(X\geqslant N^{2/h}\) Lemma 4.2 gives \(J(X)\ll N^{1/h}\). For \(X\leqslant N^{2/h}\) the desired estimate is contained in Lemma 4.5.

Next suppose that \(h=3\) or 4. If \(X\geqslant N^{1/2-\sigma (h)}\), we use (4.8) and obtain \(J(X) \ll N^{1/h+\sigma (h)+\varepsilon }\). For \(X\leqslant N^{1/2-\sigma (h)}\) the desired bound is again a consequence of Lemma 4.5.\(\square \)

Lemma 4.7

Let \(k=3\) and \(h=4\) or 5. Further let \(1\leqslant X\leqslant \frac{1}{4} N^{1/2}\). Then

$$\begin{aligned} J(X) \ll N^{2\theta -1+\varepsilon }\bigl ( N^{-1/(8h)} + X^{-2/3}\bigr ). \end{aligned}$$

Proof

This follows from Lemma 4.2 for \(X\geqslant N^{1/3+ 1/(4h)}\), and from Lemma 4.5 for the remaining X.\(\square \)

5 Imperfect variance

We launch our attack on Theorem 1.1 by first considering the expression

that may be viewed as an imperfect version of the variance V(NQ). One opens the square and finds that

(5.1)

where

(5.2)

and

Our ultimate goal in this section is an asymptotic formula for U(NQT). It is easy to extract a main term from \(U_0(N,T)\).

Lemma 5.1

Let \(T\geqslant 1\). Then

$$\begin{aligned} U_0(N,T)= \sum _{n\leqslant N} r(n)^2 + O\bigl (T^{1+\varepsilon } N^{2\theta -1} +T^{2+\varepsilon }\bigr ). \end{aligned}$$

Proof

We square out the expression in (5.2), and first consider the cross term

Here, by (3.1), (3.2) and (2.3), we have the trivial bound

Hence, by Lemma 2.1 and partial summation, we see that the cross term is bounded by

$$\begin{aligned} \ll T^{1+\varepsilon } \sum _{n\leqslant N} n^{\theta -1} r(n) \ll T^{1+\varepsilon } N^{2\theta -1}, \end{aligned}$$

which is acceptable. This leaves the sum involving \(|\mathfrak {s}(n;T)|^2\), and here Lemma 3.1 provides an acceptable estimate. \(\square \)

The next theorem provides an estimate for S. Recall the data \(\sigma (h)\) defined in Lemma 4.6.

Theorem 5.2

Let kh be one of the pairs satisfying (1.5) and suppose that \(1\leqslant Q\leqslant N^\theta \). If \(k=2\) and \(1\leqslant T\leqslant N^{1/h}\), then

$$\begin{aligned} S\ll N^{1+1/h+\sigma +\varepsilon } + N^{1+2/h+\varepsilon } T^{-1}. \end{aligned}$$

If \(k=3\), \(h=4\) and \(1\leqslant T\leqslant N^{1/8}\), then

$$\begin{aligned} S\ll N^{2\theta - 1/32 +\varepsilon } + N^{2\theta +\varepsilon } T^{-1} + N^{7/8+\varepsilon }T^2, \end{aligned}$$

and if \(k=3\), \(h=5\) and \(1\leqslant T\leqslant N^{1/20}\), then

$$\begin{aligned} S\ll N^{2\theta - 1/40 +\varepsilon } + N^{2\theta +\varepsilon } T^{-1} + N^{19/20+\varepsilon }T^2. \end{aligned}$$

The initial steps in the proof of this theorem are identical to the work in [2,  Section 3]. Form the exponential sums

(5.3)

and

$$\begin{aligned} F(\alpha )=\sum _{q\leqslant Q}\sum _{r\leqslant N/q}\!\! e(\alpha qr). \end{aligned}$$

Then

(5.4)

We follow Goldston and Vaughan [4, Section 3] and examine the integral in (5.4) by the circle method. This depends on a mean square estimate for \(G(\alpha )\). By (5.3) and (3.3),

so that

(5.5)

By Lemma 3.1 and orthogonality, one finds that

(5.6)

Then, by (4.7), (5.5) and (5.6), it follows that

(5.7)

Consider a typical interval \({{\mathfrak {M}}}(r,b)\) associated with the element b/r of the Farey dissection of order \(2N^{1/2}\), namely, when \(1\leqslant b\leqslant r\leqslant 2N^{1/2}\) and \((b,r)=1\),

$$\begin{aligned} {{\mathfrak {M}}}(r,b)= \biggl ( {\frac{b+b_-}{r+r_-}},{\frac{b+b_+}{r+r_+}}\biggr ] \end{aligned}$$

where \(r_\pm \) is defined by \(br_\pm \equiv \mp 1\pmod r\) and \(2N^{1/2}-r<r_\pm \leqslant 2N^{1/2}\) and \(b_\pm \) is defined by \(b_\pm =(br_\pm \pm 1)/r\). We observe that

$$\begin{aligned} \biggl | {\frac{b+b_\pm }{r+r_\pm }}-{\frac{b}{r}}\biggr | = {\frac{1}{r(r+r_\pm )}} \end{aligned}$$

lies in \([1/(4rN^{1/2}),1/(2rN^{1/2}))\). For \(r\leqslant \frac{1}{4} N^{1/2}\) this implies that

$$\begin{aligned} {\mathfrak {N}}\bigl (r,b; {\textstyle \frac{1}{4}}N^{1/2}\bigr ) \subset {{\mathfrak {M}}}(r,b). \end{aligned}$$
(5.8)

The analysis of the function F relative to the Farey intervals performed in [2, Section 3] we invoke with \(R=2N^{1/2}\) (in the notation of [2]). Then, by (5.8), we infer the bound

for all \(\alpha \) covered by the Farey intervals. Consequently, by (5.4) and (5.7),

(5.9)

We are reduced to estimating the integral on the right hand side. We choose a parameter Y with \(1\leqslant Y\leqslant \frac{1}{4}N^{1/2}\), write \({\mathfrak {K}}= {\mathfrak {N}}(Y)\) for the core major arcs, and put . By (4.6) we see that \(\Phi (\alpha )\ll Y^{-1}\) holds uniformly on \({\mathfrak {k}}\), so that (5.6) provides us with the bound

Further, since \({\mathfrak {k}}\) is covered by no more than \(\log N\) sets \({\mathfrak {L}}(X)\) with \(Y\leqslant X\leqslant \frac{1}{4} N^{1/2}\), it follows from (4.5) that

If \(k=2\), then by (5.5) and Lemma 4.6, the last two bounds combine to give

(5.10)

When \(k=3\), the same argument based on Lemma 4.7 yields

(5.11)

On the core major arcs, we use approximations to \(G(\alpha )\) provided by Lemma 3.4. Here we follow the pattern of our previous work [2, Section 3] quite closely, beginning with (3.12) of that memoir. In the wider context of our current analysis, this formula still reads

$$\begin{aligned} G(\alpha ) = g(\alpha )- \sum _{t\leqslant T} \sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t\!\! W(\alpha ;t,c), \end{aligned}$$
(5.12)

but now stems from (5.3), (3.12) and (3.2).

We wish to use this for \(\alpha \in {\mathfrak {K}}\) and therefore write \({\mathfrak {K}}\) as the disjoint union of intervals \({\mathfrak {K}}(r,b)= \{\alpha : |r\alpha - b|\leqslant Y/N\}\) with \(1\leqslant b\leqslant r\leqslant Y\) and \((b,r)=1\). Suppose that \(\alpha \) is in one of these arcs \({\mathfrak {K}}(r,b)\). Then

$$\begin{aligned} G(\alpha ) = g(\alpha ) - W(\alpha ;r,b) + D(\alpha ;r,b) \end{aligned}$$
(5.13)

where in view of (5.12) we have

$$\begin{aligned} D(\alpha ;r,b) = W(\alpha ;r,b) - \sum _{t\leqslant T} \sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t\!\! W(\alpha ;t,c). \end{aligned}$$

Here we estimate a typical summand on the far right when \(c/t \ne b/r\). If \(\alpha \in {\mathfrak {K}}(r,b)\), then \(|\alpha -b/r|\leqslant Y(rN)^{-1}\). We proceed subject to the condition that \(T\leqslant Y\), this will turn out to be the case later. It then follows that

$$\begin{aligned} \bigg \Vert \alpha -\frac{c}{t}\bigg \Vert \geqslant \bigg \Vert \frac{b}{r}-\frac{c}{t}\bigg \Vert -\frac{Y}{rN} \geqslant \frac{1}{2}\, \biggl \Vert \frac{b}{r}-\frac{c}{t}\biggr \Vert . \end{aligned}$$

Hence, by (3.1), (3.11) and (3.12), we have

At this stage, we may follow through the argument in [2] that leads from (3.15) to (3.20) of that paper. Instead of [2, (3.17)] we then encounter the sum

and by (2.3) and Cauchy’s inequality, this sum is bounded \(O(T^{1+\varepsilon })\), as is required to proceed to [2,  (3.18)], and we then arrive at the estimates

$$\begin{aligned} D(\alpha ;r,b) \ll r^{\theta } + T^{1+\varepsilon } \quad (r\leqslant T) \end{aligned}$$
(5.14)

and

$$\begin{aligned} D(\alpha ;r,b) \ll |W(\alpha ;r,b)|+r^{\theta } + T^{1+\varepsilon } \quad (r> T). \end{aligned}$$
(5.15)

Lemma 5.3

One has

Proof

We begin with the set of all \(\alpha \) where \(D(\alpha ;r,b)\ll r^{\theta }+T^{1+\varepsilon }\) holds. This set makes a contribution to the integral in question that does not exceed

$$\begin{aligned} \sum _{r\leqslant Y} \sum _{\begin{array}{c} b=1\\ (b,r)=1 \end{array}}^r \int _{{{\mathfrak {K}}}(r,b)}\!\! r^{-1}(1+N|\alpha -b/r|)^{-1} \bigl (r^{2\theta }+T^{2+\varepsilon }\bigr )\,\mathrm {d}\alpha \ll N^{\varepsilon -1} Y \bigl (Y^{2\theta } +T^2\bigr ). \end{aligned}$$

On the remaining set we have \(r>T\) and \(D(\alpha ;r,b) \ll |W(\alpha ;r,b)|\). This follows from (5.14) and (5.15). By (3.1), (3.11), (3.12) and (2.4) we see that

This confirms the estimate proposed in Lemma 5.3.\(\square \)

Our last auxiliary estimate is now almost immediate. For \(\alpha \in {\mathfrak {K}}(r,b)\) we see from (5.13) that

$$\begin{aligned} |G(\alpha )|^2 \ll |g(\alpha )- W(\alpha ;r,b)|^2 + |D(\alpha ;r,b)|^2. \end{aligned}$$

We multiply by \(\Phi (\alpha )\). Then, by Lemma 3.4,

Now we integrate over \({\mathfrak {K}}\) and obtain

We apply (2.3) and combine the result with Lemma 5.3 to confirm the bound

(5.16)

We are ready to derive an estimate for S. Indeed, \({\mathfrak {N}}\) is the union of \({\mathfrak {k}}\) and \({\mathfrak {K}}\), so we have to combine the results in (5.9) and (5.16) with either (5.10) or (5.11). We first consider the cases with \(k=2\) and choose \(Y=N^{1/h}\). Then, we are allowed to take \(1\leqslant T\leqslant N^{1/h}\), and find that

$$\begin{aligned} S\ll N^{1+1/h+ \sigma (h)+\varepsilon } + N^{1+2/h +\varepsilon }T^{-1}. \end{aligned}$$

For \(k=3\), the choices \(Y=Y_5=N^{1/20}\) and \(Y=Y_4=N^{1/8}\) give

in the range \(1\leqslant T\leqslant Y_h\). In both cases, Theorem 5.2 is now available.

Asymptotic formulae for U(NQT) are also available.

Theorem 5.4

Let kh be one of the pairs satisfying (1.5) and suppose that \(1\leqslant Q\leqslant N^\theta \). If \(k=2\) and \(1\leqslant T\leqslant N^{1/h}\), then

$$\begin{aligned} U(N,Q,T) = CQN^\theta +O\bigl (N^{1+1/h+ \sigma (h)+\varepsilon } + N^{1+2/h +\varepsilon }T^{-1} +QTN^{2/h+\varepsilon }\bigr ). \end{aligned}$$

If \(k=3\) and \(1\leqslant T \leqslant N^{1/(8h)}\), then

$$\begin{aligned} U(N,Q,T) = CQN^\theta +O\bigl (N^{2\theta +\varepsilon }T^{-1}\bigr ). \end{aligned}$$

We remark here that for \(Q\leqslant N^\theta \) and \(k=2\), \(h\geqslant 6\) one has \(QTN^{2/h} \leqslant N^{(1/2)+(4/h)} \leqslant N^{1+1/h}\), so the term \(QTN^{2/h}\) can be ignored except when \(h\leqslant 5\). Iterating an earlier comment relating to the error term in Theorem 1.1, note here that for \(k=2\) errors of size \(N^{1+1/h}\) correspond to square root cancellation in the integral representation of S, and are probably hard to improve.

For a proof of Theorem 5.4 use Lemma 2.1 within Lemma 5.1 to see that

$$\begin{aligned} U_0(N,T) = CN^\theta + O\bigl (N^{1/k} + N^{2/h+\varepsilon } + TN^{2\theta -1+\varepsilon } + T^{2}N^\varepsilon \bigr ) \end{aligned}$$

where we choose T in accordance with Theorem 5.2. Then \(T\leqslant N^{1/h}\) in all cases, so the term \(T^2\) in the error term is redundant. By (5.1) we get

$$\begin{aligned} U(N,Q,T) = CQN^\theta +2S + O\bigl (N^\theta + Q\bigl (N^{1/k}+N^{2/h+\varepsilon } +TN^{2\theta -1+\varepsilon }\bigr )\bigr ).\nonumber \\ \end{aligned}$$
(5.17)

First suppose that \(k=3\) and \(1\leqslant T \leqslant N^{1/(8h)}\). Then, by Theorem 5.2, one has \(S\ll N^{2\theta +\varepsilon }T^{-1}\), and Theorem 5.4 follows from (5.17).

Next suppose that \(k=2\) and \(1\leqslant T\leqslant N^{1/h}\). Now \(2\theta -1= 2/h\), and the term \(O(N^{2/h+\varepsilon })\) in (5.17) can be ignored. Further, one has \(QN^{1/2}\leqslant N^{1+1/h}\), and the cases with \(k=2\) of Theorem 5.4 again follow from (5.17) and Theorem 5.2.

6 From the imperfect to the perfect

In this section we establish Theorem 1.1. The argument is very similar to the one presented in §4 of our previous communication [2] but there are some subtleties, and we therefore proceed in moderate detail. Let

(6.1)

and consider

$$\begin{aligned} \Delta (N,Q,T) = \sum _{q\leqslant Q} \sum _{a=1}^q\,\bigg | \Xi (N,T;q,a) - \frac{\rho (q,a)}{\theta q^2}\, N^{\theta }\bigg |^2 . \end{aligned}$$
(6.2)

Lemma 6.1

Let \(1\leqslant T\leqslant Q\leqslant N^\theta \). Then

Proof

The method of proof of [14,  Lemma 2.12] yields

$$\begin{aligned} \frac{\rho (q,a)}{q^2} = \frac{1}{q}\sum _{t|q}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t \!\frac{S_k(t,c)S_h(t,c)}{t^2}\, e(-ca/t) . \end{aligned}$$
(6.3)

We split the partial singular series as \({\mathfrak {s}}(n;T) = {\mathfrak {s}}^*_q(n;T) + {\mathfrak {s}}^\dagger _q(n;T)\) where

$$\begin{aligned} {\mathfrak {s}}^*_q(n;T) = \sum _{\begin{array}{c} t\leqslant T\\ t\mid q \end{array}}\sum _{\begin{array}{c} c=1\\ (c,t)=1 \end{array}}^t\! \frac{S_k(t,c)S_h(t,c)}{t^2}\, e(-ca/t) \end{aligned}$$
(6.4)

and \({\mathfrak {s}}^\dagger _q(n;T)\) is defined in the same way, but with the complementary condition \(t\not \mid q\).

Let \(\Xi ^\dagger (N,T;q,a)\) be the sum defined in (6.1), but with \({\mathfrak {s}}(n;T)\) replaced by \({\mathfrak {s}}^\dagger _q(n;T)\). We shall use this notation also in situations where \(\Xi \) and \({\mathfrak {s}}\) are decorated by symbols other than \(\dagger \). We show that \(\Xi ^\dagger (N,T;q,a)\) is small on average. Let \(1\leqslant a\leqslant q\). Then

The total contribution from terms with \(m=0\) is

(6.5)

For \(m\geqslant 1\) we have \(a+mq \gg mq\), and so, by partial summation,

Hence the contribution to \(\Xi ^\dagger (N,T;q,a)\) from terms with \(m\geqslant 1\) is bounded by

Here we used (2.3) in the final estimate. By similar estimates within (6.5), we also see \( \Xi ^\dagger (N,T;q,a) \ll (a^{\theta -1} + q^{\theta -1}) T^{1+\varepsilon } \) and then have

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q\, |\Xi ^\dagger (N,T;q,a)|^2\! \ll Q^{2\theta } T^{2+\varepsilon }. \end{aligned}$$
(6.6)

Now let

(6.7)

This sum depends on n only modulo q. Thus, with the convention concerning decorations of \(\Xi \) in mind,

Much as before, we find that whenever \(q\leqslant N^\theta \) then

and consequently,

$$\begin{aligned} \sum _{a=1}^q \,|\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{2\theta }q^{-2} \sum _{a=1}^q \,|{\mathfrak {s}}^\ddagger _q(a,T)|^2. \end{aligned}$$

By (6.7) and Cauchy’s inequality, and then by orthogonality,

Recall that \(Q\leqslant N^{\theta }\). Hence

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q |\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{2\theta +\varepsilon } \sum _{q\leqslant Q} q^{-1} \sum _{\begin{array}{c} t> T\\ t\mid q \end{array}} t \kappa _k(t)^2\kappa _h(t)^2. \end{aligned}$$

In the sums over q and t we write \(q=ts\). By (2.4) the double sum is

$$\begin{aligned} \sum _{T<t\leqslant Q}\!\! \kappa _k(t)^2\kappa _h(t)^2 \sum _{s\leqslant Q/t}\!\! s^{-1} \ll (\log Q) \sum _{T<t\leqslant Q}\!\! \kappa _k(t)^2\kappa _h(t)^2 \ll N^\varepsilon T^{-1}. \end{aligned}$$

This proves that

$$\begin{aligned} \sum _{q\leqslant Q} \sum _{a=1}^q |\Xi ^\ddagger (N,T;q,a)|^2 \ll N^{2\theta +\varepsilon }T^{-1}. \end{aligned}$$
(6.8)

We collect the results obtained so far in a single estimate. By (6.3), (6.4) and (6.7), we have

It follows that

and from (6.6) and (6.8) we then conclude

(6.9)

By (6.9) and (6.2), we are reduced to replacing

To realize this, we may follow the argument given in [2] very closely. By orthogonality, we see that

Here, by (3.11), the sum on the far right over n is \( O(\Vert b/q\Vert ^{-\theta })\). Further, by (6.3) and (3.1),

(6.10)

Observing orthogonality, it now follows that the sum

(6.11)

is bounded above by

as is readily confirmed from (2.5).

The elementary evaluation

$$\begin{aligned} \sum _{n\leqslant N} n^{\theta -1} = \theta ^{-1} N^{\theta } + O(1) \end{aligned}$$

together with (6.10) and (2.5) implies the bound

Equipped with the estimate obtained for the sum in (6.11), we deduce that

This bound coupled with (6.9) implies Lemma 6.1.\(\square \)

The endgame begins by writing (1.3) as

$$\begin{aligned} V(N,Q) = \sum _{q\leqslant Q} \sum _{a=1}^q\, \bigl ( A(N,q,a) + B(N,q,a) \bigr )^2 \end{aligned}$$

where

Then, squaring out and estimating the cross term by Cauchy’s inequality, we find

$$\begin{aligned} V(N,Q) = U(N,Q,T) + (\theta C)^2\Delta (N,Q,T) + O\bigl ((U\Delta )^{1/2}\bigr ). \end{aligned}$$
(6.12)

Theorem 1.1 is now easily deduced. When \(k=3\) we choose \(T=N^{1/8h}\). Then, for \(T\leqslant Q\leqslant N^\theta \), Lemma 6.1 gives \(\Delta \ll N^{2\theta }T^{-1}\), and Theorem 5.4 yields \(U=CQN^{\theta } +O(N^{2\theta }T^{-1})\). Now (6.12) gives

$$\begin{aligned} V(N,Q) = CQN^\theta + O\bigl (N^{2\theta -1/(8h)+\varepsilon } + Q^{1/2}N^{(3\theta /2) - 1/(16h)+\varepsilon }\bigr ) \end{aligned}$$
(6.13)

in the range \(N^{1/8h}\leqslant Q\leqslant N^\theta \). For \(1\leqslant Q\leqslant N^{1/8h}\) the asymptotic formula (6.13) reduces to the upper bound \(V(N,Q)\ll N^{2\theta -1/(8h)+\varepsilon }\) that we have just confirmed for \(Q= N^{1/8h}\), and hence remains true for smaller values of Q. This completes the proof of Theorem 1.1 in the case \(k=3\).

The other cases are similar, and we merely indicate the choices of parameters. For \(k=2\), \(h\geqslant 7\), we take \(T=N^{1/h}\). Then, for \(T\leqslant Q\leqslant N^{\theta }\) we have \(T^3 \leqslant (N/Q)^{2\theta } \) so that Lemma 6.1 again gives \(\Delta \ll N^{2\theta }T^{-1}\ll N^{1+1/h}\). As above, we first deduce that

$$\begin{aligned} V(N,Q) = CQN^\theta + O\bigl (N^{1+1/h+\varepsilon } + Q^{1/2}N^{3/4 - 3/(2h)+\varepsilon }\bigr ) \end{aligned}$$

holds for \(T\leqslant Q\leqslant N^\theta \), and then see that the range \(1\leqslant Q\leqslant T\) is covered for trivial reasons.

Next, consider the cases \(k=2\), \(4\leqslant h\leqslant 6\). We wish to conclude from Lemma 6.1 that \(\Delta \ll N^{2\theta }T^{-1}\) with T as large as is possible. This requires us to take \(T\leqslant N^{1/h}\) and \(T\leqslant (N/Q)^{2\theta /3}\). However, at least when \(h\leqslant 5\) the term \(QTN^{2/h}\) needs attention, and we require that \(QTN^{2/h}\leqslant N^{2\theta }T^{-1}\) as well. This reduces to \(T\leqslant (N/Q)^{1/2}\). However, for \(h\geqslant 4\) we have \(\frac{2}{3} \theta \leqslant \frac{1}{2}\), so this last condition for T is implied by the previous ones, and we can choose

Then, again via Theorem 5.4 and (6.12), we conclude that

$$\begin{aligned} V(N,Q) =CQN^\theta + O\bigl (N^{2\theta -1/h+\sigma (h)+\varepsilon } +N^{2\theta +\varepsilon }T^{-1} + Q^{1/2}N^{3\theta /2+\varepsilon }T^{-1/2}\bigr ) \end{aligned}$$

holds for \(T\leqslant Q\leqslant N^\theta \), and this yields the bound in Theorem 1.1. The smaller values of Q are again a trivial range.

This leaves the case \(k=2\), \(h=3\). We follow the same strategy but this time have \(\frac{2}{3} \theta > 1/2\) and therefore choose

Theorem 5.4 gives

$$\begin{aligned} U= CQN^{5/6} + O\bigl (N^{17/12+\varepsilon }+ Q^{1/2}N^{7/6+\varepsilon }\bigr ), \end{aligned}$$

and Lemma 6.1 asserts that

$$\begin{aligned} \Delta \ll N^{2\theta +\varepsilon }T^{-1} \ll N^{4/3+\varepsilon } + N^{7/6+\varepsilon }Q^{1/2}. \end{aligned}$$

Theorem 1.1 now follows from (6.12).

7 The proof of Theorem 1.2

We define \(\eta (n)\) through

$$\begin{aligned} r_0(n)= r(n) - \eta (n). \end{aligned}$$

Then \(\eta (n)\geqslant 0\) and takes integer values, so that

$$\begin{aligned} \eta (n) \leqslant \eta (n)^2 = (r(n)-r_0(n))^2 \leqslant r(n)^2-r(n). \end{aligned}$$

By Lemma 2.1, this implies that

$$\begin{aligned} \sum _{n\leqslant N} \eta (n) \leqslant \sum _{n\leqslant N} \eta (n)^2 \leqslant \sum _{n\leqslant N} \bigl (r(n)^2-r(n)\bigr ) \ll N^{2/h+\varepsilon }. \end{aligned}$$

Now

where

(7.1)
(7.2)

To bound \(E_0\), one opens the square and finds that

Hence

Here the terms with \(n=m\) contribute

$$\begin{aligned} \ll Q \sum _{n\leqslant N} \eta (n)^2 \ll QN^{2/h+\varepsilon } \end{aligned}$$

while the terms with \(n\ne m\) contribute no more than

$$\begin{aligned} \ll Q^\varepsilon \biggl (\sum _{n\leqslant N} \eta (n)\biggr )^2 \ll N^{4/h+\varepsilon }. \end{aligned}$$

This shows that

$$\begin{aligned} E_0 \ll QN^{2/h+\varepsilon }+N^{4/h+\varepsilon }. \end{aligned}$$

Further, by (7.1), (7.2) and Cauchy’s inequality,

$$\begin{aligned} E_1^2 \ll V(N,Q) E_0. \end{aligned}$$

Theorem 1.2 now follows.