1 Introduction

We say that a set \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) of distinct integers is admissible if \(\#\{\mathcal {H}\,\,(\text {mod}\,\,p)\}<p\) for every prime p. An outstanding problem in analytic number theory is the prime k-tuples conjecture, which asserts the following.

Conjecture 1.1

Let \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) be admissible. Then there exists infinitely many integers n such that the translates \(n+h_1,\ldots ,n+h_k\) are prime.

A proof of this conjecture is far out of reach of current techniques. However, we have been successful in establishing various weak versions of this result using sieve methods. For example, the Maynard–Tao sieve can be used to show that \(\gg \log {k}\) of the translates are simultaneously prime infinitely often, when k is sufficiently large (cf. [9, 11]).

We extend the definition of admissibility to infinite ordered sets and say \(\mathcal {H}^*=\{h_1,h_2,\ldots \}\) is admissible if the finite truncation \(\{h_1,\ldots ,h_k\}\subseteq \mathcal {H}^*\) is admissible for every \(k\ge 1.\) In this paper we are interested in the following variation of this conjecture, for numbers representable as a sum of two squares.

Conjecture 1.2

Let \(\mathcal {H}^*=\{h_1,h_2,\ldots \}\) be admissible. Then there exists an increasing sequence of integers \(n_k\) such that, for every \(k\ge 1,\) the translates \(n_k+h_1,\ldots ,n_k+h_k\) are sums of two squares.

We remark that if we replaced “sums of two squares” with “prime” here, then this would simply be a reformulation of Conjecture 1.1. (It is easy to show that any finite admissible set can be extended to an infinite admissible set.)

Our interest in this version of the conjecture stems from a problem which appears towards the end of D. Jakobson’s “Quantum limits on flat tori” paper [8]. In this paper Jakobson is concerned with characterising the possible quantum limits that can arise on the standard flat d-dimensional torus \(\mathbb {T}^d=\mathbb {R}^d/\mathbb {Z}^d.\) A complete classification of such objects is established in two dimensions, with possible behaviours in higher dimensions described unconditionally for \(d\ge 4,\) and conditionally for \(d=3\) on a weak version of Conjecture 1.2 (cf. [8, Conjecture 8.2]).

In this paper we establish Jakobson’s conjecture.

Theorem 1.3

There exists increasing sequences of natural numbers \(a_j\) and \(M_k\) such that \(M_k-a_j^2\) is a sum of two squares for \(1\le j \le k.\) Moreover, the sequence \(a_j\) is such that:

  1. (1)

    \(r_{2}(a_j)<r_{2}(a_{j+1})\) for all \(j\ge 1.\)

  2. (2)

    The even parts are uniformly bounded; that is to say, if we write \(a_j = 2^{b_j}m_j\) where \((m_j,2)=1,\) then \(b_j = O(1)\) uniformly for \(j\ge 1\).

Here, \(r_2(n)\) denotes the number of representations of n as a sum of two squares. We deduce Theorem 1.3 from the following general result.

Theorem 1.4

Let \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) be admissible such that each \(h_i\) is divisible by 4. Then there exists increasing sequences of natural numbers \(a_j\) and \(n_k\) such that \(n_k+h_{a_j}\) is a sum of two squares for every \(k\ge 1\) and \(1\le j\le k.\)

For example, Theorem 1.4 applied to the admissible set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) with elements \(h_i=-(2\cdot 5^{i})^2,\) yields a solution to Theorem 1.3.

As mentioned above, Theorem 1.3 allows us conclude results about quantum limits on flat tori. Let \((\lambda _j)_{j\ge 1}\) be a sequence of eigenvalues of the Laplace operator \(\nabla \) on \(\mathbb {T}^d\) such that \(\lambda _j\rightarrow \infty ,\) and let \(\varphi _j\) be corresponding eigenfunctions with \(\left\Vert \varphi _j\right\Vert _2=1\). If the sequence of probability measures \(\mathrm {d}\mu _j=|\varphi _j|^2\mathrm {d}x\) has a weak-\(*\) limit \(\mathrm {d}v\), then we call \(\mathrm {d}v\) a quantum limit. (Here \(\mathrm {d}x\) is the normalised Riemannian volume.)

It can be shown that all limits of such sequences \(\mathrm {d}\mu _j\) are absolutely continuous with respect to the Lebesgue measure on \(\mathbb {T}^d\) (cf. [8, Theorem 1.3]), and so one can consider the Fourier expansion

$$\begin{aligned} \mathrm {d}v = \sum _{\tau \in \mathbb {Z}^d} c_{\tau } e^{2\pi i \langle \tau , x \rangle }\mathrm {d}x. \end{aligned}$$
(1.1)

Among other things, Jakobson shows that in two dimensions all quantum limits are necessarily trigonometric polynomials (cf. [8, Theorem 1.2]). The same result isn’t true for \(d\ge 4\), and conjecturally not true for \(d=3\) either (cf. [8, Conjecture 8.2] and the following discussion). With Theorem 1.3, we can now complete this aspect of the classification of quantum limits on flat tori.

Theorem 1.5

There exists quantum limits on \(\mathbb {T}^3\) that are not trigonometric polynomials.

As further consequences to Theorem 1.3 we are able to show the following results for quantum limits whose Fourier expansions are described as in (1.1).

Theorem 1.6

Let \(\epsilon >0.\) We have the following.

  1. (i)

    For \(d \ge 4\) there exists quantum limits \(\mathrm {d}v\) on \(\mathbb {T}^d\) with densities that are not in \(l^{2-\epsilon }\) (i.e. for which \(\sum _{\tau }|c_{\tau }|^{2-\epsilon }\) diverges).

  2. (ii)

    For \(d\ge 5\) there exists quantum limits \(\mathrm {d}v\) on \(\mathbb {T}^d\) for which

    $$\begin{aligned} \limsup _{\rho \rightarrow \infty } \frac{\Sigma (\rho )}{\rho ^{d-4-\epsilon }}=+\infty , \end{aligned}$$

    where \(\Sigma (\rho )\) is defined as

    $$\begin{aligned} \Sigma (\rho ) = \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | < \rho \end{array}}|c_{\tau }|. \end{aligned}$$
    (1.2)

The results contained in Theorem 1.6 improve upon various results found in [8]. Part (i) was previously shown for \(d\ge 5\), and has now been extended to the case \(d=4\) where it is now optimal (cf. [8, Theorem 1.4]). Part (ii) improves on the weaker lower bound

$$\begin{aligned} \limsup _{\rho \rightarrow \infty } \frac{\Sigma (\rho )}{\rho ^{d-5-\epsilon }}=+\infty \end{aligned}$$

which was shown for \(d\ge 6\). The lower bound we prove is believed to be optimal for all \(d\ge 5\) (cf. [8, Proposition 1.2] and comments shortly after).

Remark 1.7

It is well-known that the eigenvalues of \(\nabla \) on \(\mathbb {T}^d\) are the numbers \(4\pi ^2 k\) for non-negative integers k, and they occur with multiplicity \(r_d(k)\) (the number of representations of k as the sum of d squares). This means various constructions associated to quantum limits on flat tori can be translated to problems in number theory involving sums of squares.

Remark 1.8

Jakobson shows how Theorem 1.3 follows from weak form of the prime k-tuples conjecture, essentially by using the fact primes \(p\equiv 1\,\,(\text {mod}\,\,4)\) are the sum of two squares (cf. discussion at the end of [8, Section 8]). We note that the weak form of the conjecture Jakobson uses is still far out of reach of current methods.

2 Outline of new sieve ideas

In this section, let \(\mathcal {A}\subseteq \mathbb {N}\) denote a set of arithmetic interest, which for our purposes is the set of numbers representable as a sum of two squares (but the following discussion holds more generally). We will denote random variables by boldfaced letters, for example \(\mathbf {X}\). We will let \(\mathbb {P}(\cdot )\) denote a probability measure and by \(\mathbb {E}[\cdot ]\) the expectation operator.

2.1 A model problem

Our aim is to prove Theorem 1.4. By a pigeonhole argument (see Proposition 5.1), it suffices to consider the following model problem.

Model Problem

Fix an admissible set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) of integers and a partition \(\mathcal {H}^{*}=B_1\cup B_2\cup \ldots \) where each bin \(B_i\) is a fixed, finite size \(k_i\). Is it the case that for every \(M\ge 1\) there exists elements \(h_{a_1},\ldots ,h_{a_M}\) and infinitely many integers n such that \(h_{a_j}\in B_j\) and \(n+h_{a_j}\in \mathcal {A}\) for \(1\le j\le M\)?

We realise the above set-up as the output of a sieving process. For notational purposes we order \(B_{i}=\{h_{k_0+\cdots +k_{i-1}+1},\ldots ,h_{k_0+\cdots +k_{i}}\}\) for \(i\ge 1,\) with the convention that \(k_0=0.\) Let \(k=k_{0}+\cdots +k_M\) for some large M. Given \(n\in [N,2N)\) for some large N, let \(\mathbf {X}_i\) denote the random variable that counts the number of \(h\in B_i\) such that \(n+h\in \mathcal {A}\), and let \(\mathbf {X}=\mathbf {X}_1+\cdots \mathbf {X}_{M}.\)

The current method we use to detect primes in k-tuples is the GPY method. For general sets \(\mathcal {A},\) the aim is to show the first moment inequality

$$\begin{aligned} S_{\mathcal {A}}=\sum _{N\le n< 2N} \bigg (\sum _{i=1}^{k}\mathbb {1}_{\mathcal {A}}(n+h_i)-m\Bigg )w(n)>0 \end{aligned}$$
(2.1)

holds for some integer \(m\ge 1\), where \(\mathbb {1}_{\mathcal {A}}\) denotes the indicator function of the set \(\mathcal {A}\) and w(n) are non-negative weights (cf. [4, 9, 11]). If we normalise the weights to sum to 1, then this is saying “if we choose n randomly from the interval [N, 2N) with probability w(n), then \(\mathbb {E}[\mathbf {X}]>m\).” From this we can deduce the existence of an \(n\in [N,2N)\) for which \(m+1\) of the translates \(n+h_i\in \mathcal {A}\). We say such a translate has been “accepted.” Exactly which translates are accepted is unknown. This is a limitation of the first moment method.

It is clear that for our model problem, we require more information about which translates \(n+h\) appear. Namely, we need to be obtaining an accepted translate from each of the M bins \(B_1,\ldots ,B_M\) (recall \(k=k_0+\cdots +k_M\)). This presents two obvious difficulties.

  1. (1)

    For any \(1\le i\le k\) the probability of the event \(n+h_i\in \mathcal {A}\) depends on k, and tends to 0 as \(k\rightarrow \infty \). This would mean any bin of fixed size expects to get fewer and fewer elements as k gets large. In particular, we cannot hope the hypotheses hold for every \(M\ge 1.\)

  2. (2)

    Even in the situation where \(\mathbb {E}[\mathbf {X}_i]>1\) holds for each \(1\le i\le M\), we cannot conclude anything about \(\mathbb {P}((\mathbf {X}_1>0)\cap \cdots \cap (\mathbf {X}_M>0))\) unless we input some information about the joint distribution of the bins.

We are able to overcome these issues by modifying the sieve weights and using a second moment estimate.

2.2 Choice of sieve weights

We solve the first problem by modifying the sieve weights to put more emphasis on the earlier bins. This way, we can guarantee that \(\mathbb {P}(n+h\in \mathcal {A}|h\in B_i)=c_i\) where the constant \(c_i\) depends solely on the bin. This also means that we can guarantee \(\mathbb {E}[\mathbf {X}_i]\) is large for each i (provided \(k_i\) is large enough in terms of \(c_i\)). We will consider Maynard–Tao sieve weights with a fixed factorisation

$$\begin{aligned} w(n) = \Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i \end{array}} \prod _{i=1}^{M}\lambda _{d_{k_0+\cdots +k_{i-1}+1},\ldots ,d_{k_0+\cdots +k_{i}}}^{(i)}\Bigg )^2, \end{aligned}$$
(2.2)

where

$$\begin{aligned} \lambda _{d_{k_0+\cdots +k_{i-1}+1},\ldots ,d_{k_0+\cdots +k_i}}^{(i)} \approx \Bigg (\prod _{j=k_0+\cdots +k_{i-1}+1}^{k_0+\cdots +k_i}\mu (d_{j})\Bigg )f_i(d_{k_0+\cdots +k_{i-1}+1},\ldots ,d_{k_0+\cdots +k_i}), \end{aligned}$$
(2.3)

and \(f_i\) is a suitable smooth function supported on the simplex

$$\begin{aligned} R_{B_i,\beta _i}=\{(x_{k_0+\cdots +k_{i-1}+1},\ldots ,x_{k_0+\cdots +k_i})\in [0,1]^{k_i}: 0\le \sum _{j=k_0+\cdots +k_{i-1}+1}^{k_0+\cdots +k_i}x_j \le \beta _i\}. \end{aligned}$$
(2.4)

Here \((\beta _i)_{i\ge 1}\) is a sequence of real numbers such that \(\sum _{i=1}^{\infty }\beta _i \le 1\) (cf. the sieve weights defined in [9, Proposition 4.1]). We will take \(\beta _i=2^{-i},\) and in this instance one might say “we have allocated 50% of the sieve power to \(B_1\).”

2.3 Concentration of measure

We can deal with the second problem by showing the random variables \(\mathbf {X}_i\) exhibit “enough” independence. This is precisely what concentration of measure arguments are used for. For example, an application of the union bound and Chebychev’s inequality tells us that

$$\begin{aligned} \mathbb {P}(|\mathbf {X}_i-\mathbb {E}[\mathbf {X}_i]| < t_i \text { for all } i) \ge 1 - \sum _{i=1}^{M} \frac{\mathbb {E}[\mathbf {X}_i-\mathbb {E}[\mathbf {X}_i]]^2}{t_i^2} \end{aligned}$$
(2.5)

where \(t_i \ge 1\) are concentration parameters. Thus, if we can show the variances \(\mathbb {E}[\mathbf {X}_i-\mathbb {E}[\mathbf {X}_i]]^2\) are small, then we should be able to show each random variable concentrates in a (small) interval about its mean with high probability. In particular, we should be able to show that we get (at least) one accepted translate coming from each bin after the sieving process. We implement this analytically by using a second moment estimate (see Proposition 5.2).

Remark 2.1

A similar second moment estimate was considered by Banks, Freiberg and Maynard in their paper [1]. They showed a partition result (for primes), where the bins are allowed to grow with k. A key aspect of our work is that the bin sizes are fixed. Moreover they only needed upper bounds of the correct order of magnitude for the sieve sums, whereas we require precise asymptotics.

2.4 Hooley’s \(\rho \) function

In practice, utilising a second moment estimate requires an understanding of the two-point correlations

$$\begin{aligned} \sum _{N\le n<2N} \rho _{\mathcal {A}}(n+h) \rho _{\mathcal {A}}(n+h') \end{aligned}$$
(2.6)

where \(\rho _{\mathcal {A}}\) is a non-negative function supported on \(\mathcal {A}.\) This means our methods are limited to cases in which estimates of the above type are known. In particular we cannot deal with the case of primes, as evaluating the above sum asymptotically with \(\mathbb {1}_{\mathbb {P}}\) or the von Mangoldt function \(\Lambda \) (say) is equivalent to the twin prime conjecture.Footnote 1

One can do much better when working with sums of two squares. We cannot evaluate (2.6) asymptotically using the indicator function, but we can if we work with the representation function \(r_2(n)\) instead. Unfortunately \(r_2(n)\) is too large for our purposes, and it proves necessary to consider a weighted version instead. In Hooley’s work [7] on the distribution of numbers representable as the sum of two squares, he considers a weighted representation function \(\rho (n)=t(n)r_2(n)\) where

$$\begin{aligned} t(n) = t_{N,\theta _1}(n) = \sum _{\begin{array}{c} a|n,\,\,a\le v \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)}{g_2(a)}\Bigg (1-\frac{\log {a}}{\log {v}}\Bigg ),\quad (v=N^{\theta _1}) \end{aligned}$$
(2.7)

and \(\theta _1\) is a suitably small, fixed constant (for example Hooley takes \(\theta _1=1/20\)). Here \(g_2(p)\) is the multiplicative function defined on primes by

$$\begin{aligned} g_2(p) = {\left\{ \begin{array}{ll} 2-\frac{1}{p}\,\,&{}\text {if }p\equiv 1\,\,(\text {mod}\,\,4),\\ \frac{1}{p}\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4). \end{array}\right. } \end{aligned}$$
(2.8)

The t(n) factor acts to dampen down the oscillations due to \(r_2(n).\) Thus \(\rho (n)\) acts a proxy for the indicator function \(\mathbb {1}_{n=\Box +\Box }\) and moreover asymptotics for (2.6) are available for \(\rho (n).\) This is the function we will be working with.

2.5 Outline of the paper

In Sect. 3 we deduce the results about quantum limits contained in Theorem’s 1.5 and 1.6 from Theorem 1.3. In Sect. 5 we state a few preliminary lemmas that will be needed in the sieve calculations. We defer the proofs of these results to the Appendix. In Sect. 6 we state our main sieve results, and from them we deduce Theorem 1.4. We isolate a key lemma (see Lemma 6.6) from which all of our sieve estimates follow. Sections 7 and 8 are dedicated to proving this lemma.

3 Proofs of quantum limit results

In this section we deduce the results of Theorem 1.5 and Theorem 1.6 from Theorem 1.3. Following [8], we note that \(\varphi _k\) is an eigenfunction of the Laplacian on \(\mathbb {T}^d\) with eigenvalue \(\lambda _k = 4\pi ^2 n_k\) for some \(n_k\in \mathbb {N}\) if and only if its Fourier expansion is of the form

$$\begin{aligned} \varphi _k(x) = \sum _{\begin{array}{c} \xi \in \mathbb {Z}^d \\ |\xi |^2=n_k \end{array}}a_{\xi }e^{2\pi i \langle \xi ,x \rangle }, \end{aligned}$$
(3.1)

for \(a_{\xi }\in \mathbb {C}.\) Moreover \(\left\Vert \varphi _k \right\Vert _2=1\) if and only if \(\sum _{\xi }|a_{\xi }|^2=1\). It follows that

$$\begin{aligned} |\varphi _k(x)|^2&= \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \end{array}}b_{\tau }(k)e^{2\pi i \langle \tau ,x \rangle },\nonumber \\ b_{\tau }(k)&= \sum _{\begin{array}{c} \xi -\eta =\tau \\ |\xi |^2=|\eta |^2=n_k \end{array}}a_{\xi }\overline{a_{\eta }}. \end{aligned}$$
(3.2)

Let \(\mathrm {d}v\) be a quantum limit on \(\mathbb {T}^d\) with Fourier expansion as in (1.1). By \(|\varphi _k|^2\mathrm {d}x \rightarrow \mathrm {d}v\) weak-\(*\) as \(k\rightarrow \infty \) we mean that for every \(\tau \in \mathbb {Z}^d\) we have \( c_{\tau } = \lim _{k\rightarrow \infty } b_{\tau }(k).\)

Fix \(a_1<a_2<\cdots \) and \(M_1<M_2<\cdots \) as in the statement of Theorem 1.3, and let \(b_j^{(k)},c_{j}^{(k)}\in \mathbb {Z}\) be such

$$\begin{aligned} M_k = a_j^2 + (b_{j}^{(k)})^2 + (c_{j}^{(k)})^2,\quad (1\le j\le k). \end{aligned}$$
(3.3)

Let \(0<\epsilon <2\) and let \(F=F_{\epsilon }:\mathbb {N}\rightarrow \mathbb {N}\) be a rapidly increasing function whose rate of growth will be specified later. As we are assuming both \(a_i\rightarrow \infty \) and \(r(a_i)\rightarrow \infty \), by passing to a subsequence if necessary (and relabelling the indices of the sequence \(M_k\)), we may suppose \(a_i,r(a_i) \gg F(i)\).

We will require information about the number of integer points on the surface of the d-dimensional sphere. For this we recall the following results: writing \(n=2^km\) and letting \(\sigma (n)=\sum _{d|n}d\) denote the sum-of-divisors function, we have the identities

$$\begin{aligned} r_3(n^2)&= 6 \prod _{p^a||m}(\sigma (p^{a})-(-1)^{\frac{p-1}{2}}\sigma (p^{a-1})), \\ r_4(n^2)&= 24 \sigma (m^2), \\ r_d(n^2)&= C_d(n^2) n^{d-2}\quad \text { for }\quad d\ge 5. \end{aligned}$$

Here \(C_d(n^2)\) is a singular series which satisfies \(C_d(n^2) \asymp _d 1.\)

We prove each statement similarly - in each case we consider a suitable sequence of \(L^2\)-normalised eigenfunctions with eigenvalues \(\lambda _k = 4\pi ^2 M_k\) and show that the limit has the desired property.

(Proof of (Theorem 1.3\(\Rightarrow \) Theorem 1.5)) Consider the sequence of \(L^{2}\)-normalised eigenfunctions on \(\mathbb {T}^3\) that arise by choosing coefficients

$$\begin{aligned} a_{\xi } = {\left\{ \begin{array}{ll} \sqrt{\frac{2^{k}}{2^k-1}}\cdot \frac{1}{2^{(j+1)/2}}\,\,\,&{}\text {if }\xi =(\pm a_j, b_j^{(k)},c_j^{(k)})\text { for some }j, \\ 0\,\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Fix \(i\ge 1.\) With this choice, for any \(k\ge i\) we obtain

$$\begin{aligned} b_{(2a_i,0,0)}(k) = \sum _{\begin{array}{c} \xi -\eta = (2a_i,0,0) \\ |\xi |^2=|\eta |^2=M_k \end{array}}a_{\xi }\overline{a_{\eta }} = \frac{2^k}{2^k-1}\cdot \frac{1}{2^{i+1}}, \end{aligned}$$

because the \(a_i\) are distinct and so the only contribution to the sum comes from \(\xi =(a_i,b_i^{(k)},c_i^{(k)})\) and \(\eta =(-a_i,b_i^{(k)},c_i^{(k)})\). Hence

$$\begin{aligned} c_{(2a_i,0,0)} = \lim _{k\rightarrow \infty } b_{(2a_i,0,0)}(k) = \frac{1}{2^{i+1}} > 0, \end{aligned}$$

which proves the theorem. \(\square \)

(Proof of (Theorem 1.3\(\Rightarrow \) Theorem 1.6)) It suffices to prove part (i) for \(d=4,\) by identifying the eigenfunctions on \(\mathbb {T}^d\) with the eigenfunctions on \(\mathbb {T}^{d+l}\) all of whose non-zero frequencies lie in the subspace \(\{(x_1,\ldots ,x_{d+l}):x_{d+1}=\cdots =x_{d+l}=0\}\subseteq \mathbb {Z}^{d+l}\).

Consider the sequence of \(L^{2}\)-normalised eigenvectors on \(\mathbb {T}^4\) that arise by choosing

$$\begin{aligned} a_{\xi } = {\left\{ \begin{array}{ll} \sqrt{\frac{2^{k}}{2^k-1}}\cdot \frac{1}{(2^jr(a_j^2))^{1/2}}\,\,\,&{}\text {if } \xi =(X,Y,b_j^{(k)},c_j^{(k)})\text { for some }j\text { and }X^2+Y^2=a_j^2, \\ 0\,\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Fix i and suppose \(k\ge i\). Given two non-zero coefficients \(a_{\xi },a_{\xi '}\), corresponding to vectors of the form \(\xi =(X,Y,b_i^{(k)},c_i^{(k)})\) and \(\xi '=(X',Y',b_i^{(k)},c_i^{(k)}),\) we see the difference vector is

$$\begin{aligned} \xi -\xi ' = (X-X',Y-Y',0,0), \end{aligned}$$

and the norm of this vector is \(\le 2a_i\) by the triangle inequality. From (3.2), it follows that if we sum \(b_{\tau }(k)\) over all \(|\tau |\le 2a_i\) then we pick up all such differences. There are \(r(a_i^2)^2\) of them, leading to

$$\begin{aligned} \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | \le 2a_i \end{array}}|b_{\tau }(k)|^{2-\epsilon } \ge \Bigg (\frac{2^k}{2^k-1}\Bigg )^{2-\epsilon } \cdot \frac{(2^{i}r(a_i^2))^{\epsilon }}{4^{i}}. \end{aligned}$$

Taking the limit as \(k\rightarrow \infty \) we conclude that

$$\begin{aligned} \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | \le 2a_i \end{array}}|c_{\tau }|^{2-\epsilon } \ge \frac{(2^ir(a_i^2))^{\epsilon }}{4^i} \ge \frac{(2^ir(a_i))^{\epsilon }}{4^i} \gg \frac{(2^iF(i))^{\epsilon }}{4^i}. \end{aligned}$$

Now we can choose \(F=F_{\epsilon }\) so that the expression on the right hand side is unbounded as \(i\rightarrow \infty .\) It follows that \(\sum _{\tau }|c_{\tau }|^{2-\epsilon }\) doesn’t converge, proving part (i).

For part (ii), fix \(d\ge 5.\) We proceed as in part (i), except this time because \(d\ge 5\) we have the lower bound \(r_{d-2}(a_i^2) \gg _{d} a_i^{d-4}\). We remark that to obtain this bound for \(d\in \{5,6\}\) we are using property (2) given by Theorem 1.3. For \(d\ge 7\) the bound holds without this extra assumption on our sequence.

Now consider the sequence of eigenvectors on \(\mathbb {T}^d\) with densities

$$\begin{aligned}a_{\xi } = {\left\{ \begin{array}{ll} \sqrt{\frac{2^{k}}{2^k-1}}\cdot \frac{1}{(2^jr_{d-2}(a_j^2))^{1/2}}\,\,\, &{}\hbox {if } \xi =(X_1,\ldots ,X_{d-2},b_j^{(k)},c_j^{(k)})\hbox { for some }j \hbox { and }X_1^2+\ldots X_{d-2}^2=a_j^2, \\ 0\,\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Fix i and suppose \(k\ge i\). As above we can conclude

$$\begin{aligned} \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | \le 2a_i \end{array}}|b_{\tau }(k)| \ge \Bigg (\frac{2^k}{2^k-1}\Bigg ) \cdot \frac{r_{d-2}(a_i^2)}{2^{i}}. \end{aligned}$$

Taking the limit as \(k\rightarrow \infty \) we conclude that

$$\begin{aligned} \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | \le 2a_i \end{array}}|c_{\tau }| \ge \frac{r_{d-2}(a_i^2)}{2^i} \gg _{d} \frac{a_{i}^{d-4}}{2^i}. \end{aligned}$$

It follows that

$$\begin{aligned} \frac{\sum _{\begin{array}{c} |\tau | \le 2a_i \end{array}}|c_{\tau }|}{(2a_i)^{d-4-\epsilon }} \gg _{d} \frac{(2a_{i})^{\epsilon }}{2^i} \gg _d \frac{(2F(i))^{\epsilon }}{2^i}. \end{aligned}$$

Choosing \(F=F_{\epsilon }\) appropriately and letting \(i\rightarrow \infty ,\) we see that for this choice of quantum limit we have

$$\begin{aligned} \limsup _{\rho \rightarrow \infty } \frac{\Sigma (\rho )}{\rho ^{d-4-\epsilon }}=+\infty , \end{aligned}$$

where \(\Sigma (\rho )\) is defined as in (1.2). This proves part (ii). \(\square \)

4 Notation

We will use both Landau and Vinogradov asymptotic notation throughout the paper. N will denote a large integer, and all asymptotic notation is to be understood as referring to the limit as \(N\rightarrow \infty .\) Any dependencies of the implied constants on other parameters A will be denoted by a subscript, for example \(X\ll _{A} Y\) or \(X=O_{A}(Y),\) unless stated otherwise. We let \(\epsilon \) denote a small positive constant, and we adopt the convention it is allowed to change at each occurrence, and even within a line.

We will denote the non-trivial Dirichlet character \((\text {mod}\,\,4)\) by \(\chi _4,\) and we may omit the subscript and simply write \(\chi \). As usual, we let \(\varphi (n)\) denote the Euler-Totient function, \(\tau _r(n)\) denote the number of ways of writing n as the product of r natural numbers, \(\mu (n)\) denote the Möbius function, and \(r_d(n)\) denote the number of representations of n as the sum of d squares. For the rest of the paper we will write r(n) when \(d=2.\) For integers ab we let (ab) denote their highest common factor, and [ab] denote their lowest common multiple.

We define the Ramanujan–Landau constant

$$\begin{aligned} A= \frac{1}{\sqrt{2}}\prod _{p\equiv 3\,\,(\text {mod}\,\,4)}\Bigg (1-\frac{1}{p^2}\Bigg )^{-\frac{1}{2}} \end{aligned}$$
(4.1)

which will appear in many of our results.

5 Preliminaries

In this section, we formalise some of the notions discussed in Sect. 2, and state a few key estimates that will be required later in the sieve calculations.

5.1 A pigeonhole argument

The following proposition allows us to go from the set-up in Theorem 1.4 to the model problem discussed in Sect. 2.

Proposition 5.1

(Pigeonhole argument for infinite bin set-up). Fix \(\mathcal {A}\subseteq \mathbb {N}\) and a set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) of integers. Suppose that there exists a partition \(\mathcal {H}^{*}=B_1\cup B_2\cup \ldots \) where each bin \(B_i\) is a fixed, finite size, such that for every \(M\ge 1,\) there exists infinitely many n and M translates \(n+h_{i,M}\in \mathcal {A}\) with \(h_{i,M}\in B_i\) for \(1\le i \le M.\) Then there exists increasing sequences \(a_j\) and \(n_k\) such that for every \(k\ge 1\) we have \(n_k+h_{a_j} \in \mathcal {A}\) for \(1\le j\le k\) and moreover \(h_{a_j}\in B_j\) for all j.

Proof

With the above set-up, obtain translates \(n+h_{i,M}\in \mathcal {A}\) with \(h_{i,M}\in B_i\) for \(1\le i \le M\) for each \(M\ge 1.\) Record this process in the following infinite table:

\(B_1\)

\(B_2\)

\(B_3\)

\(\ldots \)

\(B_M\)

\(B_{M+1}\)

\(\ldots \)

\(h_{1,1}\)

  

\(\ldots \)

   

\(h_{1,2}\)

\(h_{2,2}\)

 

\(\ldots \)

   

\(h_{1,3}\)

\(h_{2,3}\)

\(h_{3,3}\)

\(\ldots \)

   

\(\vdots \)

\(\vdots \)

\(\vdots \)

\(\vdots \)

   

\(h_{1,M}\)

\(h_{2,M}\)

\(h_{3,M}\)

\(\ldots \)

\(h_{M,M}\)

  

\(h_{1,M+1}\)

\(h_{2,M+1}\)

\(h_{3,M+1}\)

\(\ldots \)

\(h_{M,M+1}\)

\(h_{M+1,M+1}\)

 

\(\vdots \)

\(\vdots \)

\(\vdots \)

\(\ldots \)

\(\vdots \)

\(\vdots \)

\(\vdots \)

Look at the first column. By the pigeonhole principle, since \(B_1\) is finite, there must exist an element \(h_{a_1}\in B_1\) which appears infinitely many times. Choose the smallest such \(h_{a_1},\) and choose any \(n_1\in \mathbb {N}\) for which \(n_1+h_{a_1}\in \mathcal {A}\). Now erase all the rows that do not start with \(h_{a_1},\) and look at the remaining (infinite) table. Again, since \(B_2\) is finite, some element \(h_{a_2}\in B_2\) must occur infinitely many times in the second column. Choose the smallest such \(h_{a_2},\) and choose any \(n_2>n_1\) such that \(n_2+h_{a_2}\in \mathcal {A}\) (which we can do because there are infinitely many such \(n_2\)). By construction this \(n_2\) will be such that \(n_2+h_{a_1}\in \mathcal {A}.\) Now erase all rows that don’t start with \(h_{a_1},h_{a_2},\) and repeat this process for \(B_3,\) and so on. We will end up with increasing sequences \(a_j\) and \(n_k\) which by construction satisfy the required conditions. \(\square \)

5.2 A second moment estimate

As discussed in Sect. 2, our work will require input about the joint distribution of the bins, which we will achieve via concentration of measure arguments. The following second moment estimate will suffice for our purposes.

Proposition 5.2

(Second moment estimate). Fix \(\mathcal {A}\subset \mathbb {N}\) and \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) a set of integers. Suppose we have a partition \(\mathcal {H}=B_1\cup \ldots \cup B_M.\) Let \(\mu _i, t_i\ge 1\) be real numbers for \(1\le i\le M.\) Let \(\rho _{\mathcal {A}}\) be a non-negative function supported on \(\mathcal {A}\) and w(n) be non-negative weights for each integer n. If

$$\begin{aligned} \sum _{\begin{array}{c} N\le n<2N \end{array}}\Bigg [\min _{j=1,\ldots ,M}\frac{\mu _j^2}{t_j^2}-\sum _{i=1}^{M}\Bigg (\frac{\sum _{h\in B_i}\rho _{\mathcal {A}}(n+h)-\mu _i}{t_i}\Bigg )^2\Bigg ]w(n)>0, \end{aligned}$$
(5.1)

then there exists an \(n\in [N,2N)\) and elements \(h_{a_i}\in B_i\) such that \(n+h_{a_i}\in \mathcal {A}\) for \(1\le i\le M.\)

Proof

By positivity we deduce the existence of an \(n\in [N,2N)\) such that

$$\begin{aligned} \sum _{i=1}^{M}\Bigg (\frac{\sum _{h\in B_i}\rho _{\mathcal {A}}(n+h)-\mu _i}{t_i}\Bigg )^2<\min _{j=1,\ldots ,M}\frac{\mu _j^2}{t_j^2}. \end{aligned}$$

If \(n+h\notin \mathcal {A}\) for all \(h\in B_i,\) then by assumption on the support of \(\rho _{\mathcal {A}}\) the left hand side of the above expression is \(\ge \mu _i^2/t_i^2,\) a contradiction. \(\square \)

Thus, if the second moment estimate (5.1) holds for all \(M\ge 1\) and sufficiently large N, then we are in a situation where the hypotheses of Proposition 5.1 are satisfied.

5.3 Estimates in arithmetic progressions

We require an understanding of how \(\rho (n)\) and \(\rho (n)\rho (n+h)\) behaves in arithmetic progressions for our sieve calculations. Essentially this reduces down to understanding the corresponding sums for r(n) and \(r(n)r(n+h),\) where the estimates we need are known with power-saving error terms. This means that the error terms in the sieve calculations can be bounded trivially (cf. with the case of primes [4, 9], where we have to use equi-distribution results such as the Bombieri-Vinogradov theorem to bound the error terms that arise).

We have the following lemmas. We note that the functions \(g_1,\ldots ,g_7\) defined in this section will be used frequently throughout the rest of the paper.

Lemma 5.3

Suppose \((a,q)=(d,q)=1\) where dq are square-free and odd, of size \(\ll N^{O(1)}.\) Then we have

$$\begin{aligned} \sum _{\begin{array}{c} n\le N\\ n\equiv a\,\,(\text {mod}\,\,q) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ d|n \end{array}}r(n) = \frac{g_1(q)g_2(d)}{2qd}\pi N+R_1(N;d,q), \end{aligned}$$

where \(g_2\) is defined as in (2.8), \(g_1\) is the multiplicative function defined on primes by \(g_1(p)=1-\chi (p)/p,\) and

$$\begin{aligned} R_1(N;d,q)\ll _{\epsilon } ((qd)^{\frac{1}{2}}+N^{\frac{1}{3}})d^{\frac{1}{2}}N^{\epsilon } \end{aligned}$$

Lemma 5.4

Suppose that \((a,q)=(a+h,q)=(d_1,q)=(d_2,q)=(d_1,d_2)=1\) and 4|h,  where \(d_1,d_2,q\) are square-free and odd, of size \(\ll N^{O(1)}.\) Moreover suppose \(h>0\) is fixed such that \(p|h\Rightarrow p|2q\). Then we have

$$\begin{aligned} \sum _{\begin{array}{c} n\le N\\ n\equiv a\,\,(\text {mod}\,\,q) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ d_1|n \\ d_2|n+h \end{array}}r(n)r(n+h) = \frac{g_1(q)^2\Gamma (d_1,d_2,q)}{q}\pi ^2N+R_2(N;d_1,d_2,q), \end{aligned}$$

where

$$\begin{aligned} \Gamma (d_1,d_2,q) = \frac{g_2(d_1)g_2(d_2)}{d_1d_2}\sum _{\begin{array}{c} (r,2q)=1 \end{array}}\frac{\mu (r)(d_1,r)(d_2,r)\chi [(d_1^2,r)]\chi [(d_2^2,r)]}{r^2}, \end{aligned}$$

and

$$\begin{aligned} R_2(N;d_1,d_2,q)\ll _{\epsilon } q^{\frac{1}{2}}d_1d_2N^{\frac{3}{4}+\epsilon }+d_1^{\frac{1}{2}}d_2^{\frac{1}{2}}N^{\frac{5}{6}+\epsilon }. \end{aligned}$$

Lemma 5.5

Let \((a,q)=(d,q)=1\) where dq are square-free and odd, of size \(\ll N^{O(1)}.\) Then we have

$$\begin{aligned} \sum _{\begin{array}{c} n\le N \\ n\equiv a\,\,(\text {mod}\,\,q) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ d|n \end{array}}r^2(n)&= \frac{g_3(q)g_4(d)}{qd}\Bigg (\log {N}+A_2+2\sum _{p|q}g_5(p)-2\sum _{p|d}g_6(p)\Bigg )N\\&\quad +O_{\epsilon }(qN^{\frac{3}{4}+\epsilon }), \end{aligned}$$

where

$$\begin{aligned} A_2=2\gamma -1+2\frac{L'(1,\chi _4)}{L(1,\chi _4)}-2\frac{\zeta '(2)}{\zeta (2)}+\frac{4}{3}\log {2}. \end{aligned}$$

Here \(g_3,g_4\) are the multiplicative functions defined on primes by

$$\begin{aligned} g_3(p)&= {\left\{ \begin{array}{ll} \frac{(p-1)^2}{p(p+1)}\,\,&{}\text {if }p\equiv 1\,\,(\text {mod}\,\,4), \\ g_1(p)\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4), \end{array}\right. } \,\,\,\,\,\,\,\,\,\,\,\,\, g_4(p) = {\left\{ \begin{array}{ll} \frac{4p^2-3p+1}{p(p+1)}\,\,&{}\text {if }p\equiv 1\,\,(\text {mod}\,\,4), \\ g_2(p)\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4), \end{array}\right. } \end{aligned}$$

and \(g_5(p),g_6(p)\) are defined by

$$\begin{aligned} g_5(p)&= {\left\{ \begin{array}{ll} \frac{(2p+1)\log {p}}{p^2-1}\,\,&{}\text {if }p\equiv 1\,\,(\text {mod}\,\,4), \\ \frac{\log {p}}{p^2-1}\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4), \end{array}\right. } \\ g_6(p)&= {\left\{ \begin{array}{ll} \frac{(p-1)^2(2p+1)\log {p}}{(p+1)(4p^2-3p+1)}\,\,&{}\text {if }p\equiv 1\,\,(\text {mod}\,\,4), \\ \log {p}\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4). \end{array}\right. } \end{aligned}$$

We will prove each of these results in Appendix A. Lemma 5.3 follows from two known results. Lemma 5.4 follows by adapting the method used by Plaksin in [10], where a similar sum is considered. Finally Lemma 5.5 can be shown using standard Perron’s formula arguments, together with a fourth moment estimate for Dirichlet L-functions.

When finding the corresponding estimates for \(\rho (n)\) the following sums naturally appear (for the definitions of \(W,W_1\) and \(D_0\) see (6.1) below the fold):

$$\begin{aligned} X_{N,W}&=\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{a}\log {\frac{v}{a}}, \end{aligned}$$
(5.2)
$$\begin{aligned} Y_{N,W}&=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ (a,b)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_7(a)g_7(b)}\log {\frac{v}{a}}\log {\frac{v}{b}}, \end{aligned}$$
(5.3)
$$\begin{aligned} Z_{N,W}^{(1)}&=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}, \end{aligned}$$
(5.4)
$$\begin{aligned} Z_{N,W}^{(2)}&=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b) g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}\sum _{p|[a,b]}g_6(p). \end{aligned}$$
(5.5)

Here \(g_7\) is the multiplicative function defined on primes by \(g_7(p)=p+1.\) The following lemma evaluates the auxiliary sums above.

Lemma 5.6

(Auxiliary estimates for \(\rho (n)\)). We have

$$\begin{aligned} X_{N,W}&= (1+o(1))\frac{8A\log ^{\frac{1}{2}}{v}}{\pi g_1(W_1)}, \\ Y_{N,W}&= (1+o(1))\frac{64A^2\log {v}}{\pi ^2g_1(W_1)^2}, \\ Z_{N,W}^{(1)}&=(1+o(1))\frac{32A^3\log ^{\frac{1}{2}}{v}}{\pi ^2g_1(W_1)^3}, \\ Z_{N,W}^{(2)}&=-(1+o(1))\frac{16A^3\log ^{\frac{3}{2}}{v}}{\pi ^2g_1(W_1)^3}, \end{aligned}$$

where A is defined as in (4.1). In each case one may take the o(1) term to be \(O(D_0^{-1}).\)

We remark that the estimate for \(X_{N,1}\) appears in Hooley’s work (after correcting a misprint—cf. [7, Lemma 5] and note his slightly different definition of A). We prove Lemma 5.6 in Appendix B. Each sum can be evaluated by the Selberg–Delange method.

6 The sieve set-up

We now state our sieve results and use them to deduce Theorem 1.4. For the rest of the paper k is fixed, \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) is a fixed admissible set such that \(4|h_i\) for each i,  and N is sufficiently large in terms of any fixed quantity. We allow any of the constants hidden in the Landau notation to depend on k,  without explicitly specifying so.

We will employ a 4W-trick in our sieve calculations. Let

$$\begin{aligned} W=\prod _{2<p\le D_0} p, \end{aligned}$$
(6.1)

where \(D_0=(\log \log {N})^3,\) so that \(W\ll (\log {N})^{2(\log \log {N})^2} \ll _{\epsilon } N^{\epsilon }\) for any fixed \(\epsilon >0\) by the prime number theorem. It will prove useful to define

$$\begin{aligned} W_1=\prod _{\begin{array}{c} p\le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}p,\,\,\,\,\,\,\quad W_3=\prod _{\begin{array}{c} p\le D_0 \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}p, \end{aligned}$$
(6.2)

so that \(W=W_1W_3.\)

By admissibility of \(\mathcal {H}\) there exists a fixed residue class \(v_0\,\,(\text {mod}\,\,W)\) such that \((v_0+h_i,W)=1\) for each i. Fix \(1\le m,l\le k\) with \(m\ne l.\) We consider four types of sums:

$$\begin{aligned} S_{1}&= \sum _{\begin{array}{c} N\le n< 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i \,\,\forall i \end{array}}\lambda _{d_1,\ldots ,d_k}\Bigg )^{2}, \end{aligned}$$
(6.3)
$$\begin{aligned} S_{2}^{(m)}&= \sum _{\begin{array}{c} N\le n< 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \rho (n+h_m) \Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i \,\,\forall i \end{array}}\lambda _{d_1,\ldots ,d_k}\Bigg )^{2}, \end{aligned}$$
(6.4)
$$\begin{aligned} S_{3}^{(m,l)}&= \sum _{\begin{array}{c} N\le n< 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \rho (n+h_m)\rho (n+h_l) \Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i \,\,\forall i \end{array}}\lambda _{d_1,\ldots ,d_k}\Bigg )^{2}, \end{aligned}$$
(6.5)
$$\begin{aligned} S_4^{(m)}&= \sum _{\begin{array}{c} N\le n< 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \rho ^{2}(n+h_m) \Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i \,\,\forall i \end{array}}\lambda _{d_1,\ldots ,d_k}\Bigg )^{2}. \end{aligned}$$
(6.6)

Because k is fixed, we may assume that \(D_0\) is sufficiently large so that

$$\begin{aligned} p|h_i-h_j \Rightarrow p|2W. \end{aligned}$$
(6.7)

Remark 6.1

For the second moment estimate, it proves important to control the residue classes of the translates \(n+h\,\,(\text {mod}\,\,4),\) hence the condition \(n\equiv 1\,\,(\text {mod}\,\,4)\) in our sieve sums and also the assumption 4|h for our admissible set. This is because of the inherent bias numbers representable as a sum of two squares have modulo 4.

Our first Proposition evaluates these sums for general half-dimensional Maynard–Tao sieve weights. Fix \(0<\theta _1<1/18\) in the definition of \(\rho (n)\) (see (2.7)). We also define the normalisation constant

$$\begin{aligned} B= \frac{A}{\Gamma (1/2)\sqrt{L(1,\chi _4)}}\cdot \frac{\varphi (W_3)(\log {R})^{\frac{1}{2}}}{W_3} = \frac{2A}{\pi }\cdot \frac{\varphi (W_3)(\log {R})^{\frac{1}{2}}}{W_3}. \end{aligned}$$
(6.8)

Proposition 6.2

(Half-dimensional Maynard–Tao sieve estimates). Let \(R=N^{\theta _2/2}\) for some small fixed positive constant \(\theta _2\) such that \(0<\theta _1+\theta _2<1/18\). Let \(\lambda _{d_1,\ldots ,d_k}\) be defined in terms of a fixed smooth function F by

$$\begin{aligned} \lambda _{d_1,\ldots ,d_k} = \Bigg (\prod _{i=1}^{k}\mu (d_i)d_i\Bigg )\sum _{\begin{array}{c} r_1,\ldots ,r_k \\ d_i|r_i \forall i \\ (r_i,W)=1\forall i \\ p|r_i \Rightarrow p\equiv 3\,\,(\text {mod}\,\,4)\forall i \end{array}} \frac{\mu (\prod _{i=1}^{k}r_i)^{2}}{\prod _{i=1}^{k}\varphi (r_i)}F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg ), \end{aligned}$$

whenever \(\prod _{i=1}^{k}d_i\le R\) is squarefree, \((\prod _{i=1}^{k}d_i,W)=1\) and \(p|\prod _{i=1}^{k}d_i\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4),\) and let \(\lambda _{d_1,\ldots ,d_k}=0\) otherwise. Moreover let F be supported on \(R_{k}=\{(x_1,\ldots ,x_k)\in [0,1]^{k}: \sum _{i=1}^{k}x_i \le 1\}.\) Then we have

$$\begin{aligned} S_{1}&=(1+o(1)) \frac{B^{k}N}{4W}L_{k}(F), \\ S_2^{(m)}&= (1+o(1)) \frac{4\sqrt{\frac{\log {R}}{\log {v}}}B^kN}{\pi W}L_{k;m}(F), \\ S_{3}^{(m,l)}&= (1+o(1)) \frac{64(\frac{\log {R}}{\log {v}})B^{k}N}{\pi ^2 W}L_{k;m,l}(F), \\ S_{4}^{(m)}&= (1+o(1))\frac{2\sqrt{\frac{\log {R}}{\log {v}}}(\frac{\log {N}}{\log {v}}+1)B^{k}N}{\pi W}L_{k;m}(F) \end{aligned}$$

provided \(L_k(F),L_{k;m}(F)\) and \(L_{k;m,l}(F)\) are non-zero, where

$$\begin{aligned} L_{k}(F)&= \int _0^1\ldots \int _0^1 \Bigg [F(x_1,\ldots ,x_k)\Bigg ]^2 \prod _{i=1}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}} \\ L_{k;m}(F)&= \int _0^1\ldots \int _0^1 \Bigg [\int _0^1F(x_1,\ldots ,x_k)\frac{\mathrm {d}x_m}{\sqrt{x_m}}\Bigg ]^2 \prod _{\begin{array}{c} i=1 \\ i\ne m \end{array}}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}}, \\ L_{k;m,l}(F)&= \int _0^1\ldots \int _0^1 \Bigg [\int _0^1\Bigg (\int _0^1F(x_1,\ldots ,x_k)\frac{\mathrm {d}x_m}{\sqrt{x_m}}\Bigg )\frac{\mathrm {d}x_l}{\sqrt{x_l}}\Bigg ]^2 \prod _{\begin{array}{c} i=1 \\ i\ne m,l \end{array}}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}}. \end{aligned}$$

From this, one can deduce the corresponding results for the modification of the Maynard–Tao sieve described in Sect. 2.

Proposition 6.3

(Modified Maynard–Tao sieve estimates). Suppose in addition to the hypotheses of Proposition 6.2 we have a partition \(\mathcal {H}=\{h_1,\ldots ,h_k\}=B_1\cup \ldots \cup B_M\) into bins \(B_i\) of fixed and finite size \(k_i.\) Write \(B_i=\{h_{k_0+\ldots +k_{i-1}+1},\ldots ,h_{k_0+\ldots +k_i}\}\) with the convention that \(k_0=0\). Suppose further we have a corresponding factorisation

$$\begin{aligned} F(x_1,\ldots ,x_k)=\prod _{i=1}^{M}F_i(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i}), \end{aligned}$$

where each \(F_i\) is smooth and supported on the simplex

$$\begin{aligned} R_{B_i,\beta _i}=\{(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i})\in [0,1]^{k_i}:0\le \sum _{j=k_0+\ldots +k_{i-1}+1}^{k_0+\ldots +k_i}x_j\le \beta _i\}. \end{aligned}$$

Here \((\beta _i)_{i=1}^{\infty }\) is a sequence of real numbers such that \(\sum _{i=1}^{\infty }\beta _i\le 1.\) Then for \(h_m,h_l\in B_j\) we have

$$\begin{aligned} S_{1}&=(1+o(1)) \frac{B^{k}N}{4W}\Bigg (\prod _{i=1}^{M}L_{|B_i|}(F)\Bigg ), \\ S_2^{(m)}&= (1+o(1)) \frac{4\sqrt{\frac{\log {R}}{\log {v}}}B^kN}{\pi W}\Bigg (\prod _{\begin{array}{c} i=1 \end{array}}^{M}L_{|B_i|}(F_i)\Bigg )\frac{L_{|B_j|;m}(F_j)}{L_{|B_j|}(F_j)}, \\ S_{3}^{(m,l)}&= (1+o(1)) \frac{64(\frac{\log {R}}{\log {v}})B^{k}N}{\pi ^2W}\Bigg (\prod _{\begin{array}{c} i=1 \end{array}}^{M}L_{|B_i|}(F_i)\Bigg )\frac{L_{|B_j|;m,l}(F_j)}{L_{|B_j|}(F_j)}, \\ S_{4}^{(m)}&= (1+o(1))\frac{2\sqrt{\frac{\log {R}}{\log {v}}}(\frac{\log {N}}{\log {v}}+1)B^{k}N}{\pi W}\Bigg (\prod _{\begin{array}{c} i=1 \end{array}}^{M}L_{|B_i|}(F_i)\Bigg )\frac{L_{|B_j|;m}(F_j)}{L_{|B_j|}(F_j)}. \end{aligned}$$

Proof

The hypotheses imply \(F=\prod _{i=1}^{M}F_i\) is also smooth and supported on \(R_{k},\) and hence the results of Proposition 6.2 apply. It suffices to to show the functionals factorise in the forms stated. Because our set-up ensures that \(\text {supp}(F)= \text {supp}(F_1)\times \cdots \times \text {supp}(F_M)\) one can easily check that if \(h_m,h_l\in B_j\) then

$$\begin{aligned} L_k(F)&= \prod _{i=1}^{M} L_{|B_i|}(F_i),\\ L_{k;m}(F)&= \Bigg (\prod _{\begin{array}{c} i=1 \end{array}}^{M}L_{|B_i|}^{(0)}(F_i)\Bigg )\frac{L_{|B_j|:m}(F_j)}{L_{|B_j|}(F_j)}, \\ L_{k;m,l}(F)&= \Bigg (\prod _{\begin{array}{c} i=1 \end{array}}^{M}L_{|B_i|}^{(0)}(F_i)\Bigg )\frac{L_{|B_j|:m,l}(F_j)}{L_{|B_j|}(F_j)}. \end{aligned}$$

\(\square \)

With the following lemma we will be in a position to prove Theorem 1.4.

Lemma 6.4

(Evaluation of sieve functionals). Let \(F(t_1,\ldots ,t_k)=\prod _{i=1}^{k}g(kt_i)\) where

$$\begin{aligned} g(t)= {\left\{ \begin{array}{ll} \frac{1}{1+\frac{t}{\beta }},\,\,\,&{}\text {if }t\le \beta , \\ 0,\,\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Then for any ml we have

$$\begin{aligned} \frac{L_{k;m}(F)}{L_k(F)}&= \frac{\pi ^2}{\pi +2}\cdot \sqrt{\frac{\beta }{k}}, \\ \frac{L_{k:m,l}(F)}{L_k(F)}&= \Bigg (\frac{\pi ^2}{\pi +2}\Bigg )^2\cdot \frac{\beta }{k}. \\ \end{aligned}$$

Proof

The definition implies F is supported on the cube \([0,\frac{\beta }{k}]^k\subseteq R_{k,\beta }.\) In this case the functionals factorise completely and the lemma follows from the fact

$$\begin{aligned} \int _0^{\frac{\beta }{k}}\frac{\mathrm {d}t}{\sqrt{t}(1+\frac{kt}{\beta })^2}&=\frac{\pi +2}{4}\cdot \sqrt{\frac{\beta }{k}}, \\ \int _0^{\frac{\beta }{k}}\frac{\mathrm {d}t}{\sqrt{t}(1+\frac{kt}{\beta })}&=\frac{\pi }{2}\cdot \sqrt{\frac{\beta }{k}}. \end{aligned}$$

\(\square \)

Remark 6.5

We have restricted the support of our functions to the cube \([0,\frac{\beta }{k}]^k\subseteq R_{k,\beta }\) so that the integrals can be evaluated exactly. This essentially means we are using weights of similar strength to the original GPY weights (cf. [4]). For the half-dimensional case one can show that for large k these weights are essentially optimal. (In particular, following a similar optimisation process as in [9, Section 7] one arrives at the same results as above.)

We are now in a position to prove Theorem 1.4.

Proof of Theorem 1.4

Let \(\mathcal {H}=\{h_1,h_2,\ldots \}\) be a fixed admissible set. Fix real numbers \(\theta _1,\theta _2\) subject to \(0<\theta _1+\theta _2 < 1/18\) and define the constant

$$\begin{aligned} \Delta = \Delta (\theta _1,\theta _2) = \frac{\sqrt{2}(\pi +2)}{32\pi }\frac{1+\theta _1}{\sqrt{\theta _1\theta _2}}. \end{aligned}$$

With notation as above, consider a partition \(\mathcal {H}=B_1\cup B_2\cup \ldots \) where \(k_1>2\Delta ^3\) and for \(i\ge 2\) we choose \(B_{i}\) such that \(k_i > 2^{7i}\). By Proposition 5.2 we will be done if we can show, for every \(M\ge 1,\) the inequality

$$\begin{aligned}&\sum _{\begin{array}{c} N\le n<2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg [\min _{j=1,\ldots ,M}\frac{\mu _j^2}{t_j^2}-\sum _{i=1}^{M}\Bigg (\frac{\sum _{h\in B_i}\rho (n+h)-\mu _i}{t_i}\Bigg )^2\Bigg ]\Bigg (\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ d_i|n+h_i\,\,\forall i \end{array}}\lambda _{d_1,\ldots ,d_k}\Bigg )^2 >0\nonumber \\ \end{aligned}$$
(6.9)

for all sufficiently large N and some choice of real numbers \(\mu _i,t_i \ge 1.\) Choose weights \(\lambda _{d_1,\ldots ,d_k}\) as in Proposition 6.2, and let \(F(x_1,\ldots ,x_k)=\prod _{i=1}^{M}F_i(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i})\) where each \(F_i\) is supported on \(R_{B_i,2^{-i}}.\) Let \(F_i(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i}) = \prod _{j=k_0+\ldots +k_{i-1}+1}^{k_0+\ldots +k_i}g(k_ix_j)\) where for \(j\in \{b_{2i-1},\ldots ,{b_{2i}}\}\) we define

$$\begin{aligned} g(x_j)= {\left\{ \begin{array}{ll} \frac{1}{1+2^ix_j},\,\,\,&{}\text {if }x_j\le 2^{-i} \\ 0,\,\,\,&{}\text {otherwise} \end{array}\right. } \end{aligned}$$

Expanding out (6.9), we have the evaluate the expression

$$\begin{aligned} \Bigg (\min _{j=1,\ldots ,M}\frac{\mu _j^2}{t_j^2}\Bigg ) S_1 - \sum _{i=1}^{M} \frac{1}{t_i^2}\Bigg [\sum _{\begin{array}{c} h,h'\in B_i \\ h\ne h' \end{array}}S_3^{(h,h')} + \sum _{h\in B_i} S_4^{(h)} - 2\mu _i\sum _{h\in B_i}S_2^{(h)}+\mu _i^2S_1\Bigg ] \end{aligned}$$

(where by abuse of notation we have written \(S_2^{(h)}\) for \(S_2^{(i)}\) where \(h=h_i\) say). A convenient choice of \(\mu _i, \lambda _i\) is

$$\begin{aligned} \mu _i&=c\Bigg (\frac{k_i}{2^i}\Bigg )^{\frac{1}{2}},\,\,\,\,\,t_i =c\Bigg (\frac{k_i}{2^i}\Bigg )^{\frac{1}{3}}, \end{aligned}$$

where

$$\begin{aligned} c = c(\theta _1,\theta _2) = \frac{16\sqrt{\theta _2/2\theta _1}}{\pi }\Bigg (\frac{\pi ^2}{\pi +2}\Bigg ). \end{aligned}$$

Evaluating these sums using Proposition 6.3 and Lemma 6.4 we see that this is asymptotically

$$\begin{aligned}&\frac{B^k N}{4W}\Bigg (\prod _{i=1}^{M}L_{|B_i|}^{(0)}(F)\Bigg ) \Bigg \{ \Bigg (\frac{k_1}{2}\Bigg )^{\frac{1}{3}} - \sum _{i=1}^{M}\Bigg [\Delta \Bigg (\frac{2^i}{k_i}\Bigg )^{\frac{1}{6}} -\frac{1}{2^i} \Bigg (\frac{2^i}{k_i}\Bigg )^{\frac{2}{3}}\Bigg ] \Bigg \}. \end{aligned}$$

Hence (6.9) will be satisfied for all sufficiently large N provided

$$\begin{aligned} \Delta \sum _{i=1}^{M}\Bigg (\frac{2^i}{k_i}\Bigg )^{\frac{1}{6}} < \Bigg (\frac{k_1}{2}\Bigg )^{\frac{1}{3}}. \end{aligned}$$
(6.10)

But now our choice of bins ensures that

$$\begin{aligned} \Delta \sum _{i=1}^{M}\Bigg (\frac{2^i}{k_i}\Bigg )^{\frac{1}{6}} \le \Delta \sum _{i=1}^{M}\frac{1}{2^i} \le \Delta < \Bigg (\frac{k_1}{2}\Bigg )^{\frac{1}{3}}, \end{aligned}$$

and so (6.10) is satisfied for all \(M \ge 1.\) \(\square \)

It remains to prove Proposition 6.2. Each sum can be treated similarly. The following lemma handles all of them at once. First, given a function F satisfying the hypotheses of Proposition 6.2, we define

$$\begin{aligned} F_{\text {max}} = \sup _{(t_1,\ldots ,t_k)\in [0,1]^k}\Bigg (|F(t_1,\ldots ,t_k)|+\sum _{i=1}^{k}|\frac{\partial F}{\partial t_i}(t_1,\ldots ,t_k)|\Bigg ). \end{aligned}$$
(6.11)

The lemma can now be stated as follows.

Lemma 6.6

(General sieve lemma). Let \(J\subseteq \{1,\ldots ,k\}\) (possibly empty) and \(p_1,p_2\in \mathbb {P}\cup \{1\}\) be fixed. Write \(I=\{1,\ldots ,k\}\backslash J.\) Define the sieve sum \(S_{J,p_1,p_2,m}=S_{J,p_1,p_2,m,f,g}\) by

$$\begin{aligned} S_{J,p_1,p_2,m}=\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \\ p_1|d_m,p_2|e_m \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i\in I}f([d_i,e_i])\prod _{j\in J}g([d_j,e_j]), \end{aligned}$$

with weights \(\lambda _{d_1,\ldots ,d_k}\) defined as in Proposition 6.2. If \(J=\emptyset \) we define \(f(p)=1/p\) (and there is no dependence on g in the sum). Otherwise, f and g are non-zero multiplicative functions defined on primes by

$$\begin{aligned} f(p) = \frac{1}{p}+O\Bigg (\frac{1}{p^2}\Bigg ),\,\,\,g(p) = \frac{1}{p^2}+O\Bigg (\frac{1}{p^3}\Bigg ), \end{aligned}$$

and moreover we assume that \(f(p)\ne 1/p.\) We write \(S_{J}\) for \(S_{J,1,1,m}\). Suppose \(\lambda _{d_1,\ldots ,d_k}\) satisfy the same hypotheses as in Proposition 6.2. Then for \(|J|\in \{0,1,2\}\) we have the following:

  1. (i)

    If \(m\in J\) then

    $$\begin{aligned} S_{J,p_1,p_2,m} \ll \frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{(p_1p_2/(p_1,p_2))^2} . \end{aligned}$$
  2. (ii)

    If \(m\notin J\) then

    $$\begin{aligned} S_{J,p_1,p_2,m} \ll \frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{p_1p_2/(p_1,p_2)} . \end{aligned}$$
  3. (iii)

    We have

    $$\begin{aligned} S_{J} = (1+o(1))B^{k+|J|}L_J(F), \end{aligned}$$

    where the integral operators are defined by Proposition 6.2 above, and we write \(L_J(F)\) as shorthand for \(L_{k;j\in J}(F).\)

We now show how this implies Proposition 6.2.

Proof of (Lemma 6.6\(\Rightarrow \) Proposition 6.2) We consider each sum in turn. First we note that using the definition of \(\lambda _{d_1,\ldots ,d_k},\) the exact same calculation as in [9, p. 394] gives

$$\begin{aligned} \sup _{d_1,\ldots ,d_k}|\lambda _{d_1,\ldots ,d_k}| \ll F_{\text {max}} \sum _{\begin{array}{c} u\le R \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)\tau _k(u)}{\varphi (u)} \ll F_{\text {max}} (\log {R})^{\frac{k}{2}}, \end{aligned}$$

and so we have a trivial boundFootnote 2

$$\begin{aligned} \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}|\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}|&\ll \Bigg (\sup _{d_1,\ldots ,d_k}|\lambda _{d_1,\ldots ,d_k}|\Bigg )^2 \Bigg (\sum _{\begin{array}{c} d\le R \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}} \tau _k(d)\Bigg )^2 \nonumber \\&\ll F_{\text {max}}^2R^2(\log {R})^{2k} \ll _{\epsilon } F_{\text {max}}^2N^{\theta _2+\epsilon }. \end{aligned}$$
(6.12)

As mentioned in Sect. 4, because we can obtain power-saving in the error terms for the formulae stated there, this trivial bound will suffice for our purposes.

  1. (i)

    Rewrite \(S_1\) in the form

    $$\begin{aligned} S_1 = \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} N\le n<2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i]) \end{array}}1. \end{aligned}$$

    We may assume \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) are pairwise coprime, as otherwise the inner sum is empty. In this case, by the Chinese Remainder Theorem, these congruences are equivalent to a single congruence (mod q) where \(q=4W\prod _{i=1}^{k}[d_i,e_i]\). The inner sum evaluates to

    $$\begin{aligned} \frac{N}{q}+O(1). \end{aligned}$$

    The error term contributes \(O_{\epsilon }(F_{\text {max}}^2N^{\theta _2+\epsilon })\) which is negligible. The main term is

    $$\begin{aligned} \frac{N}{4W} \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i=1}^{k}\frac{1}{[d_i,e_i]}. \end{aligned}$$

    This is of the form \(S_{J}\) where \(|J|=0\). Evaluating it according to Lemma 6.6 we obtain

    $$\begin{aligned} S_1 = (1+o(1))\frac{B^kN}{4W}L_k^{(0)}(F). \end{aligned}$$
  2. (ii)

    Rewrite \(S_2^{(m)}\) in the form

    $$\begin{aligned} S_2^{(m)} = \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} N\le n < 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho (n+h_m). \end{aligned}$$

    By definition of \(\rho (n+h_m)\) this is equal to

    $$\begin{aligned} \frac{1}{\log {v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} a\le v \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{g_2(a)}\log {\frac{v}{a}}\sum _{\begin{array}{c} N\le n < 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_m\,\,(\text {mod}\,\,a) \end{array}}r(n+h_m). \end{aligned}$$

    From considering the support of \(\lambda _{d_1,\ldots ,d_k}\) we see that for non-zero contribution we may assume \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) and a are pairwise coprime. In this case the inner sum can be evaluated according to Lemma 5.3, taking \(q=W\prod _{i\ne m}[d_i,e_i]\) and \(d=a[d_m,e_m]\). As \(q \ll WR^2 \ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d \ll vR^2 \ll N^{\theta _1+\theta _2}\) we see that the inner sum evaluates to

    $$\begin{aligned} \frac{g_1(q)g_2(d)}{2qd}\pi N +O_{\epsilon }(N^{\frac{1}{3}+\frac{1}{2}(\theta _1+\theta _2)+\epsilon }). \end{aligned}$$

    Bounding the sum over a trivially by \(v\log {v}\) and using (6.12), we see the error term contributes \(O_{\epsilon }(N^{\frac{1}{3}+\frac{3}{2}(\theta _1+\theta _2)+\epsilon })\) which is negligible. We obtain a main term

    $$\begin{aligned}&\frac{X_{N,W}g_1(W)\pi N}{2W\log {v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{\begin{array}{c} i\ne m \end{array}}\frac{g_1([d_i,e_i])}{[d_i,e_i]} \frac{1}{[d_m,e_m]^2}, \end{aligned}$$

    where we have defined

    $$\begin{aligned} X_{N,W}=\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{a}\log {\frac{v}{a}} \end{aligned}$$

    as in (5.2). The sieve sum above is of the form \(S_{J}\) with \(|J|=1.\) We can evaluate this by Lemma 6.6 to obtain

    $$\begin{aligned} S_2^{(m)} = (1+o(1))\frac{X_{N,W}g_1(W)\pi B^{k+1}N}{2W\log {v}}L_{k;m}^{(1)}(F). \end{aligned}$$

    Recalling the definition of B in (6.8), evaluating \(X_{N,W}\) according to Lemma 5.6, and using the fact

    $$\begin{aligned} \frac{g_1(W_3)\varphi (W_3)}{W_3} = \prod _{\begin{array}{c} p|W_3 \end{array}}\Bigg (1-\frac{1}{p^2}\Bigg ) = \frac{1}{2A^2}+O(D_0^{-1}), \end{aligned}$$

    we obtain

    $$\begin{aligned} S_2^{(m)} = (1+o(1))\frac{4 \sqrt{\frac{\log {R}}{\log {v}}}B^kN}{\pi W}L_{k;m}^{(1)}(F). \end{aligned}$$
  3. (iii)

    Rewrite \(S_3^{(m,l)}\) in the form

    $$\begin{aligned} S_3^{(m,l)}=\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho (n+h_{m})\rho (n+h_l). \end{aligned}$$

    Expanding out the definition of \(\rho \) this is

    $$\begin{aligned} \frac{1}{\log ^2{v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}&\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} a,b\le v \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \\&\cdot \sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_{m}\,\,(\text {mod}\,\,a) \\ n\equiv -h_{l}\,\,(\text {mod}\,\,b) \end{array}}r(n+h_{m})r(n+h_l). \end{aligned}$$

    Similarly to the above, for non-zero contribution we may restrict to the case \(W,[d_1,e_1],\ldots ,[d_k,e_k],a,b\) are pairwise coprime (note that the last two congruences are solvable if and only if \((a,b)|h_l-h_m,\) and in the case \((a,2W)=(b,2W)=1\) this is true if and only if \((a,b)=1\)). We evaluate the inner sum according to Lemma 5.4, taking \(q=W\prod _{i\ne m,l}[d_i,e_i],\) \(d_1 = a[d_m,e_m]\) and \(d_2=b[d_l,e_l].\) We note that \(q\ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d_1,d_2\ll N^{\theta _1+\theta _2}.\) Using the fact \(\theta _1+\theta _2<1/18,\) we see the second error term in the definition of \(R_2(N;d_1,d_2,q)\) dominates, and so the inner sum evaluates to

    $$\begin{aligned} \frac{g_1(q)^2\Gamma (d_1,d_2,q)}{q}\pi ^2N+ O_{\epsilon }(N^{\frac{5}{6}+\theta _1+\theta _2+\epsilon }). \end{aligned}$$

    Bounding the rest of the sum trivially, we obtain a total error of size \(O_{\epsilon }(N^{\frac{5}{6}+3\theta _1+2\theta _2+\epsilon })\) which, again, is negligible in the range \(\theta _1+\theta _2<1/18.\) We obtain a main term

    $$\begin{aligned}&\frac{g_1(W)^2\pi ^2N}{W\log ^{2}{v}}\sum _{\begin{array}{c} a,b\le v \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \\&\cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m,l}\frac{g_1([d_i,e_i])^2}{[d_i,e_i]} \Gamma ([d_m,e_m]a,[d_l,e_l]b,W\prod _{i\ne m,l}[d_i,e_i]). \end{aligned}$$

    For arbitrary (square-free) moduli \(d_1,d_1\) and q,  we can write \(\Gamma (d_1,d_2,q)\) as a product over primes (cf.  the definition of \(\Gamma (d_1,d_2,q)\) given in Lemma 5.4 and note that we are summing a multiplicative function). By considering the Euler-product and the various support restrictions on the variables \(d_i,e_i,a,b\), one can write \(\Gamma ([d_m,e_m]a,[d_l,e_l]b,W\prod _{i\ne m,l}[d_i,e_i])\) in the form

    $$\begin{aligned} \prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}\frac{g_2(a)g_2(b)}{g_7(a)g_7(b)}\prod _{i\ne m,l} \frac{[d_i,e_i]}{g_1([d_i,e_i]\varphi ([d_i,e_i])} \prod _{j=m,l} \frac{1}{[d_j,e_j]\varphi ([d_j,e_j])}, \end{aligned}$$

    leaving us with a main term

    $$\begin{aligned}&\prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}\frac{g_1(W)^2Y_{N,W}\pi ^2N}{W\log ^{2}{v}} \\&\cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m,l}\frac{g_1([d_i,e_i])}{\varphi ([d_i,e_i])}\prod _{j=m,l} \frac{1}{[d_j,e_j]\varphi ([d_{j},e_{j}])}. \end{aligned}$$

    Here we have defined

    $$\begin{aligned} Y_{N,W}=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ (a,b)=1 \\ p|a,b \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_7(a)g_7(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \end{aligned}$$

    as in (5.3). The main term is of the form \(S_J\) for \(|J|=2.\) By Lemma 6.6 it can be evaluated as

    $$\begin{aligned} S_{3}^{(l,m)} = (1+o(1))\frac{Y_{N,W}g_1(W)^2\pi ^2B^{k+2}N}{W\log ^2{v}}L_{k;m,l}^{(2)}(F), \end{aligned}$$

    where we have written

    $$\begin{aligned} \prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}=1+O(D_0^{-1}). \end{aligned}$$

    Evaluating \(Y_{N,W}\) as in Lemma 5.6, this simplifies to

    $$\begin{aligned} S_{3}^{(l,m)} = (1+o(1))\frac{64(\frac{\log {R}}{\log {v}}) B^{k}N}{\pi ^2 W}L_{k;m,l}^{(2)}(F). \end{aligned}$$
  4. (iv)

    Rewrite \(S_4^{(m)}\) in the form

    $$\begin{aligned} \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho ^2(n+h_m). \end{aligned}$$

    Expanding out the definition of \(\rho ^2(n)\) we see this is equal to

    $$\begin{aligned} \frac{1}{\log ^2{v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{a,b\le v}\frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}}\sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_m\,\,(\text {mod}\,\,[a,b]) \end{array}}r^2(n+h_m). \end{aligned}$$

    Again, we may restrict to the case \(W,[d_1,e_1],\ldots ,[d_k,e_k],[a,b]\) are pairwise coprime. In this case the inner sum can be evaluated according to Lemma 5.5, taking \(q=W\prod _{i\ne m}[d_i,e_i]\) and \(d=[a,b][d_m,e_m].\) We note that \(q\ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d\ll N^{2\theta _1+\theta _2},\) and so the inner sum becomes

    $$\begin{aligned} \frac{g_3(q)g_4(d)}{qd}\Bigg (\log {N}+A_2+2\sum _{p|q}g_5(p)-2\sum _{p|d}g_6(p)\Bigg )N+O_{\epsilon }(N^{\frac{3}{4}+\theta _2+\epsilon }). \end{aligned}$$

    Bounding the rest of the sum trivially, we see the error term contributes \(O_{\epsilon }(N^{\frac{3}{4}+2(\theta _1+\theta _2)+\epsilon })\) which is small. For the main term, let

    $$\begin{aligned} Z_{N,W}^{(1)}&= \sum _{a,b\le v}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}},\\ Z_{N,W}^{(2)}&= \sum _{a,b\le v}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}} \sum _{p|[a,b]}g_6(p) \end{aligned}$$

    be as in (5.4) and (5.5). We can express \(S_4^{(m)}=\Lambda _1+\Lambda _2+\Lambda _3+\Lambda _4\) where

    $$\begin{aligned} \Lambda _1&= \frac{g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}}\Bigg (\log {N}+A_2+2\sum _{p|W}g_5(p)\Bigg )T,\\ \Lambda _2&= \frac{2g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}} \sum _{i\ne m} \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}g_5(p)T^{(p,i)}, \\ \Lambda _3&= -\frac{2g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}}\sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}g_6(p)T^{(p,m)}, \\ \Lambda _4&=-\frac{2g_3(W)Z_{N,W}^{(2)}N}{W\log ^2{v}}T \end{aligned}$$

    and

    $$\begin{aligned} T&= \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m}\frac{g_1([d_i,e_i])}{[d_i,e_i]}\frac{1}{[d_m,e_m]^2}, \\ T^{(p,i)}&= \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ p|[d_i,e_i] \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m}\frac{g_1([d_i,e_i])}{[d_i,e_i]}\frac{1}{[d_m,e_m]^2}. \end{aligned}$$

    T is of the form \(S_{J}\) for \(|J|=1,\) and so by Lemma 6.6 it can be evaluated as

    $$\begin{aligned} T = (1+o(1))B^{k+1}L_{k;m}^{(1)}(F). \end{aligned}$$

    To evaluate \(T^{(p,i)},\) note by inclusion-exclusion we can write it as

    where denotes the condition \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) are pairwise coprime. Thus we see it is of the form \(S_{J,p,1,i}+S_{J,1,p,i}-S_{J,p,p,i}\) for \(|J|=1.\) By Lemma 6.6 we conclude

    $$\begin{aligned} T^{(p,i)} \ll {\left\{ \begin{array}{ll} \frac{F_{\text {max}}^2B^{k+1}(\log \log {R})^2}{p},\,\,&{}\text {if }i\ne m \\ \frac{F_{\text {max}}^2B^{k+1}(\log \log {R})^2}{p^2},\,\,&{}\text {if } i=m \end{array}\right. } \end{aligned}$$

    Now we note that

    $$\begin{aligned} \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{g_5(p)}{p}&\ll \sum _{p>D_0} \frac{\log {p}}{p^2} \ll \frac{\log {D_0}}{D_0},\\ \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{g_6(p)}{p^2}&\ll \sum _{p>D_0} \frac{\log {p}}{p^2} \ll \frac{\log {D_0}}{D_0}, \end{aligned}$$

    and so, with our choice \(D_0 = (\log \log {N})^3,\) the contributions from \(\Lambda _2\) and \(\Lambda _3\) are negligible. Because

    $$\begin{aligned} \sum _{\begin{array}{c} p|W \end{array}}g_5(p)&\ll \sum _{p<D_0} \frac{\log {p}}{p} \ll \log {D_0}, \end{aligned}$$

    we see that the only contribution to the main term comes from the \(\Lambda _1\) term corresponding to \(\log {N},\) and \(\Lambda _4,\) leaving us with

    $$\begin{aligned} S_4^{(m)} = (1+o(1)) \frac{g_3(W)B^{k+1}N}{W\log ^2{v}}\Bigg [Z_{N,W}^{(1)}\log {N}-2Z_{N,W}^{(2)}\Bigg ]L_{k;m}^{(1)}(F). \end{aligned}$$

    Evaluating these according to Lemma 5.6, and using the fact

    $$\begin{aligned} \frac{g_3(W_1)}{g_1(W_1)^3}=\prod _{\begin{array}{c} p<D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1} = \frac{3\zeta (2)}{8A^2}+O(D_0^{-1}) = \frac{\pi ^2}{16A^2}+O(D_0^{-1}), \end{aligned}$$

    we obtain

    $$\begin{aligned} S_4^{(m)} = (1+o(1)) \frac{2 \sqrt{\frac{\log {R}}{\log {v}}}(\frac{\log {N}}{\log {v}}+1)B^k N}{\pi W}L_{k;m}^{(1)}(F). \end{aligned}$$

This finishes the proof of Proposition 6.2. \(\square \)

Thus it remains to establish Lemma 6.6. First we require a few technical sieve lemmas. We list these in the following section.

7 Technical sieve sums lemmas

In the various sieve calculations that appear in the proof of Lemma 6.6, we will frequently encounter sums of the form

$$\begin{aligned} \sum _{\begin{array}{c} n\le X \\ p|n\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(n)f(n), \end{aligned}$$

where f is a multiplicative function satisfying \(f(p)=O(1/p).\) We can evaluate sums of this type with the following lemmas.

Lemma 7.1

(Technical sieve sum lemma). Let \(A_1,A_2,L>0.\) Let \(\gamma \) be a multiplicative function satisfying the sieve axioms

$$\begin{aligned} 0\le \frac{\gamma (p)}{p} \le 1-A_1, \end{aligned}$$

and

$$\begin{aligned} -L\le \sum _{w\le p \le z} \frac{\gamma (p)\log {p}}{p} - \frac{1}{2}\log {\frac{z}{w}} < A_2 \end{aligned}$$

for any \(2\le w\le z.\) Let g be the totally multiplicative function defined on primes by \(g(p) = \frac{\gamma (p)}{p-\gamma (p)}\). Finally, let \(G:[0,1]\rightarrow \mathbb {R}\) be a piecewise differentiable function, and let \(G_{\text {max}}=\sup _{t\in [0,1]} (|G(t)|+|G'(t)|)\). Then

$$\begin{aligned} \sum _{d<z}\mu ^2(d)g(d)G\Bigg (\frac{\log {d}}{\log {z}}\Bigg ) = c_{\gamma } \frac{(\log {z})^{\frac{1}{2}}}{\Gamma (1/2)} \int _0^1 G(x)\frac{\mathrm {d}x}{\sqrt{x}}+O_{A_1,A_2}(c_{\gamma }LG_{\text {max}}(\log {z})^{-\frac{1}{2}}), \end{aligned}$$

where

$$\begin{aligned} c_{\gamma } = \prod _{p}\Bigg (1-\frac{\gamma (p)}{p}\Bigg )^{-1}\Bigg (1-\frac{1}{p}\Bigg )^{\frac{1}{2}}. \end{aligned}$$

Here, the implied constant in the Landau notation is independent of G and L.

Proof

This is [5, Lemma 4] with slight changes to notation. \(\square \)

To use this lemma in practice, we need to be able to evaluate the singular series \(c_{\gamma }\) which appears. In the next lemma we do this for a function \(\gamma (p)\) which covers the cases of interest to us.

Lemma 7.2

(Evaluation of singular series). Let

$$\begin{aligned} \gamma (p)= {\left\{ \begin{array}{ll} 1+O(1/p)\,\,&{}\text { if }p\not \mid W, p\equiv 3\,\,(\text {mod}\,\,4), \\ 0\,\,&{}\text {otherwise,} \end{array}\right. } \end{aligned}$$

With the notation of Lemma 7.1, we have

$$\begin{aligned} c_{\gamma }&= \frac{A}{\sqrt{L(1)}}\cdot \frac{\varphi (W_3)}{W_3}(1+O(D_0^{-1})) \end{aligned}$$

where A is the Ramanujan–Landau constant defined in (4.1).

Proof

Let \(\gamma (p)=1+\alpha (p)\) where \(\alpha (p)=O(1/p).\) Define the auxiliary function

$$\begin{aligned} \delta (p)= {\left\{ \begin{array}{ll} 1\,\,&{}\text {if }p\equiv 3\,\,(\text {mod}\,\,4), \\ 0\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

One can easily show \(c_{\delta } = A/\sqrt{L(1)}.\) The result follows because

$$\begin{aligned} c_{\gamma } = c_{\delta } \prod _{\begin{array}{c} p|W \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p}\Bigg )\prod _{\begin{array}{c} p\not \mid W \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{\alpha (p)}{p-1}\Bigg )^{-1}. \end{aligned}$$

The latter product is \(1+O(D_0^{-1})\) by our assumption \(\alpha (p)=O(1/p).\) \(\square \)

The next lemma collects both of these results together. First we recall the definition of the normalising constant from (6.8):

$$\begin{aligned} B= \frac{2A \varphi (W_3) (\log {R})^{\frac{1}{2}}}{\pi W_3}. \end{aligned}$$
(7.1)

Lemma 7.3

(Evaluation of sieve sums). Let f be a multiplicative function such that

$$\begin{aligned} f(p)=\frac{1}{p}+O\Bigg (\frac{1}{p^2}\Bigg ). \end{aligned}$$

Then for any piece-wise smooth function G we have

$$\begin{aligned}&\sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d \Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)f(d)G\Bigg (\frac{\log {d}}{\log {x}}\Bigg )= B \int _0^1G(x)\frac{\mathrm {d}x}{\sqrt{x}} +O\Bigg (\frac{G_{\text {max}}B}{D_0}\Bigg ). \end{aligned}$$

Proof

Let \(f(p)=1/p+g(p)\) where \(g(p)=O(1/p^2),\) and consider the function \(\gamma \) defined on primes by

$$\begin{aligned} \gamma (p)= {\left\{ \begin{array}{ll} 1-\frac{1}{p+1+pg(p)}\,\,&{}\text {if }p\not \mid W, p\equiv 3\,\,(\text {mod}\,\,4), \\ 0,\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

With this choice of \(\gamma (p)\) we have

$$\begin{aligned} \frac{\gamma (p)}{p-\gamma (p)} = f(p). \end{aligned}$$

Note that

$$\begin{aligned} \sum _{w\le p \le z} \frac{\gamma (p)\log {p}}{p}&= \sum _{\begin{array}{c} w\le p \le z \\ p\equiv 3\,\,(\text {mod}\,\,4) \\ p\not \mid W \end{array}} \frac{\log {p}}{p} +O\Bigg ( \sum _{\begin{array}{c} w\le p \le z \\ p\equiv 3\,\,(\text {mod}\,\,4) \\ p\not \mid W \end{array}} \frac{\log {p}}{p^2}\Bigg )\\&=\sum _{\begin{array}{c} w\le p \le z \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {p}}{p}+O\Bigg (\sum _{\begin{array}{c} w\le p \le z \\ p\equiv 3\,\,(\text {mod}\,\,4) \\ p|W \end{array}} \frac{\log {p}}{p}\Bigg )+O(1) \\&=\frac{1}{2}\log {\frac{z}{w}} + O(\log {D_0})+O(1). \end{aligned}$$

Therefore we can apply Lemma 7.1 with \(\gamma (p),\) taking \(L\ll 1+\log {D_0}\) and \(A_2\) a suitable constant. We obtain

$$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)f(d)G\Bigg (\frac{\log {d}}{\log {x}}\Bigg )= & {} \frac{c_{\gamma }(\log {R})^{\frac{1}{2}}}{\Gamma (1/2)}\int _0^1G(x)\frac{\mathrm {d}x}{\sqrt{x}}\\&+O\Bigg (\frac{G_{\text {max}}c_{\gamma }(1+\log {D_0})}{(\log {R})^{\frac{1}{2}}}\Bigg ). \end{aligned}$$

We can evaluate \(c_{\gamma }\) by Lemma 7.2 to find

$$\begin{aligned} c_{\gamma } = \frac{A}{\sqrt{L(1)}}\cdot \frac{\varphi (W_3)}{W_3}(1+O(D_0^{-1})) \end{aligned}$$

When we substitute this back into our expression we see the error incurred here contributes \(O(G_{\text {max}}B/D_0)\) and dominates. The result follows as \(\Gamma (1/2)\sqrt{L(1)} = \pi /2.\) \(\square \)

We highlight the following two results, the first of which follows immediately from Lemma 7.3, and the second of which is trivial.

  1. (1)

    For multiplicative functions f satisfying \(f(p)=1/p+O(1/p^2)\) we have the upper bound

    $$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)f(d) \ll B. \end{aligned}$$
    (7.2)
  2. (2)

    For multiplicative functions g satisfying \(g(p)=O(1/p^2)\) we have the upper bound

    $$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)g(d) \ll 1. \end{aligned}$$
    (7.3)

These sums will appear frequently in our calculations, and we will use these bounds without comment in the arguments which follow.

8 Establishing Lemma 6.6

Our attention now turns to establishing Lemma 6.6. We follow the combinatorial arguments used by Maynard—the steps which follow mirror those found in [9].

8.1 Change of variables

Our first step to evaluating the sums appearing in Lemma 6.6 is to make a change of variables. We do so in the following proposition.

Proposition 8.1

(Diagonalising the sieve sum). With notation as in Lemma 6.6, denote by \(f^*,g^*\) the convolutions

$$\begin{aligned} f^*=\mu *\frac{1}{f},\quad g^*=\mu *\frac{1}{g}. \end{aligned}$$

Define the diagonalising vectors \(y_{r_1,\ldots ,r_k}^{(J,p,m)} = y_{r_1,\ldots ,r_k}^{(J,p,m,f,g)}\) by

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p,m)}= \Bigg (\prod _{i\in I}\mu (r_i)f^*(r_i)\Bigg )\Bigg (\prod _{j\in J}\mu (r_j)g^*(r_j)\Bigg ) \sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \\ p|d_m \end{array}}\lambda _{d_1,\ldots ,d_k}\prod _{i\in I}f(d_i)\prod _{j\in J}g(d_j). \end{aligned}$$

Let \(y_{\text {max}}^{(J,p,m)}=\sup _{r_1,\ldots ,r_k} |y_{r_1,\ldots ,r_k}^{(J,p,m)}|\) and \(\tilde{y}_{\text {max}}^{(J,p,m)}=\sup _{\begin{array}{c} r_1,\ldots ,r_k \\ (r_m,p)=1 \end{array}} |y_{r_1,\ldots ,r_k}^{(J,p,m)}|.\) (Note that these coincide if \(p=1.\)) Then we have

$$\begin{aligned} S_{J,p_1,p_2,m}&= \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ (u_m,p_1p_2)=1 \\ u_j=1\,\,\forall j\in J \end{array}}\frac{(y_{u_1,\ldots ,u_k}^{(J,p_1,m)})(y_{u_1,\ldots ,u_k}^{(J,p_2,m)})}{\prod _{i\in I}f^*(u_i)\prod _{j\in J}g^*(u_j)}+E. \end{aligned}$$

If \(m\in J\) then the (error) term E satisfies

$$\begin{aligned} E&\ll B^{|I|}\Bigg [\frac{(\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})}{D_0}+\frac{(y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})}{(p_1p_2/(p_1,p_2))^2} \\&\quad +\frac{(y_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})}{p_1^2}+\frac{(\tilde{y}_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})}{p_2^2}\Bigg ]. \end{aligned}$$

If \(m\notin J\) then E satisfies a similar estimate, namely that which is obtained upon replacing all occurrences of \(p_i^2\) with \(p_i\) in the above expression, for \(i\in \{1,2\}\). Moreover, in both of these cases, we adopt the convention that if \(p_i=1\) then any term in our expression for E involving \(p_i\) in the denominator may be omitted.

Proof

Recall the definition of \(S_{J,p_1,p_2,m}\) given in Lemma 6.6:

$$\begin{aligned} S_{J,p_1,p_2,m}=\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \\ p_1|d_m,p_2|e_m \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i\in I}f([d_i,e_i])\prod _{j\in J}g([d_j,e_j]). \end{aligned}$$
(8.1)

We can write this in the form

$$\begin{aligned} S_{J,p_1,p_2,m}=\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \\ p_1|d_m, p_2|e_m \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i\in I}\frac{f(d_i)f(e_i)}{f((d_i,e_i))}\prod _{j\in J}\frac{g(d_j)g(e_j)}{g((d_j,e_j))}, \end{aligned}$$
(8.2)

using multiplicativity of the functions f and g,  together with the fact \([d_i,e_i]\) is square-free for each i. We remark that because fg are non-zero, the functions 1/f, 1/g are well-defined. We note the convolution identities

$$\begin{aligned} \frac{1}{f((d_i,e_i))} = \sum _{u_i|d_i,e_i}f^*(u_i),\,\,\,\,\,\,\,\quad \frac{1}{g((d_j,e_j))} = \sum _{u_j|d_j,e_j}g^*(u_j) \end{aligned}$$

for \(f^*\) and \(g^*\). Substituting these into (8.2) and swapping the order of summation, we obtain

$$\begin{aligned}&\sum _{\begin{array}{c} u_1,\ldots ,u_k \end{array}}\Bigg (\prod _{i\in I}f^*(u_i)\Bigg )\Bigg (\prod _{j\in J}g^*(u_j)\Bigg )\\&\cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \\ u_i|d_i,e_i\,\,\forall i\in I \\ u_j|d_j,e_j\,\,\forall j\in J \\ p_1|d_m,p_2|e_m \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i\in I}f(d_i)f(e_i)\prod _{j\in J}g(d_j)g(e_j). \end{aligned}$$

From the support of the \(\lambda _{d_1,\ldots ,d_k},\) we see the only restriction coming from the pairwise coprimality of \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) is from the possibility \((d_i,e_j)\ne 1\) for \(i\ne j\). We can take care of this constraint by Möbius inversion: multiplying by \(\sum _{s_{i,j}|d_i,e_j}\mu (s_{i,j})\) for all \(i\ne j,\) we obtain

$$\begin{aligned}&\sum _{\begin{array}{c} u_1,\ldots ,u_k \end{array}}\Bigg (\prod _{i\in I}f^*(u_i)\Bigg )\Bigg (\prod _{j\in J}g^*(u_j)\Bigg ) \sum _{s_{1,2},\ldots ,s_{k-1,k}}\Bigg (\prod _{\begin{array}{c} 1\le i,j\le k \\ i\ne j \end{array}}\mu (s_{i,j})\Bigg ) \nonumber \\&\quad \cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ u_i|d_i,e_i\,\,\forall i\in I \\ u_j|d_j,e_j\,\,\forall j\in J \\ s_{i,j}|d_i,e_j\,\,\forall i\ne j \\ p_1|d_m, p_2|e_m \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i\in I}f(d_i)f(e_i)\prod _{j\in J}g(d_j)g(e_j). \end{aligned}$$
(8.3)

We may restrict to the case where \(s_{i,j}\) is coprime to \(s_{i,a},s_{b,j}\) and \(u_i,u_j,\) for \(a\ne j\) and \(b\ne i,\) because the vectors \(\lambda _{d_1,\ldots ,d_k}\) are supported on square-free integers \(d=\prod _{i=1}^{k}d_i.\) Denote the sum over \(s_{i,j}\) with these conditions by . Define the diagonalising vectors

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p,m)}= \Bigg (\prod _{i\in I}\mu (r_i)f^*(r_i)\Bigg )\Bigg (\prod _{j\in J}\mu (r_j)g^*(r_j)\Bigg ) \sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \\ p|d_m \end{array}}\lambda _{d_1,\ldots ,d_k}\prod _{i\in I}f(d_i)\prod _{j\in J}g(d_j). \end{aligned}$$
(8.4)

From the support of \(\lambda _{d_1,\ldots ,d_k}\) we see that \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) is also supported on \(r_1,\ldots ,r_k\) with \(r=\prod _{i=1}^{k}r_i\) square-free, \((r,W)=1\) and \(p|r\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4).\) We claim this change of variables is invertible. Indeed, from the definition (8.4), for \(d_1,\ldots ,d_k\) with \(\prod _{i=1}^{k}d_i\) square-free, we have

$$\begin{aligned} \sum _{\begin{array}{c} r_1,\ldots ,r_k \\ d_i|r_i\,\,\forall i \end{array}}\frac{y_{r_1,\ldots ,r_k}^{(J,p,m)}}{\prod _{i\in I}f^*(r_i)\prod _{j\in J}g^*(r_j)}&=\sum _{\begin{array}{c} r_1,\ldots ,r_k \\ d_i|r_i\,\,\forall i \end{array}}\prod _{i=1}^{k}\mu (r_i) \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ r_i|e_i\,\,\forall i \\ p|e_m \end{array}}\lambda _{e_1,\ldots ,e_k}\prod _{i\in I}f(e_i)\prod _{j\in J}g(e_j) \nonumber \\&= \sum _{\begin{array}{c} e_1,\ldots ,e_k \end{array}}1_{p|e_m} \lambda _{e_1,\ldots ,e_k}\prod _{i\in I}f(e_i)\prod _{j\in J}g(e_j) \sum _{\begin{array}{c} r_1,\ldots ,r_k \\ r_i|e_i\,\,\forall i \\ d_i|r_i\,\,\forall i \end{array}}\prod _{i=1}^{k}\mu (r_i) \nonumber \\&= 1_{p|d_m}\lambda _{d_1,\ldots ,d_k}\prod _{i\in I}\mu (d_i)f(d_i)\prod _{j\in J}\mu (d_j)g(d_j). \end{aligned}$$
(8.5)

With this transformation our sum (8.3) becomes

where we have defined \(a_i = u_i \prod _{j\ne i}s_{i,j}\) and \(b_j = u_j\prod _{i\ne j}s_{i,j}.\) Because of our constraints on the \(s_{i,j}\) variables, we can use multiplicativity to write this as

(8.6)

We now wish to reduce to the case when \((a_m,p_1)=(b_m,p_2)=1.\) Indeed, we will show the contribution from the alternative cases is negligible. Of course, depending on whether of not \(p_1=1\) and/or \(p_2=1,\) some (or all) of the analysis which follows is not necessary, and this accounts for the convention we assert in the statement of the proposition. First let us note the estimate

$$\begin{aligned} \frac{1}{f^*(p)} = \frac{f(p)}{1-f(p)}=\frac{\frac{1}{p}+O\left( \frac{1}{p^2}\right) }{1-\frac{1}{p}+O\left( \frac{1}{p^2}\right) }=\frac{1}{p}+O\Bigg (\frac{1}{p^2}\Bigg ), \end{aligned}$$
(8.7)

and similarly

$$\begin{aligned} \frac{1}{g^{*}(p)}=\frac{1}{p^2}+O\Bigg (\frac{1}{p^3}\Bigg ). \end{aligned}$$
(8.8)

Now, there are three cases to consider.

  1. (1)

    Suppose that \(p_1|a_m\) and \(p_2|b_m.\) This occurs if and only if \(p_1|u_m\) or \(p_1|s_{m,j}\) for some \(j\ne m,\) and \(p_2|u_m\) or \(p_2|s_{i,m}\) for some \(i\ne m.\) Suppose, for example, that \(p_1|u_m\) and \(p_2|u_m.\) Moreover let us assume that \(m\in J.\) Then one can bound the contribution as follows:

    $$\begin{aligned}&\ll (y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{f^*(u)}\Bigg )^{|I|}\\&\qquad \cdot \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{g^*(u)}\Bigg )^{|J|-1} \Bigg (\sum _{\begin{array}{c} u_m\le R \\ (u_m,W)=1 \\ p_1,p_2|u_m \\ p|u_m\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u_m)}{g^*(u_m)}\Bigg )\\&\qquad \times \Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)^2}\Bigg )^{|I|^2-|I|}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{g^*(s)^2}\Bigg )^{|J|^2-|J|}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)g^*(s)}\Bigg )^{2|I||J|} \\&\quad \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})B^{|I|}}{(p_1p_2/(p_1,p_2))^2}. \end{aligned}$$

    It is easy to see that this bound also holds in any of the other possible cases in which \(p_1|a_m\) and \(p_2|b_m\) and \(m\in J.\) If instead \(m\notin J,\) then again it is easy to see the contribution is

    $$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1p_2/(p_1,p_2)}, \end{aligned}$$

    in all possible cases.

  2. (2)

    Suppose that \(p_1|a_m\) and \(p_2\not \mid b_m.\) If \(m\in J,\) then similarly to the above, one can bound the contribution by

    $$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1^2}. \end{aligned}$$

    If \(m\notin J\) then likewise one obtains a contribution

    $$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1}. \end{aligned}$$
  3. (3)

    Finally, the case \(p_1\not \mid a_m\) and \(p_2|b_m\) proceeds as above, interchanging the roles of \(p_1\) and \(p_2.\)

Thus, we may now suppose that \((a_m,p_1)=(b_m,p_2)=1.\) From the support of the \(y_{a_1,\ldots ,a_k}^{(J,p_1,m)}\) we see there is no contribution from \((s_{i,j},W)\ne 1\) and so either \(s_{i,j}=1\) or \(s_{i,j}>D_0.\) The contribution from \(s_{i,j}>D_0\) with \(i,j\in I\) is

$$\begin{aligned}&\ll (\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{f^*(u)}\Bigg )^{|I|}\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{g^*(u)}\Bigg )^{|J|} \\&\qquad \times \Bigg (\sum _{\begin{array}{c} s_{i,j}\ge D_0 \end{array}}\frac{\mu ^2(s_{i,j})}{f^*(s_{i,j})^2}\Bigg ) \Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)^2}\Bigg )^{|I|^2-|I|-1}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{g^*(s)^2}\Bigg )^{|J|^2-|J|}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)g^*(s)}\Bigg )^{2|I||J|} \\&\quad \ll \frac{(\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{D_0}, \end{aligned}$$

This contribution will be negligible. The cases \(i,j\in J\) and \(i\in I,j\in J\) can be treated the same way. This leaves us with a main term

$$\begin{aligned} \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ (u_m,p_1p_2)=1 \end{array}}\frac{(y_{u_1,\ldots ,u_k}^{(J,p_1,m)})(y_{u_1,\ldots ,u_k}^{(J,p_2,m)})}{\prod _{i\in I}f^*(u_i)\prod _{j\in J}g^*(u_i)} \end{aligned}$$

To finish, we claim the contribution from \(u_j>1\) is small whenever \(j\in J\). Indeed if \(u_j>1\) then it must be divisible by a prime \(p>D_0\) (with \(p\equiv 3\,\,(\text {mod}\,\,4)\)). So suppose \(|J| \ge 1\) and let \(j\in J.\) If \(u_j>1\) we get a contribution

$$\begin{aligned}&\ll (\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)}) \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \end{array}}\frac{\mu ^2(u)}{f^*(u)}\Bigg )^{|I|}\Bigg (\sum _{\begin{array}{c} u\le R \end{array}}\frac{\mu ^2(u)}{g^*(u)}\Bigg )^{|J|-1}\sum _{\begin{array}{c} p>D_0 \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\Bigg (\sum _{\begin{array}{c} u_j<R \\ p|u_j \end{array}}\frac{\mu ^2(u_j)}{g^*(u_j)}\Bigg )\\&\quad \ll (\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|} \sum _{\begin{array}{c} p>D_0 \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{1}{g^*(p)}\Bigg (\sum _{\begin{array}{c} u_j \end{array}}\frac{\mu ^2(u_j)}{g^*(u_j)}\Bigg ) \\&\quad \ll \frac{(\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{D_0}, \end{aligned}$$

which is small. Putting all of these facts together establishes Proposition 8.1. \(\square \)

8.2 Transformation for \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) and proof of Lemma 6.6 parts (i) and (ii)

Define

$$\begin{aligned} y_{r_1,\ldots ,r_k}= \Bigg (\prod _{i=1}^k\mu (r_i)\varphi (r_i)\Bigg )\sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \end{array}}\frac{\lambda _{d_1,\ldots ,d_k}}{\prod _{i=1}^{k}d_i}. \end{aligned}$$

and let \(y_{\text {max}}=\sup _{r_1,\ldots ,r_k}|y_{r_1,\ldots ,r_k}|.\) By the inversion formula (8.5), our definition of \(\lambda _{d_1,\ldots ,d_k}\) in Proposition 6.2 is equivalent to taking

$$\begin{aligned} y_{r_1,\ldots ,r_k}=F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg ). \end{aligned}$$
(8.9)

We now wish to relate the more complicated diagonalisation vectors \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) to these simpler vectors. We first deal with the case when \(J=\emptyset \), which is straightforward. By inspecting the proof of Proposition 6.2, it is clear that we only need to understand this case when \(f(p)=1/p.\)

Lemma 8.2

(Relating \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) to \(y_{r_1,\ldots ,r_k}\) when \(J=\emptyset \)). Suppose \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\ne 0, J=\emptyset \) and \(m\in \{1,\ldots ,k\}.\) Moreover suppose \(f(p)=1/p\) for all primes p. Then the following hold.

  1. (1)

    If \(p_1|r_m\) then

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)} = y_{r_1,\ldots ,r_k}. \end{aligned}$$
  2. (2)

    If \(p_1\not \mid r_m\) then

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)} = \frac{y_{r_1,\ldots ,p_1r_m,\ldots ,r_k}}{\mu (p_1)\varphi (p_1)}. \end{aligned}$$

Proof

If \(f(p)=1/p\) then \(f^{*}(p) = p-1 = \varphi (p).\) Hence we are assuming

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p,m)} = \Bigg (\prod _{i=1}^{k} \mu (r_i)\varphi (r_i)\Bigg ) \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ r_i|d_i\,\,\forall i \\ p|d_m \end{array}}\frac{\lambda _{d_1,\ldots ,d_k}}{\prod _{i=1}^{k}d_i}. \end{aligned}$$

The result then follows by comparing with the definition of \(y_{r_1\ldots ,r_k}\) given above. \(\square \)

Thus, proceeding, we may suppose that \(|J| \ge 1.\) We may further suppose that \(f(p)\ne 1/p\) as this case is of no interest to us (again, this is clear by inspecting the proof of Proposition 6.2). The following proposition gives the result in full.

Proposition 8.3

(Relating \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) to \(y_{r_1,\ldots ,r_k}\) when \(|J| \ge 1\)). Suppose \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\ne 0, |J| \ge 1\) and \(f(p)\ne 1/p\). Then the following hold:

  1. (i)

    If \(m\in J\) and \((r_m,p)=1\) we have

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1g(p_1)}{\varphi (p_1)}\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\quad \cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots ,e_k \\ r_i|e_i\,\,\forall i\ne m \\ r_m|e_m' \\ e_i=r_i\,\,\forall i\in I \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots ,e_k}\Bigg (\prod _{\begin{array}{c} j\in J \\ j\ne m \end{array}}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )\Bigg (\frac{g^{**}(e_m')}{\varphi (e_m')}\Bigg )\\&\quad +O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1^2}\Bigg ). \end{aligned}$$
  2. (ii)

    If \(m\notin J\) and \((r_m,p)=1\) we have

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1f(p_1)}{\varphi (p_1)}\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots ,e_k \\ r_i|e_i\,\,\forall i\ne m \\ e_i=r_i\,\,\forall i\in I\backslash \{m\} \\ e'_m = r_m \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots ,e_k}\Bigg (\prod _{j\in J}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )+O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1}\Bigg ). \end{aligned}$$
  3. (iii)

    If \(p_1|r_m\) we have

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\cdot \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ r_i|e_i\,\,\forall i \\ e_i=r_i\,\,\forall i\in I \end{array}}y_{e_1,\ldots ,e_k}\Bigg (\prod _{j\in J}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )+O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0}\Bigg ). \end{aligned}$$

Here \(f^{**}\) and \(g^{**}\) are defined by the convolutions

$$\begin{aligned} f^{**}= \iota \mu f *1,\,\,\,\,\,g^{**} = \iota \mu g*1, \end{aligned}$$

where \(\iota \) is the identity function, \(\iota (p)=p\).

Proof

We prove (i), with the rest proved in exactly the same way. Directly from the definition (8.4) we have

$$\begin{aligned} \frac{y_{r_1,\ldots ,r_k}^{(J,p_1,m)}}{\prod _{i\in I}\mu (r_i)f^*(r_i)\prod _{j\in J}\mu (r_j)g^*(r_j)}&=\sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \\ p_1|d_m \end{array}}\lambda _{d_1,\ldots ,d_k}\prod _{i\in I}f(d_i)\prod _{j\in J}g(d_j). \end{aligned}$$
(8.10)

From the inversion formula (8.5) and the definition of \(y_{r_1,\ldots ,r_k},\) we see that the right hand side of (8.10) equals

$$\begin{aligned}&\sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \\ p_1|d_m \end{array}}\Bigg (\prod _{i\in I}\mu (d_i)d_if(d_i)\Bigg )\Bigg (\prod _{j\in J}\mu (d_j)d_jg(d_j)\Bigg ) \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ d_i|e_i\,\,\forall i \end{array}}\frac{y_{e_1,\ldots ,e_k}}{\prod _{i=1}^{k}\varphi (e_i)}. \end{aligned}$$

Swapping sums we obtain

$$\begin{aligned} \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ r_i|e_i\,\,\forall i \\ p_1|e_m \end{array}}\frac{y_{e_1,\ldots ,e_k}}{\prod _{i=1}^{k}\varphi (e_i)}\sum _{\begin{array}{c} d_1,\ldots , d_k \\ r_i|d_i\,\,\forall i \\ d_i|e_i\,\,\forall i\\ p_1|d_m \end{array}}\Bigg (\prod _{i\in I}\mu (d_i)d_if(d_i)\Bigg )\Bigg (\prod _{j\in J}\mu (d_j)d_jg(d_j)\Bigg ). \end{aligned}$$
(8.11)

We can evaluate the inner sums using the convolution identities

$$\begin{aligned} f^{**}(n)=\sum _{d|n}\mu (d)df(d),\,\,\,\,\,g^{**}(n)=\sum _{d|n}\mu (d)dg(d). \end{aligned}$$

We note that

$$\begin{aligned} f^{**}(p) = 1-pf(p)=O\Bigg (\frac{1}{p}\Bigg ), \end{aligned}$$
(8.12)

and similarly

$$\begin{aligned} g^{**}(p) = 1-pg(p)=1+O\Bigg (\frac{1}{p}\Bigg ). \end{aligned}$$
(8.13)

With our assumption \(f(p)\ne 1/p,\) we may suppose both of these functions are non-zero. Now, recall that we are assuming \(m\in J.\) Using these identities transforms  (8.11) into

$$\begin{aligned}&\frac{\mu (p_1)p_1g(p_1)}{g^{**}(p_1)} \Bigg (\prod _{i\in I}\frac{\mu (r_i)r_if(r_i)}{f^{**}(r_i)}\Bigg )\Bigg (\prod _{j\in J}\frac{\mu (r_j)r_jg(r_j)}{g^{**}(r_j)}\Bigg ) \\&\quad \cdot \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ r_i|e_i\,\,\forall i \\ p_1|e_m \end{array}}y_{e_1,\ldots ,e_k}\Bigg (\prod _{i\in I}\frac{f^{**}(e_i)}{\varphi (e_i)}\Bigg )\Bigg (\prod _{j\in J}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg ). \end{aligned}$$

Here we are using the fact \(y_{e_1,\ldots ,e_k}\) is supported on square-free integers \(e=\prod _{i=1}^{k}e_i.\) Hence, from (8.10), it follows that

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1g(p_1)}{\varphi (p_1)} \Bigg (\prod _{i\in I}\frac{r_if(r_i)f^{*}(r_i)}{f^{**}(r_i)}\Bigg )\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\quad \cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots e_k \\ r_i|e_i\,\,\forall i\ne m \\ r_m|e_m' \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots e_k}\Bigg (\prod _{i\in I}\frac{f^{**}(e_i)}{\varphi (e_i)}\Bigg )\Bigg (\prod _{\begin{array}{c} j\in J \\ j\ne m \end{array}}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )\frac{g^{**}(e_m')}{\varphi (e_m')}. \end{aligned}$$

Here we have substituted \(e_m= p_1 e_m'\) and used the fact \((r_m,p)=1.\) Now, since

$$\begin{aligned} \frac{pf(p)f^{*}(p)}{\varphi (p)}= \frac{p}{p-1}(1-f(p)) = \frac{p}{p-1}\Bigg [1-\frac{1}{p}+O\Bigg (\frac{1}{p^2}\Bigg )\Bigg ]= 1+O\Bigg (\frac{1}{p^2}\Bigg ), \end{aligned}$$

it follows that

$$\begin{aligned} \sup _{r_1,\ldots ,r_k}\prod _{j\in J}\frac{r_j f(r_j)f^{*}(r_j)}{\varphi (r_j)} \ll 1. \end{aligned}$$
(8.14)

Similarly, since

$$\begin{aligned} \frac{p g(p)g^{*}(p)}{\varphi (p)} = \frac{p}{p-1}(1-g(p)) = \frac{p}{p-1}\Bigg [1+O\Bigg (\frac{1}{p^2}\Bigg )\Bigg ], \end{aligned}$$

we have the bound

$$\begin{aligned} \sup _{r_1,\ldots ,r_k}\prod _{j\in J}\frac{r_j g(r_j)g^{*}(r_j)}{\varphi (r_j)}&\ll \sup _{r_1,\ldots ,r_k}\prod _{j\in J}\frac{r_j}{\varphi (r_j)} \ll \log \log {R}. \end{aligned}$$
(8.15)

Here we have used the fact \(r = \prod _{i=1}^{k}r_i \le R\) and the standard estimate

$$\begin{aligned} \frac{n}{\varphi (n)} \ll \log \log {n}. \end{aligned}$$

Now, if \(i\ne m\) then either \(e_i=r_i\) or \(e_i>D_0 r_i\). Suppose \(e_{i_0}>D_0r_{i_0}\) for some \(i_0\in I.\) By first using multiplicativity of the sum over the \(e_i\) variables, and then using estimates (8.14) and (8.15), we see that that this gives a contribution

$$\begin{aligned}&\ll \frac{y_{\text {max}}\log \log {R}}{p_1^2} \Bigg (\sum _{\begin{array}{c} e_{i_0}> D_0 \\ (e_{i_0},W)=1 \end{array}}\frac{\mu ^2(e_{i_0})f^{**}(e_{i_0})}{\varphi (e_{i_0})}\Bigg ) \\&\qquad \cdot \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)f^{**}(u)}{\varphi (u)}\Bigg )^{|I|-1}\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)g^{**}(u)}{\varphi (u)}\Bigg )^{|J|}\\&\quad \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1^2}, \end{aligned}$$

where we have used the fact \(f^{**}(p)/\varphi (p)=O(1/p^2)\) and \(g^{**}(p)/\varphi (p)=1/p+O(1/p^2),\) which follows from (8.12) and (8.13) respectively. This is small. Thus

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1g(p_1)}{\varphi (p_1)}\Bigg (\prod _{i\in I}\frac{r_if(r_i)f^{*}(r_i)}{\varphi (r_i)}\Bigg )\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg )\\&\quad \cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots ,e_k \\ r_i|e_i\,\,\forall i\ne m \\ r_m|e_m' \\ e_i=r_i\,\,\forall i\in I \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots e_k}\Bigg (\prod _{\begin{array}{c} j\in J \\ j\ne m \end{array}}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )\Bigg (\frac{g^{**}(e_m')}{\varphi (e_m')}\Bigg ) +O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1^2}\Bigg ). \end{aligned}$$

From the support restrictions on \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) we necessarily have \((r_i,W)=1.\) From the above, we see the first product may be replaced by \(1+O(D_0^{-1}).\) This incurs an acceptable error

$$\begin{aligned} \ll \frac{y_{\text {max}}\log \log {R}}{D_0p_1^2} \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p \equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u) g^{**}(u)}{\varphi (u)}\Bigg )^{|J|} \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1^2}, \end{aligned}$$

which gives the result stated. \(\square \)

We record the following useful corollary.

Corollary 8.4

With notation as in the statement of Proposition 8.1, the following estimates hold.

  1. (1)

    If \(m\in J\) then

    $$\begin{aligned} \tilde{y}_{\text {max}}^{(J,p_1,m)} \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{p_1^2}. \end{aligned}$$
  2. (2)

    If \(m\notin J\) then

    $$\begin{aligned} \tilde{y}_{\text {max}}^{(J,p_1,m)} \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{p_1}. \end{aligned}$$
  3. (3)

    We have

    $$\begin{aligned} y_{\text {max}}^{(J,p,m)} \ll y_{\text {max}}B^{|J|}\log \log {R}. \end{aligned}$$

Proof

This follows easily from Lemma 8.2 and Proposition 8.1. (We note that in the special case \(J=\emptyset ,\) in items (2) and (3) we could drop the extra \(\log \log {R}\) factor if required.) \(\square \)

We are now in a position to prove the first two parts of Lemma 6.6.

Proof of Lemma 6.6 parts (i) and (ii)

We prove part (i), in the case \(m\in J.\) The rest of the argument proceeds along the same lines. From Proposition 8.3 we have

$$\begin{aligned} S_{J,p_1,p_2,m} = \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ (u_m,p_1p_2)=1 \\ u_j=1\,\,\forall j\in J \end{array}}\frac{(y_{u_1,\ldots ,u_k}^{(J,p_1,m)})(y_{u_1,\ldots ,u_k}^{(J,p_2,m)})}{\prod _{i\in I}f^*(u_i)\prod _{j\in J}g^*(u_j)}+O\Bigg (\frac{y_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{(p_1p_2/(p_1,p_2))^2}\Bigg ). \end{aligned}$$

Here we have used Corollary 8.4 to control the various error terms in the statement of the proposition. We have also used the fact \(|I|+|J|=k.\) It follows that

$$\begin{aligned} S_{J,p_1,p_2,m}&\ll (\tilde{y}_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu ^2(u)}{f^*(u)}\Bigg )^{|I|}\\&\quad +\frac{y_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{(p_1p_2/(p_1,p_2))^2} \\&\ll \frac{y_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{(p_1p_2/(p_1,p_2))^2}, \end{aligned}$$

again using Corollary 8.4. This gives the result, as required. \(\square \)

From now on we are only interested in the sums \(S_{J} = S_{J,1,1,m},\) and in particular the cases \(|J|\in \{0,1, 2\}\). For ease of notation we let \(y_{r_1,\ldots ,r_k}^{(J)} = y_{r_1,\ldots ,r_k}^{(J,1,m)}\) and \(y_{\text {max}}^{(J)} = \sup _{r_1,\ldots ,r_k} |y_{r_1,\ldots ,r_k}^{(J)}|.\) For future reference we note the bound

$$\begin{aligned} y_{\text {max}}^{(J)} \ll y_{\text {max}} B^{|J|}\log \log {R}, \end{aligned}$$
(8.16)

proved above.

8.3 Relating vectors to functionals and proof of Lemma 6.6 part (iii)

We first prove the following corollary that follows from Proposition 8.3.

Corollary 8.5

(Relating \(y_{r_1,\ldots ,r_k}^{(J)}\) to integral operators). Let \(y_{r_1,\ldots ,r_k}\) be defined in terms of a fixed, smooth function F, supported on \({R}_k=\{\vec {x}\in [0,1]^k:\sum _{i=1}^{k}x_i\le 1\}\), by

$$\begin{aligned} y_{r_1,\ldots ,r_k}=F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg ). \end{aligned}$$

Let

$$\begin{aligned} F_{\text {max}} = \sup _{(t_1,\ldots ,t_k)\in [0,1]^k}|F(t_1,\ldots ,t_k)|+\sum _{i=1}^{k}|\frac{\partial F}{\partial t_i}(t_1,\ldots ,t_k)|. \end{aligned}$$

Define the integral operators

$$\begin{aligned} I_{r_1,\ldots ,r_k; m}(F)&= \int _0^1F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,x_m,\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg )\frac{\mathrm {d}x_m}{\sqrt{x_m}}, \\ I_{r_1,\ldots ,r_k; m,l}(F)&= \int _0^1\Bigg (\int _0^1F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,x_l,\ldots ,x_m,\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg )\frac{\mathrm {d}x_m}{\sqrt{x_m}}\Bigg )\frac{\mathrm {d}x_l}{\sqrt{x_l}}. \end{aligned}$$

Then

  1. (i)

    if \(J=\{m\}\) we have

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\bigg |_{r_j=1\,\,\forall j\in J} = B\bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )I_{r_1,\ldots ,r_k; m}(F)+ O\Bigg (\frac{F_{\text {max}}B}{D_0}\Bigg ). \end{aligned}$$
  2. (ii)

    if \(J=\{m,l\}\) we have

    $$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\Bigg |_{r_j=1\,\,\forall j\in J} = B^2\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )^2I_{r_1,\ldots ,r_k; m,l}(F)+ O\Bigg (\frac{F_{\text {max}}B^2}{D_0}\Bigg ). \end{aligned}$$

Proof

First suppose \(J=\{m\}.\) From Proposition 8.3 we have

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\big |_{r_j=1\,\,\forall j\in J} = \sum _{\begin{array}{c} e_m \end{array}}\frac{g^{**}(e_m)}{\varphi (e_m)}F\Bigg (\frac{\log {r_1}}{\log {R}}, \ldots ,\frac{\log {e_m}}{\log {R}},\ldots \frac{\log {r_k}}{\log {R}}\Bigg )+O\Bigg (\frac{F_{\text {max}}B}{D_0}\Bigg ) \end{aligned}$$
(8.17)

Consider the function

$$\begin{aligned} \gamma _1(p)= {\left\{ \begin{array}{ll} \frac{p}{1+\frac{\varphi (p)}{g^{**}(p)}}\,\,&{}\text {if }p\not \mid W\prod _{i\in I}r_i\text { and }p\equiv 3\,\,(\text {mod}\,\,4), \\ 0\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

With this choice of \(\gamma _1(p)\) we have

$$\begin{aligned} \frac{\gamma _1(p)}{p-\gamma _1(p)} = \frac{g^{**}(p)}{\varphi (p)} \end{aligned}$$

and one can easily check \(\gamma _1(p) = 1+O(1/p).\) By an argument identical to the proof of Lemma 7.3, we can evaluate the sum in (8.17) as

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\big |_{r_j=1\,\,\forall j\in J}&= B\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )\int _0^1 F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,t_m,\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg )\frac{\mathrm {d}t_m}{\sqrt{t_m}}\\&\quad +O\Bigg (\frac{F_{\text {max}}B}{D_0}\Bigg ), \end{aligned}$$

which proves (i). If \(J=\{m,l\}\) then, again using Proposition 8.3, we find

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\big |_{r_j=1\,\,\forall j\in J}&= \sum _{\begin{array}{c} e_{m},e_{l} \end{array}}\Bigg (\prod _{j=m,l}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {e_{m}}}{\log {R}},\ldots ,\frac{\log {e_{l}}}{\log {R}},\ldots ,\frac{\log {r_{k}}}{\log {R}}\Bigg )\nonumber \\&\quad +O\Bigg (\frac{F_{\text {max}}B^2}{D_0}\Bigg ). \end{aligned}$$
(8.18)

Take the sum over \(e_l\) first. Consider the function

$$\begin{aligned} \gamma _2(p)= {\left\{ \begin{array}{ll} \frac{p}{1+\frac{\varphi (p)}{g^{**}(p)}}\,\,&{}\text {if }p\not \mid W\prod _{i\in I}r_i e_m\text { and } p\equiv 3\,\,(\text {mod}\,\,4), \\ 0\,\,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Reasoning as above, the sum on the right hand side of (8.18) becomes

$$\begin{aligned}&B\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )\sum _{e_{m}}\frac{g^{**}(e_{m})}{e_{m}}\int _0^1 F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {e_{m}}}{\log {R}},\ldots ,t_{l},\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg )\frac{\mathrm {d}t_{l}}{\sqrt{t_{l}}}\\&\quad +O\Bigg (\frac{F_{\text {max}}B^2}{D_0}\Bigg ). \end{aligned}$$

We can evaluate this sum in much the same way, this time using the function

$$\begin{aligned} \gamma _3(p)= {\left\{ \begin{array}{ll} \frac{p}{1+\frac{p}{g^{**}(p)}}\,\,&{}\text {if }p\not \mid W\prod _{i\in I}r_i\text { and }p\equiv 3\,\,(\text {mod}\,\,4), \\ 0\,\,&{}\text {otherwise,} \end{array}\right. } \end{aligned}$$

to get the stated result. \(\square \)

If we define the (identity) operator

$$\begin{aligned} I_{r_1,\ldots ,r_k}(F)=F\Bigg (\frac{\log {r_1}}{\log {R}},\ldots ,\frac{\log {r_k}}{\log {R}}\Bigg ), \end{aligned}$$

then the results of Corollary 8.5 can be concisely written as

$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\Bigg |_{r_j=1\,\,\forall j\in J} = B^{|J|}\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )^{|J|}I_{r_1,\ldots ,r_k;J}(F)+ O\Bigg (\frac{F_{\text {max}}B^{|J|}}{D_0}\Bigg ) \end{aligned}$$

for \(|J|\in \{0,1,2\}.\) Here we have used \(I_{r_1,\ldots ,r_k;J}(F)\) to denote \(I_{r_1,\ldots ,r_k;j:j\in J}(F).\) We are now in a position to prove the remaining part of Lemma 6.6.

Proof of Lemma 6.6 part (iii)

We can write the operators in the statement of Lemma 6.6 as follows:

$$\begin{aligned} L(F)&= \int _0^1\ldots \int _0^1 \Bigg [I(F)\Bigg ]^2 \prod _{\begin{array}{c} i=1 \end{array}}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}}, \\ L_{m}(F)&= \int _0^1\ldots \int _0^1 \Bigg [I_{m}(F)\Bigg ]^2 \prod _{\begin{array}{c} i=1 \\ i\ne m \end{array}}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}}, \\ L_{m,l}(F)&= \int _0^1\ldots \int _0^1 \Bigg [I_{m,l}(F)\Bigg ]^2 \prod _{\begin{array}{c} i=1 \\ i\ne m,l \end{array}}^{k}\frac{\mathrm {d}x_i}{\sqrt{x_i}}, \end{aligned}$$

where we have defined

$$\begin{aligned} I(F)&= F(x_1,\ldots ,x_k), \\ I_m(F)&= \int _0^1F(x_1,\ldots ,x_k)\frac{\mathrm {d}x_m}{\sqrt{x_m}}, \\ I_{m,l}(F)&= \int _0^1 \Bigg (\int _0^1F(x_1,\ldots ,x_k)\frac{\mathrm {d}x_m}{\sqrt{x_m}}\Bigg )\frac{\mathrm {d}x_l}{\sqrt{x_l}}. \end{aligned}$$

From Proposition 8.1 we see that

$$\begin{aligned} S_J = \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ u_j=1\,\,\forall j\in J \end{array}}\frac{(y_{u_1,\ldots ,u_k}^{(J)})^2}{\prod _{i\in I}f^*(u_i)}+O\Bigg (\frac{(y_{\text {max}}^{(J)})^2B^{|I|}}{D_0}\Bigg ). \end{aligned}$$
(8.19)

From Corollary 8.5, for \(|J|\in \{0,1,2\},\) we have

$$\begin{aligned} (y_{r_1,\ldots ,r_k}^{(J)})^2\Bigg |_{r_j=1\,\,\forall j\in J} = B^{2|J|}\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )^{2|J|}\Bigg [I_{r_1,\ldots ,r_k;J}(F)\Bigg ]^2+ O\Bigg (\frac{F_{\text {max}}^2B^{2|J|}}{D_0}\Bigg ). \end{aligned}$$

Substituting this into (8.19), and using (8.16), yields

$$\begin{aligned} S_J&= B^{2|J|} \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ u_j=1\,\,\forall j\in J \\ (u_i,u_j)=1\,\,\forall i\ne j \\ (u_i,W)=1\,\,\forall i \\ p|u_i\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\prod _{i\in I}\frac{\mu ^2(u_i)\varphi (u_i)^{2|J|}}{f^*(u_i)u_i^{2|J|}}\Bigg [I_{u_1,\ldots ,u_k;J}(F)\Bigg ]^2 \\&\quad +O\Bigg (\frac{F_{\text {max}}^2B^{2|J|}}{D_0}\sum _{\begin{array}{c} u_1,\ldots ,u_k \\ u_j=1\,\,\forall j\in J \\ (u_i,W)=1\,\,\forall i \\ p|u_i\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\prod _{i\in I}\frac{\mu ^2(u_i)}{f^*(u_i)}\Bigg )+ O\Bigg (\frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{D_0}\Bigg ). \end{aligned}$$

The first error contributes

$$\begin{aligned} \ll \frac{F_{\text {max}}^2B^{2|J|}}{D_0}\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u_i)}{f^*(u_i)}\Bigg )^{|I|} \ll \frac{F_{\text {max}}^2B^{k+|J|}}{D_0}. \end{aligned}$$

For the main term, if \((u_i,u_j)\ne 1\) then they must be divisible by a prime \(q>D_0\) with \(q\equiv 3\,\,(\text {mod}\,\,4)\). In this case we get a contribution

$$\begin{aligned}&\ll F_{\text {max}}^2B^{2|J|} \sum _{\begin{array}{c} q>D_0 \\ q\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\sum _{\begin{array}{c} u_1,\ldots ,u_k \\ u_j=1\,\,\forall j\in J \\ (u_i,W)=1\,\,\forall i \\ p|u_i \Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \\ q|u_i,u_j \end{array}}\prod _{i\in I}\frac{\mu ^2(u_i)\varphi (u_i)^{2|J|}}{f^*(u_i)u_i^{2|J|}} \\&\ll F_{\text {max}}^2B^{2|J|} \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)\varphi (u)^{2|J|}}{f^*(u)u^{2|J|}}\Bigg )^{|I|} \sum _{q>D_0}\frac{\varphi (q)^{4|J|}}{f^*(q)^2q^{4|J|}} \ll \frac{F_{\text {max}}^2B^{k+|J|}}{D_0}. \end{aligned}$$

Thus this constraint can be removed at the cost of a negligible error and we are left with

$$\begin{aligned} S_J= & {} B^{2|J|} \sum _{\begin{array}{c} u_1,\ldots ,u_k \\ u_j=1\,\,\forall j\in J \\ (u_i,W)=1\,\,\forall i \\ p|u_i\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\prod _{i\in I}\frac{\mu ^2(u_i)\varphi (u_i)^{2|J|}}{f^*(u_i)u_i^{2|J|}}\Bigg [I_{u_1,\ldots ,u_k;J}(F)\Bigg ]^2\\&+O\Bigg (\frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{D_0}\Bigg ). \end{aligned}$$

Now, since

$$\begin{aligned} \frac{\varphi (p)^{2|J|}}{f^*(p)p^{2|J|}} = \frac{1}{p}+O\Bigg (\frac{1}{p^2}\Bigg ), \end{aligned}$$

we can evaluate this multidimensional sum by applying Lemma 7.3\(|I|=k-|J|\) times. We obtain

$$\begin{aligned} S_J=B^{k+|J|}L_J(F)+O\Bigg (\frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{D_0}\Bigg ). \end{aligned}$$

This completes the proof of Lemma 6.6. \(\square \)