Abstract
Let \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) be an ordered set of integers. We give sufficient conditions for the existence of increasing sequences of natural numbers \(a_j\) and \(n_k\) such that \(n_k+h_{a_j}\) is a sum of two squares for every \(k\ge 1\) and \(1\le j\le k.\) Our method uses a novel modification of the Maynard–Tao sieve together with a second moment estimate. As a special case of our result, we deduce a conjecture due to D. Jakobson which has several implications for quantum limits on flat tori.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We say that a set \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) of distinct integers is admissible if \(\#\{\mathcal {H}\,\,(\text {mod}\,\,p)\}<p\) for every prime p. An outstanding problem in analytic number theory is the prime k-tuples conjecture, which asserts the following.
Conjecture 1.1
Let \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) be admissible. Then there exists infinitely many integers n such that the translates \(n+h_1,\ldots ,n+h_k\) are prime.
A proof of this conjecture is far out of reach of current techniques. However, we have been successful in establishing various weak versions of this result using sieve methods. For example, the Maynard–Tao sieve can be used to show that \(\gg \log {k}\) of the translates are simultaneously prime infinitely often, when k is sufficiently large (cf. [9, 11]).
We extend the definition of admissibility to infinite ordered sets and say \(\mathcal {H}^*=\{h_1,h_2,\ldots \}\) is admissible if the finite truncation \(\{h_1,\ldots ,h_k\}\subseteq \mathcal {H}^*\) is admissible for every \(k\ge 1.\) In this paper we are interested in the following variation of this conjecture, for numbers representable as a sum of two squares.
Conjecture 1.2
Let \(\mathcal {H}^*=\{h_1,h_2,\ldots \}\) be admissible. Then there exists an increasing sequence of integers \(n_k\) such that, for every \(k\ge 1,\) the translates \(n_k+h_1,\ldots ,n_k+h_k\) are sums of two squares.
We remark that if we replaced “sums of two squares” with “prime” here, then this would simply be a reformulation of Conjecture 1.1. (It is easy to show that any finite admissible set can be extended to an infinite admissible set.)
Our interest in this version of the conjecture stems from a problem which appears towards the end of D. Jakobson’s “Quantum limits on flat tori” paper [8]. In this paper Jakobson is concerned with characterising the possible quantum limits that can arise on the standard flat d-dimensional torus \(\mathbb {T}^d=\mathbb {R}^d/\mathbb {Z}^d.\) A complete classification of such objects is established in two dimensions, with possible behaviours in higher dimensions described unconditionally for \(d\ge 4,\) and conditionally for \(d=3\) on a weak version of Conjecture 1.2 (cf. [8, Conjecture 8.2]).
In this paper we establish Jakobson’s conjecture.
Theorem 1.3
There exists increasing sequences of natural numbers \(a_j\) and \(M_k\) such that \(M_k-a_j^2\) is a sum of two squares for \(1\le j \le k.\) Moreover, the sequence \(a_j\) is such that:
-
(1)
\(r_{2}(a_j)<r_{2}(a_{j+1})\) for all \(j\ge 1.\)
-
(2)
The even parts are uniformly bounded; that is to say, if we write \(a_j = 2^{b_j}m_j\) where \((m_j,2)=1,\) then \(b_j = O(1)\) uniformly for \(j\ge 1\).
Here, \(r_2(n)\) denotes the number of representations of n as a sum of two squares. We deduce Theorem 1.3 from the following general result.
Theorem 1.4
Let \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) be admissible such that each \(h_i\) is divisible by 4. Then there exists increasing sequences of natural numbers \(a_j\) and \(n_k\) such that \(n_k+h_{a_j}\) is a sum of two squares for every \(k\ge 1\) and \(1\le j\le k.\)
For example, Theorem 1.4 applied to the admissible set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) with elements \(h_i=-(2\cdot 5^{i})^2,\) yields a solution to Theorem 1.3.
As mentioned above, Theorem 1.3 allows us conclude results about quantum limits on flat tori. Let \((\lambda _j)_{j\ge 1}\) be a sequence of eigenvalues of the Laplace operator \(\nabla \) on \(\mathbb {T}^d\) such that \(\lambda _j\rightarrow \infty ,\) and let \(\varphi _j\) be corresponding eigenfunctions with \(\left\Vert \varphi _j\right\Vert _2=1\). If the sequence of probability measures \(\mathrm {d}\mu _j=|\varphi _j|^2\mathrm {d}x\) has a weak-\(*\) limit \(\mathrm {d}v\), then we call \(\mathrm {d}v\) a quantum limit. (Here \(\mathrm {d}x\) is the normalised Riemannian volume.)
It can be shown that all limits of such sequences \(\mathrm {d}\mu _j\) are absolutely continuous with respect to the Lebesgue measure on \(\mathbb {T}^d\) (cf. [8, Theorem 1.3]), and so one can consider the Fourier expansion
Among other things, Jakobson shows that in two dimensions all quantum limits are necessarily trigonometric polynomials (cf. [8, Theorem 1.2]). The same result isn’t true for \(d\ge 4\), and conjecturally not true for \(d=3\) either (cf. [8, Conjecture 8.2] and the following discussion). With Theorem 1.3, we can now complete this aspect of the classification of quantum limits on flat tori.
Theorem 1.5
There exists quantum limits on \(\mathbb {T}^3\) that are not trigonometric polynomials.
As further consequences to Theorem 1.3 we are able to show the following results for quantum limits whose Fourier expansions are described as in (1.1).
Theorem 1.6
Let \(\epsilon >0.\) We have the following.
-
(i)
For \(d \ge 4\) there exists quantum limits \(\mathrm {d}v\) on \(\mathbb {T}^d\) with densities that are not in \(l^{2-\epsilon }\) (i.e. for which \(\sum _{\tau }|c_{\tau }|^{2-\epsilon }\) diverges).
-
(ii)
For \(d\ge 5\) there exists quantum limits \(\mathrm {d}v\) on \(\mathbb {T}^d\) for which
$$\begin{aligned} \limsup _{\rho \rightarrow \infty } \frac{\Sigma (\rho )}{\rho ^{d-4-\epsilon }}=+\infty , \end{aligned}$$where \(\Sigma (\rho )\) is defined as
$$\begin{aligned} \Sigma (\rho ) = \sum _{\begin{array}{c} \tau \in \mathbb {Z}^d \\ |\tau | < \rho \end{array}}|c_{\tau }|. \end{aligned}$$(1.2)
The results contained in Theorem 1.6 improve upon various results found in [8]. Part (i) was previously shown for \(d\ge 5\), and has now been extended to the case \(d=4\) where it is now optimal (cf. [8, Theorem 1.4]). Part (ii) improves on the weaker lower bound
which was shown for \(d\ge 6\). The lower bound we prove is believed to be optimal for all \(d\ge 5\) (cf. [8, Proposition 1.2] and comments shortly after).
Remark 1.7
It is well-known that the eigenvalues of \(\nabla \) on \(\mathbb {T}^d\) are the numbers \(4\pi ^2 k\) for non-negative integers k, and they occur with multiplicity \(r_d(k)\) (the number of representations of k as the sum of d squares). This means various constructions associated to quantum limits on flat tori can be translated to problems in number theory involving sums of squares.
Remark 1.8
Jakobson shows how Theorem 1.3 follows from weak form of the prime k-tuples conjecture, essentially by using the fact primes \(p\equiv 1\,\,(\text {mod}\,\,4)\) are the sum of two squares (cf. discussion at the end of [8, Section 8]). We note that the weak form of the conjecture Jakobson uses is still far out of reach of current methods.
2 Outline of new sieve ideas
In this section, let \(\mathcal {A}\subseteq \mathbb {N}\) denote a set of arithmetic interest, which for our purposes is the set of numbers representable as a sum of two squares (but the following discussion holds more generally). We will denote random variables by boldfaced letters, for example \(\mathbf {X}\). We will let \(\mathbb {P}(\cdot )\) denote a probability measure and by \(\mathbb {E}[\cdot ]\) the expectation operator.
2.1 A model problem
Our aim is to prove Theorem 1.4. By a pigeonhole argument (see Proposition 5.1), it suffices to consider the following model problem.
Model Problem
Fix an admissible set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) of integers and a partition \(\mathcal {H}^{*}=B_1\cup B_2\cup \ldots \) where each bin \(B_i\) is a fixed, finite size \(k_i\). Is it the case that for every \(M\ge 1\) there exists elements \(h_{a_1},\ldots ,h_{a_M}\) and infinitely many integers n such that \(h_{a_j}\in B_j\) and \(n+h_{a_j}\in \mathcal {A}\) for \(1\le j\le M\)?
We realise the above set-up as the output of a sieving process. For notational purposes we order \(B_{i}=\{h_{k_0+\cdots +k_{i-1}+1},\ldots ,h_{k_0+\cdots +k_{i}}\}\) for \(i\ge 1,\) with the convention that \(k_0=0.\) Let \(k=k_{0}+\cdots +k_M\) for some large M. Given \(n\in [N,2N)\) for some large N, let \(\mathbf {X}_i\) denote the random variable that counts the number of \(h\in B_i\) such that \(n+h\in \mathcal {A}\), and let \(\mathbf {X}=\mathbf {X}_1+\cdots \mathbf {X}_{M}.\)
The current method we use to detect primes in k-tuples is the GPY method. For general sets \(\mathcal {A},\) the aim is to show the first moment inequality
holds for some integer \(m\ge 1\), where \(\mathbb {1}_{\mathcal {A}}\) denotes the indicator function of the set \(\mathcal {A}\) and w(n) are non-negative weights (cf. [4, 9, 11]). If we normalise the weights to sum to 1, then this is saying “if we choose n randomly from the interval [N, 2N) with probability w(n), then \(\mathbb {E}[\mathbf {X}]>m\).” From this we can deduce the existence of an \(n\in [N,2N)\) for which \(m+1\) of the translates \(n+h_i\in \mathcal {A}\). We say such a translate has been “accepted.” Exactly which translates are accepted is unknown. This is a limitation of the first moment method.
It is clear that for our model problem, we require more information about which translates \(n+h\) appear. Namely, we need to be obtaining an accepted translate from each of the M bins \(B_1,\ldots ,B_M\) (recall \(k=k_0+\cdots +k_M\)). This presents two obvious difficulties.
-
(1)
For any \(1\le i\le k\) the probability of the event \(n+h_i\in \mathcal {A}\) depends on k, and tends to 0 as \(k\rightarrow \infty \). This would mean any bin of fixed size expects to get fewer and fewer elements as k gets large. In particular, we cannot hope the hypotheses hold for every \(M\ge 1.\)
-
(2)
Even in the situation where \(\mathbb {E}[\mathbf {X}_i]>1\) holds for each \(1\le i\le M\), we cannot conclude anything about \(\mathbb {P}((\mathbf {X}_1>0)\cap \cdots \cap (\mathbf {X}_M>0))\) unless we input some information about the joint distribution of the bins.
We are able to overcome these issues by modifying the sieve weights and using a second moment estimate.
2.2 Choice of sieve weights
We solve the first problem by modifying the sieve weights to put more emphasis on the earlier bins. This way, we can guarantee that \(\mathbb {P}(n+h\in \mathcal {A}|h\in B_i)=c_i\) where the constant \(c_i\) depends solely on the bin. This also means that we can guarantee \(\mathbb {E}[\mathbf {X}_i]\) is large for each i (provided \(k_i\) is large enough in terms of \(c_i\)). We will consider Maynard–Tao sieve weights with a fixed factorisation
where
and \(f_i\) is a suitable smooth function supported on the simplex
Here \((\beta _i)_{i\ge 1}\) is a sequence of real numbers such that \(\sum _{i=1}^{\infty }\beta _i \le 1\) (cf. the sieve weights defined in [9, Proposition 4.1]). We will take \(\beta _i=2^{-i},\) and in this instance one might say “we have allocated 50% of the sieve power to \(B_1\).”
2.3 Concentration of measure
We can deal with the second problem by showing the random variables \(\mathbf {X}_i\) exhibit “enough” independence. This is precisely what concentration of measure arguments are used for. For example, an application of the union bound and Chebychev’s inequality tells us that
where \(t_i \ge 1\) are concentration parameters. Thus, if we can show the variances \(\mathbb {E}[\mathbf {X}_i-\mathbb {E}[\mathbf {X}_i]]^2\) are small, then we should be able to show each random variable concentrates in a (small) interval about its mean with high probability. In particular, we should be able to show that we get (at least) one accepted translate coming from each bin after the sieving process. We implement this analytically by using a second moment estimate (see Proposition 5.2).
Remark 2.1
A similar second moment estimate was considered by Banks, Freiberg and Maynard in their paper [1]. They showed a partition result (for primes), where the bins are allowed to grow with k. A key aspect of our work is that the bin sizes are fixed. Moreover they only needed upper bounds of the correct order of magnitude for the sieve sums, whereas we require precise asymptotics.
2.4 Hooley’s \(\rho \) function
In practice, utilising a second moment estimate requires an understanding of the two-point correlations
where \(\rho _{\mathcal {A}}\) is a non-negative function supported on \(\mathcal {A}.\) This means our methods are limited to cases in which estimates of the above type are known. In particular we cannot deal with the case of primes, as evaluating the above sum asymptotically with \(\mathbb {1}_{\mathbb {P}}\) or the von Mangoldt function \(\Lambda \) (say) is equivalent to the twin prime conjecture.Footnote 1
One can do much better when working with sums of two squares. We cannot evaluate (2.6) asymptotically using the indicator function, but we can if we work with the representation function \(r_2(n)\) instead. Unfortunately \(r_2(n)\) is too large for our purposes, and it proves necessary to consider a weighted version instead. In Hooley’s work [7] on the distribution of numbers representable as the sum of two squares, he considers a weighted representation function \(\rho (n)=t(n)r_2(n)\) where
and \(\theta _1\) is a suitably small, fixed constant (for example Hooley takes \(\theta _1=1/20\)). Here \(g_2(p)\) is the multiplicative function defined on primes by
The t(n) factor acts to dampen down the oscillations due to \(r_2(n).\) Thus \(\rho (n)\) acts a proxy for the indicator function \(\mathbb {1}_{n=\Box +\Box }\) and moreover asymptotics for (2.6) are available for \(\rho (n).\) This is the function we will be working with.
2.5 Outline of the paper
In Sect. 3 we deduce the results about quantum limits contained in Theorem’s 1.5 and 1.6 from Theorem 1.3. In Sect. 5 we state a few preliminary lemmas that will be needed in the sieve calculations. We defer the proofs of these results to the Appendix. In Sect. 6 we state our main sieve results, and from them we deduce Theorem 1.4. We isolate a key lemma (see Lemma 6.6) from which all of our sieve estimates follow. Sections 7 and 8 are dedicated to proving this lemma.
3 Proofs of quantum limit results
In this section we deduce the results of Theorem 1.5 and Theorem 1.6 from Theorem 1.3. Following [8], we note that \(\varphi _k\) is an eigenfunction of the Laplacian on \(\mathbb {T}^d\) with eigenvalue \(\lambda _k = 4\pi ^2 n_k\) for some \(n_k\in \mathbb {N}\) if and only if its Fourier expansion is of the form
for \(a_{\xi }\in \mathbb {C}.\) Moreover \(\left\Vert \varphi _k \right\Vert _2=1\) if and only if \(\sum _{\xi }|a_{\xi }|^2=1\). It follows that
Let \(\mathrm {d}v\) be a quantum limit on \(\mathbb {T}^d\) with Fourier expansion as in (1.1). By \(|\varphi _k|^2\mathrm {d}x \rightarrow \mathrm {d}v\) weak-\(*\) as \(k\rightarrow \infty \) we mean that for every \(\tau \in \mathbb {Z}^d\) we have \( c_{\tau } = \lim _{k\rightarrow \infty } b_{\tau }(k).\)
Fix \(a_1<a_2<\cdots \) and \(M_1<M_2<\cdots \) as in the statement of Theorem 1.3, and let \(b_j^{(k)},c_{j}^{(k)}\in \mathbb {Z}\) be such
Let \(0<\epsilon <2\) and let \(F=F_{\epsilon }:\mathbb {N}\rightarrow \mathbb {N}\) be a rapidly increasing function whose rate of growth will be specified later. As we are assuming both \(a_i\rightarrow \infty \) and \(r(a_i)\rightarrow \infty \), by passing to a subsequence if necessary (and relabelling the indices of the sequence \(M_k\)), we may suppose \(a_i,r(a_i) \gg F(i)\).
We will require information about the number of integer points on the surface of the d-dimensional sphere. For this we recall the following results: writing \(n=2^km\) and letting \(\sigma (n)=\sum _{d|n}d\) denote the sum-of-divisors function, we have the identities
Here \(C_d(n^2)\) is a singular series which satisfies \(C_d(n^2) \asymp _d 1.\)
We prove each statement similarly - in each case we consider a suitable sequence of \(L^2\)-normalised eigenfunctions with eigenvalues \(\lambda _k = 4\pi ^2 M_k\) and show that the limit has the desired property.
(Proof of (Theorem 1.3\(\Rightarrow \) Theorem 1.5)) Consider the sequence of \(L^{2}\)-normalised eigenfunctions on \(\mathbb {T}^3\) that arise by choosing coefficients
Fix \(i\ge 1.\) With this choice, for any \(k\ge i\) we obtain
because the \(a_i\) are distinct and so the only contribution to the sum comes from \(\xi =(a_i,b_i^{(k)},c_i^{(k)})\) and \(\eta =(-a_i,b_i^{(k)},c_i^{(k)})\). Hence
which proves the theorem. \(\square \)
(Proof of (Theorem 1.3\(\Rightarrow \) Theorem 1.6)) It suffices to prove part (i) for \(d=4,\) by identifying the eigenfunctions on \(\mathbb {T}^d\) with the eigenfunctions on \(\mathbb {T}^{d+l}\) all of whose non-zero frequencies lie in the subspace \(\{(x_1,\ldots ,x_{d+l}):x_{d+1}=\cdots =x_{d+l}=0\}\subseteq \mathbb {Z}^{d+l}\).
Consider the sequence of \(L^{2}\)-normalised eigenvectors on \(\mathbb {T}^4\) that arise by choosing
Fix i and suppose \(k\ge i\). Given two non-zero coefficients \(a_{\xi },a_{\xi '}\), corresponding to vectors of the form \(\xi =(X,Y,b_i^{(k)},c_i^{(k)})\) and \(\xi '=(X',Y',b_i^{(k)},c_i^{(k)}),\) we see the difference vector is
and the norm of this vector is \(\le 2a_i\) by the triangle inequality. From (3.2), it follows that if we sum \(b_{\tau }(k)\) over all \(|\tau |\le 2a_i\) then we pick up all such differences. There are \(r(a_i^2)^2\) of them, leading to
Taking the limit as \(k\rightarrow \infty \) we conclude that
Now we can choose \(F=F_{\epsilon }\) so that the expression on the right hand side is unbounded as \(i\rightarrow \infty .\) It follows that \(\sum _{\tau }|c_{\tau }|^{2-\epsilon }\) doesn’t converge, proving part (i).
For part (ii), fix \(d\ge 5.\) We proceed as in part (i), except this time because \(d\ge 5\) we have the lower bound \(r_{d-2}(a_i^2) \gg _{d} a_i^{d-4}\). We remark that to obtain this bound for \(d\in \{5,6\}\) we are using property (2) given by Theorem 1.3. For \(d\ge 7\) the bound holds without this extra assumption on our sequence.
Now consider the sequence of eigenvectors on \(\mathbb {T}^d\) with densities
Fix i and suppose \(k\ge i\). As above we can conclude
Taking the limit as \(k\rightarrow \infty \) we conclude that
It follows that
Choosing \(F=F_{\epsilon }\) appropriately and letting \(i\rightarrow \infty ,\) we see that for this choice of quantum limit we have
where \(\Sigma (\rho )\) is defined as in (1.2). This proves part (ii). \(\square \)
4 Notation
We will use both Landau and Vinogradov asymptotic notation throughout the paper. N will denote a large integer, and all asymptotic notation is to be understood as referring to the limit as \(N\rightarrow \infty .\) Any dependencies of the implied constants on other parameters A will be denoted by a subscript, for example \(X\ll _{A} Y\) or \(X=O_{A}(Y),\) unless stated otherwise. We let \(\epsilon \) denote a small positive constant, and we adopt the convention it is allowed to change at each occurrence, and even within a line.
We will denote the non-trivial Dirichlet character \((\text {mod}\,\,4)\) by \(\chi _4,\) and we may omit the subscript and simply write \(\chi \). As usual, we let \(\varphi (n)\) denote the Euler-Totient function, \(\tau _r(n)\) denote the number of ways of writing n as the product of r natural numbers, \(\mu (n)\) denote the Möbius function, and \(r_d(n)\) denote the number of representations of n as the sum of d squares. For the rest of the paper we will write r(n) when \(d=2.\) For integers a, b we let (a, b) denote their highest common factor, and [a, b] denote their lowest common multiple.
We define the Ramanujan–Landau constant
which will appear in many of our results.
5 Preliminaries
In this section, we formalise some of the notions discussed in Sect. 2, and state a few key estimates that will be required later in the sieve calculations.
5.1 A pigeonhole argument
The following proposition allows us to go from the set-up in Theorem 1.4 to the model problem discussed in Sect. 2.
Proposition 5.1
(Pigeonhole argument for infinite bin set-up). Fix \(\mathcal {A}\subseteq \mathbb {N}\) and a set \(\mathcal {H}^{*}=\{h_1,h_2,\ldots \}\) of integers. Suppose that there exists a partition \(\mathcal {H}^{*}=B_1\cup B_2\cup \ldots \) where each bin \(B_i\) is a fixed, finite size, such that for every \(M\ge 1,\) there exists infinitely many n and M translates \(n+h_{i,M}\in \mathcal {A}\) with \(h_{i,M}\in B_i\) for \(1\le i \le M.\) Then there exists increasing sequences \(a_j\) and \(n_k\) such that for every \(k\ge 1\) we have \(n_k+h_{a_j} \in \mathcal {A}\) for \(1\le j\le k\) and moreover \(h_{a_j}\in B_j\) for all j.
Proof
With the above set-up, obtain translates \(n+h_{i,M}\in \mathcal {A}\) with \(h_{i,M}\in B_i\) for \(1\le i \le M\) for each \(M\ge 1.\) Record this process in the following infinite table:
\(B_1\) | \(B_2\) | \(B_3\) | \(\ldots \) | \(B_M\) | \(B_{M+1}\) | \(\ldots \) |
---|---|---|---|---|---|---|
\(h_{1,1}\) | \(\ldots \) | |||||
\(h_{1,2}\) | \(h_{2,2}\) | \(\ldots \) | ||||
\(h_{1,3}\) | \(h_{2,3}\) | \(h_{3,3}\) | \(\ldots \) | |||
\(\vdots \) | \(\vdots \) | \(\vdots \) | \(\vdots \) | |||
\(h_{1,M}\) | \(h_{2,M}\) | \(h_{3,M}\) | \(\ldots \) | \(h_{M,M}\) | ||
\(h_{1,M+1}\) | \(h_{2,M+1}\) | \(h_{3,M+1}\) | \(\ldots \) | \(h_{M,M+1}\) | \(h_{M+1,M+1}\) | |
\(\vdots \) | \(\vdots \) | \(\vdots \) | \(\ldots \) | \(\vdots \) | \(\vdots \) | \(\vdots \) |
Look at the first column. By the pigeonhole principle, since \(B_1\) is finite, there must exist an element \(h_{a_1}\in B_1\) which appears infinitely many times. Choose the smallest such \(h_{a_1},\) and choose any \(n_1\in \mathbb {N}\) for which \(n_1+h_{a_1}\in \mathcal {A}\). Now erase all the rows that do not start with \(h_{a_1},\) and look at the remaining (infinite) table. Again, since \(B_2\) is finite, some element \(h_{a_2}\in B_2\) must occur infinitely many times in the second column. Choose the smallest such \(h_{a_2},\) and choose any \(n_2>n_1\) such that \(n_2+h_{a_2}\in \mathcal {A}\) (which we can do because there are infinitely many such \(n_2\)). By construction this \(n_2\) will be such that \(n_2+h_{a_1}\in \mathcal {A}.\) Now erase all rows that don’t start with \(h_{a_1},h_{a_2},\) and repeat this process for \(B_3,\) and so on. We will end up with increasing sequences \(a_j\) and \(n_k\) which by construction satisfy the required conditions. \(\square \)
5.2 A second moment estimate
As discussed in Sect. 2, our work will require input about the joint distribution of the bins, which we will achieve via concentration of measure arguments. The following second moment estimate will suffice for our purposes.
Proposition 5.2
(Second moment estimate). Fix \(\mathcal {A}\subset \mathbb {N}\) and \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) a set of integers. Suppose we have a partition \(\mathcal {H}=B_1\cup \ldots \cup B_M.\) Let \(\mu _i, t_i\ge 1\) be real numbers for \(1\le i\le M.\) Let \(\rho _{\mathcal {A}}\) be a non-negative function supported on \(\mathcal {A}\) and w(n) be non-negative weights for each integer n. If
then there exists an \(n\in [N,2N)\) and elements \(h_{a_i}\in B_i\) such that \(n+h_{a_i}\in \mathcal {A}\) for \(1\le i\le M.\)
Proof
By positivity we deduce the existence of an \(n\in [N,2N)\) such that
If \(n+h\notin \mathcal {A}\) for all \(h\in B_i,\) then by assumption on the support of \(\rho _{\mathcal {A}}\) the left hand side of the above expression is \(\ge \mu _i^2/t_i^2,\) a contradiction. \(\square \)
Thus, if the second moment estimate (5.1) holds for all \(M\ge 1\) and sufficiently large N, then we are in a situation where the hypotheses of Proposition 5.1 are satisfied.
5.3 Estimates in arithmetic progressions
We require an understanding of how \(\rho (n)\) and \(\rho (n)\rho (n+h)\) behaves in arithmetic progressions for our sieve calculations. Essentially this reduces down to understanding the corresponding sums for r(n) and \(r(n)r(n+h),\) where the estimates we need are known with power-saving error terms. This means that the error terms in the sieve calculations can be bounded trivially (cf. with the case of primes [4, 9], where we have to use equi-distribution results such as the Bombieri-Vinogradov theorem to bound the error terms that arise).
We have the following lemmas. We note that the functions \(g_1,\ldots ,g_7\) defined in this section will be used frequently throughout the rest of the paper.
Lemma 5.3
Suppose \((a,q)=(d,q)=1\) where d, q are square-free and odd, of size \(\ll N^{O(1)}.\) Then we have
where \(g_2\) is defined as in (2.8), \(g_1\) is the multiplicative function defined on primes by \(g_1(p)=1-\chi (p)/p,\) and
Lemma 5.4
Suppose that \((a,q)=(a+h,q)=(d_1,q)=(d_2,q)=(d_1,d_2)=1\) and 4|h, where \(d_1,d_2,q\) are square-free and odd, of size \(\ll N^{O(1)}.\) Moreover suppose \(h>0\) is fixed such that \(p|h\Rightarrow p|2q\). Then we have
where
and
Lemma 5.5
Let \((a,q)=(d,q)=1\) where d, q are square-free and odd, of size \(\ll N^{O(1)}.\) Then we have
where
Here \(g_3,g_4\) are the multiplicative functions defined on primes by
and \(g_5(p),g_6(p)\) are defined by
We will prove each of these results in Appendix A. Lemma 5.3 follows from two known results. Lemma 5.4 follows by adapting the method used by Plaksin in [10], where a similar sum is considered. Finally Lemma 5.5 can be shown using standard Perron’s formula arguments, together with a fourth moment estimate for Dirichlet L-functions.
When finding the corresponding estimates for \(\rho (n)\) the following sums naturally appear (for the definitions of \(W,W_1\) and \(D_0\) see (6.1) below the fold):
Here \(g_7\) is the multiplicative function defined on primes by \(g_7(p)=p+1.\) The following lemma evaluates the auxiliary sums above.
Lemma 5.6
(Auxiliary estimates for \(\rho (n)\)). We have
where A is defined as in (4.1). In each case one may take the o(1) term to be \(O(D_0^{-1}).\)
We remark that the estimate for \(X_{N,1}\) appears in Hooley’s work (after correcting a misprint—cf. [7, Lemma 5] and note his slightly different definition of A). We prove Lemma 5.6 in Appendix B. Each sum can be evaluated by the Selberg–Delange method.
6 The sieve set-up
We now state our sieve results and use them to deduce Theorem 1.4. For the rest of the paper k is fixed, \(\mathcal {H}=\{h_1,\ldots ,h_k\}\) is a fixed admissible set such that \(4|h_i\) for each i, and N is sufficiently large in terms of any fixed quantity. We allow any of the constants hidden in the Landau notation to depend on k, without explicitly specifying so.
We will employ a 4W-trick in our sieve calculations. Let
where \(D_0=(\log \log {N})^3,\) so that \(W\ll (\log {N})^{2(\log \log {N})^2} \ll _{\epsilon } N^{\epsilon }\) for any fixed \(\epsilon >0\) by the prime number theorem. It will prove useful to define
so that \(W=W_1W_3.\)
By admissibility of \(\mathcal {H}\) there exists a fixed residue class \(v_0\,\,(\text {mod}\,\,W)\) such that \((v_0+h_i,W)=1\) for each i. Fix \(1\le m,l\le k\) with \(m\ne l.\) We consider four types of sums:
Because k is fixed, we may assume that \(D_0\) is sufficiently large so that
Remark 6.1
For the second moment estimate, it proves important to control the residue classes of the translates \(n+h\,\,(\text {mod}\,\,4),\) hence the condition \(n\equiv 1\,\,(\text {mod}\,\,4)\) in our sieve sums and also the assumption 4|h for our admissible set. This is because of the inherent bias numbers representable as a sum of two squares have modulo 4.
Our first Proposition evaluates these sums for general half-dimensional Maynard–Tao sieve weights. Fix \(0<\theta _1<1/18\) in the definition of \(\rho (n)\) (see (2.7)). We also define the normalisation constant
Proposition 6.2
(Half-dimensional Maynard–Tao sieve estimates). Let \(R=N^{\theta _2/2}\) for some small fixed positive constant \(\theta _2\) such that \(0<\theta _1+\theta _2<1/18\). Let \(\lambda _{d_1,\ldots ,d_k}\) be defined in terms of a fixed smooth function F by
whenever \(\prod _{i=1}^{k}d_i\le R\) is squarefree, \((\prod _{i=1}^{k}d_i,W)=1\) and \(p|\prod _{i=1}^{k}d_i\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4),\) and let \(\lambda _{d_1,\ldots ,d_k}=0\) otherwise. Moreover let F be supported on \(R_{k}=\{(x_1,\ldots ,x_k)\in [0,1]^{k}: \sum _{i=1}^{k}x_i \le 1\}.\) Then we have
provided \(L_k(F),L_{k;m}(F)\) and \(L_{k;m,l}(F)\) are non-zero, where
From this, one can deduce the corresponding results for the modification of the Maynard–Tao sieve described in Sect. 2.
Proposition 6.3
(Modified Maynard–Tao sieve estimates). Suppose in addition to the hypotheses of Proposition 6.2 we have a partition \(\mathcal {H}=\{h_1,\ldots ,h_k\}=B_1\cup \ldots \cup B_M\) into bins \(B_i\) of fixed and finite size \(k_i.\) Write \(B_i=\{h_{k_0+\ldots +k_{i-1}+1},\ldots ,h_{k_0+\ldots +k_i}\}\) with the convention that \(k_0=0\). Suppose further we have a corresponding factorisation
where each \(F_i\) is smooth and supported on the simplex
Here \((\beta _i)_{i=1}^{\infty }\) is a sequence of real numbers such that \(\sum _{i=1}^{\infty }\beta _i\le 1.\) Then for \(h_m,h_l\in B_j\) we have
Proof
The hypotheses imply \(F=\prod _{i=1}^{M}F_i\) is also smooth and supported on \(R_{k},\) and hence the results of Proposition 6.2 apply. It suffices to to show the functionals factorise in the forms stated. Because our set-up ensures that \(\text {supp}(F)= \text {supp}(F_1)\times \cdots \times \text {supp}(F_M)\) one can easily check that if \(h_m,h_l\in B_j\) then
\(\square \)
With the following lemma we will be in a position to prove Theorem 1.4.
Lemma 6.4
(Evaluation of sieve functionals). Let \(F(t_1,\ldots ,t_k)=\prod _{i=1}^{k}g(kt_i)\) where
Then for any m, l we have
Proof
The definition implies F is supported on the cube \([0,\frac{\beta }{k}]^k\subseteq R_{k,\beta }.\) In this case the functionals factorise completely and the lemma follows from the fact
\(\square \)
Remark 6.5
We have restricted the support of our functions to the cube \([0,\frac{\beta }{k}]^k\subseteq R_{k,\beta }\) so that the integrals can be evaluated exactly. This essentially means we are using weights of similar strength to the original GPY weights (cf. [4]). For the half-dimensional case one can show that for large k these weights are essentially optimal. (In particular, following a similar optimisation process as in [9, Section 7] one arrives at the same results as above.)
We are now in a position to prove Theorem 1.4.
Proof of Theorem 1.4
Let \(\mathcal {H}=\{h_1,h_2,\ldots \}\) be a fixed admissible set. Fix real numbers \(\theta _1,\theta _2\) subject to \(0<\theta _1+\theta _2 < 1/18\) and define the constant
With notation as above, consider a partition \(\mathcal {H}=B_1\cup B_2\cup \ldots \) where \(k_1>2\Delta ^3\) and for \(i\ge 2\) we choose \(B_{i}\) such that \(k_i > 2^{7i}\). By Proposition 5.2 we will be done if we can show, for every \(M\ge 1,\) the inequality
for all sufficiently large N and some choice of real numbers \(\mu _i,t_i \ge 1.\) Choose weights \(\lambda _{d_1,\ldots ,d_k}\) as in Proposition 6.2, and let \(F(x_1,\ldots ,x_k)=\prod _{i=1}^{M}F_i(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i})\) where each \(F_i\) is supported on \(R_{B_i,2^{-i}}.\) Let \(F_i(x_{k_0+\ldots +k_{i-1}+1},\ldots ,x_{k_0+\ldots +k_i}) = \prod _{j=k_0+\ldots +k_{i-1}+1}^{k_0+\ldots +k_i}g(k_ix_j)\) where for \(j\in \{b_{2i-1},\ldots ,{b_{2i}}\}\) we define
Expanding out (6.9), we have the evaluate the expression
(where by abuse of notation we have written \(S_2^{(h)}\) for \(S_2^{(i)}\) where \(h=h_i\) say). A convenient choice of \(\mu _i, \lambda _i\) is
where
Evaluating these sums using Proposition 6.3 and Lemma 6.4 we see that this is asymptotically
Hence (6.9) will be satisfied for all sufficiently large N provided
But now our choice of bins ensures that
and so (6.10) is satisfied for all \(M \ge 1.\) \(\square \)
It remains to prove Proposition 6.2. Each sum can be treated similarly. The following lemma handles all of them at once. First, given a function F satisfying the hypotheses of Proposition 6.2, we define
The lemma can now be stated as follows.
Lemma 6.6
(General sieve lemma). Let \(J\subseteq \{1,\ldots ,k\}\) (possibly empty) and \(p_1,p_2\in \mathbb {P}\cup \{1\}\) be fixed. Write \(I=\{1,\ldots ,k\}\backslash J.\) Define the sieve sum \(S_{J,p_1,p_2,m}=S_{J,p_1,p_2,m,f,g}\) by
with weights \(\lambda _{d_1,\ldots ,d_k}\) defined as in Proposition 6.2. If \(J=\emptyset \) we define \(f(p)=1/p\) (and there is no dependence on g in the sum). Otherwise, f and g are non-zero multiplicative functions defined on primes by
and moreover we assume that \(f(p)\ne 1/p.\) We write \(S_{J}\) for \(S_{J,1,1,m}\). Suppose \(\lambda _{d_1,\ldots ,d_k}\) satisfy the same hypotheses as in Proposition 6.2. Then for \(|J|\in \{0,1,2\}\) we have the following:
-
(i)
If \(m\in J\) then
$$\begin{aligned} S_{J,p_1,p_2,m} \ll \frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{(p_1p_2/(p_1,p_2))^2} . \end{aligned}$$ -
(ii)
If \(m\notin J\) then
$$\begin{aligned} S_{J,p_1,p_2,m} \ll \frac{F_{\text {max}}^2B^{k+|J|}(\log \log {R})^2}{p_1p_2/(p_1,p_2)} . \end{aligned}$$ -
(iii)
We have
$$\begin{aligned} S_{J} = (1+o(1))B^{k+|J|}L_J(F), \end{aligned}$$where the integral operators are defined by Proposition 6.2 above, and we write \(L_J(F)\) as shorthand for \(L_{k;j\in J}(F).\)
We now show how this implies Proposition 6.2.
Proof of (Lemma 6.6\(\Rightarrow \) Proposition 6.2) We consider each sum in turn. First we note that using the definition of \(\lambda _{d_1,\ldots ,d_k},\) the exact same calculation as in [9, p. 394] gives
and so we have a trivial boundFootnote 2
As mentioned in Sect. 4, because we can obtain power-saving in the error terms for the formulae stated there, this trivial bound will suffice for our purposes.
-
(i)
Rewrite \(S_1\) in the form
$$\begin{aligned} S_1 = \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} N\le n<2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i]) \end{array}}1. \end{aligned}$$We may assume \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) are pairwise coprime, as otherwise the inner sum is empty. In this case, by the Chinese Remainder Theorem, these congruences are equivalent to a single congruence (mod q) where \(q=4W\prod _{i=1}^{k}[d_i,e_i]\). The inner sum evaluates to
$$\begin{aligned} \frac{N}{q}+O(1). \end{aligned}$$The error term contributes \(O_{\epsilon }(F_{\text {max}}^2N^{\theta _2+\epsilon })\) which is negligible. The main term is
$$\begin{aligned} \frac{N}{4W} \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \prod _{i=1}^{k}\frac{1}{[d_i,e_i]}. \end{aligned}$$This is of the form \(S_{J}\) where \(|J|=0\). Evaluating it according to Lemma 6.6 we obtain
$$\begin{aligned} S_1 = (1+o(1))\frac{B^kN}{4W}L_k^{(0)}(F). \end{aligned}$$ -
(ii)
Rewrite \(S_2^{(m)}\) in the form
$$\begin{aligned} S_2^{(m)} = \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} N\le n < 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho (n+h_m). \end{aligned}$$By definition of \(\rho (n+h_m)\) this is equal to
$$\begin{aligned} \frac{1}{\log {v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} a\le v \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{g_2(a)}\log {\frac{v}{a}}\sum _{\begin{array}{c} N\le n < 2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_m\,\,(\text {mod}\,\,a) \end{array}}r(n+h_m). \end{aligned}$$From considering the support of \(\lambda _{d_1,\ldots ,d_k}\) we see that for non-zero contribution we may assume \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) and a are pairwise coprime. In this case the inner sum can be evaluated according to Lemma 5.3, taking \(q=W\prod _{i\ne m}[d_i,e_i]\) and \(d=a[d_m,e_m]\). As \(q \ll WR^2 \ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d \ll vR^2 \ll N^{\theta _1+\theta _2}\) we see that the inner sum evaluates to
$$\begin{aligned} \frac{g_1(q)g_2(d)}{2qd}\pi N +O_{\epsilon }(N^{\frac{1}{3}+\frac{1}{2}(\theta _1+\theta _2)+\epsilon }). \end{aligned}$$Bounding the sum over a trivially by \(v\log {v}\) and using (6.12), we see the error term contributes \(O_{\epsilon }(N^{\frac{1}{3}+\frac{3}{2}(\theta _1+\theta _2)+\epsilon })\) which is negligible. We obtain a main term
$$\begin{aligned}&\frac{X_{N,W}g_1(W)\pi N}{2W\log {v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{\begin{array}{c} i\ne m \end{array}}\frac{g_1([d_i,e_i])}{[d_i,e_i]} \frac{1}{[d_m,e_m]^2}, \end{aligned}$$where we have defined
$$\begin{aligned} X_{N,W}=\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{a}\log {\frac{v}{a}} \end{aligned}$$as in (5.2). The sieve sum above is of the form \(S_{J}\) with \(|J|=1.\) We can evaluate this by Lemma 6.6 to obtain
$$\begin{aligned} S_2^{(m)} = (1+o(1))\frac{X_{N,W}g_1(W)\pi B^{k+1}N}{2W\log {v}}L_{k;m}^{(1)}(F). \end{aligned}$$Recalling the definition of B in (6.8), evaluating \(X_{N,W}\) according to Lemma 5.6, and using the fact
$$\begin{aligned} \frac{g_1(W_3)\varphi (W_3)}{W_3} = \prod _{\begin{array}{c} p|W_3 \end{array}}\Bigg (1-\frac{1}{p^2}\Bigg ) = \frac{1}{2A^2}+O(D_0^{-1}), \end{aligned}$$we obtain
$$\begin{aligned} S_2^{(m)} = (1+o(1))\frac{4 \sqrt{\frac{\log {R}}{\log {v}}}B^kN}{\pi W}L_{k;m}^{(1)}(F). \end{aligned}$$ -
(iii)
Rewrite \(S_3^{(m,l)}\) in the form
$$\begin{aligned} S_3^{(m,l)}=\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k} \sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho (n+h_{m})\rho (n+h_l). \end{aligned}$$Expanding out the definition of \(\rho \) this is
$$\begin{aligned} \frac{1}{\log ^2{v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \end{array}}\lambda _{d_1,\ldots ,d_k}&\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} a,b\le v \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \\&\cdot \sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_{m}\,\,(\text {mod}\,\,a) \\ n\equiv -h_{l}\,\,(\text {mod}\,\,b) \end{array}}r(n+h_{m})r(n+h_l). \end{aligned}$$Similarly to the above, for non-zero contribution we may restrict to the case \(W,[d_1,e_1],\ldots ,[d_k,e_k],a,b\) are pairwise coprime (note that the last two congruences are solvable if and only if \((a,b)|h_l-h_m,\) and in the case \((a,2W)=(b,2W)=1\) this is true if and only if \((a,b)=1\)). We evaluate the inner sum according to Lemma 5.4, taking \(q=W\prod _{i\ne m,l}[d_i,e_i],\) \(d_1 = a[d_m,e_m]\) and \(d_2=b[d_l,e_l].\) We note that \(q\ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d_1,d_2\ll N^{\theta _1+\theta _2}.\) Using the fact \(\theta _1+\theta _2<1/18,\) we see the second error term in the definition of \(R_2(N;d_1,d_2,q)\) dominates, and so the inner sum evaluates to
$$\begin{aligned} \frac{g_1(q)^2\Gamma (d_1,d_2,q)}{q}\pi ^2N+ O_{\epsilon }(N^{\frac{5}{6}+\theta _1+\theta _2+\epsilon }). \end{aligned}$$Bounding the rest of the sum trivially, we obtain a total error of size \(O_{\epsilon }(N^{\frac{5}{6}+3\theta _1+2\theta _2+\epsilon })\) which, again, is negligible in the range \(\theta _1+\theta _2<1/18.\) We obtain a main term
$$\begin{aligned}&\frac{g_1(W)^2\pi ^2N}{W\log ^{2}{v}}\sum _{\begin{array}{c} a,b\le v \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \\&\cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m,l}\frac{g_1([d_i,e_i])^2}{[d_i,e_i]} \Gamma ([d_m,e_m]a,[d_l,e_l]b,W\prod _{i\ne m,l}[d_i,e_i]). \end{aligned}$$For arbitrary (square-free) moduli \(d_1,d_1\) and q, we can write \(\Gamma (d_1,d_2,q)\) as a product over primes (cf. the definition of \(\Gamma (d_1,d_2,q)\) given in Lemma 5.4 and note that we are summing a multiplicative function). By considering the Euler-product and the various support restrictions on the variables \(d_i,e_i,a,b\), one can write \(\Gamma ([d_m,e_m]a,[d_l,e_l]b,W\prod _{i\ne m,l}[d_i,e_i])\) in the form
$$\begin{aligned} \prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}\frac{g_2(a)g_2(b)}{g_7(a)g_7(b)}\prod _{i\ne m,l} \frac{[d_i,e_i]}{g_1([d_i,e_i]\varphi ([d_i,e_i])} \prod _{j=m,l} \frac{1}{[d_j,e_j]\varphi ([d_j,e_j])}, \end{aligned}$$leaving us with a main term
$$\begin{aligned}&\prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}\frac{g_1(W)^2Y_{N,W}\pi ^2N}{W\log ^{2}{v}} \\&\cdot \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots , e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m,l}\frac{g_1([d_i,e_i])}{\varphi ([d_i,e_i])}\prod _{j=m,l} \frac{1}{[d_j,e_j]\varphi ([d_{j},e_{j}])}. \end{aligned}$$Here we have defined
$$\begin{aligned} Y_{N,W}=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ (a,b)=1 \\ p|a,b \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_7(a)g_7(b)}\log {\frac{v}{a}}\log {\frac{v}{b}} \end{aligned}$$as in (5.3). The main term is of the form \(S_J\) for \(|J|=2.\) By Lemma 6.6 it can be evaluated as
$$\begin{aligned} S_{3}^{(l,m)} = (1+o(1))\frac{Y_{N,W}g_1(W)^2\pi ^2B^{k+2}N}{W\log ^2{v}}L_{k;m,l}^{(2)}(F), \end{aligned}$$where we have written
$$\begin{aligned} \prod _{p\not \mid 2W}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1}=1+O(D_0^{-1}). \end{aligned}$$Evaluating \(Y_{N,W}\) as in Lemma 5.6, this simplifies to
$$\begin{aligned} S_{3}^{(l,m)} = (1+o(1))\frac{64(\frac{\log {R}}{\log {v}}) B^{k}N}{\pi ^2 W}L_{k;m,l}^{(2)}(F). \end{aligned}$$ -
(iv)
Rewrite \(S_4^{(m)}\) in the form
$$\begin{aligned} \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \end{array}}\rho ^2(n+h_m). \end{aligned}$$Expanding out the definition of \(\rho ^2(n)\) we see this is equal to
$$\begin{aligned} \frac{1}{\log ^2{v}}\sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\sum _{a,b\le v}\frac{\mu (a)\mu (b)}{g_2(a)g_2(b)}\log {\frac{v}{a}}\log {\frac{v}{b}}\sum _{\begin{array}{c} N\le n <2N \\ n\equiv v_0\,\,(\text {mod}\,\,W) \\ n\equiv 1\,\,(\text {mod}\,\,4) \\ n\equiv -h_i\,\,(\text {mod}\,\,[d_i,e_i])\,\,\forall i \\ n\equiv -h_m\,\,(\text {mod}\,\,[a,b]) \end{array}}r^2(n+h_m). \end{aligned}$$Again, we may restrict to the case \(W,[d_1,e_1],\ldots ,[d_k,e_k],[a,b]\) are pairwise coprime. In this case the inner sum can be evaluated according to Lemma 5.5, taking \(q=W\prod _{i\ne m}[d_i,e_i]\) and \(d=[a,b][d_m,e_m].\) We note that \(q\ll _{\epsilon } N^{\theta _2+\epsilon }\) and \(d\ll N^{2\theta _1+\theta _2},\) and so the inner sum becomes
$$\begin{aligned} \frac{g_3(q)g_4(d)}{qd}\Bigg (\log {N}+A_2+2\sum _{p|q}g_5(p)-2\sum _{p|d}g_6(p)\Bigg )N+O_{\epsilon }(N^{\frac{3}{4}+\theta _2+\epsilon }). \end{aligned}$$Bounding the rest of the sum trivially, we see the error term contributes \(O_{\epsilon }(N^{\frac{3}{4}+2(\theta _1+\theta _2)+\epsilon })\) which is small. For the main term, let
$$\begin{aligned} Z_{N,W}^{(1)}&= \sum _{a,b\le v}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}},\\ Z_{N,W}^{(2)}&= \sum _{a,b\le v}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}} \sum _{p|[a,b]}g_6(p) \end{aligned}$$be as in (5.4) and (5.5). We can express \(S_4^{(m)}=\Lambda _1+\Lambda _2+\Lambda _3+\Lambda _4\) where
$$\begin{aligned} \Lambda _1&= \frac{g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}}\Bigg (\log {N}+A_2+2\sum _{p|W}g_5(p)\Bigg )T,\\ \Lambda _2&= \frac{2g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}} \sum _{i\ne m} \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}g_5(p)T^{(p,i)}, \\ \Lambda _3&= -\frac{2g_3(W)Z_{N,W}^{(1)}N}{W\log ^2{v}}\sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}g_6(p)T^{(p,m)}, \\ \Lambda _4&=-\frac{2g_3(W)Z_{N,W}^{(2)}N}{W\log ^2{v}}T \end{aligned}$$and
$$\begin{aligned} T&= \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m}\frac{g_1([d_i,e_i])}{[d_i,e_i]}\frac{1}{[d_m,e_m]^2}, \\ T^{(p,i)}&= \sum _{\begin{array}{c} d_1,\ldots ,d_k \\ e_1,\ldots ,e_k \\ p|[d_i,e_i] \\ W,[d_1,e_1],\ldots ,[d_k,e_k]\text { coprime} \end{array}}\lambda _{d_1,\ldots ,d_k}\lambda _{e_1,\ldots ,e_k}\prod _{i\ne m}\frac{g_1([d_i,e_i])}{[d_i,e_i]}\frac{1}{[d_m,e_m]^2}. \end{aligned}$$T is of the form \(S_{J}\) for \(|J|=1,\) and so by Lemma 6.6 it can be evaluated as
$$\begin{aligned} T = (1+o(1))B^{k+1}L_{k;m}^{(1)}(F). \end{aligned}$$To evaluate \(T^{(p,i)},\) note by inclusion-exclusion we can write it as
where denotes the condition \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) are pairwise coprime. Thus we see it is of the form \(S_{J,p,1,i}+S_{J,1,p,i}-S_{J,p,p,i}\) for \(|J|=1.\) By Lemma 6.6 we conclude
$$\begin{aligned} T^{(p,i)} \ll {\left\{ \begin{array}{ll} \frac{F_{\text {max}}^2B^{k+1}(\log \log {R})^2}{p},\,\,&{}\text {if }i\ne m \\ \frac{F_{\text {max}}^2B^{k+1}(\log \log {R})^2}{p^2},\,\,&{}\text {if } i=m \end{array}\right. } \end{aligned}$$Now we note that
$$\begin{aligned} \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{g_5(p)}{p}&\ll \sum _{p>D_0} \frac{\log {p}}{p^2} \ll \frac{\log {D_0}}{D_0},\\ \sum _{\begin{array}{c} D_0<p\le v \\ p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{g_6(p)}{p^2}&\ll \sum _{p>D_0} \frac{\log {p}}{p^2} \ll \frac{\log {D_0}}{D_0}, \end{aligned}$$and so, with our choice \(D_0 = (\log \log {N})^3,\) the contributions from \(\Lambda _2\) and \(\Lambda _3\) are negligible. Because
$$\begin{aligned} \sum _{\begin{array}{c} p|W \end{array}}g_5(p)&\ll \sum _{p<D_0} \frac{\log {p}}{p} \ll \log {D_0}, \end{aligned}$$we see that the only contribution to the main term comes from the \(\Lambda _1\) term corresponding to \(\log {N},\) and \(\Lambda _4,\) leaving us with
$$\begin{aligned} S_4^{(m)} = (1+o(1)) \frac{g_3(W)B^{k+1}N}{W\log ^2{v}}\Bigg [Z_{N,W}^{(1)}\log {N}-2Z_{N,W}^{(2)}\Bigg ]L_{k;m}^{(1)}(F). \end{aligned}$$Evaluating these according to Lemma 5.6, and using the fact
$$\begin{aligned} \frac{g_3(W_1)}{g_1(W_1)^3}=\prod _{\begin{array}{c} p<D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p^2}\Bigg )^{-1} = \frac{3\zeta (2)}{8A^2}+O(D_0^{-1}) = \frac{\pi ^2}{16A^2}+O(D_0^{-1}), \end{aligned}$$we obtain
$$\begin{aligned} S_4^{(m)} = (1+o(1)) \frac{2 \sqrt{\frac{\log {R}}{\log {v}}}(\frac{\log {N}}{\log {v}}+1)B^k N}{\pi W}L_{k;m}^{(1)}(F). \end{aligned}$$
This finishes the proof of Proposition 6.2. \(\square \)
Thus it remains to establish Lemma 6.6. First we require a few technical sieve lemmas. We list these in the following section.
7 Technical sieve sums lemmas
In the various sieve calculations that appear in the proof of Lemma 6.6, we will frequently encounter sums of the form
where f is a multiplicative function satisfying \(f(p)=O(1/p).\) We can evaluate sums of this type with the following lemmas.
Lemma 7.1
(Technical sieve sum lemma). Let \(A_1,A_2,L>0.\) Let \(\gamma \) be a multiplicative function satisfying the sieve axioms
and
for any \(2\le w\le z.\) Let g be the totally multiplicative function defined on primes by \(g(p) = \frac{\gamma (p)}{p-\gamma (p)}\). Finally, let \(G:[0,1]\rightarrow \mathbb {R}\) be a piecewise differentiable function, and let \(G_{\text {max}}=\sup _{t\in [0,1]} (|G(t)|+|G'(t)|)\). Then
where
Here, the implied constant in the Landau notation is independent of G and L.
Proof
This is [5, Lemma 4] with slight changes to notation. \(\square \)
To use this lemma in practice, we need to be able to evaluate the singular series \(c_{\gamma }\) which appears. In the next lemma we do this for a function \(\gamma (p)\) which covers the cases of interest to us.
Lemma 7.2
(Evaluation of singular series). Let
With the notation of Lemma 7.1, we have
where A is the Ramanujan–Landau constant defined in (4.1).
Proof
Let \(\gamma (p)=1+\alpha (p)\) where \(\alpha (p)=O(1/p).\) Define the auxiliary function
One can easily show \(c_{\delta } = A/\sqrt{L(1)}.\) The result follows because
The latter product is \(1+O(D_0^{-1})\) by our assumption \(\alpha (p)=O(1/p).\) \(\square \)
The next lemma collects both of these results together. First we recall the definition of the normalising constant from (6.8):
Lemma 7.3
(Evaluation of sieve sums). Let f be a multiplicative function such that
Then for any piece-wise smooth function G we have
Proof
Let \(f(p)=1/p+g(p)\) where \(g(p)=O(1/p^2),\) and consider the function \(\gamma \) defined on primes by
With this choice of \(\gamma (p)\) we have
Note that
Therefore we can apply Lemma 7.1 with \(\gamma (p),\) taking \(L\ll 1+\log {D_0}\) and \(A_2\) a suitable constant. We obtain
We can evaluate \(c_{\gamma }\) by Lemma 7.2 to find
When we substitute this back into our expression we see the error incurred here contributes \(O(G_{\text {max}}B/D_0)\) and dominates. The result follows as \(\Gamma (1/2)\sqrt{L(1)} = \pi /2.\) \(\square \)
We highlight the following two results, the first of which follows immediately from Lemma 7.3, and the second of which is trivial.
-
(1)
For multiplicative functions f satisfying \(f(p)=1/p+O(1/p^2)\) we have the upper bound
$$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)f(d) \ll B. \end{aligned}$$(7.2) -
(2)
For multiplicative functions g satisfying \(g(p)=O(1/p^2)\) we have the upper bound
$$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ (d,W)=1 \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\mu ^2(d)g(d) \ll 1. \end{aligned}$$(7.3)
These sums will appear frequently in our calculations, and we will use these bounds without comment in the arguments which follow.
8 Establishing Lemma 6.6
Our attention now turns to establishing Lemma 6.6. We follow the combinatorial arguments used by Maynard—the steps which follow mirror those found in [9].
8.1 Change of variables
Our first step to evaluating the sums appearing in Lemma 6.6 is to make a change of variables. We do so in the following proposition.
Proposition 8.1
(Diagonalising the sieve sum). With notation as in Lemma 6.6, denote by \(f^*,g^*\) the convolutions
Define the diagonalising vectors \(y_{r_1,\ldots ,r_k}^{(J,p,m)} = y_{r_1,\ldots ,r_k}^{(J,p,m,f,g)}\) by
Let \(y_{\text {max}}^{(J,p,m)}=\sup _{r_1,\ldots ,r_k} |y_{r_1,\ldots ,r_k}^{(J,p,m)}|\) and \(\tilde{y}_{\text {max}}^{(J,p,m)}=\sup _{\begin{array}{c} r_1,\ldots ,r_k \\ (r_m,p)=1 \end{array}} |y_{r_1,\ldots ,r_k}^{(J,p,m)}|.\) (Note that these coincide if \(p=1.\)) Then we have
If \(m\in J\) then the (error) term E satisfies
If \(m\notin J\) then E satisfies a similar estimate, namely that which is obtained upon replacing all occurrences of \(p_i^2\) with \(p_i\) in the above expression, for \(i\in \{1,2\}\). Moreover, in both of these cases, we adopt the convention that if \(p_i=1\) then any term in our expression for E involving \(p_i\) in the denominator may be omitted.
Proof
Recall the definition of \(S_{J,p_1,p_2,m}\) given in Lemma 6.6:
We can write this in the form
using multiplicativity of the functions f and g, together with the fact \([d_i,e_i]\) is square-free for each i. We remark that because f, g are non-zero, the functions 1/f, 1/g are well-defined. We note the convolution identities
for \(f^*\) and \(g^*\). Substituting these into (8.2) and swapping the order of summation, we obtain
From the support of the \(\lambda _{d_1,\ldots ,d_k},\) we see the only restriction coming from the pairwise coprimality of \(W,[d_1,e_1],\ldots ,[d_k,e_k]\) is from the possibility \((d_i,e_j)\ne 1\) for \(i\ne j\). We can take care of this constraint by Möbius inversion: multiplying by \(\sum _{s_{i,j}|d_i,e_j}\mu (s_{i,j})\) for all \(i\ne j,\) we obtain
We may restrict to the case where \(s_{i,j}\) is coprime to \(s_{i,a},s_{b,j}\) and \(u_i,u_j,\) for \(a\ne j\) and \(b\ne i,\) because the vectors \(\lambda _{d_1,\ldots ,d_k}\) are supported on square-free integers \(d=\prod _{i=1}^{k}d_i.\) Denote the sum over \(s_{i,j}\) with these conditions by . Define the diagonalising vectors
From the support of \(\lambda _{d_1,\ldots ,d_k}\) we see that \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) is also supported on \(r_1,\ldots ,r_k\) with \(r=\prod _{i=1}^{k}r_i\) square-free, \((r,W)=1\) and \(p|r\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4).\) We claim this change of variables is invertible. Indeed, from the definition (8.4), for \(d_1,\ldots ,d_k\) with \(\prod _{i=1}^{k}d_i\) square-free, we have
With this transformation our sum (8.3) becomes
where we have defined \(a_i = u_i \prod _{j\ne i}s_{i,j}\) and \(b_j = u_j\prod _{i\ne j}s_{i,j}.\) Because of our constraints on the \(s_{i,j}\) variables, we can use multiplicativity to write this as
We now wish to reduce to the case when \((a_m,p_1)=(b_m,p_2)=1.\) Indeed, we will show the contribution from the alternative cases is negligible. Of course, depending on whether of not \(p_1=1\) and/or \(p_2=1,\) some (or all) of the analysis which follows is not necessary, and this accounts for the convention we assert in the statement of the proposition. First let us note the estimate
and similarly
Now, there are three cases to consider.
-
(1)
Suppose that \(p_1|a_m\) and \(p_2|b_m.\) This occurs if and only if \(p_1|u_m\) or \(p_1|s_{m,j}\) for some \(j\ne m,\) and \(p_2|u_m\) or \(p_2|s_{i,m}\) for some \(i\ne m.\) Suppose, for example, that \(p_1|u_m\) and \(p_2|u_m.\) Moreover let us assume that \(m\in J.\) Then one can bound the contribution as follows:
$$\begin{aligned}&\ll (y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})\Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{f^*(u)}\Bigg )^{|I|}\\&\qquad \cdot \Bigg (\sum _{\begin{array}{c} u\le R \\ (u,W)=1 \\ p|u\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u)}{g^*(u)}\Bigg )^{|J|-1} \Bigg (\sum _{\begin{array}{c} u_m\le R \\ (u_m,W)=1 \\ p_1,p_2|u_m \\ p|u_m\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(u_m)}{g^*(u_m)}\Bigg )\\&\qquad \times \Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)^2}\Bigg )^{|I|^2-|I|}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{g^*(s)^2}\Bigg )^{|J|^2-|J|}\Bigg (\sum _{s\ge 1}\frac{\mu ^2(s)}{f^*(s)g^*(s)}\Bigg )^{2|I||J|} \\&\quad \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})B^{|I|}}{(p_1p_2/(p_1,p_2))^2}. \end{aligned}$$It is easy to see that this bound also holds in any of the other possible cases in which \(p_1|a_m\) and \(p_2|b_m\) and \(m\in J.\) If instead \(m\notin J,\) then again it is easy to see the contribution is
$$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(y_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1p_2/(p_1,p_2)}, \end{aligned}$$in all possible cases.
-
(2)
Suppose that \(p_1|a_m\) and \(p_2\not \mid b_m.\) If \(m\in J,\) then similarly to the above, one can bound the contribution by
$$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1^2}. \end{aligned}$$If \(m\notin J\) then likewise one obtains a contribution
$$\begin{aligned} \ll \frac{(y_{\text {max}}^{(J,p_1,m)})(\tilde{y}_{\text {max}}^{(J,p_2,m)})B^{|I|}}{p_1}. \end{aligned}$$ -
(3)
Finally, the case \(p_1\not \mid a_m\) and \(p_2|b_m\) proceeds as above, interchanging the roles of \(p_1\) and \(p_2.\)
Thus, we may now suppose that \((a_m,p_1)=(b_m,p_2)=1.\) From the support of the \(y_{a_1,\ldots ,a_k}^{(J,p_1,m)}\) we see there is no contribution from \((s_{i,j},W)\ne 1\) and so either \(s_{i,j}=1\) or \(s_{i,j}>D_0.\) The contribution from \(s_{i,j}>D_0\) with \(i,j\in I\) is
This contribution will be negligible. The cases \(i,j\in J\) and \(i\in I,j\in J\) can be treated the same way. This leaves us with a main term
To finish, we claim the contribution from \(u_j>1\) is small whenever \(j\in J\). Indeed if \(u_j>1\) then it must be divisible by a prime \(p>D_0\) (with \(p\equiv 3\,\,(\text {mod}\,\,4)\)). So suppose \(|J| \ge 1\) and let \(j\in J.\) If \(u_j>1\) we get a contribution
which is small. Putting all of these facts together establishes Proposition 8.1. \(\square \)
8.2 Transformation for \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) and proof of Lemma 6.6 parts (i) and (ii)
Define
and let \(y_{\text {max}}=\sup _{r_1,\ldots ,r_k}|y_{r_1,\ldots ,r_k}|.\) By the inversion formula (8.5), our definition of \(\lambda _{d_1,\ldots ,d_k}\) in Proposition 6.2 is equivalent to taking
We now wish to relate the more complicated diagonalisation vectors \(y_{r_1,\ldots ,r_k}^{(J,p,m)}\) to these simpler vectors. We first deal with the case when \(J=\emptyset \), which is straightforward. By inspecting the proof of Proposition 6.2, it is clear that we only need to understand this case when \(f(p)=1/p.\)
Lemma 8.2
(Relating \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) to \(y_{r_1,\ldots ,r_k}\) when \(J=\emptyset \)). Suppose \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\ne 0, J=\emptyset \) and \(m\in \{1,\ldots ,k\}.\) Moreover suppose \(f(p)=1/p\) for all primes p. Then the following hold.
-
(1)
If \(p_1|r_m\) then
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)} = y_{r_1,\ldots ,r_k}. \end{aligned}$$ -
(2)
If \(p_1\not \mid r_m\) then
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)} = \frac{y_{r_1,\ldots ,p_1r_m,\ldots ,r_k}}{\mu (p_1)\varphi (p_1)}. \end{aligned}$$
Proof
If \(f(p)=1/p\) then \(f^{*}(p) = p-1 = \varphi (p).\) Hence we are assuming
The result then follows by comparing with the definition of \(y_{r_1\ldots ,r_k}\) given above. \(\square \)
Thus, proceeding, we may suppose that \(|J| \ge 1.\) We may further suppose that \(f(p)\ne 1/p\) as this case is of no interest to us (again, this is clear by inspecting the proof of Proposition 6.2). The following proposition gives the result in full.
Proposition 8.3
(Relating \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) to \(y_{r_1,\ldots ,r_k}\) when \(|J| \ge 1\)). Suppose \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\ne 0, |J| \ge 1\) and \(f(p)\ne 1/p\). Then the following hold:
-
(i)
If \(m\in J\) and \((r_m,p)=1\) we have
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1g(p_1)}{\varphi (p_1)}\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\quad \cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots ,e_k \\ r_i|e_i\,\,\forall i\ne m \\ r_m|e_m' \\ e_i=r_i\,\,\forall i\in I \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots ,e_k}\Bigg (\prod _{\begin{array}{c} j\in J \\ j\ne m \end{array}}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )\Bigg (\frac{g^{**}(e_m')}{\varphi (e_m')}\Bigg )\\&\quad +O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1^2}\Bigg ). \end{aligned}$$ -
(ii)
If \(m\notin J\) and \((r_m,p)=1\) we have
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \frac{\mu (p_1)p_1f(p_1)}{\varphi (p_1)}\Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\cdot \sum _{\begin{array}{c} e_1,\ldots ,e_m',\ldots ,e_k \\ r_i|e_i\,\,\forall i\ne m \\ e_i=r_i\,\,\forall i\in I\backslash \{m\} \\ e'_m = r_m \end{array}}y_{e_1,\ldots ,p_1e_m',\ldots ,e_k}\Bigg (\prod _{j\in J}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )+O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0p_1}\Bigg ). \end{aligned}$$ -
(iii)
If \(p_1|r_m\) we have
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J,p_1,m)}&= \Bigg (\prod _{j\in J}\frac{r_jg(r_j)g^{*}(r_j)}{g^{**}(r_j)}\Bigg ) \\&\cdot \sum _{\begin{array}{c} e_1,\ldots ,e_k \\ r_i|e_i\,\,\forall i \\ e_i=r_i\,\,\forall i\in I \end{array}}y_{e_1,\ldots ,e_k}\Bigg (\prod _{j\in J}\frac{g^{**}(e_j)}{\varphi (e_j)}\Bigg )+O\Bigg (\frac{y_{\text {max}}B^{|J|}\log \log {R}}{D_0}\Bigg ). \end{aligned}$$
Here \(f^{**}\) and \(g^{**}\) are defined by the convolutions
where \(\iota \) is the identity function, \(\iota (p)=p\).
Proof
We prove (i), with the rest proved in exactly the same way. Directly from the definition (8.4) we have
From the inversion formula (8.5) and the definition of \(y_{r_1,\ldots ,r_k},\) we see that the right hand side of (8.10) equals
Swapping sums we obtain
We can evaluate the inner sums using the convolution identities
We note that
and similarly
With our assumption \(f(p)\ne 1/p,\) we may suppose both of these functions are non-zero. Now, recall that we are assuming \(m\in J.\) Using these identities transforms (8.11) into
Here we are using the fact \(y_{e_1,\ldots ,e_k}\) is supported on square-free integers \(e=\prod _{i=1}^{k}e_i.\) Hence, from (8.10), it follows that
Here we have substituted \(e_m= p_1 e_m'\) and used the fact \((r_m,p)=1.\) Now, since
it follows that
Similarly, since
we have the bound
Here we have used the fact \(r = \prod _{i=1}^{k}r_i \le R\) and the standard estimate
Now, if \(i\ne m\) then either \(e_i=r_i\) or \(e_i>D_0 r_i\). Suppose \(e_{i_0}>D_0r_{i_0}\) for some \(i_0\in I.\) By first using multiplicativity of the sum over the \(e_i\) variables, and then using estimates (8.14) and (8.15), we see that that this gives a contribution
where we have used the fact \(f^{**}(p)/\varphi (p)=O(1/p^2)\) and \(g^{**}(p)/\varphi (p)=1/p+O(1/p^2),\) which follows from (8.12) and (8.13) respectively. This is small. Thus
From the support restrictions on \(y_{r_1,\ldots ,r_k}^{(J,p_1,m)}\) we necessarily have \((r_i,W)=1.\) From the above, we see the first product may be replaced by \(1+O(D_0^{-1}).\) This incurs an acceptable error
which gives the result stated. \(\square \)
We record the following useful corollary.
Corollary 8.4
With notation as in the statement of Proposition 8.1, the following estimates hold.
-
(1)
If \(m\in J\) then
$$\begin{aligned} \tilde{y}_{\text {max}}^{(J,p_1,m)} \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{p_1^2}. \end{aligned}$$ -
(2)
If \(m\notin J\) then
$$\begin{aligned} \tilde{y}_{\text {max}}^{(J,p_1,m)} \ll \frac{y_{\text {max}}B^{|J|}\log \log {R}}{p_1}. \end{aligned}$$ -
(3)
We have
$$\begin{aligned} y_{\text {max}}^{(J,p,m)} \ll y_{\text {max}}B^{|J|}\log \log {R}. \end{aligned}$$
Proof
This follows easily from Lemma 8.2 and Proposition 8.1. (We note that in the special case \(J=\emptyset ,\) in items (2) and (3) we could drop the extra \(\log \log {R}\) factor if required.) \(\square \)
We are now in a position to prove the first two parts of Lemma 6.6.
Proof of Lemma 6.6 parts (i) and (ii)
We prove part (i), in the case \(m\in J.\) The rest of the argument proceeds along the same lines. From Proposition 8.3 we have
Here we have used Corollary 8.4 to control the various error terms in the statement of the proposition. We have also used the fact \(|I|+|J|=k.\) It follows that
again using Corollary 8.4. This gives the result, as required. \(\square \)
From now on we are only interested in the sums \(S_{J} = S_{J,1,1,m},\) and in particular the cases \(|J|\in \{0,1, 2\}\). For ease of notation we let \(y_{r_1,\ldots ,r_k}^{(J)} = y_{r_1,\ldots ,r_k}^{(J,1,m)}\) and \(y_{\text {max}}^{(J)} = \sup _{r_1,\ldots ,r_k} |y_{r_1,\ldots ,r_k}^{(J)}|.\) For future reference we note the bound
proved above.
8.3 Relating vectors to functionals and proof of Lemma 6.6 part (iii)
We first prove the following corollary that follows from Proposition 8.3.
Corollary 8.5
(Relating \(y_{r_1,\ldots ,r_k}^{(J)}\) to integral operators). Let \(y_{r_1,\ldots ,r_k}\) be defined in terms of a fixed, smooth function F, supported on \({R}_k=\{\vec {x}\in [0,1]^k:\sum _{i=1}^{k}x_i\le 1\}\), by
Let
Define the integral operators
Then
-
(i)
if \(J=\{m\}\) we have
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\bigg |_{r_j=1\,\,\forall j\in J} = B\bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )I_{r_1,\ldots ,r_k; m}(F)+ O\Bigg (\frac{F_{\text {max}}B}{D_0}\Bigg ). \end{aligned}$$ -
(ii)
if \(J=\{m,l\}\) we have
$$\begin{aligned} y_{r_1,\ldots ,r_k}^{(J)}\Bigg |_{r_j=1\,\,\forall j\in J} = B^2\Bigg (\prod _{i\in I}\frac{\varphi (r_i)}{r_i}\Bigg )^2I_{r_1,\ldots ,r_k; m,l}(F)+ O\Bigg (\frac{F_{\text {max}}B^2}{D_0}\Bigg ). \end{aligned}$$
Proof
First suppose \(J=\{m\}.\) From Proposition 8.3 we have
Consider the function
With this choice of \(\gamma _1(p)\) we have
and one can easily check \(\gamma _1(p) = 1+O(1/p).\) By an argument identical to the proof of Lemma 7.3, we can evaluate the sum in (8.17) as
which proves (i). If \(J=\{m,l\}\) then, again using Proposition 8.3, we find
Take the sum over \(e_l\) first. Consider the function
Reasoning as above, the sum on the right hand side of (8.18) becomes
We can evaluate this sum in much the same way, this time using the function
to get the stated result. \(\square \)
If we define the (identity) operator
then the results of Corollary 8.5 can be concisely written as
for \(|J|\in \{0,1,2\}.\) Here we have used \(I_{r_1,\ldots ,r_k;J}(F)\) to denote \(I_{r_1,\ldots ,r_k;j:j\in J}(F).\) We are now in a position to prove the remaining part of Lemma 6.6.
Proof of Lemma 6.6 part (iii)
We can write the operators in the statement of Lemma 6.6 as follows:
where we have defined
From Proposition 8.1 we see that
From Corollary 8.5, for \(|J|\in \{0,1,2\},\) we have
Substituting this into (8.19), and using (8.16), yields
The first error contributes
For the main term, if \((u_i,u_j)\ne 1\) then they must be divisible by a prime \(q>D_0\) with \(q\equiv 3\,\,(\text {mod}\,\,4)\). In this case we get a contribution
Thus this constraint can be removed at the cost of a negligible error and we are left with
Now, since
we can evaluate this multidimensional sum by applying Lemma 7.3\(|I|=k-|J|\) times. We obtain
This completes the proof of Lemma 6.6. \(\square \)
Notes
This is precisely why the authors were limited to using upper bounds and not asymptotics in [1] (cf. Remark 2.1).
By Rankin’s trick we have
$$\begin{aligned} \sum _{\begin{array}{c} d\le R \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}} \tau _k(d) \le R\sum _{\begin{array}{c} d\le R \\ p|d\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4) \end{array}} \frac{\tau _k(d)}{d} \ll R(\log {R})^{k/2}. \end{aligned}$$
References
Banks, W.D., Freiberg, T., Maynard, J.: On limit points of the sequence of normalised prime gaps. Proc. Lond. Math. Soc. 113(4), 515–539 (2016)
Bui, H.M., Heath-Brown, D.R.: A note on the fourth moment of Dirichlet \(L\)-functions. Acta Arith. 141, 335–344 (2010)
Davenport, H.: Multiplicative Number Theory, Springer GTM 74
Goldston, D.A., Pintz, J., Yildirim, C.Y.: Primes in tuples I. Ann. Math. 170(2), 819–862 (2009)
Goldston, D.A., Graham, S.W., Pintz, J., Yildirim, C.Y.: Small gaps between products of two primes. Proc. Lond. Math. Soc. 98(3), 741–774 (2009)
Granville, A.: Primes in intervals of bounded length. Bull. Am. Math. Soc. 52(2), 171–222 (2015)
Hooley, C.: On the intervals between numbers that are sums of two squares. Acta Math. 127, 279–297 (1971)
Jakobson, D.: Quantum limits on flat tori. Ann. Math. 145(2), 235–266 (1997)
Maynard, J.: Small gaps between primes. Ann. Math. 181(1), 383–413 (2015)
Plaksin, V.A.: The distribution of numbers representable as the sum of two squares. Math. USSR Izv. 31(1), 171–191 (1988)
Polymath, D.H.J.: Variants of the Selberg sieve, and bounded intervals containing many primes. Res. Math. Sci. 1(12), 83 (2014)
Tenenbaum, G.: Introduction to analytic and probabilistic number theory. Am. Math. Soc. GMT 163
Tolev, D.I.: On the remainder term in the circle problem in an arithmetic progression. Proc. Steklov Inst. Math. 276, 261–274 (2012)
Acknowledgements
The author would like to thank James Maynard for introducing this problem to them in the first instance, and for being available for many helpful conversations thereafter. We would also like to thank the anonymous referee for a careful and thorough reading of the paper. The author is funded by an Engineering and Physical Sciences Research Council (EPSRC) Studentship.
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Wei Zhang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Estimates for r(n) and \(r(n)r(n+h)\)
We sketch proofs for Lemmas 5.3, 5.4 and 5.5. The first follows immediately from the following two results, due to Tolev [13, Theorem] and Plaksin (cf. discussion just before [10, Lemma 4]) respectively.
Lemma A.1
We have
where
Lemma A.2
We have
where
and \(P(N;Q,\Delta )\) satisfies
Here \(c_r(\Delta )=\sum _{(a,r)=1}e^{\frac{2\pi i a \Delta }{r}}\) denotes the Ramanujan sum.
Proof of Lemma 5.3
Equating (A.1) and (A.2), dividing through by \(\pi N,\) and letting \(N\rightarrow \infty \) we see that
holds for any fixed \(Q,\Delta .\) Thus we may write
Let \(Q=4qd\) and \(\Delta \,\,(\text {mod}\,\,Q)\) be the solution to the congruence system \(\Delta \equiv a\,\,(\text {mod}\,\,q),\) \(\Delta \equiv 1\,\,(\text {mod}\,\,4),\) and \(\Delta \equiv 0\,\,(\text {mod}\,\,d),\) where \((a,q)=(d,q)=1\) and q, d are square-free and odd. Then by multiplicativity of the Ramanujan sum and our assumptions about \(\Delta ,q\) and d from (A.3) we obtain
and so we obtain the result of Lemma 5.3 on dividing through by \(Q=4qd\). \(\square \)
For Lemma 5.4 we have the following result.
Lemma A.3
Suppose that \((a,q)=(a+h,q)=(d_1,q)=(d_2,q)=(d_1,d_2)=1,4|h\) where \(d_1,d_2,q\) are square-free and odd, of size \(\ll N^{O(1)}\). Then for \(0<h<N^{\frac{3}{4}}\) we have
where
where
and \(g_1\) is the multiplicative function defined on primes by \(g_1(p) =1- \frac{\chi (p)}{p}\) and \(\Psi (d_1,r)=g_2((d_1,r/(r,d_1))).\)
Proof of Lemma 5.4
Under the additional assumption that \(p|h \Rightarrow p|2q,\) then we see for r considered in the sum \(c_r(h)=\mu (r)\). But now restricting to square-free r, since \(d_1,d_2\) are square-free \(\Psi (d_1,r)=\Psi (d_2,r)=1.\) Thus in this case
This is the form stated in Lemma 5.4. \(\square \)
Now we briefly outline the proof of Lemma A.3. In [10, Lemma 4] a similar sum to (A.6) is considered, this time under the hypotheses \(p|d_1,d_2\Rightarrow p\equiv 3\,\,(\text {mod}\,\,4),\) \(q=1\) and the congruence \(n\equiv 1\,\,(\text {mod}\,\,4)\) is omitted. The proof of Lemma A.3 is similar to the proof found there, with few minor changes. The key point is to note that, using the convolution identity \(r=4(\chi *1)\) and complete multiplicativity of \(\chi ,\) for \(n\equiv 1\,\,(\text {mod}\,\,4)\) we can write
Using this we may expand out r(n) in the sum (A.6). We obtain (after swapping the order of summation)
These congruences have a solution if and only if \((m,q)=1\) and \((m,d_2)|h\). In this case we can use the Chinese Remainder Theorem and write
where \(Q=4q[m,d_1,d_2]\) and \(\Delta \,\,(\text {mod}\,\,Q)\) satisfies the congruence system \(\Delta \equiv a+h\,\,(\text {mod}\,\,q), \Delta \equiv 1\,\,(\text {mod}\,\,4), \Delta \equiv h\,\,(\text {mod}\,\,m), \Delta \equiv h\,\,(\text {mod}\,\,d_1)\) and \(\Delta \equiv 0\,\,(\text {mod}\,\,d_2)\). The error term arises by estimating the intervals of length h left over: taking absolute values, using the divisor bound \(r(n) \ll n^{\epsilon }\), and noting that in this regime \(h\ll N^{3/4},\) we have to estimate sums of type
Applying the standard estimate \(h/m+O(1)\) to the inner sum and then carrying out the summation over m yields the desired estimate (after redefining our choice of \(\epsilon \)).
Now the inner sums in (A.9) consist of estimating r(n) in arithmetic progressions. It proves convenient to proceed using formula (A.4). We obtain
where \(R_2\) is an error term. Using the fact \((\Delta ,Q)\le d_1d_2(m,h)\) and \(Q\ll mqd_1d_2\) one can show \(R_2\) contributes
and so this error term dominates. Now one can estimate the main term in (A.10) using the same techniques found in [10, Lemma 4].
For Lemma 5.5 we have the following result.
Lemma A.4
Let \((a,q)=(d,q)=1\) where d, q are square-free and odd, of size \(\ll N^{O(1)}\). Then we have
where
Proof of Lemma 5.5
One obtains the main term as stated in Lemma 5.5 by taking the logarithmic derivative of H(s, d, 4q) (taking an appropriate branch-cut) and evaluating at \(s=1.\) \(\square \)
We now outline the proof of Lemma A.4. The proof uses a standard application of Perron’s formula. The following lemma, which combines [12, Theorem II.8.20] and [12, Theorem II.8.22], will prove necessary.
Lemma A.5
Let \(\mathcal {L}= \log {(|t|+Q+1)}\) and let \(\chi _Q\,\,(\text {mod}\,\,Q)\) be a Dirichlet character. For \(\sigma \ge 1\) we have the following
-
(1)
If \(\chi _{Q}^2\) is complex then \(L(s,\chi _Q^2)^{-1} \ll \mathcal {L}^7.\)
-
(2)
If \(\chi _Q^2\) is real, non-trivial, then there exists an absolute constant \(c_0>0\) such that
$$\begin{aligned} L(s,\chi _Q^2)^{-1} \ll {\left\{ \begin{array}{ll} \mathcal {L}^6 (\mathcal {L}+1/|t|)\,\,&{}\text {if }|t| > c_0 Q^{-\frac{1}{2}}(\log {2Q})^{-2}, \\ Q^{\frac{1}{2}}\,\,&{}\text {if }|t| \le c_0 Q^{-\frac{1}{2}}(\log {2Q})^{-2}. \\ \end{array}\right. } \end{aligned}$$
Write \(Q=4q\) and let \(\Delta \,\,(\text {mod}\,\,Q)\) be the unique solution to the congruences \(\Delta \equiv a\,\,(\text {mod}\,\,q)\) and \(\Delta \equiv 1\,\,(\text {mod}\,\,4)\). Since \((\Delta ,Q)=1\) by character orthogonality we can write
We study these sums by considering their generating series
Write \(F(s,\chi _Q)=F_1(s,\chi _Q).\) Then
Here \(\chi _4\) denotes the unique non-trivial character \((\text {mod}\,\,4),\) and \(L(s,\chi _D)\) denotes the L series corresponding to the Dirichlet character \(\chi _D\,\,(\text {mod}\,\,D).\) It follows that
say, using the fact d is square-free and \((d,Q)=1\). We note the divisor bound \(|r^2(n)\chi _q(n)| \le \tau ^2(n) \le e^{C\frac{\log {n}}{\log \log {n}}}\) for some explicit \(C>0.\) We can apply an effective form of Perron’s formula (for example [12, Corollary II.2.4]), averaging over the height T, to obtain
where
Here \(c=1+1/\log {x}.\) We move the contour to the region defined by \([c-it_0,c+it_0],\) \([1/2+it_0,c+it_0],\) \([1/2-it_0,1/2+it_0]\) and \([1/2-it_0,c+it_0].\) The integrand has a pole of order 2 at \(s=1,\) coming from the trivial character \(\chi _0\,\,(\text {mod}\,\,Q)\). The residue of this pole is
where
We need bounds for the integrand in the region \(\sigma \ge 1/2\) and \(|t|\le 2T.\) One can easily check the trivial bound \(|A(s,d,Q)|\ll 1\) (uniformly in d and q). We require a lower bound for \(|L(2s,\chi _Q^2)|\) in this region. If \(\chi _Q\) is real then \(\chi _Q^2\) is the trivial character, and so we have
In the region \(\sigma \ge \frac{1}{2}\) we can bound the product from below by \(\varphi (Q)/Q.\) Standard bounds for \(\zeta (s)\) on the 1-line (see for example [12, Theorem II.3.9]) then tell us that
If \(\chi _{Q}\) is complex then we have to be more careful. We use Lemma A.5, and see there is a minor technical complication where we must bound the contribution from \(s=1/2+it\) with \(|t|\le 2c_0\) (say) separately (with this choice of cut-off we can use the bound \(L(2s,\chi _Q^2) \ll (QT)^{\epsilon }\) uniformly in \(\chi _Q\) for all the other contours). One can easily show the contribution from these values of s is \(\ll Q^{\frac{1}{2}},\) which is small.
To bound the contribution from the other integrals, we can use the fourth-moment estimate for Dirichlet L-functions on the critical line in the form
(cf. [2, Theorem 1]) together with the Cauchy-Schwarz inequality. One can show that, with the choice \(T=x^{\frac{1}{4}},\) the contours contribute \(\ll _{\epsilon }Qx^{\frac{3}{4}+\epsilon }.\)
Appendix B: Auxiliary estimates for \(\rho (n)\)
To evaluate the sums appearing in Lemma 5.6 we use the Selberg–Delange method.
Lemma B.1
(Selberg–Delange method). Let \(F(s)=\sum _{n=1}^{\infty }a_n n^{-s}\) be a Dirichlet series such that the function \(G(s;z)=F(s)\zeta (s)^{-z}\) can be analytically continued to the region
for some positive \(c_0>0\) and \(z\in \mathbb {C}\) with \(|z|\le A,\) and moreover satisfies the bounds \(|G(s;z)|\ll M(1+|t|)^{1-\delta }\) for some \(\delta >0\) in this region. Let \(Z_1(s;z)=[\zeta (s)(s-1)]^z, Z_2(s;z)=\frac{Z_1(s;z)}{s}\) (both holomorphic in the disc \(|s-1|<1\)) and let
be the Taylor series in this region. Then for any \(N\ge 0\) and \(|z|\le A\) we have
and moreover if \(a_n>0\) then we also have
where
for some suitable constants \(c_1,c_2>0\). These positive constants, and the implicit constants in the Landau symbol, depend at most on \(c_0,\delta \) and A.
The second statement is exactly [12, Theorem II.5.2]. The first statement follows by the same proof with a few minor changes.
Lemma 5.6 follows from suitable applications of Lemma B.1. We sketch the details in each case.
Proof of Lemma 5.6
-
(i)
For \(X_{N,W}\) recall from (5.2) the definition
$$\begin{aligned} X_{N,W}=\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)}{a}\log {\frac{v}{a}}. \end{aligned}$$(B.5)Consider the generating series, for \(\sigma >1\)
$$\begin{aligned} h_{1}(W,s) = \sum _{\begin{array}{c} n=1 \\ (n,W)=1 \\ p|n\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}^{\infty }\frac{\mu (n)}{n^s} = \prod _{\begin{array}{c} p\not \mid W \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p^s}\Bigg ) = \frac{K_1(s)G_1(W,s)}{(\zeta (s)L(s,\chi _4))^{\frac{1}{2}}}, \end{aligned}$$where
$$\begin{aligned} K_1(s)^2&= \Bigg (1-\frac{1}{2^s}\Bigg )^{-1}\prod _{p\equiv 3\,\,(\text {mod}\,\,4)}\Bigg (1-\frac{1}{p^{2s}}\Bigg )^{-1}, \\ G_1(W,s)&= \prod _{\begin{array}{c} p|W \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p^s}\Bigg )^{-1} \end{aligned}$$(taking the positive determination of the square-root in the first instance). Both of these functions are analytic for \(\sigma >3/4\) (say). \(K_1(s)\) is bounded in this region, and \(G_1(W,s)\) satisfies
$$\begin{aligned} |G_{1}(W,s)|&\le \prod _{\begin{array}{c} p|W \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p^{\frac{3}{4}}}\Bigg )^{-1} = \exp \Bigg [- \sum _{\begin{array}{c} p\le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\log \Bigg (1-\frac{1}{p^{3/4}}\Bigg )\Bigg ] \\&= \exp \sum _{\begin{array}{c} p\le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg [\frac{1}{p^{3/4}}+O\Bigg (\frac{1}{p^{3/2}}\Bigg )\Bigg ] \ll \exp \sum _{\begin{array}{c} p\le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{D_0^{1/4}}{p} \\&\ll \exp (D_0^{1/4}\log \log {D_0}). \end{aligned}$$We remark that with the choice \(D_0 = (\log \log {N})^3,\) we certainly have the bound
$$\begin{aligned} \exp (D_0^{1/4}\log \log {D_0}) \ll _{\epsilon } (\log {N})^{\epsilon } \end{aligned}$$for any fixed \(\epsilon >0,\) and using this estimate, it is a simple task to verify that all the error terms which follow are indeed controlled. (Recall \(v=N^{\theta }\) for some fixed \(\theta >0\).)
From the classical zero-free region for Dirichlet L-functions, \(L(s)^{-1}\) can be analytically continued to a region of the form (B.1) for some \(c_0\) and moreover satisfies a bound \(L(s)^{-1}\ll \log {(2+|t|)}\) in this region (cf. [3, Chapter 14]). Thus we can apply Lemma B.1 with \(M=\exp (D_0^{1/4}\log \log {D_0}), z=-\frac{1}{2},N=0\) and some suitable choices of \(c_0,\delta ,\) to obtain
$$\begin{aligned} X_{N,W} = \frac{K_1(1)G_{1}(W)}{\Gamma (3/2)\sqrt{L(1,\chi _4)}}(\log {v})^{\frac{1}{2}}+O\Bigg (\frac{\exp (D_0^{\frac{1}{4}}\log \log {D_0})}{(\log {v})^{\frac{1}{2}}}\Bigg ). \end{aligned}$$Here we have written \(G_{1}(W)=G_{1}(W,1)\) (and this convention will be continued below the fold). This simplifies to the stated result. Note that
$$\begin{aligned} G_1(W) = \prod _{\begin{array}{c} p\le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{p}\Bigg )^{-1} \asymp (\log {D_0})^{\frac{1}{2}} \end{aligned}$$by Mertens’ theorem.
-
(ii)
For \(Y_{N,W}\) recall from (5.3) the definition
$$\begin{aligned} Y_{N,W}=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ (a,b)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\mu (a)\mu (b)}{g_7(a)g_7(b)}\log {\frac{v}{a}}\log {\frac{v}{b}}. \end{aligned}$$(B.6)Take the sum over b on the inside. We can evaluate
$$\begin{aligned} \sum _{\begin{array}{c} b\le v \\ (b,aW)=1 \\ p|b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (b)}{g_7(b)}\log {\frac{v}{b}} \end{aligned}$$using the generating function, for \(\sigma >1\)
$$\begin{aligned} h_2(aW,s)&= \sum _{\begin{array}{c} n=1 \\ (n,aW)=1 \\ p|n\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}^{\infty }\frac{\mu (n)n}{n^sg_7(n)} = h_{1}(aW,s)K_2(s)G_2(aW,s), \end{aligned}$$where
$$\begin{aligned} K_2(s)&= \prod _{p\equiv 1\,\,(\text {mod}\,\,4)}\Bigg (1+\frac{1}{(p^s-1)(p+1)}\Bigg ), \\ G_2(aW,s)&= \prod _{\begin{array}{c} p|aW\\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{1}{(p^s-1)(p+1)}\Bigg )^{-1}. \end{aligned}$$Arguing as above, Lemma B.1 yields
$$\begin{aligned} Y_{N,W} = T_1+T_2+O(T_3), \end{aligned}$$this time taking \(M=\exp (D_0^{1/4}\log \log {D_0}), z=-1/2\) and \(N=1.\) Here
$$\begin{aligned} T_3&= \frac{M}{(\log {v})^{\frac{3}{2}}}\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(a)\tau (a)}{g_7(a)}\log {\frac{v}{a}} \ll \frac{M}{(\log {v})^{\frac{1}{2}}} \prod _{\begin{array}{c} p\le v \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{2}{p+1}\Bigg ) \\&\ll \frac{M}{(\log {v})^{\frac{1}{2}}} \prod _{\begin{array}{c} p\le v \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{1}{p}\Bigg )^2 \ll M (\log {v})^{\frac{1}{2}} = \exp (D_0^{1/4}\log \log {D_0})(\log {v})^{\frac{1}{2}}. \end{aligned}$$To evaluate \(T_2,\) we need a handle on the second Taylor series coefficient. With notation as in the statement of Lemma B.1, this is labelled \(\mu _1(-1/2)\) and is precisely equal to
$$\begin{aligned} G(1)Z_1'(1;-1/2)+G'(1) Z_1(1;-1/2), \end{aligned}$$where
$$\begin{aligned} G(s) = \frac{K_1(s)G_1(aW,s)K_2(s)G_2(aW,s)}{L(s,\chi _4)^{1/2}} \end{aligned}$$(taking an appropriate branch-cut). Here all derivatives are taken with respect to s. By a series expansion, one can directly show that \(Z_1(1;-1/2)=1\) and \(Z_1'(1;-1/2)=-\gamma /2,\) where \(\gamma \) is the Euler-Mascheroni constant. We now note the following:
$$\begin{aligned} G_1(p)G_2(p)&=\frac{g_7(p)}{p}, \\ G_1'(d)&= G_1(d) \sum _{\begin{array}{c} p|d \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\log {p}}{p-1}, \\ G_2'(d)&= G_2(d) \sum _{\begin{array}{c} p|d \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\log {p}}{p(p-1)}. \end{aligned}$$Here we have defined \(G_i'(d)=\frac{\mathrm {d}G_i(d,s)}{\mathrm {d}s}\big |_{s=1}\) for \(i\in \{1,2\}\). These results are valid for primes \(p\equiv 1\,\,(\text {mod}\,\,4),\) and for any square-free d. Thus one can write \(T_2\) as a finite linear combination
$$\begin{aligned} T_2&= \frac{1}{(\log {v})^{\frac{1}{2}}}\sum _{i} c_i \alpha _i(W_1) R_i, \end{aligned}$$where \(c_i\in \mathbb {R}\) are bounded,
$$\begin{aligned} \alpha _i(W_1)\in \Bigg \{\frac{g_7(W_1)}{W_1}, \,\,\frac{g_7(W_1)}{W_1}\sum _{p|W_1}\frac{\log {p}}{p-1},\,\,\frac{g_7(W_1)}{W_1}\sum _{p|W_1}\frac{\log {p}}{p(p-1)}\Bigg \}, \end{aligned}$$and \(R_i\) is one of
$$\begin{aligned}&\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)}{a}\log {\frac{v}{a}} \ll (\log {D_0})^{1/2}(\log {v})^{\frac{1}{2}} \end{aligned}$$or
$$\begin{aligned}&\sum _{\begin{array}{c} D_0<q<v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {q}}{q-1}\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ q|a \\ p|a \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)}{a}\log {\frac{v}{a}} \ll \exp (D_0^{1/4}\log \log {D_0})(\log {v})^{1/2}, \end{aligned}$$or finally
$$\begin{aligned}&\sum _{\begin{array}{c} D_0<q<v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {q}}{q(q-1)}\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ q|a \\ p|a \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)}{a}\log {\frac{v}{a}} \ll \frac{(\log {D_0})^{3/2}(\log {v})^{1/2}}{D_0^2}. \end{aligned}$$Here q denotes a prime variable (a convention that will also be used below the fold). The first estimate follows directly from our expression for \(X_{N,W}\) found above. For the last two estimates, we note that the inner sum appearing in both is precisely \(X_{N,W}-X_{N,qW},\) and from our work above it follows that
$$\begin{aligned} X_{N,W} - X_{N,qW}&= \frac{K_1(1)}{\Gamma (3/2)\sqrt{L(1,\chi _4)}}(\log {v})^{\frac{1}{2}}(G_1(W)-G_1(qW)) \\&\qquad + O\Bigg (\frac{\exp (D_0^{\frac{1}{4}}\log \log {D_0})}{(\log {v})^{\frac{1}{2}}}\Bigg ) \\&\ll \frac{(\log {D_0})^{\frac{1}{2}}(\log {v})^{\frac{1}{2}}}{q} \\&\quad + \frac{\exp (D_0^{\frac{1}{4}}\log \log {D_0})}{(\log {v})^{\frac{1}{2}}}. \end{aligned}$$Now, standard estimates for sums over primes such as
$$\begin{aligned} \sum _{\begin{array}{c} q>D_0 \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {q}}{q(q-1)} \ll \frac{\log {D_0}}{D_0}, \end{aligned}$$yields the results stated.
Now, using the fact
$$\begin{aligned} \sum _{p|W_1}\frac{\log {p}}{p-1}\ll \log {D_0}, \end{aligned}$$and
$$\begin{aligned} \frac{g_7(W_1)}{W_1}&= \prod _{\begin{array}{c} p \le D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{1}{p}\Bigg ) \ll (\log {D_0})^{\frac{1}{2}}, \end{aligned}$$it follows that the total contribution from \(T_2\) is
$$\begin{aligned} T_2 \ll (\log {D_0})^{3/2}\exp (D_0^{1/4}\log \log {D_0})(\log {v})^{1/2} \end{aligned}$$which is negligible.
Finally, we can write
$$\begin{aligned} T_1 = \frac{K_1(1)K_2(1)G_1(W_1)G_2(W_1)(\log {v})^{\frac{1}{2}}}{\Gamma (3/2)\sqrt{L(1,\chi _4)}}\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a \Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)}{a}\log {\frac{v}{a}} \end{aligned}$$This last sum is exactly \(X_{N,W}.\) Using our bound from part (i), and also the fact
$$\begin{aligned} K_2(1)G_2(W_1)=\prod _{\begin{array}{c} p>D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{1}{p^2-1}\Bigg ) = 1+O(D_0^{-1}), \end{aligned}$$we arrive at a final estimate of
$$\begin{aligned} Y_{N,W} = (1+O(D_0^{-1}))\Bigg [&\frac{K_1(1)G_{1}(W)}{\Gamma (3/2)\sqrt{L(1,\chi _4)}}\Bigg ]^2\log {v}\\&+O((\log {D_0})^{3/2}\exp (D_0^{1/4}\log \log {D_0})(\log {v})^{1/2}). \end{aligned}$$ -
(iii)
For \(Z_{N,W}^{(1)}\) recall from (5.4) the definition
$$\begin{aligned} Z_{N,W}^{(1)}=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}. \end{aligned}$$(B.7)We write this as
$$\begin{aligned} \sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)g_4(a)}{g_2(a)a}\log {\frac{v}{a}}\sum _{d|a}\frac{d}{g_4(d)}\sum _{\begin{array}{c} b\le v \\ (b,W)=1 \\ (b,a)=d \\ p|b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (b)g_4(b)}{g_2(b)b}\log {\frac{v}{b}}. \end{aligned}$$Now substitute \(b=md.\) The inner sums can be rewritten as
$$\begin{aligned} \sum _{d|a}\frac{\mu (d)}{g_2(d)} \sum _{\begin{array}{c} m\le v/d \\ (m,aW)=1 \\ p|m\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (m)g_4(m)}{g_2(m)m}\log {\frac{v}{md}}, \end{aligned}$$where we have used the fact a is square-free. To handle the inner sum we consider the generating series
$$\begin{aligned} h_{3}(aW,s)&= \sum _{\begin{array}{c} n=1 \\ (n,aW)=1 \\ p|n\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}^{\infty }\frac{\mu (n)g_4(n)}{n^sg_2(n)} =\frac{K_3(s)K_4(s)G_{3}(aW,s)G_{4}(aW,s)}{\zeta (s)L(s,\chi _4)}, \end{aligned}$$where
$$\begin{aligned} K_3(s)&=\Bigg (1-\frac{1}{2^s}\Bigg )^{-1}\prod _{p\equiv 3\,\,(\text {mod}\,\,4)}\Bigg (1-\frac{1}{p^{2s}}\bigg )^{-1} \prod _{p\equiv 1\,\,(\text {mod}\,\,4)}\bigg (1-\frac{1}{(p^s-1)^2}\Bigg ), \\ K_4(s)&=\prod _{p\equiv 1\,\,(\text {mod}\,\,4)}\Bigg (1+\frac{5p-3}{(p+1)(2p-1)(p^s-2)}\Bigg ), \\ G_{3}(aW,s)&= \prod _{\begin{array}{c} p|aW \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{2}{p^s}\Bigg )^{-1}, \\ G_{4}(aW,s)&= \prod _{\begin{array}{c} p|aW \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{5p-3}{(p+1)(2p-1)(p^s-2)}\Bigg )^{-1}. \end{aligned}$$These four functions are analytic in the region \(\sigma > 3/4\) (say), where they all satisfy the bound O(1) except for \(G_3.\) Since \((a,W)=1\), by multiplicativity we can write \(G_3(aW,s) = G_3(a,s)G_3(W,s).\) As above, we can bound \(|G_3(W,s)| \ll \exp (D_0^{1/4}\log \log {D_0})\) in this region. On the other hand, note that
$$ 1\le \Bigg (1-\frac{2}{p^{3/4}}\Bigg )^{-1} \le 2$$whenever \(p\ge 7.\) Since we are assuming a is square-free, it follows that we can bound \(|G_3(a,s)| \ll \tau (a).\)
By similar arguments to part (i) we can apply Lemma B.1 taking \(z=-1, M=\tau (a)\exp (D_0^{1/4}\log \log {D_0})\) and suitable \(\delta ,c_0.\) Note that for \(N\ge 1\) the terms \(\mu _k(z)(\log {x})^{z+1-k}/\Gamma (z+2-k)\) are 0 for \(1\le k\le N.\) Hence we can choose \(N=\left\lfloor (\log {x})/ec_1 \right\rfloor \) (for some suitable \(c_1>0\)) to balance the error terms, yielding a stronger error term of the form \(O(Me^{-c_1\sqrt{\log {x}}}).\) Using this choice of N we obtain
$$\begin{aligned} \sum _{\begin{array}{c} m\le v/d \\ (m,aW)=1 \\ p|m\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (m)g_4(m)}{g_2(m)m}\log {\frac{v}{md}} = \frac{K_3(1)K_4(1)G_{3}(aW)G_4(aW)}{L(1,\chi _4)} +O\Bigg (Me^{-c_1\sqrt{\log {v/d}}}\Bigg ). \end{aligned}$$This error term contributes
$$\begin{aligned} \ll \exp (D_0^{1/4}\log \log {D_0})\sum _{\begin{array}{c} a\le v \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(a)g_4(a)\tau (a)}{g_2(a)a}\log {\frac{v}{a}}\sum _{d|a}\frac{\mu ^2(d)}{g_2(d)}e^{-c_1\sqrt{\log {v/d}}}. \end{aligned}$$Swapping sums yields
$$\begin{aligned} \sum _{\begin{array}{c} d\le v \\ p|d\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(d)g_4(d)\tau (d)}{g_2(d)^2d}e^{-c_1\sqrt{\log {v/d}}} \sum _{\begin{array}{c} m\le v/d \\ (m,d)=1 \\ p|m\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(m)g_4(m)\tau (m)}{g_2(m)m}\log {\frac{v}{md}} \end{aligned}$$The inner sum can be bounded by
$$\begin{aligned}&\ll \log {\frac{v}{d}} \prod _{\begin{array}{c} p\le v/d \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{2(4p^2-3p+1)}{p(p+1)(2p-1)}\Bigg ) \ll \log {\frac{v}{d}} \prod _{\begin{array}{c} p\le v/d\\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{4}{p}\Bigg ), \\&\ll \log {\frac{v}{d}}\prod _{\begin{array}{c} p\le v/d\\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1+\frac{1}{p}\Bigg )^4 \ll \Bigg (\log {\frac{v}{d}}\Bigg )^3 \ll e^{c_2\sqrt{\log {v/d}}} \end{aligned}$$for some suitably small \(c_2>0.\) Note this final bound is valid for all \(d\le v.\) Using the fact \(g_4(d)/g_2(d)^2 \le 1\), we see the total error is
$$\begin{aligned}&\ll \exp (D_0^{1/4}\log \log {D_0}) \sum _{\begin{array}{c} d\le v \\ p|d\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(d)\tau (d)}{d}e^{-c_3\sqrt{\log {v/d}}} \end{aligned}$$(B.8)for some \(0<c_3<c_1.\) For this sum, we use the generating series, for \(\sigma >1\)
$$\begin{aligned} \sum _{\begin{array}{c} n=1 \\ p|n\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}^{\infty } \frac{\mu ^2(n)\tau (n)}{n^s} = \prod _{p\equiv 1\,\,(\text {mod}\,\,4)}\Bigg (1+\frac{2}{p^s}\Bigg ) = \frac{\zeta (s)L(s,\chi _4)K(s)}{\zeta (2s)L(2s,\chi _4)}, \end{aligned}$$where
$$\begin{aligned} K(s) = \Bigg (1+\frac{1}{2^s}\Bigg )^{-1}\prod _{p\equiv 3\,\,(\text {mod}\,\,4)}\Bigg (1+\frac{1}{p^{2s}}\Bigg )^{-1}\prod _{p\equiv 1\,\,(\text {mod}\,\,4)}\Bigg (1+\frac{1}{p^s(p^s+2)}\Bigg ). \end{aligned}$$By the second part of Lemma B.1, taking \(z=1\) and \(N=\left\lfloor (\log {x})/ec_4 \right\rfloor \) for some suitable \(c_4>0\) we get
$$\begin{aligned} \sum _{\begin{array}{c} d\le v \\ p|d\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \mu ^2(d)\tau (d) = \frac{L(1,\chi _4)K(1)}{\zeta (2)L(2,\chi _4)}v+O(ve^{-c_4\sqrt{\log {v}}}). \end{aligned}$$Now by splitting the sum at \(v^{1-\epsilon }\) and using partial summation one can show that
$$\begin{aligned} \sum _{\begin{array}{c} d\le v \\ p|d\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu ^2(d)\tau (d)}{d}e^{-c_3\sqrt{\log {v/d}}} \ll 1+ (\epsilon + e^{-c_3\sqrt{\epsilon \log {v}}})\log {v}. \end{aligned}$$Hence, taking \(\epsilon =\frac{(\log \log {v})^3}{\log {v}},\) we see the error (B.8) is \(O(\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3),\) which is small. Let \(\gamma _1(a)=\sum _{d|a}\mu (d)/g_2(d).\) We obtain a main term
$$\begin{aligned} \frac{K_3(1)K_4(1)G_{3}(W)G_4(W)}{L(1,\chi _4)}\sum _{\begin{array}{c} a\le v \\ (a,W)=1 \\ p|a\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)g_4(a)\gamma _1(a)G_{3}(a)G_4(a)}{g_2(a)a}\log {\frac{v}{a}}. \end{aligned}$$For this sum we consider the generating series, for \(\sigma >1\)
$$\begin{aligned} h_4(W,s)&= \sum _{\begin{array}{c} n=1 \\ (n,W)=1 \\ p|n\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (n)g_4(n)\gamma _1(n)G_3(n)G_4(n)}{n^sg_2(n)} = h_1(W,s)K_5(s)G_5(W,s). \end{aligned}$$where
$$\begin{aligned} K_5(s)&= \prod _{p\equiv 1\,\,(\text {mod}\,\,4)} \Bigg (1-\frac{(p-1)^2}{(p^s-1)(2p-1)(2p^2-p+1)}\Bigg ), \\ G_5(W,s)&= \prod _{\begin{array}{c} p|W \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{(p-1)^2}{(p^s-1)(2p-1)(2p^2-p+1)}\Bigg )^{-1}. \end{aligned}$$Applying Lemma B.1 with \(N=0\) gives
$$\begin{aligned} Z_{N,W}^{(1)}&= C_{W} (\log {v})^{\frac{1}{2}}+O(\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3), \end{aligned}$$where
$$\begin{aligned} C_W= \frac{K_1(1)K_3(1)K_4(1)K_5(1)G_1(W)G_3(W)G_4(W)G_5(W)}{\Gamma (3/2)L(1,\chi _4)^{\frac{3}{2}}} \end{aligned}$$(B.9)which reduces to the stated result (note that
$$\begin{aligned} K_3(1)G_3(W)&=\frac{A^2}{g_1(W_1)^2}\prod _{\begin{array}{c} p>D_0 \\ p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\Bigg (1-\frac{1}{(p-1)^2}\Bigg ) = \frac{A^2}{g_1(W_1)^2}(1+O(D_0^{-1})) \end{aligned}$$to get the form stated). We note for future reference that \(|C_W| \asymp (\log {D_0})^{3/2}.\)
-
(iv)
For \(Z_{N,W}^{(2)}\) recall from (5.5) the definition
$$\begin{aligned} Z_{N,W}^{(2)}=\sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}\sum _{p|[a,b]}g_6(p). \end{aligned}$$(B.10)We can write
$$\begin{aligned} Z_{N,W}^{(2)}&= \sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}}g_6(q) \sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ q|[a,b] \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}} \\&= \sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}}g_6(q) \Bigg [T_1+T_2-T_3\Bigg ]. \end{aligned}$$where q is prime, and
$$\begin{aligned} T_1&= \sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ q|a \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}, \\ T_2&= \sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ q|b \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}, \\ T_3&= \sum _{\begin{array}{c} a,b\le v\\ (a,W)=(b,W)=1 \\ q|a,b \\ p|a,b\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a)\mu (b)g_4([a,b])}{g_2(a)g_2(b)[a,b]}\log {\frac{v}{a}}\log {\frac{v}{b}}. \end{aligned}$$\(T_1\) can be evaluated similarly to part (iii) to give
$$\begin{aligned} T_1&= \frac{C_W\beta _1(q)}{q}(\log {v/q})^{\frac{1}{2}}+O\Bigg (\frac{\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3}{q}\Bigg ), \end{aligned}$$where
$$\begin{aligned} \beta _1(q)&= \frac{\mu (q)g_4(q)\gamma _1(q)G_1(q)G_3(q)G_4(q)G_5(q)}{g_2(q)} = -\frac{q(4q^2-3q+1)}{2(q-1)(2q^2-2q+1)}. \end{aligned}$$\(T_2\) can be evaluated similarly. For \(T_3,\) write \(a=a'q, b=b'q\) then \([a,b]=[qa',qb']=q[a',b']\) so that
$$\begin{aligned} T_3 = \frac{\mu ^2(q)g_4(q)}{g_2(q)^2q} \sum _{\begin{array}{c} a',b'\le v/q\\ (a',qW)=(b',qW)=1 \\ p|a',b'\Rightarrow p\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\mu (a')\mu (b')g_4([a',b'])}{g_2(a')g_2(b')[a',b']}\log {\frac{v}{a'}}\log {\frac{v}{b'}}. \end{aligned}$$This is the same form as \(Z_{N,W}^{(1)}.\) By the same considerations as above we obtain
$$\begin{aligned} T_3&= \frac{C_W\beta _2(q)}{q}(\log {v/q})^{\frac{1}{2}}+O\Bigg (\frac{\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3}{q}\Bigg ), \end{aligned}$$where
$$\begin{aligned} \beta _2(q)&= \frac{\mu ^2(q)g_4(q)G_1(q)G_3(q)G_4(q)G_5(q)}{g_2(q)^2} = \frac{q^2(4q^2-3q+1)}{2(q-1)^2(2q^2-2q+1)}. \end{aligned}$$Thus we obtain
$$\begin{aligned} Z_{N,W}^{(2)} = C_{W}&\sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{2\beta _1(q)-\beta _2(q)}{q}g_6(q)(\log {(v/q)})^{\frac{1}{2}}\\&+O\Bigg (\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3\sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{g_6(q)}{q}\Bigg ), \end{aligned}$$where \(C_W\) is defined as in (B.9). Recalling the definition of \(g_6(q),\) we see the error term contributes
$$\begin{aligned} \ll \exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3\log {v}. \end{aligned}$$Now note that
$$\begin{aligned} \frac{2\beta _1(q)-\beta _2(q)}{q}g_6(q) = -\frac{(3q-2)(2q+1)\log {q}}{2(q+1)(2q^2-2q+1)} = -\frac{3\log {q}}{2q}+O\Bigg (\frac{\log {q}}{q^2}\Bigg ) \end{aligned}$$This error contributes
$$\begin{aligned} \ll C_W \sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}}\frac{\log {q}}{q^2}(\log {v})^{\frac{1}{2}} \ll C_W (\log {v})^{\frac{1}{2}} \end{aligned}$$which is small. We are left with a main term
$$\begin{aligned} -\frac{3C_W}{2} \sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {q}}{q} \Bigg (\log {\frac{v}{q}}\Bigg )^{\frac{1}{2}}. \end{aligned}$$By partial summation one can show
$$\begin{aligned} \sum _{\begin{array}{c} D_0<q\le v \\ q\equiv 1\,\,(\text {mod}\,\,4) \end{array}} \frac{\log {q}}{q} \Bigg (\log {\frac{v}{q}}\Bigg )^{\frac{1}{2}} = \frac{1}{3}(\log {v})^{\frac{3}{2}}+O((\log {v})^{\frac{1}{2}}\log {D_0}), \end{aligned}$$so that
$$\begin{aligned} Z_{N,W}^{(2)} = -\frac{C_W}{2}(\log {v})^{\frac{3}{2}}+O(\exp (D_0^{1/4}\log \log {D_0})(\log \log {v})^3\log {v}). \end{aligned}$$This simplifies to the stated result.
\(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
McGrath, O. A variation of the prime k-tuples conjecture with applications to quantum limits. Math. Ann. 384, 1343–1407 (2022). https://doi.org/10.1007/s00208-021-02321-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00208-021-02321-4