1 Introduction

A quartic del Pezzo surface X over \(\mathbb {Q}\) is a smooth projective surface in \(\mathbb {P}^4\) cut out by a pair of quadrics defined over \(\mathbb {Q}\). When X contains a conic defined over \(\mathbb {Q}\) it may be equipped with a dominant \(\mathbb {Q}\)-morphism \(X\rightarrow \mathbb {P}^1\), all of whose fibres are conics, giving X the structure of a conic bundle surface. Let \(U\subset X\) be the Zariski open set obtained by deleting the 16 lines from X and consider the counting function

$$\begin{aligned} N(B) = \sharp \{ x \in U(\mathbb {Q}): H(x) \leqslant B \} , \end{aligned}$$

for \(B\geqslant 1\), where H is the standard height function on \(\mathbb {P}^{4}(\mathbb {Q})\). The Batyrev–Manin conjecture [13] predicts the existence of a constant \(c\geqslant 0\) such that \(N(B)\sim cB(\log B)^{\rho -1}\), as \(B\rightarrow \infty \), where \(\rho ={{\mathrm{rank}}}\, {{\mathrm{Pic}}}_{\mathbb {Q}}(X)\leqslant 6\). To date, as worked out by de la Bretèche and Browning [2], the only example for which this conjecture has been settled is the surface

$$\begin{aligned} x_0x_1-x_2x_3=x_0^2+x_1^2+x_2^2-x_3^2-2x_4^2=0, \end{aligned}$$

with Picard rank \(\rho =5\). For a general quartic del Pezzo surface the best upper bound we have is \(N(B) =O_{\varepsilon ,X}(B^{\frac{3}{2}+\varepsilon }),\) for any \(\varepsilon >0\), which appears in forthcoming work of Salberger.

In work presented at the conference “Higher dimensional varieties and rational points” at Budapest in 2001, Salberger noticed that one can get much better upper bounds for N(B) when X has a conic bundle structure over \(\mathbb {Q}\), ultimately showing that \(N(B)=O_{\varepsilon ,X}( B^{1 + \varepsilon })\), for all \(\varepsilon >0\). Leung [21] revisited Salberger’s argument to promote the \(B^\varepsilon \) to an explicit power of \(\log B\). On the other hand, recent work of Frei, Loughran and Sofos [15, Thm. 1.2] provides a lower bound for N(B) of the predicted order of magnitude for any quartic del Pezzo surface over \(\mathbb {Q}\) with a \(\mathbb {Q}\)-conic bundle structure and Picard rank \(\rho \geqslant 4\). (In fact they have results over any number field and for conic bundle surfaces of any degree.) Our main result goes further and shows that the expected upper and lower bounds can be obtained for any conic bundle quartic del Pezzo surface over \(\mathbb {Q}\).

Theorem 1.1

Let X be a quartic del Pezzo surface defined over \(\mathbb {Q}\), such that \(X(\mathbb {Q})\ne \varnothing \). If X contains a conic defined over \(\mathbb {Q}\) then there exist effectively computable constants \(c_1,c_2, B_0>0\), depending on X, such that for all \(B\geqslant B_0\) we have

$$\begin{aligned} c_1B (\log B)^{\rho -1} \leqslant N(B)\leqslant c_2 B (\log B)^{\rho -1}. \end{aligned}$$

It is worth emphasising that this appears to be the first time that sharp bounds are achieved towards the Batyrev–Manin conjecture for del Pezzo surfaces that are not necessarily rational over \(\mathbb {Q}\).

Let X be a quartic del Pezzo surface defined over \(\mathbb {Q}\), with a conic bundle structure \(\pi : X\rightarrow \mathbb {P}^1\). There are 4 degenerate geometric fibres of \(\pi \) and it follows from work of Colliot-Thélène [10] and Salberger [25], using independent approaches, that the Brauer–Manin obstruction is the only obstruction to the Hasse principle and weak approximation. Let \(\delta _0\leqslant \delta _1\leqslant 4\), where \(\delta _1\) is the number of closed points in \(\mathbb {P}^1\) above which \(\pi \) is degenerate and \(\delta _0\) is the number of these with split fibres. (Recall from [28, Def. 0.1] that a scheme over \(\mathbb {Q}\) is called split if it contains a non-empty geometrically integral open subscheme.) It follows from [15, Lemma 2.2] that

$$\begin{aligned} \rho =2+\delta _0. \end{aligned}$$
(1.1)

For comparison, Leung’s work [21, Chapter 4] establishes an upper bound for N(B) with the potentially larger exponent \(1+\delta _1\). This exponent agrees with the Batyrev–Manin conjecture if and only if \(X\rightarrow \mathbb {P}^1\) is a conic bundle with a section over \(\mathbb {Q}\), a hypothesis that our main result avoids.

Our proof of the upper bound makes essential use of [29], where detector functions are worked out for the fibres with \(\mathbb {Q}\)-rational points. Combining this with height machinery and a uniform estimate [7] for the number of rational points of bounded height on a conic, the problem is reduced to finding optimal upper bounds for divisor sums of the shape

$$\begin{aligned} \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}^2\\ \max \{|s|,|t|\}\leqslant x \end{array}} \prod _{i=1}^n \sum _{\begin{array}{c} d_i \mid \Delta _i(s,t) \end{array}} \left( \frac{G_i(s,t)}{d_i} \right) . \end{aligned}$$
(1.2)

Here, \(n=\delta _1\) and \(\Delta _1,\dots ,\Delta _n\in \mathbb {Z}[s,t]\) are the closed points of \(\mathbb {P}^1\) above which \(\pi \) is degenerate, with \(G_1,\dots ,G_n\in \mathbb {Z}[s,t]\) being certain associated forms of even degree. Thus far, such sums have only been examined in the special case that \(G_1,\dots ,G_n\) all have degree zero. In this setting, work of la Bretèche and Browning [1] can be invoked to yield the desired upper bound. Unfortunately, this result is no longer applicable when one of \(G_1,\dots ,G_n\) has positive degree.

Using [15], we shall see in Sect. 3 that our proof of the lower bound in Theorem 1.1 may proceed for surfaces \(X\rightarrow \mathbb {P}^1\) of Picard rank \(\rho =2\). In this case the fibre above any degenerate closed point of \(\mathbb {P}^1\) must be non-split by (1.1). Ultimately, following the strategy of [15], this leads to the problem of proving tight lower bounds for sums like (1.2) in the special case that none of the characters \((\frac{G_i(s,t)}{\cdot })\) are trivial. One of the key ingredients in this endeavour is a generalised Hooley \(\Delta \)-function. Let \(K/\mathbb {Q}\) be a number field and let \(\psi _K\) be a quadratic Dirichlet character on K. We define an arithmetic function on integral ideals of K via

$$\begin{aligned} \Delta (\mathfrak {a};\psi _K)= \sup _{\begin{array}{c} u \in \mathbb {R}\\ 0\leqslant v\leqslant 1 \end{array}} \Big | \sum _{\begin{array}{c} \mathfrak {d}\mid \mathfrak {a}\\ \mathrm {e}^u<\text {N}\,_K\mathfrak {d}\leqslant \mathrm {e}^{u+v} \end{array}} \psi _K(\mathfrak {d}) \Big |, \end{aligned}$$

for any ideal \(\mathfrak {a}\) in the ring of integers \(\mathfrak {o}_K\) of K, where \(\text {N}\,_K \) denotes the ideal norm. When \(K=\mathbb {Q}\) this recovers the twisted \(\Delta \)-function considered by la Bretèche–Tenenbaum [3] and Brüdern [9]. Our treatment of the lower bound requires a second moment estimate for \(\Delta (\mathfrak {a};\psi _K)\) and this is supplied in a companion paper of Sofos [30].

Remark 1.2

Châtelet surfaces provide the other family of relatively minimal conic bundle surfaces of degree 4. When they are defined over \(\mathbb {Q}\), the Batyrev–Manin conjecture also makes a prediction for the distribution of \(\mathbb {Q}\)-rational points on them. Work of Browning [6] shows that the relevant counting function satisfies an upper bound of the expected size. Although we shall not provide any details here, if we suppose that the Châtelet surface has a \(\mathbb {Q}\)-rational point, then a lower bound of the proper size follows from the work in this paper, on taking the forms \(G_1,\dots ,G_n\) to have degree 0 in (1.2).

The main novelty in our work lies in how we overcome the difficulty of divisor sums involving characters without a fixed modulus in (1.2). In Sect. 2.2, drawing inspiration from recent work of Reuss [24], we replace the divisor functions at hand by generalised divisor functions which run over certain integral ideal divisors belonging to the number field obtained by adjoining a root of \(\Delta _i\), for each \(1\leqslant i\leqslant n\). Our proof of Theorem 1.1 then relies upon an extension to number fields of work by Nair and Tenenbaum [22] on short sums of non-negative arithmetic functions. This is achieved in an auxiliary investigation [8], the outcome of which is recorded in Sect. 2.1.

2 Preliminary results

2.1 Nair–Tenenbaum over number fields

Let \(K/\mathbb {Q}\) be a number field and let \(\mathfrak {o}_K\) be its ring of integers. Denote by \(\mathscr {I}_K\) the set of ideals in \(\mathfrak {o}_K\). We say that a function \(f:\mathscr {I}_K\rightarrow \mathbb {R}_{\geqslant 0}\) is pseudomultiplicative if there exist strictly positive constants \(A,B,\varepsilon \) such that

$$\begin{aligned} f(\mathfrak {a}\mathfrak {b}) \leqslant f(\mathfrak {a}) \min \{A^{\Omega _K(\mathfrak {b})}, B(\text {N}\,_K\mathfrak {b})^\varepsilon \}, \end{aligned}$$

for all coprime ideals \(\mathfrak {a},\mathfrak {b}\in \mathscr {I}_K\), where \( \Omega _{K}(\mathfrak {b})=\sum _{ \mathfrak {p}\mid \mathfrak {b}} \nu _\mathfrak {p}(\mathfrak {b}). \) We denote the class of all pseudomultiplicative functions associated to AB and \(\varepsilon \) by \(\mathscr {M}_K=\mathscr {M}_K(A,B,\varepsilon )\). Note that any \(f\in \mathscr {M}_K\) satisfies the bounds \(f(\mathfrak {a}) \ll A^{\Omega _K(\mathfrak {a})}\) and \(f(\mathfrak {a}) \ll (\text {N}\,_K \mathfrak {a})^\varepsilon \), for any \(\mathfrak {a}\in \mathscr {I}_K\).

We will need to work with functions supported away from ideals of small norm. To facilitate this, for any ideal \(\mathfrak {a}\in \mathscr {I}_K\) and \(W\in \mathbb {N}\), we set

$$\begin{aligned} \mathfrak {a}_W=\prod _{\begin{array}{c} \mathfrak {p}^\nu \Vert \mathfrak {a}\\ \gcd (\text {N}\,_K \mathfrak {p},W)=1 \end{array}} \mathfrak {p}^\nu . \end{aligned}$$
(2.1)

We extend this to rational integers in the obvious way. Similarly, for any \(f\in \mathscr {M}_K\), we define \( f_W(\mathfrak {a})=f(\mathfrak {a}_W).\)

Remark 2.1

We will always assume that W is of the form

$$\begin{aligned} W=\prod _{p\leqslant w} p^\nu , \end{aligned}$$
(2.2)

for some \(w>0\) and \(\nu \) a positive integer. Throughout Sect. 3 we shall take \(\nu \) to be a large constant depending only on various polynomials that are determined by X, while in Sect. 4 we shall take \(\nu =1\). In either case we have \(\gcd (\text {N}\,_K\mathfrak {p},W)=1\) if and only if \(p>w\), if \(\text {N}\,_K\mathfrak {p}=p^{f_\mathfrak {p}}\) for some \(f_\mathfrak {p}\in \mathbb {N}\). Our notation is reminiscent of the “W-trick” that appears in work of Green and Tao [16]. Whereas in their context it is important that the parameter w tends to infinity, in our setting we shall choose w to be a suitably large constant, where the meaning of “suitably large” is allowed to change at various points of the proof.

Let

$$\begin{aligned} \mathscr {P}_K^\circ = \{\mathfrak {a}\subset \mathfrak {o}_K: \mathfrak {p}\mid \mathfrak {a}\Rightarrow f_{\mathfrak {p}}=1 \} \end{aligned}$$
(2.3)

be the multiplicative span of all prime ideals \(\mathfrak {p}\subset \mathfrak {o}_K\) with residue degree \(f_\mathfrak {p}=1\). For any \(x>0\) and \(f\in \mathscr {M}_K\) we set

$$\begin{aligned} E_{f}(x;W)=\exp \left( \sum _{\begin{array}{c} \mathfrak {p}\in \mathscr {P}_K^\circ \text { prime}\\ w<\text {N}\,_K \mathfrak {p}\leqslant x \\ f_\mathfrak {p}=1 \end{array}}\frac{f(\mathfrak {p})}{\text {N}\,_K\mathfrak {p}}\right) , \end{aligned}$$

if f is submultiplicative, and

$$\begin{aligned} E_{f}(x;W)= \sum _{\begin{array}{c} \text {N}\,_K\mathfrak {a}\leqslant x\\ \mathfrak {a}\in \mathscr {P}_K^\circ \text { square-free} \\ \gcd (\text {N}\,_K \mathfrak {a},W)=1 \end{array}}\frac{f(\mathfrak {a})}{\text {N}\,_K\mathfrak {a}}, \end{aligned}$$

otherwise.

Suppose now that we are given irreducible binary forms \(F_1,\dots ,F_N\in \mathbb {Z}[x,y]\), which we assume to be pairwise coprime. Let \(i\in \{1,\dots ,N\}\). Suppose that \(F_i\) has degree \(d_i\) and that it is not proportional to y, so that \(b_i=F_i(1,0)\) is a non-zero integer. It will be convenient to form the homogeneous polynomial

$$\begin{aligned} \tilde{F}_i(x,y)=b_i^{d_i-1}F_i(b_i^{-1}x,y). \end{aligned}$$
(2.4)

This has integer coefficients and satisfies \(\tilde{F}_i(1,0)=1\). We let \(\theta _i\) be a root of the monic polynomial \(\tilde{F}_i(x,1)\). Then \(\theta _i\) is an algebraic integer and we denote the associated number field of degree \(d_i\) by \(K_i=\mathbb {Q}(\theta _i)\). Moreover,

$$\begin{aligned} N_{K_i/\mathbb {Q}}(b_is-\theta _it)=\tilde{F}_i(b_is,t)=b_i^{d_i-1}F_i(s,t), \end{aligned}$$

for any \((s,t)\in \mathbb {Z}^2\). If \(b_i=0\), so that \(F_i(x,y)=c y\) for some non-zero \(c \in \mathbb {Z}\), we take \(\theta _i=-c\) and \(K_i=\mathbb {Q}\) in this discussion. Our work on Theorem 1.1 requires tight upper bounds for averages of \(f_{1,W}((b_1s-\theta _1 t))\dots f_{N,W}((b_Ns-\theta _N t))\), over primitive vectors \((s,t)\in \mathbb {Z}^2\), for general pseudomultiplicative functions \(f_i\in \mathscr {M}_{K_i}\) and suitably large w.

For any \(k\in \mathbb {N}\) and any polynomial \(P\in \mathbb {Z}[x]\), we set

$$\begin{aligned} \rho _{P}(k)=\sharp \{x\,({{\mathrm{mod}}}{\,k}) : P(x)\equiv 0 \,({{\mathrm{mod}}}{\,k})\}. \end{aligned}$$
(2.5)

Let \(\overline{\rho }_i(k)= \rho _{F_i(x,1)}(k)\) if \(F_i(1,0)\ne 0\) and \(\overline{\rho }_i(k)=1\) if \(F_i(1,0)= 0\). Moreover, put

$$\begin{aligned} h^*(k)=\prod _{p\mid k} \left( 1-\frac{\overline{\rho }_1(p)+\dots +\overline{\rho }_N(p)}{p+1}\right) ^{-1}. \end{aligned}$$
(2.6)

To any non-empty bounded measurable region \(\mathscr {R} \subset \mathbb {R}^2\), we associate

$$\begin{aligned} K_\mathscr {R}=1+ \Vert \mathscr {R}\Vert _{\infty } +\partial (\mathscr {R}) \log (1+\Vert \mathscr {R}\Vert _\infty ) +\frac{\mathrm {vol}(\mathscr {R})}{1+\Vert \mathscr {R}\Vert _\infty } ,\end{aligned}$$

where \(\Vert \mathscr {R}\Vert _\infty = \sup _{(x,y) \in \mathscr {R}}\{ |x|,|y|\}\). We say that such a region \(\mathscr {R}\) is regular if its boundary is piecewise differentiable, \(\mathscr {R}\) contains no zeros of \(F_1\cdots F_N\) and there exists \(c_1>0\) such that \(\mathrm {vol}(\mathscr {R})\geqslant K_\mathscr {R}^{c_1}\). Bearing all of this in mind, the following result is [8, Thm. 1.1].

Lemma 2.2

Let \(\mathscr {R} \subset \mathbb {R}^2\) be a regular region, let \(V=\mathrm {vol}(\mathscr {R})\) and let \(G \subset \mathbb {Z}^2\) be a lattice of full rank, with determinant \(q_G\) and first successive minimum \(\lambda _G\). Assume that \(q_G \leqslant V^{c_2}\) for some \(c_2>0\). Let \(f_i \in \mathscr {M}_{K_i}(A_i,B_i,\varepsilon _i )\) for \(1\leqslant i\leqslant N\) and let

$$\begin{aligned}\varepsilon _0= \max \bigg \{1+\frac{4}{c_1},\frac{4(5+3 \max \{\varepsilon _1,\dots ,\varepsilon _N\}) }{c_1}\bigg \} \left( \sum _{i=1}^N d_i \varepsilon _i \right) . \end{aligned}$$

Then, for any \(\varepsilon >0\) and \(w>w_0(f_i,F_i,N)\), we have

$$\begin{aligned} \sum _{(s,t) \in \mathbb {Z}_{\text {prim}}^2 \cap \mathscr {R}\cap G} \prod _{i=1}^N f_{i,Wq_G}((b_i s -\theta _i t)) \ll&\frac{V}{(\log V)^N} \frac{h_W^*(q_G) }{q_G} \prod _{i=1}^N E_{f_i}(V;W)\\&\quad +\frac{K_{\mathscr {R}}^{1+\varepsilon _0+\varepsilon }}{\lambda _G} , \end{aligned}$$

where the implied constant depends at most on \(c_1,c_2,A_i,B_i,F_i, \varepsilon , \varepsilon _i, N,W\).

Let \(1\leqslant i\leqslant n\). In the statement of this result we recall the convention that the function \(f_{i,W q_G}\) is defined in such a way that \(f_{i,Wq_G}(\mathfrak {a})=f_{i}(\mathfrak {a}_{W q_G})\) for any integral ideal \(\mathfrak {a}\subset \mathfrak {o}_{K_i}\), where

$$\begin{aligned} \mathfrak {a}_{Wq_G}= \prod _{\begin{array}{c} \mathfrak {p}^\nu \Vert \mathfrak {a}\\ \gcd (\text {N}\,_K \mathfrak {p},W)=1 \\ \gcd (\text {N}\,_K \mathfrak {p},q_G)=1 \end{array}} \mathfrak {p}^\nu . \end{aligned}$$

2.2 Divisor sums over number fields

Let \(K/\mathbb {Q}\) be a finite extension of degree d. We write \(\mathfrak {o}=\mathfrak {o}_K\) and \(\text {N}\,=\text {N}\,_K\) for the ring of integers and ideal norm, respectively. Let \(\sigma _1,\dots ,\sigma _d:K \hookrightarrow \mathbb {C}\) be the associated embeddings and let \(\{\omega _1,\dots ,\omega _d\}\) be a \(\mathbb {Z}\)-basis for \(\mathfrak {o}\). Let \(\mathfrak {a}\subset \mathfrak {o}\) be an integral ideal with \(\mathbb {Z}\)-basis \(\{\alpha _1,\dots ,\alpha _d\}\). We henceforth set \(\Delta (\alpha _1,\dots ,\alpha _d)=|\det (\sigma _i(\alpha _j))|^2\), and similarly for \(\{\omega _1,\dots ,\omega _d\}\). According to [20, Satz 103], we have

$$\begin{aligned} \Delta (\alpha _1,\dots ,\alpha _d)=(\text {N}\,\mathfrak {a})^2 D_K, \end{aligned}$$
(2.7)

where \(D_K=\Delta (\omega _1,\dots ,\omega _d)\) is the discriminant of K.

Let \(F,G\in \mathbb {Z}[x,y]\) be non-zero binary forms with F irreducible, G of even degree and non-zero resultant \(\mathrm {Res}(F,G)\). We shall assume that F has degree d and that it is not proportional to y. In particular \(b=F(1,0)\) is a non-zero integer. Let \(W\in \mathbb {N}\). For any \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\) such that \(F(s,t)\ne 0\), we define

$$\begin{aligned} h_W(s,t)=\sum _{\begin{array}{c} k \mid F(s,t)\\ \gcd (k,W)=1 \end{array}}\left( \frac{G(s,t)}{k}\right) . \end{aligned}$$
(2.8)

This is a modified version of the functions that appear in (1.2). We recall from (2.4) the associated binary form \(\tilde{F}(x,y)=b^{d-1}F(b^{-1}x,y)\), with integer coefficients and \(\tilde{F}(1,0)=1\). We conclude that for all non-zero integer multiples c of b, we have

$$\begin{aligned} h_{cW}(s,t)=\sum _{\begin{array}{c} k \mid \tilde{F}(bs,t)\\ \gcd (k,cW)=1 \end{array}}\left( \frac{G(s,t)}{k}\right) , \end{aligned}$$

since \(k\mid \tilde{F}(bs,t) \) if and only if \(k\mid F(s,t)\).

We henceforth let \(\theta \) be a root of the polynomial \(f(x)=\tilde{F}(x,1)\). Then \(\theta \) is an algebraic integer and \(K=\mathbb {Q}(\theta )\) is a number field of degree d over \(\mathbb {Q}\). It follows that \(\mathbb {Z}[\theta ]\subset \mathfrak {o}\) is an order of K with discriminant \(\Delta _\theta =\Delta (1,\theta ,\dots ,\theta ^{d-1})\). In view of (2.7) we have

$$\begin{aligned} \Delta _\theta =[\mathfrak {o}:\mathbb {Z}[\theta ]]^2D_K. \end{aligned}$$
(2.9)

We now let \(L=K(\sqrt{g(\theta )})\), where \(g(x)=G(b^{-1}x,1)\in \mathbb {Q}[x]\). We shall assume that L / K is a quadratic extension and we let \(D_{L/K}\) be the ideal norm of the relative discriminant \(\mathfrak {D}_{L/K}\). Let \({\mathfrak {f}}={\mathfrak {f}}_{L/K}\) be the conductor of the extension L / K. Let \(J^{\mathfrak {f}}\) be the group of fractional ideals in K coprime to \({\mathfrak {f}}\) and let \(P^{\mathfrak {f}}\) be the group of principal ideals (a) such that \(a\equiv 1\,({{\mathrm{mod}}}{\,{\mathfrak {f}}})\) and a totally positive. As explained by Neukirch [23, §VII.10], the Artin symbol \(\psi (\mathfrak {a})=(\frac{L/K}{\mathfrak {a}})\) gives rise to a character \( \psi :J^{\mathfrak {f}}/P^{\mathfrak {f}}\rightarrow \{\pm 1\} \) of the ray class group \(J^{\mathfrak {f}}/P^{\mathfrak {f}}\), with \(\mathfrak {a}\,({{\mathrm{mod}}}{\,P^{{\mathfrak {f}}}})\mapsto (\frac{L/K}{\mathfrak {a}})\). This has the property that \(\psi (\mathfrak {p})=1\) if and only if \(\mathfrak {p}\) splits in L, for any unramified prime ideal \({\mathfrak {p}}\in J^{\mathfrak {f}}\).

Let

$$\begin{aligned} D=2b D_{L/K} \Delta _\theta \text {N}\,{\mathfrak {f}}. \end{aligned}$$
(2.10)

Note that D is a non-zero integer. Recall the definition (2.3) of \(\mathscr {P}_K^\circ \) of the multiplicative span of degree 1 prime ideals. We shall mainly work with the subset

$$\begin{aligned} \!\mathscr {P}_K\!=\! \{\mathfrak {a}\subset \mathscr {P}_K^\circ : \mathfrak {p}_1\mathfrak {p}_2\mid \mathfrak {a}\Rightarrow \text {N}\,_K\mathfrak {p}_1\ne \text {N}\,_K\mathfrak {p}_2 \text { or } \mathfrak {p}_1= \mathfrak {p}_2 \} \end{aligned}$$
(2.11)

cut out by ideals divisible by at most one prime ideal above each rational prime. It is not hard to see that \(\mathscr {P}_K\) has positive density in \(\mathscr {I}_K\). The proof of the following result is inspired by an argument found in recent work of Reuss [24, Lemma 4].

Lemma 2.3

Let \(W\in \mathbb {N}\), let \((s,t)\in \mathbb {Z}_{\text {prim}}^2\) such that \(F(s,t) \ne 0\), and let D be given by (2.10). Then the following hold:

  1. (i)

    \(\mathfrak {a}\in \mathscr {P}_K\) for any integral ideal \(\mathfrak {a}\mid (bs-\theta t)\) such that \(\gcd (\text {N}\,\mathfrak {a},DW)=1\);

  2. (ii)

    there exists a bijection between divisors \(\mathfrak {a}\mid (bs-\theta t)\) with \(\text {N}\,\mathfrak {a}=k\) coprime to DW and divisors \(k\mid \tilde{F}(bs,t)\) coprime to DW, in which \(\Omega (k)=\Omega _K(\mathfrak {a})\) and \((\frac{G(s,t)}{k})=\psi (\mathfrak {a})\);

  3. (iii)

    we have

    $$\begin{aligned} h_{DW}(s,t)=\sum _{\begin{array}{c} \mathfrak {a}\mid (bs-\theta t)\\ \gcd (\text {N}\,\mathfrak {a},DW)=1 \end{array}} \psi (\mathfrak {a}). \end{aligned}$$

In particular, when G(st) is the constant polynomial 1 in (2.8), then \(L=K\) and \(\psi \) is just the trivial character in part (iii). We note that \(\Omega _K(\mathfrak {a})=\Omega (\text {N}\,\mathfrak {a})\) and \(\tau _K(\mathfrak {a})=\tau (\text {N}\,\mathfrak {a})\) for any ideal \(\mathfrak {a}\in {\mathscr {P}}_K\), where \(\tau _K(\mathfrak {a})=\sum _{\mathfrak {d}\mid \mathfrak {a}}1\). Similarly, if \(h:\mathbb {N}\rightarrow \mathbb {R}_{\geqslant 0}\) is any arithmetic function, we have

$$\begin{aligned} \prod _{\mathfrak {p}\mid \mathfrak {a}} \left( 1+h(\text {N}\,\mathfrak {p})\right) =\prod _{p\mid \text {N}\,\mathfrak {a}} \left( 1+h(p)\right) , \end{aligned}$$

for any \(\mathfrak {a}\in \mathscr {P}_K\). We shall use these facts without further comment in the remainder of the paper.

Proof of Lemma 2.3

Let \((s,t)\in \mathbb {Z}_{\text {prim}}^2\) such that \(F(s,t) \ne 0\). We form the integral ideal \(\mathfrak {n}=(bs-\theta t).\) This has norm \(\text {N}\,\mathfrak {n}=|\tilde{F}(bs,t)|.\) Let \(k\mid \tilde{F}(bs,t)\) with \(\gcd (k,DW)=1\). In particular \(\gcd (k,\Delta _\theta )=1\).

Part (i) is proved in [8, Lemma 2.3]. Turning to part (ii), it follows from (i) that \((p,\mathfrak {n})\) is a prime ideal for any \(p\mid k\). Thus there is a bijection between each factorisation \(|\tilde{F}(bs,t)|=ke\), with \(\gcd (k,DW)=1\), and each ideal factorisation \(\mathfrak {n}={\mathfrak {a}}\mathfrak {b}\), with \(\text {N}\,\mathfrak {a}=k\) coprime to DW and \(\text {N}\,\mathfrak {b}=e\). In order to complete the proof of part (ii) of the lemma, it will suffice to show that

$$\begin{aligned} \left( \frac{G(s,t)}{p}\right) =\psi ({\mathfrak {p}}), \end{aligned}$$

where \(\mathfrak {p}=(p,\mathfrak {n})\). Since G has even degree we have

$$\begin{aligned} \left( \frac{G(s,t)}{p}\right) =\left( \frac{G(s\overline{t},1)}{p}\right) . \end{aligned}$$

Recall the notation \(g(x)=G(b^{-1}x,1)\). We may suppose that \(\mathfrak {p}=(p,\theta -n)\), for some \(n\in \mathbb {Z}/p\mathbb {Z}\) such that \(bs\overline{t}-n\equiv 0\,({{\mathrm{mod}}}{\,p})\), and we recall from (2.10) that \(p\not \mid 2D_{L/K}\). We observe that \({\mathfrak {p}}\) splits in \(L=K(\sqrt{g(\theta )})\) if and only if g(n) is a square in \(\mathfrak {o}/\mathfrak {p}\), since \(g(\theta )\equiv g(n)\,({{\mathrm{mod}}}{\,\mathfrak {p}}).\) But this is if and only if

$$\begin{aligned} \left( \frac{g(bs\overline{t})}{p}\right) =1, \end{aligned}$$

since \(n\equiv bs\overline{t}\,({{\mathrm{mod}}}{\,p})\) and \(\text {N}\,\mathfrak {p}=p\). Noting that \(g(bs\overline{t})=G(s\overline{t},1)\), this completes the proof of part (ii). Finally, part (iii) follows from part (ii). \(\square \)

We close this section with an observation about the condition \({\mathfrak {a}} \mid (b s-\theta t)\) that appears in Lemma 2.3, the proof of which is found in [8, Lemma 2.4].

Lemma 2.4

Let \(\mathfrak {a}\in \mathscr {P}_K\) such that \(\gcd (\text {N}\,\mathfrak {a},D_K)=1\). Then there exists \(k=k(\mathfrak {a})\in \mathbb {Z}\) such that \( \mathfrak {a}\mid (bs-\theta t) \Leftrightarrow bs\equiv k t\,({{\mathrm{mod}}}{\,\text {N}\,\mathfrak {a}}) \), for all \((s,t)\in \mathbb {Z}^2\).

2.3 Uniform upper bounds for conics

Let \(Q\in \mathbb {Z}[y_1,y_2,y_3]\) be a non-singular isotropic quadratic form. Denote its discriminant by \(\Delta _Q\) and the greatest common divisor of the \(2\times 2\) minors of the associated matrix by \(D_Q\). It follows from [26, §IV.2] that there is a quadratic Dirichlet character \(\chi _Q\) such that

$$\begin{aligned} \sharp \{\mathbf {y}\,({{\mathrm{mod}}}{\,p}):Q(\mathbf {y})\equiv 0 \,({{\mathrm{mod}}}{\,p}), ~p\not \mid \mathbf {y}\} =p(p-1)\left( 1+\chi _Q(p)\right) +p-1, \end{aligned}$$

for any prime p such that \(p\mid \Delta _Q\) and \(p\not \mid 2 D_Q\).

The main aim of this section is to establish the following result.

Lemma 2.5

Let \(w,B_1,B_2,B_3>0\) be given. Then

$$\begin{aligned} \sharp \left\{ \mathbf {y}\in \mathbb {Z}_{\mathrm {prim}}^3: Q(\mathbf {y})=0, ~|y_i|\leqslant B_i \right\} \ll C(Q,w) \left( 1+\frac{\left( B_1B_2B_3\right) ^{\frac{1}{3}}D_Q^{\frac{1}{2}}}{|\Delta _Q|^{\frac{1}{3}} }\right) , \end{aligned}$$

with an absolute implied constant, where

$$\begin{aligned} C(Q,w)= \prod _{\begin{array}{c} p^\xi \Vert \Delta _Q \\ p \mid 2 D_Q \text { or } p\leqslant w \end{array}} \tau (p^\xi ) \prod _{\begin{array}{c} p^\xi \Vert \Delta _Q \\ p>w \\ p\not \mid 2D_Q \end{array}} \left( \sum _{k=0}^\xi \chi _Q(p)^k \right) . \end{aligned}$$

Since \(C(Q,w)\leqslant \tau (\Delta _Q)\), this result is a refinement of work due to Browning and Heath-Brown [7, Cor. 2]. In fact, although not needed here, one can show that for any prime \(p\not \mid 2D_Q\), the p-adic factor appearing above is commensurate with the p-adic Hardy–Littlewood density for the conic \(Q=0\). Furthermore, if this curve has no \(\mathbb {Q}_p\)-points for some prime \(p\not \mid 2D_Q\), then the constant in the upper bound vanishes. Therefore, Lemma 2.5 detects conics with a rational point. This is the point of view adopted in the work of Sofos [29].

Proof of Lemma 2.5

The proof of [7, Cor. 2] relies on earlier work of Heath-Brown [17, Thm. 2]. The latter work produces an upper bound for the number of lattices (with determinant depending on the coefficients of Q) that any non-trivial zero of Q is constrained to lie in. For each prime p such that \(p^\xi \Vert \Delta _Q\), it turns out that there are at most \(L(p^\xi )\leqslant c_p\tau (p^\xi )\) lattices to consider, where \(c_p=1\) for \(p>2\).

Suppose that \(\mathbf {y}\in \mathbb {Z}_{\mathrm {prim}}^3\) is a non-zero vector for which \(Q(\mathbf {y})=0\). Let p be a prime such that \(p^\xi \Vert \Delta _Q\), with \(p\not \mid 2 D_Q\) and \(\chi _Q(p)=-1\). On diagonalising over \(\mathbb {Z}/p^{\xi +1}\mathbb {Z}\), we may assume that

$$\begin{aligned} a_1y_1^2+a_2y_2^2+p^\xi y_3^2\equiv 0 \,({{\mathrm{mod}}}{\,p^{\xi +1}}), \end{aligned}$$

for coefficients \(a_1,a_2\in \mathbb {Z}\) such that \(p\not \mid a_1a_2\). In particular, we have \(\chi _Q(p)=(\frac{-a_1a_2}{p})=-1\). Hence \(L(p^\xi )=1\) when \(\xi \) is even, since then \(\mathbf {y}\) is merely constrained to lie on the lattice \(\{\mathbf {y}\in \mathbb {Z}^3: y_1\equiv y_2\equiv 0\,({{\mathrm{mod}}}{\,p^{\xi /2}})\}\). Likewise, when \(\xi \) is odd, there can be no solutions in primitive integers \(\mathbf {y}\).

Note that

$$\begin{aligned} \sum _{k=0}^\xi \chi _Q(p)^k={\left\{ \begin{array}{ll} \tau (p^\xi ) &{} \quad \text { if } \,\, \chi _Q(p)=1,\\ 1 &{} \quad \text { if } \,\, \chi _Q(p)=-1 \hbox { and } \xi \hbox { is even},\\ 0 &{}\quad \text { if } \,\, \chi _Q(p)=-1 \hbox { and } \xi \hbox { is odd.} \end{array}\right. } \end{aligned}$$

It follows that the total number of lattices emerging is

$$\begin{aligned}&\ll \mathbf {1}(\Delta _Q) \prod _{\begin{array}{c} p^\xi \Vert \Delta _Q \\ p | 2 D_Q \end{array}} \tau (p^\xi ) \prod _{\begin{array}{c} p^\xi \Vert \Delta _Q \\ p \leqslant w \\ p\not \mid 2 D_Q \end{array}} \tau (p^\xi ) \prod _{\begin{array}{c} p^\xi \Vert \Delta _Q \\ \chi _Q(p)=1 \\ p>w \\ p\not \mid 2D_Q \end{array}} \tau (p^\xi ) =C(Q,w), \end{aligned}$$

where \(\mathbf {1}(\Delta _Q)=0\) (resp. \(\mathbf {1}(\Delta _Q)=1\)) if there exists \(p^\xi \Vert \Delta _Q\) such that \(\chi _Q(p)=-1\), with \(\xi \) odd and \(p\not \mid 2D_Q\) (resp. otherwise). This completes the proof of the lemma. \(\square \)

2.4 Lattice point counting

We will need general results about counting lattice points in an expanding region. Let \(\mathscr {D}\subset \mathbb {R}^2\setminus \{\mathbf {0}\}\) be a non-empty open disc and put \(\delta (\mathscr {D})=\Vert \mathscr {D}\Vert _\infty \), in the notation of Sect. 2.1. Let \(b,c,q \in \mathbb {Z}\) and \(\mathbf {x}_0\in \mathbb {Z}^2\) such that \(q\geqslant 1\) and \(\gcd (\mathbf {x}_0,q)=1\). For each \(e\in \mathbb {N}\) such that \(\gcd (e,q)=\gcd (b,c,e)=1\), we define the non-empty set

$$\begin{aligned} \Lambda (e)=\{(s,t)\in \mathbb {Z}^2:bs\equiv ct \left( \text {mod}\ e\right) \}. \end{aligned}$$

We then fix, once and for all, a non-zero vector of minimal Euclidean length within \(\Lambda (e)\) and we call it \(\mathbf {v}(e)\). We are interested in

$$\begin{aligned} N(x)=\sharp \{\mathbf {x}\in \mathbb {Z}_{\text {prim}}^2 \cap x\mathscr {D}\cap \Lambda (e):\mathbf {x}\equiv \mathbf {x}_0\left( \text {mod}\ q\right) \}, \end{aligned}$$

as \(x\rightarrow \infty \). We shall prove the following result.

Lemma 2.6

Let \(\mathscr {D}, b,c,\mathbf {x}_0,q,\Lambda (e), \mathbf {v}(e), N(x)\) be as above, and assume that \(|\mathbf {v}(e)| \leqslant \delta (\mathscr {D}) x\). Then

$$\begin{aligned} N(x)&=\frac{\mathrm {vol}(\mathscr {D})x^2}{\zeta (2) e q^2} \prod _{p | e} \left( 1+\frac{1}{p}\right) ^{-1} \prod _{p | q} \left( 1-\frac{1}{p^2}\right) ^{-1}\\&\quad +O\left( \left( \beta +\gamma \right) x \left\{ \left( \sum _{d|e}\frac{1}{d|\mathbf {v}(e/d)|} \log \left( 2+ \frac{\delta (\mathscr {D}) x}{d|\mathbf {v}(e/d)|}\right) \right) + \frac{1}{e} \sum _{d|e}|\mathbf {v}(d)| \right\} \right) , \end{aligned}$$

where

$$\begin{aligned} \beta =\delta (\mathscr {D})+\frac{\partial {\mathscr {D}}}{q}, \quad \gamma = \frac{\mathrm {vol}(\mathscr {D})}{\delta (\mathscr {D}) q^2} . \end{aligned}$$

The implied constant in this estimate is absolute.

For any \(d\mid e\), let us denote \(\mathbf {v}(e/d)\) by \((x_0,x_1)\), temporarily. Then

$$\begin{aligned} \frac{e}{d} \mid (bx_0-c x_1) \Rightarrow (dx_0,dx_1) \in \Lambda (e), \end{aligned}$$

whence

$$\begin{aligned} |\mathbf {v}(e)| \leqslant d |\mathbf {v}(e/d)| .\end{aligned}$$
(2.12)

Moreover, using the basic properties of the minimal basis vector, one obtains

$$\begin{aligned} \frac{1}{e}\sum _{d\mid e} |\mathbf {v}(d)| \ll \frac{1}{e}\sum _{d\mid e} \sqrt{d}\leqslant \frac{\tau (e)}{\sqrt{e}}\ll \frac{\tau (e)}{|\mathbf {v}(e)|}. \end{aligned}$$
(2.13)

These inequalities may be used to simplify the error term in Lemma 2.6.

Proof of Lemma 2.6

Our argument is based on a modification of the proof of [29, Lemma 5.3]. We write \(\delta =\delta (\mathscr {D})\) for short and put \(\mathbf {x}_0=(s_0,t_0)\). Since \(\gcd (s_0,t_0,q)=1\), an application of Möbius inversion gives

$$\begin{aligned} N(x)= \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \end{array}} \mu (m) \sum _{\begin{array}{c} (u,v) \in \frac{x}{m}\mathscr {D}\cap \Lambda (e)\\ \gcd (u,v,e)=1 \\ (u,v)\equiv \overline{m}(s_0,t_0)\left( \text {mod}\ q\right) \end{array}} 1 \end{aligned}$$

on making the substitution \(s=mu\) and \(t=mv\). The inner sum is empty if m is large enough. Indeed, if it contains any terms then we must have

$$\begin{aligned} 1\leqslant |\mathbf {v}(e)| = \min \{|\mathbf {y}|:\mathbf {y} \in \Lambda (e)\setminus \{\mathbf {0}\}\} \leqslant \max \left\{ |\mathbf {y}|:\mathbf {y} \in \frac{x}{m}\mathscr {D}\right\} \leqslant \frac{\delta x}{m}. \end{aligned}$$

Thus, on using the Möbius function to remove the condition \(\gcd (u,v,e)=1\), we find that

$$\begin{aligned} N(x)= \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \\ m\leqslant \frac{\delta x}{|\mathbf {v}(e)|} \end{array}} \mu (m) \sum _{d\mid e}\mu (d) \sum _{\begin{array}{c} (u,v) \in \frac{x}{m}\mathscr {D}\cap \Lambda (e) \\ d\mid u, ~d\mid v\\ (u,v)\equiv \overline{m} (s_0,t_0)\left( \text {mod}\ q\right) \end{array}} 1 . \end{aligned}$$

Making the substitution \(u=ds\) and \(v=dt\), and arguing as before we find that

$$\begin{aligned} N(x)= \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \\ m\leqslant \frac{\delta x}{|\mathbf {v}(e)|} \end{array}} \mu (m) \sum _{\begin{array}{c} d\mid e\\ d\leqslant \frac{\delta x}{|\mathbf {v}(e/d)|m} \end{array}}\mu (d) \sum _{\begin{array}{c} (s,t) \in \frac{x}{dm}\mathscr {D}\cap \Lambda (e/d) \\ (s,t)\equiv \overline{dm} (s_0,t_0)\left( \text {mod}\ q\right) \end{array}} 1. \end{aligned}$$

Now let \(n \in \mathbb {Z}\) be such that \(n\equiv \overline{dm} \left( \text {mod}\ q\right) \). Then we can make the change of variables \( (s,t)=n(s_0,t_0)+q(s',t') \) in the inner sum. Noting that \(\Lambda (e/d)\) defines a lattice in \(\mathbb {Z}^2\) of determinant e / d, the inner sum is found to be

$$\begin{aligned} \frac{\mathrm {vol}(\mathscr {D})x^2}{dem^2q^2} +O\left( 1+\frac{\frac{x}{dm}\partial {\mathscr {D}}}{q|\mathbf {v}(e/d)|}\right) =\frac{\mathrm {vol}(\mathscr {D})x^2}{dem^2q^2} +O\left( \beta \frac{x}{md|\mathbf {v}(e/d)|} \right) , \end{aligned}$$

with an absolute implied constant, since the upper bound on d implies that

$$\begin{aligned} 1 \leqslant \frac{\delta x}{dm |\mathbf {v}(e/d)|} . \end{aligned}$$

In summary, we have shown that

$$\begin{aligned} N(x)= \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \\ m\leqslant \frac{\delta x}{|\mathbf {v}(e)|} \end{array}} \mu (m) \sum _{\begin{array}{c} d\mid e\\ d\leqslant \frac{\delta x}{|\mathbf {v}(e/d)|m} \end{array}}\mu (d) \left( \frac{\mathrm {vol}(\mathscr {D})x^2}{dem^2q^2} +O\left( \beta \frac{x}{md|\mathbf {v}(e/d)|} \right) \right) . \end{aligned}$$

The contribution from the error term is

$$\begin{aligned} \ll \beta x \sum _{d|e}\frac{1}{d|\mathbf {v}(e/d)|} \sum _{m \leqslant \frac{\delta x}{d|\mathbf {v}(e/d)|}}\frac{1}{m} \ll \beta x \sum _{d|e}\frac{1}{d|\mathbf {v}(e/d)|} \log \left( 2+\frac{\delta x}{d|\mathbf {v}(e/d)|}\right) . \end{aligned}$$

The main term equals

$$\begin{aligned} \frac{\mathrm {vol}(\mathscr {D})x^2}{eq^2} \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \end{array}} \frac{\mu (m)}{m^2} \sum _{\begin{array}{c} d|e\\ d\leqslant \frac{\delta x}{|\mathbf {v}(e/d)|m} \end{array}}\frac{\mu (d)}{d}, \end{aligned}$$

since  (2.12) implies that the extra constraint in m-sum is implied by the constraint in the d-sum. But this is equal to

$$\begin{aligned} \frac{\mathrm {vol}(\mathscr {D})x^2}{eq^2} \sum _{d\mid e}\frac{\mu (d)}{d} \sum _{\begin{array}{c} m \in \mathbb {N}\\ \gcd (m,eq)=1 \end{array}} \frac{\mu (m)}{m^2} +O \left( \frac{\mathrm {vol}(\mathscr {D})x}{\delta q^2} \cdot \frac{1}{e} \sum _{d\mid e} |\mathbf {v}(e/d)| \right) , \end{aligned}$$

which thereby completes the proof. \(\square \)

2.5 Twisted Hooley \(\Delta \)-function over number fields

Adopting the notation of Sect. 1, it is now time to reveal the version of the Hooley\(\Delta \)-function that arises in our work. Let \(K/\mathbb {Q}\) be a number field and let \(\psi _K\) be a quadratic Dirichlet character on K. We let \(\Delta :\mathscr {I}_{K} \rightarrow \mathbb {R}_{>0}\) be the function given by

$$\begin{aligned} \Delta (\mathfrak {a};\psi _K)= \sup _{\begin{array}{c} u \in \mathbb {R}\\ 0\leqslant v\leqslant 1 \end{array}} \Big | \sum _{\begin{array}{c} \mathfrak {d}\mid \mathfrak {a}\\ \mathrm {e}^u < \text {N}\,_K\mathfrak {d}\leqslant \mathrm {e}^{u+v} \end{array}} \psi _K(\mathfrak {d}) \Big | ,\end{aligned}$$
(2.14)

for any integral ideal \(\mathfrak {a}\in \mathscr {I}_K\). We shall put \(\Delta (\mathfrak {a})=\Delta (\mathfrak {a};\mathbf {1})\) for the corresponding function in which \(\psi _K\) is replaced by the constant function \(\mathbf {1}\).

We begin by showing that \(\Delta \) belongs to the class \(\mathscr {M}_K\) of pseudomultiplicative functions introduced in Sect. 2.1. For coprime ideals \(\mathfrak {a}_1,\mathfrak {a}_2 \subset \mathfrak {o}_K\), any ideal divisor \(\mathfrak {d}\mid \mathfrak {a}_1 \mathfrak {a}_2\) can be written uniquely as \(\mathfrak {d}=\mathfrak {d}_1 \mathfrak {d}_2\), where \(\mathfrak {d}_i\mid \mathfrak {a}_i\). Therefore

$$\begin{aligned} \sum _{\begin{array}{c} \mathfrak {d}\mid \mathfrak {a}_1 \mathfrak {a}_2\\ \mathrm {e}^u< \text {N}\,_K\mathfrak {d}\leqslant \mathrm {e}^{u+v} \end{array}} \psi _K(\mathfrak {d}) = \sum _{\begin{array}{c} \mathfrak {d}_1\mid \mathfrak {a}_1\\ \end{array}} \psi _K(\mathfrak {d_1}) \sum _{\begin{array}{c} \mathfrak {d}_2\mid \mathfrak {a}_2 \\ \mathrm {e}^{u-\log \text {N}\,_K\mathfrak {d}_1} < \text {N}\,_K \mathfrak {d}_2 \leqslant \mathrm {e}^{u-\log \text {N}\,_K\mathfrak {d}_1} \mathrm {e}^{v} \end{array}} \psi _K(\mathfrak {d}_2). \end{aligned}$$

Thus the triangle inequality yields \(\Delta (\mathfrak {a}_1\mathfrak {a}_2;\psi _K) \leqslant \tau _K(\mathfrak {a}_1) \Delta (\mathfrak {a}_2;\psi _K)\), where \(\tau _K\) is the divisor function on ideals of \(\mathfrak {o}_K.\) This shows that \(\Delta (\cdot ,\psi _K)\) belongs to \(\mathscr {M}_K\) and an identical argument confirms this for \(\Delta (\cdot )\).

We shall need the following result proved in [30].

Lemma 2.7

Define the function

$$\begin{aligned} \widehat{\varepsilon }(x) =\sqrt{\frac{\log \log \log (16+x)}{\log \log (3+x)}}, \end{aligned}$$

for any \(x\geqslant 1\) and recall the definition (2.3) of \(\mathscr {P}_K^\circ \).

  1. (i)

    There exists a positive constant \(c=c(K)\) such that

    $$\begin{aligned} \sum _{\begin{array}{c} \mathfrak {a}\in \mathscr {P}_K^\circ \text { square-free} \\ \text {N}\,_K \mathfrak {a}\leqslant x \end{array}}\frac{\Delta (\mathfrak {a})}{\text {N}\,_K\mathfrak {a}}\ll (\log x)^{1+c\widehat{\varepsilon }(x)} .\end{aligned}$$
  2. (ii)

    Let \(\psi _K\) be a quadratic Dirichlet character on K and let \(W\in \mathbb {N}\). There exists a positive constant \(c=c(K,\psi _K)\) such that

    $$\begin{aligned} \sum _{\begin{array}{c} \mathfrak {a}\in \mathscr {P}_K^\circ \text { square-free} \\ \gcd (\text {N}\,_K \mathfrak {a},W)=1 \\ \text {N}\,_K \mathfrak {a}\leqslant x \end{array}}\frac{\Delta (\mathfrak {a};\psi _K)^2}{\text {N}\,_K\mathfrak {a}}\ll (\log x)^{1+c\widehat{\varepsilon }(x)}. \end{aligned}$$

The implied constant in both estimates is allowed to depend on K and, in the second estimate, also on W and the character \(\psi _K\).

3 The lower bound

In order to prove the lower bound in Theorem 1.1, we first appeal to work of Frei, Loughran and Sofos [15]. It follows from [15, Thm. 1.2] that the desired lower bound holds when \(\rho \geqslant 4\). Suppose that \(\rho =3\). Then (1.1) implies that in the fibration \(\pi :X\rightarrow \mathbb {P}^1\) there is at least one closed point \(P\in \mathbb {P}^1\) above which the singular fibre \(X_P\) is split. Since the sum \(c(\pi )\) defining the complexity of \(\pi \) in [15, Def. 1.5] is at most 4 for conic bundle quartic del Pezzo surfaces, we infer that \(c(\pi )\leqslant 3\) when \(\rho =3\), so that the lower bound in Theorem 1.1 is a consequence of [15, Thm. 1.7]. Throughout this section, it therefore suffices to assume that \(\rho =2\) and \(\delta _0=0\), so that X is a minimal conic bundle surface.

Invoking [15, Thm. 1.6], the lower bound in Theorem 1.1 is a direct consequence of the divisor sum conjecture that is recorded in [14, Con. 1], for the relevant data associated to the fibration \(\pi \). Note that the principal result in [14] only covers cubic divisor sums, since we still lack the technology to asymptotically evaluate divisor sums of higher degree with a power saving in the error term. The goal of this section is to estimate certain quartic divisor sums, with a logarithmic saving in the error term, which turns out to be sufficient for proving the lower bound in Theorem 1.1. The divisor sums relevant here shall involve complicated quadratic symbols whose modulus tends to infinity, a delicate task that will be the entire focus of this section.

We proceed to explain the particular case of the divisor sum conjecture that is germane here. Assume that we have forms \(F_1,\ldots ,F_n,G_1,\ldots ,G_n \in \mathbb {Z}[x,y]\) with

$$\begin{aligned} F_i \hbox { irreducible}, \quad F_i\not \mid G_i, \quad 2\mid \deg (G_i), \quad \hbox {and } \quad \prod _{i=1}^n F_i \hbox { separable} . \end{aligned}$$

For each i such that \(F_i(1,0)\ne 0\), we define the associated binary form \(\tilde{F}_i(x,y)=b_i^{d_i-1}F_i(b_i^{-1}x,y)\), as in (2.4), where \(d_i=\deg F_i\) and \(b_i=F_i(1,0)\). For such i we let \(\theta _i \in \overline{\mathbb {Q}}\) be a fixed root of \(\tilde{F}_i(x,1)=0\). If, on the other hand, \(F_i(x,y)\) is proportional to y, we define \(\theta _i=-F_i(0,1)\). We may assume that

$$\begin{aligned} \sum _{i=1}^n d_i=4 \end{aligned}$$
(3.1)

and that \(G_i(\theta _i,1) \notin \mathbb {Q}(\theta _i)^2\) for every i, because in the correspondence outlined in [15], the binary forms \(F_1,\dots ,F_n\) are equal to the closed points \(\Delta _1,\dots ,\Delta _n\) from Sect. 1. Indeed, under this correspondence, the statement \(G_i(\theta _i,1) \notin \mathbb {Q}(\theta _i)^2\) is equivalent to the singular fibre above \(\Delta _i\) being non-split, which holds for any i since we are working with minimal conic bundle surfaces.

Let

$$\begin{aligned} f(d)=\prod _{p\mid d}\left( 1-\frac{2}{p}\right) . \end{aligned}$$
(3.2)

We need to prove that there exists a finite set of primes \(S_{\text {bad}}= S_{\text {bad}}(F_i,G_i)\) such that for all \(W\in \mathbb {N}\), all \((s_0,t_0) \in \mathbb {Z}_{\text {prim}}^2\), and all non-empty compact discs \(\mathscr {D} \subset \mathbb {R}^2\), which together satisfy the conditions

  1. (C1)

    \(p\in S_{\text {bad}} \Rightarrow p\mid W\);

  2. (C2)

    \(\prod _{i=1}^n F_i(s_0,t_0)\ne 0\);

  3. (C3)

    \((s,t)\in \mathbb {R}^2\cap \mathscr {D}\Rightarrow \prod _{i=1}^n F_i(s,t)\ne 0\); and

  4. (C4)

    for all \((s,t) \in \mathbb {Z}_{\text {prim}}^2\cap x\mathscr {D}\) with \(x\geqslant 1\) and \((s,t)\equiv (s_0,t_0)\left( \text {mod}\ W\right) \) we have

    $$\begin{aligned} \left( \frac{G_i(s,t)}{F_i(s,t)_{W}}\right) =1; \end{aligned}$$

we have the lower bound \(D_{W}(x)\gg x^2\), where

$$\begin{aligned} D_W(x)= \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} \prod _{i=1}^n \left( f(F_i(s,t)_{W}) \sum _{d\mid F_i(s,t)_{W}} \left( \frac{G_i(s,t)}{d} \right) \right) . \end{aligned}$$
(3.3)

Here, we recall the notation \(m_W=\prod _{p\not \mid W} p^{\nu _p(m)}\) for all \(m,W \in \mathbb {N}\).

We shall prove this conjectured lower bound when \(S_{\text {bad}}\) is taken to be the set of all primes up to a constant \(w=w(F_i,G_i)\). In what follows we shall often write that we need to enlarge w. This statement is to be interpreted as having already taken a very large constant w at the outset of the proof of the conjecture, rather than increasing w within the confines of the lower bound arguments. The primary goal of this section is now to establish the following bound, which directly leads to the lower bound in Theorem 1.1.

Proposition 3.1

Let \(F_i,G_i,f\) be as above. Then there exists a constant \(w=w(F_i,G_i)\) such for any \(W,(s_0,t_0),\mathscr {D}\) satisfying (C1)–(C4) as above, we have

$$\begin{aligned} D_W(x)\gg x^2. \end{aligned}$$

Here the implied constant depends on \(F_i,G_i,s_0,t_0,\mathscr {D},w\) and W, but not on x.

Suppose that \(\nu >\nu _p({W})\) for all \(p\mid W\) and write \(W_0=\prod _{p\mid W } p^\nu \). Then, since every summand in (3.3) is non-negative and \(F_i(s,t)_{W}=F_i(s,t)_{W_0}\) for all \(1\leqslant i\leqslant n\), we conclude that \(D_W(x)\geqslant D_{W_0}(x)\). In this way we see that it will suffice to prove the lower bound in Proposition 3.1 under the assumption that \(W=\prod _{p\mid W } p^\nu \) with

$$\begin{aligned} \nu >\max _{\begin{array}{c} 1\leqslant i \leqslant n\\ p\mid W \end{array}}\{\nu _p(F_i(s_0,t_0))\}. \end{aligned}$$

In this case the identity \( F_i(s_0+p^\nu X,t_0+p^\nu Y) \equiv F_i(s_0,t_0) \left( \text {mod}\ p^\nu \right) \) guarantees that \(\nu _p(F_i(s,t))=\nu _p(F_i(s_0,t_0))\) for any (st) appearing in the outer summation of (3.3) and any \(p\mid W\). Hence, for such (st), we can always assume that

$$\begin{aligned} F_i(s,t)_W= |F_i(s,t)| \prod _{p\mid W}p^{-\nu _p(F_i(s_0,t_0))} . \end{aligned}$$
(3.4)

3.1 Dirichlet’s hyperbola trick

Let \(i\in \{1,\dots ,n\}\). For any \((s,t) \in \mathbb {Z}^2\) appearing in (3.3), let

$$\begin{aligned} r_i(s,t) = \sum _{k\mid F_i(s,t)_W} \left( \frac{G_i(s,t)}{k} \right) . \end{aligned}$$

Then, possibly on enlarging w, it follows from Lemma 2.3 that

$$\begin{aligned} r_i(s,t) =\sum _{\begin{array}{c} \mathfrak {d} \mid (b_i s-\theta _i t) \\ \gcd (\text {N}\,_i \mathfrak {d},W)=1 \\ \mathfrak {d}\in \mathscr {P}_i \end{array} } \psi _i(\mathfrak {d}) , \end{aligned}$$

where \(\mathfrak {d}\) runs over integral ideals of \(K_i=\mathbb {Q}(\theta _i)\), \(\text {N}\,_i\) denotes the ideal norm \(\text {N}\,_{K_i/\mathbb {Q}}\) and \(\mathscr {P}_i=\mathscr {P}_{K_i}\), in the notation of (2.11). Furthermore, for all (st) in (3.3), we have

$$\begin{aligned} \text {N}\,_i \mathfrak {d}\leqslant \text {N}\,_i(b_is-\theta _i t) =|\tilde{F}_i (b_is,t)| \leqslant c_i x^{d_i} , \end{aligned}$$

for some positive constant \(c_i\) that depends at most on \(F_i\) and \(\mathscr {D}\). We define

$$\begin{aligned} X=x \max \left\{ c_1^{\frac{1}{d_1}}, \dots , c_n^{\frac{1}{d_n}}\right\} , \end{aligned}$$

so that the previous inequality becomes \(\text {N}\,_i \mathfrak {d}\leqslant X^{d_i}\).

On relabelling the indices we may suppose that \(d_n=\min _{1\leqslant i\leqslant n} d_i\). In particular, we have

$$\begin{aligned} d_n \leqslant \min _{1\leqslant i\leqslant n}\deg (\Delta _i) .\end{aligned}$$
(3.5)

Suppose that \(n>1\). Then for each \(i\in \{1,\dots ,n-1\}\) and (st) appearing in (3.3), we set

$$\begin{aligned} r_i^{(0)}(s,t)&=\sum _{\begin{array}{c} \mathfrak {d} \mid (b_i s-\theta _i t),~ \mathfrak {d}\in \mathscr {P}_i\\ \\ \gcd (\text {N}\,_i \mathfrak {d},W)=1 \\ \text {N}\,_i \mathfrak {d}\leqslant X^{\frac{d_i}{2}} \end{array} } \psi _i(\mathfrak {d}),\quad r_i^{(1)}(s,t) = \sum _{\begin{array}{c} \mathfrak {e} \mid (b_i s-\theta _i t), ~\mathfrak {e}\in \mathscr {P}_i\\ \\ \gcd (\text {N}\,_i \mathfrak {e},W)=1 \\ \text {N}\,_i \mathfrak {e}\leqslant X^{-\frac{d_i}{2}} F_i(s,t)_W \end{array} } \psi _i(\mathfrak {e}). \end{aligned}$$

Dirichlet’s hyperbola trick implies that

$$\begin{aligned} r_i(s,t) = r_i^{(0)}(s,t)+r_i^{(1)}(s,t). \end{aligned}$$
(3.6)

Indeed, if \((b_is-\theta _i t)_W\) denotes the part of the ideal \((b_is-\theta _i t)\) that is composed solely of prime ideals whose norms are coprime to W, as in (2.1), then the sum in \(r_i(s,t)\) is over ideals \(\mathfrak {d}, \mathfrak {e}\) such that \(\mathfrak {d}\mathfrak {e}=(b_is-\theta _i t)_W\). Recalling (C4), it follows from part (ii) of Lemma 2.3 that \(\psi _i((b_is-\theta _i t)_W)=1\). This concludes the proof of (3.6).

We proceed by introducing the quantity

$$\begin{aligned} L=(\log x)^\alpha ,\end{aligned}$$
(3.7)

for some \(\alpha >0\) that will be determined in due course. (When \(n>1\) we shall take \(\alpha \) to be a large constant, but when \(n=1\) it will be important to restrict to \(0<\alpha <1\).) For (st) appearing in (3.3), we proceed by defining

$$\begin{aligned} r_n^{(0)}(s,t)&= \sum _{\begin{array}{c} \mathfrak {d}\mid (b_ns-\theta _n t), ~\mathfrak {d}\in \mathscr {P}_n \\ \gcd (\text {N}\,_n \mathfrak {d},W)=1 \\ \text {N}\,_n\mathfrak {d}\leqslant L^{-1} {X^{\frac{d_n}{2}}} \end{array} } \psi _n(\mathfrak {d}),\qquad r_n^{(1)}(s,t)= \sum _{\begin{array}{c} \mathfrak {e} \mid (b_ns-\theta _n t), ~\mathfrak {e}\in \mathscr {P}_n \\ \gcd (\text {N}\,_n\mathfrak {e},W)=1 \\ \text {N}\,_n\mathfrak {e}\leqslant L^{-1}X^{-\frac{d_n}{2}} F_n(s,t)_W \end{array}} \psi _n(\mathfrak {e}) \end{aligned}$$

and

$$\begin{aligned} r_n^{(\infty )}(s,t)= \sum _{\begin{array}{c} \mathfrak {d}\mid (b_ns-\theta _n t), ~\mathfrak {d}\in \mathscr {P}_n \\ \gcd (\text {N}\,_n\mathfrak {d},W)=1 \\ L^{-1}X^{\frac{d_n}{2}}< \text {N}\,_n\mathfrak {d}< L X^{\frac{d_n}{2}} \end{array} } \psi _n(\mathfrak {d}) . \end{aligned}$$

As before, we may now write

$$\begin{aligned} r_n(s,t) = r_n^{(\infty )}(s,t) + r_n^{(0)}(s,t)+r_n^{(1)}(s,t) . \end{aligned}$$
(3.8)

For each \(\mathbf {j}=(j_1,\ldots ,j_n) \in \{0,1\}^n\), we define

$$\begin{aligned} D_{\mathbf {j}}(x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} \prod _{i=1}^n f(F_i(s,t)_W) r_i^{(j_i)}(s,t) , \end{aligned}$$

and

$$\begin{aligned} D_\infty (x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} r_n^{(\infty )}(s,t) \prod _{i=1}^{n-1} r_i(s,t), \end{aligned}$$

in which we recall the definition (3.2) of f. (Here, we recall our convention that products over empty sets are equal to 1.) Injecting (3.6) and (3.8) into (3.3) yields

$$\begin{aligned} D_W(x)- \sum _{\mathbf {j} \in \{0,1\}^n} D_{\mathbf {j}}(x) \ll D_{\infty }(x) . \end{aligned}$$

The validity of Proposition 3.1 is therefore assured, provided we can show that

$$\begin{aligned} D_{\mathbf {j}}(x) \gg x^2 \end{aligned}$$
(3.9)

and

$$\begin{aligned} D_{\infty }(x) =o(x^2). \end{aligned}$$
(3.10)

We shall devote Sects. 3.23.4 to the proof of (3.10) and Sect. 3.5 to the proof of (3.9).

3.2 The generalised Hooley \(\Delta \)-function

In this section we initiate the proof of (3.10). Define

$$\begin{aligned} A_n^{(\infty )}(x) = \left\{ (s,t) \in \mathbb {Z}_{\text {prim}}^2\cap x\mathscr {D}: \begin{array}{l} (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \\ \exists \mathfrak {d}\in \mathscr {P}_n \text { such that:}\\ ~\bullet ~ \mathfrak {d}\mid (b_ns-\theta _n t)_W\\ ~\bullet ~L^{-1}X^{\frac{d_n}{2}}< \text {N}\,_n\mathfrak {d}< L X^{\frac{d_n}{2}} \end{array} \right\} . \end{aligned}$$
(3.11)

It immediately follows that

$$\begin{aligned} D_{\infty }(x) = \sum _{(s,t) \in A_n^{(\infty )}(x)} r_n^{(\infty )}(s,t) \prod _{i=1}^{n-1} r_i(s,t). \end{aligned}$$

Defining

$$\begin{aligned} B_{\infty }(x) = \sum _{(s,t) \in A_n^{(\infty )}(x)} \prod _{i=1}^{n-1} r_i(s,t) ,\end{aligned}$$
(3.12)

we use Cauchy’s inequality to arrive at

$$\begin{aligned} D_{\infty }(x) \leqslant B_{\infty }(x)^{\frac{1}{2}} \left( \sum _{(s,t) \in A_n^{(\infty )}(x)} \Big |r_n^{(\infty )}(s,t)\Big |^2 \prod _{i=1}^{n-1} r_i(s,t) \right) ^{\frac{1}{2}} . \end{aligned}$$

Recall the definition (2.14) of the twisted Hooley \(\Delta \)-function \(\Delta (\mathfrak {a};\psi _n)\) associated to the Dirichlet character \(\psi _n\) and any integral ideal \(\mathfrak {a}\). Putting

$$\begin{aligned} H_{\infty }(x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}\cap x\mathscr {D}\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} \Delta ((b_ns-\theta _n t);\psi _n)_W^2 \prod _{i=1}^{n-1} r_i(s,t), \end{aligned}$$
(3.13)

and partitioning the interval \((L^{-1}X^{\frac{d_n}{2}}, L X^{\frac{d_n}{2}})\) into at most \(O(\log \log x)\)\(\mathrm {e}\)-adic intervals, we deduce that

$$\begin{aligned} \sum _{(s,t) \in A_n^{(\infty )}(x)} \Big |r_n^{(\infty )}(s,t)\Big |^2 \prod _{i=1}^{n-1} r_i(s,t) \ll (\log \log x)^2 H_{\infty }(x) . \end{aligned}$$

In summary, we have shown that

$$\begin{aligned} D_{\infty }(x) \ll (\log \log x)\sqrt{ B_{\infty }(x) H_{\infty }(x)} . \end{aligned}$$

Therefore, in order to prove (3.10), it will be sufficient to prove that there exists a constant \(\delta >0\), that depends only on the data given at the start of Sect. 3, such that

$$\begin{aligned} B_{\infty }(x) \ll x^2 (\log x)^{-\delta } \end{aligned}$$
(3.14)

and

$$\begin{aligned} H_{\infty }(x) \ll x^2 (\log x)^{o(1)} .\end{aligned}$$
(3.15)

We shall call \(B_{\infty }(x)\) the interval sum and \(H_{\infty }(x)\) the Bretèche–Tenenbaum sum.

3.3 The interval sum

By recycling work of la Bretèche and Tenenbaum [4, § 7.4], the case \(n=1\) is easy to handle. Indeed, in this case \(F_1\) is an irreducible quartic form and (3.12) becomes

$$\begin{aligned} B_{\infty }(x)=\sharp A_1^{(\infty )}(x)\leqslant \sharp \left\{ (s,t) \in \mathbb {Z}_{\text {prim}}^2\cap x\mathscr {D}: \begin{array}{l} (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \\ \exists \mathfrak {d}\in \mathscr {P}_1 \text { such that:}\\ ~\bullet ~ \mathfrak {d}\mid (b_1s-\theta _1 t)_W\\ ~\bullet ~ X^2/L< \text {N}\,_1\mathfrak {d}< L X^2 \end{array} \right\} . \end{aligned}$$

Note that assumption (C2) ensures that \(|F_1(s,t)| \asymp 1\) whenever \((s,t) \in \mathscr {D}\). Increasing w so that every prime factor of \(b_1\) also divides W, shows that

$$\begin{aligned} \tilde{F}_1(b_1s,t)_W=(b_1^{d_1-1}F_1(s,t))_W=F_1(s,t)_W. \end{aligned}$$

Thus it follows from  (3.4) that \(\tilde{F}_1(s,t)_W \asymp |F_1(s,t)|\), for implied constants that depend on \(F_1,s_0,t_0,w\) and W. Hence

$$\begin{aligned} \text {N}\,_1((b_1s-\theta _1 t)_W)=\tilde{F}_1(b_1s,t)_W \asymp |F_1(s,t)| \asymp x^4\asymp X^4. \end{aligned}$$

Therefore, on introducing \(\mathfrak {e}\) through the factorisation \(\mathfrak {d}\mathfrak {e}=(b_1 s-\theta _1 t)_W\), we can infer that we must have either

$$\begin{aligned} X^2/L \ll \text {N}\,_1 \mathfrak {d}\ll X^2 \quad \text { or } \quad X^2/L \ll \text {N}\,_1 \mathfrak {e}\ll X^2 . \end{aligned}$$

Without loss of generality we shall assume that we are in the former setting. Therefore there exist constants \(c_0,c_1>0\) such that

$$\begin{aligned} B_\infty (x) \ll \sharp \left\{ (s,t) \in \mathbb {Z}_{\text {prim}}^2\cap x\mathscr {D}: \begin{array}{l} (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \\ \exists d\mid F_1(s,t) \text { s.t. } c_0x^2/L< d < c_1 x^2 \end{array} \right\} . \end{aligned}$$

But now we can employ the bound [4, Eq. (7.41)], with

$$\begin{aligned} T=F_1,\quad \Xi = \xi =x, \quad y_1=c_0x^2/L, \quad y_2=c_1x^2, \quad \text { and } \quad 1\ll \sigma ,\vartheta \ll 1. \end{aligned}$$

This implies that for any \(\eta \in (0,\frac{1}{2})\), we have

$$\begin{aligned} B_\infty (x)\ll x^2 \left( \frac{L}{(\log x)^{Q(2\eta )}} + \frac{\log \log x}{(\log x)^{Q(1+\eta )}} \right) , \end{aligned}$$

where \(Q(\lambda )=\lambda \log \lambda - \lambda +1\). In particular, \(Q(2\eta )\rightarrow 1\) as \(\eta \rightarrow 0+\) and \(Q(1+\eta )>0\) for all \(\eta >0\). Recalling the definition (3.7) of L, this means that provided \(\alpha <1\), we may choose \(\eta >0\) small enough (but away from 0), so as to ensure that (3.14) holds when F is irreducible.

It remains to establish (3.14) when \(n>1\). In this case (3.5) implies that \(d_n=\deg (F_n)\leqslant 2\). Fix \(\eta \in (0,1)\). To estimate \(B_\infty (x)\), drawing inspiration from [4, § 9.3], we shall divide the terms in the sum (3.12) into two categories.

3.3.1 First case: \((b_n s-\theta _n t)\) has many prime divisors

We denote by \(B_{\infty }^{(1)}(x)\) the contribution to \(B_{\infty }(x)\) from the set of vectors (st) for which \( \Omega _{n}((b_ns-\theta _n t)_W)>(1+\eta ) \log \log x, \) where \(\Omega _n(\mathfrak {a})=\Omega _{K_n}(\mathfrak {a})\) is the total number of prime ideal factors of an ideal \(\mathfrak {a}\subset \mathfrak {o}_{K_n}\). Recall that, as in Sect. 3.1, we denote \(\text {N}\,_{K_n}(\mathfrak {a})\) by \(\text {N}\,_{n}(\mathfrak {a})\). We have

$$\begin{aligned} B_{\infty }^{(1)}(x) \leqslant (\log x)^{-(1+\eta ) \log (1+\eta )} \sum _{(s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}} (1+\eta )^{\Omega _n((b_n s-\theta _n t)_W)} \prod _{i=1}^{n-1} r_i(s,t), \end{aligned}$$
(3.16)

since \((1+\eta )^{-(1+\eta )\log \log x}=(\log x)^{-(1+\eta ) \log (1+\eta )}\). Our plan is now to apply Lemma 2.2 for \(N=n\), with \(f_N(\mathfrak {a})=(1+\eta )^{ \Omega _n(\mathfrak {a}_W) }\) and

$$\begin{aligned} f_i(\mathfrak {a})=\sum _{\begin{array}{c} \mathfrak {d}\mid \mathfrak {a}\\ \mathfrak {d}\in \mathscr {P}_i \end{array}}\psi _i(\mathfrak {d}), \end{aligned}$$

for \(i< N\). Fix any \(\varepsilon >0\). It is easy to see that if \(i<N\) then there exists \(B>0\) such that \(f_i\in \mathscr {M}_{K_i}(2,B,\varepsilon )\). Thus, in the notation of Lemma 2.2, one can take

$$\begin{aligned} i<N\Rightarrow \varepsilon _i=\varepsilon .\end{aligned}$$
(3.17)

When \(i=N\), however, we will show that for every \(\varepsilon >0\) there exists w such that if W is given by (2.2) then

$$\begin{aligned}(1+\eta )^{\Omega _n(\mathfrak {a}_W)} \in \mathscr {M}_{K_n}(1+\eta ,1,\varepsilon ).\end{aligned}$$

Indeed, we have

$$\begin{aligned} (1+\eta )^{\Omega _n(\mathfrak {a}_W)} =\prod _{\begin{array}{c} \mathfrak {p}^\xi \Vert \mathfrak {a}\\ \gcd (\text {N}\,_n \mathfrak {p},W)=1 \end{array}}(1+\eta )^\xi \leqslant \prod _{\begin{array}{c} \mathfrak {p}^\xi \Vert \mathfrak {a}\\ \text {N}\,_n \mathfrak {p}>w \end{array}}(1+\eta )^\xi . \end{aligned}$$

Taking \(w\geqslant 2^{1/\varepsilon }\), so that \((1+\eta )\leqslant w^\varepsilon \), yields

$$\begin{aligned} \prod _{\begin{array}{c} \mathfrak {p}^\xi \Vert \mathfrak {a}\\ \text {N}\,_n \mathfrak {p}>w \end{array}}(1+\eta )^\xi \leqslant \prod _{\begin{array}{c} \mathfrak {p}^\xi \Vert \mathfrak {a}\\ \text {N}\,_n \mathfrak {p}>w \end{array}}w^{\varepsilon \xi } \leqslant \prod _{\begin{array}{c} \mathfrak {p}^\xi \Vert \mathfrak {a}\\ \text {N}\,_n \mathfrak {p}>w \end{array}}(\text {N}\,_n \mathfrak {p})^{\varepsilon \xi } \leqslant (\text {N}\,_n \mathfrak {a})^\varepsilon . \end{aligned}$$

This means that in the notation of Lemma 2.2 one can take

$$\begin{aligned} \varepsilon _N=\varepsilon . \end{aligned}$$
(3.18)

Furthermore, we shall take \(G=\mathbb {Z}^2\) and \(\mathscr {R}=x\mathscr {D}\). Thus \(q_G=1\), \(\mathscr {R}\) is regular and we have \(V\asymp x^2\) and \(K_{\mathscr {R}}\asymp x \log x\), in the notation of the lemma. This means that for large x we can take \(c_1=1\), hence by (3.1), (3.17) and (3.18) we have

$$\begin{aligned} \sum _{i=1}^N d_i \varepsilon _i=4\varepsilon . \end{aligned}$$

Therefore, assuming that \(\varepsilon \in (0,1)\) is fixed, the relevant constant in Lemma 2.2 is \(\varepsilon _0= \max \{5,20+12\varepsilon \} 4\varepsilon \leqslant 199 \varepsilon \). This shows that if \(\varepsilon \) is fixed and \(200 \varepsilon <1/3\) then

$$\begin{aligned} \frac{K_{\mathscr {R}}^{1+\varepsilon _0+\varepsilon }}{\lambda _G} \ll (x \log x)^{1+200 \varepsilon } \ll x^{3/2} , \end{aligned}$$

hence the secondary term of Lemma 2.2 makes a satisfactory contribution. The contribution of the first term of Lemma 2.2 towards the sum in (3.16) is

$$\begin{aligned}&\ll \frac{x^2}{(\log x)^n} \exp \left( \sum _{i =1}^{n-1} \sum _{\begin{array}{c} \mathfrak {p}\in \mathscr {P}_i^\circ \\ \text {N}\,_i\mathfrak {p}\ll x^2 \end{array}}\frac{1+\psi _i(\mathfrak {p})}{\text {N}\,_i \mathfrak {p}} + (1+\eta ) \sum _{\begin{array}{c} \mathfrak {p}\in \mathscr {P}_n^\circ \\ \text {N}\,_n\mathfrak {p}\ll x^2 \end{array}}\frac{1}{\text {N}\,_n \mathfrak {p}}\right) \\&\ll \frac{x^2}{(\log x)^n} \exp ((n-1)\log \log x+(1+\eta ) \log \log x)\\&\ll x^2 (\log x)^{\eta }. \end{aligned}$$

The proof of these estimates is standard and will not be repeated here. (See Heilbronn [18], for example.) Thus \(B_{\infty }^{(1)}(x) \ll x^2 (\log x)^{-(1+\eta )\log (1+\eta )+\eta }\). The exponent of the logarithm is strictly negative for all \(\eta >0\), which is clearly sufficient for (3.14).

3.3.2 Second case: \((b_n s-\theta _n t)\) has few prime divisors

We denote by \(B_\infty ^{(2)}(x)\) the contribution to \(B_{\infty }(x)\) from the set of vectors (st) for which \( \Omega _n((b_n s-\theta _n t)_W) \leqslant (1+\eta ) \log \log x. \) Recall from the definition (3.11) of \(A_n^{(\infty )}(x)\) that there exists \(\mathfrak {d}\in \mathscr {P}_n\) such that \(\mathfrak {d}\mid (b_n s-\theta _n t)\), with \(\gcd (\text {N}\,_n\mathfrak {d},W)=1\) and

$$\begin{aligned} L^{-1}X^{\frac{d_n}{2}}< \text {N}\,_n\mathfrak {d}< L X^{\frac{d_n}{2}} . \end{aligned}$$

Condition (C3) ensures that \( \text {N}\,_n((b_ns-\theta _n t)_W)\asymp X^{d_n} \). Defining \(\mathfrak {e}\) via the factorisation \(\mathfrak {d}\mathfrak {e}=(b_n s-\theta _n t)_W\), we can then infer that \(\gcd (\text {N}\,_n\mathfrak {e},W)=1\) and \(\mathfrak {e}\in \mathscr {P}_n\), with \( L^{-1}X^{\frac{d_n}{2}} \ll \text {N}\,_n\mathfrak {e}\ll L X^{\frac{d_n}{2}}, \) where the implied constants depend at most on \(\mathscr {D}\) and \(F_n\). Note that

$$\begin{aligned} \Omega _n(\mathfrak {d}) + \Omega _n(\mathfrak {e}) =\Omega _n((b_n s-\theta _n t)_W) \leqslant (1+\eta ) \log \log x. \end{aligned}$$

Thus, either \(\Omega _n(\mathfrak {d}) \leqslant \frac{1}{2} (1+\eta ) \log \log x\), or \(\Omega _n(\mathfrak {e}) \leqslant \frac{1}{2} (1+\eta ) \log \log x\). We will assume without loss of generality that we are in the latter case.

It follows that

$$\begin{aligned} B_{\infty }^{(2)}(x) \ll \sum _{\begin{array}{c} \mathfrak {e}\in \mathscr {P}_n\\ L^{-1}X^{\frac{d_n}{2}} \ll \text {N}\,_n\mathfrak {e}\ll L X^{\frac{d_n}{2}} \\ \Omega _n(\mathfrak {e}) \leqslant \frac{1}{2} (1+\eta ) \log \log x \\ \gcd (\text {N}\,_n\mathfrak {e},W)=1 \end{array}} B_{\mathfrak {e}}(x), \end{aligned}$$

where

$$\begin{aligned} B_{\mathfrak {e}}(x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D} \\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \\ \mathfrak {e}| (b_ns -\theta _n t) \end{array}} \prod _{i=1}^{n-1} r_i(s,t). \end{aligned}$$

This is a non-archimedean version of Dirichlet’s hyperbola trick, where instead of looking at the complimentary divisor to reduce the size, we have tried to reduce the number of prime divisors. Lemma 2.4 implies that the condition \(\mathfrak {e}\mid (b_ns -\theta _n t)\) defines a lattice in \(\mathbb {Z}^2\) of determinant \(e=\text {N}\,_n\mathfrak {e}\), which we shall call G. Hence we may write

$$\begin{aligned} B_{\mathfrak {e}}(x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\text {prim}}^2\cap x\mathscr {D} \cap G \\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} \prod _{i=1}^{n-1} r_i(s,t). \end{aligned}$$

Let \(\mathbf {v}\in \mathbb {Z}^2\) be such that \(|\mathbf {v}|=\max \{|v_1|,|v_2|\}\) is the first successive minimum of G. Lemma 2.2 can be applied with \(\mathscr {R}=x\mathscr {D}\), \(q_G=e\), \(N=n-1\), and

$$\begin{aligned} f_i(\mathfrak {a})=\sum _{\mathfrak {d}\mid \mathfrak {a}}\psi _i(\mathfrak {d}), \end{aligned}$$

for \(1\leqslant i\leqslant n-1\). For such \(f_i\) one can take \(\varepsilon _i\) in Lemma 2.2 to be arbitrarily small, whence

$$\begin{aligned} B_\mathfrak {e}(x)\ll x^2\frac{h^*(e)}{e} +\frac{x^{1+\varepsilon }}{|\mathbf {v}|}, \end{aligned}$$

for any \(\varepsilon >0\), where

$$\begin{aligned} h^*(e)=\prod _{p\mid k} \left( 1-\frac{\overline{\rho }_1(p)+\dots +\overline{\rho }_{n-1}(p)}{p+1}\right) ^{-1}. \end{aligned}$$

(Note that \(h_W^*(e)=h^*(e)\), since \(\gcd (e,W)=1\).)

We have \(e=\text {N}\,_n\mathfrak {e}\ll LX^{\frac{d_n}{2}}\) and so \(|\mathbf {v}|\ll \sqrt{LX^{\frac{d_n}{2}}}\leqslant \sqrt{LX}\), since \(d_n\leqslant 2\). Since \(F_n\) is irreducible, we note that \(d_n=1\) when \(F_n(\mathbf {v})=0\). Next, we introduce \(g(e)=\sharp \{\mathfrak {e}\in \mathscr {P}_n: \text {N}\,_n\mathfrak {e}=e\}.\) The second term is therefore seen to make the overall contribution

$$\begin{aligned}&\ll x^{1+\varepsilon } \sum _{\begin{array}{c} |\mathbf {v}|\ll \sqrt{LX} \\ F_n(\mathbf {v})\ne 0 \end{array}} \frac{1}{|\mathbf {v}|} \sum _{e\mid F_n(\mathbf {v})} g(e) + x^{1+\varepsilon } \sum _{\begin{array}{c} |\mathbf {v}|\ll \sqrt{LX} \\ F_n(\mathbf {v})= 0 \end{array}} \frac{1}{|\mathbf {v}|} \sum _{e\ll L\sqrt{X}} g(e) \ll x^{\frac{3}{2}+2\varepsilon }, \end{aligned}$$

which is satisfactory.

Next, the overall contribution from the term \(x^2 h^*(e)/e\) is \(O( x^2 \Sigma )\), where

$$\begin{aligned} \Sigma = \sum _{\begin{array}{c} L^{-1}X^{\frac{d_n}{2}} \ll e \ll L X^{\frac{d_n}{2}} \\ \Omega (e) \leqslant \frac{1}{2} (1+\eta ) \log \log x \\ \gcd (e,W)=1 \end{array}} \frac{g(e)h^*(e)}{e} . \end{aligned}$$

Letting \(A= \left( \frac{1+\eta }{2}\right) ^{-1} >1\), we get

$$\begin{aligned} \Sigma \ll (\log x)^{\frac{\log A}{A}} \sum _{\begin{array}{c} L^{-1}X^{\frac{d_n}{2}} \ll e \ll L X^{\frac{d_n}{2}} \\ \gcd (e,W)=1 \end{array}} \frac{g(e)h^*(e)}{e} A^{-\Omega (e)} . \end{aligned}$$

Put

$$\begin{aligned} S(y)= \sum _{\begin{array}{c} e \leqslant y \\ \gcd (e,W)=1 \end{array}} g(e)h^*(e)A^{-\Omega (e)}. \end{aligned}$$

Then it follows from Shiu’s work [27] that

$$\begin{aligned} S(y) \ll \frac{y}{\log y} \exp \left( A^{-1}\sum _{\begin{array}{c} p\leqslant y\\ p\not \mid W \end{array}}\frac{g(p)h^*(p)}{p}\right)&\ll \frac{y}{\log y} \exp \left( A^{-1}\sum _{\begin{array}{c} p\leqslant y\\ p\not \mid W \end{array}}\frac{\overline{\rho }_n(p)}{p}\right) \\&\ll y(\log y)^{\frac{1}{A}-1}. \end{aligned}$$

Partial summation now leads to the estimate

$$\begin{aligned} B_\infty ^{(2)}(x)&\ll x^2 (\log \log x) (\log x)^{\frac{\log A}{A}+\frac{1}{A}-1}\\&= x^2 (\log \log x) (\log x)^{\frac{\eta -1}{2}-\left( \frac{1+\eta }{2}\right) \log \left( \frac{1+\eta }{2}\right) }. \end{aligned}$$

The exponent of \(\log x\) is strictly negative for all \(\eta \in (0,1)\), which thereby completely settles the proof of (3.14).

3.4 The Bretèche–Tenenbaum sum

We saw in Sect. 2.5 that the Hooley \(\Delta \)-function defined in (2.14) belongs to \(\mathscr {M}_n\). The stage is now set for an application of Lemma 2.2 with \(N=n\) and \(G=\mathbb {Z}^2\), and with \(f_N(\mathfrak {a})= \Delta (\mathfrak {a};\psi _n)^2\) and \( f_i(\mathfrak {a})=\sum _{\mathfrak {d}\mid \mathfrak {a}} \psi _i(\mathfrak {d}), \) for \(i<N\). For such \(f_i\) one can take \(\varepsilon _i\) in Lemma 2.2 to be arbitrarily small, whence this gives

$$\begin{aligned} H_{\infty }(x) \ll \frac{x^2}{\log x} E_{\Delta (\cdot ;\psi _n)^2}(x^2;W) \end{aligned}$$

in (3.13). The statement of (3.15) now follows from part (ii) of Lemma 2.7.

3.5 Small divisors

In this section we establish (3.9), as required to complete the proof of Proposition 3.1. When \(n>1\), the proof follows from the treatment in [15] and will not be repeated here. Thus, provided that one takes \(\alpha \) to be sufficiently large in the definition (3.7) of L, one gets an asymptotic formula for \(D_{\mathbf {j}}(x)\) with a logarithmic saving in the error term. The proof of (3.9) when \(n=1\) is more complicated. In this case \(F_1\) is an irreducible binary quartic form. In order to simplify the notation, we shall drop the index \(n=1\) in what follows (and in particular, we shall denote \(\mathscr {P}_{K_1}=\mathscr {P}_1\) by \(\mathscr {P}\)). Our task is to estimate

$$\begin{aligned} D_j(x) = \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} f(F(s,t)_W) r^{(j)}(s,t) , \end{aligned}$$

for \(j\in \{0,1\}\). Opening up the definition of \(f(F(s,t)_W)\), it follows from parts (i) and (ii) of Lemma 2.3 that

$$\begin{aligned} f(F(s,t)_W)= & {} \sum _{\begin{array}{c} e\mid F(s,t)\\ \gcd (e,W)=1 \end{array}} \frac{\tau (e)\mu (e)}{e} =\sum _{\begin{array}{c} \mathfrak {e}\mid (bs-\theta t)\\ \gcd (\text {N}\,\mathfrak {e},W)=1 \\ \mathfrak {e}\in \mathscr {P} \end{array}} \frac{\tau (\mathfrak {e})\mu (\mathfrak {e})}{\text {N}\,\mathfrak {e}}, \end{aligned}$$

since \(\tau (\text {N}\,\mathfrak {e})=\tau _{K_1}(\mathfrak {e})=\tau (\mathfrak {e})\), say, for any \(\mathfrak {e}\in {\mathscr {P}}\).

Let \(y>0\). The overall contribution to \(D_j(x)\) from \(\mathfrak {e}\) such that \(\text {N}\,\mathfrak {e}>y\) is

$$\begin{aligned} \ll \sum _{\begin{array}{c} y<\text {N}\,\mathfrak {e}\ll x^4\\ \gcd (\text {N}\,\mathfrak {e},W)=1 \\ \mathfrak {e}\in \mathscr {P} \end{array}} \frac{\tau (\mathfrak {e})|\mu (\mathfrak {e})|}{\text {N}\,\mathfrak {e}} \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ \mathfrak {e}\mid (bs-\theta t) \end{array}} r^{(j)}(s,t). \end{aligned}$$

The condition \(\mathfrak {e}\mid (bs-\theta t)\) defines a lattice in \(\mathbb {Z}^2\) of determinant \(\text {N}\,\mathfrak {e}\) by Lemma 2.4. Thus we can apply Lemma 2.2, finding that

$$\begin{aligned} \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ \mathfrak {e}\mid (bs-\theta t) \end{array}} r^{(j)}(s,t)\ll x^2\frac{h_W^*(\text {N}\,\mathfrak {e})}{\text {N}\,\mathfrak {e}}+x^{1+\frac{\varepsilon }{2}}, \end{aligned}$$

for any \(\varepsilon >0\), where \(h^*\) is given by (2.6) with \(N=1\). Hence we arrive at the overall contribution

$$\begin{aligned}&\ll x^2\sum _{\begin{array}{c} \text {N}\,\mathfrak {e}>y \end{array}} (\text {N}\,\mathfrak {e})^{-2+\varepsilon } + x^{1+\frac{\varepsilon }{2}}\sum _{\begin{array}{c} \text {N}\,\mathfrak {e}\ll x^4 \end{array}} (\text {N}\,\mathfrak {e})^{-1+\frac{\varepsilon }{8}} \ll \frac{x^{2}}{\sqrt{y}} + x^{1+\varepsilon }, \end{aligned}$$

from \(\text {N}\,\mathfrak {e}>y\). Taking \(y=\log \log x\), we therefore conclude that

$$\begin{aligned} D_j(x)= \sum _{\begin{array}{c} \text {N}\,\mathfrak {e}\leqslant \log \log x\\ \gcd (\text {N}\,\mathfrak {e},W)=1 \\ \mathfrak {e}\in \mathscr {P} \end{array}} \frac{\tau (\mathfrak {e})\mu (\mathfrak {e})}{\text {N}\,\mathfrak {e}} \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ \mathfrak {e}\mid (bs-\theta t)\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}} r^{(j)}(s,t) +O\left( \frac{x^2}{\sqrt{\log \log x}}\right) . \end{aligned}$$

Note that by enlarging w we may assume that any prime factor of b is present in the factorisation of W.

We henceforth focus on the case \(j=0\), the case \(j=1\) being similar. First, we define for any \(\mathfrak {a}\in \mathscr {P}\) with \(\gcd (\text {N}\,\mathfrak {a},W)=1\) the set

$$\begin{aligned} \mathscr {H}(\mathfrak {a}) =\big \{(s,t)\in \mathbb {Z}^2:\mathfrak {a}\mid (bs-\theta t)\big \}. \end{aligned}$$

By Lemma 2.4 there exists \(k=k(\mathfrak {a})\in \mathbb {Z}\) such that a vector \((s,t) \in \mathbb {Z}^2\) belongs to \(\mathscr {H}(\mathfrak {a})\) if and only if \(\text {N}\,\mathfrak {a}\mid bs-k t\). Therefore, \(\mathscr {H}(\mathfrak {a})\) is a lattice in \(\mathbb {Z}^2\) of determinant \(\text {N}\,\mathfrak {a}\). Recalling the definition of \(r^{(0)}(s,t)\) we obtain

$$\begin{aligned} \begin{aligned} D_{0}(x) =~&\sum _{\begin{array}{c} \text {N}\,\mathfrak {e}\leqslant \log \log x\\ \gcd (\text {N}\,\mathfrak {e},W)=1 \\ \mathfrak {e}\in \mathscr {P} \end{array}} \frac{\tau (\mathfrak {e})\mu (\mathfrak {e})}{\text {N}\,\mathfrak {e}} \sum _{\begin{array}{c} \text {N}\,\mathfrak {d}\leqslant L^{-1}X^2 \\ \gcd (\text {N}\,\mathfrak {d},W)=1\\ \mathfrak {d}\in \mathscr {P} \end{array} } \psi (\mathfrak {d}) \sum _{\begin{array}{c} (s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\cap x\mathscr {D}\\ (s,t) \in \mathscr {H}(\mathfrak {d}) \cap \mathscr {H}(\mathfrak {e})\\ (s,t)\equiv (s_0,t_0)\,({{\mathrm{mod}}}{\,W}) \end{array}}1\\&+ O\left( \frac{x^2}{\sqrt{\log \log x}}\right) . \end{aligned} \end{aligned}$$
(3.19)

In fact, for coprime integers st, part (i) of Lemma 2.3 ensures that we only have \((s,t) \in \mathscr {H}(\mathfrak {d}) \cap \mathscr {H}(\mathfrak {e})\) if the least common multiple \([\mathfrak {d},\mathfrak {e}]\) of \(\mathfrak {d}\) and \(\mathfrak {e}\) belongs to \( \mathscr {P}\). It now follows from Lemma 2.4 that there exists \(k=k(\mathfrak {d},\mathfrak {e})\in \mathbb {Z}\) such that \((s,t) \in \mathscr {H}(\mathfrak {d}) \cap \mathscr {H}(\mathfrak {e})\) if and only if \(bs\equiv kt \,({{\mathrm{mod}}}{\,M})\), where \(M=[\text {N}\,\mathfrak {d},\text {N}\,\mathfrak {e}]\) is the least common multiple of \(\text {N}\,\mathfrak {d}\) and \(\text {N}\,\mathfrak {e}\). We let \(\mathbf {v}(M)=\mathbf {v}(M;\mathfrak {d},\mathfrak {e})\) denote a fixed non-zero vector \((s,t)\in \mathbb {Z}^2\) of minimal length such that \(bs\equiv kt \,({{\mathrm{mod}}}{\,M})\).

Note that \(\gcd (b,k,M)=1\), since \(\gcd (M,W)=1\) and we chose W in such a way that any prime factor of b also divides W. The inner sum over st is now in a form that is suitable for Lemma 2.6, with \(c=k\), \(e=M\) and

$$\begin{aligned} 1\ll \delta (\mathscr {D})\ll 1, \quad \beta , \gamma \ll 1. \end{aligned}$$

Arguing as in [15, Sects.4.3–4.5], once inserted into (3.19), the contribution from the main term (denoted by \(M_{\varvec{\psi }}\) in [15]) in Lemma 2.6 is \(\gg x^2\). This is satisfactory for (3.9). It remains to consider the effect of substituting the error term in Lemma 2.6.

Let

$$\begin{aligned} r^*(m)=\sharp \{\mathfrak {a}\in \mathscr {P}^\circ : \text {N}\,\mathfrak {a}=m,~ \gcd (\text {N}\,\mathfrak {a},W)=1\}, \end{aligned}$$

for any \(m\in \mathbb {N}\), where we recall that \(\mathscr {P}^\circ \) is the multiplicative span of prime ideals with residue degree 1. This function is multiplicative and has constant average order. We claim that \( r^*(cd)\leqslant r^*(c)r^*(d) \) for all \(c,d \in \mathbb {N}\), which we shall keep in use throughout this section. It is enough to consider the case \(c=p^a\) and \(d=p^b\) for a rational prime \(p\not \mid W\) with \(r^*(p)\ne 0\). Letting \(\mathfrak {p}_1,\ldots ,\mathfrak {p}_{m+1}\) be all the degree 1 prime ideals above p, we easily see that \(r^*(p^k)={ k+ m \atopwithdelims ()m }\). We therefore have to verify that

$$\begin{aligned} { a+b+m \atopwithdelims ()m } \leqslant { a+m \atopwithdelims ()m } { b+m \atopwithdelims ()m }, \end{aligned}$$

for all integers \(a,b,m\geqslant 0\). This is obvious when \(m=0\). When \(m\geqslant 1\) the inequality is equivalent to

$$\begin{aligned} 1\leqslant \prod _{i=1}^m\frac{(a+i)(b+i)}{i(a+b+i)}, \end{aligned}$$

the validity of which is clear.

The error term in Lemma 2.6 is composed of two parts. According to (2.13), the second part contributes

$$\begin{aligned}&\ll x \sum _{\begin{array}{c} \text {N}\,\mathfrak {d}\ll x^2/L\\ \text {N}\,\mathfrak {e}\leqslant \log \log x \end{array}} \frac{\tau (\mathfrak {e})|\mu (\mathfrak {e})|}{\text {N}\,\mathfrak {e}}\cdot \frac{1}{M} \sum _{d\mid M} \sqrt{d}, \end{aligned}$$

with \(M=[\text {N}\,\mathfrak {d},\text {N}\,\mathfrak {e}].\) Taking \(M\geqslant \text {N}\,\mathfrak {d}=q\), say, and

$$\begin{aligned} \sum _{d\mid M} \sqrt{d}\leqslant \tau (\text {N}\,\mathfrak {e})\sqrt{\text {N}\,\mathfrak {e}} \sum _{d\mid q} \sqrt{d}, \end{aligned}$$

we conclude that the second part contributes

$$\begin{aligned} \ll x \sum _{\begin{array}{c} q\ll x^2/L\\ \text {N}\,\mathfrak {e}\leqslant \log \log x \end{array}} \frac{\tau (\mathfrak {e})^2|\mu (\mathfrak {e})|}{\sqrt{\text {N}\,\mathfrak {e}}}\frac{r^*(q)}{q} \sum _{d\mid q} \sqrt{d} \ll x \log \log x \sum _{\begin{array}{c} q\ll x^2/L \end{array}} \frac{r^*(q)}{q} \sum _{d\mid q} \sqrt{d}. \end{aligned}$$

Writing \(q=cd\) and recalling \(r^*(cd)\leqslant r^*(c)r^*(d)\), this is

$$\begin{aligned}&\ll x \log \log x \sum _{\begin{array}{c} cd\ll x^2/L \end{array}} \frac{r^*(c)r^*(d)}{c\sqrt{d}}\\&\ll x \log \log x \sum _{\begin{array}{c} c\ll x^2/L \end{array}} \frac{r^*(c)}{c}\sqrt{\frac{x^2}{cL}}\\&\ll L^{-\frac{1}{2}}x^2 \log \log x. \end{aligned}$$

This is satisfactory for any \(\alpha >0\) in (3.7).

Finally, the overall contribution from the first part of the error term of Lemma 2.6 is

$$\begin{aligned}&\ll {x} \sum _{\begin{array}{c} {\text {N}\,}{\mathfrak {d}}\ll x^2/L\\ {\text {N}\,}{\mathfrak {e}}\leqslant \log {\log } x\\ {[}{\mathfrak {d}},{\mathfrak {e}}]\in {\mathscr {P}}\\ \gcd ({\text {N}\,}{\mathfrak {d}}{\text {N}\,}{\mathfrak {e}},W)=1 \end{array}} \frac{\tau (\mathfrak {e})|\mu ({\mathfrak {e}})|}{\text {N}\,{\mathfrak {e}}} \sum _{\begin{array}{c} u \in \mathbb {N}\\ u\mid M \end{array}} \frac{1}{u|\mathbf {v}(M/u)|} \log \left( 2+\frac{ x}{u|\mathbf {v}(M/u)|} \right) . \end{aligned}$$

Here we recall that \(\mathbf {v}(M/u)\) is a vector \((s,t)\in \mathbb {Z}^2\) of minimal length for which \(bs\equiv kt \,({{\mathrm{mod}}}{\,M/u})\). In particular it also depends on \(\mathfrak {d}\) and \(\mathfrak {e}\) since k does. Put \(d=\text {N}\,\mathfrak {d}\) and \(e=\text {N}\,\mathfrak {e}\), so that \(M=[d,e].\) If \(u\mid [d,e]\) then we claim that there is a factorisation \(u=u'u''\) such that \(u'\mid d\), \(u''\mid e\) and such that \(d/u'\) divides [de] / u. To see this let \(\nu _p(d)=\delta \) and \(\nu _p(e)=\varepsilon \) for any prime p. If \(u\mid [d,e]\) then \(\nu _p(u)\leqslant \max \{\delta ,\varepsilon \}\) for any prime p. We take

$$\begin{aligned} u'=\prod _{p^\nu \Vert u} p^{\min \{\nu ,\delta \}} \quad \text { and } \quad u''=\prod _{p^\nu \Vert u} p^{\nu -\min \{\nu ,\delta \}}. \end{aligned}$$

It is clear that \(u'\mid d\) and \(u''\mid e\). Moreover, one easily checks that

$$\begin{aligned} \nu _p(d/u')=\delta -\min \{\nu ,\delta \} \leqslant \max \{\delta ,\varepsilon \}-\nu = \nu _p([d,e]/u), \end{aligned}$$

for any prime p, whence \(d/u'\mid [d,e]/u\). In particular, this implies that

$$\begin{aligned} |\mathbf {v}([d,e]/u;\mathfrak {d},\mathfrak {e})|\geqslant |\mathbf {v}(d/u';\mathfrak {d},\mathfrak {e})|. \end{aligned}$$

Our argument so far shows that the term in which we are interested is

$$\begin{aligned} \ll x \sum _{\begin{array}{c} e\leqslant \log \log x \end{array}} \frac{\tau (e)^2}{e} \sum _{ \begin{array}{c} \mathfrak {e}\in \mathscr {P}\\ \text {N}\,\mathfrak {e}=e \end{array}} S(\mathfrak {e}), \end{aligned}$$
(3.20)

where

$$\begin{aligned} S(\mathfrak {e})&= \sum _{\begin{array}{c} d\ll x^2/L\\ \gcd (d,W)=1 \end{array}} \sum _{ \begin{array}{c} \mathfrak {d}\in \mathscr {P}\\ \text {N}\,\mathfrak {d}=d \end{array}} \sum _{\begin{array}{c} u'\mid d \end{array}} \frac{1}{u'|\mathbf {v}(d/u')|} \log \left( 2+\frac{ x}{u'|\mathbf {v}(d/u')|}\right) \\&\leqslant \sum _{\begin{array}{c} u'\ll x^2/L\\ \gcd (u',W)=1 \end{array}} \frac{1}{u'} \sum _{\begin{array}{c} d'\ll x^2/(u'L) \\ \gcd (d',W)=1 \end{array}} \sum _{ \begin{array}{c} \mathfrak {d}\in \mathscr {P}\\ \text {N}\,\mathfrak {d}=d'u' \end{array}} \frac{1}{|\mathbf {v}(d')|} \log \left( 2+\frac{ x}{|\mathbf {v}(d')|}\right) , \end{aligned}$$

with the caveat that \(\mathbf {v}(d')\) still depends on \(\mathfrak {d}\) and \(\mathfrak {e}\). Moreover if there exists \(\mathfrak {d}\in \mathscr {P}\) with \(\gcd (\text {N}\,\mathfrak {d},W)=1\) such that \(\text {N}\,\mathfrak {d}=d'u'\) then there exists \(\mathfrak {d}'\in \mathscr {P}\) with \(\gcd (\text {N}\,\mathfrak {d}',W)=1\) such that \(\text {N}\,\mathfrak {d}'=d'\). Hence \(\mathfrak {d}'\) must divide \((\mathbf {v}(d')_1-\theta \mathbf {v}(d')_2)\) and so it follows that \(d' \mid F(\mathbf {v}(d'))\). Furthermore, we note that \(|\mathbf {v}(d')|\ll \sqrt{d'}\ll x/\sqrt{L}\) in our upper bound for \(S(\mathfrak {e})\).

The contribution from \(d',\mathfrak {d}\) for which \(|\mathbf {v}(d')|\leqslant x/(\log x)^\Upsilon \) is seen to be

$$\begin{aligned}&\ll \log x \sum _{\begin{array}{c} u'\ll x^2/L\\ \gcd (u',W)=1 \end{array}} \frac{r^*(u')}{u'} \sum _{\begin{array}{c} \mathbf {v}=(v_1,v_2)\in \mathbb {Z}^2\\ 0<|\mathbf {v}|\leqslant x/(\log x)^\Upsilon \end{array}} \frac{1}{|\mathbf {v}|} \sum _{\begin{array}{c} d'\mid F(\mathbf {v}) \end{array}} r^*(d')\ll x(\log x) ^{-\Upsilon +10}, \end{aligned}$$

by [1]. Here we have used the fact that \(r^*(d')\leqslant \tau _4(d')\) and

$$\begin{aligned} \sum _{\begin{array}{c} u'\leqslant U \end{array}} \frac{r^*(u')}{u'} \leqslant \sum _{\begin{array}{c} u'\leqslant U \end{array}}\frac{r_K(u')}{u'} \ll \log U, \end{aligned}$$
(3.21)

where \(r_K\) are the coefficients in the associated Dedekind zeta function. Once inserted into (3.20) this contributes

$$\begin{aligned} \ll x^2(\log x)^{-\Upsilon +10} \sum _{e\leqslant \log \log x} \frac{\tau (e)^2r^*(e)}{e}\ll x^2(\log x)^{-\Upsilon +9}, \end{aligned}$$

which is satisfactory, on taking \(\Upsilon \) sufficiently large.

In the opposite case, we plainly have \( d'\gg |\mathbf {v}(d')|^2\geqslant x^2/(\log x)^{2\Upsilon }, \) whence \(\log (2+x/|\mathbf {v}(d')|)\ll _\Upsilon \log \log x\). Moreover, the inequalities \(d'\ll x^2/(u'L)\) and \(d'\gg x^2/(\log x)^{2\Upsilon }\) together provide us with \(u'\ll (\log x)^{2\Upsilon }\). Thus it remains to study the contribution

$$\begin{aligned}&\ll _\Upsilon \log \log x \sum _{\begin{array}{c} u'\ll (\log x)^{2\Upsilon } \end{array}} \frac{1}{u'} \sum _{\begin{array}{c} x^2/(\log x)^{2\Upsilon }\ll d'\ll x^2/L \\ \gcd (d',W)=1 \end{array}} \sum _{ \begin{array}{c} \mathfrak {d}\in \mathscr {P}\\ \text {N}\,\mathfrak {d}=d'u' \\ |\mathbf {v}(d')|\geqslant x/(\log x)^\Upsilon \end{array}} \frac{1}{|\mathbf {v}(d')|}\\&\ll _\Upsilon \log \log x \sum _{\begin{array}{c} u'\ll (\log x)^{2\Upsilon } \end{array}} \frac{1}{u'} \sum _{\begin{array}{c} \mathbf {v}\in \mathbb {Z}^2 \\ |\mathbf {v}|\ll x/\sqrt{L} \end{array}} \frac{1}{|\mathbf {v}|} \sum _{\begin{array}{c} x^2/(\log x)^{2\Upsilon }\ll d'\ll x^2/L \\ \gcd (d',W)=1 \end{array}} \sum _{ \begin{array}{c} \mathfrak {d}\in \mathscr {P}\\ \text {N}\,\mathfrak {d}=d'u' \\ bv_1\equiv kv_2\,({{\mathrm{mod}}}{\,d'}) \end{array}}1, \end{aligned}$$

where we recall that k depends on \(\mathfrak {d}\) and \(\mathfrak {e}\). For any \(\mathfrak {d}\in \mathscr {P}\) with \(\text {N}\,\mathfrak {d}=d'u'\) and \(\gcd (\text {N}\,\mathfrak {d},W)=1\), there is a factorisation \(\mathfrak {d}=\mathfrak {d}_1\mathfrak {d}_2\) with \(\mathfrak {d}_1,\mathfrak {d}_2\in \mathscr {P}\) such that \(\text {N}\,\mathfrak {d}_1=d'\), \(\text {N}\,\mathfrak {d}_2=u'\). Hence

$$\begin{aligned} \sum _{ \begin{array}{c} \mathfrak {d}\in \mathscr {P}\\ \text {N}\,\mathfrak {d}=d'u' \\ bv_1\equiv kv_2\,({{\mathrm{mod}}}{\,d'}) \end{array}}1\leqslant r^*(u') \sum _{ \begin{array}{c} \mathfrak {d}_1\in \mathscr {P}\\ \text {N}\,\mathfrak {d}_1=d' \\ \mathfrak {d}_1\mid (bv_1- \theta v_2) \end{array}} 1, \end{aligned}$$

by Lemma 2.4. On appealing to (3.21) to estimate the \(u'\)-sum, we are left with the contribution

$$\begin{aligned}&\ll _\Upsilon (\log \log x)^2 \sum _{\begin{array}{c} \mathbf {v}\in \mathbb {Z}^2 \\ |\mathbf {v}|\ll x/\sqrt{L} \end{array}} \frac{1}{|\mathbf {v}|} \sum _{ \begin{array}{c} \mathfrak {d}_1\in \mathscr {P}\\ \mathfrak {d}_1\mid (bv_1- \theta v_2)\\ x^2/(\log x)^{2\Upsilon }\ll \text {N}\,\mathfrak {d}_1 \ll x^2/L \\ \gcd (\text {N}\,\mathfrak {d}_1,W)=1 \end{array}} 1. \end{aligned}$$

We will need to restrict the outer sum to a sum over primitive vectors in order to bring Lemma 2.2 into play. Let \(h=\gcd (v_1,v_2)\) so that \(\mathbf {v}=h\mathbf {w}\) for \(\mathbf {w}\in \mathbb {Z}_{\text {prim}}^2\). Then \((bv_1- \theta v_2) = (h)(bw_1- \theta w_2)\), where (h) is the principal ideal generated by h. By unique factorisation, we have \(\mathfrak {d}_1 \mid (h)(bw_1- \theta w_2)\) if and only if

$$\begin{aligned} {\mathfrak {f}}^{-1}\mathfrak {d}_1 \mid (bw_1- \theta w_2), \end{aligned}$$

where \({\mathfrak {f}}\) is defined to be the greatest common ideal divisor of \(\mathfrak {d}_1\) and (h). Writing \(\mathfrak {c}={\mathfrak {f}}^{-1}\mathfrak {d}_1\), we see that

$$\begin{aligned} \sum _{ \begin{array}{c} \mathfrak {d}_1\in \mathscr {P}\\ \mathfrak {d}_1\mid (bv_1- \theta v_2)\\ x^2/(\log x)^{2\Upsilon }\ll \text {N}\,\mathfrak {d}_1 \ll x^2/L \\ \gcd (\text {N}\,\mathfrak {d}_1,W)=1 \end{array}} 1 \leqslant \sum _{\begin{array}{c} {\mathfrak {f}}\in {\mathscr {P}}\\ {\mathfrak {f}}\mid (h)\\ \gcd (\text {N}\,{\mathfrak {f}},W)=1 \end{array}} \sum _{ \begin{array}{c} \mathfrak {c}\in \mathscr {P}\\ \mathfrak {c}\mid (bw_1- \theta w_2)\\ \frac{x^2}{(\log x)^{2\Upsilon }\text {N}\,{\mathfrak {f}} }\ll \text {N}\,\mathfrak {c}\ll \frac{x^2}{L \text {N}\,{\mathfrak {f}}} \\ \gcd (\text {N}\,\mathfrak {c},W)=1 \end{array}}1. \end{aligned}$$

Splitting into \(\mathrm {e}\)-adic intervals the inner sum is easily seen to be

$$\begin{aligned} \ll _\Upsilon ( \log \log x) \Delta ((bw_1-\theta w_2)_W), \end{aligned}$$

where \(\Delta (\cdot )= \Delta (\cdot ,\mathbf {1})\), in the notation of Sect. 2.5. Since there are at most \(r^*(h)\) ideals \({\mathfrak {f}}\in {\mathscr {P}}\) such that \({\mathfrak {f}}\mid (h)\) and \(\gcd (\text {N}\,{\mathfrak {f}},W)=1\), we are left with the final contribution

$$\begin{aligned}&\ll _\Upsilon (\log \log x)^3 \sum _{h} \frac{r^*(h)}{h} \sum _{\begin{array}{c} \mathbf {w}\in \mathbb {Z}_{\text {prim}}^2 \\ |\mathbf {w}|\ll x/(h\sqrt{L}) \end{array}} \frac{\Delta ((bw_1-\theta w_2)_W)}{|\mathbf {w}|}. \end{aligned}$$

Splitting into dyadic intervals, we now apply Lemma 2.2 with \(G=\mathbb {Z}^2\), combined with part (i) of Lemma 2.7. Noting that one can take \(\varepsilon _1>0\) in Lemma 2.2 to be arbitrarily small, we deduce that the sum over \(\mathbf {w}\) can be bounded by

$$\begin{aligned} \ll _\varepsilon (\log x)^{\varepsilon /2} \frac{x}{h\sqrt{L}} \end{aligned}$$

for any \(\varepsilon >0\). This leads to the overall bound

$$\begin{aligned}&\ll _{\varepsilon ,\Upsilon } \frac{x(\log x)^\varepsilon }{\sqrt{L}} \sum _{h} \frac{r^*(h)}{h^2} \ll _{\varepsilon ,\Upsilon } \frac{x(\log x)^\varepsilon }{\sqrt{L}}, \end{aligned}$$

which thereby completes the proof of (3.9).

4 The upper bound

This section is concerned with proving the upper bound in Theorem 1.1. Let X be a quartic del Pezzo surface defined over \(\mathbb {Q}\), containing a conic defined over \(\mathbb {Q}\). We continue to follow the convention that all implied constants are allowed to depend in any way upon the surface X.

We appeal to [15, Thm. 5.6 and Rem. 5.9]. This shows that there are binary quadratic forms \(q_{1,1}^{(i)},q_{1,2}^{(i)},q_{2,2}^{(i)}\in \mathbb {Z}[s,t],\) for \(i=1,2\), such that

$$\begin{aligned} N (B) \leqslant \sum _{i=1,2} \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2 \\ |s|,|t| \ll \sqrt{B} \\ \Delta ^{(i)}(s,t)\ne 0 \end{array} } \sharp \left\{ \mathbf {y}\in \mathbb {Z}_{\mathrm {prim}}^3: Q_{s,t}^{(i)}(\mathbf {y})=0, ~\Vert \mathbf {y}\Vert _{s,t}\ll B\right\} , \end{aligned}$$
(4.1)

where \( \Vert \mathbf {y}\Vert _{s,t} = \max \{|s|,|t|\}\max \{|y_1|, |y_2|\}\) and

$$\begin{aligned} Q_{s,t}^{(i)}(\mathbf {y})=q_{1,1}^{(i)}(s,t)y_1^2+q_{1,2}^{(i)}(s,t)y_1y_2+q_{2,2}^{(i)}(s,t)y_2^2+y_3^2. \end{aligned}$$

Moreover, the discriminant \(\Delta ^{(i)}(s,t)\) of \(Q^{(i)}_{s,t}\) is a separable quartic form. The indices \(i=1,2\) are related to the existence of the two complimentary conic bundle fibrations. The two cases \(i=1,2\) are treated identically and we shall therefore find it convenient to suppress the index i in the notation. It is now clear that we will need a good upper bound for the number of rational points of bounded height on a conic, which is uniform in the coefficients of the defining equation, a topic that was addressed in Sect. 2.2.

4.1 Application of the bound for conics

Returning to (4.1), we apply Lemma 2.5 to estimate the inner cardinality. For any \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\), an argument of Broberg [5, Lemma 7] shows that \(D_{Q_{s,t}}=O(1)\). In our work W is given by (2.2), with \(\nu =1\) and w a large parameter depending only on X, which we will need to enlarge at various stages of the argument. In the first instance, we assume that \(2D_{Q_{s,t}}<w\ll 1\). We deduce that

$$\begin{aligned} N (B) \ll \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2 \\ |s|,|t| \ll \sqrt{B} \\ \Delta (s,t)\ne 0 \end{array} } C(Q_{s,t},w) \left( 1+\frac{B}{|\Delta (s,t)|^{\frac{1}{3}} \max \{|s|,|t|\}^{\frac{2}{3}}} \right) , \end{aligned}$$

for any \(w>0\), where

$$\begin{aligned} C(Q_{s,t},w) \ll \prod _{\begin{array}{c} p^\xi \Vert \Delta (s,t) \\ p\leqslant w \end{array}} \tau (p^\xi ) \prod _{\begin{array}{c} p^\xi \Vert \Delta (s,t) \\ p>w \end{array}} \left( \sum _{k=0}^\xi \chi _{Q_{s,t}}(p)^k\right) . \end{aligned}$$

Since \(s, t\ll \sqrt{B}\) and \(\deg (\Delta )=4\), we see that

$$\begin{aligned} |\Delta (s,t)|^{\frac{1}{3}} \max \{|s|,|t|\}^{\frac{2}{3}} \ll \max \{|s|,|t|\}^2\ll B ,\end{aligned}$$

whence

$$\begin{aligned} 1+\frac{B}{|\Delta (s,t)|^{\frac{1}{3}} \max \{|s|,|t|\}^{\frac{2}{3}} } \ll \frac{B}{|\Delta (s,t)|^{\frac{1}{3}} \max \{|s|,|t|\}^{\frac{2}{3}} } .\end{aligned}$$

Now let

$$\begin{aligned} \Delta (s,t)=\prod _{i=1}^n \Delta _i(s,t) \end{aligned}$$
(4.2)

be the factorisation of \(\Delta (s,t)\) into irreducible factors over \(\mathbb {Q}\). Each \(\Delta _i\) is separable and \({{\mathrm{Res}}}(\Delta _i,\Delta _j)\ne 0\), whenever \(i\ne j\). We suppose that X has \(\delta _0=m\) split degenerate fibres and we re-order the factorisation of \(\Delta (s,t)\) in such a way that the split degenerate fibres correspond to the closed points \(\Delta _1(s,t),\dots ,\Delta _m(s,t)\), with the non-split fibres corresponding to the closed points \(\Delta _{m+1}(s,t),\dots ,\Delta _n(s,t)\). We enlarge w so that

$$\begin{aligned} w>\max _{i\ne j}|\,{{\mathrm{Res}}}(\Delta _i,\Delta _j)|. \end{aligned}$$

Loughran, Frei and Sofos [15, Part (5) of Lemma 4.8] have shown that for each \(i>m\) there exists a binary form \(G_i(s,t)\in \mathbb {Z}[s,t]\) of even non-negative degree, with \({{\mathrm{Res}}}(G_i,\Delta _i)\) non-zero, such that

$$\begin{aligned} \chi _{Q_{s,t}}(p)= \left( \frac{G_i(s,t)}{p}\right) , \end{aligned}$$

for all \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\) with \(\Delta (s,t)\ne 0\), and all primes \(p>w\) with \(p\mid \Delta _i(s,t)\).

We proceed by introducing the arithmetic functions

$$\begin{aligned} \tau _0(s,t)= \sum _{\begin{array}{c} d\mid \Delta (s,t)\\ d\mid W^\infty \end{array}}1, \qquad \tau _i(s,t)= \sum _{\begin{array}{c} d\mid \Delta _i(s,t) \\ \gcd (d,W)=1 \end{array}} 1, \quad (1\leqslant i\leqslant m), \end{aligned}$$
(4.3)

and

$$\begin{aligned} r_i(s,t)= \sum _{\begin{array}{c} d\mid \Delta _i(s,t) \\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(s,t)}{d} \right) , \quad (m< i\leqslant n). \end{aligned}$$
(4.4)

We put

$$\begin{aligned} \mathfrak {S}(s,t)= \tau _0(s,t) \prod _{i=1}^m \tau _i(s,t) \prod _{i=m+1}^n r_i(s,t), \end{aligned}$$
(4.5)

for any \((s,t) \in \mathbb {Z}_{\mathrm {prim}}^2\). Note that \(\mathfrak {S}(s,t)\geqslant 0\). Our work so far shows that

$$\begin{aligned} N(B) \ll B \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2 \\ |s|, |t| \ll \sqrt{B} \\ \Delta (s,t)\ne 0 \end{array} } \frac{\mathfrak {S}(s,t)}{|\Delta (s,t)|^{\frac{1}{3}} \max \{|s|,|t|\}^{\frac{2}{3}} }. \end{aligned}$$

Since we are only interested in coprime integers st, there is a satisfactory contribution of O(B) to the right hand side from those vectors (st) in which one of the components is zero. Hence, by symmetry, Theorem 1.1 will follow from a bound of the shape

$$\begin{aligned} \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ \Delta (s,t)\ne 0 \end{array}} \frac{ \mathfrak {S}(s,t) }{|\Delta (s,t)|^{\frac{1}{3}} |t|^{\frac{2}{3}} } \ll (\log B)^{m+1}, \end{aligned}$$
(4.6)

since (1.1) implies that \(m+1=\rho -1\).

4.2 Reduction to divisor sums

For \(\beta \in \mathbb {C}\) and \(x,y>0\) we let

$$\begin{aligned} \mathscr {V}= \{ (s,t) \in \mathbb {R}^2: 1\leqslant |s| \leqslant |t| \leqslant x,~ |s-\beta t | \leqslant y,~ \Delta (s,t) \ne 0 \}. \end{aligned}$$

Consider the divisor function

$$\begin{aligned} D_\beta (x,y)= \sum _{ (s,t) \in \mathscr {V}\cap \mathbb {Z}_{\mathrm {prim}}^2 } \mathfrak {S}(s,t), \end{aligned}$$
(4.7)

where \(\mathfrak {S}(s,t)\) is given by (4.5). In this section we shall establish (4.6) subject to the following bound for \(D_\beta (x,y)\), whose proof will occupy the remainder of the paper.

Proposition 4.1

Let \(\beta \in \mathbb {C}\), let \(\eta \in (0,1)\) and assume that \(x^\eta \leqslant y \leqslant x\). Then \( D_\beta (x,y) \ll _{\beta ,\eta } xy \left( \log x\right) ^{m}. \)

We proceed to show how (4.6) follows from Proposition 4.1. Since \(\Delta (s,t)\) is separable, it may contain the polynomial factor t at most once. Therefore there exists \(c_0 \in \mathbb {Q}^*\) and pairwise unequal \(\alpha _i,\alpha _j \in \overline{\mathbb {Q}}\) such that \(\Delta (s,t)\) admits the factorisation \( c_0 t \prod _{i=1}^3 (s-\alpha _i t)\) or \(c_0 \prod _{i=1}^4 (s-\alpha _i t)\) , according to whether \(t{\mid }\Delta (s,t)\) or not, respectively. Putting

$$\begin{aligned} \alpha = \frac{1}{2} \min _{\begin{array}{c} i,j,k\\ i\ne j \end{array}}\left\{ |\alpha _i-\alpha _j|, |\alpha _k|\right\} , \end{aligned}$$
(4.8)

the set of integer pairs (st) appearing in (4.6) can be partitioned according to whether or not (st) belongs to the set

$$\begin{aligned} \mathscr {A}= \{ (s,t) \in \mathbb {R}^2:|s-\alpha _i t| \geqslant \alpha |t|, \ \text {for all { i}} \}. \end{aligned}$$

If \((s,t)\in \mathscr {A}\) then \(\Delta (s,t)\gg |t|^4\) and it follows that

$$\begin{aligned} \sum _{\begin{array}{c} (s,t ) \in \mathscr {A}\cap \mathbb {Z}_{\mathrm {prim}}^2\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ \Delta (s,t)\ne 0 \end{array}} \frac{ \mathfrak {S}(s,t) }{|\Delta (s,t)|^{\frac{1}{3}} |t|^{\frac{2}{3}} } \ll \sum _{\begin{array}{c} (s,t ) \in \mathscr {A}\cap \mathbb {Z}_{\mathrm {prim}}^2\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ \Delta (s,t)\ne 0 \end{array}} \frac{ \mathfrak {S}(s,t) }{|t|^{2} }. \end{aligned}$$

Breaking into dyadic intervals \(T/2<|t|\leqslant T\) and applying Proposition 4.1 with \(x=y=T\) and \(\beta =0\), we readily find that the right hand side is \(O((\log B)^{m+1})\), which is satisfactory for (4.6).

It remains to consider the contribution to (4.6) from \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\setminus \mathscr {A}\). For each i we define

$$\begin{aligned} S_i(B)= \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ \Delta (s,t)\ne 0\\ |s-\alpha _i t|<\alpha |t| \end{array}} \frac{ \mathfrak {S}(s,t) }{|\Delta (s,t)|^{\frac{1}{3}} |t|^{\frac{2}{3}} }. \end{aligned}$$

It now suffices to prove \(S_i(B)=O((\log B)^{m+1})\) for each i and each \(\alpha _i\). If (st) is counted by \(S_i(B)\) then (4.8) implies that for any \(j\ne i\) we have

$$\begin{aligned} |s-\alpha _jt| \geqslant \frac{1}{2} |\alpha _i-\alpha _j| |t|, \end{aligned}$$

thus \(|\Delta (s,t)| \gg |t|^3 |s-\alpha _it|\) in \(S_i(B)\). Likewise, we obviously have the reverse inequality \(|\Delta (s,t)| \ll |t|^3 |s-\alpha _it|\).

We begin by dealing with the contribution of pairs (st) with \(|s-\alpha _i t|\geqslant 1\). For given ST satisfying \(1\leqslant S\ll T\ll \sqrt{B}\), the overall contribution to \(S_i(B)\) from elements st such that \(T/2<|t|\leqslant T\) and \(S/2<|s-\alpha _i t|\leqslant S\) is seen to be

$$\begin{aligned} \ll \frac{1}{S^{\frac{1}{3}} T^{\frac{5}{3}}} D_{\alpha _i} (T,S), \end{aligned}$$

in the notation of (4.7). If \(S\gg T^{\frac{1}{10}}\) then Proposition 4.1 shows that this is

$$\begin{aligned} \ll \frac{S^{\frac{2}{3}}(\log B)^{m}}{ T^{\frac{2}{3}}}. \end{aligned}$$

Summing over dyadic ST satisfying \(T^{\frac{1}{10}}\ll S\ll T\ll \sqrt{B}\) gives an overall contribution \(O((\log B)^{m+1})\). On the other hand, if \(S\ll T^{\frac{1}{10}}\), we take \(\mathfrak {S}(s,t)\ll T^{\varepsilon }\) for any \(\varepsilon >0\), by the standard estimate for the divisor function, so that \(D_{\alpha _i} (T,S)\ll ST^{1+\varepsilon }\). Taking \(\varepsilon =\frac{1}{30}\), we therefore arrive at the contribution

$$\begin{aligned} \ll \frac{S^{\frac{2}{3}}T^{\frac{1}{30}}}{ T^{\frac{2}{3}}} \ll T^{-\frac{2}{3}+\frac{1}{10}}, \end{aligned}$$

from this case. Again, summing over dyadic ST satisfying \(S\ll T^{\frac{1}{10}}\) and \(1\ll T\ll \sqrt{B}\), this shows that we have an overall contribution O(1), which is plainly satisfactory.

It remains to consider the contribution to \(S_i(B)\) from st with \(|s-\alpha _i t| <1 \). In fact for irrational \(\alpha _i\) there are infinitely many pairs of coprime integers st for which \(|s-\alpha _i t|<|t|^{-1}\). The divisor bound gives \(\mathfrak {S}(s,t)\ll |t|^{\frac{1}{10}}\), which leads to the contribution

$$\begin{aligned} \ll \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2, ~\Delta (s,t)\ne 0\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ |s-\alpha _i t|< 1 \end{array}} \frac{ 1 }{|s-\alpha _i t|^{\frac{1}{3}} |t|^{\frac{5}{3}-\frac{1}{10}} } \end{aligned}$$
(4.9)

to \(S_i(B)\). We now invoke a result of Davenport and Roth [12, Cor. 2], which shows that \(\sharp {\mathscr {L}}=O(1)\), where

$$\begin{aligned} \mathscr {L} =\left\{ (s,t)\in \mathbb {Z}_{\text {prim}}^2: \left| \alpha _i-\frac{s}{t}\right| <\frac{1}{ |t|^{2+\frac{1}{100}}} \right\} . \end{aligned}$$

Moreover, the implied constant is effective and only depends on the coefficients of \(\Delta (s,t)\). The contribution to (4.9) from \(\mathscr {L}\) is therefore seen to be

$$\begin{aligned} \sum _{\begin{array}{c} (s,t ) \in \mathscr {L}, ~\Delta (s,t)\ne 0\\ 1 \leqslant |s|\leqslant |t| \end{array}} \frac{ 1 }{|s-\alpha _i t|^{\frac{1}{3}} |t|^{\frac{5}{3}-\frac{1}{10}} }\ll \mathscr {L} \ll 1, \end{aligned}$$

since \(|s-\alpha _it|\gg |\Delta (s,t)| |t|^{-3}\gg |t|^{-3}\). On the other hand, the contribution to (4.9) outside of \(\mathscr {L}\) is

$$\begin{aligned} \ll \sum _{\begin{array}{c} (s,t ) \in \mathbb {Z}_{\mathrm {prim}}^2\setminus \mathscr {L}\\ 1 \leqslant |s|\leqslant |t|\leqslant \sqrt{B}\\ |s-\alpha _i t|<1 \end{array}} \frac{1}{|t|^{\frac{4}{3}-\frac{1}{10}-\frac{1}{300}}} \ll \sum _{|t|\leqslant \sqrt{B}} \frac{1}{|t|^{\frac{4}{3}-\frac{1}{10}-\frac{1}{300}}} \ll 1, \end{aligned}$$

since for given t there are finitely many integers s in the interval \(|s-\alpha _i t|<1\). This completes the deduction of (4.6) from Proposition 4.1.

4.3 Small divisors

The function \(\tau _0(s,t)\) in (4.5) is concerned with the contribution to \(\mathfrak {S}(s,t)\) from small primes \(p\leqslant w\). Our work in Sect. 2.2 only applies to divisor sums supported away from small prime divisors. Hence we shall begin by using the geometry of numbers to deal with the function \(\tau _0(s,t)\), before handling the remaining factors in \(\mathfrak {S}(s,t)\).

Following Daniel [11], for any \(a \in \mathbb {N}\) we call two vectors \(\mathbf {x} ,\mathbf {y}\in \mathbb {Z}^2\) equivalent modulo a if

$$\begin{aligned} \gcd (\mathbf {x},a)= \gcd (\mathbf {y},a)=1 \quad \text { and } \quad \Delta (\mathbf {x})\equiv \Delta (\mathbf {y})\equiv 0 \,({{\mathrm{mod}}}{\,a}), \end{aligned}$$

and, moreover, there exists \(\lambda \,({{\mathrm{mod}}}{\,a})\) such that \(\mathbf {x}\equiv \lambda \mathbf {y} \,({{\mathrm{mod}}}{\,a})\). The set of equivalence classes is denoted by \(\mathfrak {A}(a)\) and the class elements as \(\mathscr {A}\). Letting

$$\begin{aligned} \varrho ^*(a)=\sharp \left\{ (\sigma ,\tau ) \,({{\mathrm{mod}}}{\,a}): \gcd (\sigma ,\tau ,a)=1,~ \Delta (\sigma ,\tau ) \equiv 0 \,({{\mathrm{mod}}}{\,a}) \right\} , \end{aligned}$$

we find that \( \varrho ^*(a)= \varphi (a) \sharp \mathfrak {A}(a). \) Moreover, we clearly have

$$\begin{aligned} \varrho ^*(a)\leqslant \varphi (a)( \rho _{\Delta (x,1)}(a)+ \rho _{\Delta (1,x)}(a)), \end{aligned}$$

in the notation of (2.5). Since \(\Delta (s,t)\) is separable, it follows from Huxley [19] that \(\rho _{\Delta (x,1)}(a)\leqslant 4^{\omega (a)}|{{\mathrm{disc}}}(\Delta )|^{\frac{1}{2}}\), and similarly for \(\rho _{\Delta (1,x)}(a)\). Hence

$$\begin{aligned} \sharp \mathfrak {A}(a)= \frac{\varrho ^*(a)}{ \varphi (a)} \ll 4^{\omega (a)}. \end{aligned}$$
(4.10)

For each \((s,t)\in \mathscr {V}\cap \mathbb {Z}_{\mathrm {prim}}^2\), write

$$\begin{aligned} r(s,t)= \prod _{i=1}^m \tau _i(s,t) \prod _{i=m+1}^n r_i(s,t) . \end{aligned}$$

Then

$$\begin{aligned} D_\beta (x,y)&\leqslant \sum _{\begin{array}{c} q \ll x^{4}\\ q \mid W^\infty \end{array}} \sum _{\begin{array}{c} (s,t) \in \mathscr {V}\cap \mathbb {Z}_{\mathrm {prim}}^2\\ q \mid \Delta (s,t) \end{array}} r(s,t)\\&\leqslant \sum _{\begin{array}{c} q \ll x^{4}\\ q \mid W^\infty \end{array}} \sum _{\mathscr {A}\in \mathfrak {A}(q)} \sum _{\begin{array}{c} (s,t) \in \mathscr {V}\cap G(\mathscr {A})\cap \mathbb {Z}_{\mathrm {prim}}^2 \end{array}} r(s,t), \end{aligned}$$

where \(G(\mathscr {A})=\{\mathbf {x}\in \mathbb {Z}^2: \exists \lambda \in \mathbb {Z}~\exists \mathbf {y}\in \mathscr {A}\text { s.t. } \mathbf {x}\equiv \lambda \mathbf {y}\,({{\mathrm{mod}}}{\,q})\}\) is the lattice generated by the vectors in \(\mathscr {A}\). The determinant of this lattice is q. We shall establish the following result.

Proposition 4.2

Let \(\eta \in (0,1)\) and assume that \(x^\eta \leqslant y \leqslant x\). Then

$$\begin{aligned} \sum _{\begin{array}{c} (s,t) \in \mathscr {V}\cap G(\mathscr {A})\cap \mathbb {Z}_{\mathrm {prim}}^2 \end{array}} r(s,t) \ll _{\beta ,\eta ,N} xy \left( \frac{(\log x)^m}{q}+ \frac{1}{(\log x)^{N}} \right) , \end{aligned}$$

for any \(N>0\), where the implied constant is independent of q.

We now show how Proposition 4.1 follows from this result. Employing (4.10), we deduce that

$$\begin{aligned} D_\beta (x,y) \ll _{\beta , \eta ,N} xy (\log x)^m \sum _{\begin{array}{c} q \ll x^{4}\\ q \mid W^\infty \end{array}} \frac{ 4^{\omega (q)}}{q} + \frac{xy}{(\log x)^{N}} \sum _{\begin{array}{c} q \ll x^{4}\\ q \mid W^\infty \end{array}} 4^{\omega (q)}. \end{aligned}$$

The first sum is \( \ll (\log w)^{4} \ll 1\). On the other hand, the second sum is

$$\begin{aligned} \leqslant \prod _{p \leqslant w} \left( 16\log x+O(1)\right) \ll (\log x)^{\pi (w)}. \end{aligned}$$

Choosing \(N=\pi (w)\), we therefore conclude the deduction of Proposition 4.1 from Proposition 4.2.

4.4 The final push

The aim of this section is to prove Proposition 4.2. Recall from (4.2) that we have a factorisation

$$\begin{aligned} \Delta (s,t)=\prod _{i=1}^m \Delta _i(s,t) \prod _{i=m+1}^n \Delta _i(s,t), \end{aligned}$$

where each \(\Delta _i\in \mathbb {Z}[s,t]\) is irreducible and the fibre above the closed point \(\Delta _i\) is split if and only if \(i\leqslant m\). We now want to bring into play the work in Sect. 2.2, in order to transform the sum in Proposition 4.2 into one that can be handled by Lemma 2.2.

Let \(i\in \{1,\dots ,n\}\). Recall from (4.3) and (4.4) that we are interested in the divisor sum

$$\begin{aligned} \sum _{\begin{array}{c} d\mid \Delta _i(s,t) \\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(s,t)}{d} \right) , \end{aligned}$$

where \(G_i(s,t)\in \mathbb {Z}[s,t]\) is a form of even degree (and we allow \(G_i(s,t)\) to be identically equal to 1). This is exactly of the form considered in (2.8). Let \(b_i=\Delta _i(1,0)\in \mathbb {Z}\) and suppose for the moment that \(b_i\ne 0\). As previously, let \(\theta _i\) be a root of the polynomial \(\tilde{\Delta }_i(x,1)\), in the notation of (2.4), and write \(K_i=\mathbb {Q}(\theta _i)\). Let \(\mathfrak {o}_i\) denote the ring of integers of \(K_i\). We enlarge w to ensure that \(w>2b_i D_{L_i/K_i} \Delta _{\theta _i}\), where \(\Delta _{\theta _i}\) is given by (2.9) and \(L_i=K_i(\sqrt{G_i(b_i^{-1}\theta _i,1)})\). Thus

$$\begin{aligned}{}[L_i:K_i]={\left\{ \begin{array}{ll} 1 &{}\text { if }i\leqslant m,\\ 2 &{}\text { if }i> m. \end{array}\right. } \end{aligned}$$

Next, let \(\psi _i\) be the quadratic Dirichlet character constructed in Sect. 2.2 (taking \(\psi _i=1\) when \(G_i(s,t)\) is identically 1). Let \(\text {N}\,_i\) denote the ideal norm in \(K_i\). Then it follows from part (iii) of Lemma 2.3 that for any \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\) such that \(\Delta _i(s,t) \ne 0\), we have

$$\begin{aligned} \sum _{\begin{array}{c} d\mid \Delta _i(s,t) \\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(s,t)}{d} \right) = \sum _{\begin{array}{c} \mathfrak {a}\mid (b_is-\theta _i t)\\ \gcd (\text {N}\,_i\mathfrak {a},W)=1 \end{array}} \psi _i(\mathfrak {a}). \end{aligned}$$
(4.11)

Moreover, if \(\mathscr {P}_i^\circ \), \( \mathscr {P}_i\) are defined as in (2.3) and (2.11), respectively, then part (i) of Lemma 2.3 implies that \(\mathfrak {a}\in \mathscr {P}_i\) for any \(\mathfrak {a}\mid (b_is-\theta _i t)\) such that \(\gcd (\text {N}\,_i\mathfrak {a},W)=1\).

Suppose now that \(b_i=0\), so that \(\Delta _i(s,t)=c t\) for some non-zero \(c \in \mathbb {Z}\). We enlarge w to ensure that \(w>c\). In this case we have

$$\begin{aligned} \sum _{\begin{array}{c} d\mid \Delta _i(s,t) \\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(s,t)}{d} \right) = \sum _{\begin{array}{c} d\mid t\\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(s,t)}{d} \right) = \sum _{\begin{array}{c} d\mid t\\ \gcd (d,W)=1 \end{array}} \left( \frac{G_i(1,0)}{d} \right) , \end{aligned}$$

since \(G_i\) has even degree and \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\). But this is of the shape (4.11), with \(b_i=0\), \(\theta _i=1\), \(K_i=\mathbb {Q}\), and \(\psi _i(d)=( \frac{G_i(1,0)}{d} )\).

Let \(i\in \{1,\dots ,n\}\) and let \(\mathfrak {c}\subset \mathfrak {o}_i\) be an integral ideal. We define multiplicative functions \(\mathfrak {t}_i,\mathfrak {r}_i\in \mathscr {M}_{K_i}\), in the notation of Sect. 2.1, via

$$\begin{aligned} \mathfrak {t}_i(\mathfrak {c})= \sum _{ \begin{array}{c} \mathfrak {a}\in \mathscr {P}_i\\ \mathfrak {a}\mid \mathfrak {c} \end{array}} 1, \quad (1\leqslant i\leqslant m), \end{aligned}$$

and

$$\begin{aligned} \mathfrak {r}_i(\mathfrak {c})= \sum _{ \begin{array}{c} \mathfrak {a}\in \mathscr {P}_i\\ \mathfrak {a}\mid \mathfrak {c} \end{array}} \psi _i(\mathfrak {a}), \quad (m< i\leqslant n). \end{aligned}$$

It follows that

$$\begin{aligned} r(s,t) = \prod _{i=1}^m \mathfrak {t}_{i,W}(b_i s-\theta _i t) \prod _{i=m+1}^n \mathfrak {r}_{i,W}(b_i s-\theta _i t) \end{aligned}$$

in Proposition 4.2, for any \((s,t)\in \mathbb {Z}_{\mathrm {prim}}^2\).

We are now in a position to apply Lemma 2.2 with \(\mathscr {R}=\mathscr {V}\), \(G=G(\mathscr {A})\) and \(q_G=q\). In particular it follows that

$$\begin{aligned} xy\ll V={{\mathrm{vol}}}(\mathscr {R}) \ll xy \quad \text { and } \quad x \log x\ll K_{\mathscr {R}}\ll x \log x. \end{aligned}$$

According to the statement of Proposition 4.2, we are given \(\eta \in (0,1)\) and xy such that \(x^\eta \leqslant y\leqslant x\). Thus \(\mathscr {R}\) is regular. Since \(q\ll x^4\), it therefore follows that all the hypotheses of Lemma 2.2 are met with each \(\varepsilon _i>0\) being arbitrarily small. On enlarging w suitably, we deduce that

$$\begin{aligned} \sum _{\begin{array}{c} (s,t) \in \mathscr {V}\cap G(\mathscr {A})\cap \mathbb {Z}_{\mathrm {prim}}^2 \end{array}} r(s,t) \ll _{\eta , W}~&\frac{xy}{(\log x)^n} \frac{h_W^*(q)}{q} \prod _{i=1}^m E_{\mathfrak {t}_i}(x^2;1) \prod _{i=m+1}^n E_{\mathfrak {r}_i}(x^2;1)\\&+x^{1+\frac{\eta }{2}}, \end{aligned}$$

Note that \(h_W^*(q)=1\), since \(q\mid W^\infty \). Moreover, since \(x^{1+\frac{\eta }{2}}\ll _N xy(\log x)^{-N}\), for any \(N>0\), the second term here is plainly satisfactory for Proposition 4.2.

Finally, we have

$$\begin{aligned} E_{\mathfrak {t}_i}(z;1) =\exp \left( \sum _{\begin{array}{c} \text {N}\,_i \mathfrak {p}\leqslant z \\ \mathfrak {p}\in \mathscr {P}_i^\circ \end{array}}\frac{\mathfrak {t}_i(\mathfrak {p})}{\text {N}\,_i\mathfrak {p}}\right) = \exp \left( \sum _{\begin{array}{c} \text {N}\,_i \mathfrak {p}\leqslant z \\ \mathfrak {p}\in \mathscr {P}_i^\circ \end{array}}\frac{2}{\text {N}\,_i\mathfrak {p}}\right) \ll (\log z)^{2}, \end{aligned}$$

for \(i\in \{1,\dots ,m\}\), and

$$\begin{aligned} E_{\mathfrak {r}_i}(z;1) =\exp \left( \sum _{\begin{array}{c} \text {N}\,_i \mathfrak {p}\leqslant z \\ \mathfrak {p}\in \mathscr {P}_i^\circ \end{array}}\frac{\mathfrak {r}_i(\mathfrak {p})}{\text {N}\,_i\mathfrak {p}}\right) = \exp \left( \sum _{\begin{array}{c} \text {N}\,_i \mathfrak {p}\leqslant z \\ \mathfrak {p}\in \mathscr {P}_i^\circ \end{array}}\frac{1+\psi _i(\mathfrak {p})}{\text {N}\,_i\mathfrak {p}}\right) \ll \log z, \end{aligned}$$

for \(i\in \{m+1,\dots ,n\}\). Thus the first term makes the overall contribution

$$\begin{aligned} \ll \frac{xy(\log x)^{m}}{q}, \end{aligned}$$

which thereby completes the proof of Proposition 4.2.