1 Background

This paper is part 4 of a sequence of papers devoted to understanding how to conjecture all of the integral moments of the Riemann zeta-function from a number theoretic perspective. The method is to approximate \(\zeta (s)^k\) by a long Dirichlet polynomial and then compute the mean square of the Dirichlet polynomial (c.f. [9]). There will be many off-diagonal terms and it is the care of these that is the concern of these papers. In particular it is necessary to treat the off-diagonal terms by a method invented by Bogomolny and Keating [1, 2]. Our perspective on this method is that it is most properly viewed as a multi-dimensional Hardy-Littlewood circle method.

In part 3 [7] we considered the type I off diagonal terms from a general perspective. Now we look at the simplest type II sums.

The formula we obtain is in complete agreement with all of the main terms predicted by the recipe of [3] (and in particular, with the leading order term conjectured in [10]).

2 Shifted moments

We are interested in developing a number theoretic approach to the moments of the Riemann zeta-function on the critical line, in particular to the general “shifted” moment given by

$$\begin{aligned} I^\psi _{A,B}(T)=\int _0^\infty \psi \left( \frac{t}{T} \right) \prod _{\alpha \in A}\zeta (s+\alpha )\prod _{\beta \in B} \zeta (1-s+\beta ) ~dt \end{aligned}$$
(1)

where \(\psi \) is a smooth function with compact support, say \(\psi \in C^\infty [1,2]\) and \(s=1/2+it\) and A and B are sets of small complex numbers, referred to as the shifts. It is useful to consider as well the general shifted moment of a long Dirichlet polynomial. To express this we first introduce the generalized divisor function \(\tau _A(n)\) by way of its generating function:

$$\begin{aligned} \prod _{\alpha \in A} \zeta (s+\alpha )=\sum _{n=1}^\infty \frac{\tau _A(n)}{n^s}=:D_A(s). \end{aligned}$$

Then we let

$$\begin{aligned} \mathcal D_A(s;X)=\sum _{n\le X} \frac{\tau _A(n)}{n^s} \end{aligned}$$

and consider

$$\begin{aligned} I^\psi _{A,B}(T;X):= & {} \int _0^\infty \psi \left( \frac{t}{T}\right) D_{A}(s;X)D_{B}(1-s;X) ~dt\nonumber \\= & {} T\sum _{m,n\le X }\frac{\tau _A(m)\tau _B(n){\hat{\psi }}\left( \frac{T}{2\pi }\log \frac{m}{n} \right) }{\sqrt{mn}}. \end{aligned}$$
(2)

The recipe [3] tells us how to predict the behaviour of these moments. Firstly, we conjecture that

$$\begin{aligned} I^\psi _{A,B}(T)= & {} T\int _0^\infty \psi (t)\sum \limits _{\begin{array}{c} U\subset A, V\subset B\\ |U|=|V| \end{array}} \left( \frac{tT}{2\pi }\right) ^{-\sum _{\begin{array}{c} {\alpha }\in U\\ {\beta }\in V \end{array}}({\alpha }+{\beta }) }\\&\times \,\mathcal B(A-U+V^-,B-V+U^-)~dt+o(T) \end{aligned}$$

where \(\mathcal B\) is given by

$$\begin{aligned} \mathcal B(A,B)=\sum _{n=1}^\infty \frac{\tau _A(n)\tau _B(n)}{n} \end{aligned}$$

in the case that this series converges (for example if \(\mathfrak {R}\alpha , \mathfrak {R}\beta >0\) for all \(\alpha \in A\) and \(\beta \in B\)) and is given by analytic continuation otherwise. An alternate expression is \(\mathcal B(A,B)=\mathcal A(A,B)\mathcal Z(A,B)\) where

$$\begin{aligned} Z(A,B):=\prod \limits _{\begin{array}{c} \alpha \in A \\ \beta \in B \end{array}} \zeta (1+\alpha +\beta ) \end{aligned}$$

and \(\mathcal A (A,B)\) is a product over primes that converges nicely in the domains under consideration (see below). We have used an unconventional notation here; by \(A-U+V^-\) we mean the following: start with the set A and remove the elements of U and then include the negatives of the elements of V. We think of the process as “swapping” equal numbers of elements between A and B; when elements are removed from A and put into B they first get multiplied by \(-1\). We keep track of these swaps with our equal-sized subsets U and V of A and B; and when we refer to the “number of swaps” in a term we mean the cardinality |U| of U (or, since they are of equal size, of V).

The Euler product \(\mathcal A\) is given by

$$\begin{aligned} \mathcal {A} (A,B)=\prod _p Z_{p}(A,B)\int _0^1\mathcal {A}_{p,\theta }(A,B)~d\theta , \end{aligned}$$

where \(z_p(x):=(1-p^{-x})^{-1}\), \(Z_p(A,B)=\prod _{\begin{array}{c} \alpha \in A\\ \beta \in B \end{array}} z_p(1+\alpha +\beta )^{-1}\) and

$$\begin{aligned} \mathcal {A}_{p,\theta }(A,B):= \prod _{\alpha \in A} z_{p,-\theta }\left( \frac{1}{2} +\alpha \right) \prod _{\beta \in B}z_{p,\theta }\left( \frac{1}{2} +\beta \right) \end{aligned}$$

with \(z_{p,\theta }(x):=(1-e(\theta )p^{-x})^{-1}\).

The technique we are developing in the present series of papers is to approach our moment problem (1) through the moments \(I^\psi _{A,B}(T;X)\) of long Dirichlet polynomials for various ranges of X. The recipe of [3] also leads to a conjectural formula for \(I^\psi _{A,B}(T;X)\). To explain this we begin with Perron’s formula

$$\begin{aligned} D_{A}(s;X)=\frac{1}{2\pi i} \int _{w} \frac{X^{w}}{w}D_{A_w}(s)~dw \end{aligned}$$

where we use the convenient notation

$$\begin{aligned} A_w=\{\alpha +w:\alpha \in A\}. \end{aligned}$$

Thus, we have

$$\begin{aligned} I^\psi _{A,B}(T;X)= \frac{1}{(2\pi i)^2} \iint _{z,w}\frac{X^{z+w}}{zw} I^\psi _{A_{w},B_{z}}(T)~dw~dz. \end{aligned}$$

We insert the conjecture above from the recipe and expect that

$$\begin{aligned} I^\psi _{A,B}(T;X)= & {} T\int _0^\infty \psi (t)\frac{1}{(2\pi i)^2} \iint _{z,w}\frac{X^{z+w}}{zw}\sum \limits _{\begin{array}{c} U\subset A, V\subset B\\ |U|=|V| \end{array}} \left( \frac{tT}{2\pi }\right) ^{-\sum _{\begin{array}{c} {\alpha }\in U\\ {\beta }\in V \end{array}}({\alpha }+w+{\beta }+z) }\\&\times \, \mathcal B(A_w-U_w+V^-_z,B_z-V_z+U^-_w)~dw~dz~dt +o(T). \end{aligned}$$

We have done a little simplification in this expression: instead of writing \(U\subset A_w\) we have written \(U\subset A\) and changed the exponent of \((tT/2\pi )\) accordingly.

Notice that there is a factor \((X/T^{|U|})^{w+z}\) here. As mentioned above we refer to |U| as the number of “swaps” in the recipe, and now we see more clearly the role it plays; in the terms above for which \(X<T^{|U|}\) we move the path of integration in w or z to \(+\infty \) so that the factor \((X/T^{|U|})^{w+z}\rightarrow 0\) and the contribution of such a term is 0. Thus, the size of X determines how many “swaps” we must keep track of.

Our principal aim in this series of papers is to evaluate \(I^\psi _{A,B}(T;X) \) directly using a conjecture for the correlations of \(\tau _A(n)\) and then to compare with the above formula coming from the recipe of [3]. In [5] and [7] we considered the situation of 0 swaps which leads to the usual “diagonal” terms and 1 swap which corresponds to the usual “shifted divisor” problem. In [6] we considered a special case of 2 swaps. Now we look at the general case of two swaps. This means that we are interested in the terms for which \(X>T^2\) and for which \(|U|=|V|=2\).

It is helpful to review the result of [7] before proceeding. The mathematical content of that paper is basically a conjecture and a theorem. First of all let \(\epsilon >0\) be a small fixed number for this discussion and let \(|\alpha |,|\beta |<\epsilon \) for all \(\alpha \in A\) and \(\beta \in B\). The conjecture is about the analytic continuation of

$$\begin{aligned} \mathcal S_{A,B}(s,h):=\sum _{m=1}^\infty \frac{\tau _A(m)\tau _B(m+h)}{m^s} \end{aligned}$$

and the sum of the residues near 1 of this:

$$\begin{aligned} \mathcal {R}_{A,B}(y;h):=\sum _{|s-1|<\epsilon } {\text {Res}}S_{A,B}(s,h) y^{s-1} \end{aligned}$$

where we intend this notation to mean that \(\mathcal R_{A,B}(y;h)\) is the sum of the residues of \(S_{A,B}(s,h) y^{s-1}\) over all of the poles in \(|s-1|<\epsilon \). Let

$$\begin{aligned} D_A\left( s,e\left( \frac{1}{q}\right) \right) :=\sum _{m=1}^\infty \frac{\tau _A(m)e(\frac{m}{q}) }{m^s} \end{aligned}$$

and

$$\begin{aligned} \mathcal R_A(y,q)=\sum _{|s-1|<\epsilon } {\text {Res}} D_A\left( s,e\left( \frac{1}{q}\right) \right) y^{s-1} \end{aligned}$$

be the sum of the residues near \(s=1\), i.e. including poles at \(s=1-\alpha \) for \(\alpha \in A\). Let

$$\begin{aligned} \mathcal R_{A,B}^*(y;h):= \sum _{q=1}^\infty r_q(h) \mathcal R_A(y,q)\mathcal R_B(y,q) \end{aligned}$$

where \(r_q(h)\) is the Ramanujan sum.

Conjecture 1

We conjecture for each fixed \(h>0\) that \(S_{A,B}(s,h)\) has a meromorphic continuation to \(\mathfrak {R}s>\frac{1}{2}+\epsilon \) with all poles only in the region \(|s-1|<\epsilon \) and that

$$\begin{aligned} \mathcal R_{A,B}(y;h)=\mathcal R_{A,B}^*(y;h). \end{aligned}$$

The above is essentially the obvious pole structure that one would conjecture by using the \(\delta \)-method for example.

Now we briefly describe the calculation of [7]. We evaluate

$$\begin{aligned} \sum \limits _{\begin{array}{c} m,n\le X\\ m\ne n \end{array}} \frac{\tau _A(m)\tau _B(n)}{\sqrt{mn}}\hat{\psi }\left( \frac{T}{2\pi }\log \frac{m}{n}\right) \end{aligned}$$

as

$$\begin{aligned} 2\sum _{h>0}\int _T^X \langle \tau _A(m)\tau _B(m+h)\rangle _{m\sim u} \hat{\psi }\left( \frac{Th}{2\pi u}\right) \frac{du}{u} \end{aligned}$$

which we evaluate by differentiating Perron’s formula with respect to u and then moving the s-contour to the left to give

$$\begin{aligned} 2\sum _{h>0}\int _T^X\mathcal R_{A,B}(u,h)\hat{\psi }\left( \frac{Th}{2\pi u}\right) \frac{du}{u} \end{aligned}$$

We make the change of variable \(v= \frac{Th}{2\pi u}\) and rewrite this as

$$\begin{aligned} 2\int _0^\infty \hat{\psi }(v)\sum _{h\le \frac{2 \pi Xv}{T}} \mathcal R_{A,B}\left( \frac{Th}{2\pi v},h\right) \frac{dv}{v} \end{aligned}$$

At this point we replace \(\mathcal R\) by \(\mathcal R^*\) and have

$$\begin{aligned} 2\int _0^\infty \hat{\psi }(v)\sum _{h\le \frac{2 \pi Xv}{T}} \sum _{q=1}^\infty r_q(h)\mathcal R_{A}\left( \frac{Th}{2\pi v},q\right) \mathcal R_{B}\left( \frac{Th}{2\pi v},q\right) \frac{dv}{v}. \end{aligned}$$

Now

$$\begin{aligned} r_q(h)=\sum \limits _{\begin{array}{c} d\mid h\\ d\mid q \end{array}} d\mu (q/d) \end{aligned}$$

so, replacing h by hd and q by qd the above is

$$\begin{aligned} 2\int _0^\infty \hat{\psi }(v)\sum _{q=1}^\infty \mu (q)\sum _{hd\le \frac{2 \pi Xv}{T}} d~\mathcal R_{A}\left( \frac{Thd}{2\pi v},qd\right) \mathcal R_{B}\left( \frac{Thd}{2\pi v},qd\right) \frac{dv}{v}. \end{aligned}$$

Now we express this using Cauchy’s theorem as

$$\begin{aligned}&2\int _0^\infty \hat{\psi }(v)\sum _{q=1}^\infty \frac{\mu (q)}{(2\pi i)^3} \iiint _{\begin{array}{c} \mathfrak {R}s=2\\ |w-1|<\epsilon \\ |z-1|<\epsilon \end{array}}X^{s} \sum _{h,d=1}^\infty \left( \frac{Thd}{2\pi v}\right) ^{z+w-s-2} d\\&\quad \times ~\mathcal D_{A}\left( w,e(-\frac{1}{qd})\right) \mathcal D_{B}\left( z,e(-\frac{1}{qd})\right) \frac{dv}{v}~dz ~dw ~\frac{ds}{s}. \end{aligned}$$

Now we replace the sum over h by \(\zeta (2+s-w-z)\) and the integral over v by

$$\begin{aligned} \frac{\chi (w+z-s-1)}{2}\int _0^\infty \psi (t) t^{z+w-s-2} ~dt. \end{aligned}$$

This leads to

$$\begin{aligned}&\int _0^\infty \psi (t)\sum _{q=1}^\infty \frac{\mu (q)}{(2\pi i)^3} \iiint _{\begin{array}{c} \mathfrak {R}s=2\\ |w-1|<\epsilon \\ |z-1|<\epsilon \end{array}}X^s \zeta (w+z-s-1) \sum _{d=1}^\infty d\left( \frac{Tdt}{2\pi }\right) ^{z+w-s-2} \\&\quad \times \mathcal D_{A}\left( w,e(-\frac{1}{qd})\right) \mathcal D_{B}\left( z,e(-\frac{1}{qd})\right) ~dt ~dz ~dw ~ds. \end{aligned}$$

Upon comparison with the recipe we have the identity

Theorem 1

$$\begin{aligned}&\mathop {{\text {Res}}}\limits _{\begin{array}{c} w=1-\alpha \\ z=1-\beta \end{array}}\sum _{q=1}^\infty \mu (q) \sum _{d=1}^\infty d^{z+w-1} \zeta (w+z-1) \mathcal D_{A}\left( w,e\left( -\frac{1}{qd}\right) \right) \mathcal {D}_{B}\left( z,e\left( -\frac{1}{qd}\right) \right) \\&\quad = \mathcal B(A'\cup \{-\beta \},B'\cup \{-\alpha \}) . \end{aligned}$$

Theorem 1 follows from the identity stated at the end of Sect. 3 of [7] and the fact that the singular part of \(\mathcal {D}_A(s,e(\frac{1}{q}))\) is the same as \(q^{-s}\prod _{\alpha \in A}\zeta (s+\alpha )G_A(s,q)\), as proved in [4].

We call this theorem the “analytic version of the general shifted divisor sum.” In this paper we prove an identity that is an analogue of Theorem 1 but for a convolution of two shifted divisor sums. This is a step forward in this process of understanding moments. The key theorem is a convolution identity

Theorem 2

$$\begin{aligned}&\mathop {{\text {Res}}}\limits _{\begin{array}{c} w_1=1-\alpha _1\\ z_1=1-\beta _1\\ w_2=1-\alpha _2\\ z_2=1-\beta _2 \end{array}} \zeta (w_1+z_1-1)\zeta (w_2+z_2-1) \sum \limits _{\begin{array}{c} (M,N)=1\\ d_1,d_2\\ q_1,q_2 \end{array}}\frac{ \mu (q_1)\mu (q_2)d_1^{z_1+w_1-1} d_2^{z_2+w_2-1}}{M^{z_1+w_2-1} N^{w_1+z_2-1}} \\&\qquad \times \mathcal D_{A_1}\left( w_1,e\left( -\frac{N}{q_1d_1}\right) \right) \mathcal D_{A_2}\left( w_2,e\left( -\frac{M}{q_2d_2}\right) \right) \\&\qquad \times \mathcal D_{B_1}\left( z_1,e\left( -\frac{M}{q_1d_1}\right) \right) \mathcal D_{B_2}\left( z_2,e\left( -\frac{N}{q_2d_2}\right) \right) \\&\quad = \mathcal B(A''\cup \{-\beta _1,-\beta _2\},B''\cup \{-\alpha _1,-\alpha _2\}) . \end{aligned}$$

Theorem 2 follows from Sect. 11 because if \((a,q)=1\) then the singular part of \(\mathcal {D}_A(s,e(\frac{a}{q}))\) is identical to that of \(q^{-s}\prod _{\alpha \in A}\zeta (s+\alpha )G_A(s,q)\).

A particularly interesting feature of this theorem is the appearance of the sum over M and N. It is these parameters which prompt us to liken this calculation to a circle method calculation. Basically the M and N make their appearance because of a splitting of the equation \(m_1m_2-n_1n_2=h\) into a pair of equations where \(m_1/n_1\approx M/N\approx n_2/m_2\) which gives \(Mm_1-Nn_1=h_1, Nm_2-Mn_2=h_2\). This is the fundamental new idea of the paper.

In the next section of this paper we present the basic set up, which involves a convolution of two shifted divisor sums. In Sect. 4, 5 and 6 we deal with the semi-diagonal case where one of the shifted divisor sums is degenerate. In Sect. 7, 8 and 9 we motivate heuristically the identity of Theorem 2. This identity is sufficiently complicated that we find it convenient to recast it as an equality of certain power series. Sect. 10 and 11 are devoted to the rigorous proof of this identity.

3 Type II convolution sums

To proceed we approach the moment \(I_{A,B}^\psi (T;X)\) through arithmetic means. To do this, we consider a convolution of shifted correlation sums.

We first make use of the fact that if \(A=A_1\cup A_2\) and \(B=B_1\cup B_2\) then \(\tau _A\) and \(\tau _B\) are convolutions: \(\tau _A=\tau _{A_1}*\tau _{A_2}\) and \(\tau _B=\tau _{B_1}*\tau _{B_2}\). We are thus interested in

$$\begin{aligned} \mathcal O_{II}&=\sum \limits _{\begin{array}{c} m_1m_2,n_1n_2 \le X\\ 0<|m_1m_2-n_1n_2|<m_1m_2/\tau \end{array}} \frac{\tau _{A_1}(m_1)\tau _{A_2}(m_2) \tau _{B_1}(n_1)\tau _{B_2}(n_2)}{m_1m_2} \\&\quad \times \hat{\psi }\left( \frac{T}{2\pi }\log ((n_1n_2)/(m_1m_2))\right) . \end{aligned}$$

Now we embark on a discrete analog of the circle method which basically consists of approximating a ratio, say \(m_1/n_1\) by a rational number with a small denominator, say M / N, and then sum all of the terms with \(m_1/n_1\) close to M / N.

To this end we introduce a parameter Q and subdivide the interval [0, 1] into Farey intervals associated with the fractions M / N with \(1\le M\le N\le Q\) and \((M,N)=1\) from the Farey sequence \(\mathcal F_Q\); see [6] for details. We define

$$\begin{aligned} h_1 := m_1 N - n_1 M \end{aligned}$$

and

$$\begin{aligned} h_2: = m_2 M -n_2 N. \end{aligned}$$

We have

$$\begin{aligned} m_1m_2 MN -n_1n_2 MN = h_1m_2M+h_2m_1 N -h_1 h_2 \end{aligned}$$

so that

$$\begin{aligned} \frac{m_1 m_2 -n_1 n_2}{m_1m_2}=\frac{h_1}{m_1N}+\frac{h_2}{m_2 M} -\frac{h_1h_2}{m_1m_2MN} \end{aligned}$$

and

$$\begin{aligned} \log \frac{n_1n_2}{m_1m_2}=\frac{h_1}{m_1N}+\frac{h_2}{m_2 M} +O\big (\frac{h_1h_2}{m_1m_2MN}\big ). \end{aligned}$$

The error term is negligible so we have now arranged the sum as

$$\begin{aligned} \sum \limits _{\begin{array}{c} M\le N\le Q\\ (M,N)=1 \end{array}} \sum _{h_1,h_2 } \sum \limits _{\begin{array}{c} {m_1m_2\le X }\\ {(*_1), (*_2) } \end{array}} \frac{\tau _{A_1}(m_1)\tau _{A_2}(m_2) \tau _{B_1}(n_1)\tau _{B_2}(n_2)}{m_1m_2} \hat{\psi }\left( \frac{Th_1}{2\pi m_1N}+\frac{Th_2}{2\pi m_2 M} \right) \end{aligned}$$
(3)

where

$$\begin{aligned} (*_1): m_1N-n_1M=h_1 \quad \text{ and } \quad (*_2): m_2M-n_2N=h_2 \end{aligned}$$

Note that for a given \(m_1, n_1\) and \(h_1\) the condition \((*_1)\) implies that \(m_1/n_1 \in \mathcal M_{M,N}\) so we don’t need to write that condition.

4 The case of \(h_2=0\)

We remark first of all that the terms with \(h_1=h_2=0\) are precisely the diagonal terms. Now we consider what happens if \(h_2=0\) and \(h_1\ne 0\). We call this a “semi-diagonal” term after [1].

If \(h_2=0\) then \(m_2M=n_2N\). Since \((M,N)=1\) it follows that \(m_2=N\ell \) and \(n_2=M\ell \) for some \(\ell \). Thus we have

$$\begin{aligned} \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}}\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \sum _{ h_1} \sum \limits _{\begin{array}{c} m_1, n_1 ,\ell \\ (*_1) \\ n_1\ge |h_1|Q \end{array}} \frac{\tau _{A_1}(m_1)\tau _{A_2}(N\ell ) \tau _{B_1}(n_1)\tau _{B_2}(M\ell )}{m_1N\ell } \hat{\psi }\left( \frac{Th_1}{2\pi m_1N} \right) \end{aligned}$$

where

$$\begin{aligned} (*_1): m_1N-n_1M=h_1 . \end{aligned}$$

In general, with \(*:mN-nM=h \), we expect by the delta-method that

$$\begin{aligned}&\langle \tau _{A}(m)\tau _{B}(n)\rangle ^{(*)}_{m= u}\sim \sum \limits _{\begin{array}{c} \alpha \in A\\ \beta \in B \end{array}}u^{-\alpha - \beta }M^{-1+\beta }N^{-\beta } Z(A'_{-\alpha })Z(B'_{-\beta })\\&\quad \times \sum _{d\mid h} \frac{1}{d^{1-\alpha - \beta }} \sum _{q} \frac{\mu (q)(qd,M)^{1-\beta } (qd,N)^{1- \alpha }}{q^{2-\alpha -\beta }} \nonumber \\&\quad \times \, G_{A}\left( 1-\alpha ,\frac{qd}{(qd,N)}\right) G_{B}\left( 1-\beta ,\frac{qd}{(qd,M)}\right) ,\nonumber \end{aligned}$$
(4)

where G is a multiplicative function for which

$$\begin{aligned} G_A(1- \alpha ,p^r)=\prod _{\hat{\alpha }\in A'}\left( 1-\frac{1}{p^{1+\hat{\alpha }-\alpha }}\right) \sum _{j=0}^\infty \frac{\tau _{A'}(p^{j+r})}{p^{j(1- \alpha )}} \end{aligned}$$

with \(A'=A-\{ \alpha \}\) and where

$$\begin{aligned} Z(A)=\prod _{a\in A}\zeta (1+a). \end{aligned}$$

5 A diversion

The ensuing calculations are about to become (more) complicated largely due to arithmetic factors. We pause in the calculation to show what the calculations look like without the arithmetic factors. That should help the reader when we complete this calculation in the next section. Basically we ignore the terms with \(q\ge 2\) and we replace \(G_A(1-\alpha ,d)\) by \(\tau _{A'}(d)\).

Altogether we now have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}} Z((A_1')_{-\alpha })Z((B_1')_{- \beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}}\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \sum _{\ell , h_1} \frac{\tau _{A_2}(\ell )\tau _{B_2}(\ell )}{\ell }\\&\quad \times \int _{u\le \frac{X}{N\ell }} \sum _{d\mid h_1} \frac{(d,N)^{1-{\alpha }}(d,M)^{1-{\beta }} \tau _{A_1'}\big (\frac{d}{(d,N)}\big )\tau _{B_1'}\big (\frac{d}{(d,M)}\big )\tau _{A_2}(N)\tau _{B_2}(M)}{ M^{1- \beta }N^{1+ \beta }u^{\alpha +\beta }d^{1-\alpha - \beta }} \hat{\psi }\left( \frac{Th_1}{2\pi uN} \right) \frac{du}{u}. \end{aligned}$$

We make the substitution

$$\begin{aligned} v=\frac{Th_1}{2\pi uN}. \end{aligned}$$

The above is

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}Z((A_1')_{-\alpha })Z((B_1')_{- \beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}}\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \sum _{\ell , h_1} \frac{\tau _{A_2}(\ell )\tau _{B_2}(\ell )}{\ell }\\&\quad \times \int _{v\ge \frac{Th_1\ell }{2\pi X}} \sum _{d\mid h_1} \frac{(d,N)^{1-{\alpha }}(d,M)^{1-{\beta }} \tau _{A_1'}\left( \frac{d}{(d,N)}\right) \tau _{B_1'}\left( \frac{d}{(d,M)}\right) \tau _{A_2}(N)\tau _{B_2}(M)}{ M^{1- \beta }N^{1+ \beta }\left( \frac{Th_1}{2\pi vN}\right) ^{\alpha +\beta }d^{1-\alpha - \beta }} \hat{\psi }(v)\frac{dv}{v}. \end{aligned}$$

Now we switch the sums around; replacing \(h_1\) by \(h_1d\) and bringing the sum over \(h_1\) and \(\ell \) to the inside, we have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}\left( \frac{T}{2\pi }\right) ^{- \alpha - \beta }Z((A_1')_{-\alpha })Z((B_1')_{- \beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}}\frac{\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \tau _{A_2}(N)\tau _{B_2}(M) }{M^{1- \beta }N^{1- \alpha }}\\&\quad \times \int _v \frac{ \hat{\psi }(v)}{v^{1-\alpha - \beta }} \sum _{d\ell h_1 \le \frac{2\pi Xv}{T}} \frac{(d,N)^{1-{\alpha }}(d,M)^{1-{\beta }} \tau _{A_1'}\big (\frac{d}{(d,N)}\big )\tau _{B_1'}\big (\frac{d}{(d,M)}\big )\tau _{A_2}(\ell )\tau _{B_2}(\ell )}{ d \ell h_1^{ \alpha + \beta }} ~dv. \end{aligned}$$

Using Perron’s formula we write this as

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}\left( \frac{T}{2\pi }\right) ^{- \alpha - \beta }Z((A_1')_{-\alpha })Z((B_1')_{- \beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}}\frac{\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \tau _{A_2}(N)\tau _{B_2}(M) }{M^{1- \beta }N^{1- \alpha }}\\&\quad \times \int _v \frac{ \hat{\psi }(v)}{v^{1-\alpha - \beta }} \frac{1}{2\pi i}\int _{(2)}\sum _{d,\ell , h_1 } \frac{(d,N)^{1-{\alpha }}(d,M)^{1-{\beta }} \tau _{A_1'}\big (\frac{d}{(d,N)}\big )\tau _{B_1'}\big (\frac{d}{(d,M)}\big )\tau _{A_2}(\ell )\tau _{B_2}(\ell )}{ d^{s+1}\ell ^{s+1} h_1^{s+ \alpha + \beta }}\\&\quad \times \left( \frac{2\pi Xv}{T}\right) ^s\frac{ds}{s} ~dv. \end{aligned}$$

The sum over \(\ell \) and \(h_1\) here is essentially

$$\begin{aligned} \zeta (s+\alpha + \beta ) Z((A_2)_s,B_2). \end{aligned}$$

The sum over d, M and N we evaluate to a first approximation by looking at the polar parts of

$$\begin{aligned} \sum \limits _{\begin{array}{c} d, M,N\\ (M,N)=1 \end{array}} \frac{(d,M)^{1-\beta }(d,N)^{1-\alpha }\tau _{A_1'}\left( \frac{d}{(d,N)}\right) \tau _{B_1'}\left( \frac{d}{(d,M)}\right) \tau _{A_2}(N) \tau _{B_2}(M)}{ d^{1+s}M^{1-\beta } N^{1-\alpha } }; \end{aligned}$$

these are calculated with the help of the following table:

$$\begin{aligned} \begin{array}{|l|l|l|l|l|l|} \hline d&{}M&{}N&{} \text{ Euler } \text{ term }&{} Z-\text{ factor }\\ \hline p&{}1&{}1&{}\tau _{A_1'}(p)\tau _{B_1'}(p)/p^{1+s}&{}Z((A_1')_{s}, B_1')\\ \hline 1&{}p&{}1&{} \tau _{B_2}(p)/p^{1-\beta } &{}Z(B_2,\{- \beta \})\\ \hline 1&{}1&{}p&{}\tau _{A_2}(p)/p^{1-\alpha } &{}Z(A_2,\{- \alpha \})\\ \hline p&{}1&{}p&{} \tau _{B_1'}(p)\tau _{A_2}(p)/p^{1+s}&{}Z(B_1', (A_2)_{s})\\ \hline p&{}p&{}1&{} \tau _{A_1'}(p)\tau _{B_2}(p)/p^{1+s} &{}Z((A_1')_{s}, B_2)\\ \hline \end{array} \end{aligned}$$

We take the product of all of these Z factors. Now the v-integral is

$$\begin{aligned} \int _v \frac{ \hat{\psi }(v)}{v^{1-s-\alpha - \beta }}~dv=(1/2) \chi (1-s-\alpha -\beta ) \int _0^\infty \psi (t) t^{-s-\alpha -\beta }~dt. \end{aligned}$$

Note that

$$\begin{aligned} \chi (1-s-\alpha -\beta )\zeta (s+\alpha + \beta )= \zeta (1-s-\alpha -\beta ). \end{aligned}$$

If we include the factors \(Z((A_2)_s,B_2)\), \(Z((A_1')_{-\alpha })Z((B_1')_{- \beta }) \), and \(Z(\{-s-\alpha \}, \{- \beta \})\) then the product of all of these Z-factors is

$$\begin{aligned} Z\bigg ((A_1' \cup A_2)_{s} \cup \{- \beta \} , B_1' \cup B_2 \cup \{-s-\alpha \} \bigg )=Z\bigg ((A')_{s} \cup \{- \beta \} , B' \cup \{-s-\alpha \} \bigg ). \end{aligned}$$

Thus, altogether we have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}\left( \frac{T}{2\pi }\right) ^{- \alpha - \beta } \int _t \frac{ \psi (t)}{t^{\alpha + \beta }} \frac{1}{2\pi i}\int _{(2)}Z\bigg ((A')_{s} \cup \{- \beta \} , B' \cup \{-s-\alpha \} \bigg ) \left( \frac{2\pi X}{tT}\right) ^s\frac{ds}{s} ~dt. \end{aligned}$$

Compare this with Eq. (4) of [7] which gives the “one-swap” terms from the recipe:

$$\begin{aligned}&\int _0^\infty \psi (t) \sum \limits _{\begin{array}{c} \alpha \in A\\ \beta \in B \end{array}} \left( \frac{Tt}{2\pi }\right) ^{- \alpha - \beta } Z((A')_{-\alpha })Z((B')_{- \beta }) \\&\quad \times \frac{1}{2\pi i} \int _{\mathfrak {R}s=4} \frac{\left( \frac{2\pi X}{Tt}\right) ^s}{s} \mathcal A(A'\cup \{-\beta -s\},B'_s\cup \{-\alpha \}) Z(A'_{s},B') \zeta (1-\alpha -\beta -s)~ds. \nonumber \end{aligned}$$

The only differences are that so far we have ignored the arithmetic factors and that in the expression we just derived we have the restrictions \(\alpha \in A_1\) and \( \beta \in B_1\). But as \(A_1\) and \(B_1\) vary through subsets of A and B every possible \(\alpha \) and \( \beta \) will appear. Also, we have the terms where \(M>N\) and those with \(h_1=0\).

6 The same calculation with the arithmetic factors

We replace \(m_1\) by \(u_1\); taking into account the arithmetic considerations and also using \(u_1\ell N=m_1m_2\le X\), we have that our sum is \(\sum _{\alpha ,\beta } Z((A'_1)_{-\alpha })Z((B'_1)_{-\beta }) \)

$$\begin{aligned}&\quad \times \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}} \phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) \sum _{h_1} \sum _{ \ell }\frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{M^{1-\beta }N^{1+ \beta }\ell } \\&\quad \times \int _{u_1\ell \le \frac{X}{ N} } \sum _{d\mid h_1 } \sum _{q=1}^\infty \frac{\mu (q)(qd,M)^{1-\beta } (qd,N)^{1- \alpha }}{d^{1- \alpha -\beta }q^{2- \alpha - \beta }}\\&\quad \times \, G_{A_1}\big (1-\alpha ,\frac{qd}{(qd,N)}\big ) G_{B_1}\big (1-\beta ,\frac{qd}{(qd,M)}) \hat{\psi }\left( \frac{Th_1}{2\pi u_1N} \right) ~\frac{du_1}{u_1^{1+\alpha +\beta }}. \end{aligned}$$

The term with \(h_1=0\) just leads to diagonal terms which are easy to deal with. Now we group the non-zero terms \(h_1\) and \(-h_1\) together and use \( \psi (-v)=\overline{ \psi (v)}\). We replace \(h_1\) by \(h_1d\). We make the substitution \(v_1=\frac{Th_1d}{2\pi u_1N}\) in the integral and switch the integral over \(v_1\) with the sum over \(h_1\), d and \(\ell \). Then (with \(h_1>0\)) we have that

$$\begin{aligned} \frac{\ell NTh_1d}{2\pi v_1N}=u_1\ell N\le X \end{aligned}$$

implies that

$$\begin{aligned} \ell h_1 d\le \frac{2\pi Xv_1}{T}. \end{aligned}$$

Thus we have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}} \left( \frac{T}{2\pi }\right) ^{-\alpha - \beta } Z((A_1')_{- \alpha })Z((B_1')_{ -\beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}} \frac{\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) }{M^{1-\beta } N^{1-\alpha }} \int _{0}^\infty (2\mathfrak {R}{\hat{\psi }}(v_1))\\&\quad \times \sum _{h_1\ell d\le \frac{2\pi Xv_1}{T}}\frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{h_1^{\alpha +\beta } \ell d} \sum _{q\ge 1} \frac{\mu (q)(qd,M)^{1-\beta }(qd,N)^{1-\alpha }}{q^{2-\alpha - \beta }} \\&\quad \times \, G_{{A_1}}\left( 1- \alpha ,\frac{qd}{(qd,N)} \right) G_{{B_1}}\left( 1-\beta ,\frac{qd}{(qd,M)} \right) ~\frac{dv_1}{v_1^{1-\alpha -\beta }}. \end{aligned}$$

Now we use Perron’s formula to evaluate the sum over \(h_1\ell d\). This gives

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}\left( \frac{T}{2\pi }\right) ^{- \alpha - \beta } Z((A_1')_{- \alpha })Z((B_1')_{ -\beta }) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}} \frac{\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) }{M^{1-\beta }N^{1- \alpha }} \frac{1}{2\pi i}\int _{(2)}\\&\quad \times \int _{0}^\infty (2\mathfrak {R}{\hat{\psi }}(v_1)) \left( \frac{2\pi Xv_1}{T}\right) ^s \sum _{h_1,\ell , d} \frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{h_1^{s+\alpha +\beta } \ell ^{1+s} d^{1+s } } \sum _{q\ge 1} \frac{\mu (q)(qd,N)^{1-\alpha }(qd,M)^{1- \beta }}{q^{2-\alpha - \beta }} \\&\quad \times \, G_{{A_1}}\left( 1- \alpha ,\frac{qd}{(qd,N)} \right) G_{{B_1}}\left( 1-\beta ,\frac{qd}{(qd,M)} \right) ~\frac{dv_1}{v_1^{1- \alpha - \beta }}\frac{ds}{s}. \end{aligned}$$

The sum over \(h_1\) is \(\zeta (s+\alpha +\beta )\). The integral over \(v_1\) is

$$\begin{aligned} \int _{0 }^{\infty } v_1^{-1+s+ \alpha + \beta } (2\mathfrak {R}\hat{\psi }(v_1))~dv_1 =\chi (1-s- \alpha - \beta ) \int _0^\infty \psi (t)t^{-s-\alpha - \beta }~dt. \end{aligned}$$

Combining these two facts and using the functional equation for \(\zeta \) we have

$$\begin{aligned}&\sum \limits _{\begin{array}{c} \alpha \in A_1\\ \beta \in B_1 \end{array}}\left( \frac{T}{2\pi }\right) ^{- \alpha - \beta } Z((A_1')_{- \alpha })Z((B_1')_{ -\beta }) \int _{0 }^{\infty } t^{- \alpha - \beta } \psi (t) \sum \limits _{\begin{array}{c} M\le N\\ (M,N)=1 \end{array}} \frac{\phi \left( \frac{M}{Q}\right) \phi \left( \frac{N}{Q}\right) }{M^{1- \beta }N^{1-\alpha }}\\&\quad \times \frac{1}{2\pi i}\int _{(2)}\zeta (1-s-\alpha - \beta ) \left( \frac{2\pi X}{tT}\right) ^s\sum _{\ell , d} \frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{ \ell ^{1+s} d^{1+s } } \\&\quad \times \sum _{q\ge 1} \frac{\mu (q)(qd,N)^{1-\alpha }(qd,M)^{1- \beta }}{q^{2-\alpha - \beta }} G_{{A_1}}\left( 1- \alpha ,\frac{qd}{(qd,N)} \right) G_{{B_1}}\left( 1-\beta ,\frac{qd}{(qd,M)} \right) ~ \frac{ds}{s}~dt. \end{aligned}$$

This requires studying the Dirichlet series

$$\begin{aligned}&\sum _{ (M,N)=1} \frac{1 }{M^{1-\beta }N^{1-\alpha }} \sum _{\ell , d} \frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{ \ell ^{1+s} d^{1+s } } \sum _{q\ge 1} \frac{\mu (q)(qd,N)^{1-\alpha }(qd,M)^{1- \beta }}{q^{2-\alpha - \beta }} \\&\qquad \qquad \times G_{{A_1}}\left( 1- \alpha ,\frac{qd}{(qd,N)} \right) G_{{B_1}}\left( 1-\beta ,\frac{qd}{(qd,M)} \right) \end{aligned}$$

See the appendix for the resolution of this arithmetic factor.

Regarding multiplicities, see the section on automorphisms at the end of the paper.

Taking account of the terms with \(h_1=0\) and \(h_2\ne 0\) we find that we have now accounted for all of the one-swap terms from the semi-diagonal contributions.

7 \(h_1h_2\ne 0\)

Now we come to the crux of the paper, the terms where neither \(h_1\) nor \(h_2\) are 0; we need to match these up with the two swap terms.

In the formula (3) above we replace the convolution sums by their averages, i.e.

$$\begin{aligned} \iint _{u_1u_2\le X }\langle \tau _{A_1}(m_1)\tau _{B_1}(n_1)\rangle ^{(*_1)}_{m_1\sim u_1} \langle \tau _{A_2}(m_2)\tau _{B_2}(n_2)\rangle ^{(*_2)}_{m_2\sim u_2} \hat{\psi }\left( \frac{Th_1}{2\pi u_1N}+\frac{Th_2}{2\pi u_2 M} \right) ~\frac{du_1}{u_1}\frac{du_2}{u_2}. \end{aligned}$$

We insert the formula (4) for these averages. After switching the ensuing sums over \(h_1,h_2\) and \(d_1,d_2\) we have \(\sum _{M,N} \phi (M/Q)\phi (N/Q)\) times

$$\begin{aligned}&\sum \limits _{\begin{array}{c} {\alpha _1}\in A_1\\ \alpha _2\in A_2 \end{array}} \sum \limits _{\begin{array}{c} {\beta _1}\in B_1\\ \beta _2\in B_2 \end{array}} Z((A_1')_{- \alpha _1})Z((A_2')_{- \alpha _2}) Z((B_1')_{- \beta _1})Z((B_2')_{- \beta _2}) \sum \limits _{\begin{array}{c} q_1,d_1,h_1 \\ q_2,d_2,h_2 \end{array}} \frac{\mu (q_1)\mu (q_2)}{q_1^{2-\alpha _1- {\beta }_1}q_2^{2-\alpha _2- {\beta }_2}} \\&\quad \times \frac{G_{A_1}(1- \alpha _1, \frac{q_1d_1}{(q_1d_1,N)}) G_{A_2}(1- \alpha _2 ,\frac{q_2d_2}{(q_2d_2,M)}) G_{B_1}(1- \beta _1, \frac{q_1d_1}{(q_1d_1,M)}) G_{B_2}(1- \beta _2 ,\frac{q_2d_2}{(q_2d_2,N)})}{(q_1d_1,N)^{-1+\alpha _1}(q_1d_1,M)^{-1+\beta _1}(q_2d_2,M)^{-1+\alpha _2}(q_2d_2,N)^{-1+\beta _2} d_1^{1- \alpha _1 - \beta _1} d_2^{1- \alpha _2 - \beta _2} } \\&\quad \times \iint _{T^2\le u_1u_2\le X}M^{-1+\beta _1-\beta _2}N^{-1+ \beta _2 -\beta _1} u_1^{- \alpha _1 - \beta _1} u_2^{- \alpha _2 - \beta _2}\\&\quad \times \hat{\psi }\left( \frac{Th_1d_1}{2\pi u_1N}+ \frac{Th_2d_2}{2\pi u_2M}\right) \frac{du_1}{u_1}\frac{du_2}{u_2}. \end{aligned}$$

Let’s first assume that \(h_1>0\) and \(h_2>0\). We make the changes of variable \(v_1=\frac{Th_1d_1}{2\pi u_1N}\) and \(v_2=\frac{Th_2d_2}{2\pi u_2M}\) and bring the sums over \(h_1\) and \(h_2\) to the inside; \(u_1u_2<X\) implies that

$$\begin{aligned} h_1d_1h_2d_2< \frac{4\pi ^2 Xv_1v_2MN}{T^2}. \end{aligned}$$

Then the sums over the \(q_i,h_i,d_i\) are \(N^{-1+\alpha _1+\beta _2} M^{-1+\alpha _2+\beta _1}\) times

$$\begin{aligned}&\left( \frac{T}{2\pi }\right) ^{- \alpha _1 - \alpha _2- \beta _1 - \beta _2} \iint _{v_1,v_2} v_1^{ \alpha _1 + \beta _1}v_2^{ \alpha _2 + \beta _2}\hat{\psi }(v_1+v_2) \frac{1}{2\pi i} \int _{(2)} \sum \limits _{\begin{array}{c} q_1,d_1,h_1 \\ q_2,d_2,h_2 \end{array}} \frac{\mu (q_1)\mu (q_2)}{q_1^{2-\alpha _1- {\beta _1}} q_2^{2-\alpha _2- {\beta }_2}}\\&\quad \times \frac{G_{A_1}(1- \alpha _1, \frac{q_1d_1}{(q_1d_1,N)}) G_{A_2}(1- \alpha _2 ,\frac{q_2d_2}{(q_2d_2,M)}) G_{B_1}(1- \beta _1, \frac{q_1d_1}{(q_1d_1,M)}) G_{B_2}(1- \beta _2 ,\frac{q_2d_2}{(q_2d_2,N)})}{(q_1d_1,N)^{-1+\alpha _1}(q_1d_1,M)^{-1+\beta _1}(q_2d_2,M)^{-1+\alpha _2} (q_2d_2,N)^{-1+\beta _2}d_1^{1+s} d_2^{1+s}h_1^{\alpha _1+\beta _1+s}h_2^{\alpha _2+\beta _2+s} }\\&\quad \times \frac{\left( \frac{4\pi ^2 Xv_1v_2MN}{T^2}\right) ^s}{s} ~ds \frac{dv_1}{v_1}\frac{dv_2}{v_2}. \end{aligned}$$

The sums over \(h_1\) and \(h_2\) are \(\zeta (s+\alpha _1+\beta _1)\zeta (s+\alpha _2+\beta _2)\). The other 3 cases of the signs of \(h_1\) and \(h_2\) can be taken care of similarly. Then we use

$$\begin{aligned} \hat{\psi }(v_1+v_2)=\int _0^\infty \psi (t)e(t(v_1+v_2))~dt \end{aligned}$$

to see that

$$\begin{aligned}&\hat{\psi }(v_1+v_2)+ \hat{\psi }(v_1-v_2)+ \hat{\psi }(-v_1+v_2)+ \hat{\psi }(-v_1-v_2)\\&\quad =\int _0^\infty \psi (t)\big (e(tv_1)+e(-tv_1)\big )\big (e(tv_2)+e(-tv_2)\big )~dt. \end{aligned}$$

Also

$$\begin{aligned} \int _0^\infty v_1^{s+\alpha +\gamma -1}(e(tv_1)+e(-tv_1))~dv_1 =t^{-s-\alpha -\gamma }\chi (1-s-\alpha -\gamma ), \end{aligned}$$

and similarly for the integral over \(v_2\). This leaves us with a total for the sum over \(M,N,q_i,h_i,d_i\) of

$$\begin{aligned}&\int _0^\infty \psi (t) \left( \frac{tT}{2\pi }\right) ^{- \alpha _1 - \alpha _2- \beta _1 - \beta _2} \frac{1}{2\pi i} \int _{(2)}\frac{\left( \frac{4\pi ^2 X }{t^2T^2}\right) ^s}{s} \zeta (1-s-\alpha _1- \beta _1)\zeta (1-s-\alpha _2- \beta _2)\\&\quad \times \sum \limits _{\begin{array}{c} (M,N)=1\\ M\le N \end{array}}\frac{\phi (M/Q)\phi (N/Q)}{M^{1-s-\alpha _2-\beta _1}N^{1-s-\alpha _1-\beta _2}} \sum \limits _{\begin{array}{c} q_1,d_1 \\ q_2,d_2 \end{array}} \frac{\mu (q_1)\mu (q_2)}{q_1^{2-\alpha _1- {\beta }_1}q_2^{2-\alpha _2- {\beta }_2}} \frac{G_{A_1}(1- \alpha _1, \frac{q_1d_1}{(q_1d_1,N)})}{(q_1d_1,N)^{-1+\alpha _1}}\\&\quad \times \frac{ G_{A_2}(1- \alpha _2 ,\frac{q_2d_2}{(q_2d_2,M)}) G_{B_1}(1- \beta _1, \frac{q_1d_1}{(q_1d_1,M)}) G_{B_2}(1- \beta _2 ,\frac{q_2d_2}{(q_2d_2,N)})}{ (q_1d_1,M)^{-1+\beta _1}(q_2d_2,M)^{-1+\alpha _2}(q_2d_2,N)^{-1+\beta _2}d_1^{1+s} d_2^{1+s} }~ds ~dt. \end{aligned}$$

Recall that

$$\begin{aligned} G_A(1- \alpha , p)= & {} \tau _{A'}(p)+O(1/p); \end{aligned}$$

we use this to calculate the polar part of

$$\begin{aligned} \sum \limits _{\begin{array}{c} d_1,d_2\\ (M,N)=1\\ M\le N \end{array}} \frac{\tau _{A_1'}(\frac{d_1}{(d_1,N)}) \tau _{B_1'}(\frac{d_1}{(d_1,M)})\tau _{A_2'}(\frac{d_2}{(d_2,M)}) \tau _{B_2'}(\frac{d_2}{(d_2,N)})}{(d_1,N)^{-1+\alpha _1}(d_1,M)^{-1+\beta _1}(d_2,M)^{-1+\alpha _2}(d_2,N)^{-1+\beta _2} d_1^{1+s}d_2^{1+s}M^{1-s-\alpha _2 - \beta _1} N^{1-s-\alpha _1 - \beta _2} }. \end{aligned}$$

We do this by calculating the significant parts of the Euler product. The following table is helpful; we let \(A_1'=A_1-\{\alpha _1\}\), \(A_2'=A_2-\{\alpha _2\}\), \(B_1'=B_1-\{ \beta _1\}\), and \(B_2'=B_2-\{ \beta _2\}\).

$$\begin{aligned} \begin{array}{|l|l|l|l|l|l|} \hline d_1&{}d_2&{}N&{}M&{} \text{ Euler } \text{ term }&{} Z-\text{ factor }\\ \hline p&{}1&{}1&{}1&{}\tau _{A_1'}(p)\tau _{B_1'}(p)/p^{1+s}&{}Z((A_1')_s, B_1')\\ \hline 1&{}p&{}1&{}1&{}\tau _{A_2'}(p)\tau _{B_2'}(p)/p^{1+s}&{}Z((A_2')_s, B_2')\\ \hline 1&{}1&{}p&{}1&{}p^{ \alpha _1+ \beta _2}/p^{1-s}&{}Z(\{- \alpha _1-s\}, \{- \beta _2\})\\ \hline 1&{}1&{}1&{}p&{}p^{ \alpha _2+ \beta _1}/p^{1-s}&{}Z(\{- \alpha _2-s\}, \{- \beta _1\})\\ \hline p&{}1&{}p&{}1&{} \tau _{B_1'}(p)/p^{1- \beta _2}&{}Z(B_1', \{- \beta _2\})\\ \hline p&{}1&{}1&{}p&{} \tau _{A_1'}(p)/p^{1- \alpha _2 }&{}Z(A_1', \{- \alpha _2\})\\ \hline 1&{}p&{}p&{}1&{} \tau _{A_2'}(p)/p^{1- \alpha _1}&{}Z(A_2', \{- \alpha _1\})\\ \hline 1&{}p&{}1&{}p&{} \tau _{B_2'}(p)/p^{1- \beta _1}&{}Z(B_2', \{- \beta _1\})\\ \hline p&{}p&{}p&{}1&{} \tau _{A_2'}(p)\tau _{B_1'}(p)/p^{1+s} &{}Z((A_2')_s, B_1')\\ \hline p&{}p&{}1&{}p&{} \tau _{A_1'}(p)\tau _{B_2'}(p)/p^{1+s}&{}Z((A_1')_s, B_2') \\ \hline \end{array} \end{aligned}$$

If we include the factors \(Z((A_1')_s,\{-s-\alpha _1\})Z(B_1',\{-\beta _1\})\), \(Z((A_2')_s,\{-s-\alpha _2\})Z(B_2',\{-\beta _2\})\), \(Z(\{-s-\alpha _1\}, \{- \beta _1\})\) and \(Z(\{-s-\alpha _2\}, \{- \beta _2\})\) then the product of all of these Z-factors is

$$\begin{aligned}&Z\big ((A_1' \cup A_2')_s \cup \{- \beta _1\} \cup \{- \beta _2\} , B_1' \cup B_2'\cup \{-s- \alpha _1\}\cup \{-s- \alpha _2\}\big )\\&\qquad =Z((A-S)_s+T^-,B-T+(S_s)^-) \end{aligned}$$

where \(S=\{\alpha _1, \alpha _2\}\) and \(T=\{ \beta _1,\beta _2\}\).

The predicted two-swap terms from the recipe are

$$\begin{aligned}&\sum \limits _{\begin{array}{c} S\subset A,T\subset B \\ |S|=|T|=2 \end{array}}\int _0^\infty \psi (t)\left( \frac{tT}{2\pi }\right) ^{-\sum \limits _{\begin{array}{c} \alpha \in S \\ \beta \in T \end{array}}( \alpha +\beta )} \\&\quad \times \frac{1}{2\pi i} \int _{(2)}\frac{\left( \frac{4\pi ^2 X }{t^2T^2}\right) ^s}{s}\mathcal AZ((A-S)_s+T^-,B-T+S_s^-)~ds ~dt \end{aligned}$$

which matches the above except that S and T are allowed to range over all two-element subsets of A and B in the recipe version whereas in the correlation version we first split \(A=A_1\cup A_2\) and \(B=B_1\cup B_2\) and then take one element from \(A_1\) and one from \(A_2\) to make up our two element set S and similarly one element from \(B_1\) and one from \(B_2\) to make our two element set T.

See the last two sections for the calculation of the arithmetic factor.

8 Automorphisms

The final step of this paper is to explain the apparent over-counting that has occurred. The explanation is that there are automorphisms that have to be taken into account. In this section we explain these multiplicities.

We start with

$$\begin{aligned} \left\{ \begin{array}{ll} Nm_1&{}=Mn_1+h_1\\ Mm_2&{}=Nn_2+h_2 \end{array} \right. \end{aligned}$$

Suppose \(m_1=\mu _1\hat{\mu _1}\), \(m_2=\mu _2\hat{\mu _2}\), \(n_1=\nu _1\hat{\nu _1}\) and \(n_2=\nu _2\hat{\nu _2}\). Multiply the first equation by \(\mu _2\nu _2\) and the second equation by \(\mu _1\nu _1\). Let

$$\begin{aligned} \tilde{M}= \nu _1\mu _2 M; \quad \tilde{N}=\mu _1\nu _2 N; \quad \tilde{m_1}=\hat{\mu _1}\mu _2; \quad \tilde{m_2}=\mu _1\hat{\mu _2};\quad \tilde{n_1} =\hat{\nu _1}\nu _2 ;\quad \tilde{n_2}=\nu _1\hat{\nu _2}. \end{aligned}$$

Then we have

$$\begin{aligned} \left\{ \begin{array}{ll} \tilde{N}\tilde{m_1}&{}=\tilde{M}\tilde{n_1}+\tilde{h_1}\\ \tilde{M}\tilde{m_2}&{}=\tilde{N}\tilde{n_2}+\tilde{h_2} \end{array} \right. \end{aligned}$$

where

$$\begin{aligned} \tilde{h_1}=\mu _2\nu _2 h_1 \qquad \text{ and } \qquad \tilde{h_2}=\mu _1\nu _1h_2. \end{aligned}$$

This scheme provides lots of automorphisms and explains the overcounting we have.

Basically there is one automorphism for each quadruple of divisors of \(m_1,m_2,n_1\) and \(n_2\). We have \(m=m_1m_2\) and \(n=n_1n_2\) where if \(|A|=k\) and \(|B|=\ell \) then \(\tau _A\) is a convolution of k and \(\tau _B\) a convolution of \(\ell \) atomic functions. We can think of

$$\begin{aligned} A=\{\alpha _1,\dots ,\alpha _k\} \qquad B=\{\beta _1,\dots ,\beta _\ell \} \end{aligned}$$

and with \(I=\{1,2,\dots ,k\}\) and \(J=\{1,2,\dots ,\ell \}\) we partition \(I=I_1\cup I_2\) and \(J=J_1\cup J_2\). Then in our decompositions \(A=A_1\cup A_2\) and \(B=B_1\cup B_2\) we have \(A_1=\{\alpha _i:i\in I_1\}\) etc. These correspond to the decompositions \(m=m_1m_2\) and \(n=n_1n_2\). If we write \(m=\mu _1\dots \mu _k\) and \(n=\nu _1\dots \nu _\ell \) then we can put \(m_1=\prod _{i\in I_1}\mu _i\) etc. The number of such decompositions of A or of m is just the number of subsets of A, i.e. \(2^k\); and the number for B is \(2^\ell \). We can associate an automorphism as above with each such decomposition. Therefore, there are \(2^{k+\ell }\) automorphisms in total. So each term is counted with a multiplicity \(2^{k+\ell }\). Now let’s see that this overcounting is in agreement with the number of ways of producing the term from the recipe with, say, \(S=\{\alpha _1,\alpha _2\}\) and \(T=\{\beta _1,\beta _2\}\). The term from the recipe will occur whenever we have a decomposition of \(A=A_1\cup A_2\) and \(B=B_1\cup B_2\) in which precisely one of \(\alpha _1\) and \(\alpha _2\) is in \(A_1\) and the other in \(A_2\) and similarly for B and the \(\beta \)s. How many ways are there to do this? If we say that \(\alpha _1\) is to be in \(A_1\) and \(\alpha _2\) in \(A_2\) then we have \(k-2\) other elements to be partitioned into two sets. There are \(2^{k-2}\) subsets and the chosen subset can be assigned to go with \(\alpha _1\) or with \(\alpha _2\), so we have an extra factor of 2; then and then another factor of 2 by putting \(\alpha _1\) in \(A_2\) and \(\alpha _2 \) in \(A_1\). Therefore, a total of \(2^k\) ways to do this. And \(2^\ell \) for the \(\beta \)s into the Bs. So, we have the same amount of overcounting as there are automorphisms. Taking this into account, we obtain just a single copy of each term from the recipe.

9 Conclusion

We have shown how to obtain an asymptotic formula with power savings for the mean square of a Dirichlet polynomial of length X where \(T^2\ll X\ll T^3\) with coefficients that are general divisor functions in two different ways: one way is via Perron’s formula and the recipe, and the other is by calculating a convolution of shifted divisor correlations. The two approaches give exactly the same answer.

In the next paper, which will conclude this introductory series, we will consider the completely general situation with an arbitrary length Dirichlet polynomial.

10 The semi-diagonal arithmetic factor

It remains to prove that the arithmetic factors agree. This calculation is surprisingly involved. In order to carry it out with minimal notational difficulties we introduce a new set of notation. These appendices are self-contained.

We begin by introducing a little notation. First of all, we are working locally; basically we are identifying the local p-factor in an Euler product. As far as we are concerned p is fixed for this discussion so we often suppress it. In fact we write X for 1 / p and mostly consider power series in X. We take the unusual step of suppressing not only the prime p but the divisor function and so we write A(n) in place of \(\tau _A(p^n)\). Also, for a set A we let

$$\begin{aligned} A_\alpha =\{a+\alpha :a\in A\}. \end{aligned}$$

A further piece of notation: \(A^+=A\cup \{0\}\). We have two important identities. The first is

$$\begin{aligned} A^+(d)=A(d)+A^+(d-1). \end{aligned}$$

This is a special case of

$$\begin{aligned} (A\cup \{-\alpha \})(d-1) = X^{\alpha } \bigg ( (A\cup \{-\alpha \})(d) - A(d)\bigg ). \end{aligned}$$

The other identity is

$$\begin{aligned} \sum _{r=0}^R A(r+M)= A^+(R+M)- A^+(M-1) \end{aligned}$$

which follows by repeated application of the first identity.

For arbitrary sets A,B, C and D we let

$$\begin{aligned} \mathcal C(A,B):=\sum _{M=0}^\infty A(M)B(M)X^M \end{aligned}$$

and

$$\begin{aligned} \mathcal F(A,B;C,D)=\sum _{K,L,M} A(K)B(K+M)C(L)D(L+M)X^{K+L+M}; \end{aligned}$$

Also, we let

$$\begin{aligned} Z(A)=\sum _{j=0}^\infty A(j) X^j=\prod _{a\in A}(1-X^{1+a})^{-1}. \end{aligned}$$

We have a lemma about \(\mathcal F\) and \(\mathcal C\) which is really just a formal manipulation; consequently we state it in a more general form.

Lemma 1

For any 4 functions aAbB let

$$\begin{aligned} F(a,A;b,B)=\sum _{K,L,M} a(K)A(K+M)b(L)B(L+M)X^{K+L+M} \end{aligned}$$

and

$$\begin{aligned} C(a,b)=\sum _{r=0}^\infty a(r)b(r)X^r \end{aligned}$$

we have

$$\begin{aligned} F(a,A;b,B)+F(A,a;B,b)=C(A\star b,a\star B)+C(a,A)C(b,B). \end{aligned}$$

Proof

Let \(Y=\sqrt{X}e(\theta )\). Then

$$\begin{aligned} F(a,A;b,B)= & {} \int _0^1 \sum _{r,s,M,N}a(r)A(r+M) {Y}^{r+s+M} b(s)B(s+N) \overline{Y}^{r+s+N} ~d\theta \\= & {} \int _0^1 \sum \limits _{\begin{array}{c} R,S\\ r\le R; s\le S \end{array}} a(r)A(R) {Y}^{s+R} b(s)B(S) \overline{Y}^{r+S} ~d\theta \\= & {} \sum \limits _{\begin{array}{c} r+S=R+s\\ r\le R; s\le S \end{array}}a(r)A(R) b(s)B(S) X^{r+S} \end{aligned}$$

The latter sum is

$$\begin{aligned}&\sum _{r+S=R+s}a(r)A(R) b(s)B(S) X^{r+S} - \sum \limits _{\begin{array}{c} r+S=R+s\\ r> R; s> S \end{array}}a(r)A(R) b(s)B(S) X^{r+S}\\&\qquad = C(a\star B,A\star b)+C(a,A)C(b,B)- F(A,a;B,b) \end{aligned}$$

as desired. \(\square \)

Now we address the arithmetic factor from the semi-diagonal term. The p part of

$$\begin{aligned}&Z((A'_1)_{-\alpha })Z((B'_1)_{-\beta }) \sum _{ (M,N)=1} \frac{1 }{M^{1-\beta }N^{1-\alpha }} \sum _{\ell , d} \frac{\tau _{A_2}(N\ell )\tau _{B_2}(M\ell )}{ \ell ^{1+s} d^{1+s } }\\&\quad \times \sum _{q\ge 1} \frac{\mu (q)(qd,N)^{1-\alpha }(qd,M)^{1- \beta }}{q^{2-\alpha - \beta }} \\&\quad \times \, G_{{A_1}}\left( 1- \alpha ,\frac{qd}{(qd,N)} \right) G_{{B_1}}\left( 1-\beta ,\frac{qd}{(qd,M)} \right) s \end{aligned}$$

is (after setting \(s=0\))

$$\begin{aligned}&\sum _{ \min (M,N)=0} X^{M(1-\beta )+N(1-\alpha )} \sum _{\ell , d}A_2(N+\ell )B_2(M+\ell ) X^{\ell +d }\\&\quad \times \sum _{q } \mu (p^q)X^{-\min (q+d,N)(1-\alpha )-\min (q+d,M)(1- \beta )+q(2-\alpha - \beta )} \\&\quad \times \sum _{j,k}A_1'(j+q+d-\min (q+d,N)) \\&\quad \times \, B_1'(k+q+d-\min (q+d,M))X^{j(1-\alpha )+k(1-\beta )} \end{aligned}$$

We use

$$\begin{aligned} \sum _{\min (M,N)=0}f(M,N)=\sum _M f(M,0)+\sum _N f(0,N)-f(0,0) \end{aligned}$$

and get \(S_L+S_R-S_0\) where

$$\begin{aligned} S_0= & {} \sum _{\ell , d,q,j,k}A_2(\ell )B_2(\ell ) \mu (p^q) A_1'(j+q+d )B_1'(k+q+d) \\&\times \, X^{ q(2-\alpha - \beta )+\ell + d +j(1-\alpha )+k(1-\beta )}, \end{aligned}$$
$$\begin{aligned} S_L= & {} \sum _{ M} X^{M(1-\beta )} \sum _{\ell , d}A_2(\ell )B_2(M+\ell ) X^{\ell + d } \sum _{q } \mu (p^q)X^{- \min (q+d,M)(1- \beta )+q(2-\alpha - \beta )} \\&\times \sum _{j,k}A_1'(j+q+d )B_1'(k+q+d-\min (q+d,M) )X^{j(1-\alpha )+k(1-\beta )}. \end{aligned}$$
$$\begin{aligned} S_R= & {} \sum _{ N} X^{N(1-\alpha )} \sum _{\ell , d}A_2(N+\ell )B_2(\ell ) X^{\ell + d } \sum _{q } \mu (p^q)X^{-\min (q+d,N)(1-\alpha ) +q(2-\alpha - \beta )} \\&\times \sum _{j,k}A_1'(j+q+d-\min (q+d,N))B_1'(k+q+d)X^{j(1-\alpha )+k(1-\beta )}. \end{aligned}$$

We expand the q sum in \(S_0\) to get

$$\begin{aligned} S_0= & {} \sum _{\ell , d,j,k}A_2(\ell )B_2(\ell ) A_1'(j+d )B_1'(k+d) X^{ \ell + d +j(1-\alpha )+k(1-\beta )}\\&-\sum _{\ell , d,j,k}A_2(\ell )B_2(\ell ) A_1'(j+1+d )B_1'(k+1+d) X^{ 2-\alpha - \beta +\ell + d +j(1-\alpha )+k(1-\beta )}. \end{aligned}$$

This telescopes in j and k to give

$$\begin{aligned} S_0= & {} \sum _{\ell , d,j}A_2(\ell )B_2(\ell ) A_1'(j+d )B_1'(d) X^{ \ell + d +j(1-\alpha ) }\\&+ \sum _{\ell , d,k}A_2(\ell )B_2(\ell ) A_1'(d )B_1'(k+d) X^{ \ell + d +k(1-\beta )}\\&- \sum _{\ell , d }A_2(\ell )B_2(\ell ) A_1'(d )B_1'(d) X^{ \ell + d }\\= & {} \mathcal C(A_2,B_2)\bigg (\sum _{r} X^{ r(1 -\alpha )} A_1'(r )\sum _{d\le r} X^{d\alpha } B_1'(d)\\&+\sum _{r} X^{ r(1 -\beta )} B_1'(r )\sum _{d\le r} X^{d\beta } A_1'(d) -\mathcal C(A_1',B_1')\bigg )\\= & {} \mathcal C(A_2,B_2)\left( \sum _{r} X^{ r(1 -\alpha )} A_1'(r ) ((B_1')_{\alpha })^+(r) \right. \\&\left. +\sum _{r} X^{ r(1 -\beta )} B_1'(r ) (( A_1')_{\beta })^+(r) -\mathcal C(A_1',B_1')\right) \\= & {} \mathcal C(A_2,B_2)\left( \mathcal C((A_1')_{-\alpha }, ((B_1')_{\alpha })^+) + \mathcal C( ( B_1')_{-\beta },(( A_1')_{\beta })^+)-\mathcal C(A_1',B_1')\right) . \end{aligned}$$

This may be rewritten as

$$\begin{aligned} S_0= \mathcal C(A_2,B_2)\big ( \mathcal C(A_1', B_1'\cup \{-\alpha \}) + \mathcal C( B_1',A_1'\cup \{-\beta \})-\mathcal C(A_1',B_1')\big ). \end{aligned}$$

Now we turn to \(S_L\). Expanding in q we have

$$\begin{aligned} S_L= & {} \sum _{ M,\ell ,d,j,k} A_2(\ell )B_2(M+\ell ) A_1'(j+d )B_1'(k+d-\min (d,M) )\\&\times X^{\ell +d+M(1-\beta )-\min (d,M)(1- \beta )+j(1-\alpha )+k(1-\beta )}\\&- \sum _{ M,\ell ,d,j,k} A_2(\ell )B_2(M+\ell )A_1'(j+1+d )B_1'(k+1+d-\min (1+d,M) ) \\&\times X^{\ell + d +M(1-\beta )- \min (1+d,M)(1- \beta )+2-\alpha - \beta +j(1-\alpha )+k(1-\beta )} . \end{aligned}$$

We split this into \(S_L=S_L^-+S_L^+\) where \(S_L^-\) denotes those terms for which \(d<M\) and \(S_L^+\) contains those terms with \(d\ge M\). We have

$$\begin{aligned} S_L^-= & {} \sum \limits _{\begin{array}{c} M,\ell ,j,k\\ d<M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(j+d )B_1'(k ) X^{\ell +M(1-\beta )+d \beta +j(1-\alpha )+k(1-\beta )}\\&- \sum \limits _{\begin{array}{c} M,\ell ,j,k\\ d<M \end{array}} A_2(\ell )B_2(M+\ell )A_1'(j+1+d )B_1'(k ) X^{\ell +M(1-\beta ) + d\beta +(j+1)(1-\alpha )+k(1-\beta )} . \end{aligned}$$

The sum over j telescopes; we are left with

$$\begin{aligned} S_L^-= & {} \sum \limits _{\begin{array}{c} M,\ell ,k\\ d<M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(d )B_1'(k ) X^{\ell +M(1-\beta )+d \beta +k(1-\beta )} \end{aligned}$$

We execute the sum over k to obtain

$$\begin{aligned} S_L^-= & {} Z((B_1')_{- \beta })\sum \limits _{\begin{array}{c} M,\ell ,\\ d<M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(d ) X^{\ell +M(1-\beta )+d \beta } . \end{aligned}$$

The sum over d gives

$$\begin{aligned} S_L^-= & {} Z((B_1')_{- \beta })\sum _{ M,\ell } A_2(\ell )B_2(M+\ell ) ((A_1')_{\beta })^+(M-1 ) X^{\ell +M(1-\beta ) } \\= & {} Z((B_1')_{- \beta })\sum _{ M,\ell } A_2(\ell )B_2(M+\ell ) \big (((A_1')_{\beta })^+(M)-(A_1')_{\beta }(M)\big ) X^{\ell +M(1-\beta ) } \\= & {} Z((B_1')_{- \beta })\sum _{ M,\ell } A_2(\ell )B_2(M+\ell ) \big ((A_1'\cup \{-\beta \})(M)-A_1'(M)\big ) X^{\ell +M }\\= & {} Z((B_1')_{- \beta })\big (\mathcal C (A_1'\cup A_2\cup \{-\beta \},B_2)-\mathcal C(A_1'\cup A_2,B_2)\big ). \end{aligned}$$

Now we consider \(S_L^+\). We have

$$\begin{aligned} S_L^+= & {} \sum \limits _{\begin{array}{c} M,\ell ,j,k\\ d\ge M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(j+d )B_1'(k+d-M ) X^{\ell +d +j(1-\alpha )+k(1-\beta )}\\&- \sum \limits _{\begin{array}{c} M,\ell ,j,k\\ d\ge M \end{array}} A_2(\ell )B_2(M+\ell )A_1'(j+1+d ) \\&\times \, B_1'(k+1+d-M ) X^{\ell + d +2- \alpha - \beta +j(1- \alpha )+k(1- \beta )} . \end{aligned}$$

This sum telescopes in j and k. We have

$$\begin{aligned} S_L^+= & {} \sum \limits _{\begin{array}{c} M,\ell ,j\\ d\ge M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(j+d )B_1'(d-M ) X^{\ell +d +j(1-\alpha ) }\\&+ \sum \limits _{\begin{array}{c} M,\ell ,k\\ d\ge M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(d )B_1'(k+d-M ) X^{\ell +d +k(1-\beta )}\\&- \sum \limits _{\begin{array}{c} M,\ell \\ d\ge M \end{array}} A_2(\ell )B_2(M+\ell ) A_1'(d )B_1'(d-M ) X^{\ell +d }. \end{aligned}$$

We replace d by \(d+M\) and have

$$\begin{aligned} S_L^+= & {} \sum _{ M,\ell ,j,d} A_2(\ell )B_2(M+\ell ) A_1'(j+d +M)B_1'(d ) X^{\ell +d+M +j(1-\alpha ) }\\&+ \sum _{ M,\ell ,k,d} A_2(\ell )B_2(M+\ell ) A_1'(d+M )B_1'(k+d ) X^{\ell +d +M +k(1-\beta )}\\&- \sum _{ M,\ell ,d} A_2(\ell )B_2(M+\ell ) A_1'(d +M)B_1'(d ) X^{\ell +d+M }. \end{aligned}$$

In the first term we replace \(j+d\) by r and sum over d; it becomes

$$\begin{aligned} \sum _{ M,\ell ,r} A_2(\ell )B_2(M+\ell ) (A_1')_{-\alpha }(r +M)((B_1')_{\alpha })^+(r ) X^{\ell +r+M+M\alpha }. \end{aligned}$$

In the second term we execute the sum over k as follows:

$$\begin{aligned}&\sum _{k,d} A_1'(d+M )B_1'(k+d ) X^{ d +k(1-\beta )}\\&\quad = \sum _{K}(B_1')_{-\beta }(K)\sum _{d\le K}A_1'(d+M) X^{ d +K\beta }\\&\quad =X^{-M\beta }\sum _{K}(B_1')_{-\beta }(K)X^K\sum _{d\le K}(A_1')_{\beta }(d+M) \\&\quad = X^{-M\beta }\sum _{K}(B_1')_{-\beta }(K)X^K\big ( ((A_1')_{\beta })^+(K+M)-((A_1')_{\beta })^+(M-1)\big )\\ \end{aligned}$$

This may be rewritten as

$$\begin{aligned}&X^{-M\beta }\sum _{K}(B_1')_{-\beta }(K)X^K ((A_1')_{\beta })^+(K+M)\\&\qquad -X^{-M\beta } ((A_1')_{\beta })^+(M)Z((B_1')_{-\beta })+X^{-M\beta } (A_1')_{\beta }(M)Z((B_1')_{-\beta }). \end{aligned}$$

Thus, altogether we have

$$\begin{aligned} S_L^+= & {} \sum _{ M,\ell ,r} A_2(\ell )B_2(M+\ell ) (A_1')_{-\alpha }(r +M)((B_1')_{\alpha })^+(r ) X^{\ell +r+M+M\alpha }\\&+ \sum _{ M,\ell } A_2(\ell )B_2(M+\ell )\bigg ( X^{-M\beta }\sum _{K}(B_1')_{-\beta }(K)X^K ((A_1')_{\beta })^+(K+M)\\&-\,X^{-M\beta } ((A_1')_{\beta })^+(M)Z((B_1')_{-\beta })+X^{-M\beta } (A_1')_{\beta }(M)Z((B_1')_{-\beta }) \bigg ) X^{\ell +M}\\&-\sum _{ M,\ell ,d} A_2(\ell )B_2(M+\ell ) A_1'(d +M)B_1'(d ) X^{\ell +d+M }. \end{aligned}$$

In the first line notice that \((((B_1')_{\alpha })^+)_{-\alpha }=B_1'\cup \{-\alpha \}\). Also, recall our notation:

$$\begin{aligned} \mathcal F (A,B;C,D)=\sum _{K,L,M}A(K)B(K+M)C(L)D(L+M)X^{K+L+M} . \end{aligned}$$

Using this notation we have that

$$\begin{aligned} S_L^+= & {} \mathcal F(B_1'\cup \{-\alpha \},A_1' ;A_2,B_2)+ \mathcal F (B_1',A_1'\cup \{-\beta \};A_2,B_2)- \mathcal F(B_1',A_1';A_2,B_2)\\&-Z((B_1')_{-\beta })\mathcal C(A_1'\cup A_2\cup \{-\beta \},B_2)+Z((B_1')_{-\beta })\mathcal C(A_1'\cup A_2,B_2). \end{aligned}$$

We add this with our expression for \(S_L^-\) and have

$$\begin{aligned} S_L= & {} \mathcal F(B_1'\cup \{-\alpha \},A_1' ;A_2,B_2)+ \mathcal F (B_1',A_1'\cup \{-\beta \};A_2,B_2)- \mathcal F(B_1',A_1';A_2,B_2). \end{aligned}$$

The expression for \(S_R\) is obtained by the symmetry \(\alpha \leftrightarrow \beta \); \(A_1\leftrightarrow B_1\); and \(A_2\leftrightarrow B_2\). Thus,

$$\begin{aligned} S_R= & {} \mathcal F(A_1'\cup \{-\beta \},B_1' ;B_2,A_2)+ \mathcal F (A_1',B_1'\cup \{-\alpha \};B_2,A_2)- \mathcal F(A_1',B_1';B_2,A_2). \end{aligned}$$

Recall that

$$\begin{aligned} \mathcal F(A,B;C,D)+\mathcal F(B,A;D,C)=\mathcal C(A\cup D,B\cup C)+\mathcal C(A,B)C(C,D). \end{aligned}$$

Thus,

$$\begin{aligned} S_L+S_R= & {} \mathcal C(A_1'\cup A_2\cup \{-\beta \},B_1'\cup B_2)+ C(A_1'\cup A_2 ,B_1'\cup B_2\cup \{-\alpha \})\\&-\,C(A_1'\cup A_2 ,B_1'\cup B_2)+\mathcal C(A_1'\cup \{-\beta \},B_1')\mathcal C(B_2,A_2) \\&+\,\mathcal C(A_1' ,B_1'\cup \{-\alpha \})\mathcal C(B_2,A_2) -\,\mathcal C(A_1' ,B_1')\mathcal C(B_2,A_2). \end{aligned}$$

Adding this to \(-S_0\) we have

$$\begin{aligned} S_L+S_R -S_0= & {} \mathcal C(A_1'\cup A_2\cup \{-\beta \},B_1'\cup B_2)+ C(A_1'\cup A_2 ,B_1'\cup B_2\cup \{-\alpha \})\\&-\,C(A_1'\cup A_2 ,B_1'\cup B_2). \end{aligned}$$

This is equal to

$$\begin{aligned} (1-X^{1-\alpha -\beta }) ~ \mathcal C(A_1'\cup A_2\cup \{-\beta \},B_1'\cup B_2\cup \{-\alpha \}) \end{aligned}$$

as desired.

11 Proof of Theorem 2

We shall it convenient to recast the identity of Theorem 2 using a set-theoretic language.

11.1 A reformulation of the identity

We begin with 4 sets ABC and D and 4 numbers \(\alpha ,\beta ,\gamma \) and \(\delta \). We consider

$$\begin{aligned}&\sum _{\min (M,N)=0} X^{-M(\gamma +\beta )-N(\alpha +\delta )} \Sigma _1(M,N)\Sigma _2(M,N)X^{M+N} \end{aligned}$$

where

$$\begin{aligned} \Sigma _1(M,N)= & {} \sum \limits _{\begin{array}{c} d,j,k\\ q\le 1 \end{array}} (-1)^q X^{d(\alpha +\beta )}{A_{-\alpha }}(j+q+d-\min (q+d,N))\\&\times {B_{-\beta }}(k+q+d-\min (q+d,M)) X^{2q+d+j+k-\min (q+d,M)-\min (q+d,N)} \end{aligned}$$

and

$$\begin{aligned} \Sigma _2(M,N)= & {} \sum \limits _{\begin{array}{c} d,j,k\\ q\le 1 \end{array}} (-1)^q X^{d(\gamma +\delta )}{C_{-\gamma }}(j+q+d-\min (q+d,M))\\&\times \, {D_{-\delta }}(k+q+d-\min (q+d,N)) X^{2q+d+j+k-\min (q+d,M)-\min (q+d,N)}. \end{aligned}$$

The problem is to express this quantity in terms of the \(\mathcal C\) function, namely we want to prove that the above is

$$\begin{aligned} =(1-X^{1-\alpha -\beta })(1-X^{1-\gamma -\delta })\mathcal C(A\cup C\cup \{-\beta ,-\delta \}, B\cup D \cup \{-\alpha ,-\gamma \}). \end{aligned}$$

11.2 Initial reductions

We can decompose the sum over M and N via

$$\begin{aligned} \sum _{\min (M,N)=0}f(M,N)=\sum _{M=0}^\infty f(M,0)+\sum _{N=0}^\infty f(0,N)-f(0,0). \end{aligned}$$

Thus, the sum above is \(S_L+S_R-S_0\) where

$$\begin{aligned} S_L= \sum _{M=0}^\infty X^{-M(\gamma +\beta ) } \Sigma _1(M,0)\Sigma _2(M,0)X^{M} \end{aligned}$$
$$\begin{aligned}&S_R=\sum _{N=0}^\infty X^{ -N(\alpha +\delta )} \Sigma _1(0,N)\Sigma _2(0,N)X^{N} \end{aligned}$$

and

$$\begin{aligned} S_0= \Sigma _1(0,0)\Sigma _2(0,0). \end{aligned}$$

We have

$$\begin{aligned} S_0= & {} \left( \sum \limits _{\begin{array}{c} d,j,k\\ q\le 1 \end{array}} (-1)^q X^{d(\alpha +\beta )}{A_{-\alpha }}(j+q+d ) {B_{-\beta }}(k+q+d ) X^{2q+d+j+k} \right) \\&\times \left( \sum \limits _{\begin{array}{c} d,j,k\\ q\le 1 \end{array}} (-1)^q X^{d(\gamma +\delta )}{C_{-\gamma }}(j+q+d ) {D_{-\delta }}(k+q+d ) X^{2q+d+j+k }\right) . \end{aligned}$$

The first factor here is

$$\begin{aligned}&\sum _{d,j,k } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(k+d ) X^{d+j+k}\\&\quad - \sum _{d,j,k } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+1+d ) {B_{-\beta }}(k+1+d ) X^{2+d+j+k} \end{aligned}$$

which telescopes in j and k. Thus, it is

$$\begin{aligned}&\sum _{d,j } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(d ) X^{d+j} +\sum _{d,k } X^{d(\alpha +\beta )}{A_{-\alpha }}(d ) {B_{-\beta }}(k+d ) X^{d+k}\\&\quad -\sum _{d } X^{d(\alpha +\beta )}{A_{-\alpha }}(d ) {B_{-\beta }}(d ) X^{d}. \end{aligned}$$

The first term here is

$$\begin{aligned}&\sum _{d,j } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(d ) X^{d+j} =\sum _{J=0}^\infty {A_{-\alpha }}(J )X^J \sum _{d\le J} {B_{-\beta }}(d ) X^{d(\alpha +\beta )}. \end{aligned}$$

Now

$$\begin{aligned} {B_{-\beta }}(d ) X^{d(\alpha +\beta )}={B_{\alpha }}(d ) \end{aligned}$$

and

$$\begin{aligned} \sum _{d\le J}{B_{\alpha }}(d ) =(B_{\alpha })^+(J ). \end{aligned}$$

Thus, the above is

$$\begin{aligned} \sum _{J=0}^\infty {A_{-\alpha }}(J )(B_{\alpha })^+(J )X^J =\mathcal C(A_{-\alpha },(B_{\alpha })^+)=\mathcal C(A ,B\cup \{-\alpha \}). \end{aligned}$$

The second term is

$$\begin{aligned} \mathcal C((A_{\beta })^+,B_{-\beta })=\mathcal C(A\cup \{-\beta \},B ) \end{aligned}$$

and the third term is

$$\begin{aligned} \mathcal C(A ,B) . \end{aligned}$$

We can do the same with the second factor. The net result is that

$$\begin{aligned} S_0= & {} \bigg (\mathcal C(A ,B\cup \{-\alpha \}) + \mathcal C(A\cup \{-\beta \},B )-\mathcal C(A,B)\bigg )\\&\times \bigg (\mathcal C(C ,D\cup \{-\gamma \}) + \mathcal C(C\cup \{-\delta \},D )-\mathcal C(C,D)\bigg ). \end{aligned}$$

Next we analyze \(S_L\). First consider \(\Sigma _1(M,0)\):

$$\begin{aligned} \Sigma _1(M,0)= & {} \sum _{d,j,k } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(k+d-\min (d,M)) X^{d+j+k-\min (d,M) }\\&- \sum _{d,j,k } X^{d(\alpha +\beta )}{A_{-\alpha }}(j+1+d ) {B_{-\beta }}(k+1+d-\min (1+d,M)) \\&\times \, X^{2+d+j+k-\min (1+d,M) }. \end{aligned}$$

We split this into the terms with \(d< M\) and those with \(d\ge M\). We have

$$\begin{aligned} \Sigma _1^-(M,0)= & {} \sum \limits _{\begin{array}{c} j,k\\ d< M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(k ) X^{j+k }\\&- \sum \limits _{\begin{array}{c} j,k\\ d< M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+1+d ) {B_{-\beta }}(k ) X^{1+j+k } \\= & {} Z(B_{-\beta })\left( \sum \limits _{\begin{array}{c} j\\ d< M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) X^{j }- \sum \limits _{\begin{array}{c} j\\ d < M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+1+d ) X^{1+j }\right) . \end{aligned}$$

The sum over j telescopes so that this is

$$\begin{aligned} \Sigma _1^-(M,0)= & {} Z(B_{-\beta }) \sum _{ d< M } X^{d(\alpha +\beta )}{A_{-\alpha }}(d ) \\= & {} Z(B_{-\beta }) \sum _{ d < M } {A_{\beta }}(d ) = Z(B_{-\beta })(A_{\beta })^+(M-1). \end{aligned}$$

Next we consider

$$\begin{aligned} \Sigma _1^+(M)= & {} \sum \limits _{\begin{array}{c} j,k\\ d \ge M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+d ) {B_{-\beta }}(k+d-M) X^{d+j+k-M }\\&- \sum \limits _{\begin{array}{c} j,k\\ d \ge M \end{array}} X^{d(\alpha +\beta )}{A_{-\alpha }}(j+1+d ) {B_{-\beta }}(k+1+d-M) X^{2+d+j+k-M }. \end{aligned}$$

We replace d by \(d+M\) and have

$$\begin{aligned} \Sigma _1^+(M)= & {} \sum _{j,k,d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(j+d+M ) {B_{-\beta }}(k+d) X^{d+j+k }\\&- \sum _{j,k,d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(j+1+d+M ) {B_{-\beta }}(k+1+d) X^{2+d+j+k }. \end{aligned}$$

Now the sum over j and k telescopes and we have

$$\begin{aligned} \Sigma _1^+(M)= & {} \sum _{j,d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(j+d+M ) {B_{-\beta }}(d) X^{d+j }\\&+ \sum _{k,d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(d+M ) {B_{-\beta }}(k+d) X^{d+k }\\&- \sum _{d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(d+M ) {B_{-\beta }}(d) X^{d }\\&\end{aligned}$$

We recognize a convolution in the first term and rewrite this as

$$\begin{aligned} \Sigma _1^+(M)= & {} \sum _{r } X^{M(\alpha +\beta )}{A_{-\alpha }}(r+M ) (B_{\alpha })^+(r) X^{r }\\&+ \sum _{k,d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(d+M ) {B_{-\beta }}(k+d) X^{d+k }\\&- \sum _{d } X^{(d+M)(\alpha +\beta )}{A_{-\alpha }}(d+M ) {B_{-\beta }}(d) X^{d }\\&\end{aligned}$$

The middle term here may be written as

$$\begin{aligned}&\sum _{K } {B_{-\beta }}(K) X^{K } \sum _{d\le K}A_{\beta }(d+M )\\&\quad = \sum _{K } {B_{-\beta }}(K) X^{K } \left( (A_{\beta })^+(K+M )-(A_{\beta })^+(M-1 )\right) \\&\quad = \sum _K {B_{-\beta }}(K) (A_{\beta })^+(K+M )X^K-Z(B_{-\beta })(A_{\beta })^+(M-1 ). \end{aligned}$$

The second term of this cancels with \(\Sigma _1^-(M,0)\) and so we have

$$\begin{aligned} \Sigma _1(M,0)= & {} X^{M(\alpha +\beta )}\sum _{K }(B_{\alpha })^+(K) {A_{-\alpha }}(K+M ) X^{K }\\&- X^{M(\alpha +\beta )}\sum _{K }B_{\alpha }(K) {A_{-\alpha }}(K+M ) X^{K }\\&+\sum _K {B_{-\beta }}(K) (A_{\beta })^+(K+M )X^K. \end{aligned}$$

This may be rewritten as

$$\begin{aligned} \Sigma _1(M,0)= & {} X^{M\beta } \bigg ( \sum _{K }(B\cup \{-\alpha \})(K) A (K+M ) X^{K } - \sum _{K }B (K) {A }(K+M ) X^{K }\\&+\sum _K {B }(K) (A\cup \{-\beta \})(K+M )X^K\bigg ) \end{aligned}$$

By symmetry

$$\begin{aligned} \Sigma _2(M,0)= & {} X^{M\gamma }\bigg ( \sum _{L }(C\cup \{-\delta \})(L) D (L+M ) X^{L } - \sum _{L }C (L)D(L+M ) X^{L }\\&+\sum _L C(L) (D\cup \{-\gamma \})(L+M )X^L\bigg ). \end{aligned}$$

Recall that we are trying to evaluate

$$\begin{aligned} S_L=\sum _{M}X^{M(1-\gamma -\beta ) } \Sigma _1(M,0)\times \Sigma _2(M,0). \end{aligned}$$

If we multiply out the three terms of \(\Sigma _1\) by the three terms of \(\Sigma _2\) and then sum over M we get a total of nine expressions the first of which is

$$\begin{aligned}&\sum _{K,L,M}(B\cup \{-\alpha \})(K) A (K+M ) (C\cup \{-\delta \})(L) D (L+M ) X^{K+L+M}\\&\quad =\mathcal F(B\cup \{-\alpha \},A; C\cup \{-\delta \}, D). \end{aligned}$$

Thus we now see \(S_L\) as a sum of nine terms of \(\mathcal F\) at different arguments which we encapsulate in the following table for \(S_L\):

$$\begin{aligned} \begin{array}{cccccc} \#&{}\text{ sign }&{}K&{}K+M&{}L&{}L+M\\ \hline 1 &{}+&{} B\cup \{-\alpha \}&{}A&{} C\cup \{-\delta \}&{} D\\ 2 &{}-&{} B\cup \{-\alpha \}&{}A&{} C &{} D\\ 3 &{}+&{} B\cup \{-\alpha \}&{}A&{} C &{} D\cup \{-\gamma \}\\ 4 &{}-&{} B &{}A&{} C\cup \{-\delta \}&{} D\\ 5 &{}+&{} B &{}A&{} C &{} D\\ 6 &{}-&{} B &{}A&{} C &{} D\cup \{-\gamma \}\\ 7 &{}+&{} B &{}A\cup \{-\beta \}&{} C\cup \{-\delta \}&{} D\\ 8 &{}-&{} B &{}A\cup \{-\beta \}&{} C &{} D\\ 9 &{}+&{} B &{}A\cup \{-\beta \}&{} C &{} D\cup \{-\gamma \}\\ \end{array} \end{aligned}$$

Note that \(S_R\) is just the same as \(S_L\) but with \(\alpha \leftrightarrow \beta \); \(\gamma \leftrightarrow \delta \); \(A\leftrightarrow B\); and \(C\leftrightarrow D\). Thus, we have the table for \(S_R\):

$$\begin{aligned} \begin{array}{cccccc} \#&{}\text{ sign }&{}K&{}K+M&{}L&{}L+M\\ \hline 1 &{}+&{} A\cup \{-\beta \}&{}B&{} D\cup \{-\gamma \}&{} C\\ 2 &{}-&{} A\cup \{-\beta \}&{}B&{} D &{} C\\ 3 &{}+&{} A\cup \{-\beta \}&{}B&{} D &{} C\cup \{-\delta \}\\ 4 &{}-&{} A &{}B&{} D\cup \{-\gamma \}&{} C\\ 5 &{}+&{} A &{}B&{} D &{} C\\ 6 &{}-&{} A &{}B&{} D &{} C\cup \{-\delta \}\\ 7 &{}+&{} A &{}B\cup \{-\alpha \}&{} D\cup \{-\gamma \}&{} C\\ 8 &{}-&{} A &{}B\cup \{-\alpha \}&{} D &{} C\\ 9 &{}+&{} A &{}B\cup \{-\alpha \}&{} D &{} C\cup \{-\delta \}\\ \end{array} \end{aligned}$$

Now we pair up line x from \(S_L\) with line \(10-x\) from \(S_R\) and we use the lemma to express the sum of the \(\mathcal F\)-terms as \(\mathcal C\)’s. We have

$$\begin{aligned} S_L+S_R= & {} \mathcal C( A\cup C \cup \{-\delta \},B\cup D\cup \{-\alpha \}) +\mathcal C( A\cup \{-\beta \}, B) ~ \mathcal C( D\cup \{-\gamma \}, C) \\&- \, \mathcal C( A\cup C ,B\cup D\cup \{-\alpha \})- \mathcal C( A\cup \{-\beta \}, B) ~\mathcal C( D , C)\\&+ \, \mathcal C( A\cup C ,B\cup D\cup \{-\alpha ,-\gamma \}) +\mathcal C( A\cup \{-\beta \}, B) ~\mathcal C( D , C\cup \{-\delta \})\\&- \, \mathcal C( A\cup C \cup \{-\delta \},B\cup D ) - \mathcal C( A , B) ~\mathcal C( D\cup \{-\gamma \}, C)\\&+ \, \mathcal C( A\cup C ,B\cup D )+ \mathcal C( A, B)~\mathcal C( D , C)\\&- \, \mathcal C( A\cup C ,B\cup D\cup \{-\gamma \}) - \mathcal C(A , B) ~\mathcal C( D , C\cup \{-\delta \})\\&+ \, \mathcal C( A\cup C \cup \{-\beta ,-\delta \},B\cup D )+ \mathcal C( A , B\cup \{-\alpha \}) ~\mathcal C( D\cup \{-\gamma \}, C)\\&- \, \mathcal C( A\cup C \cup \{-\beta \},B\cup D )- \mathcal C ( A , B\cup \{-\alpha \})~\mathcal C( D , C)\\&+ \, \mathcal C( A\cup C \cup \{-\beta \},B\cup D\cup \{-\gamma \}) + \mathcal C( A , B\cup \{-\alpha \})~\mathcal C( D , C\cup \{-\delta \}). \end{aligned}$$

When we subtract \(S_0\) all of the terms that are products of two \(\mathcal C\)s cancel:

$$\begin{aligned} S_L+S_R-S_0= & {} \mathcal C( A\cup C \cup \{-\delta \},B\cup D\cup \{-\alpha \}) - \mathcal C( A\cup C ,B\cup D\cup \{-\alpha \}) \\&+ \, \mathcal C( A\cup C ,B\cup D\cup \{-\alpha ,-\gamma \}) - \mathcal C( A\cup C \cup \{-\delta \},B\cup D ) \\&+ \, \mathcal C( A\cup C ,B\cup D ) - \mathcal C( A\cup C ,B\cup D\cup \{-\gamma \}) \\&+ \, \mathcal C( A\cup C \cup \{-\beta ,-\delta \},B\cup D ) -\mathcal C( A\cup C \cup \{-\beta \},B\cup D ) \\&+ \, \mathcal C( A\cup C \cup \{-\beta \},B\cup D\cup \{-\gamma \}). \end{aligned}$$

11.3 The final reckoning

A generalization of

$$\begin{aligned} A^+(d)=A(d)+A^+(d-1) \end{aligned}$$

is

$$\begin{aligned} (A\cup \{-\alpha \})(d-1) = X^{\alpha } \bigg ( (A\cup \{-\alpha \})(d) - A(d)\bigg ). \end{aligned}$$

We apply this to the expression

$$\begin{aligned}&(1-X^{1-\alpha -\beta })(1-X^{1-\gamma -\delta }) \sum _{r=0}^\infty (A\cup C \cup \{-\beta ,-\delta \})(r) (B\cup D \cup \{-\alpha ,-\gamma \})(r)X^r \end{aligned}$$

and after some work find that it is equal to the expression above for \(S_L+S_R-S_0\).