1 Background

Let \(\mathbb {F}_q\) be a field of q elements, where q is a power of an odd prime p. Throughout this paper, A, B, C, D, \(\chi \), \(\lambda \), \(\nu \), \(\mu \), \(\varepsilon \), \(\phi \) denote complex multiplicative characters on \(\mathbb {F}_q^*\), extended to map 0 to 0. Here \(\varepsilon \) and \(\phi \) always denote the trivial and quadratic characters, respectively. Define \(\delta (A)\) to be 1 or 0 according as A is trivial or not, and let \(\delta (j,k)\) denote the Kronecker delta for \(j,k \in \mathbb {F}_q\).

Much of this paper deals with the extension field \(\mathbb {F}_{q^2}\) of \(\mathbb {F}_q\). Let \(M_4\) denote a fixed quartic character on \(\mathbb {F}_{q^2}\) and let \(M_8\) denote a fixed octic character on \(\mathbb {F}_{q^2}\) such that \(M_8^2 = M_4\).

Define the additive character \(\psi \) on \(\mathbb {F}_q\) by

$$\begin{aligned} \psi (y) := \exp \Bigg ( \frac{2 \pi i}{p} \Big ( y^p + y^{p^2} + \dots + y^q \Big ) \Bigg ), \quad y \in \mathbb {F}_q. \end{aligned}$$

The corresponding additive character on \(\mathbb {F}_{q^2}\) will be denoted by \(\psi _2\).

Recall the definitions of the Gauss and Jacobi sums over \(\mathbb {F}_q\):

$$\begin{aligned} G(A) = \sum _{y \in \mathbb {F}_q} A(y) \psi (y), \quad J(A, B) = \sum _{y \in \mathbb {F}_q} A(y) B(1-y). \end{aligned}$$

These sums have the familiar properties

$$\begin{aligned} G(\varepsilon ) = -1, \quad J(\varepsilon ,\varepsilon ) = q-2, \end{aligned}$$

and for nontrivial A,

$$\begin{aligned} G(A) G(\overline{A}) = A(-1) q, \quad J(A, \overline{A}) = -A(-1), \quad J(\varepsilon , A)=-1. \end{aligned}$$

Gauss and Jacobi sums are related by [5, p. 59]

$$\begin{aligned} J(A,B) = \frac{G(A) G(B)}{G(AB)}, \quad \text {if } AB \ne \varepsilon \end{aligned}$$

and

$$\begin{aligned} J(A,\overline{C}) = \frac{A(-1)G(A) G(\overline{A}C)}{G(C)}=A(-1)J(A,\overline{A}C), \quad \text {if } C \ne \varepsilon . \end{aligned}$$

The Hasse–Davenport product relation [5, p. 351] yields

$$\begin{aligned} A(4) G(A) G(A \phi ) = G(A^2) G(\phi ). \end{aligned}$$
(1.1)

As in [12, p. 82], define the hypergeometric \({}_2F_1\) function over \(\mathbb {F}_q\) by

$$\begin{aligned} {}_2F_1 \left( \begin{array}{r|}A,B \\ C \end{array}\ x \right) =\frac{\varepsilon (x)}{q} \sum _{y \in \mathbb {F}_q} B(y) \overline{B}C(y-1) \overline{A}(1-xy), \quad x \in \mathbb {F}_q. \end{aligned}$$
(1.2)

For \(j, k \in \mathbb {F}_q\) and \(a \in \mathbb {F}_q^*\), Katz [13, p. 224] defined the mixed exponential sums

$$\begin{aligned}&P(j,k): = \delta (j,k) + \phi (-1)\delta (j,-k) \nonumber \\&\qquad \qquad +\frac{1}{G(\phi )}\sum _{x \in \mathbb {F}_q^*} \phi (a/x - x)\psi (x(j+k)^2 + (a/x)(j-k)^2). \end{aligned}$$
(1.3)

Note that

$$\begin{aligned} P(j,k)=P(k,j), \quad P(-j,k) = \phi (-1)P(j,k). \end{aligned}$$
(1.4)

Katz proved an equidistribution conjecture of Wootters [13, p. 226], [1] connected with quantum physics by constructing explicit character sums V(j) [13, pp. 226–229] for which the identities

$$\begin{aligned} P(j,k) = V(j)V(k) \end{aligned}$$
(1.5)

hold for all \(j,k \in \mathbb {F}_q\). (The q-dimensional vector \((V(j))_{j \in \mathbb {F}_q} \) is a minimum uncertainty state, as described by Sussman and Wootters [17].) Katz’s proof [13, Theorem 10.2] of the identities (1.5) required the characteristic p to exceed 3, in order to guarantee that various sheaves of ranks 2, 3, and 4 have geometric and arithmetic monodromy groups which are SL(2), SO(3), and SO(4), respectively.

As Katz indicated in [13, p. 223], his proof of (1.5) is quite complex, invoking the theory of Kloosterman sheaves and their rigidity properties, as well as results of Deligne [6] and Beilinson, Bernstein, Deligne [4]. Katz [13, p. 223] wrote, “It would be interesting to find direct proofs of these identities.”

The goal of this paper is to respond to Katz’s challenge by giving a direct proof of (1.5) (a “character sum proof” not involving algebraic geometry). This has the benefit of making the demonstration of his useful identities accessible to a wider audience of mathematicians and physicists. Since a direct proof for \(q \equiv 1 \pmod 4\) has been given in [8], we will assume from here on that \(q \equiv 3 \pmod 4\).

A big advantage of our proof is that it works for all odd characteristics p, including \(p=3\). As a bonus, we obtain some elegant new double character sum evaluations in (5.11)–(5.14).

Our method of proof is to show (in Sect. 6) that the double Mellin transforms of both sides of (1.5) are equal. The Mellin transforms of the left and right sides of (1.5) are given in Theorems 3.1 and 5.1, respectively. A key feature of our proof is a formula (Theorem 4.1) relating a norm-restricted Jacobi sum over \(\mathbb {F}_{q^2}\) to a hypergeometric \({}_2F_1\) character sum over \(\mathbb {F}_q\). Theorem 4.1 will be applied to prove Theorem 5.3, an identity for a weighted sum of hypergeometric \({}_2F_1\) character sums. Theorem 5.3 is crucial for our proof of (1.5) in Sect. 6.

Hypergeometric character sums over finite fields have had a variety of applications in number theory. For some recent examples, see [2, 3, 7, 11, 14,15,16].

Since \(q \equiv 3 \pmod 4\), we have \(\phi (-1)=-1\), and every element \(z \in \mathbb {F}_{q^2}\) has the form

$$\begin{aligned} z= x + iy, \quad x,y \in \mathbb {F}_q, \end{aligned}$$

where i is a fixed primitive fourth root of unity in \(\mathbb {F}_{q^2}\). Write \(\overline{z}= x - iy\) and note that \(\overline{z}= z^q\). The restriction of \(M_8\) to \(\mathbb {F}_q\) equals \(\varepsilon \) or \(\phi \) according as q is congruent to 7 or 3 mod 8. In particular,

$$\begin{aligned} M_8(-1) = \phi (2). \end{aligned}$$
(1.6)

For a character C on \(\mathbb {F}_q\), we let CN denote the character on \(\mathbb {F}_{q^2}\) obtained by composing C with the norm map N on \(\mathbb {F}_{q^2}\) defined by

$$\begin{aligned} Nz = z\overline{z}\in \mathbb {F}_q. \end{aligned}$$

Given a character B on \(\mathbb {F}_q\), BCN is to be interpreted as the character (BC)N, i.e., BNCN.

For the same a as in (1.3), define

$$\begin{aligned} \tau = -\sqrt{qM_8(-a)}, \end{aligned}$$

where the choice of square root is fixed. Katz defined the sums V(j) to be the following norm-restricted Gauss sums:

$$\begin{aligned} V(j): = \tau ^{-1}\phi (j) \sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ Nz=a \end{array}} M_8(z)\psi _2(j^2 z), \quad j \in \mathbb {F}_q. \end{aligned}$$
(1.7)

Note that

$$\begin{aligned} V(-j) = -V(j). \end{aligned}$$
(1.8)

2 Mellin transform of the sums V(j)

This section begins with some results related to Gauss sums over \(\mathbb {F}_{q^2}\) that will be used in this paper. We use the notation \(G_2\) and \(J_2\) for Gauss and Jacobi sums over \(\mathbb {F}_{q^2}\), in order to distinguish them from the Gauss and Jacobi sums G and J over \(\mathbb {F}_q\). For any character \(\beta \) on \(\mathbb {F}_{q^2}\), we have

$$\begin{aligned} G_2(\beta ) = G_2(\beta ^q); \end{aligned}$$
(2.1)

for example, for a character C on \(\mathbb {F}_q\), \(G_2(CN M_8)\) equals \(G_2(CN \overline{M}_8)\) or \(G_2(CN M_8^3)\) according as q is congruent to 7 or 3 mod 8. The Hasse-Davenport theorem on lifted Gauss sums [5, Theorem 11.5.2] gives

$$\begin{aligned} G_2(CN) = - G(C)^2. \end{aligned}$$
(2.2)

From [9, (4.10)],

$$\begin{aligned} G_2(CN M_4) = G_2(CN \overline{M}_4) = -\overline{C}^2\phi (2)G(C^2\phi )G(\phi ). \end{aligned}$$
(2.3)

For any character \(\beta \) on \(\mathbb {F}_{q^2}\), define

$$\begin{aligned} E(\beta ): = \sum _{y \in \mathbb {F}_q} \beta (1+iy). \end{aligned}$$
(2.4)

It is easily seen that

$$\begin{aligned} E(\beta ) = \beta (2) E_2(\beta ), \end{aligned}$$
(2.5)

where \(E_2(\beta )\) is the Eisenstein sum

$$\begin{aligned} E_2(\beta ): = \sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ z+z^q =1 \end{array}} \beta (z). \end{aligned}$$
(2.6)

Let \(\beta ^*\) denote the restriction of \(\beta \) to \(\mathbb {F}_q\). Applying [5, Theorem 12.1.1] with q in place of p, we can express \(E_2(\beta )\) in terms of Gauss sums when \(\beta \) is nontrivial, as follows:

$$\begin{aligned} E_2(\beta ) = {\left\{ \begin{array}{ll} G_2(\beta )/G(\beta ^*) &{}\text{ if } \beta ^* \ne \varepsilon \\ -G_2(\beta )/q &{} \text{ if } \beta ^* = \varepsilon . \end{array}\right. } \end{aligned}$$
(2.7)

For any character \(\chi \) on \(\mathbb {F}_q\), define the Mellin transform

$$\begin{aligned} S(\chi ): = \sum _{j \in \mathbb {F}_q^*} \chi (j)V(j). \end{aligned}$$
(2.8)

In the case that \(\chi \) is odd, we may write \(\chi = \phi \lambda ^2\) for some character \(\lambda \) on \(\mathbb {F}_q\). In that case, we may assume without loss of generality that \(\lambda \) is even, otherwise replace \(\lambda \) by \(\phi \lambda \). In summary, when \(\chi \) is odd,

$$\begin{aligned} \chi = \phi \lambda ^2 = \phi \nu ^4, \quad \lambda = \nu ^2 \end{aligned}$$
(2.9)

for some character \(\nu \) on \(\mathbb {F}_q\).

The next theorem gives an evaluation of \(S(\chi )\) in terms of Gauss sums.

Theorem 2.1

If \(\chi \) is even, then \(S(\chi )=0\). If \(\chi \) is odd [so that (2.9) holds], then

$$\begin{aligned} S(\chi )= \overline{\nu }(a)\tau ^{-1}G_2(\nu N M_8)+\phi \overline{\nu }(a)\tau ^{-1}G_2(\nu N M_8^5). \end{aligned}$$
(2.10)

Proof

If \(\chi \) is even, then \(S(\chi )\) vanishes by (1.8) and (2.8). Now assume that \(\chi \) is odd, so that \(\chi = \phi \nu ^4\). Then

$$\begin{aligned} S(\chi )=\frac{\tau ^{-1}}{q-1}\sum _{z \in \mathbb {F}_{q^2}} \sum _{j \in \mathbb {F}_q^*} M_8(z)\psi _2(z j^2) \nu ^4(j) \sum _{C} C(N(z)/a). \end{aligned}$$

Replace z by \(z/j^2\) to get

$$\begin{aligned} S(\chi )=\frac{\tau ^{-1}}{q-1}\sum _{C} \overline{C}(a)\sum _{z \in \mathbb {F}_{q^2}} M_8(z)\psi _2(z)CN(z)\sum _{j \in \mathbb {F}_q^*} \nu ^4 \overline{C}^4(j). \end{aligned}$$

The sum on j on the right equals \(q-1\) when \(C \in \{\nu , \phi \nu \}\) and it equals 0 otherwise. Since \(\phi N = M_8^4\), the result now follows from the definition of \(G_2\). \(\square \)

3 Double Mellin transform of V(j)V(k)

For characters \(\chi _1, \chi _2\), define the double Mellin transform

$$\begin{aligned} S=S(\chi _1, \chi _2): = \sum _{j, k \in \mathbb {F}_q^*} \chi _1(j)\chi _2(k)V(j)V(k). \end{aligned}$$
(3.1)

As in (2.9), when \(\chi _1\) and \(\chi _2\) are both odd,

$$\begin{aligned} \chi _i = \phi \lambda _i^2 = \phi \nu _i^4, \quad \lambda _i = \nu _i^2, \quad i=1,2, \end{aligned}$$
(3.2)

for some characters \(\nu _1\), \(\nu _2\) on \(\mathbb {F}_q\). In this case, write

$$\begin{aligned} \mu = \nu _1 \nu _2. \end{aligned}$$
(3.3)

The following theorem evaluates S in terms of Gauss and Jacobi sums.

Theorem 3.1

If \(\chi _1\) or \(\chi _2\) is even, then \(S=0\). If \(\chi _1\) and \(\chi _2\) are both odd [so that (3.2) and (3.3) hold], then

$$\begin{aligned} S=\sum _{i=0}^{1} \phi ^i \overline{\mu }(a)\frac{q }{G_2(\phi ^i \overline{\mu }N)} \{J_2(\nu _1 N M_8, \phi ^i \overline{\mu }N) + J_2(\nu _1 N M_8^5, \phi ^i \overline{\mu }N)\}. \end{aligned}$$
(3.4)

Proof

By (3.1), \(S=S(\chi _1)S(\chi _2)\). By Theorem 2.1, \(S=0\) when \(\chi _1\) or \(\chi _2\) is even. Thus assume that \(\chi _1\) and \(\chi _2\) are both odd. Then Theorem 2.1 yields

$$\begin{aligned}&S=\sum _{i=0}^{1} \phi ^i \overline{\mu }(a)\frac{M_8(-a) }{q} \nonumber \\&\qquad \times \ \{G_2(\nu _1 N M_8)G_2( \phi ^i \mu \overline{\nu _1}N M_8) + G_2(\nu _1 N M_8^5)G_2( \phi ^i \mu \overline{\nu _1}N M_8^5)\}. \end{aligned}$$
(3.5)

A straightforward computation with the aid of (2.1) shows that (3.5) is equivalent to (3.4). The computation is facilitated by noting that \(M_8(-a)\) equals 1 or \(-\phi (a)\) according as q is congruent to 7 or 3 mod 8, so that the bracketed expression for \(i=0\) in (3.4) is to be compared to that for \(i=1\) in (3.5) when q is congruent to 3 mod 8. \(\square \)

4 Identity for a norm-restricted Jacobi sum in terms of a \({}_2F_1\)

Let D be a character on \(\mathbb {F}_q\). Define the norm-restricted Jacobi sums

$$\begin{aligned} R(D,j):=\sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ N(z)=j^4 \end{array}} M_8(z)\overline{D}N(1-z), \quad j \in \mathbb {F}_q^*. \end{aligned}$$
(4.1)

The next theorem provides a formula expressing R(Dj) in terms of a \({}_2F_1\) hypergeometric character sum.

Theorem 4.1

For \(j = \pm 1\),

$$\begin{aligned} R(D,j) = -\overline{D}(4)J(\phi D^2, \phi ). \end{aligned}$$
(4.2)

For all other \(j \in \mathbb {F}_q^*\),

$$\begin{aligned} R(D,j)=-\phi (j) q \overline{D}^4(j-1) {}_2F_1 \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) . \end{aligned}$$
(4.3)

Proof

Replace z in (4.1) by \(-zj^2\). By (1.6), we obtain

$$\begin{aligned} R(D,j)=\phi (2)\sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ N(z)=1 \end{array}} M_8(z)\overline{D}N(1+zj^2). \end{aligned}$$

Each z in the sum must be a square, since N(z) is a square in \(\mathbb {F}_q\). Thus

$$\begin{aligned} R(D,j)=\frac{\phi (2)}{2}\sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ N(z)=1 \end{array}} M_4(z)\overline{D}N(1+z^2j^2). \end{aligned}$$

Writing \(z=x+iy\), we have

$$\begin{aligned} R(D,j)=\frac{\phi (2)}{2}\sum _{x^2+y^2=1} M_4(x+iy)\overline{D}((1-j^2)^2 + 4 j^2 x^2), \end{aligned}$$

where it is understood that the sum is over all \(x,y \in \mathbb {F}_q\) for which \(x^2 + y^2 =1\). Thus, since \(M_4(\pm i)=M_8(-1)=\phi (2)\),

$$\begin{aligned} R(D,j) = \overline{D}^2(1-j^2) + Q(D,j), \end{aligned}$$
(4.4)

where

$$\begin{aligned} Q(D,j)=\frac{\phi (2)}{2}\sum _{\begin{array}{c} x^2+y^2=1 \\ x \ne 0 \end{array}} M_4(x+iy)\overline{D}((1-j^2)^2 + 4 j^2 x^2). \end{aligned}$$

Replacing y by yx, we have

$$\begin{aligned} Q(D,j)=\frac{\phi (2)}{2}\sum _{\begin{array}{c} 1+y^2=x^{-2} \\ x \ne 0 \end{array}} M_4(1+iy)\overline{D}((1-j^2)^2 + 4 j^2 /(1+ y^2)). \end{aligned}$$

Since \(\overline{M}_4= \phi N M_4\), this yields

$$\begin{aligned} Q(D,j)=\frac{\phi (2)}{2}\sum _{y \in \mathbb {F}_q} \{M_4(1+iy)+\overline{M}_4(1+iy)\} \overline{D}((1-j^2)^2 + 4 j^2 /(1+ y^2)). \end{aligned}$$
(4.5)

First consider the case where \(j = \pm 1\). By (4.4) and (4.5),

$$\begin{aligned} R(D,j)=Q(D,j) = \frac{\overline{D}(4) \phi (2)}{2}\sum _{y \in \mathbb {F}_q}\left\{ DN M_4(1+iy)+DN \overline{M}_4(1+iy)\right\} . \end{aligned}$$

By (2.4) and (2.5),

$$\begin{aligned} R(D,j)=\frac{\phi (2)}{2}\{E_2(DN M_4) + E_2(DN \overline{M}_4)\}. \end{aligned}$$
(4.6)

The restriction of \(DN M_4\) to \(\mathbb {F}_q\) is \(D^2\). Thus by (2.7),

$$\begin{aligned} R(D,j)=\frac{\phi (2)}{2 G(D^2)}\{G_2(DN M_4) + G_2(DN \overline{M}_4)\}, \end{aligned}$$

if \(D^2\) is nontrivial, and

$$\begin{aligned} R(D,j)=\frac{\phi (2)}{-2 q}\{G_2(DN M_4) + G_2(DN \overline{M}_4)\}, \end{aligned}$$

if \(D^2\) is trivial. By (2.3),

$$\begin{aligned} G_2(DN M_4) = G_2(DN \overline{M}_4) = -\overline{D}(4) \phi (2)G(D^2 \phi )G(\phi ). \end{aligned}$$

Consequently,

$$\begin{aligned} R(D,j) = -\overline{D}(4)J(\phi D^2, \phi ) \end{aligned}$$
(4.7)

for every D, which completes the proof when \(j = \pm 1\). Thus assume for the remainder of this proof that \(j^2 \ne 1\).

By (4.5), Q(Dj) equals

$$\begin{aligned} \frac{\overline{D}^2(1-j^2)}{2\phi (2)}\sum _{y \in \mathbb {F}_q} \{M_4(1+iy)+\overline{M}_4(1+iy)\} \overline{D}\left( 1 + \frac{4 j^2}{(1+ y^2)(1-j^2)^2}\right) . \end{aligned}$$

By the “binomial theorem” [12, (2.10)], the rightmost factor above equals

$$\begin{aligned} \frac{q}{q-1}\sum _{\chi } \begin{pmatrix} D \chi \\ \chi \end{pmatrix} \chi \left( \frac{-4 j^2}{(1+ y^2)(1-j^2)^2}\right) , \end{aligned}$$

where the “binomial coefficient” over \(\mathbb {F}_q\) is defined by [12, p. 80]

$$\begin{aligned} \begin{pmatrix} A \\ B \end{pmatrix} = \frac{B(-1)}{q} J(A, \overline{B}). \end{aligned}$$

Replacing \(\chi \) with \(\overline{\chi }\) and observing that [12, p. 80]

$$\begin{aligned} \begin{pmatrix} D \overline{\chi }\\ \overline{\chi }\end{pmatrix}= D(-1)\begin{pmatrix} \chi \\ \overline{D}\chi \end{pmatrix}, \end{aligned}$$

we see that

$$\begin{aligned} Q(D,j)=\frac{\overline{D}^2(1-j^2)D(-1)\phi (2)q}{2(q-1)}\sum _{\chi } \begin{pmatrix} \chi \\ \overline{D}\chi \end{pmatrix} \chi \left( \frac{-(1-j^2)^2}{4j^2}\right) \kappa (\chi ), \end{aligned}$$

where

$$\begin{aligned} \kappa (\chi ):=\sum _{y \in \mathbb {F}_q} \{\chi N M_4(1+iy) + \chi N \overline{M}_4(1+iy)\}. \end{aligned}$$

By (2.4) and (2.5),

$$\begin{aligned} \kappa (\chi )=\chi (4)\{E_2(\chi N M_4) + E_2(\chi N \overline{M}_4)\}. \end{aligned}$$

Comparing (4.6) and (4.7), we see that

$$\begin{aligned} \kappa (\chi )= -2\phi (2)J(\phi \chi ^2, \phi )= 2q\phi (2)\overline{\chi }(4) \begin{pmatrix} \phi \chi ^2 \\ \chi \end{pmatrix}, \end{aligned}$$

where the last equality follows from the Hasse-Davenport relation (1.1). Consequently,

$$\begin{aligned} Q(D,j)=\frac{\overline{D}^2(1-j^2)D(-1)q^2}{q-1}\sum _{\chi } \begin{pmatrix} \chi \\ \overline{D}\chi \end{pmatrix} \begin{pmatrix} \phi \chi ^2 \\ \chi \end{pmatrix} \chi \left( \frac{-(1-j^2)^2}{16j^2}\right) . \end{aligned}$$

Replace \(\chi \) by \(D \chi \) to get

$$\begin{aligned} Q(D,j)=\frac{\overline{D}(16j^2)q^2}{q-1}\sum _{\chi } \begin{pmatrix} D\chi \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi ^2 \\ D \chi \end{pmatrix} \chi \left( \frac{-(1-j^2)^2}{16j^2}\right) . \end{aligned}$$

By [12, (2.15)] with \(A=D \chi \), \(B=\chi \), and \(C=D^2 \phi \chi ^2\),

$$\begin{aligned}&\frac{q^2}{q-1} \begin{pmatrix} D\chi \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi ^2 \\ D \chi \end{pmatrix} \\&\quad =\frac{q^2}{q-1} \begin{pmatrix} D^2 \phi \chi ^2 \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi \\ D \phi \chi \end{pmatrix} -\chi (-1)\delta (D \chi ) + D(-1) \delta (D^2 \phi \chi ), \end{aligned}$$

since by [12, (2.6)],

$$\begin{aligned} \begin{pmatrix} D^2 \phi \chi \\ D \end{pmatrix} = \begin{pmatrix} D^2 \phi \chi \\ D \phi \chi \end{pmatrix}. \end{aligned}$$

Thus

$$\begin{aligned}&Q(D,j)=\frac{\overline{D}(16j^2)q^2}{q-1}\sum _{\chi } \begin{pmatrix} D^2 \phi \chi ^2 \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi \\ D \phi \chi \end{pmatrix} \chi \left( \frac{-(1-j^2)^2}{16j^2}\right) \nonumber \\&\qquad \qquad \,\,-\,\,\overline{D}^2(1-j^2) -D(-16j^2)\overline{D}^4(1-j^2). \end{aligned}$$
(4.8)

By [12, Theorem 4.16] with \(A=D\), \(B=\phi D^2\), and \(x = -(j+1)^2/(j-1)^2\), we have

$$\begin{aligned}&-\phi (j)D^4(j-1)\overline{D}(16j^2) \displaystyle \frac{q}{q-1}\sum \limits _{\chi } \begin{pmatrix} D^2 \phi \chi ^2 \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi \\ D \phi \chi \end{pmatrix} \chi \left( \dfrac{-(1-j^2)^2}{16j^2}\right) \\&={}_2F_1 \displaystyle \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) -\phi (j)D(-16j^2)\overline{D}^4(j+1)/q. \end{aligned}$$

Multiply by \(-\phi (j)q\overline{D}^4(j-1)\) to get

$$\begin{aligned}&\overline{D}(16j^2)\frac{q^2}{q-1}\sum _{\chi } \begin{pmatrix} D^2 \phi \chi ^2 \\ \chi \end{pmatrix} \begin{pmatrix} D^2 \phi \chi \\ D \phi \chi \end{pmatrix} \chi \left( \frac{-(1-j^2)^2}{16j^2}\right) \\&\quad =-\phi (j)q\overline{D}^4(j-1){}_2F_1 \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) +D(-16j^2)\overline{D}^4(1 -j^2). \end{aligned}$$

Thus by (4.8),

$$\begin{aligned}&Q(D,j) + \overline{D}^2(1-j^2) + D(-16j^2) \overline{D}^4(1-j^2) \nonumber \\&\quad =-\phi (j)q\overline{D}^4(j-1){}_2F_1 \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) +D(-16j^2)\overline{D}^4(1-j^2). \end{aligned}$$
(4.9)

Combining (4.4) and (4.9), we arrive at the desired result (4.3). \(\square \)

5 Double Mellin transform of P(jk)

For characters \(\chi _1, \chi _2\), define the double Mellin transform

$$\begin{aligned} T=T(\chi _1, \chi _2): = \sum _{j, k \in \mathbb {F}_q^*} \chi _1(j)\chi _2(k)P(j,k). \end{aligned}$$
(5.1)

Note that \(T(\chi _1, \chi _2)\) is symmetric in \(\chi _1\), \(\chi _2\).

The following theorem evaluates T.

Theorem 5.1

If \(\chi _1\) or \(\chi _2\) is even, then \(T=0\). If \(\chi _1\) and \(\chi _2\) are both odd [so that (3.2) and (3.3) hold], then

$$\begin{aligned} T=\sum _{i=0}^1 \overline{\mu }\phi ^i(a)\frac{G(\phi \mu ^2)}{G(\phi )} \left\{ \sum _{j \in \mathbb {F}_q^*} \chi _1(j) h(\mu \phi ^i,j) +2(q-1)\delta (\mu \phi ^i)\right\} , \end{aligned}$$
(5.2)

where for a character D on \(\mathbb {F}_q\) and \(j \in \mathbb {F}_q^*\), we define

$$\begin{aligned} h(D,j):=\sum _{x \in \mathbb {F}_q^*} D(x)\phi (1-x) \phi \overline{D}^2(x(j+1)^2 +(j-1)^2). \end{aligned}$$
(5.3)

Proof

By (1.4), \(P(j, k)=-P(j,-k)\), so \(T=0\) if \(\chi _1\) or \(\chi _2\) is even. Thus assume that (3.2) and (3.3) hold. Replacing j by jk in (1.3), we obtain

$$\begin{aligned} T= & {} 2(q-1)\delta (\mu ^4) +\frac{1}{G(\phi )}\sum _{x,j,k \in \mathbb {F}_q^*} \chi _1(j)\mu ^4(k)\phi (a/x - x)\psi (k^2(x(j+1)^2\\&+\,a(j-1)^2/x)). \end{aligned}$$

Since \(\delta (\mu ^4)=\delta (\mu ^2)\), this becomes

$$\begin{aligned} T= & {} 2(q-1)\delta (\mu ^2) +\frac{1}{G(\phi )}\sum _{x,j,k \in \mathbb {F}_q^*} \chi _1(j)\mu ^2(k)\phi (a/x - x)\psi (k(x(j+1)^2\\&+\,a(j-1)^2/x))(1+\phi (k)). \end{aligned}$$

There is no contribution from the 1 in the rightmost factor \((1+\phi (k))\); to see this, replace k and x by their negatives. Therefore,

$$\begin{aligned} T= & {} 2(q-1)\delta (\mu ^2) +\frac{G(\phi \mu ^2)}{G(\phi )}\sum _{x,j \in \mathbb {F}_q^*} \chi _1(j)\phi (a/x - x)\phi \overline{\mu }^2(x(j+1)^2\\&+\,a(j-1)^2/x). \end{aligned}$$

It follows that

$$\begin{aligned} T= & {} 2(q-1)\delta (\mu ^2)\frac{G(\phi \mu ^2)}{G(\phi )}+\frac{G(\phi \mu ^2)}{G(\phi )}\sum _{x,j \in \mathbb {F}_q^*} \chi _1(j)\phi (a - x)\phi \overline{\mu }^2(x(j+1)^2 \\&+\,a(j-1)^2)\mu (x)(1+\phi (x)). \end{aligned}$$

After replacing x by ax and employing (5.3), the desired result (5.2) readily follows. \(\square \)

We proceed to analyze h(Dj).

Lemma 5.2

We have

$$\begin{aligned} h(D,j) = -\phi (j) \overline{D}(16) J(D, \phi ), \quad \text{ if } j = \pm 1, \end{aligned}$$
(5.4)

and for \(j \ne \pm 1\) and nontrivial D, we have

$$\begin{aligned} h(D,j) = \frac{G(\phi )G(D)^2}{G(\phi D^2)}\overline{D}^4(j-1) {}_2F_1 \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) . \end{aligned}$$
(5.5)

Finally, if \(j \ne \pm 1\) and D is trivial, then \(h(D,j)=0\).

Proof

The evaluation in (5.4) follows directly from the definition of h(Dj) in (5.3). The evaluation in (5.5) is the same as that in [8, (5.21)], the proof of which is valid for q congruent to either 1 or 3 mod 4. Finally, let \(j \ne \pm 1\). Then since

$$\begin{aligned} h(\varepsilon ,j) = -1 + \sum _{x \in \mathbb {F}_q} \phi (1-x) \phi (x(j+1)^2 + (j-1)^2), \end{aligned}$$

replacement of x by \(1-x(2j^2+2)/(j+1)^2\) shows that \(h(\varepsilon ,j)= -1 + 1 = 0\). \(\square \)

Theorem 5.3

For a character D on \(\mathbb {F}_q\), define

$$\begin{aligned} W(D): = \sum _{j \in \mathbb {F}_q^*} \phi \nu _1^4(j)h(D,j). \end{aligned}$$
(5.6)

Then \(W(\varepsilon )=2\), and for nontrivial D,

$$\begin{aligned} W(D)=\frac{-G(\phi )G(D)^2}{qG(\phi D^2)} \left\{ J_2(\nu _1N M_8, \overline{D}N)+ J_2(\nu _1N M_8^5, \overline{D}N)\right\} . \end{aligned}$$
(5.7)

Proof

It follows directly from Lemma 5.2 that \(W(\varepsilon )=2\). Let D be nontrivial. By Lemma 5.2,

$$\begin{aligned} W(D)= & {} -2\overline{D}(16)J(D, \phi ) \ \nonumber \\&- \frac{G(\phi )G(D)^2}{qG(\phi D^2)} \sum _{\begin{array}{c} j \in \mathbb {F}_q^*\\ j \ne \pm 1 \end{array}} \nu _1^4(j)\left( -q\phi (j)\overline{D}^4(j-1) {}_2F_1 \left( \begin{array}{r|}D,D^2\phi \\ D\phi \end{array}\ -\left( \frac{j+1}{j-1}\right) ^2 \right) \right) .\nonumber \\ \end{aligned}$$
(5.8)

Thus by (4.1)–(4.3),

$$\begin{aligned} W(D)= & {} -2\overline{D}(16)J(D, \phi ) -\frac{G(\phi )G(D)^2}{qG(\phi D^2)} \sum _{\begin{array}{c} j \in \mathbb {F}_q^*\\ j \ne \pm 1 \end{array}} \nu _1^4(j)R(D,j) \\= & {} -2\overline{D}(16)J(D, \phi ) - \frac{G(\phi )G(D)^2}{qG(\phi D^2)}\left\{ 2\overline{D}(4)J(\phi D^2, \phi ) \ + \sum _{j \in \mathbb {F}_q^*} \nu _1^4(j)R(D,j)\right\} . \end{aligned}$$

This simplifies to

$$\begin{aligned} W(D) = -\frac{G(\phi )G(D)^2}{qG(\phi D^2)}\sum _{j \in \mathbb {F}_q^*} \nu _1^4(j)R(D,j). \end{aligned}$$
(5.9)

For brevity, let Y(D) denote this sum on j. It remains to prove that

$$\begin{aligned} Y(D): = \sum _{j \in \mathbb {F}_q^*} \nu _1^4(j)R(D,j)= J_2(\nu _1N M_8, \overline{D}N)+ J_2(\nu _1N M_8^5, \overline{D}N). \end{aligned}$$
(5.10)

Since the fourth powers in \(\mathbb {F}_q\) are precisely the squares, it follows from definition (4.1) that

$$\begin{aligned} Y(D) = \sum _{j \in \mathbb {F}_q^*} \nu _1(j^2) \sum _{\begin{array}{c} z \in \mathbb {F}_{q^2}\\ N(z)=j^2 \end{array}} M_8(z)\overline{D}N(1-z). \end{aligned}$$

Thus

$$\begin{aligned} Y(D)= & {} \frac{1}{q-1}\sum _{j \in \mathbb {F}_q^*} \nu _1(j^2) \sum _{z \in \mathbb {F}_{q^2}^*}M_8(z)\overline{D}N(1-z)\sum _{\chi } \chi (N(z)/j^2) \\= & {} \frac{1}{q-1}\sum _{\chi } J_2(\chi N M_8, \overline{D}N) \sum _{j \in \mathbb {F}_q^*} (\nu _1\overline{\chi })^2(j). \end{aligned}$$

The sum on j on the right vanishes unless \(\chi \in \{\nu _1, \nu _1\phi \}\), and so we obtain the desired result (5.10). \(\square \)

As interesting consequences of Theorem 5.3, we record the elegant double character sum evaluations (5.11)–(5.14) below.

Theorem 5.4

For any character \(\nu \) on \(\mathbb {F}_q\),

$$\begin{aligned}&\sum _{j, x \in \mathbb {F}_q^*} \phi \nu ^4(j) \phi (x)\phi (1-x)\phi (x(j+1)^2+(j-1)^2) \nonumber \\&\quad = J_2(\nu N M_8, \phi N) + J_2(\nu N M_8^5, \phi N) \end{aligned}$$
(5.11)

Proof

This follows by putting \(D=\phi \) in (5.7). \(\square \)

Theorem 5.5

When \(q \equiv 7 \pmod 8\), we have

$$\begin{aligned} \sum _{j, x \in \mathbb {F}_q^*} \phi (jx)\phi (1-x)\phi (x(j+1)^2+(j-1)^2)=2q. \end{aligned}$$
(5.12)

When \(q \equiv 3 \pmod 8\), we have

$$\begin{aligned} \sum _{j, x \in \mathbb {F}_q^*} \phi (jx)\phi (1-x)\phi (x(j+1)^2+(j-1)^2)=2u, \end{aligned}$$
(5.13)

where |u|, |v| is the unique pair of positive integers with \(p \not \mid u\) for which \(q^2=u^2 + 2v^2\), and where the sign of u is determined by the congruence \(u \equiv -1 \pmod 8\). In particular, when \(q=p \equiv 3 \pmod 8\), we have

$$\begin{aligned} \sum _{j, x \in \mathbb {F}_q^*} \phi (jx)\phi (1-x)\phi (x(j+1)^2+(j-1)^2)=4a_8^2 -2p, \end{aligned}$$
(5.14)

where \(p = a_8^2 + 2b_8^2\).

Proof

By (5.11) with \(\nu = \varepsilon \), the sum in (5.12) equals

$$\begin{aligned} J_2(M_8, \phi N) + J_2(M_8^5, \phi N). \end{aligned}$$

First suppose that \(q \equiv 7 \pmod 8\). Then

$$\begin{aligned} G_2(M_8) = G_2(M_8^5), \quad G_2(\phi N) = q \end{aligned}$$

by [5, Theorem 11.6.1]. Thus each Jacobi sum above equals q, which proves (5.12).

Now suppose that \(q \equiv 3 \pmod 8\). An application of (2.1) shows that \(J_2(M_8^5, \phi N)\) is the complex conjugate of \(J_2(M_8, \phi N)\), so that the sum in (5.13) equals \(2 \mathfrak {R}J_2(M_8, \phi N)\).

First consider the case where q is prime, i.e., \(q=p\). Then \(G_2(\phi N) = p\) and by [5, Theorems 12.1.1 and 12.7.1(b)],

$$\begin{aligned} G_2(M_8) = G(\phi )\pi , \quad G_2(M_8^5)=G(\phi )\overline{\pi }, \quad J_2(M_8, \phi N) = \pi ^2, \end{aligned}$$

where \(\pi =a_8 + i b_8 \sqrt{2}\) is a prime in \(\mathbb {Q}(i\sqrt{2})\) of norm \(p=\pi \overline{\pi }=a_8^2 + 2b_8^2\). Note that \(\pi ^2 = u_1 + iv_1\sqrt{2}\), where

$$\begin{aligned} u_1 = 2a_8^2 - p, \quad v_1= 2a_8b_8, \quad u_1^2 + 2v_1^2 = p^2, \end{aligned}$$

so that

$$\begin{aligned} \mathfrak {R}J_2(M_8, \phi N) = u_1 = 2a_8^2 -p \equiv -1 \pmod 8. \end{aligned}$$

In the general case where say \(q = p^t\), the Hasse–Davenport lifting theorem [5, Theorem 11.5.2] yields

$$\begin{aligned} J_2(M_8, \phi N) = (-1)^{t-1}\pi ^{2t}= (-1)^{t-1}(u_1 +iv_1\sqrt{2})^t=u+iv\sqrt{2}, \end{aligned}$$

for integers u, v such that \(q^2 = u^2 + 2v^2\). Since \(u_1 \equiv -1 \pmod 8\), it is easily seen using the binomial theorem that \(u \equiv -1 \pmod 8\). If \(p=\pi \overline{\pi }\) divided u, then p would divide v, so that the prime \(\overline{\pi }\) would divide \(\pi ^{2t}\), which is impossible. Thus \(p \not \mid u\). For an elementary proof of the uniqueness of |u|, |v|, see [5, Lemma 3.0.1]. \(\square \)

Remark

The sum in Theorem 5.5, namely

$$\begin{aligned} Z =\sum _{j \in \mathbb {F}_q^*} \phi (j) h(\phi , j), \end{aligned}$$
(5.15)

can be evaluated when \(q \equiv 1 \pmod 4\) as well. We have \(Z=0\) when \(q \equiv 5 \pmod 8\), which can be seen by applying [8, Lemma 5.1] with \(\phi \) in place of D, and then replacing j by jI, where I is a primitive fourth root of unity in \(\mathbb {F}_q\). More work is needed to evaluate Z in the remaining case where \(q \equiv 1 \pmod 8\). In this case Z is equal to the sum \(R_2\) in [8, (5.44)] with \(\nu _1= B_8\) and \(A_4 = B_8^2\) for an octic character \(B_8\) on \(\mathbb {F}_q\). The proof of [5, Theorem 3.3.1] shows that

$$\begin{aligned} J(B_8, \phi ) = J(B_8^3, \phi ) \in \mathbb {Q}(i\sqrt{2}). \end{aligned}$$

Using this equality to evaluate the sum \(R_2\), we have

$$\begin{aligned} Z= 2q + 2 \mathfrak {R}J(B_8, \phi )^2, \quad q \equiv 1 \pmod 8. \end{aligned}$$
(5.16)

We will use (5.16) to show that

$$\begin{aligned} Z=4q, \ \text{ when } p \equiv 5 \text{ or } 7 \pmod 8, \end{aligned}$$
(5.17)

and

$$\begin{aligned} Z=4c^2, \ \text{ when } p \equiv 1 \text{ or } 3 \pmod 8, \end{aligned}$$
(5.18)

where c and d are the unique pair of integers up to sign for which

$$\begin{aligned} q = c^2 + 2d^2, \quad p \not \mid c. \end{aligned}$$
(5.19)

First suppose that \(p \equiv 7 \pmod 8\). Then \(q = p^{2t}\) for some \(t \ge 1\). If \(t=1\), then \(J(B_8, \phi )=p\) by [5, Theorem 11.6.1]. For general t, the Hasse-Davenport lifting theorem thus yields \(J(B_8, \phi )=(-1)^{t-1}p^t\), so that \(J(B_8, \phi )^2=q\). Thus \(Z=4q\) by (5.16).

Now suppose that \(p \equiv 5 \pmod 8\). Since \(G(B_8)=G(B_8^p)=G(B_8^5)\) by [5, Theorem 1.1.4(d)], \(J(B_8, \phi ) = G(\phi )\). Thus \(J(B_8, \phi )^2=q\), so again \(Z=4q\). This completes the proof of (5.17).

Next suppose that \(p \equiv 3 \pmod 8\). Then \(q = p^{2t}\) for some \(t \ge 1\). Since \(-2\) is a square \(\pmod p\), we have the prime splitting \(p = \pi \overline{\pi }\) in \(\mathbb {Q}(i\sqrt{2})\). Assume first that \(t=1\). Then

$$\begin{aligned} J(B_8, \phi ) \overline{J}(B_8, \phi ) = q = p^2 = \pi ^2 \overline{\pi }^2, \quad t=1. \end{aligned}$$
(5.20)

We cannot have \(J(B_8, \phi )= \pm p\), otherwise the prime ideal factorization of \(J(B_8, \phi )\) in [5, Theorems 11.2.3, 11.2.9] would yield the contradiction that p ramifies in the cyclotomic field \(\mathbb {Q}(\exp (2\pi i/8))\). In view of (5.20) and unique factorization in \(\mathbb {Q}(i\sqrt{2})\), we may suppose without loss of generality that \(J(B_8, \phi ) = \pi ^2\) when \(t=1\). For general t,

$$\begin{aligned} J(B_8, \phi ) = (-1)^{t-1} \pi ^{2t} = c+ di\sqrt{2} \end{aligned}$$
(5.21)

for some integers c and d such \(c^2 + 2d^2 = q\). Note that p cannot divide c, for otherwise p also divides d (since \(c^2 + 2d^2 = q\)), so that p divides \(\pi ^{2t}\), yielding the contradiction that the prime \(\overline{\pi }\) divides \(\pi \). By (5.21),

$$\begin{aligned} \mathrm{Re}\, J(B_8, \phi )^2 = c^2 -2d^2 = 2c^2 -q, \end{aligned}$$

so that by (5.16), \(Z = 4c^2\).

Finally, suppose that \(p \equiv 1 \pmod 8\), and write \(q=p^t\) for some \(t \ge 1\). Since \(-2\) is a square \(\pmod p\), we have the prime splitting \(p = \pi \overline{\pi }\) in \(\mathbb {Q}(i\sqrt{2})\). If \(t=1\), then without loss of generality, \(J(B_8, \phi ) = \pi \). For general t,

$$\begin{aligned} J(B_8, \phi ) = (-1)^{t-1} \pi ^t = c+ di\sqrt{2} \end{aligned}$$

for some integers c and d such \(c^2 + 2d^2 = q\). Arguing as in the case \(p \equiv 3 \pmod 8\), we again obtain \(Z=4c^2\) for c as in (5.19). This completes the proof of (5.18).

6 Proof of Katz’s identities (1.5)

When \(jk=0\), both sides of (1.5) vanish, by (1.4) and (1.8). We thus assume that \(jk \ne 0\). It suffices to show that the Mellin transforms of the left and right sides of (1.5) are the same for all characters, for then (1.5) follows by taking inverse Mellin transforms. Thus it remains to show that \(S=T\), where S and T are given in Theorems 3.1 and 5.1, respectively. These theorems show that S and T both vanish when \(\chi _1\) or \(\chi _2\) is even, so we may assume that (3.2) and (3.3) hold. For brevity, write \(D = \mu \phi ^i\), where \(i \in \{0,1\}\). Then the equality \(S=T\) is equivalent to

$$\begin{aligned}&\frac{q }{G_2(\overline{D}N)} \{J_2(\nu _1 N M_8, \overline{D}N) + J_2(\nu _1 N M_8^5, \overline{D}N)\} =&\frac{G(\phi D^2)}{G(\phi )}(W(D) + 2(q-1)\delta (D)).\nonumber \\ \end{aligned}$$
(6.1)

Noting that \(G_2(\overline{D}N)= -G(\overline{D})^2\) by (2.2), and using the formula for W(D) in Theorem 5.3, we easily see that (6.1) holds. This completes the proof that \(S=T\).