1 Introduction and results

1.1 Background

Classically, for \(k \ge 2\), the k-th divisor function is defined, for \(n \in {\mathbb {N}}\), by

$$\begin{aligned} d_k (n) := |\{ (a_1 , \ldots , a_k ) \in {\mathbb {N}}^k : a_1 \ldots a_k = n \} |, \end{aligned}$$

where \({\mathbb {N}}\) is the set of positive integers; and when \(k=2\) we will often write d instead of \(d_2\).

It was shown by Dirichlet that

$$\begin{aligned} \sum _{n \le x} d(n) = x \log x + (2 \gamma -1) x + \Delta (x) , \end{aligned}$$
(1)

where the remainder satisfies \(\Delta (x) = O (x^{\frac{1}{2}})\); while, for \(k \ge 2\), it can be shown that

$$\begin{aligned} \sum _{n \le x} d_k (n) = x P_k (\log x ) + \Delta _k (x) , \end{aligned}$$

where \(P_k\) is a polynomial of degree \(k-1\) and \(\Delta _k (x)\) is a lower order term. It is of particular interest to understand the behavior of the remainder \(\Delta _k (x)\), and we do so by studying its moments. Various results on this and the above are given in Chapter 12 of [30].

We mention here that Cramér [10] proved that

$$\begin{aligned} X^{-\frac{3}{2}} \int _{x=0}^{X} \Delta (x)^2 \textrm{d} x \sim \frac{1}{6 \pi ^2} \sum _{n=1}^{\infty } d(n)^2 n^{-\frac{3}{2}} , \end{aligned}$$

while Tong [31] proved that

$$\begin{aligned} X^{-\frac{5}{3}} \int _{x=0}^{X} \Delta _3 (x)^2 \textrm{d} x \sim \frac{1}{10 \pi ^2} \sum _{n=1}^{\infty } d_3 (n)^2 n^{-\frac{4}{3}} . \end{aligned}$$

We can also consider higher moments of \(\Delta \): \(\int _{x=0}^{X} \Delta (x)^k \textrm{d} x\). For this, we refer the reader to the works of Heath-Brown [16] and Tsang [32]. We also mention that Heath-Brown proved that \(x^{-\frac{1}{4}} \Delta (x)\) has a distribution function. That is, there is a function f such that

$$\begin{aligned} X^{-1} {{\,\textrm{meas}\,}}\{ x \in [1,X]: x^{-\frac{1}{4}} \Delta (x) \in I \} \longrightarrow \int _{t \in I} f(t) \textrm{d} t \end{aligned}$$

as \(X \longrightarrow \infty \). The function f extends to an entire function on \({\mathbb {C}}\) and satisfies certain bounds on its derivatives.

A related topic is that of the divisor function over intervals. That is, we are interested in

$$\begin{aligned} \sum _{ x < n \le x+H} d(n) = \sum _{n \le x+H} d(n) - \sum _{n \le x} d(n) . \end{aligned}$$

Applying (1), we obtain

$$\begin{aligned} \sum _{ x < n \le x+H} d(n)&= x \log \Big ( 1 + \frac{H}{x} \Big ) + H \log (x+H) + (2 \gamma -1) H \\&\quad + \Delta (x+H) - \Delta (x) . \end{aligned}$$

Given that \(\Delta (x) = O (x^{\frac{1}{2}})\), it is clear that, for \(H \le x\), the error term

$$\begin{aligned} \Delta (x;H) := \Delta (x+H) - \Delta (x) \end{aligned}$$

is of lower order. Nonetheless, it is not fully understood.

For the k-th divisor problem, the analogous object to study is

$$\begin{aligned} \Delta _k (x;H) := \Delta _k (x+H) - \Delta _k (x) . \end{aligned}$$
(2)

It is the short intervals (which have \(H \le x^{1-\frac{1}{k}}\)) that are of particular interest. We highlight some of the main results in this area. Let \(\epsilon > 0\), and consider the range \(X^{\epsilon }< H < X^{\frac{1}{2} - \epsilon }\). Ivić [21] (see also [9, 22]) proved the asymptotic formula

$$\begin{aligned} \frac{1}{X} \int _{x = X}^{2X} \Delta (x;H)^2 \textrm{d} x = H \sum _{j=0}^{3} c_j \log ^j \Big ( \frac{X^{\frac{1}{2}}}{H} \Big ) \;\; + O_{\epsilon } \big ( X^{- \frac{1}{2} + \epsilon } H^2 \big ) + O_{\epsilon } \big ( X^{\epsilon } H^{\frac{1}{2}} \big ) , \end{aligned}$$

where \(c_0, c_1, c_2\) are constants and \(c_3 = \frac{8}{\pi ^2}\). Assuming the Riemann hypothesis, for \(k \ge 3\) and the range \(X^{\epsilon }< H < X^{1-\epsilon }\), Milinovich and Turnage-Butterbaugh [27] obtained the upper bound

$$\begin{aligned} \frac{1}{X} \int _{x = X}^{2X} \Delta _k (x;H)^2 \textrm{d} x \ll H (\log X)^{k^2 + o (1)}. \end{aligned}$$

Asymptotic formulas can be obtained given certain restrictions on H. For \(k \ge 3\) (and assuming the Lindelöf Hypothesis for \(k >3\)) and \(2 \le L \le X^{\frac{1}{k(k-1)} - \epsilon }\), Lester [24] proved

$$\begin{aligned} \frac{1}{X} \int _{x = X}^{2X} \Delta _k \Big ( x; \frac{x^{1-\frac{1}{k}}}{L} \Big )^2 \textrm{d} x =C_k \frac{X^{1-\frac{1}{k}}}{L} (\log L)^{k^2 -1} + O \Big ( \frac{X^{1-\frac{1}{k}}}{L} (\log L)^{k^2 -2} \Big ) . \end{aligned}$$

Finally, as \(L,X \longrightarrow \infty \) with \(\log L = o (\log T)\), and \(\alpha < \beta \), Lester and Yesha [25] prove that

$$\begin{aligned} \frac{1}{X} {{\,\textrm{meas}\,}}\bigg \{ x \in [X , 2X] : \alpha \le \frac{ \Delta \Big ( x ; \frac{x^{\frac{1}{2}}}{L} \Big )}{x^{\frac{1}{4}} \sqrt{\frac{8}{\pi ^2} \frac{\log ^3 L}{L} }} \le \beta \bigg \} \sim \frac{1}{\sqrt{2 \pi }} \int _{t=\alpha }^{\beta } e^{-\frac{t^2}{2}} \textrm{d} t . \end{aligned}$$

That is, we have a Gaussian distribution function.

Let us now consider the divisor function over short intervals in the polynomial ring \({\mathcal {A}}:= {\mathbb {F}}_q [T]\), where q is a prime power. Before proceeding, we define \({\mathcal {M}}\) to be the set of monic polynomials in \({\mathcal {A}}\); and for \({\mathcal {B}} = {\mathcal {A}}, {\mathcal {M}}\) we define \({\mathcal {B}}_n\) and \({\mathcal {B}}_{\le n}\) to be the set of polynomials in \({\mathcal {B}}\) with degree equal to n and degree \(\le n\), respectively. It should be noted that, as \({\mathcal {A}}\) is a Euclidean domain, primality and irreducibility are equivalent. We write \(E \mid A\) to indicate that the polynomial E divides the polynomial A, and when this appears in the ranges for summations or products it is with the understanding that we are considering divisors that are monic. For non-zero \(A \in {\mathcal {A}}\), we define \(|A |:= q^{{{\,\textrm{deg}\,}}A}\), and we define \(|0 |:= 0\). It is convenient to take \({{\,\textrm{deg}\,}}0 = - \infty \), and so (unless otherwise indicated) the range \({{\,\textrm{deg}\,}}A \le n\) should be taken to include the zero polynomial. The k-th divisor function is defined for \(N \in {\mathcal {M}}\) by

$$\begin{aligned} d_k (N) := |\{ (A_1 , \ldots , A_k ) \in {\mathcal {M}}^k : A_1 \ldots A_k = N \} |. \end{aligned}$$

For \(A \in {\mathcal {M}}_n\) and \(0 \le h \le n\), we define the interval

$$\begin{aligned} I (A;<h) := \{ B \in {\mathcal {M}} : {{\,\textrm{deg}\,}}(B-A) < h \} \end{aligned}$$
(3)

and define

$$\begin{aligned} {\mathcal {N}}_{d_k} (A;<h) := \sum _{B \in I(A;<h)} d_k (B) . \end{aligned}$$
(4)

Remark 1.1.1

This definition and notation should be compared to the following given in [13, 23]:

$$\begin{aligned} I (A;h) := \{ B \in {\mathcal {M}} : {{\,\textrm{deg}\,}}(B-A) \le h \} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {N}}_{d_k} (A;h) := \sum _{B \in I(A;h)} d_k (B) . \end{aligned}$$

Our definition is slightly different (\(<h\) compared to \(\le h\)), and our notation is also slightly different to reflect this and avoid confusion. Beyond this slight difference, we attempt to keep our notation consistent with previous work in the area.

The reason we use a slightly different definition is because we feel it is more natural. For example, we have \(|I (A; <h) |= q^h\) as opposed to \(|I (A;h) |= q^{h+1}\); and, for \(A \in {\mathcal {M}}_n\), we have

$$\begin{aligned} |\{ A' \in {\mathcal {M}}_n : I (A';<h) = I (A; <h) \} |= q^{h} \end{aligned}$$

as opposed to

$$\begin{aligned} |\{ A' \in {\mathcal {M}}_n : I (A';h) = I (A;h) \} |= q^{h+1}. \end{aligned}$$

In the remainder of the paper, we use only our notation, and so some of the results we reference from [13, 23] may appear slightly different from theirs, but they are in fact the same statement.

Continuing, it is not difficult to obtain an exact expression for the mean value of \({\mathcal {N}}_{d_k} (A;<h)\) (see [1] for a proof):

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} {\mathcal {N}}_{d_k} (A;<h) = q^{h} \left( {\begin{array}{c}n+k-1\\ k-1\end{array}}\right) . \end{aligned}$$

We can now define

$$\begin{aligned} \Delta _k (A;<h) := {\mathcal {N}}_{d_k} (A;<h) - q^{h} \left( {\begin{array}{c}n+k-1\\ k-1\end{array}}\right) . \end{aligned}$$
(5)

It was shown by Keating et al. [23] that, as \(q \longrightarrow \infty \),

$$\begin{aligned}&\frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _k (A;<h) |^2 \\&\quad = {\left\{ \begin{array}{ll} 0 &{}\text { for }\Big \lfloor \Big ( 1 - \frac{1}{k} \Big ) n \Big \rfloor +1 \le h \le n+1, \\ O \Big ( \frac{q^h}{\sqrt{q}} \Big ) &{}\text { for }h = \Big \lfloor \Big ( 1 - \frac{1}{k} \Big ) n \Big \rfloor , \\ q^h I_k (n ; n-h-3) + O \Big ( \frac{H}{\sqrt{q}} \Big ) &{}\text { for } 1 \le h \le \min \Big \{ n-4 , \Big \lfloor \Big ( 1 - \frac{1}{k} \Big ) n \Big \rfloor -1 \Big \}; \end{array}\right. } \end{aligned}$$

where \(I_k (n; n-h-3)\) is an integral over a group of unitary matrices, defined by (1.27) in [23]. In particular, when \(k=2\), and \(n \ge 5\) and \(h \le \frac{n}{2} - 1\), we have

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _2 (A;<h) |^2 \sim q^h \frac{(n-2h-1)(n-2h)(n-2h+1)}{6} \end{aligned}$$
(6)

as \(q \longrightarrow \infty \). (This result follows from equation (1.34) in [23] with k=2. It is also given explicitly in (1.33), although there is a slight error in the evaluation of the binomial there). Recently, in his thesis [13, Subsection 3.2.1], Gorodetsky obtained an exact formula for the case \(k=2\):

$$\begin{aligned} \begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta (A;<h) |^2 = {\left\{ \begin{array}{ll} (q-1) q^{h-1} \frac{(n-2h-1)(n-2h)(n-2h+1)}{6} &{}\text { for }h \le \lfloor \frac{n}{2} \rfloor -1, \\ 0 &{}\text { for }h \ge \lfloor \frac{n}{2} \rfloor . \end{array}\right. } \end{aligned} \end{aligned}$$
(7)

Let us now turn our attention to divisor correlations, and consider the classical case first. The most common example is

$$\begin{aligned} \sum _{n \le x} d(n) d(n+h) , \end{aligned}$$

where h is a fixed positive integer, and we are interested in the limit as \(x \longrightarrow \infty \). Ingham [19] showed that

$$\begin{aligned} \sum _{n \le x} d(n) d(n+h) \sim \frac{1}{\zeta (2) } \sigma _{-1} (h) x (\log x)^2 , \end{aligned}$$

where \(\sigma _{t} (h):= \sum _{a \mid k} a^t\). Estermann [11] later proved that there exist constants \(a_{1,h}\) and \(a_{0,h}\) (dependent on h) such that for all \(\epsilon > 0\) we have

$$\begin{aligned} \sum _{n \le x} d(n) d(n+h) = \frac{1}{\zeta (2) } \sigma _{-1} (h) x (\log x)^2 + a_{1,h} x \log x + a_{0,h} x + O_{\epsilon } \big ( x^{\frac{11}{12}} (\log x)^{\frac{17}{6} + \epsilon } \big ) . \end{aligned}$$

Heath-Brown [14] subsequently showed that, given \(h \le x^{\frac{5}{6}}\) (and uniformly over this range), we can improve the error term above to \(O_{\epsilon } (x^{\frac{5}{6} + \epsilon })\). The importance of these results lies in their application to the fourth moment of the Riemann zeta-function on the critical line [14].

The analogous problem for higher divisor functions, namely

$$\begin{aligned} \sum _{n \le x} d_k (n) d_k (n+h) \end{aligned}$$

for \(k \ge 3\), is also of great importance, specifically in the application to higher moments of the Riemann zeta-function (see the work of Ivić [20], Conrey and Gonek [3], and the five papers by Conrey and Keating [4,5,6,7,8]). It is conjectured (see equation (1.6) of [20]) that

$$\begin{aligned} \sum _{n \le x} d_k (n) d_k (n+h) = x P_{2k-2} (\log x ;h) + \Delta (x;h) , \end{aligned}$$

where \(P_{2k-2} (\log x;h)\) is a polynomial in \(\log x\) of degree \(2k-2\) with coefficients dependent on h, and we expect the error term to satisfy \(\Delta (x;h) = o (x)\) as \(x \longrightarrow \infty \).

Another example of divisor correlations is

$$\begin{aligned} \sum _l \sum _n d(lk+n) d(n) , \end{aligned}$$
(8)

where k is fixed, l ranges over a certain interval, and n ranges over integers of a certain interval that also satisfy \((n,k)=1\). This appears in the off-diagonal terms for the fourth moment of Dirichlet L-functions as can be seen in [15, 29] (with the function field analogue appearing in [2]).

In the function field setting, Andrade, Bary-Soroker, and Rudnick [1] proved

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} d_k (A) d_k (A+B) = \left( {\begin{array}{c}n+k-1\\ k-1\end{array}}\right) ^2 + O \big ( q^{-\frac{1}{2}} \big ) , \end{aligned}$$

uniformly over all \(B \in {\mathcal {A}} \backslash \{ 0 \}\) with \({{\,\textrm{deg}\,}}B \le n\), and as \(q \longrightarrow \infty \) (recall q is the order of the finite field \({\mathbb {F}}_q\) and \({\mathcal {A}}:= {\mathbb {F}}_q [T]\)). Gorodetsky [13, Lemma 3.3] improves on the case \(k=2\) by showing that

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} d (A) d (A+B) = (n+1)^2 + \sum _{i=1}^{\lfloor \frac{n}{2} \rfloor } \frac{(n-2i+1)^2}{q^i} \Big ( d (B;i) - d (B;i-1) \Big ) , \end{aligned}$$
(9)

where d(Bi) is the number of monic divisors of B of degree i.

Finally, we mention another generalization of the divisor function d. Let \(z\in {\mathbb {C}}\), and define \(\sigma _{z}\) on the set of positive integers by

$$\begin{aligned} \sigma _{z} (n) := \sum _{a \mid n} a^{z} . \end{aligned}$$

The function field analogue is defined similarly: For \(A \in {\mathcal {M}}\),

$$\begin{aligned} \sigma _{z} (A) := \sum _{E \mid A} |E |^{z} , \end{aligned}$$

where we remind the reader that the sum is over monic divisors. The case \(z=0\) gives back the standard divisor function d. In function fields, as a function of z, \(\sigma _{z} (A)\) is periodic with respect to the imaginary part with period \(\frac{2 \pi i}{\log q}\). As far as we are aware, the variance and correlations of \(\sigma _{z}\) have not been evaluated. We will consider these in the function field setting, and so we will need to establish some notation. As with d, we define

$$\begin{aligned} {\mathcal {N}}_{\sigma _{z}} (A;<h) := \sum _{B \in I(A;<h)} \sigma _{z} (B) . \end{aligned}$$

For \(z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have already established the average

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} {\mathcal {N}}_{\sigma _{z}} (A;<h) = \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} {\mathcal {N}}_{d} (A;<h) = q^{h} (n+1) . \end{aligned}$$

For \(z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}\), Lemma 3.0.2 shows

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} {\mathcal {N}}_{\sigma _{z}} (A;<h) = q^h \Big ( \frac{q^{(n+1)z} -1}{q^z -1} \Big ) . \end{aligned}$$

(Note that \(\frac{q^{(n+1)z} -1}{q^z -1}\) tends to \(n+1\) as z tends to any point in \(\frac{2 \pi i}{\log q} {\mathbb {Z}}\), and so this average is continuous with respect to z). We now define

$$\begin{aligned} \Delta _{\sigma _{z}} (A;<h) := {\mathcal {N}}_{\sigma _{z}} (A;<h) - {\left\{ \begin{array}{ll} q^{h} (n+1) &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ q^h \Big ( \frac{q^{(n+1)z} -1}{q^z -1} \Big ) &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}. \end{array}\right. } \end{aligned}$$
(10)

1.2 Statement of results

We obtain an exact formula for the variance of the divisor function d over intervals in \({\mathbb {F}}_q [T]\), where q is a prime power. We also provide the generalization to \(\sigma _z\) for any \(z\in {\mathbb {C}}\).

Theorem 1.2.1

Let \(n \ge 4\) and \(n_1:= \lfloor \frac{n+2}{2} \rfloor \). For \(z\in \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have

$$\begin{aligned}&\frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta (A;<h) |^2 \\&\quad = \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _{\sigma _{z}} (A;<h) |^2\\&\quad = {\left\{ \begin{array}{ll} (q-1) q^{h-1} \frac{(n-2h-1)(n-2h)(n-2h+1)}{6} &{}\text { for }h \le n_1 -2, \\ 0 &{}\text { for }n_1 -1 \le h \le n. \end{array}\right. } \end{aligned}$$

While, for \(z\in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have

$$\begin{aligned}&\frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _{\sigma _{z}} (A;<h) |^2\\&\quad = {\left\{ \begin{array}{ll} (q-1) q^{h-1} \sum _{r=h+1}^{n_1 -1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 &{}\text { for }h \le n_1 -2, \\ 0 &{}\text { for }n_1 -1 \le h \le n. \end{array}\right. } \end{aligned}$$

Note that these variances are continuous with respect to z, since

$$\begin{aligned} \lim _{z \longrightarrow 0} \sum _{r=h+1}^{n_1 -1} \left( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \right) ^2&= \sum _{r=h+1}^{n_1 -1} (n+1-2r)^2\\&= \frac{(n-2h-1)(n-2h)(n-2h+1)}{6} . \end{aligned}$$

To prove Theorem 1.2.1, we use the orthogonality relation of a non-trivial additive character on \({\mathbb {F}}_q\) to express the problem in terms of Hankel matrices over \({\mathbb {F}}_q\). Most of this article comprises of results on Hankel Matrices over finite fields, which can be found in Sect. 2. Once these results are established, the proof of Theorem 1.2.1, in Sect. 3, is relatively short.

In Sect. 4, a very slight adaptation of our proof of Theorem 1.2.1 allows us to prove the following result on divisor correlations:

Theorem 1.2.2

Let \(n \ge 4\) and \(n_1:= \lfloor \frac{n+2}{2} \rfloor \). For \(z\in \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have

$$\begin{aligned}&\frac{1}{q^{h+n}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } d (A) d (A+B) \\&\quad = \frac{1}{q^{h+n}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } \sigma _{z} (A) \sigma _{z} (A+B) \\&\quad = {\left\{ \begin{array}{ll} (n+1)^2 + (1 - q^{-1}) (q^{-h}) \frac{(n-2h-1) (n-2h) (n-2h+1)}{6} &{}\text { for }h \le n_1 -2, \\ (n+1)^2 &{}\text { for }n_1 -1 \le h \le n. \end{array}\right. } \end{aligned}$$

While, for \(z\in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have

$$\begin{aligned}&= \frac{1}{q^{h+n}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } \sigma _{z} (A) \sigma _{z} (A+B) \\&= {\left\{ \begin{array}{ll} \Big ( \frac{q^{(n+1)z} - 1}{q^z -1} \Big )^2 + (1 - q^{-1}) (q^{-h}) \sum _{r=h+1}^{n_1 -1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 &{}\text { for }h \le n_1 -2, \\ \Big ( \frac{q^{(n+1)z} - 1}{q^z -1} \Big )^2 &{}\text { for }n_1 -1 \le h \le n. \end{array}\right. } \end{aligned}$$

This result, and its proof, allow us to clearly see the relationship between divisor variance and correlations.

Theorem 1.2.3

Let \(Q \in {\mathcal {M}}\) be prime, and let nk be such that \(0 \le n \le {{\,\textrm{deg}\,}}Q - 1 \le k\). Then,

$$\begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) \\&\quad = \bigg ( \frac{1}{q^n } \sum _{N \in {\mathcal {M}}_n} d(N) \bigg ) \bigg ( \frac{1}{q^{k + n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) \bigg ) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) (n+1) - q^{-{{\,\textrm{deg}\,}}Q} (k - {{\,\textrm{deg}\,}}Q -1) (n+1) . \end{aligned}$$

This is the function field analogue of (8), although we are considering the special case where Q is prime. However, if we wish to apply this to the fourth moment of Dirichlet L-functions in function fields, then we would require the restriction \(k < {{\,\textrm{deg}\,}}Q -1\) instead of \(k \ge {{\,\textrm{deg}\,}}Q -1\), which is more difficult. We discuss this further in Sect. 1.3 and Remark 4.0.1. Nonetheless, Theorem 1.2.3 is an interesting result, not only because it is exact, but also because it shows that \(d (KQ+N)\) and d(N) are uncorrelated over the given ranges of K and N.

Theorem 1.2.3 can be extended to \(\sigma _z\) but we omit this because the main interest for these kind of correlations is the application to moments of Dirichlet L-functions, but this concerns \(d, d_3, d_4, \ldots \) and not \(\sigma _z\). Theorems 1.2.1 and 1.2.2 on the other hand are natural to study for any arithmetic function, which is why our results have been generalized to \(\sigma _z\).

The cases \(z=0\) of Theorems 1.2.1 and 1.2.2 have also been obtained by Gorodetsky [13] independently from this author using a different approach.

Our approach here allows for further results such as Theorem 1.2.3, and it can be used to translate other divisor sum problems into linear algebraic problems involving Hankel matrices. In Sect. 1.4 we discuss this for problems such as moments higher than the variance over short intervals, analogues for \(d_k\), and more.

Furthermore, in a related paper [33], we consider the variance of a restricted sum-of-squares function over short intervals. Again, the problem is translated to Hankel matrices, and by developing a deeper understanding of these matrices we are able to obtain an exact formula for the variance. Moreover, in upcoming work, we use our approach to calculate the variance of lattice points in thin elliptic annuli (in function fields), which is equivalent to the variance over short intervals of the arithmetic function that counts representations of the form \(UE^2 + VF^2\), where UV are fixed and EF can vary. This is of particular interest because, unlike the standard sum-of-squares function or the divisor function, the number of representations of the form \(UE^2 + VF^2\) is not multiplicative for general UV, and so we cannot associate an L-function and thus we cannot use the classical techniques that have been used for the standard sum-of-squares function or divisor function.

The approach of Hankel matrices that we introduce in this paper provides a new framework for the kind of problems described above, and the results we develop for these matrices in Sect. 2 set the foundation for this. Indeed, let us briefly discuss our results on Hankel matrices and our use of additive characters.

An additive character \(\psi \) on \({\mathbb {F}}_q\) is a function from \({\mathbb {F}}_q\) to \({\mathbb {C}}^*\) satisfying \(\psi (a+b) = \psi (a) \psi (b)\) (note this implies \(\psi (0) =1\) and \(\psi (-a) = \psi (a)^{-1}\) for all \(a \in {\mathbb {F}}_q\)). We say \(\psi \) is non-trivial if there exists \(a \in {\mathbb {F}}_q^*\) such that \(\psi (a) \ne 1\), and in this case we have the orthogonality relation

$$\begin{aligned} \frac{1}{q} \sum _{\alpha \in {\mathbb {F}}_q } \psi (\alpha b) = {\left\{ \begin{array}{ll} 1 &{}\text {if }b=0, \\ 0 &{}\text {if }b \in {\mathbb {F}}_q^*. \end{array}\right. } \end{aligned}$$
(11)

The first case follows from the fact that if \(b=0\), then \(\psi (\alpha b) = 1\) for all \(\alpha \in {\mathbb {F}}_q\). The second case follows from the fact that if \(b \in {\mathbb {F}}_q^*\), then \(\alpha b\) and \(\alpha b + a\) both vary over \({\mathbb {F}}_q\) as \(\alpha \) varies over \({\mathbb {F}}_q\), and so

$$\begin{aligned} \sum _{\alpha \in {\mathbb {F}}_q } \psi (\alpha b) = \sum _{\alpha \in {\mathbb {F}}_q } \psi (\alpha b + a) = \psi (a) \sum _{\alpha \in {\mathbb {F}}_q } \psi (\alpha b). \end{aligned}$$

Since \(\psi (a) \ne 1\), we deduce that \(\sum _{\alpha \in {\mathbb {F}}_q } \psi (\alpha b) =0\). In the remainder of this article, \(\psi \) is a non-trivial character on \({\mathbb {F}}_q\), and we will make significant use of (11).

An \(l \times m\) Hankel matrix over \({\mathbb {F}}_q\) is a matrix of the form

where \(\alpha _0, \ldots , \alpha _{l+m-2} \in {\mathbb {F}}_q\). It is natural to consider the sequence \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _{l+m-2}) \in {\mathbb {F}}_q^{l+m-1}\) that is associated to the matrix above, and it will be convenient to denote the matrix by \(H_{l,m} (\varvec{\alpha })\).

Theorem 2.3.1 gives the number of Hankel matrices of a given size and rank. Theorems 2.4.4 and 2.4.9 demonstrate the kernel structure of Hankel matrices, and we see how function field arithmetic is incorporated in these matrices. To see this, we must view the coefficients of a polynomial as the entries in a vector, and vice versa. For example, the polynomial \(a_0 + a_1 T + \ldots + a_n T^n\) should be associated to the vector \((a_0, a_1, \ldots , a_n )^T\). We prove that, generally, for a Hankel matrix H, there are polynomials \(A_1, A_2\) such that the kernel of H consists exactly of the polynomials

$$\begin{aligned} B_1 A_1 + B_2 A_2 \end{aligned}$$

where \(B_1, B_2 \in {\mathcal {A}}\) are any polynomials satisfying a certain bound on their degrees. The polynomials \(A_1, A_2\) are called the characteristic polynomials of H, not to be confused with the characteristic polynomial of a square matrix. In Theorem 2.4.10 and Corollary 2.4.11, we show that if \(H'\) is a top-left submatrixFootnote 1 of H, then the characteristic polynomials associated to \(H'\) are the same as the polynomials that we obtain after applying a certain number of steps of the Euclidean algorithm to \(A_1, A_2\); the exact number of steps is related to the size of the submatrix \(H'\). Theorem 2.4.13 demonstrates how the kernel structure of a Hankel matrix changes if we extend it by a single row or column. We prove various other results as well.

1.3 Motivation

We will briefly give an explanation of how additive characters can allow us to express sums of the divisor function in \({\mathbb {F}}_q [T]\) in terms of Hankel matrices. This is best achieved by considering the sum

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2 . \end{aligned}$$

We have that

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2 = \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \\ EF = A \end{array}} 1 \bigg )^2 = \sum _{A \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \\ EF = A \end{array}} 1 \bigg )^2 . \end{aligned}$$

We note that the conditions on E and F force A to be monic and of degree n, and that is why for the last equality we were able to replace the condition \(A \in {\mathcal {M}}_n\) with \(A \in {\mathcal {A}}_{\le n}\). Now, let us write \(a_i, e_i, f_i\) for the i-th coefficient of AEF, respectively. We also write \(\{ EF \}_i\) for the i-th coefficient of EF when we do not yet wish to express the coefficients of EF in terms of the \(e_i\) and \(f_i\). We have

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2 =&\sum _{A \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \end{array}} \prod _{k=0}^{n} \mathbb {1}_{\{ EF \}_k = a_k } \bigg )^2 \\ =&\frac{1}{q^{2n+2}} \sum _{A \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \end{array}} \prod _{k=0}^{n} \sum _{\alpha _k \in {\mathbb {F}}_q} \psi \Big ( \alpha _k \big ( \sum _{\begin{array}{c} i,j \ge 0 \\ i+j=k \end{array}} e_i f_j \hspace{1em} - a_k \big ) \Big ) \bigg )^2 . \end{aligned}$$

Here, for a proposition \({\textbf{P}}\), we define \(\mathbb {1}_{{\textbf{P}}}\) to be 1 if the proposition is true, and 0 if false. For the last equality, we used (11) with \(b = \{ EF \}_k - a_k\). We will now collect all of the terms involving \(\psi \). To do this, we note that

where

Thus, we have

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2&= \frac{1}{q^{2n+2}} \sum _{A \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^l \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^m \times \{ 1 \} \end{array}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - \sum _{k=0}^{n} \alpha _k a_k \Big ) \bigg )^2 \\ =&\frac{1}{q^{2n+2}} \sum _{A \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^l \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^m \times \{ 1 \} \end{array}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - \sum _{k=0}^{n} \alpha _k a_k \Big ) \bigg ) \\&\quad \times \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' =n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \sum _{\varvec{\beta } \in {\mathbb {F}}_q^{n+1} } \psi \Big ( {\textbf{g}}^T H_{l' +1 , m' +1} (\varvec{\beta }) {\textbf{h}} - \sum _{k=0}^{n} \beta _k a_k \Big ) \bigg ) . \end{aligned}$$

Let us now consider only the terms involving a given \(a_k\) in the sum above. Since \(A \in {\mathcal {A}}_{\le n}\), we can see that \(a_k\) ranges over \({\mathbb {F}}_q\). We have

$$\begin{aligned} \frac{1}{q} \sum _{a_k \in {\mathbb {F}}_q } \psi \Big ( - (\alpha _k + \beta _k) a_k \Big ) = {\left\{ \begin{array}{ll} 1 &{}\text { if }\alpha _k = -\beta _k, \\ 0 &{}\text { if }\alpha _k \ne - \beta _k, \end{array}\right. } \end{aligned}$$

where we have used (11) with \(b = \alpha _k + \beta _k\). Essentially, by considering all \(k= 0, 1, \ldots , n\) this means \(\varvec{\alpha } = - \varvec{\beta }\), and we have effectively removed the sum over A. For simplicity, we will take \(\varvec{\alpha } = \varvec{\beta }\). So, we have

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2 =&\frac{1}{q^{n+1}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^l \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^m \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \bigg )^2 . \end{aligned}$$

Now, let us consider the sum over a given \(e_i\):

$$\begin{aligned} \frac{1}{q} \sum _{e_i \in {\mathbb {F}}_q } \psi \big ( e_i R_{i+1} {\textbf{f}} \big ) \end{aligned}$$

where \(R_{i+1}\) is the \((i+1)\)-th row of \(H_{l+1, m+1} (\varvec{\alpha })\) (the row indexing begins at 1, not 0, and that is why we have \(R_{i+1}\) and not \(R_i\)). By using (11) again, with \(b = R_{i+1} {\textbf{f}}\), we can see that the sum over \(e_i\) will give a non-zero contribution only when \(R_{i+1} {\textbf{f}} = 0\), and this non-zero contribution will be 1. Applying this to all \(i = 0, 1, \ldots , l\), we see that a non-zero contribution occurs only when

$$\begin{aligned} H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} = {\textbf{0}}. \end{aligned}$$
(12)

That is, when \({\textbf{f}}\) is in the kernel of \(H_{l+1, m+1} (\varvec{\alpha })\).

Technically, this is not quite true as the last entry of \({\textbf{e}}\) is 1, and so it cannot take any value in \({\mathbb {F}}_q\). Ultimately, this actually limits the \(\varvec{\alpha }\) that we can take, and so simplifies our final calculations. However, for now, let us assume that the last entry of \({\textbf{e}}\) can take any value in \({\mathbb {F}}_q\).

Continuing, noting that the number of \({\textbf{f}}\) in the kernel of \(H_{l+1, m+1} (\varvec{\alpha })\) is \(q^{m+1 - {{\,\textrm{rank}\,}}H_{l+1, m+1} (\varvec{\alpha })}\), we have

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} d(A)^2 \approx&q \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} q^{- {{\,\textrm{rank}\,}}H_{l+1 , m+1} (\varvec{\alpha })} \bigg )^2 . \end{aligned}$$

So, we can now see how using additive characters allows us to express the sum \(\sum _{A \in {\mathcal {M}}_n} d(A)^2 \) in terms of Hankel matrices in a concise manner, and how knowing the exact number of Hankel matrices of a given rank and size will allow us to obtain an exact evaluation of the original divisor sum.

Now, Theorem 1.2.1 is concerned with the variance of the divisor function. That is, it is concerned with the sum

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _2 (A;<h) |^2 =&\frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} \Big |{\mathcal {N}}_{d_2} (A;<h) - q^{h} (n+1) \Big |^2 \\ =&\frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A) < h \end{array}} d(B) \bigg )^2 \hspace{1em} - q^{2h} (n+1)^2 , \end{aligned}$$

and it suffices to consider

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A) < h \end{array}} d(B) \bigg )^2 \end{aligned}$$

which is similar to the sum \(\sum _{A \in {\mathcal {M}}_n} d(A)^2 \) that we worked with earlier. Here, we have the sum over A which appears outside the squared parentheses, while the sum over B appears within. We can proceed similar to previously, and the sum over A will force \(\alpha _k + \beta _k = 0\) for \(k = h, h+1, \ldots , n\). Whereas, the sum over B, will force \(\alpha _k = 0\) and \(\beta _k = 0\) for \(k=0, 1, \dots , h-1\). Ultimately, this means we will have to understand how many Hankel matrices there are of a given size and rank with the first h skew-diagonals being 0.

Note that if the first h entries of \(\varvec{\alpha }\) are zero, and \(h \ge \frac{n}{2}\), then the matrix \(H_{\frac{n}{2} +1, \frac{n}{2} +1} (\varvec{\alpha })\) is lower skew-triangular (for simplicity we are assuming n is even and so \(\frac{n}{2}\) is an integer). We can easily determine the rank of such a matrix by determining the first non-zero skew-diagonal; and, as we will see later, we can use that to easily determine the rank of all \(H_{l, m} (\varvec{\alpha })\) for \(l+m-2 = n\). On the other hand, if \(h < \frac{n}{2}\), then it is more difficult to determine the rank of the matrix \(H_{\frac{n}{2} +1, \frac{n}{2} +1} (\varvec{\alpha })\), which is no longer necessarily lower skew-triangular. This demonstrates why it is easier to work with large intervals (\(h \ge \frac{n}{2}\)) than short intervals (\(h < \frac{n}{2}\)). Note that the condition \(h < \frac{n}{2}\) is equivalent to \(q^h = (q^n)^{\frac{1}{2}}\) which is analogous to the classical \(H < x^{\frac{1}{2}}\) (see (2) and the paragraph below that).

1.4 Extensions

The first extension that we consider is the analogue of Theorem 1.2.1 for the k-th divisor function, \(d_k\). The approach is similar but we will ultimately be working with Hankel tensors instead of Hankel matrices. A matrix is a two dimensional array, which appeared because we were working with the standard divisor function, \(d=d_2\). When working with higher divisor functions, we will work with higher dimensional arrays (i.e., tensors). Consider the case \(k=3\). We will have tensors of the form

$$\begin{aligned} (\alpha _{i+j+l-3})_{\begin{array}{c} 1 \le i \le i_1 \\ 1 \le j \le j_1 \\ 1 \le l \le l_1 \end{array}} \end{aligned}$$

and we will need to determine how many \({\textbf{f}} = (f_1, \ldots , f_{j_1})^T\) and \({\textbf{g}} = (g_1, \ldots , g_{l_1})^T\) there are such that

$$\begin{aligned} \sum _{j=1}^{j_1} \sum _{l=1}^{l_1} \alpha _{i+j+l-3} f_j g_l = 0 \end{aligned}$$

for all \(1 \le i \le i_1\). This is analogous to (12).

Let us now consider another extension. In Theorem 1.2.1, we consider the variance of the divisor function, which is essentially the second moment. We could consider higher moments, such as the third, and if we are aiming to obtain an exact evaluation then it would be sufficient to obtain an exact formula for

$$\begin{aligned} \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A) < h \end{array}} d(B) \bigg )^3 . \end{aligned}$$

With the variance, we needed to determine how many \(\varvec{\alpha }, \varvec{\beta }\) there are such that \(H_{l+1, m+1}(\varvec{\alpha })\) and \(H_{l+1, m+1}(\varvec{\beta })\) have certain given ranks, and \(\varvec{\alpha } + \varvec{\beta } = \varvec{0}\). Of course, the last condition means the matrices are essentially identical, making the problem slightly simpler. However, for the third moment, we will need to determine how many \(\varvec{\alpha }, \varvec{\beta }, \varvec{\gamma }\) there are such that \(H_{l+1, m+1}(\varvec{\alpha })\), \(H_{l+1, m+1}(\varvec{\beta })\), and \(H_{l+1, m+1}(\varvec{\gamma })\) have certain given ranks, and \(\varvec{\alpha } + \varvec{\beta } + \varvec{\gamma } = \varvec{0}\). This last condition makes the problem more difficult. We do indicate in Remark 2.4.16 how we can reduce this problem to special cases of \(\varvec{\alpha }, \varvec{\beta }, \varvec{\gamma }\). We are effectively taking two Hankel matrices for which we understand their rank, and we wish to understand the rank of their sum. Related problems have been considered for Toeplitz-plus-Hankel matrices over the complex numbers [17, 28] (compared to Hankel-plus-Hankel matrices, which is what we are interested in).

If we can obtain a result such as Theorem 1.2.3, or at least a strong approximation, for the case \(k < {{\,\textrm{deg}\,}}Q -1\), then this would allow us to obtain lower order terms in the asymptotic expansion of the fourth moment of Dirichlet L-functions in function fields (the average would be over characters of prime modulus, and this would be analogous to Young’s results [34]). We discuss what is required for this in Remark 4.0.1, after having established the necessary results and notation in Sect. 2. For now, we can give the following indication: We must understand how many \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _{{{\,\textrm{deg}\,}}Q +k})\) there are that satisfy the following three conditions:

  • The square matrix \(H_{\frac{{{\,\textrm{deg}\,}}Q +k}{2} +1, \frac{{{\,\textrm{deg}\,}}Q +k}{2} +1} (\varvec{\alpha })\) has rank \(r_1\);

  • The square matrix \(H_{\frac{n}{2} +1, \frac{n}{2} +1} (\varvec{\alpha }')\) has rank \(r_2\);

  • The matrix \(H_{{{\,\textrm{deg}\,}}Q +1, k +1} (\varvec{\alpha })\) has \((q_0, q_1, \ldots , q_{{{\,\textrm{deg}\,}}Q} )^T\) in its kernel;

where \(\varvec{\alpha }'\) is defined to be the subsequence \((\alpha _0, \alpha _1, \ldots , \alpha _{n})\), the integers \(r_1, r_2\) are fixed, and we are assuming for simplicity that \({{\,\textrm{deg}\,}}Q + k\) and n are even. Also, \(q_0, q_1, \ldots , q_{{{\,\textrm{deg}\,}}Q}\) are defined by \(Q = q_0 + q_1 T + \ldots + q_{{{\,\textrm{deg}\,}}Q} T^{{{\,\textrm{deg}\,}}Q}\). Working with any two of the above conditions is possible given the results we establish later. However, the difficulty lies in working with all three.

If we were to work with higher moments of Dirichlet L-functions then we would consider correlations of higher divisor functions (such as \(d_3\)), and so we would need to work with Hankel tensors instead of Hankel matrices. Specifically, we would have a tensor analogue for the three conditions above. Of course, moments higher than the fourth are notoriously difficult, and no rigorous results have been obtained. However, the above is interesting as it provides an alternative approach to the problem.

Finally, we can consider arithmetic functions other than the divisor function. For example, the number of ways we can express a polynomial in \({\mathcal {A}}\) as a sum of two squares. As with the divisor function, this involves multiplication, and our approach of Hankel matrices can be applied.

2 Hankel matrices over \({\mathbb {F}}_q\)

2.1 Introduction

While we are concerned with Hankel matrices over finite fields, Hankel matrices over the complex numbers have received considerably more attention. Heinig and Rost [18] provide a detailed account of the results that have been established for the complex setting. While fewer in number, there are publications specifically on Hankel matrices over finite fields as well [12, 26]. About half of the results we provide are completely original; while the rest are either finite field analogies of results in [18] or generalizations of results in [12], but the proofs are often different with the intention of being more intuitive. Wherever possible, we will adhere to the notation established in [12, 18], and when this is not possible we make clear what the differences are.

As mentioned previously, an \(l \times m\) Hankel matrix over \({\mathbb {F}}_q\) is a matrix of the form

(13)

where \(\alpha _0, \ldots , \alpha _{l+m-2} \in {\mathbb {F}}_q\). As we can see, all the entries on a given skew-diagonal take the same value. We index our entries from zero, and we later see this is necessary in order to be consistent with the indexing of coefficients in a polynomial, which also begins at zero.

Define the \(n \times n\) counter-identity matrix, \(J_n\), to be the matrix with zero entries everywhere except for the main skew-diagonal going from bottom-left to top-right. If H is an \(l \times m\) Hankel matrix, then \(J_l H\) and \(H J_m\) are \(l \times m\) Toeplitz matrices. Similarly, if T is an \(l \times m\) Toeplitz matrix, then \(J_l T\) and \(T J_m\) are \(l \times m\) Hankel matrices. Thus, we can see that Hankel matrices and their kernels are inextricably linked to the more well known Toeplitz matrices. Although, we will focus on the former. We denote the set of all \(l \times m\) Hankel matrices in \({\mathbb {F}}_q [T]\) by \({\mathscr {H}}_{l,m}\).

It is natural to consider the finite sequence \(( \alpha _0, \alpha _1, \ldots , \alpha _{l+m-2}) \in {\mathbb {F}}_q^{l+m-1}\) that is associated to the matrix (13). Generally, for \(\varvec{\alpha }:= (\alpha _0, \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1}\) and \(l+m-2 = n\) with \(l,m \ge 1\), we define the matrices

(14)

That is, we associate \(n+1\) number of matrices with \(\varvec{\alpha }\). As we will later see, there is a crucial relationship between the kernels of these matrices. We note that we can extend the above definition to the case where \(l+m-2 = n' < n\), in which case we have \(n' +1\) number of matrices, and the last \(n-n'\) entries of \(\varvec{\alpha }\) do not appear in any of them.

Throughout this paper, for an integer \(n \ge 0\), we will always define \(n_1:= \lfloor \frac{n+2}{2} \rfloor \) and \(n_2:= \lfloor \frac{n+3}{2} \rfloor \).

Now, let \(\varvec{\alpha } = (\alpha _0, \ldots , \alpha _{n}) \in {\mathbb {F}}_{q}^{n+1}\), and consider the square matrices

$$\begin{aligned} H_{1,1} (\varvec{\alpha }) , H_{2,2} (\varvec{\alpha }) , \ldots , H_{n_1 , n_1} (\varvec{\alpha }) . \end{aligned}$$

Note that all of these matrices have at most \(n+1\) skew-diagonals and so, given the length of the sequence \(\varvec{\alpha }\), they are well defined. Intuitively, they are all the square Hankel matrices that can be obtained from \(\varvec{\alpha }\) or its truncations; and they are all top-left submatrices of \(H_{n_1, n_2} ( \varvec{\alpha })\). We now make the following definition.

Definition 2.1.1

(\(\rho (\varvec{\alpha })\) and \(\pi (\varvec{\alpha })\)) Let \(\varvec{\alpha } = (\alpha _0, \ldots , \alpha _{n}) \in {\mathbb {F}}_{q}^{n+1}\) and consider the matrices \(H_{1,1} (\varvec{\alpha }), H_{2,2} (\varvec{\alpha }), \ldots , H_{n_1, n_1} (\varvec{\alpha })\). If at least one of these matrices has non-zero determinant, then define \(\rho (\varvec{\alpha })\) to be the largest \(l \in \{ 1,2, \ldots , n_1 \}\) with the property that \(\det H_{l, l } (\varvec{\alpha }) \ne 0\). If all of these matrices have determinant equal to zero, then define \(\rho (\varvec{\alpha })\) to be zero.

Now consider the matrix \(H_{n_1, n_2} (\varvec{\alpha })\), which is square if n is even and almost square if n is odd.Footnote 2 We define

$$\begin{aligned} \pi (\varvec{\alpha }) := {{\,\textrm{rank}\,}}H_{n_1 , n_2} (\varvec{\alpha }) - \rho (\varvec{\alpha }) . \end{aligned}$$

We also extend the definitions of \(\rho (\varvec{\alpha })\) and \(\pi (\varvec{\alpha })\) to the matrices associated to \(\varvec{\alpha }\): For \(l+m-2 = n\), we define \(\rho \big ( H_{l,m} (\varvec{\alpha }) \big ) = \rho (\varvec{\alpha })\) and \(\pi \big ( H_{l,m} (\varvec{\alpha }) \big ) = \pi (\varvec{\alpha })\).Footnote 3

We will later see that \(\rho (\varvec{\alpha })\) and \(\pi (\varvec{\alpha })\) will help us in understanding the rank and kernel of \(H_{l, m} (\varvec{\alpha })\) for all \(l+m-2 =n\). There are a couple of remarks we must make before continuing.

Remark 2.1.2

In [12] they give the same definition for \(\rho \), although they use the letter \(\delta \) instead and it applies only to square Hankel matrices. We use the letter \(\rho \) because it is consistent with the notation established in [18, Subsection 5.6]. Here they define the \((\rho , \pi )\)-characteristic of a Hankel matrix H (denoted by \({{\,\textrm{char}\,}}H\)) to be \(\big ( \rho (H), \pi (H) \big )\). Technically, they give a different definition; although, the results we establish later allow us to see that it is equivalent to the definition we give. The benefit of our definition is that it can be given before introducing results on the kernel structure of Hankel matrices.

Remark 2.1.3

Suppose \(\varvec{\alpha } = (\alpha _0, \ldots , \alpha _n) \in {\mathbb {F}}_q^{n+1}\). By definition, \(\rho (\varvec{\alpha }) \) can take values in \(\{ 0, 1, \ldots , n_1 \}\). Given that \(\pi (\varvec{\alpha }):= {{\,\textrm{rank}\,}}H_{n_1, n_2 } (\varvec{\alpha }) - \rho (\varvec{\alpha }) \le n_1 - \rho (\varvec{\alpha })\), we can see that \(\pi (\varvec{\alpha })\) can take values in \(\{ 0, 1, \ldots , n_1 - \rho (\varvec{\alpha }) \}\).

In fact, these are all attainable if n is odd, but if n is even then \(\pi (\varvec{\alpha })\) can only attain values in \(\big \{ 0, 1, \ldots , \max \{ 0, n_1 - \rho (\varvec{\alpha }) -1 \} \big \}\). That is, we cannot have \(\pi (\varvec{\alpha }) = n_1 - \rho (\varvec{\alpha })\) when \(\rho (\varvec{\alpha }) \ne n_1\). Indeed, for a contradiction, suppose we do have \(\pi (\varvec{\alpha }) = n_1 - \rho (\varvec{\alpha })\) and \(\rho (\varvec{\alpha }) \ne n_1\). Then,

$$\begin{aligned} {{\,\textrm{rank}\,}}H_{n_1 , n_1 } (\varvec{\alpha }) = {{\,\textrm{rank}\,}}H_{n_1 , n_2 } (\varvec{\alpha }) = \rho (\varvec{\alpha }) + \pi (\varvec{\alpha }) = n_1 , \end{aligned}$$

where the first equality uses the fact that \(n_1 = n_2\) (since n is even). Thus, \(H_{n_1, n_1 } (\varvec{\alpha })\) has full rank and so \(\rho (\varvec{\alpha }) = n_1\) by definition, which obviously contradicts \(\rho (\varvec{\alpha }) \ne n_1\). When n is odd, we have \(n_1 \ne n_2\) and so the first relation above does not hold, meaning we do not encounter this contradiction.

We now make several definitions for sets of finite sequences and Hankel matrices.

Definition 2.1.4

We define,

$$\begin{aligned} {\mathscr {L}}_n (r) :=&\{ \varvec{\alpha } \in {\mathbb {F}}_q^{n+1} : {{\,\textrm{rank}\,}}H_{n_1 , n_1 } (\varvec{\alpha }) = r \} , \\ {\mathscr {L}}_n (r , \rho _1 , \pi _1 ) :=&\{ \varvec{\alpha } \in {\mathscr {L}}_n (r) : \rho (\varvec{\alpha }) = \rho _1 , \pi (\varvec{\alpha }) = \pi _1 \} . \end{aligned}$$

Of course, by definition of \(\pi (\varvec{\alpha })\), we must have \(\rho _1 + \pi _1 = r\), and so at times we may write \({\mathscr {L}}_n (\rho _1 + \pi _1, \rho _1, \pi _1 )\) or \({\mathscr {L}}_n (r, \rho _1, r - \rho _1 )\), depending on what parameters we are using. We also define, for \(h=0, \ldots , n+1\),

$$\begin{aligned} {\mathscr {L}}_n^h :=&\{ \varvec{\alpha } = (\alpha _0 , \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1} : \alpha _0 , \ldots ,\alpha _{h-1} = 0 \} , \\ {\mathscr {L}}_n^h (r) :=&\{ \varvec{\alpha } = (\alpha _0 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r) : \alpha _0 , \ldots ,\alpha _{h-1} = 0 \} , \\ {\mathscr {L}}_n^h (r , \rho _1 , \pi _1 ) :=&\{ \varvec{\alpha } = (\alpha _0 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r, \rho _1 , \pi _1) : \alpha _0 , \ldots ,\alpha _{h-1} = 0 \} . \end{aligned}$$

In the above, we have three sets of parameters. The first relates to the length of the sequence \(\varvec{\alpha }\), the second relates to the rank of the associated square (or nearly square) Hankel matrix, and the third relates to entries equal to zero at the start of the sequence. Note that when \(h=0\) we have \({\mathscr {L}}_n^h = {\mathbb {F}}_q^{n+1}\), \({\mathscr {L}}_n^h (r) = {\mathscr {L}}_n (r)\), and \({\mathscr {L}}_n^h (r, \rho _1, \pi _1 ) = {\mathscr {L}}_n (r, \rho _1, \pi _1 )\).

Definition 2.1.5

For \(h = 0, \ldots , l+m-1\), we define

$$\begin{aligned} {\mathscr {H}}_{l,m} (r) :=\,&\{ H \in {\mathscr {H}}_{l,m}: {{\,\textrm{rank}\,}}H = r \} , \\ {\mathscr {H}}_{l,m}^h :=\,&\{ H = (\alpha _{i+j-2})_{\begin{array}{c} 1 \le i \le l \\ 1 \le j \le m \end{array}} \in {\mathscr {H}}_{l,m} : \alpha _ 0 , \ldots , \alpha _{h-1} = 0 \} , \\ {\mathscr {H}}_{l,m}^h (r) :=\,&\{ H = (\alpha _{i+j-2})_{\begin{array}{c} 1 \le i \le l \\ 1 \le j \le m \end{array}} \in {\mathscr {H}}_{l,m} (r) : \alpha _ 0 , \ldots , \alpha _{h-1} = 0 \} . \end{aligned}$$

Note that when \(h=0\) we have \({\mathscr {H}}_{l,m}^h = {\mathscr {H}}_{l,m}\) and \({\mathscr {H}}_{l,m}^h (r) = {\mathscr {H}}_{l,m} (r)\). Note also that the parameter r appearing in \({\mathscr {L}}_n (r)\) is not analogous to the parameter r appearing in \({\mathscr {H}}_{l,m} (r)\). For example, if \(\varvec{\alpha } \in {\mathscr {L}}_n (r)\) and \(l+m-2 = n\), then we do not necessarily have \(H_{l,m} (\varvec{\alpha }) \in {\mathscr {H}}_{l,m} (r)\); indeed, if \(l<r\) then we have \(H_{l,m} (\varvec{\alpha }) \in {\mathscr {H}}_{l,m} (l)\).

As we will see later, the number of zeros that appear at the start of our matrix is important for the variance of the divisor function over intervals. The definitions above incorporate various parameters, which are not all considered in [18] or [12]. Thus, our notation is different.

Remark 2.1.6

Consider \({\mathscr {L}}_n^h (r, \rho _1, \pi _1 )\), and suppose \(\rho _1 \ne 0\). We must have that \(h \le \rho _1 -1\). Otherwise, for any \(\varvec{\alpha } \in {\mathscr {L}}_n^h (r, \rho _1, \pi _1 )\) we would have that \(H_{\rho _1, \rho _1} (\varvec{\alpha })\) is strictly lower skew-triangularFootnote 4 and thus contradicting that \(\det H_{\rho _1, \rho _1} (\varvec{\alpha }) \ne 0\).

Now suppose that \(\rho _1 = 0\). Then, \(h \le n+1-r\). This can be seen from the following reasoning. Since \(\rho _1 = 0\), we have

$$\begin{aligned} \det H_{1,1} (\varvec{\alpha }) = 0 , \hspace{1em} \ldots , \hspace{1em} \det H_{n_1 , n_1} (\varvec{\alpha }) = 0 , \end{aligned}$$

and so, by induction (as we later demonstrate), we have that \(H_{n_1, n_1} (\varvec{\alpha })\) is strictly lower skew-triangular. Thus, the matrix \(H_{n_1, n_2} (\varvec{\alpha })\) is of the form

if n is even or odd, respectively. Given that \({{\,\textrm{rank}\,}}H_{n_1, n_2} (\varvec{\alpha }) = r\), we can see that \(H_{n_1, n_2} (\varvec{\alpha })\) must be of the form

with \(\alpha _{n+1-r} \ne 0\). (If n is odd and \(r=n_1\), then there should be no rows of zeros at the top, and the matrix above should be interpreted as such). In particular, this forces \(h \le n+1-r\), as required.

The following definition is the same as Definition 5.8 of [18].

Definition 2.1.7

(Quasi-regular) Suppose we have \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1}\). We say that \(\varvec{\alpha }\) is quasi-regular if \(\pi (\varvec{\alpha }) = 0\). For any integers \(l,m \ge 1\) with \(l+m-2=n\) we say \(H_{l,m} (\varvec{\alpha })\) is quasi-regular if \(\varvec{\alpha }\) is quasi-regular.

Before proceeding, we make a couple of remarks on notation. Let M be an \(l \times m\) matrix and let \(l_1, l_2, m_1, m_2 \ge 0\) satisfy \(l_1 + l_2 \le l\) and \(m_1 + m_2 \le m\). Then, we define \(M[l_1, -l_2; m_1, -m_2]\) to be the submatrix of M consisting of the first \(l_1\) and last \(l_2\) rows, and the first \(m_1\) and last \(m_2\) columns. In the special cases when one or more of \(l_1, l_2, m_1, m_2\) are zero, we may not include them. For example, \(M[l_1; -m_2]\) should be taken to be \(M[l_1, 0; 0, -m_2 ]\). Note that the sign of the parameters in \(M[l_1; -m_2]\) makes it clear whether we start at the beginning or end: \(l_1\) is positive and so we are taking the first \(l_1\) rows; while \(-m_2\) is negative and so we are taking the last \(m_2\) columns.

There will be times where we will use the matrix

perhaps with different letters and indexing. If i is equal to the number of rows of the matrix above, then there should be no rows of zeros at the top of the matrix. In that case, the matrix should be interpreted as such even if there are rows of zeros indicated. Similarly, if i is equal to the number of columns, then the matrix above should be interpreted as having no columns of zeros on the left. This is to avoid unnecessary technicalities when we are working with a range of values of i.

2.2 The \((\rho , \pi )\)-form of a Hankel matrix

We are now able to introduce the \((\rho , \pi )\)-form of a Hankel matrix. Generally, we apply a series of row operations to transform the matrix into one whose structure demonstrates the \((\rho , \pi )\)-characteristic of the original matrix. It allows us to understand the ranks and kernels of Hankel matrices more easily, as we will see in Sects. 2.3 and 2.4. This form was used for square Hankel matrices in [12], although no terminology for this was given there, or elsewhere, as far as we are aware. Thus, we have introduced the terminology of “\((\rho , \pi )\)-form”. We require a few results before giving the definition of \((\rho , \pi )\)-form.

Lemma 2.2.1

Suppose \(\varvec{\alpha } = (\alpha _0, \ldots , \alpha _n )\) with \(\rho (\varvec{\alpha }) = 0\) and let

$$\begin{aligned} \pi _1 \in {\left\{ \begin{array}{ll} \{ 0 , 1 , \ldots , n_1 \} &{}\text { if }n\text { is odd,} \\ \{ 0 , 1 , \ldots , n_1 - 1 \} &{}\text { if }n\text { is even} \end{array}\right. } \end{aligned}$$

(See Remark 2.1.3 regarding the values that \(\pi _1\) can take). Then, \(\pi (\varvec{\alpha }) = \pi _1\) if and only if

$$\begin{aligned} \alpha _i \in {\left\{ \begin{array}{ll} \{ 0 \} &{}\text { for }i < (n+1) - \pi _1 \\ {\mathbb {F}}_q^* &{}\text { for }i = (n+1) - \pi _1 \\ {\mathbb {F}}_q &{}\text { for }i > (n+1) - \pi _1 . \end{array}\right. } \end{aligned}$$

Proof

We begin with the forward implication. Let \(H:= H_{n_1, n_2} (\varvec{\alpha })\). Since \(\rho (\varvec{\alpha }) = 0\), we have \(\det H [i;i] = 0\) for \(i=1, \ldots n_1\). When \(i=1\) this gives

$$\begin{aligned} 0 = \det H [1;1] = \alpha _0 . \end{aligned}$$

When \(i=2\) it gives

$$\begin{aligned} 0 = \det H [2;2] = \det \begin{pmatrix} 0 &{} \alpha _1 \\ \alpha _1 &{} \alpha _2 \end{pmatrix} = - {\alpha _1 }^2 , \end{aligned}$$

meaning \(\alpha _1 = 0\). Proceeding as above in an inductive manner, we deduce that \(\alpha _0, \ldots , \alpha _{n_1 -1} = 0\). Thus, if n is even or odd, we have

respectively. Now, suppose i is the largest element in the set \(\{ 1, 2, \ldots , (n+1) - n_1 \}\) satisfying \(\alpha _{(n+1) - i} \ne 0\) (such an i must exist, unless we are working with the zero matrix, in which case we take \(i=0\)). Then,

Recalling that \(\alpha _{(n+1)-i} \ne 0\), we can clearly see that \({{\,\textrm{rank}\,}}H = i\). Since \({{\,\textrm{rank}\,}}H = \rho (\varvec{\alpha }) + \pi (\varvec{\alpha }) = \pi _1\), we see that \(i=\pi _1\). In particular, \(\alpha _{(n+1) - \pi _1} \ne 0\); while for \(i < (n+1) - \pi _1\) we have \(\alpha _i = 0\), and for \(i > (n+1) - \pi _1\) we have \(\alpha _i \in {\mathbb {F}}_q\). This concludes the forward implication.

The backward implication follows easily given some of the reasoning that we have established above. \(\square \)

Remark 2.2.2

Let \(l+m -2 = n\). It is helpful to visualize what the matrix \(H_{l,m} (\varvec{\alpha })\) looks like given \(\rho (\varvec{\alpha }) = 0\) and \(\pi (\varvec{\alpha }) = \pi _1\). For presentational purposes, we use 1 to denote an entry in \({\mathbb {F}}_q^*\), we use \(*\) to denote an entry in \({\mathbb {F}}_q\), and 0 denotes 0 as usual. For \(l < \pi _1\), we have

This has full row rank. We can describe the kernel in some manner, although it is more helpful when \(l,m \ge \pi _1\). In this case, we have

where there are exactly \(\pi _1\) number of 1s. This has rank equal to \(\pi _1\), we can clearly see that any vector in the kernel must have zeros in its last \(\pi _1\) positions. For \(m < \pi _1\), we have

This has full column rank and so the kernel is trivial.

Lemma 2.2.3

Suppose \(\varvec{\alpha } = (\alpha _0, \ldots , \alpha _n )\) with \(\rho (\varvec{\alpha }) = \rho _1 \in \{ 1, 2, \ldots , n_1 -1 \}\) and

$$\begin{aligned} \pi (\varvec{\alpha }) = \pi _1 \in {\left\{ \begin{array}{ll} \{ 0 , 1 , \ldots , n_1 - \rho _1 \} &{}\text { if }n\text { is odd,} \\ \{ 0 , 1 , \ldots , n_1 - \rho _1 - 1 \} &{}\text { if }n\text { is even} \end{array}\right. } \end{aligned}$$

(See Remark 2.1.3 regarding the values that \(\pi _1\) can take). Suppose \(l+m-2 = n\) with \(l > \rho _1\), and let \(H:= H_{l, m} (\varvec{\alpha })\). Define \({\textbf{x}} = (x_0, \ldots , x_{\rho _1 -1})^T\) to be the vector that satisfies

$$\begin{aligned} H [\rho _1 , \rho _1 ] {\textbf{x}} = \begin{pmatrix} \alpha _{\rho _1} \\ \alpha _{\rho _1 +1} \\ \vdots \\ \alpha _{2\rho _1 - 1} \end{pmatrix}. \end{aligned}$$

Let \(R_i\) be the i-th row of \(H_{l,m} (\varvec{\alpha })\). If we apply the row operations

$$\begin{aligned} R_i \longrightarrow R_i - (x_0 , \ldots , x_{\rho _1 -1}) \begin{pmatrix} R_{i - \rho _1 } \\ \vdots \\ R_{i-1} \end{pmatrix} = R_i - x_0 R_{i - \rho _1 } - \ldots - x_{\rho _1 -1} R_{i-1} \end{aligned}$$

for \(i = n_1, n_1 -1, \ldots , \rho _1 +1\) in that order, then we obtain a matrix

$$\begin{aligned} \begin{pmatrix} H_{\rho _1 , m} (\varvec{\alpha }' )\\ \hline H_{l - \rho _1 , m} (\varvec{\beta }) \end{pmatrix} , \end{aligned}$$
(15)

where

$$\begin{aligned} \varvec{\alpha }' =&(\alpha _0 , \ldots , \alpha _{\rho _1 + m - 2} ) \\ \varvec{\beta } =&(\beta _{\rho _1} , \ldots , \beta _{n}) \end{aligned}$$

and

$$\begin{aligned} \beta _i \in {\left\{ \begin{array}{ll} \{ 0 \} &{}\text { if }i < (n+1) - \pi _1 \\ {\mathbb {F}}_q^* &{}\text { if }i = (n+1) - \pi _1 \\ {\mathbb {F}}_q &{}\text { if }i > (n+1) - \pi _1 . \end{array}\right. } \end{aligned}$$
(16)

Furthermore, the sequence \(\varvec{\beta }\) is independent of the specific values taken by lm (as long as \(l > \rho _1\)).

Remark 2.2.4

In order to keep the lemma above succinct, we avoided various explanatory remarks. We give them here, for clarity.

The matrix \(H [\rho _1, \rho _1 ]\) is the largest top-left submatrix of H that is invertible, by definition of \(\rho _1\). It is independent of the specific values taken by lm. The vector \((\alpha _{\rho _1}, \alpha _{\rho _1 +1}, \ldots , \alpha _{2\rho _1 - 1})^T\) is simply the column of entries directly to the right of \(H [\rho _1, \rho _1 ]\) in H, which is also equal to the transpose of the row of entries directly below \(H [\rho _1, \rho _1 ]\).

The rows operations that we apply start at the last row and end at the row just below the submatrix \(H [\rho _1, \rho _1 ]\).

The lemma states that after the row operations are applied, we are left with the matrix

(17)

Of course, the lemma states that some of the \(\beta _i\) are zero, but we do not demonstrate this above so that we can instead see that the indexing is preserved: An entry in the i-th skew-diagonal is always either \(\alpha _i\) or \(\beta _i\).

Remark 2.2.5

As in Remark 2.2.2, it is helpful to visualize what the matrix

$$\begin{aligned} H' := \begin{pmatrix} H_{\rho _1 , m} (\varvec{\alpha }' ) \\ \hline H_{l - \rho _1 , m} (\varvec{\beta }) \end{pmatrix} \end{aligned}$$

looks like. Again, for presentational purposes, we use 1 to denote an entry in \({\mathbb {F}}_q^*\), we use \(*\) to denote an entry in \({\mathbb {F}}_q\), and 0 denotes 0 as usual. If \(l \le \rho _1\), then the lemma above does not apply. We simply remark that in this case H has full row rank, which follows from the fact that \(H [ \rho _1, \rho _1 ]\) is invertible and thus has full rank.

Now, suppose \(l,m \ge \rho _1 + \pi _1\). Given (15) and (16), we can see that

(18)

where there are exactly \(\pi _1\) number of 1s in the bottom-right submatrix. One reason that this is helpful is that the bottom two submatrices imply that any vector in the kernel of \(H'\) must have zeros in its last \(\pi _1\) entries. Given that row operations do not affect the kernel, the same can be said for H. Furthermore, we can see that the rank of \(H'\) (which is equal to the rank of H) is equal to the number of rows of \(H [\rho _1, \rho _1 ]\) (which is invertible) added to the number of 1s in the bottom-right submatrix. That is, the rank of \(H'\) (and H) is \(\rho _1 + \pi _1 = r\).

Now, we wish to consider the case \(\rho _1< l < \rho _1 + \pi _1\) (which requires \(\pi _1 \ge 2\)). We can do this by repeatedly removing a row and adding a column to (18), while maintaining that the bottom two matrices form a Hankel matrix and that the top two matrices form a Hankel matrix. This is permissible because, as stated in Lemma 2.2.3, the sequence \(\varvec{\beta }\) is independent of the values of l and m. We can then see that when \(\rho _1< l < \rho _1 + \pi _1\) we have a matrix of the form

(19)

Note that the 1s still appear to the right of \(H [\rho _1, \rho _1]\). Given this fact, and the invertibility of \(H [\rho _1, \rho _1]\), we see that \(H'\) (and H) has full row rank.

For the case \(m < \rho _1 + \pi _1\), we can again take (18), but this time we repeatedly remove a column and add a row. Similar to above, we can see that we will have full column rank. In particular, the kernel will be trivial.

Remarks 2.2.2 and 2.2.5 effectively establish the following Corollary.

Corollary 2.2.6

Let \(\varvec{\alpha } \in {\mathscr {L}}_n (r)\), and let \(l+m-2 = n\). Then,

$$\begin{aligned} {{\,\textrm{rank}\,}}H_{l,m} (\varvec{\alpha }) = {\left\{ \begin{array}{ll} r &{}\text { if }\min \{ l , m \} \ge r, \\ \min \{ l,m \} &{}\text { if }\min \{ l , m \} < r. \end{array}\right. } \end{aligned}$$

Of course, if we are working with particular values of \(\rho (\varvec{\alpha })\) and \(\pi (\varvec{\alpha })\), then we can replace r with \(\rho (\varvec{\alpha }) + \pi (\varvec{\alpha })\).

We give a further, final remark in order to demonstrate the usefulness of the \((\rho , \pi )\)-form.

Remark 2.2.7

Lemma 2.2.1 allows us to easily determine the number of \(\varvec{\alpha } \in {\mathbb {F}}_q^{n+1}\) with \(\rho (\varvec{\alpha }) = 0\) and \(\pi (\varvec{\alpha }) = \pi _1\). Regarding Lemma 2.2.3, we can easily count the number of possible values that the matrix \(H_{l - \rho _1, m} (\varvec{\beta })\) could take. It is not immediately obvious what the number of values the matrix \(H_{\rho _1, m} (\varvec{\alpha }' )\) could take, but we are working with a smaller matrix now, and so this suggests using an inductive argument, which is what we do in Sect. 2.3.

We now proceed to prove Lemma 2.2.3.

Proof of Lemma 2.2.3

Given that row operations are only applied to rows \(\rho _1 +1\) to \(n_1\), it is clear that we do indeed have \(H_{\rho _1, m} (\varvec{\alpha }' )\) in the top submatrix of (15).

If a row operation is applied to an entry \(\alpha _i\) that is found on the i-th skew-diagonal, then it is mapped to \(\alpha _i - x_0 \alpha _{i - \rho _1 } - \ldots - x_{\rho _1 -1} \alpha _{i-1}\). This is independent of what position on the i-th skew-diagonal the entry is found (but, of course, it must be on a row that has a row operation applied to it). It is also independent of what the particular values of lm are (as long as \(l > \rho _1\), as given in the lemma). The former demonstrates that the bottom submatrix of (15) is indeed a Hankel matrix; while the latter demonstrates that \(\varvec{\beta }\) is independent of the specific values taken by lm.

Thus, all that remains to be proven is (16); and, since \(\varvec{\beta }\) is independent of the specific values taken by lm, it suffices to work with the case \(l=n_1\) and \(m=n_2\). To this end, after the row operations are applied, we have the matrix

$$\begin{aligned} H' = \left( \begin{array}{ccccccccccc} \alpha _0 &{} \alpha _1 &{} \cdots &{} \cdots &{} \alpha _{\rho _1-1} &{} \vline &{} \alpha _{\rho _1} &{} \alpha _{\rho _1 + 1} &{} \cdots &{} \cdots &{} \alpha _{n_2 -1} \\ \alpha _1 &{} &{} &{} &{} \vdots &{} \vline &{} \alpha _{\rho _1 + 1} &{} &{} &{} &{} \vdots \\ \vdots &{} &{} &{} &{} \vdots &{} \vline &{} \vdots &{} &{} &{} &{} \vdots \\ \vdots &{} &{} &{} &{} \alpha _{2\rho _1 - 3} &{} \vline &{} \vdots &{} &{} &{} &{} \alpha _{n_2 + \rho _1 - 1} \\ \alpha _{\rho _1-1} &{} \cdots &{} \cdots &{} \alpha _{2\rho _1 - 3} &{} \alpha _{2\rho _1 - 2} &{} \vline &{} \alpha _{2 \rho _1-1} &{} \cdots &{} \cdots &{} \alpha _{n_2 + \rho _1 - 1} &{} \alpha _{n_2 + \rho _1 - 2} \\ \hline \beta _{\rho _1} &{} \beta _{\rho _1 + 1} &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \beta _{n_2 + \rho _1 - 1} \\ \beta _{\rho _1 +1} &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \vdots \\ \vdots &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \vdots \\ \vdots &{} &{} &{} &{} &{} &{} &{} &{} &{} &{} \beta _{n-1} \\ \beta _{n_1 -1} &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \cdots &{} \beta _{n-1} &{} \beta _{n} \end{array}\right) . \end{aligned}$$

This is the similar to (17), but here we indicate the top-left \(\rho _1 \times \rho _1\) submatrix, which is the largest invertible top-left submatrix of \(H'\) (by definition of \(\rho _1\)).

We recall that the row operations that we applied do not change the rank of H or any of its top-left submatrices. We now consider the following top-left submatrices of \(H'\):

$$\begin{aligned} H' [\rho _1 +1 | \rho _1 +1 ] , \hspace{1em} H' [\rho _1 +2 | \rho _1 +2 ] , \hspace{0.5em} \ldots \hspace{0.5em} , H' [n_1 | n_1 ] . \end{aligned}$$

For the first, we have

$$\begin{aligned} H' [\rho _1 +1 | \rho _1 +1 ] =&\begin{pmatrix} \alpha _0 &{} \alpha _1 &{} \cdots &{} \cdots &{} \alpha _{\rho _1-1} &{} \vline &{} \alpha _{\rho _1} \\ \alpha _1 &{} &{} &{} &{} \vdots &{} \vline &{} \alpha _{\rho _1 + 1} \\ \vdots &{} &{} &{} &{} \vdots &{} \vline &{} \vdots \\ \vdots &{} &{} &{} &{} \alpha _{2\rho _1 - 3} &{} \vline &{} \vdots \\ \alpha _{\rho _1-1} &{} \cdots &{} \cdots &{} \alpha _{2\rho _1 - 3} &{} \alpha _{2\rho _1 - 2} &{} \vline &{} \alpha _{2 \rho _1-1} \\ \hline \beta _{\rho _1} &{} \cdots &{} \cdots &{} \beta _{2 \rho _1-2} &{} \beta _{2 \rho _1-1} &{} \vline &{} \beta _{2 \rho _1} \end{pmatrix} \\&\vspace{0.5em} \\ =&\begin{pmatrix} \alpha _0 &{} \alpha _1 &{} \cdots &{} \cdots &{} \alpha _{\rho _1-1} &{} \vline &{} \alpha _{\rho _1} \\ \alpha _1 &{} &{} &{} &{} \vdots &{} \vline &{} \alpha _{\rho _1 + 1} \\ \vdots &{} &{} &{} &{} \vdots &{} \vline &{} \vdots \\ \vdots &{} &{} &{} &{} \alpha _{2\rho _1 - 3} &{} \vline &{} \vdots \\ \alpha _{\rho _1-1} &{} \cdots &{} \cdots &{} \alpha _{2\rho _1 - 3} &{} \alpha _{2\rho _1 - 2} &{} \vline &{} \alpha _{2 \rho _1-1} \\ \hline 0 &{} \cdots &{} \cdots &{} \cdots &{} 0 &{} \vline &{} \beta _{2 \rho _1} \end{pmatrix} , \end{aligned}$$

where the second equality follows from the definition of \({\textbf{x}}\) and the row operation that we applied to row \(\rho _1 +1\). Now, by the definition of \(\rho _1\), we have that \(\det H' [\rho _1 +1 | \rho _1 +1 ] = 0\); while, by the form of \(H' [\rho _1 +1 | \rho _1 +1 ] \) above, we can see that \(\det H' [\rho _1 +1 | \rho _1 +1 ] = \beta _{2 \rho _1} \cdot \det H [\rho _1 | \rho _1 ] \). Given that \(\det H [\rho _1 | \rho _1 ] \ne 0\) (by definition of \(\rho _1\)), we must have that \(\beta _{2 \rho _1} = 0\).

Now consider \(H' [\rho _1 +2 | \rho _1 +2 ]\). We have

$$\begin{aligned} H' [\rho _1 +2 | \rho _1 +2 ] =&\begin{pmatrix} \alpha _0 &{} \alpha _1 &{} \cdots &{} \cdots &{} \alpha _{\rho _1-1} &{} \vline &{} \alpha _{\rho _1} &{} \alpha _{\rho _1 + 1} \\ \alpha _1 &{} &{} &{} &{} \vdots &{} \vline &{} \alpha _{\rho _1 + 1} &{} \alpha _{\rho _1 + 2} \\ \vdots &{} &{} &{} &{} \vdots &{} \vline &{} \vdots &{} \vdots \\ \vdots &{} &{} &{} &{} \alpha _{2\rho _1 - 3} &{} \vline &{} \vdots &{} \vdots \\ \alpha _{\rho _1-1} &{} \cdots &{} \cdots &{} \alpha _{2\rho _1 - 3} &{} \alpha _{2\rho _1 - 2} &{} \vline &{} \alpha _{2 \rho _1-1} &{} \alpha _{2 \rho _1} \\ \hline 0 &{} \cdots &{} \cdots &{} \cdots &{} 0 &{} \vline &{} 0 &{} \beta _{2 \rho _1 +1} \\ 0 &{} \cdots &{} \cdots &{} \cdots &{} 0 &{} \vline &{} \beta _{2 \rho _1 +1} &{} \beta _{2 \rho _1 +2} \end{pmatrix} . \end{aligned}$$

By similar reasoning as above, we have

$$\begin{aligned} 0&= \det H' [\rho _1 +2 | \rho _1 +2 ] = \det \begin{pmatrix} 0 &{} \beta _{2 \rho _1 +1} \\ \beta _{2 \rho _1 +1} &{} \beta _{2 \rho _1 +2} \end{pmatrix} \hspace{0.5em} \cdot \hspace{0.5em} \det H [\rho _1 | \rho _1 ]\\&= - {\beta _{2 \rho _1 +1}}^2 \cdot \det H [\rho _1 | \rho _1 ] . \end{aligned}$$

Given that \(\det H [\rho _1 | \rho _1 ] \ne 0\), we must have that \(\beta _{2 \rho _1 +1} = 0\).

Proceeding as above in an inductive manner, we see that \(\beta _{\rho _1}, \beta _{\rho _1 +1}, \ldots , \beta _{n_1 + \rho _1 -1} = 0\). That is,

(20)

Now, if n is even, then \(n_2 = n_1\) and \(H'= H'[n_1, n_1 ]\), whereas if n is odd, then \(n_2 = n_1 +1\) and we have an additional column:

(21)

In either case, in the last \(n_1 - \rho _1\) rows, all the entries are zero except for the last \(n_2 - \rho _1 -1\) skew-diagonals which may or may not be zero. We now consider the first such skew-diagonal that is non-zero: Suppose i is the largest element in the set \( \{ 1, 2, \ldots , (n+1) - (n_1 + \rho _1 ) \}\) that satisfies \(\beta _{(n+1)-i} \ne 0\) (if no such i exists, then we take \(i=0\), and only a slight adaptation of the following reasoning is required). Then, the bottom-right quadrant of \(H'\) (bounded by the vertical and horizontal lines in (20) and (21)) is of the form

Given that \(\beta _{(n+1)-i} \ne 0\), we can see that the last i columns appearing in this quadrant are linearly independent, and that the rank of this quadrant (matrix) is i. Recall also that \(H' [\rho _1, \rho _1 ]\) has full column rank. So, given the form of \(H'\), we can see that the first \(\rho _1\) columns and the last i columns of \(H'\) form a basis for its column space. In particular,

$$\begin{aligned} {{\,\textrm{rank}\,}}H' = \rho _1 + i . \end{aligned}$$

Since \({{\,\textrm{rank}\,}}H' = {{\,\textrm{rank}\,}}H = \rho _1 + \pi _1\), we have \(i = \pi _1\). This proves (16) as required. \(\square \)

Lemmas 2.2.1 and 2.2.3 extend upon [12, Section 5], where they prove similar lemmas but for square Hankel matrices only. They use column operations instead of row operations. We chose the latter in order to preserve the kernel. We will now formally give the definition of \((\rho , \pi )\)-form.

Definition 2.2.8

(The \((\rho , \pi )\)-form) Let \(\varvec{\alpha } \in {\mathbb {F}}_q^{n+1}\) and let \(l+m-2 = n\). Consider \(H:= H_{l,m} (\varvec{\alpha })\). If \(\rho (\varvec{\alpha }) = 0\), then we define the \((\rho , \pi )\)-form of H to be itself.

If \(\rho (\varvec{\alpha }) = \rho _1 \in \{ 1, 2, \ldots , n_1 -1 \}\) and \(l \le \rho _1\), then we also define the \((\rho , \pi )\)-form of H to be itself. Whereas, if \(l > \rho _1\), then we define the \((\rho , \pi )\)-form of H to be the matrix that we obtain after applying the row operations

$$\begin{aligned} R_i \longrightarrow R_i - (x_0 , \ldots , x_{\rho _1 -1}) \begin{pmatrix} R_{i - \rho _1 } \\ \vdots \\ R_{i-1} \end{pmatrix} = R_i - x_0 R_{i - \rho _1 } - \ldots - x_{\rho _1 -1} R_{i-1} \end{aligned}$$

for \(i = n_1, n_1 -1, \ldots , \rho _1 +1\) in that order; where \(R_i\) is the i-th row of H, and \({\textbf{x}} = (x_0, \ldots , x_{\rho _1 -1})^T\) is the vector that satisfies

$$\begin{aligned} H [\rho _1 , \rho _1 ] {\textbf{x}} = \begin{pmatrix} \alpha _{\rho _1} \\ \alpha _{\rho _1 +1} \\ \vdots \\ \alpha _{2\rho _1 - 1} \end{pmatrix}. \\ \end{aligned}$$

If \(\rho (\varvec{\alpha }) = n_1\), then we define the \((\rho , \pi )\)-form of H to be itself.

2.3 Matrices of a given size and rank

In [12, Section 5], they determine the number of square Hankel matrices of a given size, rank, and \((\rho , \pi )\)-form. Our results in this section generalize upon this by determining the size of sets of the form \({\mathscr {L}}_n^h (r, \rho _1, \pi _1)\), \({\mathscr {L}}_n^h (r)\), and \({\mathscr {H}}_{l,m}^h (r)\). That is, we consider rectangular (not just square) Hankel matrices, the associated sequences, and the condition on the number of zeros at the start of those sequences.

In the following theorem, it is necessary that we consider several cases separately. It is important to note that these cases account for all possible sequences and Hankel matrices. This may not be obvious at first, due to the restrictions on the parameters \(r, \rho _1, \pi _1, h\) that appear in each case, but Remarks 2.1.3 and 2.1.6 justify this and explain the restrictions on the parameters. It is useful to keep in mind that in the following theorem, by definition of \(\pi _1\), we have \(r = \rho _1 + \pi _1\).

Theorem 2.3.1

Let \(n \ge 0\) and \(0 \le h \le n+1\), and consider \({\mathscr {L}}_{n}^{h} (r, \rho _1, \pi _1 )\). Let us also define \(\delta _{\textrm{E}}(n)\) to be 1 if n is even, and 0 if n is odd.

Claim 1 (In this claim, we consider the case \(\rho _1 = 0\)). Suppose \(0 \le r \le \min \{ n_1 - \delta _{\textrm{E}}(n), n-h+1 \}\). Then,

$$\begin{aligned} |{\mathscr {L}}_{n}^{h} (r , 0 , r ) |= {\left\{ \begin{array}{ll} 1 &{}\text { if }r=0, \\ (q-1) q^{r-1} &{}\text { if }r > 0. \end{array}\right. } \end{aligned}$$

Claim 2 Suppose that \(h+1 \le \rho _1 \le n_1 - 1\) (by definition, we cannot have \(1 \le \rho _1 \le h\)) and \(0 \le \pi _1 \le n_1 - \rho _1 - \delta _{\textrm{E}}(n)\). Then,

$$\begin{aligned} |{\mathscr {L}}_{n}^{h} (\rho _1 + \pi _1 , \rho _1 , \pi _1 ) |= {\left\{ \begin{array}{ll} (q-1) q^{2 \rho _1 - h - 1} &{}\text { if }\pi _1 = 0, \\ (q-1)^2 q^{2 \rho _1 + \pi _1 - h -2} &{}\text { if }\pi _1 > 0. \end{array}\right. } \\ \end{aligned}$$

Claim 3 (In this claim, we consider the case \(\rho _1 = n_1\), which requires \(h+1 \le n_1\)). We have

$$\begin{aligned} |{\mathscr {L}}_{n}^{h} (n_1 , n_1 , 0 ) |= (q-1) q^{n-h} . \end{aligned}$$

Claim 4 Consider \({\mathscr {L}}_n^h (r )\). We have

$$\begin{aligned} |{\mathscr {L}}_n^h (r ) |= {\left\{ \begin{array}{ll} 1 &{}\text { if }r=0, \\ (q-1) q^{r-1} &{}\text { if }1 \le r \le \min \{ h , n-h+1 \} , \\ (q^2 -1) q^{2r-h-2} &{}\text { if }h+1 \le r \le n_1 - 1, \\ q^{n-h+1} - q^{2n_1 - h -2} &{}\text { if }r = n_1\text { (which is only possible if }h+1 \le n_1). \end{array}\right. } \end{aligned}$$

Claim 5 Let \(l+m-2 = n\). If \(r < \min \{ l, m \}\), then

$$\begin{aligned} |{\mathscr {H}}_{l,m}^h (r) |= {\left\{ \begin{array}{ll} 1 &{}\text { if }r=0, \\ (q-1) q^{r-1} &{}\text { if }1 \le r \le \min \{ h , n-h+1 \} , \\ (q^2 -1) q^{2r-h-2} &{}\text { if }h+1 \le r \le n_1 - 1. \end{array}\right. } \end{aligned}$$

If \(r = \min \{ l, m \}\), then

$$\begin{aligned} |{\mathscr {H}}_{l,m}^h (r) |=&\big |{\mathscr {H}}_{l,m}^h \big ( \min \{ l , m \} \big ) \big |\\ =&{\left\{ \begin{array}{ll} q^{l+m-h-1} - q^{\min \{ l , m \} -1} &{}\text { if }\min \{ l , m \} - 1 \le \min \{ h , n-h+1 \}, \\ q^{l+m-h-1} - q^{2 \min \{ l , m \} -h-2} &{}\text { if }\min \{ l , m \} - 1 \ge h+1. \end{array}\right. } \end{aligned}$$

Proof

Claim 1 When \(r=0\), the only element in \({\mathscr {L}}_{n}^{h} (r, 0, r )\) is a sequence of zeros. Thus, \(|{\mathscr {L}}_{n}^{h} (r, 0, r ) |= 1\). Suppose \(r > 0\). Lemma 2.2.1 tells us \(\varvec{\alpha } \in {\mathscr {L}}_{n}^{h} (r, 0, r )\) if and only if

$$\begin{aligned} \varvec{\alpha } = (0 , \ldots , 0, \alpha _{(n+1)-r } , \ldots , \alpha _n ) \end{aligned}$$

with \(\alpha _{(n+1)-r} \in {\mathbb {F}}_q^*\) and \(\alpha _{(n+1)-r +1}, \ldots , \alpha _n \in {\mathbb {F}}_q\). Thus, we have \(|{\mathscr {L}}_{n}^{h} (r, 0, r ) |= (q-1) q^{r-1}\).

Claim 2 For \(\pi _1 \ge 1\), there is a bijection between \({\mathscr {L}}_{n}^{h} (\rho _1 + \pi _1, \rho _1, \pi _1 )\) and

$$\begin{aligned} {\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 ) \times {\mathbb {F}}_q \times \{ 0 \}^{n - \pi _1 - 2\rho _1 +1} \times {\mathbb {F}}_q^* \times {\mathbb {F}}_q^{\pi _1 - 1} . \end{aligned}$$

Indeed, suppose \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n ) \in {\mathscr {L}}_{n}^{h} (\rho _1 + \pi _1, \rho _1, \pi _1 )\) and consider \(H:= H_{n_1, n_2} (\varvec{\alpha })\). Lemma 2.2.3 gives us the following information.

  • The submatrix \(H [\rho _1, \rho _1]\) is invertible. In particular,

    $$\begin{aligned} {\alpha }' := (\alpha _0 , \alpha _1 , \ldots , \alpha _{2\rho _1 -2}) \in {\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 , \rho _1 , 0 ) = {\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 ) . \end{aligned}$$
  • The entry \(\alpha _{2 \rho _1 -1}\) is free to take any value in \({\mathbb {F}}_q\). Note that the vector \({\textbf{x}}\) is uniquely determined by \(\varvec{\alpha }'\) and \(\alpha _{2 \rho _1 -1}\).

  • The entries \(\beta _{2 \rho _1}, \ldots , \beta _{n - \pi _1}\), of which there are \(n - \pi _1 - 2\rho _1 +1\) number of them, must all take the value 0, and the invertibility of the row operations (which are uniquely determine by \({\textbf{x}}\)), means that the corresponding \(\alpha _{2 \rho _1}, \ldots , \alpha _{n - \pi _1}\) can also only take a single value.

  • Similarly, \(\beta _{(n+1) - \pi _1}\) can take any value in \({\mathbb {F}}_q^*\), and so the corresponding \(\alpha _{(n+1) - \pi _1}\) can take \((q-1)\) possible values.

  • Similarly again, \(\beta _{(n+2) - \pi _1}, \ldots , \beta _{n}\), of which there are \(\pi _1 - 1\) number of them, can take any value in \({\mathbb {F}}_q\), and so the corresponding \(\alpha _{(n+2) - \pi _1}, \ldots , \alpha _{n}\) can each take any value in \({\mathbb {F}}_q\).

By similar reasoning, when \(\pi _1 = 0\) we have a bijection between \({\mathscr {L}}_{n}^{h} (\rho _1 + \pi _1, \rho _1, \pi _1 ) = {\mathscr {L}}_{n}^{h} (\rho _1, \rho _1, 0 )\) and

$$\begin{aligned} {\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 ) \times {\mathbb {F}}_q \times \{ 0 \}^{n - 2\rho _1 +1} . \end{aligned}$$

So, we have

$$\begin{aligned} |{\mathscr {L}}_{n}^{h} (\rho _1 + \pi _1 , \rho _1 , \pi _1 ) |= {\left\{ \begin{array}{ll} |{\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 ) |\cdot (q-1) q^{\pi _1 } &{}\text { if }\pi _1 \ge 1, \\ |{\mathscr {L}}_{2 \rho _1 -2}^{h} (\rho _1 ) |\cdot q &{}\text { if }\pi _1 = 0. \end{array}\right. } \end{aligned}$$
(22)

Therefore, what we must understand are the sets \({\mathscr {L}}_{2 k -2}^{h} (k)\). We have that

$$\begin{aligned} \begin{aligned} |{\mathscr {L}}_{2 k -2}^{h} (k) |=&|{\mathscr {L}}_{2 k -2}^{h} |- \sum _{i=0}^{k -1} |{\mathscr {L}}_{2 k -2}^{h} (i) |\\ =&q^{2 k - h -1} - 1 - \sum _{i=1}^{k -1} |{\mathscr {L}}_{2 k -2}^{h} (i) |. \end{aligned} \end{aligned}$$
(23)

Let us now partition the sets \({\mathscr {L}}_{2 k -2}^{h} (i)\) above according to the \((\rho , \pi )\)-form of the sequences they contain. Suppose first that \(1 \le i \le h\) and let \(\varvec{\alpha } \in {\mathscr {L}}_{2 k -2}^{h} (i)\). We must have that \(\rho (\varvec{\alpha }) = 0\). Indeed, consider the matrix \(H:= H_{k,k} (\varvec{\alpha })\) that is associated to \(\varvec{\alpha }\). It’s rank is i, and so we must have that \(\rho (\varvec{\alpha }) \le i\). However, the fact that \(i \le h\) means that the following matrices are lower skew-triangular, and thus not invertible:

$$\begin{aligned} H[1,1] , H[2,2] , \ldots , H[i,i]. \end{aligned}$$

Therefore, \(\rho (\varvec{\alpha }) \not \in \{ 1, 2, \ldots , i \}\), and so we must have \(\rho (\varvec{\alpha }) = 0\). Note this implies that \(\pi (\varvec{\alpha }) = i - \rho (\varvec{\alpha }) = i\). Hence, by Claim 1, we have

$$\begin{aligned} |{\mathscr {L}}_{2 k -2}^{h} (i) |= |{\mathscr {L}}_{2 k -2}^{h} (i , 0 , i) |= (q-1) q^{i-1} . \end{aligned}$$
(24)

Now suppose that \(h+1 \le i \le k-1\), and let \(\varvec{\alpha } \in {\mathscr {L}}_{2 k -2}^{h} (i)\). By similar reasoning as above, we must have that \(\rho (\varvec{\alpha }) = 0\) or \(h+1 \le \rho (\varvec{\alpha }) \le i\). Hence,

$$\begin{aligned} \begin{aligned} |{\mathscr {L}}_{2 k -2}^{h} (i) |=&|{\mathscr {L}}_{2 k -2}^{h} (i , 0 , i) |+ \sum _{j=h+1}^{i} |{\mathscr {L}}_{2 k -2}^{h} (i , j , i-j) |\\ =&(q-1) q^{i-1} + \sum _{j=h+1}^{i} |{\mathscr {L}}_{2 k -2}^{h} (i , j , i-j) |. \end{aligned} \end{aligned}$$
(25)

Substituting (24) and (25) into (23), we obtain

$$\begin{aligned} |{\mathscr {L}}_{2 k -2}^{h} (k ) |=&q^{2 k - h -1} - q^{k -1} - \sum _{i=h+1}^{k -1} \sum _{j=h+1}^{i} |{\mathscr {L}}_{2 k -2}^{h} (i , j , i-j) |\\ =&q^{2 \rho _1 - h -1} - q^{k -1} - \sum _{j=h+1}^{k -1} \sum _{i=j}^{k -1} |{\mathscr {L}}_{2 k -2}^{h} (i , j , i-j) |\\ =&q^{2 k - h -1} - q^{k -1} - q^{k} \sum _{j=h+1}^{k -1} |{\mathscr {L}}_{2 j -2}^{h} (j) |\cdot q^{-j} , \end{aligned}$$

where the last lines applies (22). This is a recurrence relation. The initial condition is

$$\begin{aligned} |{\mathscr {L}}_{2 (h+1) -2}^{h} (h+1) |= (q-1) q^h ; \end{aligned}$$

Indeed, if \(\varvec{\alpha } \in {\mathscr {L}}_{2 (h+1) -2}^{h} (h+1)\) then \(H_{h+1, h+1} (\varvec{\alpha })\) has all entries above the main skew-diagonal equal to 0, and so to have rank equal to \(h+1\) (i.e., full rank) we must have that the entries in the main skew-diagonal are in \({\mathbb {F}}_q^*\), while the entries in the last h skew-diagonals are free to take any values in \({\mathbb {F}}_q\). Now, it can easily be verified that the solution to the recurrence relation is

$$\begin{aligned} |{\mathscr {L}}_{2 k -2}^{h} (k ) |= (q-1) q^{2k-h-2} . \end{aligned}$$

Substituting this into (22) proves Case 2.

Claims 3 and 4 We begin with Claim 4. If \(r=0\), the only element in \({\mathscr {L}}_n^h (r )\) is the sequence of zeros, thus proving this case.

Suppose instead that \(1 \le r \le \min \{ h, n-h+1 \}\). As described in the theorem, if \(\varvec{\alpha } \in {\mathscr {L}}_n^h (r )\) then \(\rho (\varvec{\alpha }) = 0\), and so this case follows from Claim 1.

Now suppose that \(h+1 \le r \le n_1 - 1\). If \(\varvec{\alpha } \in {\mathscr {L}}_n^h (r )\) then \(\rho (\varvec{\alpha }) = 0\) or \(\rho (\varvec{\alpha }) \in \{ h+1, h+2, \ldots , r \}\). Hence, we have

$$\begin{aligned} |{\mathscr {L}}_n^h (r ) |=&|{\mathscr {L}}_n^h (r , 0 , r) |+ \sum _{\rho _1 = h+1}^{r} |{\mathscr {L}}_n^h (r , \rho _1 , r - \rho _1 ) |\\ =&(q-1) q^{r-1} + (q-1)^2 \sum _{\rho _1 = h+1}^{r-1} q^{\rho _1 + r - h - 2} \hspace{1em} + (q-1) q^{2r-h-1} \\ =&(q^2 -1) q^{2r-h-2} , \end{aligned}$$

where the second equality uses Claim 2.

Finally, suppose that \(r=n_1\). Then,

$$\begin{aligned} |{\mathscr {L}}_n^h (r ) |=&|{\mathscr {L}}_n^h |- \sum _{r=0}^{n_1 -1} |{\mathscr {L}}_n^h (r ) |\\ =&q^{n-h+1} - 1 - \sum _{r=1}^{h} (q-1) q^{r-1} - \sum _{r=h+1}^{n_1 -1} (q^2 -1) q^{2r-h-2} \\ =&q^{n-h+1} - q^{2n_1 - h -2}. \end{aligned}$$

For Claim 3, if n is even, then we have \({\mathscr {L}}_n^h (n_1, n_1, 0 ) = {\mathscr {L}}_n^h (n_1 )\). So, by the last case of Claim 4, we have

$$\begin{aligned} |{\mathscr {L}}_n^h (n_1 , n_1 , 0 ) |= q^{n-h+1} - q^{2n_1 - h -2} = (q-1) q^{n-h} . \end{aligned}$$

Now suppose n is odd, and let

$$\begin{aligned} \varvec{\alpha } =&(\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathscr {L}}_n^h (n_1 , n_1 , 0 ), \\ \varvec{\alpha }' :=&(\alpha _0 , \alpha _1 , \ldots , \alpha _{n-1} ), \end{aligned}$$

and \(H:= H_{n_1, n_2 } (\varvec{\alpha })\). Since \(\rho ( \varvec{\alpha }) = n_1\), we have that \(H_{n_1, n_1 } (\varvec{\alpha }') = H[n_1; n_1]\) has full rank. Therefore, \(\varvec{\alpha }' \in {\mathscr {L}}_{n-1}^h (n_1, n_1, 0 ) = {\mathscr {L}}_{n-1}^h (n_1 )\), of which there are \((q-1) q^{n-h-1}\) possible values it could take (by the first case of Claim 3). Meanwhile, \(\alpha _n\) is free to take any value in \({\mathbb {F}}_q\), of which there are q possibilities. Thus, \(|{\mathscr {L}}_n^h (n_1, n_1, 0 ) |= (q-1) q^{n-h}\), as required.

We remark that the difference between the odd case and the even case is that when n is odd, the matrix \(H_{n_1, n_2} (\varvec{\alpha })\) is not quite square; the additional column allows the matrix to have full rank without necessarily having \(\rho (\varvec{\alpha }) = n_1\). This is why Claim 3 gives the same result as the last case of Claim 4 only when n is even, but not when n is odd.

Claim 5 If \(r < \min \{ l, m \}\), then Corollary 2.2.6 implies that there is a bijection between \({\mathscr {H}}_{l,m}^h (r)\) and \({\mathscr {L}}_{n}^{h} (r)\). The result then follows by the first three cases of Claim 4.

Now suppose \(r = \min \{ l, m \}\). If \(\min \{ l, m \} - 1 \le \min \{ h, n-h+1 \}\), then

$$\begin{aligned} |{\mathscr {H}}_{l,m}^h (r) |=&|{\mathscr {H}}_{l,m}^h \big ( \min \{ l , m \} \big ) |\\ =&|{\mathscr {H}}_{l,m}^h |- \sum _{i=0}^{\min \{ l , m \} -1} |{\mathscr {H}}_{l,m}^h (i) |\\ =&q^{l+m-h-1} - 1 - (q-1) \sum _{i=1}^{\min \{ l , m \} -1} q^{i-1} \\ =&q^{l+m-h-1} - q^{\min \{ l , m \} -1} , \end{aligned}$$

where the third equality uses the first part of Claim 5. Now suppose that \(\min \{ l, m \} - 1 \ge h+1\). Note that this gives \(h \le \min \{ l, m \} -2 \le n_1 -2\), and so \(h < n-h+1\). Thus, we have

$$\begin{aligned} |{\mathscr {H}}_{l,m}^h (r) |=\,&|{\mathscr {H}}_{l,m}^h \big ( \min \{ l , m \} \big ) |\\ =\,&|{\mathscr {H}}_{l,m}^h |- \sum _{i=0}^{\min \{ l , m \} -1} |{\mathscr {H}}_{l,m}^h (i) |\\ =\,&q^{l+m-h-1} - 1 - (q-1) \sum _{i=1}^{h} q^{i-1} \hspace{1em} - (q^2 -1) \sum _{i=h+1}^{\min \{ l , m \} -1} q^{2i-h-2} \\ =\,&q^{l+m-h-1} - q^{2 \min \{ l , m \} -h-2}. \end{aligned}$$

Again, the third equality uses the first part of Claim 5. \(\square \)

2.4 Kernel structure

We will now investigate the kernel structure of Hankel matrices. We begin with an extension to Corollary 2.2.6.

Corollary 2.4.1

Suppose \(\varvec{\alpha } \in {\mathscr {L}}_n^h (r)\). We have

$$\begin{aligned} \dim {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = {\left\{ \begin{array}{ll} 0 &{}\text { if }1 \le m \le r, \\ \dim {{\,\textrm{ker}\,}}h_{l+1 , m-1} (\varvec{\alpha }) + 1 &{}\text { if }r< m \le n+2-r, \\ \dim {{\,\textrm{ker}\,}}h_{l+1 , m-1}(\varvec{\alpha }) + 2 &{}\text { if }n+2-r < m \le n+1. \end{array}\right. } \end{aligned}$$

(For the case \(r=0\) we must define \(\dim {{\,\textrm{ker}\,}}H_{n+2,0} (\varvec{\alpha }):= 0\)). Thus,

$$\begin{aligned} \dim {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = {\left\{ \begin{array}{ll} 0 &{}\text { if }1 \le m \le r, \\ m-r &{}\text { if }r< m \le n+2-r, \\ 2m - n -2 &{}\text { if }n+2-r < m \le n+1. \end{array}\right. } \end{aligned}$$

Proof

The first statement follows from the second. The second statement follows directly from Corollary 2.2.6 and the fact that the dimension of the kernel of a matrix is just the number of columns subtracted by the rank. \(\square \)

Remark 2.4.2

For intuition it is helpful to understand the first result in Corollary 2.4.1 by making use of the \((\rho , \pi )\)-form. We start with \(m=1\), and add a column and remove a row incrementally (while maintaining that we have a Hankel matrix). For simplicity, assume \(2 \le r \le n_1 -1\) and \(\rho (\varvec{\alpha }) = \rho _1 \in \{ 1, 2, \ldots , r -1 \}\). When \(m \le r\) we have full column rank and hence the kernel is trivial. If \(l,m > r\), then (18) gives

where there are \(r - \rho _1\) number of 1s at the bottom-right (recall 1 represents an element in \({\mathbb {F}}_q^*\) while \(*\) represents an element in \({\mathbb {F}}_q\)). The rank of this matrix is r. Now, adding a column and removing a row maintains this form until \(l=r\); that is until \(m = n+2-r\). In particular the rank remains the same, but the number of columns increases by 1 each time; thus, the dimension of the kernel increases by 1 each time. If we now take \(2 \le l \le r\), that is \(n+2-r \le m \le n\), then (19) gives

Removing a row will decrease the rank by 1. If we also add a column then the effect is to increase the dimension of the kernel by 2.

Definition 2.4.3

(Characteristic Degrees) Suppose \(\varvec{\alpha } \in {\mathscr {L}}_n^h (r)\). The characteristic degrees of \(\varvec{\alpha }\) are defined to be r and \(n+2-r\). We extend this definition to any Hankel matrix \(H_{l,m} (\varvec{\alpha })\) associated to \(\varvec{\alpha }\).

The characteristic degrees are just the boundaries for the cases in Corollary 2.4.1. Note that we always have \(r \le n+2-r\), with equality occurring if n is even and \(r= \frac{n+2}{2}\). Corollary 2.4.1 is given in [18] as Proposition 5.4, although it is stated differently and the proof is different. Definition 2.4.3 is also given in [18] as Definition 5.3.

Now, in what follows, it will be necessary to view vectors in \({\mathbb {F}}_q^{k+1}\) (for any integer \(k \ge 0\)) as polynomials in \({\mathcal {A}}:= {\mathbb {F}}_q [T]\). A vector \((v_0, v_1, \ldots , v_k)^T\) should be considered the same as the polynomial \(v_0 + v_1 T + \ldots v_k T^k\) and vice versa. Clearly, a vector has a unique polynomial associated with it. However, a polynomial does not have a unique vector associated with it. For example, \({\textbf{v}}_1:= (v_0, v_1, \ldots , v_k)^T\) and \({\textbf{v}}_2:= (v_0, v_1, \ldots , v_k, 0)^T\) are different vectors but they are associated with the same polynomial. In order to avoid confusion and to ensure everything is well defined, we will make it clear what vector space we are working with, and its dimension will inform us of the number of zeros that should appear at the end of the vector. It should be noted that in [18] the polynomial associated with \({\textbf{v}}_2\) is said to have a root at infinity (associated with the 0 in the last entry of the vector), thus distinguishing it from the polynomial associated with \({\textbf{v}}_1\). However, we will not employ this. Finally, it is helpful to keep in mind that a polynomial of degree k has \(k+1\) coefficients, and so any vector associated to it must be in at least \((k+1)\)-dimensional space.

In this subsection we prove the following four theorems and their associated corollaries. The proofs of the theorems are provided at the end of this subsection.

Theorem 2.4.4

Let \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\), where \(n > 0\); and denote the characteristic degrees by \(c_1:= r\) and \(c_2:= n-r+2\). Also, let lm be such that \(1 \le m \le n+1\) and \(l+m-2=n\). Then, there exist coprime polynomials \(A_1, A_2 \in {\mathcal {A}}\) with

$$\begin{aligned}&{{\,\textrm{deg}\,}}A_1 = \rho _1 , \\&{{\,\textrm{deg}\,}}A_2 \le c_2 , \end{aligned}$$

such that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1 \\ {{\,\textrm{deg}\,}}B_2 \le m - c_2 - 1 \end{array} \bigg \} \subseteq {\mathbb {F}}_q^m . \end{aligned}$$
(26)

If \(\rho _1\) is not equal to \(r=c_1\), then \({{\,\textrm{deg}\,}}A_2\) is necessarily equal to \(c_2\).

The proof of this theorem begins on Page 35.

This leads us to the following definition.

Definition 2.4.5

(Characteristic Polynomials) In Theorem 2.4.4, we define the polynomials \(A_1\) and \(A_2\) to be the first and second characteristic polynomials of the sequence \(\varvec{\alpha }\), respectively.

Remark 2.4.6

For certain values of m, the restrictions \({{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1\) and \({{\,\textrm{deg}\,}}B_2 \le m - c_2 - 1\) may force \(B_1 = 0\) or \(B_2 = 0\) (recall we define \({{\,\textrm{deg}\,}}0 = - \infty \) and so \({{\,\textrm{deg}\,}}B \le -1\) implies \(B=0\)). Indeed, we can rewrite (26) in the following manner:

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = {\left\{ \begin{array}{ll} \{ {\textbf{0}} \} &{}\text { if }1 \le m \le c_1, \\ \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1 \end{array} \Big \} \subseteq {\mathbb {F}}_q^m &{}\text { if }c_1 + 1 \le m \le c_2, \\ \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1 \\ {{\,\textrm{deg}\,}}B_2 \le m - c_2 - 1 \end{array} \bigg \} \subseteq {\mathbb {F}}_q^m&\text { if }c_2 +1 \le m \le n+1. \end{array}\right. } \end{aligned}$$

We further note that when \(r=0,1\), the polynomial \(B_2\) will always be forced to be 0 regardless of the values of lm. Thus, the value, or even the existence, of \(A_2\) may seem irrelevant when \(r=0,1\). However, this is not the case. Indeed, we often need to consider extensions \(\varvec{\alpha }'\) of \(\varvec{\alpha }\) and understand how the characteristic polynomials of \(\varvec{\alpha }'\) depend on those of \(\varvec{\alpha }\); and in this case it is necessary to always have a definition of \(A_2\) for \(\varvec{\alpha }\). Indeed, we do this in Theorem 2.4.13. So, we make the following natural definitions, which are consistent with any claims made about \(A_2\) in Theorem 2.4.4 above:

$$\begin{aligned} A_2 := {\left\{ \begin{array}{ll} 0 &{}\text { if }\varvec{\alpha } \in {\mathscr {L}}_n (0 , 0 , 0 )\text { (that is, }\varvec{\alpha }= {\textbf{0}}), \\ 1 &{}\text { if }\varvec{\alpha } \in {\mathscr {L}}_n (1 , 1 , 0 ), \\ T^{n+1} &{}\text { if }\varvec{\alpha } \in {\mathscr {L}}_n (1 , 0 , 1 ). \end{array}\right. } \end{aligned}$$
(27)

Remark 2.4.7

Let us comment on whether \(A_1, A_2\) are unique. Suppose \(c_1 \ne c_2\). We can see that \(A_1\) is unique up to multiplication by a unit in \({\mathbb {F}}_q^*\), but unless otherwise stated \(A_1\) should be taken to be monic. For \(r \ge 2\), the polynomial \(A_2\) should be taken to be monic unless otherwise stated. However, even then it is not unique: We can multiply it by a unit in \({\mathbb {F}}_q^*\) and add \(B_2 A_2\) to it, for any \({{\,\textrm{deg}\,}}B_2 \le c_2 - c_1\). Thus, if we state that \(A_2\) is the second characteristic polynomial, it is with the understanding that it is generally not unique. Note that all possibilities for \(A_2\) are equivalent modulo \(A_1\); and in particular if \(\rho _1\) is equal to \(r=c_1\) (that is, the sequence \(\varvec{\alpha }\) is quasi-regular), then we can choose \(A_2\) to be monic and have degree less than \({{\,\textrm{deg}\,}}A_1\).

The case when \(c_1 = c_2\) occurs when n is even and \(c_1 = c_2 = n_1\). In this case we have

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{n_1 , n_1 } (\varvec{\alpha }) =&\{ {\textbf{0}} \} , \\ {{\,\textrm{ker}\,}}H_{n_1 -1 , n_1 +1 } (\varvec{\alpha }) =&\{ B A + B' A' : B , B' \in {\mathbb {F}}_q \} , \end{aligned}$$

for some \(A,A' \in {\mathcal {A}}\) with at least one having degree equal to \(n_1\). As both these polynomials first appear in the same matrix, it is not immediately obvious how to define the first characteristic polynomial and how to define the second. However, this can be addressed in the following manner. We let \(A_2\) be the polynomial that is of smaller degree between \(A,A'\), and multiplied by an element in \({\mathbb {F}}_q^*\) so that it is monic; and we let \(A_1\) be the polynomial of higher degree, multiplied so that it is monic. If both \(A,A'\) have the same degree, then we take \(A_2\) to be the smallest monic representative of A modulo \(A'\); and we take \(A_1\) to be \(A'\), again multiplied so that it is monic. There is more than one possibility for the specific values that \(A,A'\) can take, with all possibilities spanning \({{\,\textrm{ker}\,}}H_{n_1 -1, n_1 +1 } (\varvec{\alpha })\) as above; but regardless of which possibility we have, the value of \(A_2\) is the same. In cases where we do not have \(c_1 = c_2 = n_1\), this uniqueness would apply to \(A_1\), not \(A_2\); however, the definition we have here is consistent with the degree bounds on \(A_1, A_2\) given in the theorem for the other cases.

It should be noted that the characteristic degrees and characteristic polynomials of a sequence \(\varvec{\alpha }\) completely determine the kernel structure. However, the characteristic polynomials alone do not, as we will see in Theorem 2.4.13 where a sequence \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n )\) and a certain extension \(\varvec{\alpha }' = (\alpha _0, \alpha _1, \ldots , \alpha _n, \alpha _{n+1} )\) can have the same characteristic polynomials (but different characteristic degrees).

The following corollary is easily deduced from Theorem 2.4.4.

Corollary 2.4.8

Suppose \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\) and \(H:= H_{l,m} (\varvec{\alpha })\) where \(l+m-2=n\). We have already established that if \(m \le r\), then the kernel of H is trivial.

If \(r < m \le n+2-r\) and \(\varvec{\alpha }\) is not quasi-regular (that is, \(\pi _1 \ne 0\)), then there are no vectors in the kernel of H of the form \((v_0, v_1, \ldots , v_{m-1}, 1)^T\), for some \(v_0, \ldots , v_{m-1} \in {\mathbb {F}}_q\); that is, none of the polynomials in the kernel are monic and of degree m. Whereas, if \(\varvec{\alpha }\) is quasi-regular (that is, \(\pi _1 = 0\)), then exactly \(\frac{1}{q}\) of the vectors in the kernel of H of the form \((v_0, v_1, \ldots , v_{m-1}, 1)^T\), for some \(v_0, \ldots , v_{m-1} \in {\mathbb {F}}_q\); that is, \(\frac{1}{q}\) of the polynomials in the kernel are monic and of degree m.

If \(n+2-r < r \le n+1\), regardless of the value of \(\pi _1\), exactly \(\frac{1}{q}\) of the vectors in the kernel of H of the form \((v_0, v_1, \ldots , v_{m-1}, 1)^T\), for some \(v_0, \ldots , v_{m-1} \in {\mathbb {F}}_q\); that is, \(\frac{1}{q}\) of the polynomials in the kernel are monic and of degree m.

The following can be viewed as a converse to Theorem 2.4.4.

Theorem 2.4.9

Claim 1 Suppose we have \(A_1 \in {\mathcal {A}} \backslash \{ 0 \}\) with \(\rho _1:= {{\,\textrm{deg}\,}}A_1 \le 1\), and let \(n \ge \rho _1\). Then, there exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (\rho _1, \rho _1, 0)\) with first characteristic polynomial equal to \(A_1\). If \(\rho _1 = 0\), then there also exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (0, 0, 1)\) with first characteristic polynomial equal to \(A_1\). The second characteristic polynomials will be as in (27).

Claim 2 Suppose we have \(A_1 \in {\mathcal {A}} \backslash \{ 0 \}\) with \(\rho _1:= {{\,\textrm{deg}\,}}A_1 \le 1\), and \(A_2 \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A_2 \ge {{\,\textrm{deg}\,}}A_1 +2\). Also, let \(n \ge {{\,\textrm{deg}\,}}A_2\) and \(r:= n+2- {{\,\textrm{deg}\,}}A_2\). Then, there exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, r - \rho _1 )\) with characteristic polynomials equal to \(A_1, A_2\).

Claim 3 Suppose we have coprime \(A_1, A_2 \in {\mathcal {A}}\) with \(r:= {{\,\textrm{deg}\,}}A_1 \ge 2\), and let \(n \ge \max \{ r, {{\,\textrm{deg}\,}}A_2 \} + r - 2\). Then, there exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (r,r,0)\) with characteristic polynomials \(A_1, A_2\). Furthermore, \(\varvec{\alpha }\) is unique up to multiplication by elements in \({\mathbb {F}}_q^*\).

Claim 4 Suppose we have coprime \(A_1, A_2 \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A_2 > {{\,\textrm{deg}\,}}A_1 \ge 2\), and let

$$\begin{aligned} \rho _1 :=&{{\,\textrm{deg}\,}}A_1 \\ \pi _1 :=&{{\,\textrm{deg}\,}}A_2 - {{\,\textrm{deg}\,}}A_1 \\ r :=&\rho _1 + \pi _1 \\ n :=&{{\,\textrm{deg}\,}}A_2 + r -2 . \end{aligned}$$

Then, there exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\) with characteristic polynomials \(A_1, A_2\). Furthermore, \(\varvec{\alpha }\) is unique up to multiplication by elements in \({\mathbb {F}}_q^*\).

The proof of this theorem begins on page 41. This theorem demonstrates the extent to which we can take any coprime polynomials \(A_1, A_2\) and integer n such that there is a sequence \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n )\) with characteristic polynomials equal to \(A_1, A_2\). Claims 1 and 2 address the cases where \(\rho _1 \le 1\). This is not difficult and it is included for completeness. Claim 3 considers the case where \(\varvec{\alpha }\) is quasi-regular, and we can see by Theorem 2.4.4 that this allows for the possibility that \({{\,\textrm{deg}\,}}A_2 \le {{\,\textrm{deg}\,}}A_2\). On the other hand, if \(\varvec{\alpha }\) is not quasi-regular, then by Theorem 2.4.4 we must have \({{\,\textrm{deg}\,}}A_2 > {{\,\textrm{deg}\,}}A_1\), and this is the case that Claim 4 considers.

With regards to the definition of r in Claim 2, this follows from the fact that if \(A_2\) is to be the second characteristic polynomial, then we need the second characteristic degree of \(\varvec{\alpha }\), which is \(n+2 - r\), to be equal to \({{\,\textrm{deg}\,}}A_2\). With regards to the bounds on n, for Claim 3 we note that the characteristic degrees are r and \(n+2-r\), and since the latter must be at least as large as the former, we obtain the requirement that \(n \ge r+r-2\). Furthermore, by Theorem 2.4.4, we must have that \({{\,\textrm{deg}\,}}A_2 \le n+2-r\), and thus \(n \ge {{\,\textrm{deg}\,}}A_2 + r -2\). For Claim 4, by Theorem 2.4.4 we must have that \({{\,\textrm{deg}\,}}A_2 = n+2-r\); that is, \(n = {{\,\textrm{deg}\,}}A_2 + r -2\). The values given for \(r, \rho _1, \pi _1\) are also required by Theorem 2.4.4.

Theorem 2.4.10

Suppose

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r , r , 0 ) \end{aligned}$$

with \(r \ge 2\) (note that \(\varvec{\alpha }\) is quasi-regular). Let \(A_1, A_2\) be the characteristic polynomials. We necessarily have

$$\begin{aligned} d_1 := {{\,\textrm{deg}\,}}A_1 = r , \end{aligned}$$

and we can choose \(A_2\) such that

$$\begin{aligned} d_2 := {{\,\textrm{deg}\,}}A_2 < {{\,\textrm{deg}\,}}A_1 . \end{aligned}$$

Now, if \(d_2 \ge 1\), then let \(A_3\) be the unique polynomial satisfying

$$\begin{aligned} A_1 = R_2 A_2 + A_3 \hspace{1em} \text { and } \hspace{1em} d_3 := {{\,\textrm{deg}\,}}A_3 < {{\,\textrm{deg}\,}}A_2 , \end{aligned}$$

for some polynomial \(R_2\).

Case 1: If \(d_2 \ge 2\), then

$$\begin{aligned} \varvec{\alpha }^{(2)} := (\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 + d_2 -2}) \end{aligned}$$

is in \({\mathscr {L}}_{d_1 + d_2 -2} (d_2, d_2, 0 )\) and has characteristic polynomials \(A_2, A_3\).

Case 2: If \(d_2 = 1\), then

$$\begin{aligned} \varvec{\alpha }^{(2)} := (\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 }) \end{aligned}$$

is in \({\mathscr {L}}_{d_1 } (2, 1, 1 )\) and has characteristic polynomials \(A_2, A_1\) (note that the order is important).

Case 3: If \(d_2 = 0\), then

$$\begin{aligned} \varvec{\alpha }^{(2)} := (\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 }) \end{aligned}$$

is in \({\mathscr {L}}_{d_1 } (2, 0, 2 )\) and has characteristic polynomials \(A_2, A_1\) (note that the order is important).

Furthermore, \(\varvec{\alpha }\) is the unique sequence in \({\mathscr {L}}_n (r, r, 0 )\) that has characteristic polynomials \(A_1, A_2\) and gives the above properties for \(\varvec{\alpha }^{(2)}\).

Suppose now that

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r , \rho _1 , \pi _1 ) , \end{aligned}$$

where \(r \ge 2\) and \(\pi _1 \ge 1\); and let \(A_1, A_2\) be the characteristic polynomials. Then

$$\begin{aligned} \varvec{\alpha }^{(1)} := (\alpha _0 , \alpha _1 , \ldots , \alpha _{n - \pi _1 }) \end{aligned}$$

is in \({\mathscr {L}}_{n - \pi _1 } (\rho _1, \rho _1, 0 )\) and has characteristic polynomials \(A_1, A_2\). A similar result holds for the cases \(r \le 1\), but the second characteristic polynomial of \(\varvec{\alpha }^{(1)}\) will not be that of \(\varvec{\alpha }\), but it will be defined as in Theorem 2.4.4.

The proof of this theorem begins on page 39. The theorem above demonstrates the manifestation of the Euclidean algorithm in Hankel matrices, and this is made clearer in the corollaries below. The final claim in the theorem is given in order to demonstrate that even if a sequence is not quasi-regular (which is required for the main part of the theorem) a truncation can be taken that is quasi-regular and has the same characteristic polynomials.

Corollary 2.4.11

Let

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r , r , 0 ) \end{aligned}$$

where \(r \ge 2\). Let the characteristic polynomials of \(\varvec{\alpha }\) be \(A_1, A_2\), where we choose \(A_2\) such that \({{\,\textrm{deg}\,}}A_2 < {{\,\textrm{deg}\,}}A_1\) (which is possible since \(\varvec{\alpha }\) is quasi-regular) and note that degree \(A_1 = r\). Define \(d_1:= {{\,\textrm{deg}\,}}A_1\) and \(d_2:= {{\,\textrm{deg}\,}}A_2\). Suppose the Euclidean algorithm gives

$$\begin{aligned} A_1 =&R_2 A_2 + A_3{} & {} d_3 := {{\,\textrm{deg}\,}}A_3< {{\,\textrm{deg}\,}}A_2 , \\ A_2 =&R_3 A_3 + A_4{} & {} d_4 := {{\,\textrm{deg}\,}}A_4< {{\,\textrm{deg}\,}}A_3 , \\&\vdots{} & {} \vdots \\ A_t =&R_{t+1} A_{t+1} + A_{t+2}{} & {} d_{t+2} := {{\,\textrm{deg}\,}}A_{t+2} < {{\,\textrm{deg}\,}}A_{t+1} , \end{aligned}$$

where t is such that \({{\,\textrm{deg}\,}}A_{t} \ge 2 > {{\,\textrm{deg}\,}}A_{t+1} \ge 0\). Finally, let \(\varvec{\alpha }^{(1)}:= \varvec{\alpha }\); if \(t \ge 2\) then let

$$\begin{aligned} \varvec{\alpha }^{(2)} :=&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 + d_2 - 2} ) , \\ \varvec{\alpha }^{(3)} :=&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_2 + d_3 - 2} ) , \\&\vdots \\ \varvec{\alpha }^{(t)} :=&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_{t-1} + d_{t} - 2} ) ; \end{aligned}$$

and for all \(t \ge 1\), let

$$\begin{aligned} \varvec{\alpha }^{(t+1)} :=&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_{t}} ) . \end{aligned}$$

Then,

$$\begin{aligned} \begin{aligned}&\varvec{\alpha }^{(1)} \in {\mathscr {L}}_{n} (r , r , 0)\text { and has characteristic polynomials }A_1 ,\ A_2, \\&\varvec{\alpha }^{(2)} \in {\mathscr {L}}_{d_1 + d_2 - 2} (d_2 , d_2 , 0)\text { and has characteristic polynomials }A_2 ,\ A_3, \\&\varvec{\alpha }^{(3)} \in {\mathscr {L}}_{d_2 + d_3 - 2} (d_3 , d_3 , 0)\text { and has characteristic polynomials }A_3 ,\ A_4, \\&\vdots \\&\varvec{\alpha }^{(t)} \in {\mathscr {L}}_{d_{t-1} + d_t -2} (d_{t} , d_{t} , 0)\text { and has characteristic polynomials }A_{t} ,\ A_{t+1}; \end{aligned} \end{aligned}$$
(28)

and

$$\begin{aligned} \varvec{\alpha }^{(t+1)} \in {\left\{ \begin{array}{ll} {\mathscr {L}}_{d_{t}} (2 , 1 , 1) &{}\text { if }d_{t+1} =1, \\ {\mathscr {L}}_{d_{t}} (2 , 0 , 2) &{}\text { if }d_{t+1} =0, \end{array}\right. } \hspace{1em} \text { and has characteristic polynomials }A_{t+1} ,\ A_{t}. \end{aligned}$$

Furthermore, for any given \(1 \le i \le t\), the sequence \(\varvec{\alpha }^{(i)}\) is the unique extension of \(\varvec{\alpha }^{(t+1)}\) satisfying the associated conditions in (28).

Proof

This follows by successive applications of Theorem 2.4.10. \(\square \)

Corollary 2.4.12

Suppose \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n ) \in {\mathscr {L}}_n (r,r,0)\) with \(r \ge 2\), and let the characteristic polynomials be \(A_1, A_2\). Note that \({{\,\textrm{deg}\,}}A_1 = r = 2\), and we choose \(A_2\) such that \({{\,\textrm{deg}\,}}A_2 < {{\,\textrm{deg}\,}}A_1\). Let \(A_1, A_2, \ldots , A_s, 1\) be the polynomials we obtain by applying the Euclidean algorithm to \(A_1, A_2\), and let \(d_1, d_2, \ldots , d_s, 0\) be their respective degrees. Then, there are exactly \(h:= {{\,\textrm{deg}\,}}A_{s} -1\) consecutive zeros at the beginning of the sequence \(\varvec{\alpha }\).

Proof

We begin with the case where \({{\,\textrm{deg}\,}}A_{s} = 1\). We must show that the first term of \(\varvec{\alpha }\) is non-zero. Note that since \({{\,\textrm{deg}\,}}A_1 \ge 2\), we must have \(s \ge 2\). Let us define \(\varvec{\alpha }^{(s)}\) as in Corollary 2.4.11. That is, \(\varvec{\alpha }^{(s)} = (\alpha _0, \alpha _1, \ldots , \alpha _{d_{s-1}}) \in {\mathscr {L}}_{d_{s-1}} (2,1,1)\) and has characteristic polynomials \(A_s\) and \(A_{s-1}\). In particular, the kernel of the matrix \((\alpha _0, \alpha _1, \ldots , \alpha _{d_{s-1}})\) contains the polynomials

$$\begin{aligned} A_s \hspace{1em} , \hspace{1em} T A_s \hspace{1em} , \hspace{1em} \ldots \hspace{1em} , \hspace{1em} T^{d_{s-1} -2} A_s \end{aligned}$$
(29)

and

$$\begin{aligned} A_{s-1}. \end{aligned}$$
(30)

Suppose for a contradiction that \(\alpha _0 = 0\). Then, (29) implies that \(\alpha _1, \ldots , \alpha _{d_{s-1} -1} = 0\), and then (30) implies also that \(\alpha _{d_{s-1}} = 0\). Thus, \(\varvec{\alpha }^{(s)} = {\textbf{0}}\), contradicting that \(\varvec{\alpha }^{(s)} \in {\mathscr {L}}_{d_{s-1}} (2,1,1)\).

We now consider the case where \({{\,\textrm{deg}\,}}A_s \ge 2\), and we define \(A_{s+1} =1\). We define \(\varvec{\alpha }^{(s+1)}\) as in Corollary 2.4.11. That is, \(\varvec{\alpha }^{(s+1)} = (\alpha _0, \alpha _1, \ldots , \alpha _{d_{s}}) \in {\mathscr {L}}_{d_{s-1}} (2,1,1)\) and has characteristic polynomials \(A_{s+1}\) and \(A_{s}\). In particular, the kernel of the matrix \((\alpha _0, \alpha _1, \ldots , \alpha _{d_{s}})\) contains the polynomials

$$\begin{aligned} A_{s+1} \hspace{1em} , \hspace{1em} T A_{s+1} \hspace{1em} , \hspace{1em} \ldots \hspace{1em} , \hspace{1em} T^{d_{s} -2} A_{s+1} \end{aligned}$$
(31)

and

$$\begin{aligned} A_{s}. \end{aligned}$$
(32)

Since \(A_{s+1} = 1\), we deduce from (31) that the first \(d_s - 1\) entries of \(\varvec{\alpha }^{(s+1)}\) are 0. We now need only show that \(\alpha _{d_s -1}\) is non-zero, which follows easily by a contradiction argument: If it were zero, then (32) would imply that \(\alpha _{d_s}\) is also zero, meaning \(\varvec{\alpha }^{(s+1)} = {\textbf{0}}\), which contradicts that \(\varvec{\alpha }^{(s+1)} \in {\mathscr {L}}_{d_{s-1}} (2,1,1)\). \(\square \)

The following theorem demonstrates how the characteristic polynomials of a sequence change if we increase the length of the sequence. The intuition behind this result is made clear in the proof.

Theorem 2.4.13

Let

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1} , \end{aligned}$$

and let \(c_1, c_2\) be the characteristic degrees and \(A_1, A_2\) be the characteristic polynomials. Now let

$$\begin{aligned} \varvec{\alpha }' := (\alpha _0 , \alpha _1 , \ldots , \alpha _n , \alpha _{n+1} ) \in {\mathbb {F}}_q^{n+2}. \end{aligned}$$

be an extension of \(\varvec{\alpha }\), and denote the characteristic degrees by \(c_1', c_2'\) and the characteristic polynomials by \(A_1', A_2'\). In what follows we define \(n_1':= \lfloor \frac{(n+1)+2}{2} \rfloor \) and \(n_2':= \lfloor \frac{(n+1)+3}{2} \rfloor \).

Claim 1 Suppose that \(\varvec{\alpha } \in {\mathscr {L}}_n (r,r,0)\) where \(0 \le r \le n_1 -1\); and so \(c_1 =r\) and \(c_2 = n+2-r\), and \(A_1 \in {\mathcal {A}}_{r}\) and \(A_2 \in {\mathcal {A}}_{<r}\).

There is one value of \(\alpha _{n+1}\) such that \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (r,r,0)\). In which case, we have \(c_1' = c_1 = r\) and \(c_2' = c_2 +1 = n+3-r\), and \(A_1' = A_1\) and \(A_2'= A_2\).

There are \(q-1\) values of \(\alpha _{n+1}\) such that \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (r+1,r,1)\). In which case, we have \(c_1' = c_1 +1 = r+1\) and \(c_2' = c_2 = n+2-r\), and \(A_1' = A_1\) and \(A_2' = \beta A_2 + T^{c_2 - c_1} A_1\) for some \(\beta \in {\mathbb {F}}_q^*\). There is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

Claim 2 Suppose that \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\) where \(\pi _1 \ge 1\) and \(0 \le r \le n_1 -1\) (and, by definition, \(r=\rho _1 + \pi _1\)); and so \(c_1 = r\) and \(c_2 = n+2-r\), and \(A_1 \in {\mathcal {A}}_{\rho _1 }\) and \(A_2 \in {\mathcal {A}}_{c_2}\).

For any value that \(\alpha _{n+1}\) takes in \({\mathbb {F}}_q\), we have \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (r+1, \rho _1, \pi _1 +1 )\). We have \(c_1' = c_1 +1 = r+1\) and \(c_2' = c_2 = n+2-r\), and \(A_1' = A_1\) and \(A_2' = \beta T^{c_2 - c_1} A_1 + A_2\) for some \(\beta \in {\mathbb {F}}_q\). There is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

Claim 3 Suppose n is even and \(\varvec{\alpha } \in {\mathscr {L}}_n (n_1, n_1, 0 )\); and so \(c_1', c_2' = n_1\), and \(A_1 \in {\mathcal {A}}_{n_1}\) and \(A_2 \in {\mathcal {A}}_{<n_1}\). For any value that \(\alpha _{n+1}\) takes, we have \(\varvec{\alpha } \in {\mathscr {L}}_{n+1} (n_1, n_1, 0 )\). We also have \(c_1' = c_1 = n_1\) and \(c_2' = c_2 +1 = n+3 - n_1 = n_1 +1\), and \(A_1' = \beta A_2 + A_1\) and \(A_2' = A_2\). There is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

Suppose n is odd and \(\varvec{\alpha } \in {\mathscr {L}}_n (n_1, n_1, 0 )\); and so \(c_1' = n_1\) and \(c_2' = n_1 +1\), and \(A_1 \in {\mathcal {A}}_{n_1}\) and \(A_2 \in {\mathcal {A}}_{<n_1}\).

  • There is one value of \(\alpha _{n+1}\) such that \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (n_1, n_1, 0 )\); in which case \(c_1' = c_1 = n_1\) and \(c_2' = c_2 +1 = n_1 +2\), and \(A_1' = A_1\) and \(A_2' = A_2\).

  • There are \(q-1\) values of \(\alpha _{n+1}\) such that \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (n_1 +1, n_1 +1, 0 )\); in which case \(c_1' = c_1 +1 = n_1 +1\) and \(c_2' = c_2 = n_1 +1\), and \(A_1' = \beta A_2 + T A_1\) and \(A_2' = A_1\). There is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

Suppose n is odd and \(\varvec{\alpha } \in {\mathscr {L}}_n (n_1, \rho _1, \pi _1 )\), where \(\pi _1 \ge 1\) and \(0 \le \rho _1 \le n_1 -1\); and so \(c_1' = n_1\) and \(c_2' = n_1 +1\), and \(A_1 \in {\mathcal {A}}_{\rho _1}\) and \(A_2 \in {\mathcal {A}}_{n_1 +1}\). For any value that \(\alpha _{n+1}\) takes, we have \(\varvec{\alpha } \in {\mathscr {L}}_{n+1} (n_1 +1, n_1 +1, 0 )\); and \(c_1', c_2' = n_1 +1\), and \(A_1' = \beta T A_1 + A_2\) and \(A_2' = A_1\). There is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

The proof of this theorem begins on page 43.

We now proceed to prove our four theorems, but first we will need the following two lemmas.

Lemma 2.4.14

Suppose \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1}\), and that we have integers lmk satisfying \(l+m-2 = n\) and \(l>k \ge 1\).

A vector

$$\begin{aligned} {\textbf{v}} = (v_0 , \ldots , v_{m-1})^T \end{aligned}$$

is in the kernel of \(H_{l,m} (\varvec{\alpha })\) if and only if the vectors

$$\begin{aligned} ({\textbf{v}} \mid 0) =&(v_0 , \ldots , v_{m-1} , 0)^T , \\ (0 \mid {\textbf{v}} ) =&(0 , v_0 , \ldots , v_{m-1})^T \end{aligned}$$

are in the kernel of \(H_{l-1,m+1} (\varvec{\alpha })\). This can be extended, and expressed in terms of polynomials, to give the following result:

A polynomial \(A \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A \le m-1\) is in the kernel of \(H_{l,m} (\varvec{\alpha })\) if and only if YA is in the kernel of \(H_{l-k,m+k} (\varvec{\alpha })\) for any \({{\,\textrm{deg}\,}}Y \le k\).

Proof

For the forward implication of the first claim, suppose \({\textbf{v}}\) is in the kernel of \(H_{l,m} (\varvec{\alpha })\), and let

$$\begin{aligned} \varvec{\alpha }' := (\alpha _0 , \alpha _1 , \ldots , \alpha _{n-2}) , \varvec{\alpha }'' := (\alpha _1 , \alpha _2 , \ldots , \alpha _{n-1}) . \end{aligned}$$

Due to the last entry being zero, we can see that \(({\textbf{v}} \mid 0)\) is in the kernel of \(H_{l-1,m+1} (\varvec{\alpha })\) if and only if \({\textbf{v}}\) is in the kernel of \(H_{l-1, m} (\varvec{\alpha }')\). The latter is true, because \(H_{l-1, m} (\varvec{\alpha }')\) is the matrix we obtain by removing the last row from \(H_{l,m} (\varvec{\alpha })\).

Similarly, \((0 \mid {\textbf{v}})\) is in the kernel of \(H_{l-1,m+1} (\varvec{\alpha })\) if and only if \({\textbf{v}}\) is in the kernel of \(H_{l-1, m} (\varvec{\alpha }'')\). Again, the latter is true, because \(H_{l-1, m} (\varvec{\alpha }'')\) is the matrix we obtain by removing the first row from \(H_{l,m} (\varvec{\alpha })\).

The backward implication of the first claim follows from what we have established above.

We now consider the second claim. The first claim tells us that a polynomial \(A \in {\mathcal {A}} \in {\mathbb {F}}_q [T]\) with \({{\,\textrm{deg}\,}}A \le m-1\) is in the kernel of \(H_{l,m} (\varvec{\alpha })\) if and only if A and TA are in the kernel of \(H_{l-1,m+1} (\varvec{\alpha })\).

Successive applications of this tell us that this holds if and only if

$$\begin{aligned} A, T A, \ldots , T^k A \end{aligned}$$

are in the kernel of \(H_{l-k,m+k} (\varvec{\alpha })\).

Using the fact that any polynomial in the kernel remains in the kernel after being multiplied by an element of \({\mathbb {F}}_q\), we can see that the above holds if and only if \(Y \cdot A\) is in the kernel of \(H_{l-k,m+k} (\varvec{\alpha })\) for any \({{\,\textrm{deg}\,}}Y \le k\). \(\square \)

A related lemma is the following.

Lemma 2.4.15

Let \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1}\), and define \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{n-1})\). Also, let \(l+m-2 = n\) with \(l \ge 2\). A vector

$$\begin{aligned} {\textbf{v}} = (v_0 , \ldots , v_{m-1})^T \end{aligned}$$

is in the kernel of \(H_{l,m} (\varvec{\alpha } )\), if and only if the vectors

$$\begin{aligned} ({\textbf{v}} \mid 0) =&(v_0 , \ldots , v_{m-1} , 0)^T , \\ (0 \mid {\textbf{v}} ) =&(0 , v_0 , \ldots , v_{m-1})^T \end{aligned}$$

are in the kernel of \(H_{l-1,m} (\varvec{\alpha }' )\) (which is just the matrix \(H_{l,m} (\varvec{\alpha } )\) after removing the last row).

The proof of this lemma is similar to the proof of Lemma 2.4.14.

We now give the proofs of the four theorems, beginning with Theorem 2.4.4.

Proof of Theorem 2.4.4

If \(r=0\), then \(\varvec{\alpha } = 0\) and so \({{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = {\mathbb {F}}_q^m\) for all m. Therefore, we can take any \(A_1 \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A_1 = 0\).

Now suppose \(r=1\) and \(\rho _1 = 1\). Then, \(\alpha _0 \ne 0\), and so the matrix \(H_{n+1,1} (\varvec{\alpha })\) has full column rank and its kernel is trivial. The \((\rho , \pi )\)-form of \(H_{n,2} (\varvec{\alpha })\) is

$$\begin{aligned} \begin{pmatrix} \alpha _0 &{} \vline &{} \alpha _1 \\ \hline 0 &{} &{} 0 \\ 0 &{} &{} 0 \\ \vdots &{} &{} \vdots \\ 0 &{} &{} 0 \end{pmatrix} . \end{aligned}$$

So, we can see that \({{\,\textrm{ker}\,}}H_{n,2} (\varvec{\alpha }) = \{ \gamma A_1: \gamma \in {\mathbb {F}}_q \}\) for some \(A_1 \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A_1 = \rho _1 = 1\). Lemma 2.4.14 tells us that

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - 2 \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) \end{aligned}$$

for \(2 \le m \le n+1\). The dimension of the left side is \(m-1\) (recall a polynomial of degree \(m-2\) has \(m-1\) coefficients); while Corollary 2.4.1 tells us that the right side has dimension \(m-1\) as well. Therefore, we must have equality, as required.

If, instead, we have \(r=1\) and \(\rho _1 = 0\), then, \(\varvec{\alpha } = (0, \ldots , 0, \alpha _n)\) with \(\alpha _n \ne 0\), and so the matrix \(H_{n+1,1} (\varvec{\alpha })\) has full column rank and its kernel is trivial. We have that

$$\begin{aligned} H_{n,2} (\varvec{\alpha }) = \begin{pmatrix} 0 &{}\quad 0 \\ 0 &{}\quad 0 \\ \vdots &{}\quad \vdots \\ 0 &{}\quad 0 \\ 0 &{}\quad \alpha _n \end{pmatrix} , \end{aligned}$$

and so by similar means as above we have

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - 2 \end{array} \Big \} \end{aligned}$$

where \({{\,\textrm{deg}\,}}A_1 = 0 = \rho _1\), as required.

We now consider the case \(r \ge 2\). We will work up to \(m = c_2 +1\) first, before considering \(m > c_2 +1\).

We will first address the special subcase when \(c_1 = c_2\). This occurs when n is even and \(r = \frac{n+2}{2} = n_1\) (which also implies that \(\rho _1 = r\)). When \(1 \le m \le c_1\), we have \({{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \{ 0 \}\), by Corollary 2.4.1. In this subcase, there are no m satisfying \(c_1 + 1 \le m \le c_2\). Suppose now that \(m = c_2 +1 = n_1 +1\) and \(l=c_2 -1=n_1 -1\). Corollary 2.4.1 tells us that \(\dim {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = 2\). Thus, there are polynomials \(A_1, A_2\) (neither being a multiple of the other) such that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

All that remains to be shown is that at least one of \(A_1, A_2\) have degree equal to \(\rho _1 = r\) (without loss of generality, this will be \(A_1\)). To show this, suppose for a contradiction that \({{\,\textrm{deg}\,}}A_1, {{\,\textrm{deg}\,}}A_2 < \rho _1 = r\). Then, the vectors associated to these polynomials are of the form

$$\begin{aligned} {\textbf{v}} =&(v_0 , v_1 , \ldots , v_{r-1} , 0), \\ {\textbf{w}} =&(w_0 , w_1 , \ldots , w_{r-1} , 0) ; \end{aligned}$$

and so the vectors

$$\begin{aligned} {\textbf{v}}' =&(v_0 , v_1 , \ldots , v_{r-1}) , \\ {\textbf{w}}' =&(w_0 , w_1 , \ldots , w_{r-1}) \end{aligned}$$

are in the kernel of \(H_{l,m-1} (\varvec{\alpha }')\), where \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{n-1})\). In particular, \( \dim {{\,\textrm{ker}\,}}H_{l,m-1} (\varvec{\alpha }') = 2\). From this, and the fact that \(H_{l,m-1} (\varvec{\alpha }')\) is the matrix we obtain by removing the last row from \(H_{l+1, m-1} (\varvec{\alpha }) = H_{n_1, n_1} (\varvec{\alpha })\), we deduce that the kernel of \(H_{n_1, n_1} (\varvec{\alpha })\) is at least one-dimensional. This contradicts that the kernel is trivial (since the matrix is invertible).

Suppose now that \(c_1 \ne c_2\). When \(1 \le m \le c_1 = r\), Corollary 2.4.1 tells us that \({{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \{ {\textbf{0}} \}\).

Now suppose that \(m= c_1 +1 = r+1\). Corollary 2.4.1 tells us that the kernel of \(H: = H_{l,m} (\varvec{\alpha })\) has dimension 1, and so we let \({\textbf{v}} \ne {\textbf{0}}\) be a vector that spans the kernel. The \((\rho , \pi )\)-form of H is

where 1 represents an element in \({\mathbb {F}}_q^*\) and \(*\) represents an element in \({\mathbb {F}}_q\). The submatrix \(H [\rho _1, \rho _1]\) is invertible, and there are \(\pi _1\) number of 1s in the bottom-right submatrix. Of course, if \(\rho _1 = 0\), then the top two submatrices and the bottom-left submatrix should disappear. If \(\pi _1 = 0\), then the bottom-right submatrix should be a zero matrix. Regardless, the \((\rho , \pi )\)-form shows us that \({\textbf{v}}\) must have zeros in its last \(\pi _1\) entries. That is,

$$\begin{aligned} {\textbf{v}} = \begin{pmatrix} v_0 \\ \vdots \\ v_{\rho _1 -1} \\ v_{\rho _1 } \\ 0 \\ \vdots \\ 0 \end{pmatrix} . \end{aligned}$$

We must also have that \(v_{\rho _1 } \ne 0\). When \(\rho _1 = 0\), this is clear. When \(\rho _1 > 0\), because \({\textbf{v}}\) is in the kernel of H, we can see that \((v_0, \ldots , v_{\rho _1 -1}, v_{\rho _1 })^T\) is in the kernel of

$$\begin{aligned} \begin{pmatrix} H [\rho _1 , \rho _1 ] &{} \vline &{} \begin{matrix} \alpha _{\rho _1} \\ \vdots \\ \alpha _{2 \rho _1 -2} \\ \alpha _{2 \rho _1 -1} \end{matrix} \end{pmatrix} ; \end{aligned}$$

and so, if \(v_{\rho _1 } = 0\), then \((v_0, \ldots , v_{\rho _1 -1})^T \ne {\textbf{0}}\) is in the kernel of \(H [\rho _1, \rho _1 ]\), contradicting that it is invertible.

In terms of polynomials, we have shown, for \(m= c_1 +1 = r+1\), that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \end{array} \Big \} . \end{aligned}$$

As previously, Lemma 2.4.14 and Corollary 2.4.1 tell us, for \(c_1 +1 \le m \le c_2\), that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 -1 \end{array} \Big \} . \end{aligned}$$

Now suppose that \(m = c_2 +1\). Lemma 2.4.14 tells us that

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 -1 \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) . \end{aligned}$$

However, the left side has dimension \(m - c_1 = n+3 - 2r\), while, by Corollary 2.4.1, the right side has dimension \(2m-n-2 = n+4 - 2r\). Thus, there is some polynomial \(A_2 \in {\mathcal {A}}\) in the kernel of \(H_{l,m} (\varvec{\alpha })\) with

$$\begin{aligned} A_2 \not \in \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 -1 \end{array} \Big \} . \end{aligned}$$
(33)

Thus, we have

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

The fact that \({{\,\textrm{deg}\,}}A_2 \le c_2\) (that is, \(A_2\) has at most its first \(c_2 +1\) coefficients being non-zero) follows from the fact that the kernel is in \((c_2 +1)\)-dimensional space.

We will now show that if \(\rho _1\) is not equal to \(r=c_1\), then \({{\,\textrm{deg}\,}}A_2\) is necessarily equal to \(c_2\). Let \(m = c_2 +1\), and let \({\textbf{v}} = (a_0, a_1, \ldots , a_{c_2})^T\) be the vector associated with \(A_2\). Note that the condition \({{\,\textrm{deg}\,}}A_2 = c_2\) is equivalent to \(a_{c_2} \ne 0\).

Suppose for a contradiction that \(\rho _1 \ne r\) and \({{\,\textrm{deg}\,}}A_2 \ne c_2\). This means that \(\pi _1 \ge 1\) and \({\textbf{v}} = (a_0, a_1, \ldots , a_{c_2 -1}, 0)^T\). The latter implies that \({\textbf{v}}':= (a_0, a_1, \ldots , a_{c_2 -1})^T\) is in the kernel of \(H_{l, m-1} (\varvec{\alpha }')\), where \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{n-1})\). In terms of polynomials, this means \(A_2 \in H_{l, m-1} (\varvec{\alpha }')\).

Note that \(H_{l, m-1} (\varvec{\alpha }')\) is the matrix we obtain by removing the last row from \(H_{l+1, m-1} (\varvec{\alpha })\). We have already established that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l+1 , m-1} (\varvec{\alpha }) = {{\,\textrm{ker}\,}}H_{c_1 , c_2} (\varvec{\alpha }) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 - 1 \end{array} \Big \} . \end{aligned}$$

Hence, an application of Lemma 2.4.15 gives us that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l , m-1} (\varvec{\alpha }') = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \end{array} \Big \} . \end{aligned}$$

(Note that, because \(\pi _1 \ge 1\) every vector in the kernel of \(H_{l+1, m-1} (\varvec{\alpha })\) has a zero in its last entry, which is a requirement for our application of Lemma 2.4.15). However, since we have established \(A_2 \in H_{l, m-1} (\varvec{\alpha }')\), this implies that

$$\begin{aligned} A_2 \in \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \end{array} \Big \} , \end{aligned}$$

which contradicts (33).

Finally, it remains to consider when \(m > c_2 + 1\). By Lemma 2.4.14, we can see that

$$\begin{aligned} \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1 - 1 \\ {{\,\textrm{deg}\,}}B_2 \le m - c_2 - 1 \end{array} \bigg \} \subseteq {{\,\textrm{ker}\,}}H_{l,m} (\varvec{\alpha }) . \end{aligned}$$
(34)

We will show that

$$\begin{aligned} B_1 A_1 + B_2 A_2 \ne 0 \end{aligned}$$
(35)

for all \({{\,\textrm{deg}\,}}B_1 \le n+1 - c_1\) and \({{\,\textrm{deg}\,}}B_2 \le n+1 - c_2\). Thus, the left side of (34) has dimension \(2m - c_1 - c_2\), which is equal to the dimension of the right side by Corollary 2.4.1, and thus we have equality.

Note that (35) also proves that \(A_1, A_2\) are coprime. Indeed, if they were not then we could write

$$\begin{aligned} A_1 =&C A_1' , \\ A_2 =&C A_2' , \end{aligned}$$

where

$$\begin{aligned} {{\,\textrm{deg}\,}}C \ge&1 , \\ {{\,\textrm{deg}\,}}A_1' \le&{{\,\textrm{deg}\,}}A_1 -1 = \rho _1 -1 \le c_1 -1 = n+1 -c_2, \\ {{\,\textrm{deg}\,}}A_2' \le&{{\,\textrm{deg}\,}}A_2 -1 = c_2 -1 = n+1 - c_1 . \end{aligned}$$

In particular, taking \(B_1 = A_2'\) and \(B_2 = -A_1'\) would contradict (35).

To prove (35), suppose for a contradiction that

$$\begin{aligned} B_1 A_1 + B_2 A_2 = 0 \end{aligned}$$

with

$$\begin{aligned} {{\,\textrm{deg}\,}}B_1 \le&m' - c_1 , \\ {{\,\textrm{deg}\,}}B_2 \le&m' - c_2 , \end{aligned}$$

where \(c_2 \le m' \le n+1\). Suppose further that \(m'\) is minimal with this property; in particular, there is equality in at least one the inequalities above. Let us write

$$\begin{aligned} B_1 =&x_0 + x_1 T + \ldots + x_{m' - c_1} T^{m' - c_1} , \\ B_2 =&y_0 + y_1 T + \ldots + y_{m' - c_2} T^{m' - c_2} ; \end{aligned}$$

and we note that at least one of \(x_{m' - c_1}, y_{m' - c_2}\) is non-zero due to there being at least one equality in the inequalities above. Since \(B_1 A_1 + B_2 A_2 = 0\), we have that

$$\begin{aligned}&( x_{m' - c_1} T^{m' - c_1} A_1 + y_{m' - c_2} T^{m' - c_2} A_2 ) \\&\quad = - \Big ( ( x_0 + x_1 T + \ldots + x_{m' - c_1 -1} T^{m' - c_1 -1}) A_1 + (y_0 + y_1 T + \ldots + y_{m' - c_2 -1} T^{m' - c_2 -1}) A_2 \Big ) . \end{aligned}$$

(Note that this is non-zero because at least one of \(x_{m' - c_1}, y_{m' - c_2}\) is non-zero, and because \(A_2\) is not a multiple of \(A_1\)). By (34), the right side tells us that

$$\begin{aligned} T^{m' - c_2} Z \in H_{l' , m'} (\varvec{\alpha }) , \end{aligned}$$

where \(l'\) is such that \(l' + m' = n+2\) and

$$\begin{aligned} Z := ( x_{m' - c_1} T^{c_2 - c_1} A_1 + y_{m' - c_2} A_2 ) . \end{aligned}$$

Again by (34), we also have that

$$\begin{aligned} T^{s} Z = ( x^{m' - c_1} T^{s + c_2 - c_1} A_1 + y^{m' - c_2} T^{s} A_2 ) \in H_{l' , m'} (\varvec{\alpha }) \end{aligned}$$

for all \(0 \le s \le m' - c_2 -1\). But then, using the second result in Lemma 2.4.14 for the first relation below, we have that

$$\begin{aligned} Z \in H_{l' + (m' - c_2) , m' - (m' - c_2)} (\varvec{\alpha }) = H_{c_1 , c_2} (\varvec{\alpha }) = \Big \{ X A_1 : \begin{array}{c} X \in \mathcal {A_1} \\ {{\,\textrm{deg}\,}}X \le c_2 - c_1 - 1 \end{array} \Big \} . \end{aligned}$$

Recalling the definition of Z, we can see that this implies

$$\begin{aligned} A_2 \in \Big \{ X A_1 : \begin{array}{c} X \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}X \le c_2 - c_1 \end{array} \Big \} , \end{aligned}$$

which contradicts (33). \(\square \)

We now prove Theorem 2.4.10, which will be required for the proof of Theorem 2.4.9 afterward.

Proof of Theorem 2.4.10

For what follows, we define \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{2 d_1})\).

Now, consider the matrix \(H_{d_1 -1, n+3 - d_1 } (\varvec{\alpha })\). It has kernel

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{d_1 -1 , n+3 - d_1 } (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+3 - 2 d_1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

Recall that \({{\,\textrm{deg}\,}}A_1 = d_1\) and \({{\,\textrm{deg}\,}}A_2 = d_2 < d_1\). In particular, the vectors associated to \(A_1, A_2\) in \(H_{d_1 -1, n+3 - d_1 } (\varvec{\alpha })\) have at least \(n+2 - 2d_1\) zero entries at the end. Therefore, consider the matrix that we obtain by removing the last \(n+2 - 2d_1\) columns of \(H_{d_1 -1, n+3 - d_1 } (\varvec{\alpha })\), which is \(H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\). From the above, we can see that

$$\begin{aligned} \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} \subseteq {{\,\textrm{ker}\,}}H_{d_1 -1 , d_1 +1} (\varvec{\alpha }') . \end{aligned}$$

Given that \(H_{d_1, d_1 } (\varvec{\alpha }') = H_{r, r } (\varvec{\alpha }')\) is invertible, we can see that the dimension of \( {{\,\textrm{ker}\,}}H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\) is 2 and so we must have equality above:

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{d_1 -1 , d_1 +1} (\varvec{\alpha }') = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$
(36)

Now, we have \({{\,\textrm{deg}\,}}A_2 = d_2 < d_1\), and so the polynomial associated to \(A_2\) in \({{\,\textrm{ker}\,}}H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\) has \(d_1 - d_2\) zeros entries at the end. What we do next depends on the size of \(d_2\).

Case 1: If \(d_2 \ge 2\), then \(d_1 - d_2 < d_1 -1\). Thus, we can remove the last \(d_1 - d_2\) columns from \({{\,\textrm{ker}\,}}H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\). This gives the matrix \({{\,\textrm{ker}\,}}H_{d_2 -1, d_1 +1} (\varvec{\alpha }^{(2)} )\), and we can see that it has kernel

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{d_2 -1 , d_1 +1} (\varvec{\alpha }^{(2)} ) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \\ {{\,\textrm{deg}\,}}B_2 \le d_1 - d_2 \end{array} \bigg \} . \end{aligned}$$

The fact that the left side is a subset of the right side follows from (36) and successive applications of Lemma 2.4.15. Equality then follows by noting that removing a row will increase the dimension by (at most) 1. Also, we can see that \(A_3\) is in the kernel of \(H_{d_2 -1, d_1 +1} (\varvec{\alpha }^{(2)} )\) and we can replace \(A_1\) with \(A_3\):

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{d_2 -1 , d_1 +1} (\varvec{\alpha }^{(2)} ) = \bigg \{ B_2 A_2 + B_3 A_3: \begin{array}{c} B_2 , B_3 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_2 \le d_1 - d_2 \\ {{\,\textrm{deg}\,}}B_3 \le 0 \end{array} \bigg \} . \end{aligned}$$

We can now deduce that the characteristic polynomials of \(\varvec{\alpha }^{(2)}\) are \(A_2, A_3\), and that \(\varvec{\alpha }^{(2)} \in {\mathscr {L}}_{d_1 + d_2 -2} (d_2, d_2, 0)\).

Case 2: If \(d_2 = 1\), then \(d_1 - d_2 = d_1 -1\). Thus, we can only remove the last \(d_1 - d_2 -1 = d_1 -2\) columns from \({{\,\textrm{ker}\,}}H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\). This gives the matrix \({{\,\textrm{ker}\,}}H_{d_2, d_1 +1} (\varvec{\alpha }^{(2)} ) = {{\,\textrm{ker}\,}}H_{1, d_1 +1} (\varvec{\alpha }^{(2)} )\) (recall that the definition of \(\varvec{\alpha }^{(2)}\) differs between the cases), and we can see by similar means as in Case 1 that it has kernel

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{1 , d_1 +1} (\varvec{\alpha }^{(2)} ) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le 0 \\ {{\,\textrm{deg}\,}}B_2 \le d_1 - d_2 -1 \end{array} \bigg \} . \end{aligned}$$

Note that \(B_2\) cannot have degree equal to \(d_1 - d_2\), and so \(A_3\) is not in the kernel above. Thus, we have that the characteristic polynomials are \(A_2, A_1\) (recall that the order matters). The fact that \(B_2\) cannot have degree equal to \(d_1 - d_2\) also tells us that \(\pi (\varvec{\alpha }^{(2)} ) = 1\). We deduce that \(\varvec{\alpha }^{(2)} \in {\mathscr {L}}_{d_1 } (2, 1, 1)\).

Case 3: This case is very similar to Case 2.

We now prove the uniqueness claim that is made in the theorem. We do so for Case 1, and the remaining cases are almost identical and only need to take into account the difference in definition of \(\varvec{\alpha }^{(2)}\). To this end, consider the sequences

$$\begin{aligned} \varvec{\alpha }^{(2)} =&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 + d_2 -1}) \\ \varvec{\alpha }' =&(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_1 + d_2 -1} , \alpha _{d_1 + d_2 } , \ldots , \alpha _{2 d_1} ) . \end{aligned}$$

Note that \(\alpha _{d_1 + d_2 }, \ldots , \alpha _{2 d_1}\) form the last \(d_1 - d_2 +1\) entries in the last column of the matrix \(H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\), and by assumption we must have that \(A_1\) is in the kernel of this matrix. Thus, for \(d_1 + d_2 \le i \le 2 d_1\), the i-th row of \(H_{d_1 -1, d_1 +1} (\varvec{\alpha }')\) is orthogonal to the vector associated with \(A_1\). Since this vector has non-zero final entry, we can express the last entry in the i-th row in terms of the previous entries. An inductive argument proves that the entries \(\alpha _{d_1 + d_2 }, \ldots , \alpha _{2 d_1}\) can be uniquely determined in terms of the entries \(\alpha _0, \alpha _1, \ldots , \alpha _{d_1 + d_2 -1}\).

Now consider the sequence

$$\begin{aligned} \varvec{\alpha } =&(\alpha _0 , \alpha _1 , \ldots , \alpha _{2 d_1} , \alpha _{2 d_1 +1} , \ldots , \alpha _{n} ) , \end{aligned}$$

and note that \(\alpha _{2 d_1 +1}, \ldots , \alpha _{n}\) form the last \(n - 2 d_1\) entries of the final row of \(H_{d_1 -1, n+3 - d_1 } (\varvec{\alpha })\). By the assumptions made in the uniqueness claim, we must have that the polynomials

$$\begin{aligned} A_1 , T A_1 , \ldots , T^{n - 2 d_1 -1} A_1 \end{aligned}$$

are in the kernel of \(H_{d_1 -1, n+3 - d_1 } (\varvec{\alpha })\) (and thus orthogonal to its final row). Hence, a similar inductive argument as above will show that the entries \(\alpha _{2 d_1 +1}, \ldots , \alpha _{n}\) can be uniquely determined in terms of the entries \(\alpha _0, \alpha _1, \ldots , \alpha _{2 d_1}\), thus concluding the proof of the uniqueness claim.

All that remains is to prove the final claim made in the theorem, which we do for the case \(r \ge 2\). We have that

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathscr {L}}_n (r , \rho _1 , \pi _1 ) . \end{aligned}$$

By considering the \((\rho , \pi )\)-form of the matrix \(H_{n_1, n_2} (\varvec{\alpha })\) and removing last \(\pi _1\) columns, we can deduce that \(\varvec{\alpha }^{(1)}\) is in \({\mathscr {L}}_{n - \pi _1 } ( \rho _1, \rho _1, 0)\). Given that \(A_1\) is in the kernel of \(H_{n+1-r, r+1} (\varvec{\alpha })\), and that the associated vector has \(\pi _1\) number of zero entries at the end, we can see that \(A_1\) is in the kernel of matrix obtained by removing the last \(\pi _1\) columns of \(H_{n+1-r, r+1} (\varvec{\alpha })\); that is, the matrix \(H_{n+1-r, \rho _1 +1} (\varvec{\alpha }^{(1)} )\). This tells us that \(A_1\) is the first characteristic polynomial. Similarly, since \(A_2\) is in the kernel of \(H_{r-1, n+3-r} (\varvec{\alpha })\), we can see that it is in the kernel of the matrix obtained by removing the last \(\pi _1\) rows of \(H_{r-1, n+3-r} (\varvec{\alpha })\); that is, the matrix \(H_{\rho _1 -1, n+3-r} (\varvec{\alpha }^{(1)} )\). This tells us that \(A_2\) is the second characteristic polynomial. \(\square \)

We now prove Theorem 2.4.9.

Proof of Theorem 2.4.9

The construction of the sequences \(\varvec{\alpha }\) in Claims 1 and 2 are not difficult, and so we proceed to Claims 3 and 4.

Claim 3 Let us write \(A_2'\) to be the unique polynomial satisfying

$$\begin{aligned} A_2 = R A_1 + A_2' \hspace{3em} {{\,\textrm{deg}\,}}A_2' < {{\,\textrm{deg}\,}}A_1 . \end{aligned}$$

That is, \(A_2'\) is the smallest representative of \(A_2\) modulo \(A_1\). Recalling our discussion on uniqueness in Definition 2.4.5, it is equivalent to prove that there exists a sequence \(\varvec{\alpha } \in {\mathscr {L}}_n (r,r,0)\) with characteristic polynomials \(A_1, A_2'\). Now, let us define \(d_1:= {{\,\textrm{deg}\,}}A_1\) and \(d_2:= {{\,\textrm{deg}\,}}A_2'\), and we apply the Euclidean algorithm to \(A_1, A_2'\):

$$\begin{aligned} A_1 =&R_2 A_2' + A_3{} & {} d_3 := {{\,\textrm{deg}\,}}A_3< {{\,\textrm{deg}\,}}A_2' , \\ A_2' =&R_3 A_3 + A_4{} & {} d_4 := {{\,\textrm{deg}\,}}A_4< {{\,\textrm{deg}\,}}A_3 , \\ A_3 =&R_4 A_4 + A_5{} & {} d_5 := {{\,\textrm{deg}\,}}A_5< {{\,\textrm{deg}\,}}A_4 , \\&\vdots{} & {} \vdots \\ A_t =&R_{t+1} A_{t+1} + A_{t+2}{} & {} d_{t+2} := {{\,\textrm{deg}\,}}A_{t+2} < {{\,\textrm{deg}\,}}A_{t+1} , \end{aligned}$$

where t is such that \({{\,\textrm{deg}\,}}A_{t} \ge 2 > {{\,\textrm{deg}\,}}A_{t+1} \ge 0\). Also, let

$$\begin{aligned} \varvec{\alpha }^{(t+1)} := (\alpha _0 , \alpha _1 , \ldots , \alpha _{d_{t+1}} ) . \end{aligned}$$

Corollary 2.4.11 tells us that our desired \(\varvec{\alpha }\) exists if and only if we have

$$\begin{aligned} \varvec{\alpha }^{(t+1)} \in {\left\{ \begin{array}{ll} {\mathscr {L}}_{d_{t}} (2 , 1 , 1) &{}\text { if }d_{t+1} =1, \\ {\mathscr {L}}_{d_{t}} (2 , 0 , 2) &{}\text { if }d_{t+1} =0, \end{array}\right. } \end{aligned}$$

with characteristic polynomials \(A_{t+1}, A_{t}\).

Suppose \(d_{t+1} = 1\). Then, it is equivalent to find \(\varvec{\alpha }^{(t+1)}\) such that

$$\begin{aligned} A_{t+1} \hspace{1em} , \hspace{1em} T A_{t+1} \hspace{1em} , \hspace{1em} \ldots \hspace{1em} , \hspace{1em} T^{d_{t} -2} A_{t+1} \in {{\,\textrm{ker}\,}}(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_{t}} ) \end{aligned}$$
(37)

and

$$\begin{aligned} A_{t} \in {{\,\textrm{ker}\,}}(\alpha _0 , \alpha _1 , \ldots , \alpha _{d_{t}} ) . \end{aligned}$$
(38)

To this end, we let \(\alpha _0\) take any value in \({\mathbb {F}}_q\). Since \({{\,\textrm{deg}\,}}A_{t+1} = 1\), we can see that (37) uniquely determines the values of \(\alpha _1, \ldots , \alpha _{d_{t} -1}\) in terms of \(\alpha _0\). Similarly, since \({{\,\textrm{deg}\,}}A_{t} = d_{t}\), we can see that (38) uniquely determines the value of \(\alpha _{d_{t}}\) in terms of \(\alpha _0\). Ultimately, we have shown that our desired \(\varvec{\alpha }\) exists. The fact that it is unique up to multiplication by elements in \({\mathbb {F}}_q^*\) follows from the following four facts:

  1. (1)

    We could let \(\alpha _0\) take any value in \({\mathbb {F}}_q\).

  2. (2)

    The entries \(\alpha _1, \ldots , \alpha _{d_{t}}\) can be expressed uniquely in terms of \(\alpha _0\). In fact, each such \(\alpha _i\) can be expressed as a linear function of \(\alpha _0\) that passes through the origin.

  3. (3)

    The sequence \(\varvec{\alpha }\) is uniquely determined by the sequence \(\varvec{\alpha }^{(t+1)}\), which follows from the uniqueness claim at the end of Corollary 2.4.11. In fact, we can show that for all \(i \le n\) the entry \(\alpha _i\) can be expressed as a linear function of \(\alpha _0\) that passes through the origin.

  4. (4)

    Finally, we note from the above that if \(\alpha _0 = 0\), then \(\varvec{\alpha } = {\textbf{0}}\). We dismiss this case as it does not give us \(\varvec{\alpha }^{(t+1)} \in {\mathscr {L}}_{d_{t}} (2, 1, 1)\).

Now suppose that we have \(d_{t+1} = 0\) instead. We can use a similar argument as above. The main difference will be that \(\alpha _0, \ldots , \alpha _{d_{t} -2} = 0\), while \(\alpha _{d_{t} -1}\) will be free to take any value in \({\mathbb {F}}_q^*\), and for all \(d_{t} \le i \le n\) the entry \(\alpha _i\) can be expressed as a linear function of \(\alpha _{d_{t} -1}\) that passes through the origin.

Claim 4 We know by Claim 2 that there is a sequence \(\varvec{\alpha }^{(1)} = (\alpha _0, \alpha _1, \ldots , \alpha _{n - \pi _1} ) \in {\mathscr {L}}_{n-\pi _1 } (\rho _1, \rho _1, 0)\) with characteristic polynomials \(A_1, A_2\). Note that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{\rho _1 -1 , n+3 - r} (\varvec{\alpha }^{(1)}) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+2 - r - \rho _1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$
(39)

We now define the extension \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _n )\) of \(\varvec{\alpha }^{(1)}\) by the following property:

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{r -1 , n+3 - r} (\varvec{\alpha }) \subseteq \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+2 - 2r \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$
(40)

Note that removing the last \(\pi _1\) rows of \(H_{r -1, n+3 - r} (\varvec{\alpha })\) will leave us with the matrix \( H_{\rho _1 -1, n+3 - r} (\varvec{\alpha }^{(1)})\) from (39). Hence, by successive applications of Lemma 2.4.15, regardless of the way we extended \(\varvec{\alpha }^{(1)}\), we certainly have that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{r -1 , n+3 - r} (\varvec{\alpha }) \supseteq \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+2 - 2r \end{array} \bigg \} . \end{aligned}$$

The requirement that \(A_2\) is in the kernel above will actually uniquely determine the entries \(\alpha _{n - \pi _1 +1}, \ldots , \alpha _{n}\) (which form the extended part of \(\varvec{\alpha }\)) in terms of \(\alpha _0, \ldots , \alpha _{n - \pi _1 }\). This follows from the fact that \(\alpha _{n - \pi _1 +1}, \ldots , \alpha _{n}\) appear as the final entries in the last \(\pi _1\) rows of \(H_{r -1, n+3 - r} (\varvec{\alpha })\), and the fact that \(A_2\) has degree \(n+2-r\).

We will now show that \(\varvec{\alpha }\) is in \({\mathscr {L}}_n (r, \rho _1, \pi _1 )\). This is a condition of Claim 3, but it also allows us to determine the dimension of the left side of (40) by using Corollary 2.4.1. By comparing this to the dimension of the right side of (40), we see that we must have equality, and thus allowing us to deduce that \(A_1, A_2\) are the characteristic polynomials of \(\varvec{\alpha }\). All that will be left to prove is that \(\varvec{\alpha }\) is unique up to multiplication by elements in \({\mathbb {F}}_q^*\).

Consider the sequence \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{n - \pi _1 +1} )\) and the associated matrix \(H':= H_{\rho _1, n+3 - r} (\varvec{\alpha }')\). Note that if we remove the last row from this matrix then we are left with the matrix \(H:= H_{\rho _1 -1, n+3 - r} (\varvec{\alpha }^{(1)})\) from (39). Hence, by Lemma 2.4.15 we have that

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+1 - r - \rho _1 \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H' . \end{aligned}$$

By our construction of \(\varvec{\alpha }\),we also have that \(A_2\) is in this kernel, and thus

$$\begin{aligned} \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le n+1 - r - \rho _1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} \subseteq {{\,\textrm{ker}\,}}H' . \end{aligned}$$

In fact, we must have equality above. This follows from the fact that the dimension of \({{\,\textrm{ker}\,}}H'\) is one less than that of \({{\,\textrm{ker}\,}}H\). Indeed, the number of columns remains the same, but the rank of \(H'\) is one more than the rank of H, which follows by the fact that \(H'\) has full row rank since \(\rho (\varvec{\alpha }') \ge \rho (\varvec{\alpha }^{(1)}) = \rho _1\). Now, because \(T^{n+2 - r - \rho _1} A_1\) is not in the kernel of \(H'\), we can see that \(\pi (H') = 1\), and thus \(\pi (\varvec{\alpha }') = 1\). By considering \((\rho , \pi )\)-forms, we can see that this forces \(\pi (\varvec{\alpha }') = \pi _1\) (indeed, extending the sequence by one entry will increase the \(\pi \)-characteristic by one), and thus \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\).

Finally, for the uniqueness claim, this follows from the uniqueness claim in Claim 2 and the way we formed our extension \(\varvec{\alpha }\) of \(\varvec{\alpha }^{(1)}\). Technically, we should also prove a converse result; that any sequence \(\varvec{\alpha }\) with the properties in Claim 3 has a truncation \(\varvec{\alpha }^{(1)} \in {\mathscr {L}}_{n-\pi _1 } (\rho _1, \rho _1, 0)\) with characteristic polynomials \(A_1, A_2\). This is to ensure that our construction above actually addresses all possibilities for \(\varvec{\alpha }\). This is not difficult to prove, and we have done something similar in the proof of Theorem 2.4.10. \(\square \)

Finally, we prove Theorem 2.4.13.

Proof of Theorem 2.4.13

Claim 1 The cases \(r \le 1\) are considerably easier than the cases \(r \ge 2\); we consider only the latter.

Let \(H:=H_{n_1, n_2} (\varvec{\alpha })\). The \((\rho , \pi )\)-form of H is

By recalling the definition of \((\rho , \pi )\)-form, we can see that the \((\rho , \pi )\)-form of \(H':=H_{n_1', n_2'} (\varvec{\alpha }')\) is

where \(\gamma \in {\mathbb {F}}_q\), and there is a one-to-one correspondence between \(\gamma \) and \(\alpha _{n+1}\). If \(\gamma = 0\), then we have \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (r,r,0)\), while if \(\gamma \ne 0\), then we have \(\varvec{\alpha }' \in {\mathscr {L}}_{n+1} (r+1,r,1)\). The claims on the characteristic degrees follow by definition.

Let us now consider the characteristic polynomials. Consider the case \(\gamma = 0\) first. By definition, we have that \(A_1\) spans the kernel of \(H_{c_2 -1, c_1 +1} (\varvec{\alpha })\). By comparing the \((\rho , \pi )\)-form of \(H_{c_2 -1, c_1 +1} (\varvec{\alpha })\) to the \((\rho , \pi )\)-form of \(H_{c_2' -1, c_1' +1} (\varvec{\alpha }') = H_{c_2, c_1 +1} (\varvec{\alpha }')\) (the latter having an extra row of zeros at the bottom, compared to the former), we see that \(A_1\) spans the kernel of \(H_{c_2' -1, c_1' +1} (\varvec{\alpha }')\). Thus, \(A_1' = A_1\).

Regarding \(A_2'\), by definition it is the polynomial in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\) that is not a multiple of \(A_1'\). Now, let \({\textbf{a}}_2\) be the vector in the kernel of \(H_{c_1 -1, c_2 +1} (\varvec{\alpha })\) that is associated to \(A_2\). We can see that \(({\textbf{a}}_2 \mid 0)\) is in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }') = H_{c_1 -1, c_2 +2} (\varvec{\alpha }')\) (the latter matrix has an extra column compared to the former). In terms of polynomials, this means \(A_2\) is in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\); and since it is not a multiple of \(A_1' = A_1\), we have that \(A_2' = A_2\).

Now consider the case where \(\gamma \ne 0\). Let us write \({\textbf{a}}_1\) for the vector associated to \(A_1\) in the kernel of \(H_{c_2 -1, c_1 +1} (\varvec{\alpha })\). Similar to above, we compare the \((\rho , \pi )\)-form of \(H_{c_2 -1, c_1 +1} (\varvec{\alpha })\) to the \((\rho , \pi )\)-form of \(H_{c_2' -1, c_1' +1} (\varvec{\alpha }') = H_{c_2 -1, c_1 +2} (\varvec{\alpha }')\) (the latter having an additional column compared to the former, with a non-zero entry in the bottom-right), and we see that \(({\textbf{a}}_1 \mid 0)\) spans the kernel of \(H_{c_2' -1, c_1' +1} (\varvec{\alpha }')\). Thus, \(A_1' = A_1\).

For \(A_2'\), as above, it is the polynomial in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }') = H_{c_1, c_2 +1} (\varvec{\alpha }')\) that is not a multiple of \(A_1'\). Since \(A_1' = A_1\) is the first characteristic polynomial, we have

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{c_1' -1 , c_2' +1} (\varvec{\alpha }') \supseteq \Big \{ B_1 A_1' : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2' - c_1' \end{array} \Big \} = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 -1 \end{array} \Big \} . \end{aligned}$$

We also have

$$\begin{aligned} T^{c_2 - c_1} A_1 \not \in {{\,\textrm{ker}\,}}H_{c_1' -1 , c_2' +1} (\varvec{\alpha }'). \end{aligned}$$
(41)

Otherwise, by Lemma 2.4.14, we would have that \(A_1\) is in the kernel of \(H_{c_2' -2, c_1'} (\varvec{\alpha }')\), which is a contradiction. Note also that \(H_{c_1 -1, c_2 +1} (\varvec{\alpha })\) is the matrix we obtain after removing the last row from \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }') = H_{c_1, c_2 +1} (\varvec{\alpha }')\). In particular, the kernel of the latter is a subspace of the former. That is,

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{c_1' -1 , c_2' +1} (\varvec{\alpha }') \subseteq {{\,\textrm{ker}\,}}H_{c_1 -1 , c_2 +1} (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

Thus, we have established that

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 -1 \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H_{c_1' -1 , c_2' +1} (\varvec{\alpha }') \subseteq \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

Corollary 2.4.1 tells us that the dimensions of adjacent vector spaces above differ by 1. Now, \(A_2'\) is the vector in \({{\,\textrm{ker}\,}}H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\) that is not a multiple of \(A_1' = A_1\), and recall that by Theorem 2.4.4 the degree of \(A_2'\) must be equal to \(c_2'\). Thus, we can deduce

$$\begin{aligned} A_2' = \beta A_2 + T^{c_2 - c_1} A_1 \end{aligned}$$

for some \(\beta \in {\mathbb {F}}_q^*\) (\(\beta \) cannot be 0 by (41)). It is not difficult to see that if we change the value of \(\alpha _{n+1}\) (which appears in the last row of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\)), then the value of \(\beta \) will have to change to ensure that the vector associated to \(A_2'\) is orthogonal to the last row of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\), and thus in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\).( Note that this makes use of the fact that \(A_2\) is not in the kernel of \(H_{c_1' -1, c_2' +1} (\varvec{\alpha }')\); indeed, otherwise, \(A_2\) would be orthogonal to the last row and altering the value of \(\beta \) would have no effect). Hence, there is a one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \).

Claim 2 Let \(H:= H_{n_1, n_2} (\varvec{\alpha })\). The \((\rho , \pi )\)-form of H is of the form

(42)

where there are \(\pi _1\) number of 1s in the bottom-right submatrix (recall, 1 represents an element in \({\mathbb {F}}_q^*\), 0 represents 0 as usual, and \(*\) represents an element in \({\mathbb {F}}_q\)). Now consider the matrix \(H' = H_{n_1', n_2'} (\varvec{\alpha }')\), and note that \(H'\) is the same matrix as H but it has either an additional row or column at the end (depending on whether n is even or odd). Thus, we can deduce that the \((\rho , \pi )\)-form of \(H'\) is the same as (42) but with an additional row or column. This additional row or column contributes an additional 1 in the bottom-right submatrix, and thus we have \(\pi (\varvec{\alpha }') = \pi _1 +1\). Clearly \(\rho (\varvec{\alpha }') = \rho _1\), and thus \(\varvec{\alpha }' \in {\mathscr {L}}_n (r+1, \rho _1, \pi _1 +1 )\), as required. The claims on the characteristic degrees follow by definition.

Now, the proof that \(A_1' = A_1\) is similar as in the second case of Claim 1. The proof for the second characteristic polynomial is also similar, but slightly different, and so we give an outline of the proof.

By definition of the first characteristic polynomial, we have that \(A_1' = A_1\) spans the kernel of \(H_{c_2' -1, c_1' +1} (\varvec{\alpha })\). Thus, using Lemma 2.4.14 for the second relation below, we have that

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 -1 \end{array} \Big \} = \Big \{ B_1 A_1' : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2' - c_1' \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H_{c_1'-1 , c_2'+1} (\varvec{\alpha }') , \end{aligned}$$

but

$$\begin{aligned} T^{c_2 - c_1} A_1 \not \in {{\,\textrm{ker}\,}}H_{c_1'-1 , c_2'+1} (\varvec{\alpha }') . \end{aligned}$$
(43)

We also have that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{c_1'-1 , c_2'+1} (\varvec{\alpha }') \subseteq {{\,\textrm{ker}\,}}H_{c_1'-2 , c_2'+1} (\varvec{\alpha })&= {{\,\textrm{ker}\,}}H_{c_1-1 , c_2+1} (\varvec{\alpha }) \\&= \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} . \end{aligned}$$

Thus, we have

$$\begin{aligned} \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 -1 \end{array} \Big \} \subseteq {{\,\textrm{ker}\,}}H_{c_1'-1 , c_2'+1} (\varvec{\alpha }') \subseteq \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le c_2 - c_1 \\ {{\,\textrm{deg}\,}}B_2 \le 0 \end{array} \bigg \} , \end{aligned}$$

and, by Corollary 2.4.1, the dimensions of adjacent vector spaces above differ by 1. Thus, by (43), and the fact that Theorem 2.4.4 tells us that \(A_2'\) must be of degree \(c_2'\), we have that

$$\begin{aligned} A_2' = \beta T^{c_2 - c_1} A_1 + A_2 \end{aligned}$$

for some \(\beta \in {\mathbb {F}}_q\). The one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \) follows by similar reasoning as in Claim 2.

Claim 3 Consider the first case given in Claim 3. The fact that \(\varvec{\alpha } \in {\mathscr {L}}_{n+1} (n_1, n_1, 0 )\), and the claims on the characteristic degrees, are not difficult to deduce. Thus, we restrict our attention to the characteristic polynomials. By definition, \(A_1'\) is the polynomial that spans the kernel of \(H_{n_1, n_1 +1} (\varvec{\alpha }')\), and Theorem 2.4.4 tells us that \({{\,\textrm{deg}\,}}A_1' = n_1\). Now, the matrix \(H_{n_1, n_1 +1} (\varvec{\alpha }')\) is the same as the matrix \({{\,\textrm{ker}\,}}H_{n_1 -1, n_1 +1} (\varvec{\alpha })\) but with an additional row, and we know that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{n_1 -1 , n_1 +1} (\varvec{\alpha }) = \{ B_1 A_1 + B_2 A_2 : B_1 , B_2 \in {\mathbb {F}}_q \}, \end{aligned}$$

and so

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{n_1 , n_1 +1} (\varvec{\alpha }') \subseteq \{ B_1 A_1 + B_2 A_2 : B_1 , B_2 \in {\mathbb {F}}_q \}. \end{aligned}$$

Since we know \({{\,\textrm{deg}\,}}A_1' = n_1\), and that \({{\,\textrm{deg}\,}}A_1 = n_1\) and \({{\,\textrm{deg}\,}}A_2 < n_1\), we must have that \(A_1' = \beta A_2 + A_1\) for some \(\beta \in {\mathbb {F}}_q\). The one-to-one correspondence between \(\alpha _{n+1}\) and \(\beta \) follows similarly as previously. The second characteristic polynomial is, by definition, the polynomial in the kernel of \(H_{n_1 -1, n_1 +2} (\varvec{\alpha }')\) that is not a multiple of \(A_1'\). Similar to the previous claims, we note that \(H_{n_1 -1, n_1 +2} (\varvec{\alpha }')\) is the same as the matrix \(H_{n_1 -1, n_1 +1} (\varvec{\alpha }')\) but with an additional column. So, because \(A_2\) is in the kernel of the latter, we can see that it must be in the kernel of the former. Hence \(A_2' = A_2\).

The second case of Claim 3 is very similar to Claim 1. However, for the second bullet point, one should note that, because \(\varvec{\alpha }\) is quasi-regular (unlike in the analogous result in Claim 1), we require that \({{\,\textrm{deg}\,}}A_1' = n_1 +1\), and this is why we take \(A_1' = \beta A_2 + T A_1\) (as opposed to \(A_2' = \beta A_2 + T A_1 = \beta A_2 + T^{c_2 - c_1} A_1\) which would be completely analogous to Claim 1). However, this “swapping” of \(A_1'\) and \(A_2'\) is important and natural as it plays a role in the manifestation of the Euclidean algorithm that we saw in Theorem 2.4.10 and its corollaries.

The third case of Claim 3 is very similar to Claim 2. Again, there is a similar “swapping” of \(A_1'\) and \(A_2'\). \(\square \)

Remark 2.4.16

Recall that in Sect. 1.4 we considered the generalization of Theorem 1.2.1 to higher moments such as the third, and we described that we would need to determine how many \(\varvec{\alpha }, \varvec{\beta }, \varvec{\gamma }\) there are such that \(H_{l+1, m+1}(\varvec{\alpha })\), \(H_{l+1, m+1}(\varvec{\beta })\), and \(H_{l+1, m+1}(\varvec{\gamma })\) have certain given ranks, and \(\varvec{\alpha } + \varvec{\beta } + \varvec{\gamma } = \varvec{0}\). Now that we have established various results on Hankel matrices, we can reduce this problem to certain special cases.

Let us write

$$\begin{aligned} \varvec{\alpha } =&(\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \\ \varvec{\beta } =&(\beta _0 , \beta _1 , \ldots , \beta _n ) \\ \varvec{\gamma } =&(\gamma _0 , \gamma _1 , \ldots , \gamma _n ) \end{aligned}$$

and

$$\begin{aligned} \varvec{\alpha }^- =&(\alpha _0 , \alpha _1 , \ldots , \alpha _{n-1} ) \\ \varvec{\beta }^- =&(\beta _0 , \beta _1 , \ldots , \beta _{n-1} ) \\ \varvec{\gamma }^- =&(\gamma _0 , \gamma _1 , \ldots , \gamma _{n-1} ) \end{aligned}$$

Suppose we know how many \(\varvec{\alpha }^-, \varvec{\beta }^-, \varvec{\gamma }^-\) there are with certain given \((\rho , \pi )\)-characteristics and \(\varvec{\alpha }^- + \varvec{\beta }^- + \varvec{\gamma }^- = \varvec{0}\). If at least one of these sequences \(\varvec{\alpha }^-, \varvec{\beta }^-, \varvec{\gamma }^-\) has non-zero \(\pi \)-characteristic (that is, not all are quasi-regular) then we can determine the \((\rho , \pi )\)-characteristics of \(\varvec{\alpha }, \varvec{\beta }, \varvec{\gamma }\), even when we impose the condition that \(\varvec{\alpha } + \varvec{\beta } + \varvec{\gamma } = \varvec{0}\).

Indeed, without loss of generality, suppose that \(\pi (\varvec{\gamma }^- ) \ge 1\). Theorem 2.4.13 allows us to determine the \((\rho , \pi )\)-characteristics of \(\varvec{\alpha }, \varvec{\beta }\) based on the \((\rho , \pi )\)-characteristics of \(\varvec{\alpha }^-, \varvec{\beta }^-\). However, if, for example, \(\varvec{\alpha }^-\) is quasi-regular, then the \((\rho , \pi )\)-characteristic of \(\varvec{\alpha }\) would depend on \(\alpha _n\). Theorem 2.4.13 tells us how many values \(\alpha _n\) can take so that the \((\rho , \pi )\)-characteristic of \(\varvec{\alpha }\) takes a specific value. However, it does not tell us exactly what those values are, and this is why it is important that \(\pi (\varvec{\gamma }^- ) \ge 1\): Regardless of the values of \(\alpha _n\) and \(\beta _n\), we know there exists \(\gamma _n\) such that \(\alpha _n + \beta _n + \gamma _n = 0\), and we still know what the \((\rho , \pi )\)-characteristic of \(\varvec{\gamma }\) is because it is independent of the value of \(\gamma _n\) (which uses the fact that \(\pi (\varvec{\gamma }^- ) \ge 1\)).

Thus we have reduce our problem to the cases where \(\varvec{\alpha }^-, \varvec{\beta }^-, \varvec{\gamma }^-\) are all quasi-regular.

3 The variance of the divisor function

We now prove Theorem 1.2.1.

Proof of Theorem 1.2.1

In what follows, \(n \ge 4\).

By (10), we have that

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _{\sigma _{z}} (A;<h) |^2&= \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _z (B) \bigg )^2 \nonumber \\&\quad - {\left\{ \begin{array}{ll} q^{2h} (n+1)^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ q^{2h} \Big ( \frac{q^{(n+1)z} - 1}{q^z -1} \Big )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}. \end{array}\right. } \end{aligned}$$
(44)

So, we will consider the first term on the right side.

$$\begin{aligned}&\sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _{z} (B) \bigg )^2 \\&\quad = \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A)< h \end{array}} \sigma _{z} (B) \bigg )^2 \\&\quad = \sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A)< h \end{array}} \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} \sum _{E \in {\mathcal {M}}_l} \sum _{F \in {\mathcal {M}}_m} \mathbb {1}_{EF=B} \hspace{0.5em} |F |^z \bigg )^2 \\&\quad = \sum _{A \in {\mathcal {A}}_{\le n }} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A) < h \end{array}} \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{E \in {\mathcal {M}}_l} \sum _{F \in {\mathcal {M}}_m} \mathbb {1}_{EF=B} \bigg )^2 . \end{aligned}$$

For the last line, we changed the first summation range from \(A \in {\mathcal {M}}_n\) to all \(A \in {\mathcal {A}}\) with \({{\,\textrm{deg}\,}}A \le n\). This does not change the result because the conditions on E and F force A to be in \({\mathcal {M}}_n\). Continuing, we have

$$\begin{aligned}&\sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _{z} (B) \bigg )^2 \\&\quad = \sum _{A \in {\mathcal {A}}_{\le n }} \bigg ( \sum _{\begin{array}{c} B \in {\mathcal {M}}_n \\ {{\,\textrm{deg}\,}}(B-A) < h \end{array}} \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{E \in {\mathcal {M}}_l} \sum _{F \in {\mathcal {M}}_m} \prod _{i=0}^{n} \mathbb {1}_{\{ EF \}_i = b_i } \bigg )^2 . \end{aligned}$$

Here, for a polynomial C, we let \(\{ C \}_i\) denote its i-ith coefficient, which is convenient when working with products of polynomials such as EF. We also denote i-th coefficient of ABEF by \(a_i, b_i, e_i, f_i\), respectively.

We will now express the sums over ABEF by sums over their coefficients. For example \(\sum _{A \in {\mathcal {A}}_{\le n }}\) will be expressed as \(\sum _{a_0, \ldots , a_n \in {\mathbb {F}}_q}\). We also note that if \(A'\) satisfies \({{\,\textrm{deg}\,}}(A-A') < h\), then (due to the sum over B) both A and \(A'\) give the same contribution. For a given A, there are \(q^h\) such \(A'\). Thus, we can multiply the whole expression by \(q^h\) and consider only the A that are of the form

$$\begin{aligned} A = 0 + 0 T + 0 T^2 + \ldots + 0 T^{h-1} + a_h T^h + a_{h+1} T^{h+1} + \ldots + a_{n} T^{n} . \end{aligned}$$

Furthermore, this means that B is of the form

$$\begin{aligned} B = b_0 + b_1 T + b_2 T^2 + \ldots + b_{h-1} T^{h-1} + a_h T^h + a_{h+1} T^{h+1} + \ldots + a_{n} T^{n} . \end{aligned}$$

Thus, using the above and applying (11) to the terms \(\mathbb {1}_{\{ EF \}_i = b_i }\), we obtain

$$\begin{aligned} \begin{aligned}&\sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _{z} (B) \bigg )^2 \\&\quad = \frac{q^{h}}{q^{2n+2}} \sum _{a_h , \ldots , a_n \in {\mathbb {F}}_q } \\&\quad \quad \Bigg ( \sum _{b_0 , \ldots , b_{h-1} \in {\mathbb {F}}_q} \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} e_0 , \ldots , e_{l-1} \in {\mathbb {F}}_q \\ e_l =1 \end{array}} \sum _{\begin{array}{c} f_0 , \ldots , f_{m-1} \in {\mathbb {F}}_q \\ f_m =1 \end{array}}\\&\quad \quad \prod _{i=0}^{h-1} \sum _{\alpha _i \in {\mathbb {F}}_q } \psi \bigg ( \alpha _i \Big ( b_i - \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \Big ) \bigg ) \\&\quad \quad \times \prod _{i=h}^{n} \sum _{\alpha _i \in {\mathbb {F}}_q } \psi \bigg ( \alpha _i \Big ( a_i - \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \Big ) \bigg ) \Bigg )^2 , \end{aligned} \end{aligned}$$
(45)

where we remind the reader that \(\psi \) is a non-trivial additive character, defined on page 7. Now, consider the sum over one of the coefficients \(b_i\), and apply (11). We obtain,

$$\begin{aligned} \frac{1}{q} \sum _{b_i \in {\mathbb {F}}_q } \psi ( \alpha _i b_i ) = {\left\{ \begin{array}{ll} 1 &{}\text { if }\alpha _i = 0, \\ 0 &{}\text { if }\alpha _i \in {\mathbb {F}}_q^*. \end{array}\right. } \end{aligned}$$

Thus, we require \(\alpha _i = 0\) in order to have a non-zero contribution to (45). Applying this for \(i = 0, \ldots , h-1\), we obtain

$$\begin{aligned} \begin{aligned}&\sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _{z} (B) \bigg )^2 \\&\quad = \frac{q^{3h}}{q^{2n+2}} \sum _{a_h , \ldots , a_n \in {\mathbb {F}}_q } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} e_0 , \ldots , e_{l-1} \in {\mathbb {F}}_q \\ e_l =1 \end{array}} \sum _{\begin{array}{c} f_0 , \ldots , f_{m-1} \in {\mathbb {F}}_q \\ f_m =1 \end{array}} \prod _{i=h}^{n} \sum _{\alpha _i \in {\mathbb {F}}_q } \psi \\&\quad \quad \bigg ( \alpha _i \Big ( a_i - \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \Big ) \bigg ) \Bigg )^2 \\&\quad = \frac{q^{3h}}{q^{2n+2}} \sum _{a_h , \ldots , a_n \in {\mathbb {F}}_q } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} e_0 , \ldots , e_{l-1} \in {\mathbb {F}}_q \\ e_l =1 \end{array}} \sum _{\begin{array}{c} f_0 , \ldots , f_{m-1} \in {\mathbb {F}}_q \\ f_m =1 \end{array}} \prod _{i=h}^{n} \sum _{\alpha _i \in {\mathbb {F}}_q } \psi \\&\quad \quad \bigg ( \alpha _i \Big ( a_i - \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \Big ) \bigg ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} g_0 , \ldots , g_{l' -1} \in {\mathbb {F}}_q \\ g_{l'} =1 \end{array}} \sum _{\begin{array}{c} h_0 , \ldots , h_{m' -1} \in {\mathbb {F}}_q \\ h_{m'} =1 \end{array}} \prod _{i=h}^{n} \sum _{\beta _i \in {\mathbb {F}}_q } \psi \bigg ( \beta _i \Big ( a_i - \sum _{\begin{array}{c} 0 \le l_1' \le l' \\ 0 \le m_1' \le m' \\ l_1' + m_1' = i \end{array}} g_{l_1'} h_{m_1'} \Big ) \bigg ) \Bigg ) . \end{aligned} \end{aligned}$$
(46)

We now consider the sum over one of the coefficients \(a_i\). Unlike the \(b_i\) which appeared within the largest parentheses, the \(a_i\) appear outside. Thus, we must simultaneously consider the terms within each pair of parentheses whose product forms the square, and that is why we have written them explicitly in the last line above. Again, we apply (11) to obtain

$$\begin{aligned} \frac{1}{q} \sum _{a_i \in {\mathbb {F}}_q } \psi \big ( (\alpha _i + \beta _i) a_i \big ) = {\left\{ \begin{array}{ll} 1 &{}\text { if }\alpha _i = - \beta _i, \\ 0 &{}\text { if }\alpha _i \ne - \beta _i. \end{array}\right. } \end{aligned}$$

Applying this to (46) gives

$$\begin{aligned}&\sum _{A \in {\mathcal {M}}_n} \bigg ( \sum _{B \in I (A;<h)} \sigma _{z} (B) \bigg )^2 \nonumber \\&\quad = q^{2h-n-1} \sum _{\alpha _{h} , \alpha _{h+1} , \ldots , \alpha _{n} \in {\mathbb {F}}_q} \nonumber \\&\quad \quad \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} e_0 , \ldots , e_{l-1} \in {\mathbb {F}}_q \\ e_l =1 \end{array}} \sum _{\begin{array}{c} f_0 , \ldots , f_{m-1} \in {\mathbb {F}}_q \\ f_m =1 \end{array}} \prod _{i=h}^{n} \psi \bigg ( - \alpha _i \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \bigg ) \Bigg ) \nonumber \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} g_0 , \ldots , g_{l' -1} \in {\mathbb {F}}_q \\ g_{l'} =1 \end{array}} \sum _{\begin{array}{c} h_0 , \ldots , h_{m' -1} \in {\mathbb {F}}_q \\ h_{m'} =1 \end{array}} \prod _{i=h}^{n} \psi \bigg ( \alpha _i \sum _{\begin{array}{c} 0 \le l_1' \le l' \\ 0 \le m_1' \le m' \\ l_1' + m_1' = i \end{array}} g_{l_1'} h_{m_1'} \bigg ) \Bigg ) \nonumber \\&\quad = q^{2h-n-1} \sum _{\varvec{\alpha } \in {\mathscr {L}}_n^h } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} e_0 , \ldots , e_{l-1} \in {\mathbb {F}}_q \\ e_l =1 \end{array}} \sum _{\begin{array}{c} f_0 , \ldots , f_{m-1} \in {\mathbb {F}}_q \\ f_m =1 \end{array}} \prod _{i=h}^{n} \psi \bigg ( - \alpha _i \sum _{\begin{array}{c} 0 \le l_1 \le l \\ 0 \le m_1 \le m \\ l_1 + m_1 = i \end{array}} e_{l_1} f_{m_1} \bigg ) \Bigg ) \nonumber \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} g_0 , \ldots , g_{l' -1} \in {\mathbb {F}}_q \\ g_{l'} =1 \end{array}} \sum _{\begin{array}{c} h_0 , \ldots , h_{m' -1} \in {\mathbb {F}}_q \\ h_{m'} =1 \end{array}} \prod _{i=h}^{n} \psi \bigg ( \alpha _i \sum _{\begin{array}{c} 0 \le l_1' \le l' \\ 0 \le m_1' \le m' \\ l_1' + m_1' = i \end{array}} g_{l_1'} h_{m_1'} \bigg ) \Bigg ) \nonumber \\&\quad = q^{2h-n-1} \sum _{\varvec{\alpha } \in {\mathscr {L}}_n^h } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \bigg ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \bigg ) \Bigg ) \nonumber \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {{\mathbb {F}}_q}^{m'} \times \{ 1 \} \end{array}} \psi \bigg ( {\textbf{g}}^T H_{l' + 1 , m' + 1} (\varvec{\alpha }) {\textbf{h}} \bigg ) \Bigg ) \nonumber \\&\quad = {\left\{ \begin{array}{ll} q^{2h+n} \sum _{r=0, h+1 , h+2 , \ldots , n_1 -1} \sum _{\varvec{\alpha }^- \in {\mathscr {L}}_{n-1}^h (r,r,0) } (n+1-2r)^2 q^{-2r} &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ q^{2h+n} \sum _{r=0, h+1 , h+2 , \ldots , n_1 -1} \sum _{\varvec{\alpha }^- \in {\mathscr {L}}_{n-1}^h (r,r,0) } \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 q^{-2r} &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}} \end{array}\right. }\nonumber \\&\quad = {\left\{ \begin{array}{ll} (q-1) q^{h+n-1} \sum _{r=h+1}^{n_1 -1} (n+1-2r)^2 \hspace{2em} + q^{2h+n} (n+1)^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ (q-1) q^{h+n-1} \sum _{r=h+1}^{n_1 -1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 \hspace{2em} + q^{2h+n} \Big ( \frac{q^{(n+1)z} - 1}{q^z -1} \Big )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}; \end{array}\right. } \end{aligned}$$
(47)

where we remind the reader that \(n_1:= \lfloor \frac{n+2}{2} \rfloor \) and \(n_2:= \lfloor \frac{n+3}{2} \rfloor \). For the second equality, \(\varvec{\alpha }:= (\alpha _0, \alpha _1, \ldots , \alpha _n)\); of course, since \(\varvec{\alpha } \in {\mathscr {L}}_n^h\), we have \(\alpha _0, \alpha _1, \ldots , \alpha _{h-1} = 0\), which is consistent with the previous line. For the third equality, we are writing \({\textbf{e}} = (e_1, e_2, \ldots , e_l )^T\) and similarly for \({\textbf{f}}, {\textbf{g}}, {\textbf{h}}\). The fourth equality uses Lemma 3.0.1, and the fifth equality uses Claims 2 and 3 from Theorem 2.3.1. So, applying this to (44) gives

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _{\sigma _{z}} (A;<h) |^2 = {\left\{ \begin{array}{ll} (q-1) q^{h-1} \frac{(n-2h-1)(n-2h)(n-2h+1)}{6} &{}\text { for }h \le n_1 -2, \\ 0 &{}\text { for }h \ge n_1 -1 \end{array}\right. } \end{aligned}$$

if \(z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}\); and

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} |\Delta _{\sigma _{z}} (A;<h) |^2 = {\left\{ \begin{array}{ll} (q-1) q^{h-1} \sum _{r=h+1}^{n_1 -1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 &{}\text { for }h \le n_1 -2, \\ 0 &{}\text { for }h \ge n_1 -1 \end{array}\right. } \end{aligned}$$

if \(z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}\). \(\square \)

Lemma 3.0.1

In what follows we have

$$\begin{aligned} \varvec{\alpha } = (\alpha _0 , \alpha _1 , \ldots , \alpha _n ) \in {\mathbb {F}}_q^{n+1} \end{aligned}$$

and we define

$$\begin{aligned} \varvec{\alpha }^- := (\alpha _0 , \alpha _1 , \ldots , \alpha _{n-1} ) . \end{aligned}$$

The two claims below address all possible values that \(\varvec{\alpha }\) could take.

Claim 1 Suppose \(\varvec{\alpha }\) is such that \(\varvec{\alpha }^{-} \in {\mathscr {L}}_{n-1} (r, \rho _1, \pi _1 )\), where \(0 \le r \le n_1 -1\) and \(\pi _1 \ge 1\). For all \(l,m \ge 0\) with \(l+m=n\), we have

$$\begin{aligned} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \bigg ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \bigg ) = 0 . \end{aligned}$$
(48)

Now suppose n is odd and \(\varvec{\alpha }\) is such that \(\varvec{\alpha }^{-} \in {\mathscr {L}}_{n-1} (n_1, 0, 0 )\). For all \(l,m \ge 0\) with \(l+m=n\), we have

$$\begin{aligned} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \bigg ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \bigg ) = 0 . \end{aligned}$$

Claim 2 Suppose \(\varvec{\alpha }\) is such that \(\varvec{\alpha }^- \in {\mathscr {L}}_{n-1} (r,r,0)\), where \(0 \le r \le n_1 -1\). We will fix \(\varvec{\alpha }^-\) and consider all possible extensions \(\varvec{\alpha }\); that is, we let \(\alpha _n\) vary in \({\mathbb {F}}_q\). We have

$$\begin{aligned}&\sum _{\alpha _n \in {\mathbb {F}}_q } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {{\mathbb {F}}_q}^{m'} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{g}}^T H_{l' + 1 , m' + 1} (\varvec{\alpha }) {\textbf{h}} \Big ) \Bigg ) \\&\quad ={\left\{ \begin{array}{ll} q^{2n - 2r +1} (n+1-2r)^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ q^{2n - 2r +1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}. \end{array}\right. } \end{aligned}$$

Proof

In what follows, \(e_i\) and \(f_i\) are the i-th entries of \({\textbf{e}}\) and \({\textbf{f}}\), respectively.

Claim 1 Consider the first result in this claim. Since \(\pi _1 \ge 1\), we can see from Theorem 2.4.4 tells us that for \(m \le n-r-1\) there is no monic polynomial of degree m in the kernel of \(H_{l, m+1} (\varvec{\alpha }^-)\). That is, there is no vector \({\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \}\) such that \(H_{l, m+1} (\varvec{\alpha }^-) {\textbf{f}} = {\textbf{0}}\). Therefore, for any \({\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \}\) we can find a row, \(R_i\), of \(H_{l, m+1} (\varvec{\alpha })\) such that \(R_i {\textbf{f}} \ne 0\). Now, consider only the sum and terms involving \(e_i\) on the left side of (48); we have

$$\begin{aligned} \sum _{e_i \in {\mathbb {F}}_q} \psi ( - e_i R_i {\textbf{f}} ) = 0 , \end{aligned}$$

where we have used (11) and the fact that \(R_i {\textbf{f}} \ne 0\). Thus, the left side of (48) is indeed 0.

Let us now consider when \(m \ge n-r\). Let \(c_1^-:= r\) and \(c_2^-:= n+1-r\). Theorem 2.4.4 tells us that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-) = \bigg \{ B_1 A_1 + B_2 A_2 : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1^- \\ {{\,\textrm{deg}\,}}B_2 \le m - c_2^- \end{array} \bigg \} , \end{aligned}$$
(49)

for some \(A_1 \in {\mathcal {A}}_{\rho _1}\) and \(A_2 \in {\mathcal {A}}_{c_2^-}\). Note that in this case, there are monic polynomials of degree m in the kernel of \(H_{l,m+1} (\varvec{\alpha }^-)\). That is, there are vectors \({\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \}\) such that \(H_{l, m+1} (\varvec{\alpha }^-) {\textbf{f}} = {\textbf{0}}\). By similar reasoning as above, these are the only candidates for non-zero contributions to the left side of (48).

Now, \(A_1, A_2\) are the characteristic polynomials of \(\varvec{\alpha }^-\). Theorem 2.4.13 tells us that

$$\begin{aligned} \varvec{\alpha } \in {\left\{ \begin{array}{ll} {\mathscr {L}}_{n} (n_1, n_1 , 0 ) &{}\text { if }n\text { is even and }r= n_1 -1, \\ {\mathscr {L}}_{n} (r+1 , \rho _1 , \pi +1) &{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

Consider the latter case. Theorem 2.4.13 also tells us that the characteristic polynomials of \(\varvec{\alpha }\) are \(A_1\) and \(\beta T^{c_2^- - c_1^- } A_1 + A_2\), and so Theorem 2.4.4 tells us that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l+1 , m+1} (\varvec{\alpha }) = \bigg \{ B_1 A_1 + B_2 (\beta T^{c_2^- - c_1^- } A_1 + A_2 ) : \begin{array}{c} B_1 , B_2 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m - c_1^- -1 \\ {{\,\textrm{deg}\,}}B_2 \le m - c_2^- \end{array} \bigg \} . \end{aligned}$$

Comparing this to (49), we see that

$$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-) ={{\,\textrm{ker}\,}}H_{l+1 , m+1} (\varvec{\alpha }) \hspace{1em} \oplus \hspace{1em} \{ \gamma T^{m-c_1^- } A_1 : \gamma \in {\mathbb {F}}_q \} . \end{aligned}$$
(50)

(This is not surprising as the dimension of \({{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-) \) is just one more than the dimension of \({{\,\textrm{ker}\,}}H_{l+1, m+1} (\varvec{\alpha })\)). So, for \({\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \}\) that are in \({{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-) \), we can write \({\textbf{f}} = {\textbf{f}}^- + \gamma {\textbf{g}}_m\); where \({\textbf{f}}^- \in {{\,\textrm{ker}\,}}H_{l+1, m+1} (\varvec{\alpha })\), \(\gamma \in {\mathbb {F}}_q\), and \({\textbf{g}}_m\) is the polynomial associated with \(T^{m-c_1^- } A_1\). Hence,

$$\begin{aligned} {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} = \gamma R_{l+1} {\textbf{g}}_m , \end{aligned}$$

where \(R_{l+1}\) is the last row of \(H_{l+1, m+1} (\varvec{\alpha })\). Thus, we have

$$\begin{aligned}&\sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \\&\quad = \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}}^- \in ({{\mathbb {F}}_q}^m \times \{ 1 \}) \cap {{\,\textrm{ker}\,}}H_{l+1 , m+1} (\varvec{\alpha }) \\ \gamma \in {\mathbb {F}}_q \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) ({\textbf{f}}^- +\gamma {\textbf{g}}_m ) \Big ) \\&\quad = \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}}^- \in ({{\mathbb {F}}_q}^m \times \{ 1 \}) \cap {{\,\textrm{ker}\,}}H_{l+1 , m+1} (\varvec{\alpha }) \\ \gamma \in {\mathbb {F}}_q \end{array}} \psi \Big ( - \gamma R_{l+1} {\textbf{g}}_m \Big ) \\&\quad = q^l C \sum _{\gamma \in {\mathbb {F}}_q} \psi \Big ( - \gamma R_{l+1} {\textbf{g}}_m \Big ) = 0 , \end{aligned}$$

where for the last equality we have used (11) and the fact that \(R_{l+1} {\textbf{g}} \ne 0\) (since \({\textbf{g}} \not \in {{\,\textrm{ker}\,}}H_{l+1, m+1} (\varvec{\alpha })\)), and C is the number of \({\textbf{f}}^-\) in \(({{\mathbb {F}}_q}^m \times \{ 1 \}) \cap {{\,\textrm{ker}\,}}H_{l+1, m+1} (\varvec{\alpha })\) (which we can calculate explicitly but do not need to).

The case where n is even and \(r=n_1 -1\) follows similarly as above.

The second result of Claim 1 is also proved similarly to the above.

Claim 2 By similar means as in Claim 1, we can show that

$$\begin{aligned} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) = 0 , \end{aligned}$$
(51)

for \(m \le r-1\) and \(m \ge n+1-r\).

So, suppose that \(r \le m \le n-r\). Consider

$$\begin{aligned} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) ; \end{aligned}$$

As previously, a non-zero contribution requires that \({\textbf{f}} \in {{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-)\). Let \(R_{l+1} (\varvec{\alpha })\) be the last row of \(H_{l+1,m+1} (\varvec{\alpha })\). We can deduce the following two points from Theorem 2.4.13:

  • There are \(q-1\) values of \(\alpha _{n}\) such that

    $$\begin{aligned} \varvec{\alpha } \in {\left\{ \begin{array}{ll} {\mathscr {L}}_{n} (n_1 , n_1 , 0) &{}\text { if }n\text { is even and }r=n_1 -1, \\ {\mathscr {L}}_{n} (r+1 , r , 1) &{}\text { otherwise.} \end{array}\right. } \end{aligned}$$

    Theorem 2.4.4 tells us that

    $$\begin{aligned} {{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m-r \end{array} \Big \} \end{aligned}$$

    and

    $$\begin{aligned} {{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha }) = \Big \{ B_1 A_1 : \begin{array}{c} B_1 \in {\mathcal {A}} \\ {{\,\textrm{deg}\,}}B_1 \le m-1-r \end{array} \Big \} , \end{aligned}$$

    for some \(A_1 \in {\mathcal {M}}_{r}\). Thus, any \({\textbf{f}} \in ({{\mathbb {F}}_q}^m \times \{ 1 \}) \cap {{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-)\) can be written as \({\textbf{f}} = {\textbf{f}}^- + {\textbf{a}}_m\) for some \({\textbf{f}}^- \in {{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha })\), and \({\textbf{a}}_{m}\) is the vector associated with the polynomial \(T^{m-r} A_1\). This gives

    $$\begin{aligned} {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} = R_{l+1} (\varvec{\alpha }) {\textbf{a}}_m \ne 0. \end{aligned}$$

    Note that \(R_{l+1} (\varvec{\alpha }) {\textbf{a}}_m\) is independent of the values of lm (so long as \(r \le m \le n-r\)) This is because the non-zero entries of \({\textbf{a}}_m\) occur in the last \(r+1\) entries, and the last \(r+1\) entries \(R_{l+1} (\varvec{\alpha })\) are independent of the value of l. Similar statements hold for the sums over \({\textbf{g}}, {\textbf{h}}\), but we should keep in mind that there is no negative sign before \({\textbf{e}}^T\) in these sums. Hence, we have

    $$\begin{aligned}&\Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {{\mathbb {F}}_q}^{m'} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{g}}^T H_{l' + 1 , m' + 1} (\varvec{\alpha }) {\textbf{h}} \Big ) \Bigg ) \\&\quad = \Bigg ( \sum _{\begin{array}{c} l+m=n \\ r \le m \le n-r \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}}^- \in {{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha }) \end{array}} \psi \Big ( - R_{l+1} (\varvec{\alpha }) {\textbf{a}}_m \Big ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} l' + m' = n \\ r \le m' \le n-r \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}}^- \in {{\,\textrm{ker}\,}}H_{l' +1,m' +1} (\varvec{\alpha }) \end{array}} \psi \Big ( R_{l+1} (\varvec{\alpha }) {\textbf{a}}_m \Big ) \Bigg ) \\&\quad = \Bigg ( \sum _{\begin{array}{c} l+m=n \\ r \le m \le n-r \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}}^- \in {{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha }) \end{array}} 1 \Bigg ) \Bigg ( \sum _{\begin{array}{c} l' + m' = n \\ r \le m' \le n-r \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}}^- \in {{\,\textrm{ker}\,}}H_{l' +1,m' +1} (\varvec{\alpha }) \end{array}} 1 \Bigg ) \\&\quad ={\left\{ \begin{array}{ll} \big ( (n+1-2r) q^{n-r} \big )^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ \bigg ( \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big ) q^{n-r} \bigg )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}; \end{array}\right. } \end{aligned}$$

    where we have used the fact that \(|{{\mathbb {F}}_q}^l \times \{ 1 \} |= q^l\) and \(|{{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha }) |= q^{(m+1) - (r+1)}\).

  • There is one value of \(\alpha _{n+1}\) such that \(\varvec{\alpha } \in {\mathscr {L}}_{n}^h (r,r,0)\). In which case, all \({\textbf{f}}\) in \({{\,\textrm{ker}\,}}H_{l,m+1} (\varvec{\alpha }^-)\) are also in \({{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha })\), thus giving \(R_{l+1} (\varvec{\alpha }) {\textbf{f}} = 0\). A similar statement holds for the sums over \({\textbf{g}}, {\textbf{h}}\). Hence, we have

    $$\begin{aligned}&\Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {{\mathbb {F}}_q}^{m'} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{g}}^T H_{l' + 1 , m' + 1} (\varvec{\alpha }) {\textbf{h}} \Big ) \Bigg ) \\&\quad = \Bigg ( \sum _{\begin{array}{c} l+m=n \\ r \le m \le n-r \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}}^- \in {{\,\textrm{ker}\,}}H_{l+1,m+1} (\varvec{\alpha }) \end{array}} 1 \Bigg ) \Bigg ( \sum _{\begin{array}{c} l' + m' = n \\ r \le m' \le n-r \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}}^- \in {{\,\textrm{ker}\,}}H_{l' +1,m' +1} (\varvec{\alpha }) \end{array}} 1 \Bigg ) \\&\quad ={\left\{ \begin{array}{ll} \big ( (n+1-2r) q^{n-r} \big )^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ \bigg ( \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big ) q^{n-r} \bigg )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}; \end{array}\right. } \end{aligned}$$

Thus, we have

$$\begin{aligned}&\sum _{\alpha _n \in {\mathbb {F}}_q } \Bigg ( \sum _{\begin{array}{c} 0 \le l,m \le n \\ l+m=n \end{array}} q^{mz} \sum _{\begin{array}{c} {\textbf{e}} \in {{\mathbb {F}}_q}^l \times \{ 1 \} \\ {\textbf{f}} \in {{\mathbb {F}}_q}^m \times \{ 1 \} \end{array}} \psi \Big ( - {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \Bigg ) \\&\quad \quad \times \Bigg ( \sum _{\begin{array}{c} 0 \le l' , m' \le n \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {{\mathbb {F}}_q}^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {{\mathbb {F}}_q}^{m'} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{g}}^T H_{l' + 1 , m' + 1} (\varvec{\alpha }) {\textbf{h}} \Big ) \Bigg ) \\&\quad ={\left\{ \begin{array}{ll} q^{2n - 2r +1} (n+1-2r)^2 &{}\text { if }z \in \frac{2 \pi i}{\log q} {\mathbb {Z}}, \\ q^{2n - 2r +1} \Big ( \frac{q^{(n+1-r)z} - q^{rz}}{q^z -1} \Big )^2 &{}\text { if }z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}. \end{array}\right. } \end{aligned}$$

\(\square \)

Lemma 3.0.2

For \(z \in {\mathbb {C}} \backslash \frac{2 \pi i}{\log q} {\mathbb {Z}}\), we have that

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} N_{\sigma _z} (A;<h) = q^h \Big ( \frac{q^{(n+1)z} -1}{q^z -1} \Big ) . \end{aligned}$$

Proof

First, we note that

$$\begin{aligned} \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} N_{\sigma _z} (A;<h) = \frac{1}{q^n} \sum _{A \in {\mathcal {M}}_n} \sum _{B \in I (A;<h)} \sigma _z (B) = \frac{q^h}{q^n} \sum _{A \in {\mathcal {M}}_n} \sigma _z (A) , \end{aligned}$$

and so it suffices to evaluate the far-right side. Now, on one hand we have

$$\begin{aligned} \bigg ( \sum _{A \in {\mathcal {M}}} \frac{1}{|A |^s} \bigg ) \bigg ( \sum _{A \in {\mathcal {M}}} \frac{|A |^z }{|A |^s} \bigg ) = \sum _{A \in {\mathcal {M}}} \frac{\sigma _z (A)}{|A |^s} , \end{aligned}$$

for \({\text {Re}}(s) > z+1\). On the other hand, we have

$$\begin{aligned} \bigg ( \sum _{A \in {\mathcal {M}}} \frac{1}{|A |^s} \bigg ) \bigg ( \sum _{A \in {\mathcal {M}}} \frac{|A |^z }{|A |^s} \bigg ) = \bigg ( \sum _{n=0}^{\infty } q^{n(1-s)} \bigg ) \bigg ( \sum _{n=0}^{\infty } q^{n(z+1-s)} \bigg ) = \sum _{n=0}^{\infty } q^{n(1-s)} \sum _{i=0}^{n} q^{iz} . \end{aligned}$$

Comparing the coefficients of \(q^{-ns}\), we obtain the required result. \(\square \)

4 Correlations

We begin this section by proving Theorem 1.2.2.

Proof of Theorem 1.2.2

The proof is very similar to the proof of Theorem 1.2.1. We have

$$\begin{aligned}&\frac{1}{q^{n+h}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } \sigma _{z} (A) \sigma _{z} (A+B) \\&\quad = \frac{1}{q^{n+h}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \end{array}} \mathbb {1}_{EF=A} |F |^{z} \bigg ) \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} G \in {\mathcal {M}}_{l'} \\ H \in {\mathcal {M}}_{m'} \end{array}} \mathbb {1}_{GH=A+B} |H |^{z}\bigg ) \\&\quad = \frac{1}{q^{n+h}} \sum _{A \in {\mathcal {A}}_{\le n} } \sum _{B \in {\mathcal {A}}_{<h} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} q^{m z} \sum _{\begin{array}{c} E \in {\mathcal {M}}_l \\ F \in {\mathcal {M}}_m \end{array}} \prod _{i=0}^{n} \mathbb {1}_{ \{ EF \}_i = \{ A \}_i } \bigg )\\&\quad \quad \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} q^{m' z} \sum _{\begin{array}{c} G \in {\mathcal {M}}_{l'} \\ H \in {\mathcal {M}}_{m'} \end{array}} \prod _{i=0}^{n} \mathbb {1}_{ \{ GH \}_i = \{ A \}_i + \{ B\}_i } \bigg ) \\&\quad = \frac{1}{q^{3n+h+2}} \sum _{{\textbf{a}} \in {\mathbb {F}}_q^{n+1} } \sum _{{\textbf{b}} \in {\mathbb {F}}_q^{h} } \bigg ( \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} q^{m z} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - {\textbf{a}} \cdot \varvec{\alpha } \Big ) \bigg ) \\&\quad \quad \times \bigg ( \sum _{\varvec{\beta } \in {\mathbb {F}}_q^{n+1} } \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n' \end{array}} q^{m' z} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \Big ( {\textbf{g}}^T H_{l'+1 , m'+1} (\varvec{\beta }) {\textbf{h}} - ({\textbf{a}} + {\textbf{b}}) \cdot \varvec{\beta } \Big ) \bigg ) . \end{aligned}$$

As previously, the sum over \({\textbf{a}}\) will force \(\varvec{\beta } = - \varvec{\alpha }\), while the sum over \({\textbf{b}}\) will force the first h entries of \(\varvec{\alpha }\) (and \(\varvec{\beta }\)) to be zero. Thus, we have

$$\begin{aligned}&\frac{1}{q^{n+h}} \sum _{A \in {\mathcal {M}}_n } \sum _{B \in {\mathcal {A}}_{<h} } \sigma _{z} (A) \sigma _{z} (A+B) \\&\quad = \frac{1}{q^{2n+1}} \sum _{\varvec{\alpha } \in {\mathscr {L}}_{n}^h } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l+m=n \end{array}} q^{m z} \hspace{-0.75em} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \hspace{-0.5em} \psi \Big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \Big ) \bigg ) \\&\quad \quad \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n' \end{array}} q^{m' z} \hspace{-0.75em} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \hspace{-0.5em} \psi \Big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }) {\textbf{h}} \Big ) \bigg ) . \end{aligned}$$

This is identical to the third-to-last equality of (47) multiplied by \(q^{-n-2h}\). Thus, the remainder of (47) immediately gives our required result. \(\square \)

We will now prove Theorem 1.2.3.

Proof of Theorem 1.2.3

As with previous proofs, we will use additive characters. We first prove that

$$\begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) (n+1) - q^{-{{\,\textrm{deg}\,}}Q} (k - {{\,\textrm{deg}\,}}Q -1) (n+1) . \end{aligned}$$

We have

$$\begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) \\&\quad = \frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_{l} \\ F \in {\mathcal {M}}_{m} \\ EF = KQ+N \end{array}} 1 \bigg ) \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} G \in {\mathcal {M}}_{l'} \\ H \in {\mathcal {M}}_{m'} \\ GH = N \end{array}} 1 \bigg ) \\&\quad = \frac{1}{q^{k+n}} \sum _{K \in {\mathcal {A}}_{\le k} } \sum _{N \in {\mathcal {A}}_{\le n} } \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} E \in {\mathcal {M}}_{l} \\ F \in {\mathcal {M}}_{m} \\ EF = KQ+N \end{array}} \prod _{i=0}^{k+{{\,\textrm{deg}\,}}Q} \mathbb {1}_{ \{ EF \}_i = \{ KQ \}_i + \{ N \}_i } \bigg ) \\&\quad \quad \times \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} G \in {\mathcal {M}}_{l'} \\ H \in {\mathcal {M}}_{m'} \\ GH = N \end{array}} \prod _{i=0}^{n} \mathbb {1}_{ \{ GH \}_i = \{ N \}_i } \bigg ) \\&\quad = \frac{1}{q^{{{\,\textrm{deg}\,}}Q + 2k + 2n + 2}} \sum _{{\textbf{k}} \in {\mathbb {F}}_q^{k+1}} \sum _{{\textbf{n}} \in {\mathbb {F}}_q^{n+1}} \\&\quad \quad \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q +1}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - {\textbf{k}}^T H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} - {\textbf{n}} \cdot \varvec{\alpha }' \big ) \bigg ) \\&\quad \quad \times \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \sum _{\varvec{\beta } \in {\mathbb {F}}_q^{n +1}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (\varvec{\beta }) {\textbf{h}} - {\textbf{n}} \cdot \varvec{\beta } \big ) \bigg ) ; \end{aligned}$$

where \(\varvec{\alpha } = (\alpha _0, \alpha _1, \ldots , \alpha _{k + {{\,\textrm{deg}\,}}Q})\) and we define \(\varvec{\alpha }':= (\alpha _0, \alpha _1, \ldots , \alpha _{n})\), and we also write \(Q = q_0 + q_1 T + \ldots + q_{{{\,\textrm{deg}\,}}Q} T^{{{\,\textrm{deg}\,}}Q}\) and define \({\textbf{q}}:= (q_0, q_1, \ldots , q_{{{\,\textrm{deg}\,}}Q})\). Now, similar to what we have seen in previous proofs, the sum over \({\textbf{n}}\) will force \(\varvec{\beta }\) to equal \(-\varvec{\alpha }'\), while the sum over \({\textbf{k}}\) will require \(\varvec{\alpha }\) to be such that \(H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}}\). Thus, we have

$$\begin{aligned} \begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) \\&\quad = \frac{1}{q^{{{\,\textrm{deg}\,}}Q +k+n}} \sum _{\begin{array}{c} \varvec{\alpha } \in {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q +1} \\ H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}} \end{array}} \bigg ( \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \big ) \bigg ) \\&\quad \quad \times \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }') {\textbf{h}} \big ) \bigg ) . \end{aligned} \end{aligned}$$
(52)

First, we consider the sum

$$\begin{aligned} \sum _{\begin{array}{c} l , m \ge 0 \\ l + m = {{\,\textrm{deg}\,}}Q +k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \big ). \end{aligned}$$

Let \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\), and let us investigate the values of \(r, \rho _1, \pi _1\). The theorem assumes that \(k \ge {{\,\textrm{deg}\,}}Q -1\), and so, by Theorem 2.4.4, there exists some \(A_1 \in {\mathcal {M}}_{\rho _1 }\) such that \({{\,\textrm{ker}\,}}H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) = \{ B_1 A_1: B_1 \in {\mathcal {A}}, {{\,\textrm{deg}\,}}B_1 \le {{\,\textrm{deg}\,}}Q - r \}\). Since Q is in this kernel, and since it is prime, we must have that either \(B_1\) can take the value Q, or \(A_1 = Q\). That is, either \(r=0\) and so \(\rho _1, \pi _1 = 0\), or \(r = {{\,\textrm{deg}\,}}Q\) and so \(\rho _1 = {{\,\textrm{deg}\,}}Q\) and \(\pi _1 = 0\). The former simply means \(\varvec{\alpha } = {\textbf{0}}\). We will consider each case separately.

For \(\varvec{\alpha } = {\textbf{0}}\), we have

$$\begin{aligned}&\sum _{\begin{array}{c} l , m \ge 0 \\ l + m = {{\,\textrm{deg}\,}}Q +k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} ({\textbf{0}}) {\textbf{f}} \big ) \\&\quad = \sum _{\begin{array}{c} l , m \ge 0 \\ l + m = {{\,\textrm{deg}\,}}Q +k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( 0 \big ) = q^{{{\,\textrm{deg}\,}}Q +k} \sum _{\begin{array}{c} l , m \ge 0 \\ l + m = {{\,\textrm{deg}\,}}Q +k \end{array}} 1 \\&\quad = q^{{{\,\textrm{deg}\,}}Q +k} ({{\,\textrm{deg}\,}}Q + k +1) . \end{aligned}$$

For \(\varvec{\alpha } \in {\mathscr {L}}_n ({{\,\textrm{deg}\,}}Q, {{\,\textrm{deg}\,}}Q, 0 )\), we have

$$\begin{aligned}&\sum _{\begin{array}{c} l , m \ge 0 \\ l + m = {{\,\textrm{deg}\,}}Q +k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \big )\! =\! \sum _{\begin{array}{c} l + m = {{\,\textrm{deg}\,}}Q +k \\ {{\,\textrm{deg}\,}}Q \le m \le k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \big ) \\&\quad = \sum _{\begin{array}{c} l + m = {{\,\textrm{deg}\,}}Q +k \\ {{\,\textrm{deg}\,}}Q \le m \le k \end{array}} q^l \sum _{\begin{array}{c} {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \\ {\textbf{f}} \in {{\,\textrm{ker}\,}}H_{l , m+1} (\varvec{\alpha }) \end{array}} 1 = \sum _{\begin{array}{c} l + m = {{\,\textrm{deg}\,}}Q +k \\ {{\,\textrm{deg}\,}}Q \le m \le k \end{array}} q^l q^{m - {{\,\textrm{deg}\,}}Q} = q^k (k - {{\,\textrm{deg}\,}}Q +1) , \end{aligned}$$

where the first equality uses (51).

We apply these two results to (52). To do so, we define

$$\begin{aligned} S := \bigg \{ \varvec{\alpha } \in {\mathbb {F}}_{q}^{{{\,\textrm{deg}\,}}Q + k +1} : \begin{array}{c} \alpha _i = 0 \text { for all } i \in \{ 0 , \ldots , n -1 , n+1 , \ldots , {{\,\textrm{deg}\,}}Q -1 \} \\ H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}} \end{array} \bigg \} . \end{aligned}$$

For \(\varvec{\alpha } \in S\), it may be helpful to note that, due to the condition \(H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}}\), the terms \(\alpha _{{{\,\textrm{deg}\,}}Q}, \ldots , \alpha _{{{\,\textrm{deg}\,}}Q + k}\) can be expressed entirely in terms of \(\alpha _{0}, \ldots , \alpha _{{{\,\textrm{deg}\,}}Q -1}\) of which only \(\alpha _{n}\) could be non-zero. Further, if \(\alpha _n = 0\), then \(\varvec{\alpha } = {\textbf{0}}\); while if \(\alpha _{n} \ne 0\), then \(\varvec{\alpha } \in {\mathscr {L}}_n ({{\,\textrm{deg}\,}}Q, {{\,\textrm{deg}\,}}Q, 0 )\). We also note that for \(\varvec{\alpha } \not \in S\) satisfying \(H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}}\), we have \(\varvec{\alpha } \in {\mathscr {L}}_n ({{\,\textrm{deg}\,}}Q, {{\,\textrm{deg}\,}}Q, 0 )\).

Now, we consider the cases \(\varvec{\alpha } = {\textbf{0}}\), \(\varvec{\alpha } \in S \backslash \{ {\textbf{0}} \}\), and \(\varvec{\alpha } \not \in S\) separately. We have

$$\begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N)\nonumber \\&\quad = q^{-n} ({{\,\textrm{deg}\,}}Q + k +1) \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} ({\textbf{0}} ) {\textbf{h}} \big )\nonumber \\&\quad \quad + q^{-{{\,\textrm{deg}\,}}Q - n} (k - {{\,\textrm{deg}\,}}Q +1) \sum _{\begin{array}{c} \varvec{\alpha } \in S \\ \varvec{\alpha } \backslash \{ {\textbf{0}} \} \end{array}} \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }') {\textbf{h}} \big ) \bigg ) \nonumber \\&\quad \quad + q^{-{{\,\textrm{deg}\,}}Q - n} (k - {{\,\textrm{deg}\,}}Q +1) \sum _{\begin{array}{c} \varvec{\alpha } \in {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q +1} \\ H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}} \\ \varvec{\alpha } \not \in S \end{array}} \nonumber \\&\quad \quad \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }') {\textbf{h}} \big ) \bigg ) . \end{aligned}$$
(53)

Consider the case \(\varvec{\alpha } \not \in S\) first. Let \(\varvec{\alpha }'' = ( \alpha _0, \alpha _1, \ldots , \alpha _{n-1})\). By Claim 1 in Lemma 3.0.1, the non-zero contributions occur when \(\varvec{\alpha }'' \in {\mathscr {L}}_{n-1} (r,r,0)\) for some \(0 \le r \le n_1 -1\). By (51), we need only consider when \(r \le m' \le n-r\).

Note that if \(r \ne 0\), then \(\alpha _{n+1}, \ldots , \alpha _{{{\,\textrm{deg}\,}}Q -1}\) do not affect our sum, and so they are free to take any values in \({\mathbb {F}}_q\) (of which there are \(q^{{{\,\textrm{deg}\,}}Q -n-1}\) possibilities); while if \(r=0\), they can take any value but they cannot all be 0 simultaneously (of which there are \(q^{{{\,\textrm{deg}\,}}Q -n-1} -1\) possibilities), otherwise \(\varvec{\alpha } \in S\). We define

$$\begin{aligned} c_{\varvec{\alpha }^{\prime \prime }} = {\left\{ \begin{array}{ll} q^{{{\,\textrm{deg}\,}}Q -n-1} &{}\text { if }\varvec{\alpha } \in \,\,,\,,\text { with }r \ne 0, \\ q^{{{\,\textrm{deg}\,}}Q -n-1} -1 &{}\text { if }\varvec{\alpha } \in \,\,,\,,\text { with }r=0. \end{array}\right. } \end{aligned}$$

So, we have

$$\begin{aligned}&\sum _{\begin{array}{c} \varvec{\alpha } \in {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q +1} \\ H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}} \\ \varvec{\alpha } \not \in S \end{array}} \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }') {\textbf{h}} \big ) \\&\quad = \sum _{r=0}^{n_1 -1} \sum _{\begin{array}{c} \varvec{\alpha }'' \in {\mathscr {L}}_{n-1} (r,r,0) \\ \alpha _{n} \in {\mathbb {F}}_q \end{array}} c_{\varvec{\alpha }'' } \sum _{\begin{array}{c} l' + m' = n \\ r \le m' \le n-r \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (\varvec{\alpha }') {\textbf{h}} \big ) \\&\quad = \sum _{r=0}^{n_1' -1} \sum _{\varvec{\alpha }'' \in {\mathscr {L}}_{n-1} (r,r,0)} c_{\varvec{\alpha }'' } \sum _{\begin{array}{c} l' + m' = n \\ r \le m' \le n-r \end{array}} q^{l'} \sum _{\begin{array}{c} {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \\ H_{l' , m'+1} (- \varvec{\alpha }') {\textbf{h}} = {\textbf{0}} \end{array}} \sum _{\alpha _{n} \in {\mathbb {F}}_q} \psi \Big ( R_{l' +1} (\varvec{\alpha }') \cdot {\textbf{h}} \Big ) \\&\quad = 0 , \end{aligned}$$

where \(R_{l' +1} (\varvec{\alpha }')\) is the \((l'+1)\)-th row of \(H_{l', m'+1} (- \varvec{\alpha }')\), and we have used the fact that \(\sum _{\alpha _{n} \in {\mathbb {F}}_q} \psi \Big ( R_{l' +1} (\varvec{\alpha }') \cdot {\textbf{h}} \Big ) = \sum _{\alpha _{n} \in {\mathbb {F}}_q} \psi ( \alpha _n ) = 0\).

Let us now consider the case \(\varvec{\alpha } \in S \backslash \{ 0 \}\) in (53). By a similar argument as above, but using \(\sum _{\alpha _{n} \in {\mathbb {F}}_q \backslash \{ 0 \}} \psi ( \alpha _n ) = -1\) instead, we obtain

$$\begin{aligned}&\sum _{\begin{array}{c} \varvec{\alpha } \in S \\ \varvec{\alpha } \backslash \{ {\textbf{0}} \} \end{array}} \bigg ( \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (- \varvec{\alpha }') {\textbf{h}} \big ) \bigg ) \\&\quad = \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \sum _{\alpha _{n} \in {\mathbb {F}}_q \backslash \{ 0 \}} \psi \big ( \alpha _n \big ) \\&\quad = - q^n (n+1) . \end{aligned}$$

Finally, it is not difficult to see that

$$\begin{aligned}&\sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} ({\textbf{0}} ) {\textbf{h}} \big ) \\&\quad = q^n (n+1) . \end{aligned}$$

Apply these three results to (53) gives

$$\begin{aligned}&\frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) (n+1) - q^{-{{\,\textrm{deg}\,}}Q} (k - {{\,\textrm{deg}\,}}Q + 1) (n+1) . \\ \end{aligned}$$

We now prove that

$$\begin{aligned}&\bigg ( \frac{1}{q^n } \sum _{N \in {\mathcal {M}}_n} d(N) \bigg ) \bigg ( \frac{1}{q^{k + n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) \bigg ) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) (n+1) - q^{-{{\,\textrm{deg}\,}}Q} (k - {{\,\textrm{deg}\,}}Q -1) (n+1) . \end{aligned}$$

We have

$$\begin{aligned} \frac{1}{q^n } \sum _{N \in {\mathcal {M}}_n} d(N)&= \frac{1}{q^{2n+1} } \sum _{{\textbf{n}} \in {\mathbb {F}}_q^{n+1}} \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{n+1} } \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} (\varvec{\alpha } ) {\textbf{h}} - {\textbf{n}} \cdot \varvec{\alpha } \big ) \\&= \frac{1}{q^{n} } \sum _{\begin{array}{c} l' , m' \ge 0 \\ l' + m' = n \end{array}} \sum _{\begin{array}{c} {\textbf{g}} \in {\mathbb {F}}_q^{l'} \times \{ 1 \} \\ {\textbf{h}} \in {\mathbb {F}}_q^{m'} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{g}}^T H_{l'+1 , m'+1} ({\textbf{0}} ) {\textbf{h}} \big ) \\&= n+1 , \end{aligned}$$

where, for the second equality, similar to what we have seen previously, the sum over \({\textbf{n}}\) forces \(\varvec{\alpha } = {\textbf{0}}\).

We also have

$$\begin{aligned}&\frac{1}{q^{k + n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) \\&\quad = \frac{1}{q^{{{\,\textrm{deg}\,}}Q + 2k + n + 1}} \sum _{{\textbf{k}} \in {\mathbb {F}}_q^{k+1}} \sum _{{\textbf{n}} \in {\mathbb {F}}_q^{n} \times \{ 1 \} } \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \sum _{\varvec{\alpha } \in {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q +1}} \\&\quad \quad \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - {\textbf{k}}^T H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} - {\textbf{n}} \cdot \varvec{\alpha }' \big ) \\&\quad = \frac{1}{q^{{{\,\textrm{deg}\,}}Q + k }} \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} \varvec{\alpha } \in \{ 0 \}^{n} \times {\mathbb {F}}_q^{k+{{\,\textrm{deg}\,}}Q - n +1} \\ H_{k+1 , {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}} \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - \alpha _n \big ) . \end{aligned}$$

Again, for the last equality, the sum of \({\textbf{n}}\) over \({\mathbb {F}}_q^{n} \times \{ 1 \}\) forces \(\alpha _0, \alpha _1. \ldots , \alpha _{n-1} = 0\), while the sum over \({\textbf{k}}\) forces the requirement that \(H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}}\). As previously, the contribution of \(\varvec{\alpha } \in S\) is zero. Thus, we have

$$\begin{aligned}&\frac{1}{q^{k + n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) \\&\quad = \frac{1}{q^{{{\,\textrm{deg}\,}}Q + k }} \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} ({\textbf{0}}) {\textbf{f}} \big ) \\&\quad \quad + \frac{1}{q^{{{\,\textrm{deg}\,}}Q + k }} \sum _{\begin{array}{c} l,m \ge 0 \\ l + m = k + {{\,\textrm{deg}\,}}Q \end{array}} \sum _{\begin{array}{c} \varvec{\alpha } \in S \backslash \{ 0 \} \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} - \alpha _n \big ) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) + \frac{1}{q^{{{\,\textrm{deg}\,}}Q + k }} \sum _{\begin{array}{c} l + m = k + {{\,\textrm{deg}\,}}Q \\ {{\,\textrm{deg}\,}}Q \le m \le k \end{array}} \sum _{\begin{array}{c} {\textbf{e}} \in {\mathbb {F}}_q^{l} \times \{ 1 \} \\ {\textbf{f}} \in {\mathbb {F}}_q^{m} \times \{ 1 \} \end{array}} \psi \big ( {\textbf{e}}^T H_{l+1 , m+1} (\varvec{\alpha }) {\textbf{f}} \big ) \sum _{\alpha _n \in {\mathbb {F}}_q^* } \alpha _n \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) + \frac{1}{q^{{{\,\textrm{deg}\,}}Q + k }} \bigg ( \sum _{\begin{array}{c} l + m = k + {{\,\textrm{deg}\,}}Q \\ {{\,\textrm{deg}\,}}Q \le m \le k \end{array}} q^{l+m-{{\,\textrm{deg}\,}}Q} \bigg ) \bigg ( \sum _{\alpha _n \in {\mathbb {F}}_q^* } \alpha _n \bigg ) \\&\quad = ({{\,\textrm{deg}\,}}Q + k +1) - \frac{1}{q^{{{\,\textrm{deg}\,}}Q }} (k - {{\,\textrm{deg}\,}}Q +1) . \end{aligned}$$

\(\square \)

Remark 4.0.1

We can see from the proof of Theorem 1.2.3 that in evaluating the sum

$$\begin{aligned} \frac{1}{q^{k+n}} \sum _{K \in {\mathcal {M}}_k } \sum _{N \in {\mathcal {M}}_n } d (KQ+N) d (N) , \end{aligned}$$

the sequences denoted by \(\varvec{\alpha }\) address the polynomial \(KQ+N\), while their truncations \(\varvec{\alpha }'\) address the polynomial N. However, because of the range we have for k (the degree of K), the value of \(\varvec{\alpha }'\) does not affect the \((\rho , \pi )\)-form of \(\varvec{\alpha }\) (except the special case where \(\varvec{\alpha } = {\textbf{0}}\)). This is why \(d (KQ+N)\) and d(N) are uncorrelated for the given ranges of k and n.

If, instead, we took a smaller value of k, which is what we would find in fourth moment calculations of Dirichlet L-functions, then the \((\rho , \pi )\)-form of \(\varvec{\alpha }\) becomes dependent on the value of \(\varvec{\alpha }'\), thus making it more difficult to evaluate the sum. In effect, for given \(r, \rho _1, \pi _1\) and \(r', \rho _1 ', \pi _1 '\) we must determine how many \(\varvec{\alpha }\) there are such that

  1. (1)

    \(\varvec{\alpha } \in {\mathscr {L}}_n (r, \rho _1, \pi _1 )\),

  2. (2)

    \(H_{k+1, {{\,\textrm{deg}\,}}Q + 1} (\varvec{\alpha }) {\textbf{q}} = {\textbf{0}}\),

  3. (3)

    \(\varvec{\alpha }' \in {\mathscr {L}}_n (r', \rho _1 ', \pi _1 ' )\).

In fact, by Claim 1 of Lemma 3.0.1, we need only consider the cases where \(\pi _1, \pi _1 ' \in \{ 0, 1 \}\). We can reformulate the three conditions above in terms of coprime polynomials \(A_1, A_2\). Indeed, by Theorem 2.4.4, condition 1 is equivalent to certain degree restrictions on the characteristic polynomials \(A_1, A_2\); condition 2 is equivalent to Q being a certain linear combination of \(A_1, A_2\); and by Corollary 2.4.11, condition 3 is equivalent to certain degree restrictions on the polynomials we obtain by applying the Euclidean algorithm to \(A_1, A_2\). It is not difficult to satisfy any two of the three conditions, but satisfying all three is more difficult.