1 Introduction

Denote by \({{\mathbb {F}}}_q\) the finite field with q elements where q is a prime power. Let \({{\mathbb {F}}}_q[x]\) denote the ring of polynomials over \({{\mathbb {F}}}_q\) in the indeterminate x. For any ring R and positive integers nk, define \(M_{n,k}(R)\) to be the set of all \(n\times k\) matrices over R. Similarly, \(M_k(R)\) denotes the ring of \(k\times k\) matrices over R. Denote by \(I_{n,k}\) the matrix in \(M_{n,k}({{\mathbb {F}}}_q)\) whose (ij)th entry is zero whenever \(i\ne j\) and equal to 1 for \(i=j\).

The main objects of study in this paper are matrix polynomials over finite fields. A matrix polynomial over a field F in the variable x is a sum \(\sum _{i=0}^{d}A_ix^i\), where \(A_i\in M_{n,k}(F)(0\le i\le d)\) for some fixed positive integers nk. It is often convenient to view such a matrix polynomial as a single matrix whose entries are polynomials in x (sometimes referred to as a polynomial matrix) and we freely alternate between these two points of view. A matrix polynomial \(\mathbf{A}=\sum _{i=0}^{d}A_ix^i\in M_{n,k}({{\mathbb {F}}}_q[x])\) is unimodular if the greatest common divisor of all \(r\times r\) minors of \(\mathbf{A}\) is equal to 1 where \(r=\min \{n,k\}\). The notion of unimodularity can be defined more generally for rectangular matrices over an arbitrary integral domain. A landmark result in the setting of unimodularity is the Quillen–Suslin theorem [22, 25] formerly known as Serre’s conjecture. We refer to [13, 18, 20, 26] for other contexts where unimodularity is considered. We begin with a combinatorial question concerning matrix polynomials over a finite field.

Question 1.1

Given positive integers nk and a prime power q, determine the number of matrices \(A\in M_{n,k}({{\mathbb {F}}}_q)\) for which the matrix polynomial \(xI_{n,k}-A\) is unimodular.

This question was essentially considered by Kocięcki and Przyłuski [16] (also see [24, Prob. 1.2]) in an attempt to determine the number of reachable pairs of matrices over a finite field. Reachability is a fundamental notion in the control theory of linear systems. The question was fully answered only recently by Lieb, Jordan and Helmke [19, Thm. 1] who showed that the answer is equal to \(\prod _{i=1}^{k}(q^n-q^i)\). We refer to the introduction of [24] for details and alternate formulations of the result of Lieb et al. Our main result is Lemma 2.10 which allows us to give a new proof (Corollary 2.13) of the theorem of Lieb et al. An essential ingredient in our main lemma is a control-theoretic result of Brunovský on completely controllable pairs.

Further applications of our results appear in Sects. 3 and 4. In Sect. 3, we consider splitting subspaces (defined below) which were introduced by Niederreiter [21, Def. 1] in the context of his work on the multiple recursive matrix method for pseudorandom number generation.

Definition 1.2

Let dm be positive integers and consider the vector space \({{\mathbb {F}}}_{q^{md}}\) over \({{\mathbb {F}}}_q\). For any element \(\alpha \in {{\mathbb {F}}}_{q^{md}}\) an m-dimensional subspace W of \({{\mathbb {F}}}_{q^{md}}\) is \(\alpha \)-splitting if

$$\begin{aligned} {{\mathbb {F}}}_{q^{md}}=W\oplus \alpha W\oplus \cdots \oplus \alpha ^{d-1}W. \end{aligned}$$

Niederreiter was interested in the following question on splitting subspaces.

Question 1.3

Given \(\alpha \in {{\mathbb {F}}}_{q^{md}}\) such that \({{\mathbb {F}}}_{q^{md}}={{\mathbb {F}}}_q(\alpha )\), what is the number of \(\alpha \)-splitting subspaces of \({{\mathbb {F}}}_{q^{md}}\) of dimension m?

It may be noted that the same question was also considered by Goresky and Klapper (see the remark in [12, p. 1653] and [12, Thm. 3(4)]). In addition to the evident cryptographic aspect, Niederreiter’s question also has interesting connections with group theory and finite projective geometry via block companion Singer cycles. We refer to [8, 9] for more on this topic. The case \(m=2\) of Niederreiter’s question was settled in [9] using a result that answers the following question: What is the probability that two randomly chosen polynomials of a fixed positive degree over a finite field are coprime? This question on the probability of coprime polynomials goes back to an exercise in Knuth [15, §4.6.1, Ex. 5] and has subsequently been considered by Corteel, Savage, Wilf and Zeilberger [4] in the more general setting of combinatorial prefabs. Further results on the degree distribution of the greatest common divisor of random polynomials over a finite field appear in [6]. In fact, our main result relies on Lemma 2.4 which may be viewed as a probabilistic result on coprime polynomials. Chen and Tseng [2, Cor. 3.4] eventually answered Niederreiter’s question on splitting subspaces by proving the following theorem which was initially conjectured in [9, Conj. 5.5].

Theorem 1.4

(Splitting Subspace Theorem). For any \(\alpha \in {{\mathbb {F}}}_{q^{md}}\) such that \({{\mathbb {F}}}_{q^{md}}={{\mathbb {F}}}_q(\alpha )\), the number of \(\alpha \)-splitting subspaces of \({{\mathbb {F}}}_{q^{md}}\) of dimension m is precisely

$$\begin{aligned} \frac{q^{md}-1}{q^m-1}q^{m(m-1)(d-1)}. \end{aligned}$$

In this paper, a control-theoretic result of Wimmer (Theorem 3.8) is used to prove Theorem 3.9 from which the Splitting Subspace Theorem follows as a corollary. In Sect. 4, a generalization of Question 1.1 is considered. The answer to this question which was stated earlier can be given a probabilistic flavor as follows.

Theorem 1.5

If a matrix A is selected uniformly at random from \(M_{n,k}({{\mathbb {F}}}_q)\), then the probability that \(xI_{n,k}-A\) is unimodular is given by \(\prod _{i=1}^k (1-q^{i-n})\).

Using results in Sect. 2, we prove a conjecture (Theorem 4.1) proposed in [24] on the proportion of unimodular polynomial matrices which generalizes Theorem 1.5.

2 Simple linear transformations

We begin by recalling the notion of a simple linear transformation [24, Def. 3.1].

Definition 2.1

Let V denote a vector space over a field F and let W be a subspace of V. An F-linear transformation \(T:W\rightarrow V\) is simple if the only T-invariant subspace properly contained in V is the zero subspace.

Remark 2.2

Note that the definition requires that there are no T-invariant subspaces properly contained in V rather than in W. The reason being that if W is a proper subspace, then the definition does not allow W itself to be T-invariant. In the case \(W=V\), we necessarily have that W is T-invariant. It can be shown that a linear operator T on a finite-dimensional vector space V is simple if and only if it has an irreducible characteristic polynomial. In fact simple maps defined on a proper subspace W of a vector space V are precisely the restrictions to W of simple maps defined on all of V.

The following proposition elucidates the connection between simple linear transformations and unimodularity.

Proposition 2.3

Let V be an n-dimensional vector space over F with ordered basis \({\mathcal {B}}_n=\{v_1,\ldots ,v_n\}\). Let \({\mathcal {B}}_k=\{v_1,\ldots ,v_k\}\) denote the ordered basis for the subspace W spanned by \(v_1,\ldots ,v_k\). Let \(T:W\rightarrow V\) be a linear transformation and let \(Y\in M_{n,k}(F)\) denote the matrix of T with respect to \({\mathcal {B}}_k\) and \({\mathcal {B}}_n\). Then, T is simple if and only if \(xI_{n,k}-Y\) is unimodular.

Proof

See [24, Prop. 2.5] and [24, Prop. 3.2]. \(\square \)

Let m be a positive integer and let \(\mathbf{a} = (a_1,\ldots ,a_m)\in {\mathbb {F}}_{q}^m\) be an arbitrary but fixed nonzero vector. Let t be the largest index such that \(a_t\ne 0\). Let \(d_1\ge d_2\ge \cdots \ge d_m\) be a nonincreasing sequence of integers with \(d_t \ge -1\). Let \(N_\mathbf{a}(d_1,\ldots d_m)\) denote the number of m-tuples \((f_1,\ldots ,f_m)\) of polynomials over \({{\mathbb {F}}}_q\) such that \(f_i = a_i x^{d_i+1} + h_i\) and \(\deg h_i\le d_i\) for \(1\le i\le m\) with \(\gcd (f_1,\ldots ,f_m)=1\). Here, we interpret negative powers of x to be zero. Since \(a_t\ne 0\), we necessarily have \(\deg f_t = d_t+1\) for any tuple \((f_1,\ldots ,f_m)\in N_\mathbf{a}(d_1,\ldots ,d_m)\). We adopt the convention that the degree of the zero polynomial is \(-\infty \). Note that if there is some \(s \ge t\) such that \(d_i<0\) for each \(s < i\le m\), then \(N_\mathbf{a}(d_1,\ldots ,d_m)=N_\mathbf{a'}(d_1,\ldots ,d_s)\) where \(\mathbf{a'}= (a_1,\ldots ,a_s)\).

We adapt an argument in the proof of [7, Thm. 4.1] to prove the following lemma which is central to our main result.

Lemma 2.4

Let m be a positive integer and let \(d_1\ge d_2\ge \cdots \ge d_m\ge 0\) be a sequence of integers. Let \(\mathbf{a} = (a_1,\ldots ,a_m)\in {\mathbb {F}}_{q}^m\) be a fixed nonzero vector. We have

$$\begin{aligned}N_\mathbf{a}(d_1,\ldots ,d_m)=q^{k+m}-q^{k+1},\end{aligned}$$

where \(k=d_1+\cdots +d_m\).

Proof

Fix a positive integer m. Let \(S(d_1,\ldots ,d_m)\) denote the set of ordered m-tuples \((f_1,\ldots ,f_m)\) where \(f_i = a_i x^{d_i+1} + h_i\) for some \(h_i\) with \(\deg h_i\le d_i\) for \(1\le i\le m\). Let t be the largest index such that \(a_t\ne 0\). We partition \(S(d_1,\ldots ,d_m)\) into disjoint subsets \(S_0,S_1,\ldots ,S_{d_t+1}\) where the set \(S_d \;(0\le d\le d_t+1)\) denotes the set of m-tuples in \(S(d_1,\ldots ,d_m)\) whose GCD is a monic polynomial of degree d. For each monic polynomial h over \({{\mathbb {F}}}_q\) of degree d and any coprime m-tuple \((g_1,\ldots ,g_m)\) of polynomials in \(S(d_1-d,\ldots ,d_m-d)\), it is easy to see that \((g_1h,g_2h,\ldots ,g_mh)\in S_d\). Conversely, for any tuple \((f_1,\ldots ,f_m)\in S_d\), the polynomial \(h=\gcd (f_1,\ldots ,f_m)\) is monic of degree d and \((f_1/h,\ldots ,f_m/h)\) is an ordered m-tuple of coprime polynomials in \(S(d_1-d,\ldots ,d_m-d)\). As a result, we have \(|S_d|=q^d N_\mathbf{a}(d_1-d,\ldots ,d_m-d)\) for \(0\le d\le d_t+1\). For \(k=d_1+\cdots +d_m\), we have

$$\begin{aligned} q^{k+m}=\sum _{d=0}^{d_t+1}|S_d|=\sum _{d=0}^{d_t+1}q^dN_\mathbf{a}(d_1-d,\ldots ,d_m-d). \end{aligned}$$
(1)

Replacing \(d_i\) by \(d_{i}+1\) for each \(1\le i\le m\), we obtain

$$\begin{aligned} q^{k+2m}&=\sum _{d=0}^{d_t+2}q^dN_\mathbf{a}(d_1+1-d,\ldots ,d_m+1-d) \\&=\sum _{d=-1}^{d_t+1}q^{d+1}N_\mathbf{a}(d_1-d,\ldots ,d_m-d) \\&=N_\mathbf{a}(d_1+1,\ldots ,d_m+1)+q\sum _{d=0}^{d_t+1}q^dN_\mathbf{a}(d_1-d,\ldots ,d_m-d)\\&=N_\mathbf{a}(d_1+1,\ldots ,d_m+1)+q(q^{k+m}), \end{aligned}$$

where the last equality follows from (1). It follows that \(N_\mathbf{a}(d_1+1,\ldots ,d_m+1)=q^{k+2m}(1-q^{1-m})\), or equivalently, \(N_\mathbf{a}(d_1,\ldots ,d_m)=q^{k+m}-q^{k+1}\) as desired. \(\square \)

As the language of control theory is used in the proof of our main result, we collate here a few definitions [11, IX.2] and results that are referred to later on. In what follows, F denotes an arbitrary field and \(k,\ell \) are fixed positive integers.

Definition 2.5

A matrix pair \((A,B)\in M_{k,k}(F)\times M_{k,\ell }(F)\) is a reachable pair if the \(k\times k\ell \) matrix \(S(A,B):= \begin{bmatrix} B&AB&\cdots&A^{k-1}B \end{bmatrix}\) has rank equal to k.

Remark 2.6

A pair (AB) is reachable if and only if the polynomial matrix \([xI_k-A \; B]\) is unimodular.

Definition 2.7

Associate with each pair \((A,B)\in M_{k,k}(F)\times M_{k,\ell }(F)\) a sequence of integers \(p_i(i\ge 1)\) by defining \(p_1:=\mathrm {rank}\; B\) and for \(i\ge 2\),

$$\begin{aligned} p_i:= {\text {rank}}\begin{bmatrix} B&AB&\cdots&A^{i-1}B \end{bmatrix} - {\text {rank}}\begin{bmatrix} B&AB&\cdots&A^{i-2}B \end{bmatrix}. \end{aligned}$$

Consider the dual sequence \(k_j(j\ge 1)\) defined by \(k_j=\#\{r:p_r\ge j\}.\) The numbers \(k_1,\ldots ,k_\ell \) are called the controllability indices of the pair (AB).

For any positive integer m, denote by \({\text {GL}}_m(F)\) the general linear group of \(m\times m\) nonsingular matrices over F. Define [28, P. 3]

$$\begin{aligned} \Gamma _{k,\ell }:=\left\{ \begin{bmatrix} P &{} \mathbf{0}\\ R &{} Q \end{bmatrix}\in {\text {GL}}_{k+\ell }(F): P\in {\text {GL}}_k(F), Q\in {\text {GL}}_\ell (F), R\in M_{\ell ,k}(F)\right\} . \end{aligned}$$

Definition 2.8

Two pairs \((A_1,B_1)\) and \((A_2,B_2)\) in \( M_{k,k}(F)\times M_{k,\ell }(F)\) are said to be \(\Gamma _{k,\ell }\)-equivalent [28, Def. 2.1] if there exists a matrix \(P\in \Gamma _{k,\ell }\) such that for each pair of matrices \(C_1\in M_{\ell ,k}(F)\) and \(D_1\in M_\ell (F)\), there exist matrices \(C_2\in M_{\ell ,k}(F)\) and \(D_2\in M_{\ell }(F)\) such that

$$\begin{aligned} P \begin{bmatrix} A_1 &{}\quad B_1\\ C_1 &{}\quad D_1 \end{bmatrix} P^{-1}=\begin{bmatrix} A_2 &{}\quad B_2\\ C_2 &{}\quad D_2 \end{bmatrix}. \end{aligned}$$

When the values of \(k,\ell \) are clear from the context, we refer to \(\Gamma _{k,\ell }\)-equivalence simply as \(\Gamma \)-equivalence. The following result ( [1, 28, Lem. 2.7]) is due to Brunovsky.

Theorem 2.9

Let \((A,B)\in M_{k,k}(F)\times M_{k,\ell }(F)\). Suppose (AB) is a reachable pair with \({\text {rank}}B=r\) and \(k_1\ge \cdots \ge k_r>k_{r+1}=\cdots =k_\ell (=0)\) are the controllability indices of (AB). Then, (AB) is \(\Gamma \)-equivalent to a pair \((A_c,B_c)\in M_{k,k}(F)\times M_{k,\ell }(F)\) of the following form:

  1. (i)

    \(A_c\) is the block diagonal matrix \({\text {diag}}(A_1,\ldots ,A_r)\) where \(A_i\) is the \(k_i\times k_i\) matrix

    $$\begin{aligned} \begin{bmatrix} \mathbf{0} &{}\quad I_{k_i-1}\\ 0 &{}\quad \mathbf{0} \end{bmatrix}; \end{aligned}$$
  2. (ii)

    \(B_c\) is of the block form \([ B'\; \mathbf{0}]\), where \(B'\) denotes the \(k\times r\) matrix

    $$\begin{aligned} B'=\begin{bmatrix} E_1 \\ \vdots \\ E_r \end{bmatrix}; \text{ where } E_i=\begin{bmatrix} \mathbf{0}\\ e_i \end{bmatrix} \in M_{k_i\times r}(F), \end{aligned}$$

    and \(e_i\) denotes the ith row of the \(r\times r\) identity matrix.

The following lemma is our main result.

Lemma 2.10

Let nk be integers with \(0\le k<n-1\). Let V be an n-dimensional vector space over \({{\mathbb {F}}}_q\) and let \(W,W'\) be fixed subspaces of V of dimensions k and \(k+1\), respectively, with \(W\subset W'\). Suppose \(T:W\rightarrow V\) is a simple linear transformation. Then, the number of simple linear transformations \(T':W'\rightarrow V\) such that \(T'_{\mid W}=T\) (the restriction of \(T'\) to W is T) is equal to \(q^n-q^{k+1}\).

Proof

First suppose \(k=0\). In this case \(W'\) is spanned by some nonzero vector \(w\in V\). Then, \(T'\) is simple precisely when \(T'(w)\) does not lie in the span of w. So the number of such linear transformations is clearly \(q^n-q\).

Suppose \(k\ge 1\). Let \({\mathcal {B}}_k = \{v_1,\ldots ,v_k \}\) be an ordered basis for W and \({\mathcal {B}}_n = \{v_1,\ldots ,v_n \}\) be an ordered basis for V obtained by extending \({\mathcal {B}}_k\). Let Y be the matrix of T with respect to \({\mathcal {B}}_k\) and \({\mathcal {B}}_n\). Since T is simple, \( \mathbf{Y} = x I_{n,k} - Y\) is unimodular by Proposition 2.3. Suppose that \(Y = \left[ \!\!\begin{array}{c} A\\ C\\ \end{array}\!\!\right] \) for some \(A \in M_{k,k}({\mathbb {F}}_{q})\) and \(C\in M_{n-k,k}({\mathbb {F}}_{q})\). Since \( \mathbf{Y} = x I_{n,k} - Y \) is unimodular, it follows by Remark 2.6 that \((A^t,C^t)\) is a reachable pair. Suppose that \(\text{ rank }(C) = r\), and \(k_1\ge k_2\ge \cdots \ge k_r > k_{r+1} = \cdots = k_{n-k} = 0\) are the controllability indices of the pair \((A^t,C^t)\). We have \(k_1 + \cdots + k_r = k\). By Theorem 2.9, we may assume that A and C are of the following form:

\(A = \text{ diag }(A_1,A_2,\ldots ,A_r)\), where \(A_i\) is the \(k_i\times k_i\) matrix \(\left[ \begin{array}{cc} \mathbf{0}&{}\quad 0\\ I_{k_i-1} &{}\quad \mathbf{0}\\ \end{array}\right] ;\)

\(C = \left[ \!\!\begin{array}{c} C'\\ \mathbf{0}\\ \end{array}\!\!\right] \), where \( C' = \left[ E_1\,\cdots \,E_r\right] \in M_{r,k}({\mathbb {F}}_{q})\) with \(E_i = [\mathbf{0}\,e_i] \in M_{r,k_i}({\mathbb {F}}_{q}),\) and \(e_i\) denotes the ith column of the \(r\times r\) identity matrix for \(1\le i\le r\). Let \(\lambda _s = \sum _{i=1}^s k_i\) for \(1\le s\le r\) and set \(\lambda _0=0\). Then, the linear transformation T can be described by

$$\begin{aligned} T(v_j) = {\left\{ \begin{array}{ll} v_{k+s} &{}\quad \ \text{ if }\ j = \lambda _s\ \text{ for } \text{ some }\ s,\, 1\le s\le r;\\ v_{j+1} &{}\quad \quad \ \text{ otherwise, } \end{array}\right. } \end{aligned}$$
(2)

where \(1\le j\le k\). Also the matrix Y can be described by

$$\begin{aligned} Y = [\mathbf{e}_{\lambda _0+2},\ldots ,\mathbf{e}_{\lambda _1},\mathbf{e}_{k+1}, \mathbf{e}_{\lambda _1+2},\ldots ,\mathbf{e}_{\lambda _2},\mathbf{e}_{k+2}, \ldots , \mathbf{e}_{\lambda _{r-1}+2},\ldots ,\mathbf{e}_{\lambda _r},\mathbf{e}_{k+r}], \end{aligned}$$

where \(\mathbf{e}_i\) is the ith column of the identity matrix \(I_n\). Let \(U = \mathrm {span}({\mathcal {B}}_n{\setminus }{\mathcal {B}}_k)\) be the subspace of V spanned by \(\{v_{k+1},\ldots ,v_n\}\). We have \(V = W\oplus U\).

Now \(W\subset W'\) and \(W'\) is of dimension \(k+1\). Since \(V = W\oplus U\), there is a nonzero vector \(w\in W' \cap U\). Let \(\{v_{k+1}'=w,v_{k+2}',\ldots ,v_n'\}\) be an ordered basis for U. Since \(V = W\oplus U\), we have \({\mathcal {B}}_n' = \{v_1,\ldots ,v_k,v_{k+1}',\ldots ,v_n'\}\) is an ordered basis for V. Let R be the matrix of the identity map \(1_V\) on V with respect to the bases \({\mathcal {B}}_n'\) and \({\mathcal {B}}_n\). Note that the matrix R can be expressed as

$$\begin{aligned} \left[ \begin{array}{cc} I_k &{}\quad \mathbf{0}\\ \mathbf{0}&{}\quad S\\ \end{array}\right] , \end{aligned}$$

where S is the matrix of the identity map \(1_U\) on U with respect to the bases \({\mathcal {B}}_n'{\setminus } {\mathcal {B}}_k\) and \({\mathcal {B}}_n{\setminus } {\mathcal {B}}_k\). Let \(v_{k+1}' = \sum _{j=k+1}^{n} c_j v_j\) for some scalars \(c_j\). Then, the first column of S is given by \((c_{k+1},\ldots ,c_n)^t\in {\mathbb {F}}_{q}^{n-k}\). The matrix \({\tilde{Y}}\) of T with respect to \({\mathcal {B}}_k\) and \({\mathcal {B}}_n'\) is given by \({\tilde{Y}} = R^{-1}Y\). Define \({\mathcal {B}}_{k+1}' = \{v_1,\ldots ,v_k,v_{k+1}' \}\) and let \(Y'\) be the matrix of \(T'\) with respect to the bases \({\mathcal {B}}_{k+1}'\) and \({\mathcal {B}}_n'\). Since \(T'_{\mid W}=T\), we have \(Y' = R^{-1} [Y\,\mathbf{b}]\) for some column vector \(\mathbf{b}\in {\mathbb {F}}_q^n\). By Proposition 2.3, \(T'\) is simple if and only if \(\mathbf{Y}' = x I_{n,k+1} - Y'\) is unimodular. Let \(\mathbf{Y_b} = R \mathbf{Y}' = R (x I_{n,k+1} - R^{-1} [Y\,\mathbf{b}]) = x R_{k+1} - [Y\,\mathbf{b}]\), where \(R_{k+1}\) is the submatrix formed by the first \((k+1)\) columns of R. We have \(\mathbf{Y_b} = [\mathbf{Y}\, x\mathbf{c}-\mathbf{b}]\), where \(\mathbf{c} = (0,\ldots ,0,c_{k+1},\ldots ,c_n)^t\in {\mathbb {F}}_{q}^n\).

Suppose \(\mathbf{b}= (b_1, b_2, \ldots , b_n)^t \in {\mathbb {F}}_{q}^n\). Then, the matrix \(Y_\mathbf{b} = [Y\ \mathbf{b}]\) is of the form

$$\begin{aligned} Y_\mathbf{b} = \left[ \begin{array}{ccccc} A_1 &{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_1 \\ \mathbf{0}&{}\quad A_2 &{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_2 \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots &{}\quad \vdots \\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{}\quad A_r &{}\quad \mathbf{b}_r\\ E_1 &{}\quad E_2 &{}\quad \ldots &{}\quad E_r &{}\quad \tilde{\mathbf{b}}\\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{} \quad \mathbf{0}&{}\quad {\hat{\mathbf{b}}}\\ \end{array}\right] , \end{aligned}$$
(3)

where \(\mathbf{b}_i = (b_{\lambda _{i-1}+1},\ldots ,b_{\lambda _{i}})^t \in {\mathbb {F}}_{q}^{k_i}\) for \(1\le i\le r\), \(\tilde{\mathbf{b}} = (b_{k+1},\ldots ,b_{k+r})^t\in {\mathbb {F}}_{q}^{r}\), and \({\hat{\mathbf{b}}} = (b_{k+r+1},\ldots ,b_{n})^t\in {\mathbb {F}}_{q}^{n-k-r}\).

Now consider the polynomial matrix \(\mathbf{Y}_\mathbf{b} = [\mathbf{Y}\, x\mathbf{c}-\mathbf{b}]\). We permute the rows of \(\mathbf{Y}_\mathbf{b}\) in the following way: for each \(1\le i\le r-1\), arrange the \((k+i)\)th row of \(\mathbf{Y}_\mathbf{b}\) in between the ith and \((i+1)\)th block rows appearing in (3). The resulting matrix \(\mathbf{Z}\) is of the following form:

$$\begin{aligned} \mathbf{Z} = \left[ \begin{array}{ccccc} \mathbf{Z}_1 &{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_1' \\ \mathbf{0}&{}\quad \mathbf{Z}_2 &{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_2' \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots &{}\quad \vdots \\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{Z}_r &{}\quad \mathbf{b}_r'\\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}'\\ \end{array}\right] , \end{aligned}$$
(4)

where \(\mathbf{Z}_i = x \left[ \!\!\begin{array}{c} I_{k_i}\\ \mathbf{0}\\ \end{array}\!\!\right] - \left[ \!\!\begin{array}{c} \mathbf{0}\\ I_{k_i}\\ \end{array}\!\!\right] \), \(\mathbf{b}_i' = \left[ \!\!\begin{array}{c} -\mathbf{b}_i\\ c_{k+i}x-b_{k+i}\\ \end{array}\!\!\right] \) for \(1\le i\le r \text{ and } \mathbf{b}' = (c_{k+r+1}x-b_{k+r+1},\ldots ,c_{n}x-b_{n})^t\). Now we apply the following sequence of elementary row operations to \(\mathbf{Z}\) to eliminate x in the first k columns: in the first block row appearing in (4), add x times the \((i+1)\)th row to the ith row successively for \(i=k_1, k_1-1, \ldots , 1\) in that order. Similarly, we apply elementary row operations to the other block rows. By appropriate elementary column operations, the entries in the last column can be made zero at suitable positions. Eventually, we can transform the matrix to the following form:

$$\begin{aligned} \mathbf{Z}' = \left[ \begin{array}{ccccc} \mathbf{Z}'_1 &{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_1'' \\ \mathbf{0}&{}\quad \mathbf{Z}_2' &{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}_2'' \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots &{}\quad \vdots \\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{Z}_r' &{}\quad \mathbf{b}_r''\\ \mathbf{0}&{}\quad \mathbf{0}&{}\quad \ldots &{}\quad \mathbf{0}&{}\quad \mathbf{b}'\\ \end{array}\right] , \end{aligned}$$
(5)

where \(\mathbf{Z}_i' = -\left[ \!\!\begin{array}{c} \mathbf{0}\\ I_{k_i}\\ \end{array}\!\!\right] \), \(\mathbf{b}_i'' = \left[ \!\!\begin{array}{c} f_i\\ \mathbf{0}\\ \end{array}\!\!\right] \) with \(f_i(x) = c_{k+i} x^{k_i+1} - b_{k+i} x^{k_i} - \sum _{j=1}^{k_i} b_{\lambda _{i-1}+j} x^{j-1}\) for \(1\le i\le r\) and \(\mathbf{b}' = (c_{k+r+1}x-b_{k+r+1},\ldots ,c_{n}x-b_{n})^t\).

Let \(g = \gcd (f_1,f_2,\ldots ,f_r, c_{k+r+1} x - b_{k+r+1},\ldots ,c_n x - b_n)\). The matrix \(\mathbf{Z}'\) is unimodular if and only if \(g=1\). By Lemma 2.4, it follows that the number of vectors \(\mathbf{b}\in {\mathbb {F}}_{q}^n\) such that \(g = 1\) is given by \(q^n- q^{k+1}\). As \(\mathbf{Y}'\) and \(\mathbf{Z}'\) are equivalent, the result follows.

\(\square \)

The lemma can be recast in the setting of matrices as follows.

Corollary 2.11

Let \(Y\in M_{n,k}({{\mathbb {F}}}_q)\) be such that the linear matrix polynomial \(xI_{n,k}-Y\) is unimodular. For each column vector \(\mathbf{b}\in {{\mathbb {F}}}_{q}^n\) let \(Y_\mathbf{b}=[Y \;\mathbf{b}]\in M_{n,k+1}({{\mathbb {F}}}_q)\). Then, the number of column vectors \(\mathbf{b}\in {{\mathbb {F}}}_{q}^n\) for which \(xI_{n,k+1}-Y_\mathbf{b}\) is unimodular equals \(q^n-q^{k+1}\).

We can now give an alternate proof of [24, Thm. 3.8] concerning the number of simple linear transformations with a fixed domain.

Corollary 2.12

Let V be an n-dimensional vector space over \({{\mathbb {F}}}_q\) and W be a proper k-dimensional subspace of V. The number of simple linear transformations \(T:W\rightarrow V\) equals

$$\begin{aligned} \prod _{i=1}^{k}(q^n-q^i). \end{aligned}$$

We may use Proposition 2.3 to reformulate the corollary in terms of matrices. This allows us to answer Question 1.1 stated in the introduction.

Corollary 2.13

Let nk be positive integers with \(k<n\). The number of matrices \(A\in M_{n,k}({{\mathbb {F}}}_q)\) such that \(xI_{n,k}-A\) is unimodular equals

$$\begin{aligned} \prod _{i=1}^{k}(q^n-q^i). \end{aligned}$$

By repeated application of Corollary 2.11, we obtain the following extension which is used later on in Sects. 3 and 4.

Lemma 2.14

Let nkt be positive integers such that \(k+t<n\). Suppose that the matrix polynomial \(xI_{n,k}-Y\) is unimodular for some \(Y\in M_{n,k}({{\mathbb {F}}}_q)\). The number of matrices \(A\in M_{n,t}({\mathbb {F}}_{q})\) such that the matrix polynomial

$$\begin{aligned} x I_{n,k+t} - [Y \; A] \end{aligned}$$

is unimodular is equal to \(\prod _{i=1}^{t} (q^n - q^{k+i})\).

3 Splitting subspaces

Recall the definition of splitting subspace given earlier in the introduction.

Definition 3.1

Let dm be positive integers and consider the vector space \({{\mathbb {F}}}_{q^{md}}\) over \({{\mathbb {F}}}_q\). For any element \(\alpha \in {{\mathbb {F}}}_{q^{md}}\), an m-dimensional subspace W of \({{\mathbb {F}}}_{q^{md}}\) is \(\alpha \)-splitting if

$$\begin{aligned} {{\mathbb {F}}}_{q^{md}}=W\oplus \alpha W\oplus \cdots \oplus \alpha ^{d-1}W. \end{aligned}$$

Closely related to splitting subspaces are block companion matrices which we define below.

Definition 3.2

For positive integers md, an (md)-block companion matrix over \({{\mathbb {F}}}_q\) is a matrix in \(M_{md}({{\mathbb {F}}}_q)\) of the form

$$\begin{aligned} \begin{pmatrix} {\mathbf {0}} &{}\quad \quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad . &{}\quad . &{}\quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad C_0\\ I_m &{}\quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad . &{}\quad . &{}\quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad C_1\\ . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad .\\ . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad . &{}\quad .\\ {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad . &{}\quad . &{}\quad I_m &{}\quad {\mathbf {0}} &{}\quad C_{d-2}\\ {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad {\mathbf {0}} &{}\quad . &{}\quad . &{} \quad {\mathbf {0}} &{}\quad I_m &{}\quad C_{d-1} \end{pmatrix}, \end{aligned}$$
(6)

where \(C_0, C_1, \dots , C_{d-1}\in M_m({{\mathbb {F}}}_q)\) and \(I_m\) denotes the \(m\times m\) identity matrix over \({{\mathbb {F}}}_q\) while \({\mathbf {0}}\) denotes the zero matrix in \(M_m({{\mathbb {F}}}_q)\).

Remark 3.3

It was shown (see the discussion after Conjecture 5.5 in [9] or Appendix A in [10] for an overview) that the Splitting Subspace Theorem is in fact equivalent to the following theorem on block companion matrices.

Theorem 3.4

For any irreducible polynomial \(f\in {{\mathbb {F}}}_q[x]\) of degree md, the number of (md)-block companion matrices over \({{\mathbb {F}}}_q\) having f as their characteristic polynomial equals

$$\begin{aligned} q^{m(m-1)(d-1)}\prod _{i=1}^{m-1}(q^m-q^i). \end{aligned}$$

It is noteworthy that the problem of counting specific types of block companion matrices having irreducible characteristic polynomial has been considered in other contexts [3, 14, 23] where pseudorandom number generation is of interest. We now deduce Theorem 3.4 as a special case of Theorem 3.9 which we prove below, thereby providing an alternate proof of the Splitting Subspace Theorem.

Definition 3.5

For positive integers \(k,\ell \) with \(k<\ell \), let \(J^{\ell ,k}\) denote the \(\ell \times k\) matrix given by

$$\begin{aligned} J^{\ell ,k} := \left[ \!\!\begin{array}{c}\mathbf{0}\\ I_k\\ \end{array}\!\!\right] . \end{aligned}$$

Lemma 3.6

The linear matrix polynomial

$$\begin{aligned} x \left[ \!\!\begin{array}{c} I_k\\ \mathbf{0}\\ \end{array}\!\!\right] - J^{\ell ,k} \end{aligned}$$

is unimodular.

Proof

Since the \(k\times k\) minor formed by the last k rows of the above matrix polynomial equals \((-1)^k\), it follows that the GCD of all \(k\times k\) minors is 1. \(\square \)

Definition 3.7

Let \(m, \ell \) be positive integers such that \(m<\ell \). An m-companion matrix of order \(\ell \) over \({\mathbb {F}}_{q}\) is a square matrix C of the form

$$\begin{aligned} C = [J^{\ell ,\ell -m}\ A]\end{aligned}$$

for some \(A\in M_{\ell ,m}({\mathbb {F}}_{q})\). We denote the set of all m-companion matrices of order \(\ell \) over \({{\mathbb {F}}}_q\) by \({{\mathcal {C}}}(\ell , m; q)\). Note that \(|{{\mathcal {C}}}(\ell , m; q)| = q^{\ell m}\).

Let \({{\mathcal {P}}}(\ell ,{\mathbb {F}}_{q})\) denote the set of all monic polynomials of degree \(\ell \) over \({\mathbb {F}}_{q}\). Now consider the map \(\Phi : {{\mathcal {C}}}(\ell , m; q) \rightarrow {{\mathcal {P}}}(\ell ,{\mathbb {F}}_{q})\) given by

$$\begin{aligned}\Phi (C) := \det (x I_\ell - C).\end{aligned}$$

To determine the size of the fibers of \(\Phi \), we require a theorem of Wimmer.

Theorem 3.8

(Wimmer) Let F be an arbitrary field and let \(Y\in M_{\ell ,k}(F)\). Suppose \(f\in F[x]\) is a monic polynomial of degree \(\ell \) and let \(f_1(x)\mid \cdots \mid f_k(x)\) be the invariant factors of the polynomial matrix \(xI_{\ell ,k}-Y\). There exists a matrix \(Z \in M_{\ell ,\ell -k}(F)\) such that the block matrix \([Y\; Z]\) has characteristic polynomial f(x) if and only if the product \(\prod _{i=1}^kf_i(x)\) divides f(x).

Proof

See Wimmer [27] or Cravo [5, Thm. 15]. \(\square \)

Theorem 3.9

Suppose that \(f\in {{\mathcal {P}}}(\ell ,{\mathbb {F}}_{q})\) is irreducible. Then,

$$\begin{aligned}|\Phi ^{-1}(f)| = \prod _{t=1}^{m-1} (q^\ell - q^{\ell -t}).\end{aligned}$$

Proof

Let \(C = [J^{\ell ,\ell -m}\ A]\in {{\mathcal {C}}}(\ell ,m;q)\) with \(A = [\mathbf{a}_1\ \mathbf{a}_2\,\cdots \,\mathbf{a}_{m-1}\,\mathbf{a}_m]\), where the \(\mathbf{a}_i\)’s are the columns of A. Let \(C_0 = J^{\ell ,\ell -m}\) and let \(C_i = [J^{\ell ,\ell -m}\ \mathbf{a}_1\ \mathbf{a}_2\,\cdots \,\mathbf{a}_{i}] \) denote the submatrix of C formed by the first \(\ell -m+i\) columns for \(1\le i < m\). Suppose that \(\Phi (C) = f\). Since f is irreducible, it follows by Lemma 3.6 and Wimmer’s theorem that the linear matrix polynomials

$$\begin{aligned} x \left[ \!\!\begin{array}{c} I_{\ell -m+i}\\ \mathbf{0}\\ \end{array}\!\!\right] - C_{i} \end{aligned}$$
(7)

are unimodular for \(0 \le i\le m-1\). Conversely, if \(\mathbf{a}_1,\ldots , \mathbf{a}_{m-1}\) are chosen such that the matrix polynomials in (7) are unimodular, then there is a unique choice of \(\mathbf{a}_m\) for which \(\Phi (C)=f\). This follows since there are \(q^\ell \) total choices for \(\mathbf{a}_m\) and for each monic polynomial g of degree \(\ell \), Wimmer’s theorem ensures that there exists some choice of \(\mathbf{a}_m\) such that the characteristic polynomial is g. By Lemma 2.14, it follows that the number of choices for the first \(m-1\) columns of A is equal to \(\prod _{i=1}^{m-1}(q^\ell - q^{\ell -m+i})\) which proves the result. \(\square \)

Remark 3.10

In the case where m divides \(\ell \), say \(d = \ell /m \), the set \({{\mathcal {C}}}(\ell , m; q)\) consists precisely of all (md)-block companion matrices over \({\mathbb {F}}_{q}\). This observation yields the following corollary stated earlier as Theorem 3.4.

Corollary 3.11

For any irreducible polynomial \(f\in {{\mathbb {F}}}_q[x]\) of degree md, the number of (md)-block companion matrices over \({{\mathbb {F}}}_q\) having f as their characteristic polynomial equals

$$\begin{aligned} q^{m(m-1)(d-1)}\prod _{i=1}^{m-1}(q^m-q^{i}). \end{aligned}$$

Proof

It follows by the above remark that the number of (md)-block companion matrices over \({{\mathbb {F}}}_q\) having f as their characteristic polynomial equals

$$\begin{aligned} \prod _{i=1}^{m-1}(q^{md}-q^{m(d-1)+i})=\prod _{i=1}^{m-1}q^{m(d-1)}(q^m-q^{i}), \end{aligned}$$

which is clearly equal to the given product. \(\square \)

In light of the above corollary and Remark 3.3, we can view Theorem 3.9 as a more general result than the Splitting Subspace Theorem. While our proof relies on results in control theory, it is shorter than the proofs of the theorem appearing in [2] and [17].

4 Probability of unimodular polynomial matrices

We apply Lemma 2.14 to positively resolve a conjecture [24, Conj. 4.1] concerning the number of unimodular polynomial matrices. For positive integers dkn with \(k< n\), define

$$\begin{aligned} M_{n,k}({\mathbb {F}}_{q}[x];d) := \left\{ \mathbf{A} = x^d I_{n,k} + \sum _{i=0}^{d-1} x^i A_i\ : \ A_i \in M_{n,k}({\mathbb {F}}_{q})\ \text{ for }\ 0\le i\le d-1\right\} .\end{aligned}$$

Theorem 4.1

The probability that a uniformly random element of \(M_{n,k}({\mathbb {F}}_{q}[x];d)\) is unimodular is given by \(\prod _{i=1}^k (1-q^{i-n})\).

Proof

To each element \(\mathbf{A}\) in \(M_{n,k}({\mathbb {F}}_{q}[x];d)\), we associate the corresponding d-tuple of its coefficients \((A_0, A_1,\ldots ,A_{d-1}) \in [M_{n,k}({\mathbb {F}}_{q})]^d\). Now consider the matrix

(8)

of dimension \(nd \times (nd-n+k)\). Let

$$\begin{aligned} \mathbf{B} = x \left[ \!\!\begin{array}{c} I_{(d-1)n+k}\\ \mathbf{0}\\ \end{array}\!\!\right] - B . \end{aligned}$$

By adding x times the ith block row to the \((i-1)\)th block row successively for \(i = d, d-1, \ldots , 2\) in \(\mathbf{B}\) and using suitable column block operations, we obtain

where \(\mathbf{A}=x^d I_{n,k}+ \sum _{i=0}^{d-1} x^i A_i \in M_{n,k}({\mathbb {F}}_{q}[x];d)\). Observe that \(\mathbf{B}\) is equivalent to \(\mathbf{B}'\). So the invariant factors of \(\mathbf{B}\) and \(\mathbf{B}'\) are the same. Therefore, \(\mathbf{B}\) is unimodular if and only if \(\mathbf{A}\) is unimodular. By Lemma 2.14, the number of ways to choose the last k columns of the matrix B in (8) in such a way that \(\mathbf{B}\) is unimodular is

$$\begin{aligned}\prod _{i=1}^{k} (q^{nd}-q^{n(d-1)+i}).\end{aligned}$$

On the other hand, the cardinality of \(M_{n,k}({\mathbb {F}}_{q}[x];d)\) is clearly \(q^{nkd}\) and therefore the probability that a uniformly random element of \(M_{n,k}({\mathbb {F}}_{q}[x];d)\) is unimodular is precisely \(\prod _{i=1}^k (1-q^{i-n})\). \(\square \)

Note that the probability computed in the theorem is independent of d.

Remark 4.2

The above theorem is a generalization of Corollary 2.13 which is evidently the special case \(d=1\).

Theorem 4.1 parallels a result of Guo and Yang [13, Thm. 1] who prove that the natural density of unimodular \(n\times k\) matrices over \({{\mathbb {F}}}_q[x]\) is precisely \(\prod _{i=1}^k (1-q^{i-n})\).

Remark 4.3

To study the invariant factors of an element \(\mathbf{A} \in M_{n,k}({\mathbb {F}}_{q}[x];d)\), it suffices to study those of the corresponding linear matrix polynomial \(\mathbf{B}\) associated with the matrix B as defined in Equation (8). The matrix polynomial \(\mathbf{B}\) is called the linearization of \(\mathbf{A}\).